# OC R User Group Meet-Up

This a a summary of the presentation I gave at the October OC R User Group Meet Up. The focus on the talk was to solve the following problem statement using R.

I am investing in Lending Club notes. Specifically Grade A, 36 Month loans. I want to generate an expected cash flow for my investment.

## Note

Sometimes wordpress.com will format code like the pipes and R assignment incorrectly. You can find the code for this post here: https://github.com/bryantravissmith/OC-RUG

## Setup

I want to generate my expectations for grade A, 36 months loan. The main goal is to generate a framework for the solution that it can also be applied to other loans. I am starting off creating a data frame from all the join Lending Club Data from when I selected the target loans from tranches of loan that are mature.

library(dplyr) library(reshape2) library(ggplot2) library(zoo) library(stringr) library(knitr) library(DT) # Previously loaded all LC data into a single data frame and saved as RData load("/opt/science/Datasets/LendingClub/2016-06-30/lcCombinded.RData") lcA <- lc %>% mutate( #Crate a Data Object for the Issue Date issue_yearmon = as.Date(as.yearmon(issue_d,format="%b-%Y")), #Create a Numeric Issue Year For Group By Calculations issue_year = as.numeric(as.character(issue_yearmon,format="%Y")), #Convert to Date object to calculate the 'Age' of the loan last_pymnt_yearmon = as.Date(as.yearmon(last_pymnt_d,format="%b-%Y")), #Converte interest rate to ta numeric value interest = as.numeric(str_replace(int_rate,"%","")) ) %>% mutate( #Make what are roughly 1 month wide in the time between the loan was #originated and the time it stopped generating cash flows AgeBucket = cut(as.numeric((last_pymnt_yearmon-issue_yearmon)/365), breaks=seq(0,6,1/12), include.lowest=T, right=F) ) %>% mutate( #Converte the Age Bucket to a Numeric Month or Statement value Age = match(AgeBucket,levels(AgeBucket)) ) %>% mutate( #Get the total recieved payments for each loan total_rec = total_rec_prncp+total_rec_int ) %>% #Limit to the data we are interested in for the problem #Remove 2007 - stands out in graphs, actually doesn't change the results too much. filter(term == " 36 months",grade == "A", issue_year < 2013, issue_year > 2007)

## Cash Flow Expectation : Paid As Agreed

The simplest expectation is to just assume that everyone will make every schedule payment. The equation for the payment of a fix term loan is:

You can then use this to get the amortization schedule for the loan.

# Generates the expected payment for a fixed term loan. payment_value <- function(loan_amount,interest_rate,term_number) { loan_amount*interest_rate/12/(1 - (1+interest_rate/12)^(-1*term_number)) } #Gereates the scheduled payment behavior for a loan that is paid as agreed get_amortization <- function(term_number,interest_rate){ amortization <- data.frame(statement = seq(0,term_number,1), payment = c(0,rep(payment_value(1,interest_rate,term_number),term_number)), interest_payment = 0, principal_payment = 0, start_balance = 0, end_balance = 1) for(i in (1:term_number+1)){ amortization[i,'start_balance'] = amortization[i-1,'end_balance']*(1+interest_rate/12) amortization[i,'interest_payment'] = interest_rate*amortization[i-1,'end_balance']/12 amortization[i,'principal_payment'] = amortization[i,'payment'] - amortization[i,'interest_payment'] amortization[i,'end_balance'] = amortization[i,'start_balance']-amortization[i,'payment'] } amortization[term_number+1,'end_balance']=0 return(amortization) } #Helper function to round numeric values when writing to data table round_df <- function(df, digits) { nums <- vapply(df, is.numeric, FUN.VALUE = logical(1)) df[,nums] <- round(df[,nums], digits = digits) (df) } #Printing values get_amortization(36,0.078) %>% round_df(4) %>% kable(format = "markdown")

statement | payment | interest_payment | principal_payment | start_balance | end_balance |
---|---|---|---|---|---|

0 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 1.0000 |

1 | 0.0312 | 0.0065 | 0.0247 | 1.0065 | 0.9753 |

2 | 0.0312 | 0.0063 | 0.0249 | 0.9816 | 0.9504 |

3 | 0.0312 | 0.0062 | 0.0251 | 0.9565 | 0.9253 |

4 | 0.0312 | 0.0060 | 0.0252 | 0.9313 | 0.9001 |

5 | 0.0312 | 0.0059 | 0.0254 | 0.9059 | 0.8747 |

6 | 0.0312 | 0.0057 | 0.0256 | 0.8803 | 0.8491 |

7 | 0.0312 | 0.0055 | 0.0257 | 0.8546 | 0.8234 |

8 | 0.0312 | 0.0054 | 0.0259 | 0.8287 | 0.7975 |

9 | 0.0312 | 0.0052 | 0.0261 | 0.8027 | 0.7714 |

10 | 0.0312 | 0.0050 | 0.0262 | 0.7764 | 0.7452 |

11 | 0.0312 | 0.0048 | 0.0264 | 0.7500 | 0.7188 |

12 | 0.0312 | 0.0047 | 0.0266 | 0.7235 | 0.6922 |

13 | 0.0312 | 0.0045 | 0.0267 | 0.6967 | 0.6655 |

14 | 0.0312 | 0.0043 | 0.0269 | 0.6698 | 0.6386 |

15 | 0.0312 | 0.0042 | 0.0271 | 0.6427 | 0.6115 |

16 | 0.0312 | 0.0040 | 0.0273 | 0.6154 | 0.5842 |

17 | 0.0312 | 0.0038 | 0.0274 | 0.5880 | 0.5567 |

18 | 0.0312 | 0.0036 | 0.0276 | 0.5604 | 0.5291 |

19 | 0.0312 | 0.0034 | 0.0278 | 0.5326 | 0.5013 |

20 | 0.0312 | 0.0033 | 0.0280 | 0.5046 | 0.4733 |

21 | 0.0312 | 0.0031 | 0.0282 | 0.4764 | 0.4452 |

22 | 0.0312 | 0.0029 | 0.0284 | 0.4481 | 0.4168 |

23 | 0.0312 | 0.0027 | 0.0285 | 0.4195 | 0.3883 |

24 | 0.0312 | 0.0025 | 0.0287 | 0.3908 | 0.3596 |

25 | 0.0312 | 0.0023 | 0.0289 | 0.3619 | 0.3307 |

26 | 0.0312 | 0.0021 | 0.0291 | 0.3328 | 0.3016 |

27 | 0.0312 | 0.0020 | 0.0293 | 0.3035 | 0.2723 |

28 | 0.0312 | 0.0018 | 0.0295 | 0.2740 | 0.2428 |

29 | 0.0312 | 0.0016 | 0.0297 | 0.2444 | 0.2131 |

30 | 0.0312 | 0.0014 | 0.0299 | 0.2145 | 0.1833 |

31 | 0.0312 | 0.0012 | 0.0301 | 0.1845 | 0.1532 |

32 | 0.0312 | 0.0010 | 0.0302 | 0.1542 | 0.1230 |

33 | 0.0312 | 0.0008 | 0.0304 | 0.1238 | 0.0925 |

34 | 0.0312 | 0.0006 | 0.0306 | 0.0931 | 0.0619 |

35 | 0.0312 | 0.0004 | 0.0308 | 0.0623 | 0.0310 |

36 | 0.0312 | 0.0002 | 0.0310 | 0.0312 | 0.0000 |

## Results: Paid As Agreed

I am now going to use the Lending Club data to calculate actual yield of grade A, 36 month loans and compare it to the expectations generated by the above payment schedule. I will do this calculation by year for each year of mature loans in Lending Club. We are assuming there is no prepayment and charge off behavior, so I will also calculate these to see scale of these behaviors.

lcA %>% # Perform the following calculation for each year group_by(issue_year) %>% # Aggregrate for each year summarise( #Number of Loans Count = n(), #Dollars Loaned Out DollarsFunded = sum(funded_amnt), #Dollars collected from payments DollarsRecieved = sum(total_rec), #Percent Charge Offs Chargeoffs = round(sum(ifelse(grepl("Charged Off",loan_status),1,0))/Count*100,1), #Percent Prepayments Prepays = round(sum(ifelse(grepl("Fully Paid",loan_status)&(Age < 36),1,0))/Count*100,1), #The Percent Increase on the Dollars Loans Yield = round(100*(DollarsRecieved/DollarsFunded-1),1), #Dollar weighted interest rate for each year Interest = round(weighted.mean(interest,funded_amnt),1)) %>% mutate( #Applys the amortization schedule and sumes payments recieved for each loan ExpectedYield = apply( (.), 1, function(x) sum(get_amortization(36, as.numeric(x['Interest'])/100)$payment))) %>% mutate( #Rounds Expected Yield to a percent ExpectedYield = round(100*(ExpectedYield-1),1)) %>% #Filter to relevant variables select(issue_year,DollarsFunded,Prepays,Chargeoffs,Interest,Yield,ExpectedYield) %>% #Convert to % for display mutate(Prepays=paste0(Prepays,"%"), Chargeoffs=paste0(Chargeoffs,"%"), Interest=paste0(Interest,"%"), Yield=paste0(Yield,"%"), ExpectedYield = paste0(ExpectedYield,"%")) %>% #Create a data table kable( format = "markdown")

issue_year | DollarsFunded | Prepays | Chargeoffs | Interest | Yield | ExpectedYield |
---|---|---|---|---|---|---|

2008 | 1899800 | 49.4% | 6% | 8.5% | 7.6% | 13.6% |

2009 | 8700550 | 49.7% | 6.7% | 8.8% | 7.6% | 14.1% |

2010 | 20650050 | 53.1% | 4.4% | 7.2% | 7.4% | 11.5% |

2011 | 49784200 | 50.9% | 6.4% | 7.2% | 6.4% | 11.5% |

2012 | 119545400 | 52.4% | 7.2% | 7.7% | 6.4% | 12.3% |

## Review: Paid As Agreed

I can see that there are signficant Prepayments and Charge Offs for each year of Lending Club, so I’ll try to include these in the expectations.

## Cash Flow Expectation : Account For Charge Offs

I am going to include charge offs in the payment expectations. Once a loan charges off, it no longer makes payments. This makes it straight forward to include in the amortatization schedule because we only need to keep track of how many loans are left, and add the payments of the left over loans. In reality there is also deliquency behavior that should be accounted for, but we will have to do that another time.

To execute this, I need to get get a feel for when the loans charge off in a given traunch of loans.

# Charege Offs lcA %>% #Perform the following calculations for each year group_by(issue_year) %>% ## Count the number of leans each year mutate(count=n()) %>% ## Perform a summary for each year and age of loan group_by(issue_year,Age) %>% ## Find the number of loans that charge off each month summarise(bad = sum(ifelse(grepl("Charged Off",loan_status),1,0)), #Get the count value for the year #count is a vector of the save value for each group count = max(count)) %>% ## ORder by the age for the cumsum arrange(Age) %>% ## Get the cumlative total of bad loans over the course of the term mutate(total_bad = cumsum(bad)) %>% ## Group by the traunch of loans group_by(issue_year) %>% ##Get the total number of bad loans for the traunch mutate(max_total_bad = max(total_bad)) %>% ##Plot the cumlative percentage of chargeoffs for each traunch ggplot(aes(Age,total_bad/max_total_bad,color=factor(issue_year)))+ geom_line()+ theme_bw()+ xlim(0,36)+ ylab('Percent Total Chargeoffs')

The cumulative charge off behavior has a characteristic shape, so we can use this shape to include in the expected number of units making payments in a given month. We can generate this expectations by just doing a weighted average on these curves (thought there are more thoughtful ways to do this as well). I’m going to take this average, and smoothing fit, and make a function of out it to include in the amortization of the loan.

percentChargeoffFunc <- lcA %>% #Perform the following calculations for each year group_by(issue_year) %>% ## Count the number of leans each year mutate(count=n()) %>% ## Perform a summary for each year and age of loan group_by(issue_year,Age) %>% ## Find the number of loans that charge off each month summarise(bad = sum(ifelse(grepl("Charged Off",loan_status),1,0)), #Get the count value for the year #count is a vector of the save value for each group count = max(count)) %>% ## ORder by the age for the cumsum arrange(Age) %>% ## Limit to 36 months filter(Age <= 36) %>% ## Get the cumlative total of bad loans over the course of the term mutate(total_bad = cumsum(bad)) %>% ## Group by the traunch of loans group_by(issue_year) %>% ##Get the total number of bad loans for the traunch mutate(max_total_bad = max(total_bad)) %>% ##Calculate percent bad mutate(percent_bad = total_bad/max_total_bad) %>% ## Group by Age to do the weighted average of all the curves group_by(Age) %>% ## Calculate the weighted average by the traunch size summarise(avg_percent_total_bad = weighted.mean(percent_bad,count)) %>% ## Return Approx Function to describe Chargeoff Behavior (function(df) { ## Make a loess fit of the average to smooth out mod <- loess(avg_percent_total_bad ~ Age,df,span=1/4) ## make the predictions pred <- predict(mod,df) ## create a data frame of the age and smoothed predictions tmp <- data.frame(Age = df$Age,pred_percent_total_bad=pred) ## make sure the new smooth values are between 0 and 1 tmp$pred_percent_total_bad = tmp$pred_percent_total_bad-min(tmp$pred_percent_total_bad) tmp$pred_percent_total_bad = tmp$pred_percent_total_bad/max(tmp$pred_percent_total_bad) # return approx function of the smooth function approxfun(tmp$Age,tmp$pred_percent_total_bad,method='linear',yleft = 0,yright=1) }) ## Plot Function Output data.frame(Age=seq(0,36), pco = percentChargeoffFunc(seq(0,36))) %>% ggplot(aes(Age,pco))+ geom_line()+ theme_bw()+ xlim(0,36)+ ylab('Percent Total Chargeoffs')

Now that I have the function, I can include the expected charge off function times the actual scale to update our cash flow expectations.

#Ammortizaiton Schedule for loans with Charge Off Expectations get_amortization_chargeoff <- function(term_number,interest_rate,percent_chargeoff){ amortization <- data.frame(statement=seq(0,term_number,1), payment=c(0,rep(payment_value(1,interest_rate,term_number),term_number)), interest_payment = 0, principal_payment=0, start_balance=0, end_balance=1, #Added a units chargeoff which is the new curve times the measured scale unit_chargeoff = percent_chargeoff*percentChargeoffFunc(seq(0,36))) for(i in (1:term_number+1)){ amortization[i,'start_balance'] = amortization[i-1,'end_balance']*(1+interest_rate/12) amortization[i,'interest_payment'] = interest_rate*amortization[i-1,'end_balance']/12 amortization[i,'principal_payment'] = amortization[i,'payment'] - amortization[i,'interest_payment'] amortization[i,'end_balance'] = amortization[i,'start_balance']-amortization[i,'payment'] } amortization[term_number+1,'end_balance']=0 #Adds a payment recieved which is the schedule payments times the number of units left on a given statement amortization <- amortization %>% mutate(payment_recieved = (1-unit_chargeoff)*payment) return(amortization) } #Print output get_amortization_chargeoff(36,0.07,0.06) %>% round_df(4) %>% kable( format = "markdown")

statement | payment | interest_payment | principal_payment | start_balance | end_balance | unit_chargeoff | payment_recieved |
---|---|---|---|---|---|---|---|

0 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 1.0000 | 0.0000 | 0.0000 |

1 | 0.0309 | 0.0058 | 0.0250 | 1.0058 | 0.9750 | 0.0000 | 0.0309 |

2 | 0.0309 | 0.0057 | 0.0252 | 0.9806 | 0.9498 | 0.0008 | 0.0309 |

3 | 0.0309 | 0.0055 | 0.0253 | 0.9553 | 0.9244 | 0.0019 | 0.0308 |

4 | 0.0309 | 0.0054 | 0.0255 | 0.9298 | 0.8989 | 0.0033 | 0.0308 |

5 | 0.0309 | 0.0052 | 0.0256 | 0.9042 | 0.8733 | 0.0051 | 0.0307 |

6 | 0.0309 | 0.0051 | 0.0258 | 0.8784 | 0.8475 | 0.0070 | 0.0307 |

7 | 0.0309 | 0.0049 | 0.0259 | 0.8525 | 0.8216 | 0.0088 | 0.0306 |

8 | 0.0309 | 0.0048 | 0.0261 | 0.8264 | 0.7955 | 0.0107 | 0.0305 |

9 | 0.0309 | 0.0046 | 0.0262 | 0.8002 | 0.7693 | 0.0130 | 0.0305 |

10 | 0.0309 | 0.0045 | 0.0264 | 0.7738 | 0.7429 | 0.0152 | 0.0304 |

11 | 0.0309 | 0.0043 | 0.0265 | 0.7472 | 0.7163 | 0.0172 | 0.0303 |

12 | 0.0309 | 0.0042 | 0.0267 | 0.7205 | 0.6896 | 0.0191 | 0.0303 |

13 | 0.0309 | 0.0040 | 0.0269 | 0.6937 | 0.6628 | 0.0212 | 0.0302 |

14 | 0.0309 | 0.0039 | 0.0270 | 0.6667 | 0.6358 | 0.0235 | 0.0302 |

15 | 0.0309 | 0.0037 | 0.0272 | 0.6395 | 0.6086 | 0.0258 | 0.0301 |

16 | 0.0309 | 0.0036 | 0.0273 | 0.6122 | 0.5813 | 0.0282 | 0.0300 |

17 | 0.0309 | 0.0034 | 0.0275 | 0.5847 | 0.5538 | 0.0305 | 0.0299 |

18 | 0.0309 | 0.0032 | 0.0276 | 0.5570 | 0.5261 | 0.0327 | 0.0299 |

19 | 0.0309 | 0.0031 | 0.0278 | 0.5292 | 0.4983 | 0.0345 | 0.0298 |

20 | 0.0309 | 0.0029 | 0.0280 | 0.5012 | 0.4704 | 0.0363 | 0.0298 |

21 | 0.0309 | 0.0027 | 0.0281 | 0.4731 | 0.4422 | 0.0383 | 0.0297 |

22 | 0.0309 | 0.0026 | 0.0283 | 0.4448 | 0.4139 | 0.0399 | 0.0296 |

23 | 0.0309 | 0.0024 | 0.0285 | 0.4164 | 0.3855 | 0.0414 | 0.0296 |

24 | 0.0309 | 0.0022 | 0.0286 | 0.3877 | 0.3569 | 0.0429 | 0.0296 |

25 | 0.0309 | 0.0021 | 0.0288 | 0.3589 | 0.3281 | 0.0447 | 0.0295 |

26 | 0.0309 | 0.0019 | 0.0290 | 0.3300 | 0.2991 | 0.0469 | 0.0294 |

27 | 0.0309 | 0.0017 | 0.0291 | 0.3008 | 0.2700 | 0.0490 | 0.0294 |

28 | 0.0309 | 0.0016 | 0.0293 | 0.2715 | 0.2407 | 0.0510 | 0.0293 |

29 | 0.0309 | 0.0014 | 0.0295 | 0.2421 | 0.2112 | 0.0528 | 0.0292 |

30 | 0.0309 | 0.0012 | 0.0296 | 0.2124 | 0.1815 | 0.0543 | 0.0292 |

31 | 0.0309 | 0.0011 | 0.0298 | 0.1826 | 0.1517 | 0.0556 | 0.0292 |

32 | 0.0309 | 0.0009 | 0.0300 | 0.1526 | 0.1217 | 0.0567 | 0.0291 |

33 | 0.0309 | 0.0007 | 0.0302 | 0.1224 | 0.0916 | 0.0577 | 0.0291 |

34 | 0.0309 | 0.0005 | 0.0303 | 0.0921 | 0.0612 | 0.0586 | 0.0291 |

35 | 0.0309 | 0.0004 | 0.0305 | 0.0616 | 0.0307 | 0.0594 | 0.0290 |

36 | 0.0309 | 0.0002 | 0.0307 | 0.0309 | 0.0000 | 0.0600 | 0.0290 |

## Results: Account For Charge Offs

lcA %>% # Perform the following calculation for each year group_by(issue_year) %>% # Aggregrate for each year summarise( #Number of Loans Count = n(), #Dollars Loaned Out DollarsFunded = sum(funded_amnt), #Dollars collected from payments DollarsRecieved = sum(total_rec), #Percent Charge Offs Chargeoffs = round(sum(ifelse(grepl("Charged Off",loan_status),1,0))/Count*100,1), #Percent Prepayments Prepays = round(sum(ifelse(grepl("Fully Paid",loan_status)&(Age < 36),1,0))/Count*100,1), #The Percent Increase on the Dollars Loans Yield = round(100*(DollarsRecieved/DollarsFunded-1),1), #Dollar weighted interest rate for each year Interest = round(weighted.mean(interest,funded_amnt),1)) %>% mutate( #Applys the amortization schedule and sumes payments recieved for each loan ExpectedYield = apply( (.), 1, function(x) sum(get_amortization_chargeoff(36, as.numeric(x['Interest'])/100, as.numeric(x['Chargeoffs'])/100 )$payment_recieved))) %>% mutate( #Rounds Expected Yield to a percent ExpectedYield = round(100*(ExpectedYield-1),1) ) %>% #Filter to relevant variables select(issue_year,DollarsFunded,Prepays,Chargeoffs,Interest,Yield,ExpectedYield) %>% #Convert to % for display mutate(Prepays=paste0(Prepays,"%"), Chargeoffs=paste0(Chargeoffs,"%"), Interest=paste0(Interest,"%"), Yield=paste0(Yield,"%"), ExpectedYield = paste0(ExpectedYield,"%")) %>% #Create a data table kable( format = "markdown")

issue_year | DollarsFunded | Prepays | Chargeoffs | Interest | Yield | ExpectedYield |
---|---|---|---|---|---|---|

2008 | 1899800 | 49.4% | 6% | 8.5% | 7.6% | 10% |

2009 | 8700550 | 49.7% | 6.7% | 8.8% | 7.6% | 10.1% |

2010 | 20650050 | 53.1% | 4.4% | 7.2% | 7.4% | 8.9% |

2011 | 49784200 | 50.9% | 6.4% | 7.2% | 6.4% | 7.7% |

2012 | 119545400 | 52.4% | 7.2% | 7.7% | 6.4% | 8% |

## Review: Account For Charge Offs

I see that the expected values have decreased from the previous expectations. The expected and actuals are closer in line, but there is room for improvement. I have accounted for the charge offs, but not the prepayment behavior.

## Cash Flow Expectation : Account For Prepayments & Charge Offs

I can now repeat the process I did for including the effect of charge offs in our cash flow expectations. I can see if there is a characteristic cumulative behavior, generate a function that approximates the behavior, and incorporate it into the amortization.

# Prepays lcA %>% ## Perform the following calculations for each year group_by(issue_year) %>% ## Count the number of leans each year mutate(count=n()) %>% ## Perform a summary for each year and age of loan group_by(issue_year,Age) %>% ## Find the number of loans that prepayments each month summarise(prepay = sum(ifelse(grepl("Fully Paid",loan_status),1,0)), ## Get the count value for the year ## count is a vector of the save value for each group count = max(count)) %>% ## ORder by the age for the cumsum arrange(Age) %>% ## Can only prepay before the loan is mature filter(Age < 36) %>% ## Get the cumlative total of prepaid loans over the course of the term mutate(total_prepay = cumsum(prepay)) %>% ## Group by the traunch of loans group_by(issue_year) %>% ## Get the total number of prepayments for the traunch mutate(max_total_prepay = max(total_prepay)) %>% ## Plot the cumlative percentage of prepaid loans for each traunch ggplot(aes(Age,total_prepay/max_total_prepay,color=factor(issue_year)))+ geom_line()+ theme_bw()+ xlim(0,36)+ ylab('Percent Total Prepayment')

The cumulative prepayment behavior over the life does have a characteristic shape, so I will generate an approximation function similar to charge offs.

percentPrepaymentFunc <- lcA %>% ## Perform the following calculations for each year group_by(issue_year) %>% ## Count the number of leans each year mutate(count=n()) %>% ## Perform a summary for each year and age of loan group_by(issue_year,Age) %>% ## Find the number of loans that prepayments each month summarise(prepay = sum(ifelse(grepl("Fully Paid",loan_status),1,0)), ## Get the count value for the year ## count is a vector of the save value for each group count = max(count)) %>% ## ORder by the age for the cumsum arrange(Age) %>% ## Can only prepay before the loan is mature filter(Age < 36) %>% ## Get the cumlative total of prepaid loans over the course of the term mutate(total_prepay = cumsum(prepay)) %>% ## Group by the traunch of loans group_by(issue_year) %>% ## Get the total number of prepayments for the traunch mutate(max_total_prepay = max(total_prepay)) %>% ## Calculate the percentage of prepayments that happen in a given ## month out of all of the prepayments that happen in the traunch mutate(percent_prepay = total_prepay/max_total_prepay) %>% ## Calculate for eeach age group_by(Age) %>% ## The weighted mean of the percentage by the size of the traunch summarise(avg_percent_total_prepay = weighted.mean(percent_prepay,count)) %>% ## Return an approx function (function(df) { ## perform a loess fit to smoooth the function mod <- loess(avg_percent_total_prepay ~ Age,df,span=1/4) ## predict the results pred <- predict(mod,df) ## create a data frame to use to make the approx function tmp <- data.frame(Age = df$Age,pred_percent_total_prepay=pred) ## Make sure the fit produces values between 0 and 1 tmp$pred_percent_total_prepay = tmp$pred_percent_total_prepay-min(tmp$pred_percent_total_prepay) tmp$pred_percent_total_prepay = tmp$pred_percent_total_prepay/max(tmp$pred_percent_total_prepay) ## Return the approx function approxfun(tmp$Age,tmp$pred_percent_total_prepay,method='linear',yleft = 0,yright=1) }) ## Plot the approx functions data.frame(Age=seq(0,36), ppp = percentPrepaymentFunc(seq(0,36))) %>% ggplot(aes(Age,ppp))+ geom_line()+ theme_bw()+ xlim(0,36)+ ylab('Percent Total Chargeoffs')

## Results: Account For Prepayments & Charge Offs

I have the characteristic prepayment function and charge off function, so I can append it to the amortization schedule. Like loans that charge off, a prepaid loan does not generate payments after it is prepaid. Unlike charged off loans, it does return the principal. Once this is included, I suspect I will see higher than previously expected cashflows earlier due to returned principal.

#Ammortizaiton Schedule for loans with Charge Off and Prepayment Expectations get_amortization_chargeoff_prepay <- function(term_number,interest_rate,percent_chargeoff,percent_prepay){ amortization <- data.frame(statement=seq(0,term_number,1), payment=c(0,rep(payment_value(1,interest_rate,term_number),term_number)), interest_payment = 0, principal_payment=0, start_balance=0, end_balance=1, #Units chargeoffed is measures scale * expected percent of total unit_chargeoff = percent_chargeoff*percentChargeoffFunc(seq(0,36)), #Units prepyament is measured scale * expected percent of total unit_prepay = percent_prepay*percentPrepaymentFunc(seq(0,36))) for(i in (1:term_number+1)){ amortization[i,'start_balance'] = amortization[i-1,'end_balance']*(1+interest_rate/12) amortization[i,'interest_payment'] = interest_rate*amortization[i-1,'end_balance']/12 amortization[i,'principal_payment'] = amortization[i,'payment'] - amortization[i,'interest_payment'] amortization[i,'end_balance'] = amortization[i,'start_balance']-amortization[i,'payment'] } amortization[term_number+1,'end_balance']=0 amortization <- amortization %>% mutate(monthly_prepay = ifelse(is.na(lag(unit_prepay)),0,unit_prepay-lag(unit_prepay))) %>% mutate(payment_recieved = (1-unit_chargeoff-unit_prepay)*payment+monthly_prepay*(end_balance+payment)) return(amortization) } get_amortization_chargeoff_prepay(36,0.07,0.06,0.50) %>% round_df(4) %>% kable( format = "markdown")

statement | payment | interest_payment | principal_payment | start_balance | end_balance | unit_chargeoff | unit_prepay | monthly_prepay | payment_recieved |
---|---|---|---|---|---|---|---|---|---|

0 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 1.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |

1 | 0.0309 | 0.0058 | 0.0250 | 1.0058 | 0.9750 | 0.0000 | 0.0000 | 0.0000 | 0.0309 |

2 | 0.0309 | 0.0057 | 0.0252 | 0.9806 | 0.9498 | 0.0008 | 0.0064 | 0.0064 | 0.0369 |

3 | 0.0309 | 0.0055 | 0.0253 | 0.9553 | 0.9244 | 0.0019 | 0.0135 | 0.0071 | 0.0372 |

4 | 0.0309 | 0.0054 | 0.0255 | 0.9298 | 0.8989 | 0.0033 | 0.0214 | 0.0079 | 0.0375 |

5 | 0.0309 | 0.0052 | 0.0256 | 0.9042 | 0.8733 | 0.0051 | 0.0298 | 0.0084 | 0.0374 |

6 | 0.0309 | 0.0051 | 0.0258 | 0.8784 | 0.8475 | 0.0070 | 0.0390 | 0.0093 | 0.0376 |

7 | 0.0309 | 0.0049 | 0.0259 | 0.8525 | 0.8216 | 0.0088 | 0.0492 | 0.0101 | 0.0377 |

8 | 0.0309 | 0.0048 | 0.0261 | 0.8264 | 0.7955 | 0.0107 | 0.0600 | 0.0108 | 0.0376 |

9 | 0.0309 | 0.0046 | 0.0262 | 0.8002 | 0.7693 | 0.0130 | 0.0718 | 0.0118 | 0.0377 |

10 | 0.0309 | 0.0045 | 0.0264 | 0.7738 | 0.7429 | 0.0152 | 0.0825 | 0.0107 | 0.0361 |

11 | 0.0309 | 0.0043 | 0.0265 | 0.7472 | 0.7163 | 0.0172 | 0.0928 | 0.0103 | 0.0352 |

12 | 0.0309 | 0.0042 | 0.0267 | 0.7205 | 0.6896 | 0.0191 | 0.1039 | 0.0111 | 0.0351 |

13 | 0.0309 | 0.0040 | 0.0269 | 0.6937 | 0.6628 | 0.0212 | 0.1168 | 0.0129 | 0.0356 |

14 | 0.0309 | 0.0039 | 0.0270 | 0.6667 | 0.6358 | 0.0235 | 0.1321 | 0.0153 | 0.0363 |

15 | 0.0309 | 0.0037 | 0.0272 | 0.6395 | 0.6086 | 0.0258 | 0.1474 | 0.0153 | 0.0353 |

16 | 0.0309 | 0.0036 | 0.0273 | 0.6122 | 0.5813 | 0.0282 | 0.1631 | 0.0157 | 0.0346 |

17 | 0.0309 | 0.0034 | 0.0275 | 0.5847 | 0.5538 | 0.0305 | 0.1788 | 0.0157 | 0.0336 |

18 | 0.0309 | 0.0032 | 0.0276 | 0.5570 | 0.5261 | 0.0327 | 0.1942 | 0.0154 | 0.0324 |

19 | 0.0309 | 0.0031 | 0.0278 | 0.5292 | 0.4983 | 0.0345 | 0.2096 | 0.0154 | 0.0315 |

20 | 0.0309 | 0.0029 | 0.0280 | 0.5012 | 0.4704 | 0.0363 | 0.2256 | 0.0160 | 0.0308 |

21 | 0.0309 | 0.0027 | 0.0281 | 0.4731 | 0.4422 | 0.0383 | 0.2432 | 0.0176 | 0.0305 |

22 | 0.0309 | 0.0026 | 0.0283 | 0.4448 | 0.4139 | 0.0399 | 0.2599 | 0.0167 | 0.0290 |

23 | 0.0309 | 0.0024 | 0.0285 | 0.4164 | 0.3855 | 0.0414 | 0.2759 | 0.0160 | 0.0278 |

24 | 0.0309 | 0.0022 | 0.0286 | 0.3877 | 0.3569 | 0.0429 | 0.2923 | 0.0163 | 0.0269 |

25 | 0.0309 | 0.0021 | 0.0288 | 0.3589 | 0.3281 | 0.0447 | 0.3098 | 0.0176 | 0.0262 |

26 | 0.0309 | 0.0019 | 0.0290 | 0.3300 | 0.2991 | 0.0469 | 0.3292 | 0.0194 | 0.0257 |

27 | 0.0309 | 0.0017 | 0.0291 | 0.3008 | 0.2700 | 0.0490 | 0.3486 | 0.0193 | 0.0244 |

28 | 0.0309 | 0.0016 | 0.0293 | 0.2715 | 0.2407 | 0.0510 | 0.3681 | 0.0196 | 0.0232 |

29 | 0.0309 | 0.0014 | 0.0295 | 0.2421 | 0.2112 | 0.0528 | 0.3872 | 0.0191 | 0.0219 |

30 | 0.0309 | 0.0012 | 0.0296 | 0.2124 | 0.1815 | 0.0543 | 0.4062 | 0.0189 | 0.0207 |

31 | 0.0309 | 0.0011 | 0.0298 | 0.1826 | 0.1517 | 0.0556 | 0.4249 | 0.0187 | 0.0195 |

32 | 0.0309 | 0.0009 | 0.0300 | 0.1526 | 0.1217 | 0.0567 | 0.4438 | 0.0190 | 0.0183 |

33 | 0.0309 | 0.0007 | 0.0302 | 0.1224 | 0.0916 | 0.0577 | 0.4632 | 0.0194 | 0.0172 |

34 | 0.0309 | 0.0005 | 0.0303 | 0.0921 | 0.0612 | 0.0586 | 0.4819 | 0.0187 | 0.0159 |

35 | 0.0309 | 0.0004 | 0.0305 | 0.0616 | 0.0307 | 0.0594 | 0.5000 | 0.0181 | 0.0147 |

36 | 0.0309 | 0.0002 | 0.0307 | 0.0309 | 0.0000 | 0.0600 | 0.5000 | 0.0000 | 0.0136 |

Now I can add the prepayment behavior into our cashflows and compare the expected to the actuals.

lcA %>% # Perform the following calculation for each year group_by(issue_year) %>% # Aggregrate for each year summarise( #Number of Loans Count = n(), #Dollars Loaned Out DollarsFunded = sum(funded_amnt), #Dollars collected from payments DollarsRecieved = sum(total_rec), #Percent Charge Offs Chargeoffs = round(sum(ifelse(grepl("Charged Off",loan_status),1,0))/Count*100,1), #Percent Prepayments Prepays = round(sum(ifelse(grepl("Fully Paid",loan_status)&(Age < 36),1,0))/Count*100,1), #The Percent Increase on the Dollars Loans Yield = round(100*(DollarsRecieved/DollarsFunded-1),1), #Dollar weighted interest rate for each year Interest = round(weighted.mean(interest,funded_amnt),1)) %>% mutate( #Applys the amortization schedule and sumes payments recieved for each loan ExpectedYield = apply( (.), 1, function(x) sum(get_amortization_chargeoff_prepay(36, as.numeric(x['Interest'])/100, as.numeric(x['Chargeoffs'])/100, as.numeric(x['Prepays'])/100 )$payment_recieved))) %>% mutate( #Rounds Expected Yield to a percent ExpectedYield = round(100*(ExpectedYield-1),1) ) %>% #Filter to relevant variables select(issue_year,DollarsFunded,Prepays,Chargeoffs,Interest,Yield,ExpectedYield) %>% #Convert to % for display mutate(Prepays=paste0(Prepays,"%"), Chargeoffs=paste0(Chargeoffs,"%"), Interest=paste0(Interest,"%"), Yield=paste0(Yield,"%"), ExpectedYield = paste0(ExpectedYield,"%")) %>% #Create a data table kable(format = "markdown")

issue_year | DollarsFunded | Prepays | Chargeoffs | Interest | Yield | ExpectedYield |
---|---|---|---|---|---|---|

2008 | 1899800 | 49.4% | 6% | 8.5% | 7.6% | 8.4% |

2009 | 8700550 | 49.7% | 6.7% | 8.8% | 7.6% | 8.3% |

2010 | 20650050 | 53.1% | 4.4% | 7.2% | 7.4% | 7.4% |

2011 | 49784200 | 50.9% | 6.4% | 7.2% | 6.4% | 6.3% |

2012 | 119545400 | 52.4% | 7.2% | 7.7% | 6.4% | 6.4% |

## Review: Account For Prepayments & Charge Offs

Now I have a framework for cashflow expectations that comes within error of the actuals. This method requires that I generate exceptions for the expected charge off and prepayment scales to forecast cashflows, but once I have that, the rest is just straight forward math. The cashflow model that comes describes lending club actuals is using the paid as agreed model, then account for charge off and prepayment behavior. Even though there are a number of simplification and assumptions in this framework, it is an improved understanding of the expected cashflows for a grade A, 36 month term loan in Lending Club.

You must log in to post a comment.