Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - PhilGD

Pages: [1] 2 3 ... 10
Investors - LC / Re: Develop Loan-Picking Algorithm
« on: September 29, 2017, 12:59:34 PM »

- Are there some variables that are your smoking gun predictors?

Finding these variables is the most fun part of the model development process! And this information also tends to be among the most closely guarded and is rarely shared outside of the commonly known variables. Any answers you get to this question will likely be limited to the low-hanging fruit i.e. FICO, DTI, number of inquiries.

Investors - LC / Re: Data question
« on: September 27, 2017, 09:24:58 AM »
You might be able to estimate the monthly note payment as follows:

(original note amount / original loan amount) * monthly loan payment

You might want to further adjust this calculation to estimate the impact of the service fee. There could also be other factors that would impact this calculation, but they don't immediately come to mind.

Investors - LC / Re: Old Data Download
« on: August 24, 2017, 02:38:12 PM »
The question in the OP isn't relevant anymore because LC is back to providing the expanded data attributes for all loans issued from August 2012 - present in their historical data downloads. Therefore the granularity of the historical data is now on par (for the most part) with the granularity of the data on loans available for investment.

Investors - LC / Re: Unable to Download LoanStats Files
« on: March 29, 2017, 07:14:14 PM »
Files are back up. Member IDs are now blank, but other than that, I didn't notice any differences on first glance.

Investors - LC / Re: Unable to Download LoanStats Files
« on: March 24, 2017, 10:16:13 PM »
I can verify I downloaded them earlier this month. I've been archiving them monthly since I began investing.

Investors - LC / Re: Unable to Download LoanStats Files
« on: March 24, 2017, 12:00:12 PM »
Loan stats download is still not working. Anyone hear from LC on this?
Several new data fields have been appended to the end of the browsenotes file. They look to be related to joint loan applications. I haven't been active in a while so I don't know how recently this began. Maybe they're updating the historical data download files to match?

Investors - LC / Re: Default rates by employment title
« on: December 16, 2016, 02:01:54 PM »
I dug a little deeper into the data to investigate the correlation between interest rates and bad loan rates by profession. Below is a chart detailing this comparison for the same sub-sample of professions that I included in the OP. I also made a new PDF file with this data for all 124 professions included in my initial study (attached).

It looks like LC is doing a good job of assigning interest rates, even though I suspect their underwriting doesn't really account for employment title, since the free-text format makes it hard to accurately automate the label assignments. That said, the imperfect relationship between interest rate and profession suggests there may be some arbitrage opportunities to be found in the data.


Here's another chart showing the comparison for all 124 professions. I didn't label the professions this time because it would be too messy to do so.

Outlier A
The highest average interest rate in the data was 14.7%, and it corresponds to the "Military" profession. "Military" is a category consisting of loans where the employment title contained any of the following keywords:

-National Guard
-Air Force
-Department of Defense

Outlier B
The other big outlier has an interest rate of 14.2% and corresponds to all employment titles containing the keyword "Government."

Intuitively there is a connection between Government employees and folks in the Military - they all work for the United States government. But it seems like they are both being treated unfairly since their 12-month default rates (3.77% for Government and 4.71% for Military) are both below the population average (5.34%). I'm not sure why this discrepancy exists in the data. Perhaps there is a greater concentration of 5-year loans in these categories compared to most. I could take a deeper dive into the data for these outliers if people are interested.

One final point. The profession with the highest default rates in the population is listed as "dealer." This corresponds mostly to card dealers for table games. I have a hunch that most of these people are actually gamblers rather than casino workers - and they probably tend to gamble away their LC loans.

Investors - LC / Default rates by employment title
« on: December 16, 2016, 01:20:15 AM »
This forum has been a huge resource for me beginning in March 2015 and in the spirit of the holidays, I've decided to share a recent batch of work I've done on Lending Club's historical data. For the past week I've been working to crack open the employment title data and see if it reliably predicts charge off rates. My complete findings are contained in the attached PDF file.

The biggest challenge in this project was dealing with spelling errors and other issues that arise from a free-text field, which make it hard to group the data. For instance, using Excel, it is not easy to simply group all "Presidents" in one category. I had to deal with the people who misspelled a word, included extraneous spaces and special characters, or used compound descriptors like "President & CEO."


1. All loans issued between September 2007 (the earliest issue date) and June 2015 were included in my analysis - more than 643,000 loans. A "charge off" was defined as a loan status of "charged off" or "default," where the date of last payment occurred 0 - 12 months after the loan was issued. If the loan was charged off in month 18, for example, it was not counted as charged off for the purpose of this project. By aligning the data in this way, I was able to remove the effect of loan age for loans issued recently. Why not include all loans through November of 2015 (i.e., all loans with up to twelve months of payment data)? Because I wanted a buffer to account for delinquent loans that might roll into charge-off status. Why choose 12 months of payment data as the cut-off? A: I wanted the largest sample possible; B: fewer than 12 months wouldn't allow for enough seasoning; C: more than 12 months would whittle down the database too significantly; D: a person's employment status is more likely to change as a loan gets older.

2. After I had gathered the data and defined what a "charge off" is, I used a pivot table to determine the most commonly used employment titles. If you're curious, the top three most common were "teacher," "manager," and "owner." I decided to create a short list of employment title categories by taking the top 100 most common titles. The list eventually grew to 124 total categories, since some common categories were not detected by the initial pivot table analysis.

3. The pivot table report showed a clear separation between the low-hanging fruit in the database and everything else. The low-hanging fruit were the people who used a simple description for their employment title and spelled it correctly, such as "teacher." The difficult people used the name of their employer instead of an employment title, or used a compound description such as "President & CEO," or misspelled a word, or included special characters such as "&" or "/".

4. I quickly encountered a problem: some people could be included in multiple employment categories. Based on my shortlist, "Assistant director systems engineering" was an assistant, a director, an engineer, and a systems engineer all at once. I resolved to allow for multiple employment categories/labels per loan in order to overcome this problem:

Breakdown of loans by number of labels
One label335,495
Two labels71,845
Three labels3,556
Four labels98
not labeled271,000
emp title blank  36,886
total 643,698

5. I used keyword searches to label as many loans as possible. For example, "fire fighter," "fire marshall" and "fire chief" were all lumped together. Similarly, "CEO," "COO," "CTO," "CFO", and their non-abbreviated versions were all lumped together into the "C-suite" category.

6. As noted in the table above, over a quarter million loans in my sample remain unlabeled. This represents the really difficult ones - most of them are not employment titles, but employer names, and hence impossible to categorize. Others are indeed employment titles, but they contain severe misspellings or belong in categories that are too uncommon to be statistically significant. There are also probably many loans that could be labeled, but belong to a category that I missed.

7. I calculated charge-off rates for each category and sorted from lowest-to-highest. Median income was also calculated for each employment label. Complete results for all categories are included in the attached PDF file. Below is a small sampling of the categories for the purpose of including a pretty chart:

8. I validated the results by splitting the sample into "earlier loans" and "later loans" and recalculating the charge off rates. If the results were reasonably similar for the separate samples, then we can conclude that employment title is predictive of charge off rates. The cutoff date that I chose was October 2014 - loans issued on or before this date are in the "early" sample and loans after this date are in the "later" sample. I chose this cutoff point solely in the interest of creating equally-sized samples - I wanted an even split. Below is the chart that proves my methodology is mostly accurate:

Please feel free to dig into the complete data set (attached) and offer feedback. I'm particularly interested in suggestions for how to automate the labeling process - parsing through text data for keywords and misspellings is not easy to do in Excel. But if it can be accomplished, then I'm confident it would be useful for including this data in a regression model.

Investors - LC / Re: 2015 & recent loan quality
« on: December 07, 2016, 12:37:17 AM »
I was wondering how much help higher interest rates will be in offsetting the degraded loan quality.
The following table provides an interesting look at cumulative charge off rate versus interest rate.
The numbers look reasonable but I really didn't know what to expect.

Hi Rob, thanks for the great table you put together. Can you explain the calculations you're using to get to annualized return given cumulative charge off rates and interest rates? Thanks!

Sure. Given the original loan(s) amount (this amount is arbitrary), annual interest rate, term and service fee I compute the total net amount received for loan(s) that fully perform and mature. I multiply that amount by (1 - cumulative charge off percentage) giving the total amount received net of fees and charge offs. Finally, I compute the interest rate that produces that amount received given the original loan(s) amount and term.

The Excel formula in the upper left cell (Interest Rate 6%, Charge Offs 0%) (C7) is:
This is copied and pasted to all the other cells.

$B$1 is the original loan(s) amount
$B$2 is the term in months
$B$3 is the service fee %

C$6  is the Cumulative Charge Off Row (C6, D6, ... L6)
$B7  is the Interest Rate Column (B7, B8, ... B26)

Gotcha, thanks for the explanation! I didn't know about that =RATE formula before, so this is really helpful.

Investors - LC / Re: 2015 & recent loan quality
« on: December 06, 2016, 03:36:08 PM »
I was wondering how much help higher interest rates will be in offsetting the degraded loan quality.
The following table provides an interesting look at cumulative charge off rate versus interest rate.
The numbers look reasonable but I really didn't know what to expect.

Hi Rob, thanks for the great table you put together. Can you explain the calculations you're using to get to annualized return given cumulative charge off rates and interest rates? Thanks!

Investors - LC / Re: Some help locating a loan? 93117022
« on: November 20, 2016, 01:57:00 AM »
An example of dicey is a recent class C application. Most of the numbers were good, but the applicant was requesting a $40K loan for debt consolidation and had $97K in revolving debt. That got to about 75% funded before it timed out. I don't know about others, but for me the fact that the applicant was requesting a maximum loan and the discrepancy between the loan and the debt kept me away from it.


Why did the loan amount beinng less than the revolving debt amount turn you off of investing?

Investors - LC / Re: 2015 & recent loan quality
« on: October 10, 2016, 11:02:54 PM »
That chart is great but I think we need more clarity into the underlying borrowers. Have the loans written so far in 2016, on average, been assigned higher interest rates that would compensate investors for the higher delinquency performance?

Investors - LC / Re: Charged off account SOLD with no reimbursement to me
« on: September 09, 2016, 05:11:18 PM »
Here's an interesting one, this was a straight roller, never paid a dime of the $5100 borrowed:

7/19/16 (Tuesday) Account was repurchased from a third party debt buyer

Account was repurchased from a third party debt buyer?  Why would that happen?


Maybe they found the loan to be fraudulent? In which case they're obligated to buy back the debt from investors. You might see a 100% recovery on this one. Let us know what happens!

General P2P Lending Discussion / Re: Stoneridge P2P mutual fund (LENDX)
« on: September 06, 2016, 09:54:25 PM »
I don't understand why this fund has loan servicing fees. Isn't the servicing done by the platforms even for whole loan buyers?

Investors - LC / Re: Worst Month Yet
« on: August 13, 2016, 12:10:22 PM »
There is a blog, I think it is lendingrobot, that shows a curve.  His curve is slightly different.  I believe it shows chargeoffs rather than delinquency.  Of course chargeoffs happen a little later than the first nonpayment event, so the bump in that curve is a few months later.

The Lending Robot "Lifetime Distribution" curve doesn't seem to have a bump.
The curve is based only on loans that charged off over a certain period and shows when they did.
It flattens dramatically after age 0.5 (at 18 months for 36 month loans or at 30 months for 60 month loans):

That's because their lifetime distribution curve is a cumulative representation. For instance, compare the two charts on the top of page two of this performance update from Prosper:

The delinquency curve on the top right is not cumulative, it is month-to-month, and this is the type of relationship that Fred was referring to.

Pages: [1] 2 3 ... 10