Author Topic: Donít Just Auto Invest - Proving Systematic Bias at LendingClub  (Read 1324 times)

dacoinminster

  • Newbie
  • *
  • Posts: 6
  • Rational Exuberance
    • View Profile
    • My LinkedIn Profile
Hi! Brand new account on this forum - I hope this is the right place to post this.

I spent the past few weeks diving deep into LendingClub's historical data, and found that they seem to be systematically mis-rating loans. This seems a gift to hedge funds at the expense of retail investors. Perhaps you guys are already well aware of this, but I found it pretty surprising.

Here's my analysis: https://medium.com/@jr.willett/dont-just-auto-invest-proving-systematic-bias-at-lendingclub-c9a9122ec8c1

I'd love to hear what you guys think!

pressure9pa

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #1 on: July 19, 2018, 04:27:05 PM »
Good read - thanks for posting.

I've avoided renters since the opening of my account 2-3 years ago.  My only thought was that if we came into an inflationary environment where returns that once looked decent suddenly looked poor, those borrowers with a hard, generally appreciating asset would at least default less frequently than originally predicted and thus perhaps slightly raise the return more than expected.  I guess I was lucky that the logic behind my selection never really came into play, but the filter worked.

FWIW, my account is ~28 months old, about 75% in B's & C's, and the only consistent filter I've used is home ownership.  My annualized return is at 5.66%, which while not great I think is decent considering my entry timing into this market couldn't have been much worse.

arcee49

  • Newbie
  • *
  • Posts: 25
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #2 on: July 19, 2018, 07:02:45 PM »
As far as the income verification loans are concerned, see this thread: https://forum.lendacademy.com/index.php/topic,2945.msg26788.html#msg26788

AnilG's theory is simply that lower quality loans are the ones that would require their income to be verified.

I also seem to remember an interview done (probably the Lend Academy Podcast) with someone in P2P lending, maybe even LC.  They basically said the same thing as AnilG.  In addition to that though, they commented that with good borrowers they didn't want to bother them with the hassle of verifying their income and potentially losing their business.

rubicon

  • Full Member
  • ***
  • Posts: 103
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #3 on: July 19, 2018, 07:07:13 PM »
lendingclub is also subject to ECOA, which legally prohibits it from discriminating among certain borrower pools. The most common example is zip code, which could be a proxy for race.


also, hedge funds can value their portfolios in one of two ways:
- mark to market e.g. based on current interest rates
- held to maturity (not subject to mark to market) but starting off with a loss reserve and making write-ups or write-downs when there's evidence the loss reserve is too conservative or too aggressive. This smooths out the returns stream over time and mitigates the impact of returns coming down over time that retail investors see in their accounts as retail accounts only get mark-down when there's an actual credit event.

« Last Edit: July 19, 2018, 07:11:19 PM by rubicon »

dacoinminster

  • Newbie
  • *
  • Posts: 6
  • Rational Exuberance
    • View Profile
    • My LinkedIn Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #4 on: July 20, 2018, 03:14:38 PM »
Thanks guys. The ECOA angle is very interesting. Funny how regulations can create perverse outcomes sometimes.

AnilG

  • Hero Member
  • *****
  • Posts: 1065
    • View Profile
    • PeerCube
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #5 on: July 26, 2018, 03:00:26 AM »
Good analysis. Why and how did you decide to pick homeownership, unemployment, and verification attributes for analysis?

You most probably wil have different explanation if you used covariance matrix to select attributes to be analyzed. I believe your findings can easily be explained using income attribute. Borrower Income has the largest influence on loan performance. High income borrowers (except unusual/incorrect  income numbers reported) likely to have mortgage (not rent), employed, and income not verified.
---
Anil Gupta
PeerCube Thoughts blog https://www.peercube.com/blog
PeerCube https://www.peercube.com

Fred93

  • Hero Member
  • *****
  • Posts: 2063
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #6 on: July 26, 2018, 04:26:55 PM »
I spent the past few weeks diving deep into LendingClub's historical data, and found that they seem to be systematically mis-rating loans.

Welcome to our little club!

You have a sensationalist way of of presenting your findings, using words like "systematically mis-rating".

Another way of describing what you found is that LC publishes this huge amount of data for use by investors, and shockingly, there actually is some useful information in this big pile of data.

I don't think you would prefer that they give us only useless data from which no information can be extracted.  Be happy that they give you (somewhat) useful data!
 

Quote
This seems a gift to hedge funds at the expense of retail investors.

I find it strange that you jump to this conclusion.  This data has been available to ALL INVESTORS since the very beginning.  (And there were retail investors several years before there were hedge funds invested in LC products.)  Not only could you download the data and do your own analysis, but there have been over time many different web sites which would use this data to back-test filters for LC investors.  I believe there is now only one.  www.nsrplatform.com   The presence of these web sites means that even a person without programming ability can back-test the value of filtering on various criteria, etc.  Its not a secret.

And I reject the notion, often expressed by fearful LC investors, that hedge fund investors are somehow magically smarter than the rest of us, so they can magically pick the good stuff and leave us the trash.  People who happen to work for a hedge fund aren't some kind of magic folks.  They're just people with a different primary occupation than most of us.  I think you and I and many of the people here are pretty smart folks, so I'm happy to compete with hedge fund weenies.  (We do it in all our other investing, so why not here?)


Quote
Perhaps you guys are already well aware of this, but I found it pretty surprising.

Folks who've been around here awhile are aware, yes.


Quote
Here's my analysis: https://medium.com/@jr.willett/dont-just-auto-invest-proving-systematic-bias-at-lendingclub-c9a9122ec8c1
I'd love to hear what you guys think!

Over the past few years I've read many blog articles where someone did similar sort of analysis, so while you expressed various points from different angles than others, nothing was a surprise.  Like many folks here, I do my own analysis.  Like you, I don't use the auto-invest feature.

You noted that default rate didn't correspond to whether income had been verified in the fashion you expected.  This is because LC chooses who to verify, based on the other credit criteria.  They've never told us exactly what criteria they use, but one can examine the correlations.  Its LC's choice, not the borrower's choice, so it tells you something about LC, not something about the borrower.  This parameter is unusual for another reason.  It takes time for the verification to complete, and LC lets lenders go ahead and begin investing in loans while the process is ongoing.  Some investors have in the past tried to invest in only loans with verified income, but it is very difficult to do, because sometimes there's no more of the loan available by the time the verification comes in.  So for a couple of reasons this particular variable isn't very useful.

You wrote "Unfortunately, the data provided by LendingClub does not show borrower credit history at the time they applied for the loan, but rather their current credit history."  That's not right.  The historical files show the credit variables at the time of the loan application. 

On the subject of "mis-rating" as you put it...  LC is driven by many forces.  There are regulatory issues.  There are supply & demand issues on both the borrower and investor side.  Banks are an increasing fraction of LC's investor base, and they like certain kinds of loans.  There is increasing competition from other online loan providers, which acts to keep LC from raising rates on some loans as much as I think they should.  Luckily they give you and me the ability to choose which loans we buy.

So if you went to a car dealer, and he had Fords and Chevys, and you looked at the prices, would you say "He's systematically mis-rating the automobiles!" ?  I don't think so.  You'd say "Ya'know, those Fords are overpriced.  I know 'cause I've looked at the data from NTSB and Consumer Reports, and I asked my uncle Charlie.  So I'm buyin' a Chevy, and he can keep the damn Fords."

« Last Edit: July 26, 2018, 04:28:39 PM by Fred93 »

rawraw

  • Hero Member
  • *****
  • Posts: 2746
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #7 on: July 27, 2018, 09:19:21 AM »
Great reply above

dacoinminster

  • Newbie
  • *
  • Posts: 6
  • Rational Exuberance
    • View Profile
    • My LinkedIn Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #8 on: July 27, 2018, 12:54:54 PM »
I spent the past few weeks diving deep into LendingClub's historical data, and found that they seem to be systematically mis-rating loans.

Welcome to our little club!

You have a sensationalist way of of presenting your findings, using words like "systematically mis-rating".

Another way of describing what you found is that LC publishes this huge amount of data for use by investors, and shockingly, there actually is some useful information in this big pile of data.

I don't think you would prefer that they give us only useless data from which no information can be extracted.  Be happy that they give you (somewhat) useful data!
 

Quote
This seems a gift to hedge funds at the expense of retail investors.

I find it strange that you jump to this conclusion.  This data has been available to ALL INVESTORS since the very beginning.  (And there were retail investors several years before there were hedge funds invested in LC products.)  Not only could you download the data and do your own analysis, but there have been over time many different web sites which would use this data to back-test filters for LC investors.  I believe there is now only one.  www.nsrplatform.com   The presence of these web sites means that even a person without programming ability can back-test the value of filtering on various criteria, etc.  Its not a secret.

And I reject the notion, often expressed by fearful LC investors, that hedge fund investors are somehow magically smarter than the rest of us, so they can magically pick the good stuff and leave us the trash.  People who happen to work for a hedge fund aren't some kind of magic folks.  They're just people with a different primary occupation than most of us.  I think you and I and many of the people here are pretty smart folks, so I'm happy to compete with hedge fund weenies.  (We do it in all our other investing, so why not here?)


Quote
Perhaps you guys are already well aware of this, but I found it pretty surprising.

Folks who've been around here awhile are aware, yes.


Quote
Here's my analysis: https://medium.com/@jr.willett/dont-just-auto-invest-proving-systematic-bias-at-lendingclub-c9a9122ec8c1
I'd love to hear what you guys think!

Over the past few years I've read many blog articles where someone did similar sort of analysis, so while you expressed various points from different angles than others, nothing was a surprise.  Like many folks here, I do my own analysis.  Like you, I don't use the auto-invest feature.

You noted that default rate didn't correspond to whether income had been verified in the fashion you expected.  This is because LC chooses who to verify, based on the other credit criteria.  They've never told us exactly what criteria they use, but one can examine the correlations.  Its LC's choice, not the borrower's choice, so it tells you something about LC, not something about the borrower.  This parameter is unusual for another reason.  It takes time for the verification to complete, and LC lets lenders go ahead and begin investing in loans while the process is ongoing.  Some investors have in the past tried to invest in only loans with verified income, but it is very difficult to do, because sometimes there's no more of the loan available by the time the verification comes in.  So for a couple of reasons this particular variable isn't very useful.

You wrote "Unfortunately, the data provided by LendingClub does not show borrower credit history at the time they applied for the loan, but rather their current credit history."  That's not right.  The historical files show the credit variables at the time of the loan application. 

On the subject of "mis-rating" as you put it...  LC is driven by many forces.  There are regulatory issues.  There are supply & demand issues on both the borrower and investor side.  Banks are an increasing fraction of LC's investor base, and they like certain kinds of loans.  There is increasing competition from other online loan providers, which acts to keep LC from raising rates on some loans as much as I think they should.  Luckily they give you and me the ability to choose which loans we buy.

So if you went to a car dealer, and he had Fords and Chevys, and you looked at the prices, would you say "He's systematically mis-rating the automobiles!" ?  I don't think so.  You'd say "Ya'know, those Fords are overpriced.  I know 'cause I've looked at the data from NTSB and Consumer Reports, and I asked my uncle Charlie.  So I'm buyin' a Chevy, and he can keep the damn Fords."

Awesome. Thanks for your thoughts on this. To use your automobile analogy, to me this feel as if JD Power rated Ford highest in owner satisfaction when in fact the numbers all pointed to Chevy having fewer complaints. If the biases in LC data came and went over the years, that would be evidence that they were fixing accounting for these variables properly, but most of the biases have persisted for many years, which seems to me like they are doing it on purpose.

I am very interested in your claim that credit data represents the data at the time they applied for the loan. If that is true, then I have a lot more interesting numbers to run! I based my claim on the fact that last_credit_pull_d (the date of the most recent credit pull) is usually a later date than the issue date for older loans. It appears to me that the credit data is usually fairly recent data, even if the loan is a couple years old. On what do you base your assertion that the historical files show the credit variables at the time of the loan application?

Thanks for your thoughtful response - I really appreciate hearing the perspective of someone who has been in this space for awhile!


Fred93

  • Hero Member
  • *****
  • Posts: 2063
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #9 on: July 27, 2018, 10:05:44 PM »
I based my claim on the fact that last_credit_pull_d (the date of the most recent credit pull) is usually a later date than the issue date for older loans.

You're making an unwarranted presumption about the meaning of the last credit pull date.

The last credit pull date changes about once per month, as they do the periodic update credit pulls.  I say "about" once a month because they're not on a precise schedule.  The last credit pull date does not apply however to most of the credit fields LC provides you.  Most fields come from the application-time credit pull.

You will be able to see this if you observe over time.  If you own some loans, you can also see the credit data for loans of which you own a piece via the web site.  Via the web site you can also go to the folio secondary market, and see credit history information for any loan that has a piece of the loan for sale.  They even display a graph showing how fico score has changed over time since the beginning of the loan.

In the history files, note that there are FOUR fico-related fields. 
fico_range_high
fico_range_low
last_fico_range_high
last_fico_range_low

The fields that begin with "last" are from the most recent credit pull, which occurred on the last_credit_pull_d.  Only those fields change.  All the other credit fields are static over the life of the loan.


Quote
On what do you base your assertion that the historical files show the credit variables at the time of the loan application?

Ten years of observation.

LC documentation is not precise or complete, so I don't expect that you will find this written down in any document that comes from LC.

dacoinminster

  • Newbie
  • *
  • Posts: 6
  • Rational Exuberance
    • View Profile
    • My LinkedIn Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #10 on: July 29, 2018, 05:30:46 PM »
I based my claim on the fact that last_credit_pull_d (the date of the most recent credit pull) is usually a later date than the issue date for older loans.

You're making an unwarranted presumption about the meaning of the last credit pull date.

The last credit pull date changes about once per month, as they do the periodic update credit pulls.  I say "about" once a month because they're not on a precise schedule.  The last credit pull date does not apply however to most of the credit fields LC provides you.  Most fields come from the application-time credit pull.

You will be able to see this if you observe over time.  If you own some loans, you can also see the credit data for loans of which you own a piece via the web site.  Via the web site you can also go to the folio secondary market, and see credit history information for any loan that has a piece of the loan for sale.  They even display a graph showing how fico score has changed over time since the beginning of the loan.

In the history files, note that there are FOUR fico-related fields. 
fico_range_high
fico_range_low
last_fico_range_high
last_fico_range_low

The fields that begin with "last" are from the most recent credit pull, which occurred on the last_credit_pull_d.  Only those fields change.  All the other credit fields are static over the life of the loan.


Quote
On what do you base your assertion that the historical files show the credit variables at the time of the loan application?

Ten years of observation.

LC documentation is not precise or complete, so I don't expect that you will find this written down in any document that comes from LC.

This is great information - thank you! The historical files I pulled (downloaded June 19th from here: https://www.lendingclub.com/info/download-data.action) do not have the four fico score fields. I know it shows up in the XLS file describing the fields, but none of the CSV files have the fico score fields in their headers. Do you know of a data source that has all the historical data for download WITH fico scores?

I noticed that when I download my own personal loans, those files DID have the fico score fields, so clearly the data exists somewhere.

I added a footnote in your honor, sir! :)

Fred93

  • Hero Member
  • *****
  • Posts: 2063
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #11 on: July 29, 2018, 07:37:43 PM »
This is great information - thank you! The historical files I pulled (downloaded June 19th from here: https://www.lendingclub.com/info/download-data.action) do not have the four fico score fields. I know it shows up in the XLS file describing the fields, but none of the CSV files have the fico score fields in their headers. Do you know of a data source that has all the historical data for download WITH fico scores?

You've stumbled on an obscure undocumented feature. 

Download again, this time make sure you are LOGGED IN to the web site when you download the historical files.  If you are logged in, you get more fields.

dacoinminster

  • Newbie
  • *
  • Posts: 6
  • Rational Exuberance
    • View Profile
    • My LinkedIn Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #12 on: July 30, 2018, 01:06:34 PM »
This is great information - thank you! The historical files I pulled (downloaded June 19th from here: https://www.lendingclub.com/info/download-data.action) do not have the four fico score fields. I know it shows up in the XLS file describing the fields, but none of the CSV files have the fico score fields in their headers. Do you know of a data source that has all the historical data for download WITH fico scores?

You've stumbled on an obscure undocumented feature. 

Download again, this time make sure you are LOGGED IN to the web site when you download the historical files.  If you are logged in, you get more fields.

I downloaded again while logged in, and confirmed the additional data is now there. Son of a . . . .

This is immensely useful knowledge. I updated my article to include this critical piece of information, and credited you with that as well.

I'd like to point out to anybody else reading this, that by sharing information like this Fred93 may effectively be reducing his own future returns, since the more people that know about the value of this data the more people will exploit it, and the more competition there will be for the best loans. Fred93, I salute you for your selfless sharing of your knowledge!

There is clearly more data worth looking at. I may have to publish a "part 2" of my analysis . . .


rawraw

  • Hero Member
  • *****
  • Posts: 2746
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #13 on: July 30, 2018, 06:30:06 PM »
Remember that some variables can be predictive for returns but illegal for Lending Club to use in underwriting.  Cannot use variables that impact protected classes and other consumer protections
https://www.fdic.gov/regulations/compliance/manual/4/iv-1.1.pdf

SLCPaladin

  • Full Member
  • ***
  • Posts: 201
    • View Profile
Re: Donít Just Auto Invest - Proving Systematic Bias at LendingClub
« Reply #14 on: July 31, 2018, 01:13:28 PM »
Quote
I'd like to point out to anybody else reading this, that by sharing information like this Fred93 may effectively be reducing his own future returns, since the more people that know about the value of this data the more people will exploit it, and the more competition there will be for the best loans. Fred93, I salute you for your selfless sharing of your knowledge!

Well, that is one way of looking at it. I tend to look at the platform as a whole. The reality is if the median investor isn't doing well, the marketplace lending space will suffer. If the only winners are those who are doing advanced data analytics and "cherry picking" the best loans, and the rest are getting creamed, eventually that will lead to capital outflow. In the long run, the returns have to be attractive for the broad majority of investors, not just those with the special sauce. If they are not, this niche asset class will cease to be viable.