Author Topic: Additional variables in data & models for loans  (Read 4670 times)

breitenm

  • Newbie
  • *
  • Posts: 15
    • View Profile
Additional variables in data & models for loans
« on: October 20, 2012, 04:20:02 PM »
Hi,

Does anybody know when the new variables LC announced in their most recent blog-post ( http://blog.lendingclub.com/2012/09/28/investor-updates-and-enhancements/) are going to show up in the CSV files for download?

Also, when you guys are picking loans, how do you count loans marked as late (or grace period) in the LoanStats.csv file? I'm building a model on the data and am currently counting them as bad, but that might discount the fact that many of them do recover. This may make the model forecast too pessimistic. Any thoughts?

Markus
My LendingClub Credit Rating Model: http://cervisia.org/lc_credit/

Peter

  • Administrator
  • Hero Member
  • *****
  • Posts: 754
    • View Profile
    • Lend Academy
    • Email
Re: Additional variables in data & models for loans
« Reply #1 on: October 20, 2012, 07:53:49 PM »
Most people who are doing models for Lending Club and Prosper, when looking at the entire database, use a discounting method for late loans. You can see Lendstats model here:
http://www.lendstats.com/loansearch/lc/lcloanfilter.php
They use loss factors of 0.5 for payment plans, 0.25 for in grace period, 0.5 for 16-30 days late, 0.75 for 31-120 days late and 0.99 for defaults.
Others, such as Interest Radar, use the Lending Club recovery rate data which is more optimistic than Lendstats:
https://www.lendingclub.com/info/statistics-performance.action
Publisher of the Lend Academy blog

See my returns here: http://www.lendacademy.com/returns

brycemason

  • Hero Member
  • *****
  • Posts: 801
    • View Profile
    • P2P-Picks.com
    • Email
Re: Additional variables in data & models for loans
« Reply #2 on: October 21, 2012, 03:40:16 AM »
One way to go is only to model on loans that have termed out. No discounting necessary. Downside is you're three years out of date.

breitenm

  • Newbie
  • *
  • Posts: 15
    • View Profile
Re: Additional variables in data & models for loans
« Reply #3 on: October 24, 2012, 12:32:52 PM »
Does the LendStats model use historic data? It seems that the data available for download shows the current status of loans and does not include historical status changes. For example, I just dug out old loan files from September and October and LoanID 1024323 went from Late31-120 to Late16-30 in September back to Current in October. Unless these changes over time are accounted for and, say, the "worst" state for over the lifetime of the loan is being used in the model, then the discounting factor of the loan would change all the time.

Does anybody know of a way to get old versions of the LoanStats files?
My LendingClub Credit Rating Model: http://cervisia.org/lc_credit/

Peter

  • Administrator
  • Hero Member
  • *****
  • Posts: 754
    • View Profile
    • Lend Academy
    • Email
Re: Additional variables in data & models for loans
« Reply #4 on: October 24, 2012, 01:56:32 PM »
You are right that once a loan goes back to current at Lendstats and others it is treated as always being current. But we know that it has a higher likelihood of default than a loan that has never been late.

There is no way to obtain old versions of the Loanstats.csv, it is updated every day. I have downloaded about a dozen versions on my computer dating back to 2010 so I can see how things change over time.
Publisher of the Lend Academy blog

See my returns here: http://www.lendacademy.com/returns

breitenm

  • Newbie
  • *
  • Posts: 15
    • View Profile
Re: Additional variables in data & models for loans
« Reply #5 on: October 25, 2012, 01:14:38 AM »
Could you put them in zip-file somewhere? :-) Then I can update my model to account for loans that were paying late.
My LendingClub Credit Rating Model: http://cervisia.org/lc_credit/

Peter

  • Administrator
  • Hero Member
  • *****
  • Posts: 754
    • View Profile
    • Lend Academy
    • Email
Re: Additional variables in data & models for loans
« Reply #6 on: October 25, 2012, 06:54:23 PM »
Actually I don't mind at all. Here is a link to the Zip file with five different Loanstats files from 2010 and 2011. Warning, the file is 85 Meg.
https://www.dropbox.com/s/mh8jl5lh5dfhpzu/LoanStatsArchive.zip

Let me know what your analysis finds.
Publisher of the Lend Academy blog

See my returns here: http://www.lendacademy.com/returns

breitenm

  • Newbie
  • *
  • Posts: 15
    • View Profile
Re: Additional variables in data & models for loans
« Reply #7 on: October 27, 2012, 02:50:44 PM »
Sweet! Thank you very much. I'll look into it and will report back :-)

The loan ratings my current model comes up with (updated occasionally) are at http://cervisia.org/lc_credit/ , btw. The model uses a variety of features I've derived from the data (among them some from the loan description), but is a simple binary model that counts late/grace/defaults as bad and fully paid as good (loans that are current aren't used). Given the recovery rate of loans it is probably too conservative in the estimates.
« Last Edit: October 27, 2012, 02:52:17 PM by breitenm »
My LendingClub Credit Rating Model: http://cervisia.org/lc_credit/

Peter

  • Administrator
  • Hero Member
  • *****
  • Posts: 754
    • View Profile
    • Lend Academy
    • Email
Re: Additional variables in data & models for loans
« Reply #8 on: October 29, 2012, 04:43:49 PM »
Thanks. Look forward to your results. And thanks for the URL to your model - I have seen something similar developed by other investors.
Publisher of the Lend Academy blog

See my returns here: http://www.lendacademy.com/returns

breitenm

  • Newbie
  • *
  • Posts: 15
    • View Profile
Re: Additional variables in data & models for loans
« Reply #9 on: November 17, 2012, 04:21:22 PM »
Quick update: it looks like it skewed my model into predicting better (more accurate?) results for loans in lower credit tranches. It's a bit odd and I had to change my definition of bad loans to exclude loans in grace period. It looks like almost everyone misses a payment every now and then.
My LendingClub Credit Rating Model: http://cervisia.org/lc_credit/