Author Topic: LC loan number comparison  (Read 8037 times)

hoggy1

  • Sr. Member
  • ****
  • Posts: 401
    • View Profile
    • Email
LC loan number comparison
« on: July 03, 2014, 02:52:06 PM »
I have recently downloaded all 3 of LC's historical loan files, merged them all to get a loan count of 275066 thru 3/31/2014. When I backfilter LC loans on NS and cut off loans after 3/31/2014 I get a loan count of 277814. I know the difference is small and unlikely to affect my statistics significantly but I'm nervous when I can't even get a stat this simple to agree with what I see in what I thought was the same original data set?
Are there LC loans that have been purged for some reason that NS captured in their regular downloads and has preserved?
I'm confused.
Steve

Fred93

  • Hero Member
  • *****
  • Posts: 1907
    • View Profile
Re: LC loan number comparison
« Reply #1 on: July 03, 2014, 03:02:32 PM »
My database, built from the 3 file download from LC, contains 277814 loans.

hoggy1

  • Sr. Member
  • ****
  • Posts: 401
    • View Profile
    • Email
Re: LC loan number comparison
« Reply #2 on: July 03, 2014, 03:36:13 PM »
Thanks, maybe I screwed up the merge or had some other kind of problem with excel. I'll double check.
Steve

hoggy1

  • Sr. Member
  • ****
  • Posts: 401
    • View Profile
    • Email
Re: LC loan number comparison
« Reply #3 on: July 03, 2014, 06:13:59 PM »
OK Fred,

Help me some more if you can. When you just append all the files you get 296879 loans. If you sort by status you find 19065 with no status issued from 7/12/2013 all the way through 3/31/2014. Removing these gives you the correct number of loans 277814.

Can you tell me what these loans are or were?

Anyway I had removed these. My missing loans arose because I also removed loans whose status begins "Does not meet the credit policy. Status:" followed by current, or paid in full, etc because I didn't know how to interpret this status. That is where my missing 2748 loans went.

These are not quiet period loans. Can you tell me how this "does not meet ..." status arises and what it means?
Steve

Fred93

  • Hero Member
  • *****
  • Posts: 1907
    • View Profile
Re: LC loan number comparison
« Reply #4 on: July 03, 2014, 06:44:32 PM »
Help me some more if you can. When you just append all the files you get 296879 loans. If you sort by status you find 19065 with no status issued from 7/12/2013 all the way through 3/31/2014. Removing these gives you the correct number of loans 277814.

Can you tell me what these loans are or were?

I'm not in front of my database at the moment, but look at the policy column.  Policy=2 loans are some new lower credit grade (or something) loans they're not offering to us at this time, and they've blocked out many fields.  I throw those out, for obvious reasons.

Quote
My missing loans arose because I also removed loans whose status begins "Does not meet the credit policy. Status:" followed by current, or paid in full, etc because I didn't know how to interpret this status. That is where my missing 2748 loans went.

These are not quiet period loans. Can you tell me how this "does not meet ..." status arises and what it means?

Oh yes.  That's a horrible hack where they overloaded a column with two different kinds of information.  Ok for humans maybe but the most horrible sort of nonsense for computers to read!  When people at my company do this I yell at them. 

I removed the "does not meet" substring, and left the loans in the set.  LC hasn't said anywhere public what these loans are, but someone here talked with LC about this, and there's a bit of an explanation in some message here somewhere.  Perhaps you could search for "does not meet...".  If I remember correctly it means something like the credit policy was updated after this loan was listed but before it issued.  The credit policy changes all the time I imagine, so this is something I believe we should ignore.  The fact that someone at LC thought it was appropriate to throw in some words into a column of data, and thought it was obvious and didn't require explanation is just another example of how their people don't have the same perspective as the consumers of this data, and they would benefit by talking with us more.

hoggy1

  • Sr. Member
  • ****
  • Posts: 401
    • View Profile
    • Email
Re: LC loan number comparison
« Reply #5 on: July 04, 2014, 08:18:33 AM »
Thanks so much. You my hero!
Steve

Rob L

  • Hero Member
  • *****
  • Posts: 1768
    • View Profile
Re: LC loan number comparison
« Reply #6 on: July 23, 2014, 10:32:21 AM »
I have some minor discrepancies with NSR and my numbers regarding sub-grade. Grade numbers are all the same.
NSR has the following differences;  A1 +2 loans,  A5 -1 loan, B2 -1 loan, B5 +1 loan, C2 -2 loans and finally C2 +1 loan.
Total the differences and it's 0 so the total loan count agrees.

If you go to NSR back testing analytics and filter on A1 sub-grade only, the results show all the A1's and also one B grade and one C grade.
Clearly something is amiss.

rocco.g

  • Jr. Member
  • **
  • Posts: 53
    • View Profile
    • Nickel Steamroller
Re: LC loan number comparison
« Reply #7 on: July 23, 2014, 05:00:05 PM »
If you go to NSR back testing analytics and filter on A1 sub-grade only, the results show all the A1's and also one B grade and one C grade.
Clearly something is amiss.

Maybe we are using different data sources, but I am seeing the data messed up in the CSV we downloaded from Lending Club.  In the LoanStats3b_securev1.csv file the notes 5628625 and 6300518 are both 'A1' sub grades, but 6300518 has 'B' for the grade and 5628625 has 'C' for the grade.  This is why you see the B and C notes when doing an A1 sub grade search.  It is what Lending Club is sending out.  I agree that this is amiss, but it should be amiss for everyone...

Rob L

  • Hero Member
  • *****
  • Posts: 1768
    • View Profile
Re: LC loan number comparison
« Reply #8 on: July 23, 2014, 05:27:06 PM »
Interesting. Mystery solved. We are using the same data source, however...
I only use sub-grade and never look at the grade column. Always figured an X3 loan was grade X.
Grade and sub-grade contain redundant information, not a good thing to have in a database.
Somehow it always gets out of sync.
BTW, thanks for your fantastic web site. Very nice piece of work!

Rob L

  • Hero Member
  • *****
  • Posts: 1768
    • View Profile
Re: LC loan number comparison
« Reply #9 on: July 23, 2014, 08:56:41 PM »
LC must have cleaned up the LoanStats3b_securev1.csv file since you downloaded it.
In my copy loan 6300518 is Grade B and Sub_grade B2 and loan 5628625 is Grade C and Sub_grade C4.
I scanned the entire data base (LoanStats3a_securev1.csv, LoanStats3b_securev1.csv and LoanStats3c_securev1.csv) and found no inconsistencies between Grade and Sub_grade.
Nothing out of sync.
« Last Edit: July 23, 2014, 08:58:15 PM by Rob L »

rocco.g

  • Jr. Member
  • **
  • Posts: 53
    • View Profile
    • Nickel Steamroller
Re: LC loan number comparison
« Reply #10 on: July 24, 2014, 10:02:14 PM »
Something weird is going on.  I downloaded the file again at noon today and it is still showing the inconsistencies for me.

brother7

  • Full Member
  • ***
  • Posts: 130
  • Aloha!
    • View Profile
    • Email
Re: LC loan number comparison
« Reply #11 on: July 24, 2014, 11:15:34 PM »
I just downloaded the files on 7/24/2014 at 8:10pm Pacific time. I concur with rocco.g. The inconsistencies are still there.

In file LoanStats3b_securev1.csv on row 77685, id = 6300518 has grade = B and sub_grade = A1.
In the same file on row 97444, id = 5628625  has grade = C and sub_grade = A1.

Rob L

  • Hero Member
  • *****
  • Posts: 1768
    • View Profile
Re: LC loan number comparison
« Reply #12 on: July 25, 2014, 03:01:25 AM »
My existing database must be older:
From my existing database:

"5628625","7011021","20000","20000","20000"," 60 months"," 16.29%","489.45","C","C4","Froedtert Memorial Lutheran Hospital ",
"6300518","7831802","24000","24000","23950"," 36 months"," 10.64%","781.65","B","B2","Quest Diagnostics"

From a newly loaded database:

"5628625","7011021","20000","20000","20000"," 60 months"," 16.29%","489.45","C","A1","Froedtert Memorial Lutheran Hospital ",
"6300518","7831802","24000","24000","23950"," 36 months"," 10.64%","781.65","B","A1","Quest Diagnostics"

I'm not making this up. How could it have happened?
I can understand the database being fixed in later releases, but it was corrupted in the later release.
As you said; weird!

Rob L

  • Hero Member
  • *****
  • Posts: 1768
    • View Profile
Re: LC loan number comparison
« Reply #13 on: July 25, 2014, 03:21:03 AM »
A scan of the newly downloaded database reveals 6 Grade / Sub_grade mismatch errors:

"8957001","10749045","15000","15000","15000"," 36 months"," 15.61%","524.48","C","A5","Senior Marketing Manager"
"8494880","10247000","15000","15000","15000"," 36 months","  8.90%","476.3","A","B4","Paralegal"
"6300518","7831802","24000","24000","23950"," 36 months"," 10.64%","781.65","B","A1","Quest Diagnostics"
"5977589","7450018","16000","16000","16000"," 36 months"," 12.35%","534.11","B","C2","Nationwide"
"5628625","7011021","20000","20000","20000"," 60 months"," 16.29%","489.45","C","A1","Froedtert Memorial Lutheran Hospital "
"2895818","3528124","6000","6000","6000"," 36 months"," 14.33%","206.03","C","B5","Active Motif"

Must be the original 6 count differences I posted a couple of days ago.
Doesn't inspire a lot of confidence, does it.
Could be plenty of other diffs that don't show up;  example  is B  B1    should be     B  B4
Have to cross check with the interest rate (at that date) to be sure the Grade and Sub_grade are correct.

Fred93

  • Hero Member
  • *****
  • Posts: 1907
    • View Profile
Re: LC loan number comparison
« Reply #14 on: July 25, 2014, 06:41:56 AM »
Are you reporting these finds to LC?