Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - sociallender

Pages: 1 ... 17 18 [19]
271
Investors - LC / Re: Automatic order creation and selection
« on: February 14, 2013, 11:47:26 AM »
@yojoakak

Good suggestion.  I am working on that for today's run

@zpbsfg

(for my other stock market blog) the second column is for my system (weekly market trades), the 3rd column is the benchmark S&P500.  So for example, over the past 99 weeks, my system has a compound return of 64% while the S&P during that same period is at 13%.  You can take a look at the performance page for more of the trades and breakdown.

@breitenm

Sadly my stat package (weka) does not have the ability to display the RF trees.  Even if it did, it uses multiple trees so determining the importance of a single attribute would also need to be coded.  It just doesn't have that function to my knowledge.  However I did run a attribute evaluator (InfoGain and Ranker) with the following results (descending order of importance):

Ranked attributes:
 0.033125    4 credit_grade
 0.031705    2 interest_rate
 0.020555    3 loan_length
 0.017653    9 fico_range
 0.013517   14 revolving_line_utilization
 0.009888    5 loan_purpose
 0.007297    8 monthly_income
 0.005468   15 inquiries_in_the_last_6
 0.004476    6 debt-to-income_ratio
 0.003225   21 months_since_last_record
 0.003147   20 public_records_on_file
 0.00206    22 employment_length
 0.001964   12 total_credit_lines
 0.001875   11 open_credit_lines
 0.001812    1 amount_requested
 0.001585   10 earliest_credit_line
 0.00086    23 desc_len
 0.000705    7 home_ownership
 0          13 revolving_credit_balance
 0          19 months_since_last_delinquency
 0          18 delinquencies_(last_2_yrs)
 0          16 accounts_now_delinquent
 0          17 delinquent_amount

Also, if i understand the time question, you are interested in knowing the accuracy using a percentage split of the training set for the cross validation instead of folds?  I would have to include the date to get a correct split (no dates are used in current training set).  However, the loans are sorted oldest to newest in the training file, so after doing a 66% split, with 34% cross validation (presumably the most recent loans), the precision accuracy of true positive is still 94%. Of 6526 instances in test set, 1429 loans were classified as good.  Of these 1429 instances, 1351 were correct and 78 were incorrect).  This is consistent with 10 fold cross validation.  Hope that answers your question

@New Jersey Guy

Mmmm... i love peanuts and fruitcake!  I just did a simple pivot table in excel of the loanStats.csv file that you can download from lendingclub.com.  However, I need to qualify that loans were categrorized as charged off if they were:

Charged Off
Default
Does not meet the current credit policy  Status: Charged Off
Does not meet the current credit policy  Status: Default

Unless I messed something up, these are the default rates for each loan grade.  However, average 18% for all loan grades.  If I am right (someone please confirm), then it pays to select your loans wisely.

John


272
Investors - LC / Re: Automatic order creation and selection
« on: February 14, 2013, 01:04:07 AM »
zpbsfg,

The loans are modeled using the LCs historical database.  Unfortunately, it is very difficult to explain the rules created by the model.  Many algorithms such as neural networks are black boxes with no way (well actually there are some new processes that can on some) to tell.  Other algos such as decision trees can show the process but with the number of attributes, the trees contain hundreds of nodes.  In my case, I tested many algos such as SVMs, Bayes, linear regression, NN and settled on random forests with cost adjustment for imbalanced data.  My stat package provides attribute selection as well as precision metrics.  With 10 fold cross validation, the model seems to do a good job generalizing true positive (good loans) with precision at approx 93%.  However, many of the loans are discarded due to the cost penalization even if they do become fully paid.  The idea for me was to only choose loans that have a good chance of being paid in full even if many false postive (good loans that were incorrectly classified as bad) loans were discarded.

The stock market system is a completely different methodology using neural networks.  I am working on an overview document of how it works in detail.  I hope to have it posted there in the next few weeks.

Thanks for pointing out the link wasnt working.  It was not my intention.  I just changed it to point to the folder of the spreadsheet.  I am not quite sure why gdocs is giving me issues.  I am using googlecl to upload the docs but it doesnt seem to be putting in the sub folder correctly.  I have to manually move them and I think that may be the issue.  The folder link should be working tho (and the icon now to the folder). 

John

273
Investors - LC / Automatic order creation and selection
« on: February 14, 2013, 12:08:21 AM »
Hello everyone,

I am new to lendingclub but prettty good with numbers and programming.  I started an account at LC and found it too time consuming to choose good loans and then create an order including each loan.  So, I created some software that statistically data mines the loans.  I also created a windows application to allow users of LC to easily create an order without manually having to click on each url of the loan. 

I just started with LC so not sure how the loan selection is going to turn out but thought some of you may be interested in trying out the software (in beta test now) as well as the loan selection process.  The site is still under construction but the URL is sociallender.blogspot.com

I will be updating the site daily with the loans that meet the statistical models criteria.  I currently use regression with penalization to create a strict selection criteria. 

For those that are interested in my other investment venture, I have a stock market system that I have implemented using similar modelling techniques (NNs) with over a year of production history.  It is assetclassta.blogspot.com.  As you can see, I enjoy numbers :)

John




Pages: 1 ... 17 18 [19]