Backtesting using the historical loan data is the way to go. Statistics, data mining, and pattern recognition can help in identifying important and combinations of factors.

In my opinion, there are two type of analysis. One where someone throws different algorithms at set of data and sees what comes out. The blogger who used genetic algorithm on LC historical data will be in this camp. Second where someone explores the data first to understand what different factors are, how they are related and impact the results. I am in the second camp as I prefer to understand the factors and explore the relationships before developing quantitative models. If you are interested in learning Statistics more, I will recommend to start with textbook "Practical Business Statistics" by my stats professor Andrew Siegel.

Some of the important factors for lowering defaults based on my analysis are Loan Grade, Interest Rate, Borrower's Location, Loan Purpose (impact declining with time as I believe borrowers are manipulating it), Revolving Credit Utilization, Monthly payment to monthly income ratio, and Months since last delinquency and public record.

You may also want to check out my blog Random Thoughts at

http://andirog.blogspot.com where I have been exploring different factors and its impact on defaults. As a conservative contrarian investor, I am biased toward reducing risk first and then maximizing return second. Based on prior research in consumer lending, recently I introduced BLE Risk Index (BLE = Bad Loan Experience) on my blog and at PeerCube. Also, if you are interested in reviewing loans available through filters created by others, you may want to check out Peer Filters on PeerCube at

http://www.peercube.com/lc/peerfilter. I continue to add new filters as I find them online and also encourage users to share their filters. Also, if you share your filter and send me a note to request analysis, I will be happy to analyze and publish findings on my blog and at PeerCube.