Shostack + Friends Blog Archive

 

You talk like a delinquent

This is interesting. Not sure how robust the finding is, but according to an analysis [link to http://www.iq.harvard.edu/blog/sss/archives/2008/10/credit_scoring.shtml no longer works] of LendingClub data on all past loans [link to https://www.lendingclub.com/extdata/LoanStats.csv no longer works], including descriptions of the use for the money, applicants using certain words in their descriptions are much more likely to default.

For our purposes define a Delinquency as either being late in your payments or having defaulted completely. The 10 words with the greatest p-values are below. […]

Word Loans With P(Delinquency|No word) P(Delinquency|Word) p-value
also
215
0.067
0.140
0.0004
need
608
0.062
0.105
0.0015
business
233
0.069
0.116
0.0038
live
91
0.070
0.154
0.0057
already
64
0.071
0.156
0.0059
other
285
0.068
0.112
0.0081
bills
223
0.067
0.135
0.0082
bill
279
0.066
0.125
0.0117
interest
660
0.081
0.053
0.0136

“Words and Credit Scores”, Social Science Statistics Blog [link to http://www.iq.harvard.edu/blog/sss/archives/2008/10/credit_scoring.shtml no longer works]
Not something I’ve studied, but I wonder if a neural network could successfully classify these loans?

3 comments on "You talk like a delinquent"

  • Nicko says:

    You should be able to use an off-the-shelf Baysian spam classifier for this. That said, there are certain spam words with much higher P-values than any of these!

  • chris says:

    “I need the loan to refinance my mortgage and buy Viagra”
    :^)

  • Nicko says:

    You have obviously been using my patent p-value enlargement products 🙂

Comments are closed.