In the United States, stagnant interest rates, bank skepticism after the financial crisis and the rise of crowd sourcing has led to the development of Peer to Peer lending as an alternative to loans offered by banks and other traditional financial institutions. Typically, investing in these loans involves manually building a portfolio one note at a time or automating the procedure based on investor defined criteria. Investor returns are directly dependent on an individual borrower’s repayment of the loan, so investing in a loan that defaults results in a direct loss.
This thesis will focus on two different approaches to analyze default in this new lending environment. First, we will explore loan default as a binary classification problem. We will use initial borrower data to build a decision tree classifier and evaluate performance based on binary classification metrics . We will then examine the construction of the classifier in an effort to gain insight on the indicators of default and develop possible investment strategies. Next, we will explore loan default as a survival analysis model. This will utilize payment history data, along with the initial borrower data, to build a proportional hazards model that evaluates time until default. This model will also be explored for potential insight on investment strategy.