COMP0050 Assignment Data Download from moodle the file COMP0050CourseworkData.zip. This contains two datasets: 1- peerToPeerLoans.csv: The data come from George, N. (2018) (All Lending Club loan data version 6, February 2018, www. kaggle.com/wordsforthewise/lending-club). This is a subset of the datasets used in Turiel, J. D., & Aste, T. (2020). The variable of interest is charged_off, which takes value 1 if a debtor is not repaying the loan (0 otherwise). 2- stockReturns.csv: this dataset contains 500 daily percentage stock returns for 50 assets. Tasks There will be two tasks corresponding to the two datasets: 1. The task is to build a model to predict whether a customer will default on their loan. You should compare the performance of different methods (e.g. logistic regression, classification trees/forests) in terms of their ability to correctly predict loan defaults. You are free to focus on a subset of the data (e.g. a reduced set of features, or a subset of the loans) and to manipulate the data as you like, but you should explain your rationale. 2. Focus on the global minimum variance portfolio. Compare the portfolio variance using two different regularizers. Use validation methods to find the optimal values of the parameters. For both tasks, justify whether you want to focus only on subsamples of the data. You are also free to explore questions related to the data and the tasks you think are interesting, as long as your analysis includes the development of predictive models of defaults for what concerns task 1 and regularized portfolio optimization for task 2. Useful references in relation to the above tasks are the following • Turiel, J. D., & Aste, T. (2020). Peer-to-peer loan acceptance and default prediction with artificial intelligence. Royal Society open science, 7(6), 191649. • Fastrich, B., Paterlini, S., & Winker, P. (2015). Constructing optimal sparse portfolios using regularization methods. Computational Management Science, 12(3), 417-434. • Brodie, J., Daubechies, I., De Mol, C., Giannone, D., & Loris, I. (2009). Sparse and stable Markowitz portfolios. Proceedings of the National Academy of Sciences, 106(30), 12267-12272. Written report A brief written report (maximum 8 pages, with a maximum 4 pages for each task) containing the justification of the approach, the results of your analysis, and a discussion of your results should be submitted to Moodle before the deadline of Wednesday 06/04/2022 at 16:00. Marking
This assignment is worth 100% of the overall mark (50% for each task). The marking will be based on the following criteria (with uniform weights): 1) Clarity of presentation and explanations
2) Validity of results
3) Critical interpretation of the results
欢迎咨询51作业君