代写接单-BSAN2205 MACHINE LEARNING FOR BUSINESS

BSAN2205 MACHINE LEARNING FOR BUSINESS

 Project Plan The course BSAN2205 Machine Learning for Business has three assessment items including a Project Plan, a Project Report and Presentation, and a School-based Take-home Assessment (weighted 20%, 50%, and 30%, respectively). These notes outline my expectations for the Project Plan and introduce the context for the project work. I intend the Plan or proposal to be a formative piece of assessment. The Plan should set the groundwork for your project and project report. I will provide feedback on your Plan that you can incorporate into your project. Background and Context In competitive markets, businesses face the challenge of acquiring and retaining customers. Consider subscription services, for example, subscriptions to digital editions of newspapers and magazines, subscriptions to streaming services (film and television, music, news, sport, etc.), and subscriptions to cable television services (Foxtel). Other businesses face the same challenges, for example, airlines, banks, insurance companies, telecommunication companies, and retailers, restaurants, and personal services businesses. One retention strategy is to deepen relationships with customers through upselling convincing a customer to buy something in addition to or more expensive than that they have previously purchased from a business. Streaming services like Netflix and Spotify strive to build customer engagement increasing the number of downloads and/or the time spent streaming. Bank marketing provides the specific context for the project. Like many consumer businesses, banks confront the challenges of attracting new customers and retaining existing customers. Strategies for retaining customers provides the setting for the project. For banks, engagement is reflected in the number of products (active accounts) customers maintain. Often retention strategies have the goal of deepening engagement by encouraging customers to open new accounts. Consolidating accounts with one rather than many banks may offer consumers some benefits at the margin. For example, highly engaged customers may be offered lower rates on loans, access to services for which they do not have to pay (at least, not directly), and minimising the overall burden of managing multiple banking relationships. For banks, the benefits of more highly engaged customers are larger and more stable cash flows, lower marketing expenses (with the costs of attracting a customer higher than the costs of retaining a customer, per customer relationship economics), and thus potentially higher profits. Before moving on, I would like you to appreciate that in problems in business can be solved through effective predictive models of binary outcomes. The decision to purchase or not purchase shares in a company, to acquire or merge with another business, to hire or not hire a prospective employee, etc. All of these decisions involve binary outcomes (in some cases, they can be characterised as go/no go decisions). The specific focus of the project is customer acceptance of a marketing offer, but the concepts and models have much broader application. Aims of the Proposal The Project Plan has two broad aims. Firstly, the Plan is a marketing document. Second, the Plan is a roadmap. As a marketing document, the Project Plan must sell the project to the stakeholder(s) and/or client. Thus, the Plan should emphasis the emphasis of doing the project. As a proposal or roadmap, the Project Plan should outline in some detail the likely direction of the project. This might include identifying the key variables and methods of analysis. 1 Key Sections of the Project Plan More 1. 2. 3. 4. 5. 6. specifically, you might consider including the following sections in your Plan. Background statement Conceptual development Variable selection Methods of analysis/analysis plan Form of the results Next steps In the study. This might include reference to the key stakeholder(s) and/or client. I recommend targeting the proposal at a (hypothetical) client to bring a degree of realism to project and to help focus the project (for example, you could contextualise the study with reference to an Australian bank). In this section, also make sure to sell the project. What are the likely benefits of doing the project, what new insights do you anticipate and how will these improve decision making for example? You might find value in a section 2 that outlines the conceptual framework for your project work. If you focus your project on customer engagement with banks, for example, you might give some thought to advantages to banks and their customers from greater engagement and the process that might drive customers to respond favourably to a banks marketing efforts. My preference is you use your own common sense and logic to define the key concepts and to develop a rationale for their links. I do not expect a review of the literature, but you might find some desk (Google) research helpful in identifying past studies that have explored similar issues to the ones you are. A boxes and arrows diagram might help to illustrate the core concepts and relationships. The section on variable selection is probably the key section (section 3). Be very specific about the variables you intend to study. In the social science tradition, much emphasis is placed on explaining why the variables selected for study have been selected the focus is explanation rather than prediction. This is less the case with the data science paradigm with its focus on prediction business analysts/data scientists may wish to specific a (initial) model that includes all of the possible feature variables. My minimum expectation for this section is that you provide some description of the output and feature variables you intend to study, and why these feature variables. Section 4 outlines the methods of analysis. Here I would you to be specific about the models you might use to analyse the data. You may have completed the course BSAN2204 Methods of Business Analytics. A focus of that course was predicting a numeric output variable (song hotness) using linear regression. For this course (BSAN2205 Machine Learning for Business), our target variable is categorical: it records whether customers opened or did not open a new account in response to the Banks marketing efforts. My expectations for section 4 are that you can identify an appropriate statistical model(s) for analysing the data, state something about the assumptions of the model, and perhaps list the key steps in employing the model. You could also write out the specific model you intend estimating (write out the regression equation, for example, with reference to the y- and x- variables). Section 5 form of the results should give an indication of what the outputs might look like. You could do mock-up of the results. You could also say that you will document the results in PowerPoint format and present them verbally. The next steps section concludes the proposal. Here you might remind the client of the core benefits and indicate you need to initialise the project (final client sign-off, for example). You could also add a timeline or perhaps Gantt chart (timetabling the key activities, when you will do them, and identifying any critical paths). At this stage, refrain from background statement (section 1), you may wish to sketch out the initial motivation for the 2 doing any statistical analysis of the data save the analysis for the project reports. Use the Plan to develop some general knowledge of the models you intend to use and sketch out your best plan for the analysis you intend to implement. The final section of your Plan might address next steps (Section 6). You can briefly restate the main motivation for your Plan and highlight the key next steps. Remember the Plan is a marketing document perhaps remind the reader of the Plan that this project is an important one and should be completed now. The Bank Marketing Dataset The project work for this Semester uses the Bank Marketing dataset. Several variations of the dataset exist. There is one variation available from the UCI Machine Learning Repository and another variation on Kaggle. We will use the version of the dataset available from Kaggle (with some minor variations). Owned by Google, Kaggle is an online community of business analysts and data scientists. Users can freely upload and download data to and from the site (kaggle.com). Kaggle runs competitions often sponsored by third parties. I encourage you to explore the Kaggle website and join the Kaggle community. Kaggle is a great place for those with an interest in machine learning. I have downloaded the dataset from Kaggle, introduced some further variations, and placed the dataset to the Blackboard site. Please use this version of the dataset for your project. Appendix A provides a list of the variables in the Bank Marketing dataset, including brief descriptions. The target or output variable is customers responses to a recent marketing campaign run by the Bank (the Bank being a European bank, specifically, a Portuguese bank). The data is real-world data offered freely by the Bank to the data science community. The data consists of 21 variables (the target variable and 20 feature variables) and observations on approximately 40,000 customers targeted with a particular marketing campaign. The output variable is a binary categorical variable customers responded to the marketing campaign by either opening a new account or not. The 20 feature variables include a mix of variables reflecting customers characteristics (age, education, etc.), the nature and status of their existing accounts with the Bank (type of accounts, accounts in debit, etc.), variables describing the campaign (number of customer contacts during the campaign), and socio-economic variables (consumer confidence, etc.). The feature variables are a mix of categorical and numeric variables. Given the output variable is a (binary) categorical variable you should explore model forms other than linear regression. As a starting point, I recommend you fit a logistic regression model to the data and subsequently use tree-based methods. A comparison of these methods could be an important of your overall project (logistic regression vs decision trees). Further, you might explore ensemble methods to enhance your implementation of tree-based methods. We will cover these methods in the coming weeks! Submission Guidelines The Project Plan has a weight of 20 percent of your score for the course. Please submit your Plan in the form of a written Word document. I expect you could easily write 2,000 words. Try not to write more than 3,000. I will give your Plan a score out of 100. I will also provide you with written feedback. When marking the Project Plan, I will be looking closely at the links between the sections as much as what you write in each individual section. For example, the background statement should set-up the conceptual development that in turn should set-up the variable selection etc. A high scoring Plan will have a degree of novelty to it (a unique and/or compelling contextualisation, a 3 thoughtfully specified analysis plan including appropriate performance metrics, etc.). Finally, these notes are a guide only to preparing your Project Plan. You may find other ways to present it that are more compelling, more compact, and more complete. If in doubt, do what you think is best. I will separately provide you with the marking criteria for the Project Plan. Note they will closely follow the criteria of the Project Plan for the course BSAN2204 Methods of Business Analytics. 4 Appendix A The Bank Marketing Dataset is based on the Bank Marketing UCI dataset, with some variations. Table 1 below lists the variables in the dataset and offers brief descriptions. Table A1 Variables and Variable Descriptions 5 Variable Age Type of Job Variable Name age job Variable Type Numeric Categorical Units/Category Labels Years admin blue-collar entrepreneur housemaid management retired self-employed services student technician unemployed unknown divorced married single unknown basic4y basic6y basic9y highschool illiterate professionalcourse universitydegree unknown no yes unknown no yes unknown no yes unknown cellular telephone jan feb mar . . . Marital Status Education History Credit in Default Housing Loan Personal Loan Contact Type Month of Last Contact marital education default housing loan contact month Categorical Categorical Categorical Categorical Categorical Categorical Categorical Table A1 (Contd) Variables and Variable Descriptions 6 Variable Day of Last Contact Duration of Last Call Number of Contacts Days since Last Contact Prior Contacts Response to Last Campaign Cyclical Employment Variation Consumer Price Index Consumer Confidence Euro Interbank Offered Rate (Euriobor) Employment Rate Customer Response Variable Name day_of_week duration campaign pdays previous poutcome emp_var_rate cons_price_idx cons_conf_idx euribor3m nr_employed response Variable Type Categorical Numeric Numeric Numeric Numeric Categorical Numeric Numeric Numeric Numeric Numeric Categorical Units/Category Labels mon tue wed thu fri Seconds Counts Days Counts failure nonexistent success Index Index Index Interest rate Index no yes 


51作业君 51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: ITCSdaixie