辅导案例-STAT 1361/2360

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
STAT 1361/2360: Statistical Learning and Data Science
University of Pittsburgh
Data Science Project Details – 2020 Update
The information below is designed to supersede that provided in the original project description
(“Data Science Project Details”). Due to the COVID-19 outbreak in the U.S. and the subse-
quent transitioning to online-only coursework, a number of changes are necessary to the original
project guidelines. The changes below are designed to ensure students are able to complete the
entirety of the project independently under the new guidelines.
1. Project Proposals (5%): At the time of this writing, project proposals have already been
completed and so there is no change to this portion of the project. No Change.
2. Oral Presentation (5%): Canceled
3. Written Report (15%): The written report is now the only component of the final
project. Furthermore, the new guidelines given below consist of only 3 sections rather than the
5 originally given. Finally, these reports should now be completed independently. You
may, of course, discuss various aspects of your data and models with your other group members
via Skype (or some other online virtual meeting software). However, everyone in the class needs
to write all parts of their report independently and in their own words, and should express
their own personal unique views, thoughts, and opinions. Each written report must include the
following sections:
1. Introduction: An overview of the problem of interest and details of the specific dataset as
well as a clear description of the problems of interest. (∼ 0.5 - 1 page)
2. Methods/Results Overview: Provide a brief summary of all models constructed and how
they performed relative to each other. This will consist primarily of two parts (i) a sum-
mary of the findings from the models/experiments done in the homework and (ii) an
explanation of any and all follow-up analyses you performed after all models were first
constructed in order to compare models, evaluate variable importance, etc. Note that you
should not generally need to include code here. If code and/or plots are necessary (in your
view) to fully explain your findings, you may include that kind of thing in an appendix
that appears at the end of the file. (∼ 1.5 - 2 pages)
3. Thoughts and Takeaways: After constructing the models and attempting to determine
which perform best and which variables are most important, there are a number of things
you ought to consider. Please think critically about each of the following issues/prompts
and reply to them directly in this section of your report. Note: it is likely easiest to simply
copy and paste the prompts below into your report, put them in bold, and then include
your thoughts/responses below them. No other writing other than your responses to the
following prompts is required in this section.
1
(a) How many models seemed to perform “best” in terms of predictive accuracy? How
did you measure this? Relative to what the models are doing, does it make sense why
they would perform similarly well or are they quite different? Do you have a sense of
whether such models are actually “significantly” better than others?
(b) Among the top-performing models, which variables seemed most important? Are
they mostly the same between models or are they quite different? Do you have any
intuition as to why certain variables might appear more important in some models
but not in others? Think about what those variables actually measure, what their
general relationship to the response might be, and what kinds of models might do
better or worse at picking up different kinds of effects.
(c) What were the most challenging aspects of working with your particular dataset?
Were you able to mitigate these issues or do you feel that your final results are less
certain as a result of them? Perhaps most importantly, do you really trust your “best”
results at the end of the day? If you were in a position where you were personally
held liable for any negative outcomes associated with implementing your model, how
worried would you be?
(d) Imagine that you had to present a summary of your findings to a non-technical
decision-maker. In other words, this person is going to take some kind of action based
on what you report to them but they lack the technical expertise to really “check your
work” or even understand how you arrived at any of your conclusions. What would
you report to them (i.e. what kind of recommendations would you give)? How would
you present it to them? Imagine you had the ability to request more and/or different
data. What might you want to request? More observations? More (or different)
variables included?
It’s difficult to set any kind of length requirements here because some of you may have a lot
to say for some of these and relatively little to say for others. In general, I’m picturing roughly a
page or so for each of these. So each page would have the prompt at the top in bold followed by
a few paragraphs with your thoughts following that. If you have more ideas you want to share,
that’s great, but you probably shouldn’t need more than 2 pages maximum for each of these (try
to be concise). I’m really just interested in seeing how well you can put together the different
pieces of your analysis and recognize potential drawbacks and important issues. Please don’t
feel the need to drone on about one simple point for two pages just for appearances – one or
two interesting points that are well-summarized in a couple of paragraphs is much better than 1
point drawn out over a page and a half. Really focus on trying to draw connections between the
models you’ve built and the results you’ve seen, piece that together with the numerous issues
we’ve discussed in class, and give a concise summary of that.
4. Peer Evaluations / Manager Reports (5%): Canceled
2
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468