程序代写案例-MA 705

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
MA 705: Fall 2020 Prof. Cherveny
Individual MA705 Project
Due: Dec 14th
Goals
1. Demonstrate mastery of data science tools (python, working with data frames, visual-
izations, data collection and cleaning, dashboards)
2. Communicate data to a general audience
Description
Choose a real-world question and address it as a data scientist using tools learned in MA705.
Major components of the project are:
• Articulating a clear question
• Collecting data
• Cleaning the data
• Building a dashboard
Instructions
1. Topic: Choose a motivating question to anchor your project. This is the hardest part,
and doing it well makes the rest of the project easier and more fun. The topic may be
anything you like, but answering it should involve a variety of tools learned in MA705.
• Finance/Real estate/Health
• Sports
• Weather
• Pets/Animals/Plants/Nature
• Food/Dining/Cooking
• Music/Movies/TV/Books
• Product of interest (Coke, McDonalds, Pizza, Cereal)
• Cars/Trains/Flights
• Breakdown of a topic by state, country, county
2. Data Collection: Use web scraping, APIs, and publicly available data sets to collect
data that may be useful.
3. Data Wrangling: Appropriately clean the raw data, deal with missing values, merge
data sets, etc. Prepare at least one nice data frame for downstream use.
MA 705: Fall 2020 Prof. Cherveny
4. Data Presentation: Create a dashboard that presents your data to the user in a
way that answers the question according to the user’s interests. At a minimum, there
should be a table and a visualization that update according to the user’s input. Feel
very free to include more elements. The challenge involved will be noted (but don’t
make a complex dashboard for the sake of complexity.. dashboards not user-friendly
or not designed with a clear central question in mind will not be viewed favorably).
Most projects will not use every tool discussed in MA705, but they should have the above
elements to some extent. The lack of challenge in one step may be offset by more involved
work in another step. You should feel free to incorporate models learned in other courses (if
done right!), but you may not hand in this project (or part of it) as your semester project
in another course without speaking to both instructors. Please speak to me if you want to
incorporate tools that significantly go beyond those covered in MA705.
You may consult any references (books, media, web). However, you shouldn’t consult
professionals about this project. Document all data sources and references.
Deliverables
• Primary: A well-designed Dashboard. It should be visually appealing, address a
clear motivating question, be self-contained (meaning good labels on everything, with
possibly brief text or markdown explanations as needed). It should allow for client
customization in some ways.
• Other: Supporting .py files or data sets.
Project Example
Here are a few detailed examples of possible projects.
• (Based on # 3 from Week 11 Exercises) How can we find a great video game? Build
a dashboard that presents the user with recommendations based on customized filters.
This would involve:
– Scraping MetaCritic1 to build a data set containing the 1000+ best video games.
Variables might include name, release date, platform, genre, rating, critic score,
user score, number of critic reviews, and summary.
– Building a dashboard that displays a list of the top 20 video games based on the
user’s search parameters and relevant information for each one. The dashboard
would have controls for release date, platform, genre, minimum number of critic
reviews, and whether results are based on user or critic score.
– Ideally a visualization would be involved. The Week 11 exercise suggested a scat-
terplot of user score vs critic score with a trend line for practice, but this doesn’t
seem like it would help the user with their choice. Perhaps a bar plot containing
the top 20 games for the search parameters, sorted by the overall score?
1https://www.metacritic.com/browse/games/score/metascore/all
MA 705: Fall 2020 Prof. Cherveny
– Challenge: Maybe the user could enter a keyword, and only results that also
contain that keyword in the summary would be displayed?
– Extra challenge: Add more information for each video game recommendation, such
as an icon of the video game or the best price on Amazon or a link to Amazon?
This might even involve actually getting the icon and real-time price from Amazon.
Again, this is extra challenge and would go beyond the project requirements.
• (Current MA705 project) How does NBA team performance depend on player age?
– Prepare a data set containing the game logs for all players in the NBA during
the 2018-2019 season by either scraping or using the NBA.com’s API. For each
player and each game the player was in, variables collected would include number
of minutes played, points scored, rebounds, team they played for (player get traded
mid-season), and date of the game.
– Based on the date of the game and (separately scraped) birthdate of a player, add
a variable to the data frame for age of each player in that game.
– Now prepare a dashboard that builds a histogram of, say, minutes played in the
season by players of each age. The user would select the histogram variable to
plot against age (minutes played, points scored, etc) as well as the NBA team to
display the data for.
– When the user selects an NBA team, the team’s season stats appear, such as win-
loss record and if they made the playoffs. It would be cool to have a team logo
display as well.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468