辅导案例-AD654

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
AD654: Marketing Analytics 
Boston University 
 
Assignment IV: Time Series and AB Testing 
 
Once you have completed this assignment, you will upload two files into Blackboard: The                           
.ipynb file that you create in Jupyter Notebook, and an .html file that was generated from your                                 
.ipynb file. If you run into any trouble with submitting the .html file to Blackboard, you can                                 
submit it as a PDF instead.   

For any question that asks you to perform some particular task, you just need to show your                                 
input and output in Jupyter Notebook. Tasks will always be written in regular, non-italicized                           
font.   
 
For any question that asks you to include interpretation, write your answer in a Markdown cell                               
in Jupyter Notebook. ​Any homework question that needs interpretation will be written in                         
italicized font. ​Do not simply write your answer in a code cell as a comment, but use a                                   
Markdown cell instead.   

Remember to be resourceful! There are many helpful resources available to you, including the                           
video library, the class slides, the recitation sessions, the Zoom office hours sessions, and the                             
web.   
 

Part I: Working with Time Series Data   
  
 
A. Pick any publicly-traded company that trades on the Nasdaq or the NYSE. 
a. What company did you select, and what is its ticker symbol?   
 
B. Go to Yahoo! Finance: finance.yahoo.com. Enter your company’s ticker symbol                   
in the search bar near the top of your screen. Next, click on “Historical Data”                             
and then “Download.” This will automatically download a .csv with one year’s                       
worth of the company’s data onto your computer.   
 
C. Bring the dataset into your environment. For this step, bring the dataset into                         
your environment in the same way we have done throughout the semester --                         
just use read_csv() from pandas, passing the name of the file into the function.  
a. Use the head() function to explore the variables.  
b. Next, call the info() function on your dataset.    
 
D. Is this dataframe indexed by time values? How do you know this?   

E. If you answered no to the previous question, you will need to tell Python that                             
this data is actually a time series. Convert it to a time series now -- do this                                 
without reading the entire file back into your environment.   
 
F. In your Jupyter Notebook, view the​ index ​attribute of your time series.   
a. Now, view the ​max​ and ​min​ value of your index attribute. 
b. Now, view the ​argmax​ and ​argmin​ values of your index attribute.  
c. What do the results of max, min, argmax, and argmin represent? 
 
G. Let’s visualize the entire time series.   
a. Create a line plot that depicts all of the movement of your ‘Close’                         
variable for your stock.  
b. Now, add a horizontal, dashed line that spans the entire length of your                         
graph. The height of this line should represent the mean ‘Close’ value                       
from your dataset. Color this line with any color that you like (you might                           
even want to try a hexadecimal value!)  
c. As we all know, 2020 has been a pretty crazy year -- and for stock                             
market investors, it has sometimes felt like a ride on a roller coaster at                           
Lobster Land. Use shading to show the contrast between February,                   
March, April, May, and June of 2020. Shade each of these months in a                           
slightly different way on the graph.   
i. In a few sentences, how did your stock perform across this                     
five-month span? You don’t need to do any outside research or                     
analysis to answer this -- just describe what your graph is                     
showing.  
 
H. Let’s visualize some Simple Moving Averages. Show 5, 10, and 20-day Simple                       
Moving averages of the ‘Close’ variable for your company’s data in three                       
separate line graphs. Each time, include the daily closing price for your company                         
overlaid on your graph. 
a. How did the three simple moving average plots compare to one another?                       
How are they similar, and how are they different?   
b. What are some pros and cons of using simple moving averages? What                       
about the pros and cons of using shorter or longer k-values in a moving                           
average? 
 
I. Next, we will try something called resampling.   
a. Resample your time series so that its values are based on some different                         
unit of time (larger than daily).   
i. Plot this newly-resampled time series. 
ii. Provide an example that explains why someone might care about                   
resampling. To answer this, you may use ANY example that you                     
can think of, or discover, from any field that uses time series data                         
(health, weather, market forecasting, etc.) You don’t need to                 
perform any outside research or go too deeply into domain                   
knowledge here -- 3-4 thoughtful sentences are all you need.   
  
 
 
  
 
 
 
 
 
 
 
 
 
Part II: Using a Statistical Test to Analyze Data  
 
 

This summer, Lobster Land introduced a new game of chance called “Giant Dice.” Any visitor                             
to the park can play this game for $5. After paying the money, the person is allowed to choose                                     
any number from 1 through 6. The visitor can then either roll, or throw, one gigantic wooden                                 
dice (as shown above). If the dice comes up with the same number that the player chose                                 
before throwing it, the player will receive a Lobster Land t-shirt.   
 
An angry park visitor has decided to sue Lobster Land, because he spent $80 playing this game                                 
but did not win a t-shirt. After 16 rolls of the dice, for which he chose the number “4” each                                       
time, he failed to win on any occasion. He is claiming that Lobster Land must have manipulated                                 
the dice so that the “4” result would not come up. Even though his legal fees will greatly                                   
exceed $80, he has announced that he will sue Lobster Land in a court of law to recover his                                     
losses.   
 
Lobster Land is hoping that you can save the day here! They hope that you can use some of                                     
your analytics skills to help them out and stop this lawsuit before it goes any further.   
 
A. Lobster Land built its dice so that they will perform in a similar manner as the                               
randint() function from the random module in Python. Using Python, simulate                     
120 rolls of a dice​, ​being sure to use​ ​random.randint() to generate the values. 
 
If this were a completely fair dice, how many instances of 1, 2, 3, 4, 5, and 6                                   
would you expect to result from this simulation? How many of each outcome                         
did you actually get?   
 
B. Using an appropriate statistical test, determine whether your results support the         
claim made by the angry park visitor. ​What is the null hypothesis? Based on the
evidence here, will you reject or fail to reject the null hypothesis?  
 
C. Run the simulated dice roll again, but this time, use 1200 rolls, rather than 120.                             
Using these newly-obtained values, run your statistical test again. ​How did your                       
test statistic change? How will you interpret these results?   
 
D. Run the simulated dice roll another time, but this time, use 12,000 rolls, rather                           
than 1200. Run the statistical test yet again. ​How did your test statistic change?                           
How will you interpret these results?  
 
E. What general trend did you notice as you increased the number of dice rolls in                             
the simulation? Why do you think this is the case? To answer this, you don’t                             
need to cite any formal statistics rules or formulas -- you can answer this in your                               
own words, in a couple of sentences.   


欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468