程序代写案例-MGT458

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
MGT458 Final Examination 2018 Fall Term
For Instructor Use:
Question Points Score
1 34
2 53
3 8
Total: 95
Page Bonus Points Score
13 4
Total: 4
Page 2 of 14
MGT458 Final Examination 2018 Fall Term
1. Fall, Winter, Spring, Summary?! [45 minutes] (34 points)
When you give answers to the following questions, be very precise.
(a) (4 points) A data analyst claims to have used some standard methods with de-
fault values provided by packages such as pandas to create the following summary
table. Does this seem to be believable?
Ages Percent Frequency
9-13 10
14-19 26
20-24 22
25-28 15
29-32 17
(b) (3 points) The following data represent the net worth (in millions of dollars) of
45 national corporations.
Class limit Frequency
10-20 2
21-31 8
32-42 15
43-53 7
54-64 10
65-75 3
What Python command could be used to create a histogram that presents these
data. Assume that you have access to the original data set with 45 observations.
Page 3 of 14
MGT458 Final Examination 2018 Fall Term
(c) Are the following examples classification or regression/estimation problems. Cir-
cle the correct answer.
i. (1 point) Predict ages of your customers.
A. Classification B. Regression/Estimation
ii. (1 point) Predict marital status of your customers.
A. Classification B. Regression/Estimation
iii. (1 point) Predict the time a customer spends browsing your website.
A. Classification B. Regression/Estimation
(d) Are the following examples supervised or unsupervised machine learning prob-
lems. Circle the correct answer.
i. (1 point) Grouping similar customers of a retail company for targeted ad-
vertisements.
A. Supervised B. Unsupervised
ii. (1 point) A Wall Street analyst has been asked to find out the expected
change in stock price for a set of companies with similar price/earnings ra-
tios.
A. Supervised B. Unsupervised
iii. (1 point) Whether or not a customer will leave the company. You have ac-
cess to previous customers’ labeled data.
A. Supervised B. Unsupervised
Page 4 of 14
MGT458 Final Examination 2018 Fall Term
(e) (4 points) How to find outliers for a variable in a given data set? Is it always
necessary to remove outliers from the data before performing any further anal-
ysis? Explain why (or why not)?
(f) (4 points) An e-commerce company wants you to calculate the average number
of sales per day. Data of one-year sales are given. Considering there are a few
festive days where the sales are enormously high compared to all other days in
a year. Which measure (Mean/Median/Mode) can you use in this case to get
a good estimate of average sales on a regular day? Explain why you use that
measure?
(g) (4 points) Indicate Mean, Median, and Mode in the chart below. What can you
say about the skewness of the distribution?
Page 5 of 14
MGT458 Final Examination 2018 Fall Term
(h) (3 points) Explain why a birthdate variable would be preferred to an age vari-
able in a database.
(i) (3 points) Can you think of any reasons why, as a strategy for dealing with
missing data, it might not be recommended to simply omit the records or fields
with missing values from the analysis?
(j) (3 points) Data visualization is a very important tool for exploring the data
before performing any further analyses. Individual variables can be explored
using histograms or bar charts, for example. When exploring relationships be-
tween two quantitative variables, scatter plots are the most commonly used
tool. Draw a scatter plot that uncovers an outlier that would be invisible from
one-dimensional data exploration of the two individual variables.
Page 6 of 14
MGT458 Final Examination 2018 Fall Term
2. Credit Cards [110 minutes] (53 points)
Suppose you work as a digital marketing analyst for a large bank. Your project for
today will be to perform a marketing analysis of UTM sta↵ members who are using
special purchasing cards (P-cards) that are used on campus. P-cards are a business
credit card that some employees are permitted to use to purchase necessary goods
and services. If employees agree to certain rules, they can then use a P-card to make
appropriate business purchases rather than using their own credit card. This allows
the employee to avoid spending personal funds and seeking reimbursement. You have
been assigned the task to study all of UTMs P-card transactions for 2017. To perform
this task, you received a CSV file of all P-card transactions for the entire province
of Ontario (the province collects all transactions for provincial and higher education
institutions) containing the last five years of transactions.
(a) (1 point) After loading the CSV file you want to inspect the types of the vari-
ables. State a suitable Python command.
(b) (3 points) The resulting output is as below. Comment on this output.
AgencyNumber int64
AgencyName object
CardholderLastName object
CardholderFirstInitial object
Description object
Amount object
Vendor object
TransactionDate object
PostedDate object
MerchantCategoryCode(MCC) object
Page 7 of 14
MGT458 Final Examination 2018 Fall Term
Here are two sample lines of the CSV file. Note: They are printed in three separate
lines each here to fit them onto the page.
"1000","UNIV OF TORONTO MISSISSAUGA","Edwards","M","GENERAL PURCHASE",
"$1,710.00","STYLEBOOK INC","23/7/2014 0:00","24/7/2014 0:00",
"WOMEN’S READY-TO-WEAR STORES"
"98000","RIVER DAM AUTH.","McGuire","D","GENERAL PURCHASE",
"($9.73)","WALKER’S HARDWARE","11/12/2017 0:00","12/12/2017 0:00",
"HARDWARE STORES"
For all the following parts of this question write Python code. You can
use suitable packages such numpy, pandas, matplotlib, and so forth, and
you may assume that they have been imported accordingly.
(c) (4 points) Ensure that the dates are recognized correctly. Transform the respec-
tive column(s).
(d) (4 points) Ensure that dollar amounts are recognized correctly.
(e) (2 points) Ensure that we are only considering transactions from 2017.
Page 8 of 14
MGT458 Final Examination 2018 Fall Term
(f) (2 points) After that, ensure that we are only considering transactions from
UTM.
(g) (5 points) Find all the employees who spent more than $3,000 per a single trans-
action. Display all transaction details sorted by the transaction amount in de-
creasing order.
(h) (5 points) Visualize the findings of the previous analysis in a suitable graph.
Page 9 of 14
MGT458 Final Examination 2018 Fall Term
(i) (6 points) Display the name and total amount spent during the year for all
employees who spent more than $30,000 in 2017. Sort by the total amount
spent with the larger amounts listed first.
(j) (6 points) Display the name, total amount spent during the month and the
month for all employees who spent more than $10,000 per month in 2014. Sort
by month (January listed first) and then total the amount spent with the larger
amounts listed first.
Page 10 of 14
MGT458 Final Examination 2018 Fall Term
(k) (10 points) Did any of the employees split an amount of more than $3,000 be-
tween two or more swipes of the card by the same person. Display all transaction
details where the vendor and purchaser are the same on a specific day, there is
more than one transaction for the day and the combined total of the transactions
was more than $3,000. Sort them in ascending order by the TransactionDate.
(l) (5 points) Continued from the previous part of the questions. Count how often
each individual purchaser paid amounts over $3,000 in more than one transac-
tion. Sort this in descending order by the count.
Page 11 of 14
MGT458 Final Examination 2018 Fall Term
3. Churn Clusters [25 minutes] (8 points)
Churn, also called attrition, is a term used to indicate a customer leaving the service
of one company in favor of another company. We are investigating a data set with
nine variables that contains 3333 customers. The variables of interest are
1. Account length: Integer-valued, how long account has been active.
2. International plan: Binary categorical, yes or no.
3. Voice mail plan: Binary categorical, yes or no.
4. Total day minutes: Continuous, minutes customer used service during the day.
5. Total eve minutes: Continuous, minutes customer used service during the evening.
6. Total night minutes: Continuous, minutes customer used service during the night.
7. Total international minutes: Continuous, minutes customer used service to make in-
ternational calls.
8. Number of calls to customer service: Integer-valued.
9. Churn: Target. Indicator of whether the customer has left the company (true or false).
Assume we used a clustering algorithm to cluster the data into three clusters. All of
the following figures show a statistic for all 3333 records followed by statistics for the
three individual clusters.
(a) (2 points) Briefly describe the members of the three clusters with respect to
International Plan adoption.
(b) (2 points) Briefly describe the members of the three clusters with respect to
VoiceMail Plan adoption.
Page 12 of 14
MGT458 Final Examination 2018 Fall Term
In class we have discussed that standardization of the data might help improve the
performance of certain machine learning algorithms. Instead of standardizing, the
attributes can also be normalized after which the range into which the values fall
is restricted to go from 0 to 1. Let MaxX be the maximum value of a particular
attribute X and let MinX be the minimum value of that particular attribute X,
then the normalized value, n(x), of a particular value x is calculated as n(x) =
(x MinX)/(MaxX MinX). For example, if MaxX = 50 and MinX = 10, then
for x = 20 we get n(20) = (20 10)/(50 10) = 10/40 = 0.25.
(c) (2 points) If MaxX = 100 and MinX = 40, then for x = 60 we get n(x) = ?
The following table presents the normalized scores of the numerical variables de-
scribed at the beginning of the question. These normalized scores are the scores of
the three means of those three clusters.
Cluster Count AcctLength DayMins EveMins NightMins IntlMins CustServCalls
1 92 0.434 0.536 0.5669 0.4764 0.5468 0.1630
2 2411 0.413 0.513 0.5507 0.4774 0.5120 0.1753
3 830 0.412 0.509 0.5564 0.4795 0.5077 0.1701
(d) (2 points) Briefly comment on the this information for the three means.
The following part is a bonus/challenge question. You are not required to
solve it, but you will receive a bonus towards the examination mark for
solving it.
(e) (4 points (bonus)) Give a detailed description of the cluster members in the
three di↵erent clusters.
Page 13 of 14
MGT458 Final Examination 2018 Fall Term
SCRAP SHEET
This page will NOT be marked, but you must submit this page with your exam
paper.
Page 14 of 14 End of exam.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468