辅导案例-RSM312H1

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Data and Information Management for Business Analytics
RSM312H1
Assignment – Building a Database and Information Management Solution
Using
Access, Excel and Tableau
Assignment Objective:
The objective of this assignment is to design, develop and implement a complete Business Analytical
Environment to support an e-commerce retailer. This analytical environment is required to deliver the
following functional capabilities:
1. Accept raw input source data
2. Store and Manage the business data in a Relational Database Management System
3. Support an Analytical Sandbox Environment by creating a linked data extraction layer
4. Provide and Analytical Sandbox to test and develop various scenario, what-if analysis capabilities
5. Provide an interactive data visualization layer to support data visualizations and performance
monitoring dashboards of key performance indicators.
The purpose of creating this analytical environment is to support the business decision making
processes for this retailer and your solution will be evaluated on 2 dimensions. The first dimension will
be based on the feature functionality, technical design, and usability of your implemented solution. The
second evaluation criteria will focus on your solutions ability to help support the managerial problem
that the e-commerce retailer is facing. By applying the information management and business analysis
skillsets that have been taught in your course, you will be going through a comprehensive database
design, development and implementation process, followed by the development on a data analysis and
visualization environment that will significantly expand the decision support capabilities of the e-
commerce retailer.
Please note that the use of any pre-developed templates (for either the database, or spreadsheet
components) is strictly prohibited. All work is expected to be original development. Use of ‘freeware’
or ‘shareware’ for any of the core design components will not be accepted.
Background:
Olist is one of the largest e-commerce websites in the Brazilian marketplace. Olist connects small
businesses from all over Brazil to retail channels under a single contract and without hassles. Merchants
are able to sell their products through the Olist eStore and ship them directly to the customers using
Olist’s logistics partners.
Since Olist’s business model is largely built on being an efficient facilitator of a transaction, any
disruptions in timely and efficient delivery of customer orders immediately attract managerial attention.
Any customer order when the delivery is not made within the quoted delivery time is regarded as a
“fail”. Moreover, there is evidence that long delivery times negatively affect sales, even when delivery
occurs within the promised times. Thus, Olist’s supply management team is interested in identifying and
prioritizing any issues impacting customer order fulfillment.
Olist has initiated a project whose goal is to identify factors, that are within Olist’s influence or control,
and that have the potential to improve outcomes in terms of customer order fulfillment process. Olist is
very early in their investigation process; you and your team have been retained as analytical consultants
for this project. Some background for the impetus of this project was provided by AE, one of Olist’s
senior managers.
“Customers who did not get their orders delivered by the promised date are often
quite unhappy, and are not shy in letting us know about it. We, in turn, flag these
incidents to the partners who were involved in this order. They know that we are
unhappy and if such incidents keep happening, we will review our relationship with
them. As a result, the proportion of orders with missed delivery dates is quite low,
though there may be some further avenues for improvement that data analysis will
uncover.
A more subtle issue is that we may have over-emphasized the importance of meeting
promised delivery dates. I believe that some of our partners have responded by
padding the estimated delivery times to give themselves a large margin of safety,
rather than optimizing their delivery procedures. As a result, the estimated delivery
time shown to the customer is often quite long. I firmly believe that many customers
will not place an order when they see estimated delivery times measured in weeks,
particularly when our competitors often promise delivery in a few days. It is hard to
measure this effect: there are many reasons why a customer who gets to the order
placement page does not place the order: some are just “window shopping” – they
may not have had any real intention of buying in the first place, some may get
“sticker shock” when the see the total bill and simply abandon the order. However, I
suspect there are many customers who are discouraged to buy when they see a long
estimated delivery time.
Another thing I wonder about is our policy to always ship items in the same order
together. We do it to minimize the shipping costs, both for us and for the customer.
However, if different sellers are involved with different fulfillment timelines for their
items, the whole order may be delayed because of issues with just one item. Should
we not, at the very least, offer our customers an opportunity to split the order and
have different parts delivered separately?”
Olist Order Fulfillment Process
A customer wishing to make a purchase on Olist site will select the items they want and place them in a
shopping cart. Note that an order may contain several identical items, or several different items,
possibly carried by different sellers.

Once the customer decides to purchase, Olist system determines which partner would handle this order
and contacts them to get an estimated delivery time. Each partner follows their own procedure to
determine the estimated delivery time. Generally, they will look at order complexity (number of items
and sellers involved), the shipping distance, as well as seller quoted delivery times – the number days
within which the seller commits to having a given item ready for pickup by the partner. All of this has to
happen in real time, as the customer has to see the estimated delivery time before the order is finalized.

Once the estimated delivery time is shown to the customer and they decide to place the order, they
enter their payment detail and a unique order id and a “purchase timestamp” are generated. Next the
customer payment is processed. Once an approval for payment is received, a “payment approved”
timestamp is generated and the order is passed on to the logistics partner. Once the partner
acknowledges receiving the order, another timestamp, “order delivered carrier” is generated. The
partner then contacts all the sellers involved in fulfilling this order and asks them to have their
respective items ready for pickup by their quoted “seller limit” date. The items are then picked up,
packaged together and shipped to the customer. Once the customer accepts the delivery, the final
“order delivered customer” timestamp is generated.

While the vast majority of the orders are eventually delivered to customers, exceptions do occur due to
item unavailability, order cancellations, or other circumstances. These are reflected in the “order status”
field.


The dataset has information on approximately 100k orders from 2016 to 2018 made at multiple
marketplaces in Brazil. After a customer purchases the product from the Olist Store, a seller gets notified
to fulfill that order.

Note:
1. An order might have multiple items.
2. Each item might be fulfilled by a distinct seller.


The following data schema with field descriptions has been provided to allow you to data analysis
solution.


Olist Customers
This dataset has information about the customer and its location. Each order is assigned to a unique
customer_id. This means that the same customer will get different ids for different orders. The purpose
of having a customer_unique_id on the dataset is to allow you to identify customers that made repeated
purchases at the store. Otherwise you would find that each order had a different customer associated
with.

customer_id key to the orders dataset. Each order has a unique customer_id.
customer_unique_id unique identifier of a customer.
customer_zip_code_prefix first five digits of customer zip code
customer_city customer city name
customer_state customer state


Olist Geolocation
This dataset has information about Brazilian zip codes and its lat/lng coordinates. You can use it to plot
maps and find distances between sellers and customers.
geolocation_zip_code_prefix first 5 digits of zip code
customer_lat latitude of centroid of customer’s 5-digit zip code area, decimal notation
customer_lng longitude of centroid of customer’s 5-digit zip code area, decimal
notation
geolocation_city city name
geolocation_state state

Olist Order Items Dataset
This dataset includes data about the individual items purchased within each order. Each order may
include multiple items, possibly supplied by different sellers. This dataset will include a separate entry
for each unique product included in the order.
order_id order unique identifier
order_item_id sequential number identifying the number of items within the same order.
product_id product unique identifier
seller_id seller unique identifier
shipping_limit_date shows the seller shipping limit date for handling the order over to the logistic
partner
price item price in Brazilian reals, R$. Currently 1R$ = 0.32CAD$
freight_value item freight value item (if an order has more than one item the freight value is
split between items)

Olist Order Payments Dataset
This dataset includes information about the orders payment options.
order_id order unique identifier
payment_sequential customer can pay with more than one payment method. If they do so, a sequence
will be created to accommodate all payments.
payment_type method of payment chosen by the customer
payment_installments number of installments chosen by the customer
payment_value transaction value


Olist Order Reviews Dataset
This dataset includes information about the reviews made by the customer. After a customer purchases
the product from the Olist Store, a seller gets notified to fulfill that order. Once the customer receives
the product, or the estimated delivery date is due, the customer gets a satisfaction survey by email
where he can give a note for the purchase experience and write down some comments.
review_id unique review identifier
order_id unique order identifier
review_score note ranging from 1 to 5 given by the customer on a satisfaction survey
review_comment_title comment title from the review left by the customer (in Portuguese)
review_comment_message comment message from the review left by the customer (in Portuguese)
review_creation_date shows the date in which the satisfaction survey was sent to the customer
review_answer_timestamp shows satisfaction survey answer timestamp

Olist Orders
This is the core dataset. From each order you might find all other information.
order_id unique identifier of the order
customer_id key to the customer dataset. Each order has a unique customer_id.
order_status reference to the order status (delivered, shipped, etc.)
order_purchase_timestamp shows the purchase timestamp
order_approved_at shows the payment approval timestamp
order_delivered_carrier_date shows the order posting timestamp. When it was handled to the logistic
partner
order_delivered_customer_date shows the actual order delivery date to the customer
order_estimated_delivery_date shows the estimated delivery date that was informed to customer at the
purchase moment

Olist Products
This dataset includes information about each product sold by Olist.
product_id unique product identifier
product_category_name root category of product, in Portuguese
product_name_length number of characters extracted from the product name
product_description_length number of characters extracted from the product description
product_photos_qty number of product photos on Olist website
product_weight_g product weight measured in grams
product_length_cm product length measured in centimeters
product_height_cm product height measured in centimeters
product_width_cm product width measured in centimeters


Olist Sellers Dataset
This dataset includes data about the sellers that fulfilled orders made at Olist. Use it to find the seller
location and to identify which seller fulfilled each product.
seller_id seller unique identifier
seller_zip_code_prefix first 5 digits of seller zip code
seller_city seller city name
seller_state seller state

NOTE: Category Name Translation
Product_category_name_translation.csv contains two columns, the first column
(product_category_name) contains the category name in Portuguese and the second column
(product_category_name_english) contains the translated category name in English.
NOTE: Calculating distances from coordinates
The locations for customers and sellers are provided in decimal coordinates
(https://en.wikipedia.org/wiki/Decimal_degrees), which you may want to convert to distances. While
there are many ways to accomplish this simple task, the following formula will do the trick:
Dist (in km) = ACOS(COS(RADIANS(90-Lat1)) *COS(RADIANS(90-Lat2)) +SIN(RADIANS(90-Lat1))
*SIN(RADIANS(90-Lat2)) *COS(RADIANS(Long1-Long2))) *6371

The Business Objectives
As described in the case, Olist has initiated a project whose goal is to identify factors, that are within
Olist’s influence or control, and that have the potential to improve outcomes in terms of customer order
fulfillment process. Olist is very early in their investigation process; you and your team have been
retained as analytical consultants for this project.
Several potential issues related to the order fulfillment process and their impact on customer
satisfaction and order placement are described in the case. Your task is to design and create both a data
management and analysis environment to help support these issues on an ongoing basis.
It is not necessary to try to address every aspect of the managerial problem in your project. However,
its focus should have clear managerial relevance, and its findings should, at least potentially, provide
valuable insights for Olist’s management.
Defining a specific focus for your project is part of your task. Too narrow a focus may make findings less
relevant as important parts of the managerial setting may be ignored. On the other hand too wide a
focus may make the task intractable or infeasible.
Deliverables:
In this assignment you will be delivering an integrated solution across 3 application environments;
1. A relational database (using Access)
2. An analytical environment (using Excel and/or Python)
3. A data visualization environment with an interactive performance monitoring dashboard (using
Tableau and/or Excel)
Part 1. Designing and Building a Database
You will be required to develop a physical database design in Access and import the raw csv datasets.
Using the Database Relationship View, all tables must be interconnected by the appropriate 1 to many
relationships. Standalone tables or tables with 1 to 1 relationships should not be incorporated into your
design (please check with your TA if you feel that an exception is valid)

Tables in your database must be given understandable names and descriptions. Variables (or columns)
in the tables must be defined across appropriate data types including; text, date, number.

Each table must have a primary key (or composite primary keys) defined.

Once you’ve designed and populated your initial database you will need to evolve it into a working
database that will demonstrate the following functional components that have been introduced in the
Access portion of the course including;
- At least 5 contrasting database administrative reports that can be used to summarize
measurement data across a range of dimensions in order to assess data quality (i.e. Monthly
Order Summary reports, Seller Reports, Product Performance Reports, Customer/Product
Sales Reports, … etc.) to display your data in an attractive and professional manner. Reports
should show a broad range of techniques and data views, and demonstrate different layout
formats, and be contrasting from each other in both function and format.
- At least 2 or more forms to facilitate data entry or updating for; orders, products, customers
and suppliers. Forms must be show contrasting design layout approaches
- Demonstrated capability to create detailed and summary SQL data extraction and data
transformation queries that can be used to directly link (i.e. using a data connection) into
your Excel Analytical environment as base input tables to support your analysis. You can
also use these ETL queries to feed into your Visualization Environment in Tableau.


Submit the completed database as an access database file attachment called
XX_Group_Project_Database.accdb where XX is replaced by your group number

(Please note: please ensure that original csv files are copied into your Access Database (i.e. not
embedded as linked files)


Part 2. Building an Analytical Environment in Excel
Now that you have created a data management solution that is able to manage and retrieve Olist’s
critical information, your next task is to focus on developing an analytical environment that takes data
directly from the Access database. Your Excel solution must demonstrate the following capabilities:

- Produce supporting analytical charts and graphs for senior management
- Demonstrate the ability to pivot data in both a chart and graphical format
- Through the use of formulas, take basic level data from the access database to create
derived variables to help support advanced analysis
- Within the appropriate context of your business problem, deploy and demonstrate the
use of advanced functions within Excel including a minimum of one example that
derives additional data elements from at least FIVE of each of the following categories:
i. Mathematical or Statistical
ii. Financial
iii. Character and Text Manipulation
iv. Database functions
v. Logical functions
vi. Lookup reference
vii. Date Time manipulation
- Create a series of analysis sheets that appropriately demonstrates A MINIMUM of
THREE of the following analytical techniques covered in the course: Scenario Manager,
Data Tables, Goal Seek, Solver, or a Statistical Analysis add in package. Capability must
be demonstrated within the context of your business problem and data.
- Your Excel analytical environment must contain an external data connection to your
Access database to bring the underlying data from your database into the spreadsheet
as a dynamic datalink.


Submit the completed spreadsheet as an Excel file attachment called
XX_Group_Project_Dashboard.xls where XX is replaced by your group number. You will also need
to include your access database as XX_Group_Project_Database.accdb so that the Excel reference
to the external data link can be found.





Part 3. Building a Visual Analysis and Dashboard Environment in Tableau

Now that you’ve developed your analytical environment and have been able to find insights and develop
recommendations to help Olist better understand and manage it’s delivery and logistics challenges, the
Olist team has requested your assistance in developing a data visualization environment that they can
use to visualize how their business is performing and keep track of the key performance indicators
through some form of a dashboard to help keep them on top of the logistical and delivery issues that
they are faced with. Using Tableau Desktop, you have been tasked to develop a more Executive Level
summarization of the key graphics, that includes a dashboard. In addition to the dashboard, the
management team has requested a brief presentation/story using Tableau that points out the key
trends and observations based on the analysis that was used to support your recommendations. Make
sure that your story has a combination of; graphics, text and interactions that will allow the senior
management team to get a better sense of the issues and concerns that need to be monitored on an
ongoing basis.


Submit your Tableau files as a published ‘read only’ Tableau packaged workbook (with your dataset
embedded) that can be distributed through the Tableau Reader called
XX_Assignment_3_Part_4_Dashboard.twbx where XX is replaced by your team number.

(Please Note: DO NOT submit your original Tableau file (with the .twb extension) – you must do a File
….. Save As …… and select the Save as type: Tableau Packaged Workbook (*.twbx))

Evaluation Criteria

Criteria Novice Proficient
Database
Environment
Some issues with data loading, completeness
of data. Unclear or incorrectly specified
relationships between entities. ETL queries
insufficient to provide a full view of the data
for the analytical environment. Forms are
simplistic and repetitive in structure and not
fully considered for the intended purpose.
Reports do not provide a clear or complete
picture of the overall
consistency/quality/integrity of the data
environment.
All tables fully populated with appropriately modelled
entity relationships. ETL Queries are complementary to
each other and illustrate a range of analytical views of
the data from summary to detail across a range of
dimensions. Appropriate use of Forms to support
updating of key information into the database. Forms
are easily interpretable and ensure that relational
integrity is managed when adding or updating records.
Range of contrasting reports to ensure that data
integrity is in place such as checking key metric totals
and breakdowns by common business dimensions.
Analytical
Environment
Unclear line of inquiry in the analysis or
difficult to follow due to lack of
documentation or explanation, or poorly
designed sheet layout. Most Excel derived or
transformed data elements demonstrated, but
more examples would be of further benefit to
the analysis. Insufficient demonstration of
analytical techniques in either number or
quality of the inquiry.
Well organized data linkages between underlying data
sheets and the database. Methodical and easy to follow
structure of analysis that is well documented and
commented. Ease of readability and interpretability of
analysis pages. Full and clear demonstration of various
Excel derived data capabilities. Demonstration of
minimum 3 required analytical techniques from;
Scenario Manager, Data Tables, Goal Seek, Solver, or
Statistical Analysis add in package.
Data
Visualization
Environment
A broader range of visualization techniques
should be demonstrated. Multiple aspects of
the 4Cs are missing, or have not been well
addressed in the visualization. Study is
required to interpret the data and how it
applies to the thesis of the analysis. There is a
clear message or story conveyed, but the
action or conclusion is not definitive. May
require interpretation. Chart types chosen are
acceptable, but axes may be cluttered or have
rotated text. Color choices communicate
meaning but can be improved. Delivery
provides some form of argument and is
partially supported; minor details should be
vetted and affirmed, or there may be a
lingering lack of clarity and work may be
required to review and confirm. The
visualization suggests some possibilities, but
does not lead to clarity of understanding and
therefore action is not possible. The
presentation and communication leaves
concerns or lingering lack of clarity. Work
required to review and confirm.
Methodical structure of data linkages into underlying
charts, graphs, dashboards and stories. Broad range of
visualization techniques demonstrated. The 4Cs are well
represented; the visualization is clear, clean, concise,
and captivating. The visualization facilitates quick
cognition and leading to a fact-based conclusion or
assertion. The visualization is targeted to the audience,
the story is evident, and the conclusion or action
required is clearly apparent. Chart types are suitable
and best options for the analysis. All axes and text are
treated appropriately. The application of color is correct
and clearly conveys meaning. The visualization (or
presentation) is delivered in a convincing way that
demonstrates confidence, competency, and
thoroughness. The visualization is targeted to the
audience, the story is evident, and the conclusion or
action required is clearly apparent. The visualization (or
presentation) is delivered in a convincing way that
demonstrates confidence, competency, and
thoroughness.



51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468