程序代写案例-INFO90002 S2

INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

1


School of Computing and Information Systems <
/br>INFO90002
Database Systems and Information Modelling
Practice Exam 2 SOLUTIONS
Semester 1 2021
Reading Time: 30 minutes
Writing Time: 120 minutes (2 Hours)
This exam has 8 pages including this page
ATTEMPT ALL QUESTIONS IN ANY ORDER

Authorised Materials:
While you are undertaking this assessment, you are permitted to
• make use of any textbook, lecture slides (including soft copies)
• MySQL Workbench is supported for E.R. modelling questions
• Draw IO (diagrams.net) for Chen conceptual models or any suitable modelling equivalent
• Any lecture notes, books, laptop, PC
• You are free to use the course materials and your laptop/PC in this exam.
While you are undertaking this assessment, you MUST NOT
• make use of any messaging or communication technology
• record, screenshot, stream, upload or in any known format duplicate this document
• record, screenshot, stream, upload or in any known format duplicate your solutions
• make use of any world wide web or internet based resources such as wikipedia, github,
stackoverflow, google, Weichat or any known search engine / messaging services
• act in a manner that could be regarded as providing assistance to a student who is
undertaking this assessment or in the future will be undertaking this assessment
• seek assistance from any other student who is undertaking this assessment or in the future
will be undertaking this assessment

INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

2
Instructions to Students:
• This exam is in 8 sections. Attempt all questions in all sections.
• We recommend using pencil and paper for modelling questions to save time
• The total for this exam is 100 marks representing 50% of your final assessment
• Attempt all questions which are of unequal marks value
• This exam is a timed assessment which must be completed within 150 minutes of
official commencement time
• Questions can be answered in any order (please number your attempts)
• Start a new question on a new page
• PLEASE DO NOT USE RED font colour
• You must not communicate with other students whilst taking this exam, e.g. using
messaging, chat rooms or email
IMPORTANT
• Your file upload must be a single PDF document before the elapsed time.
• No other document format will be assessed (e.g. Pages, doc, txt, .SQL, etc).
• Email submissions will not be assessed.
• Every question attempt must be numbered (e.g. Q2C, Q1, Q6) to ensure it is assessed.
• If you choose to upload photos, ensure that they are clear, compressed to < 1mb, and
importantly, check them once you have uploaded. It is your responsibility to ensure your
file and any images has been submitted successfully.
The work you submit must be based on your own knowledge and skills and without the assistance of any other
person. You MUST NOT directly copy work that you have not authored (e.g. slide notes, websites, other student's
study notes).
INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

3
Q1. ER Modelling (30 Marks)
TeaTime
TeaTime is a platform for tea lovers. To use the system, users must install the TeaTime app on their
phone and set up a user profile that contains: a login name (their email address), password (encrypted
and stored as a string of 64 characters), short screen name e.g. "DonkeyKong64" (which other users
will see), profile photo, and a “My Cuppa” entry - their favourite tea type, chosen from a list of
standard drink names such as “Peppermint Tea” and “Black Tea with Honey”. We plan to have up to
2 million customers (end users) accessing the system.
While a user’s phone is switched on, the TeaTime app sends the phone’s current location to the server
once per minute. We store these, so that the system knows where a given user is “now”, as well as
the history of where the user has been.
All locations in this system are recorded as a pair of numbers representing latitude and longitude.
Latitudes are between -90 and 90 degrees (south pole to north pole) while longitudes are between -
180 and 180 degrees (west or east of the prime meridian in Greenwich). We will use a precision of 4
decimal places, which is about 11 metres at the equator. For example, the Doug McDonell building at
UniMelb is at latitude -37.7989, longitude 144.9627.
The names, locations and opening hours of about 600 tea stores are available via the app. Users can
browse stores either in an alphabetical list or via a map. When viewing a store's profile, users can see
reviews and photos uploaded by other users, and the store's average rating. The store's opening hours
are recorded as an opening and closing time for each day of the week (for example, a café might open
Mondays 8am to 6pm, Tuesdays 9:30am to 7pm, and so on). Each café stores a menu of the teas it
sells: these must be drawn from our standard list mentioned above.
Users can rate stores. A rating consists of a whole number between 0 and 10, along with an optional
piece of text (up to about 30 words). A given user can only rate a given tea store once. Users can mark
particular stores as favourites. These can be viewed in a list. Users can later “unfavourite” the tea
store if they wish, and yet later “favourite” it again. We keep a history of these favourites and
unfavourites. Customers can use the app to order drinks from a tea store. To do this a user first selects
a tea store, then chooses how many of each tea type(s) they want from the store's menu. We keep
track of when orders are placed and when customers later pick up the order.

Q.1. Draw a physical model in Crow’s Foot notation for this case study. Be sure to write down any
assumptions you make.

(30 marks)





INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

4
SOLUTION:



INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

5
Q2. SQL (20 marks)
Given the schema in Figure 2, write a single SQL statement to correctly answer each of the following
questions (3A – 3D). DO NOT USE VIEWS to answer questions.

Figure 1 Company Schema
Q.2A. Write a query that returns customers (company names) and the details of their orders
(orderid and orderdate), including customers who placed no orders.
(3 marks)
SELECT C.companyname, O.orderid, O.orderdate
FROM Customers AS C LEFT OUTER JOIN Orders AS O
ON O.custid = C.custid;

Q.2B. Write a query that returns the first name and last name of employees whose manager was
hired prior to 01/01/2002.
(4 marks)
SELECT E.firstname, E.lastname
FROM Employees AS E INNER JOIN Employees AS MNGR
ON E.managerid = MNGR.empid
WHERE MNGR.hiredate < '20020101';


Q.2C. Write a query that returns customers (customer ID) whose company name is ‘Google’, and for
each customer return the total number of orders and total quantities for all products that were not
discontinued (‘1’ means discontinued, ‘0’ not discontinued).
(5 marks)
INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

6
SELECT C.custid, COUNT(O.orderid) AS numorders, SUM(OD.quantity) AS
totalqty
FROM Customers AS C
JOIN Orders AS O
ON O.custid = C.custid
JOIN OrderDetails AS OD
ON OD.orderid = O.orderid
JOIN Products AS P
ON OD.productid = P.productid
WHERE C.company = 'Google'
AND P.discontinued = '0'
GROUP BY C.custid;


Q.2D. Write a query that returns the ID and company name of customers who placed orders in 2007
but not in 2008.
(8 marks)

SELECT custid, companyname
FROM Customers
WHERE custid IN
(SELECT custid
FROM Orders
WHERE orderdate >= '20070101'
AND orderdate < '20080101')
AND custid NOT IN
(SELECT custid
FROM Orders
WHERE orderdate >= '20080101'
AND orderdate < '20090101');

-- ALSO / OR

SELECT custid, companyname
FROM Customers INNER JOIN ORDERS O1
ON Customers.custid = O1.custID
WHERE YEAR(orderdate) = 20007;
WHERE NOT EXITS
(SELECT *
FROM Customers INNER JOIN ORDERS O2
ON Customers.custid = O2.custID
WHERE YEAR(orderdate) = 2018
AND O1.custid = O2.custID);
INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

7
Q3. Normalisation (16 marks)
Q.3A. The table shown below is part of an office inventory database. Identify the design problems and draw a revised table structure in 3rd Normal Form
(3NF) that corrects those problems. For each step explicitly identify and discuss which normal form is violated.
(Key: PK = Bold FK = Italic PFK = Bold + Italic)

Inventory (ItemID, Description, Qty, Cost/Unit, Dept, Dept Name, Dept Head)

ItemID is the candidate key for this table.

The following functional dependencies hold:
Dept ⟶ Dept Name and Dept Head
Qty, Cost/Unit ⟶ Inventory Value

ItemID Description Dept Dept Name Dept Head Qty Cost/Unit Inventory Value
4011 1.4m Desk MK Marketing Jane Thompson 5 200 1000
4020 Filing Cabinet MK Marketing Jane Thompson 10 75 750
4005 Executive chair MK Marketing Jane Thompson 5 100 500
4036 1.2m Desk ENG Engineering Ahmad Rashere 7 200 1400
Table 1. The Inventory table
(10 marks)



INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

8
1NF: All data is atomic – there are no repeating groups.
2NF: The table is in 1NF and there are no partial functional dependencies, so the table is in 2NF.
3NF: The table is in 2NF however there is a transitive functional dependency:
Dept → Dept Name, Dept Head is not a key attribute. This transitive FD violates 3NF.

Quan, Cost/Unit → Inventory Value
Because there is derived value between these columns (Quan × Cost/Unit = Inventory Value), this FD can be resolved
by removing the redundant Inventory Value column.

3NF
Inventory2(Item ID, Description, Dept, Quan, Cost/Unit)
Department (Dept, Dept Name, Dept Head)

KEY:
BOLD = PK
ITALIC = FK
BOLD + ITALIC = PFK



Q.3B. Given the following relation (Inventory), and its functional dependencies - is it possible to demonstrate Armstrong's Axioms of Reflexivity,
Augmentation and Transitivity?
Inventory (ItemID, Description, Qty, Cost/Unit, Dept, Dept Name, Dept Head)
ItemID is the candidate key for this table.

The following functional dependencies hold:
Dept ⟶ Dept Name and Dept Head
Quan, Cost/Unit ⟶ Inventory Value

INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

9
Reflexivity: Pick any two sets of attributes where one is a subset of the other.
E.g. {Dept} ⊆ {Dept, Qty} which then means
Dept, Qty ⟶ Dept
Augmentation: Take an existing FD and stick the same attribute to both sides. So from Description ⟶ Cost/Unit we
can get:
Description, Qty ⟶ Cost/Unit, Qty

Transitivity: If we take (2) above together with the original 3rd dependency we can do:
Description, Qty ⟶ Cost/Unit, Qty AND Qty, Cost/Unit ⟶ Inventory Value gives us
Description, Qty ⟶Inventory Value
(6 marks)
INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

10
Q4. Data Warehousing (10 marks)

Q4A. Transurban operates toll roads in Australia and the United States of America. Traffic data analysis
helps determine road maintenance, RFID reader maintenance, gantry maintenance and capacity
management to reduce inefficiencies in its tollway network.
Each part of the tollway is identified by a section (E3), and multiple sections (E1, E2, E3) make up a
chunk, (C3) and multiple Chunks make up an entire roadway ("Eastlink"). Many different vehicles on
toll roads every day. Each vehicle falls into one category: motor cycles, passenger vehicles, 4WD's,
vans, trucks, prime movers, buses, trailers, recreational and miscellaneous machinery (e.g. tractors,
cranes, street sweepers, back hoes).
Transurban's management wants to understand the vehicle trips on its road network. They need to
understand the number of trips, trip length, trip duration,
Draw a star schema to support the design of this data warehouse, showing the attributes in each table.
You do not need to select data types. Clearly display the legend for Primary Key, Foreign Key and
Primary Foreign Key.
(8 marks)


Q.4B. Why are star schemas preferred over relational database designs to support decision making?
(2 marks)
Star schemas support Data Warehouses which are organized around facts
(business measures) and dimensions that help with managerial decision
making. Star schemas are denormalised, making it faster to aggregate and
query data. Faster aggregates and query data is required due to the large
volume of data stored in data warehouses.

INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

11

Q5. Storage and Indexes (12 marks)

Q5A. Explain the components and storage hierarchy of a DBMS data file.
(4 marks)
DBMS files have the file, pages and records. The relationship is
hierarchical. One file has many pages, one page has many records.
The pages are linked together in a forward and backward direction so that
each page knows the previous page and the next page (also known in
programming terms as a doubly linked list).

Q5B. Sort the following values into a hash index where n=5.
{12, 5, 1, 34, 19, 13, 87, 2, 7, 15, 45, 8, 10, 101, 244, 910}
( 2marks)
Bucket 0 = {5, 10, 15, 45, 910}
Bucket 1 = {1, 101}
Bucket 2 = {2, 7, 12, 87}
Bucket 3 = {8, 13}
Bucket 4 = {19, 34, 244}

Q5.C An OLTP database has a large volume of users, with each user running a large volume and
variety of concurrent DML and SELECT statements within the database (schema).
What recommendations would you make about file organization, index choice and index
classification, given the volume and variety of transactions in OLTP databases? Justify your
recommendations.
(6 marks)
Sorted File Organization on Primary Key(although updates will take longer)
- Quicker for reads slightly longer for writes, but writes will be
comparatively small

Clustered Index on Primary Key of underlying Sorted File
- tables will need to be joined in an OLTP database which is usually
normalised and ER OLTP uses PK and FKs

Indexing only on frequently used columns only
for columns frequently used in WHERE and HAVING statements
Speed up retrieval of data for better throughput.
Not too many indexes
- effect insert update delete performance


INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

12
Q6. Transactions (8 marks)
Q6A. What is the inconsistent retrieval problem? Describe the problem and use your own example
to demonstrate the answer.
(4 marks)
The inconsistent retrieval problem is where data in one long running read
session (SQL SELECT) is reading data that has been modified in other
sessions. At the end of the read session it may have read unmodified and
modified data from other sessions.
In the table below Session 1 is calculating the interest payment for
credit card payments for all credit card accounts at a bank. Session 2,
Alice is paying off her entire credit card balance (a series of update
statements). In Session 3 Bob is adding more credit to his credit card
balance - another update.

Timestamp Session 1
Query Calculating
credit card interest
Session 2
Alice
Session 3
Bob
1 SELECT begins Balance 673.50 Balance 99.50
23 Reads Alice's balance
34 Alice pays off her
entire balance

47 Bob adds $49.50 to
his credit
55 Reads Bob's balance
60 Finishes running query
Table6. Transaction log demonstrating inconsistent retrieval problem.
In this scenario, the query is not including Alice's payment, but is
including Bob's purchase. This is an inconsistent retrieval of data.

Q6B. Describe how DBMS solve concurrency issues?
(4 marks)
There are three ways to resolve concurrency with database management
systems: Locking, Timestamping and the Optimistic method.
Locking ensures that all CRUD updates (insert update delete) occur
serially. Users can only modify records one at a time by locking the
object.
Timestamping uses an integer to record the time each transaction runs.
This ensures that all transactions are executed serially and we can have
an orderly access to data (managed by the Concurrency Manager in the DBMS
Architecture).
Optimistic method - allows all users to modify data concurrently, and only
checks if there are clashes in data modification when the users attempt to
commit. If users are modifying the same data their transactions are rolled
back and users can try the operation again.
INFO90002 S2 2020 Practice Exam No 2

© 2021 University 0f Melbourne

13
Q7. NoSQL (4 marks)
Q7. Domain integrity can be violated in NoSQL databases whereas Relational databases will report a
domain integrity violation. Discuss the benefits and risks of violating domain integrity.
(4 marks)

Benefits of domain integrity – data stored in the database must comply
with the rules of that domain (e.g. Date datatype must be stored as a
date).
Disadvantage of domain integrity – is that we are limited by the
databases capability to what types of domains we can store (e.g. spatial
data)

Benefit of violating domain integrity – we can store many different types
of data that is thematically related on a topic e.g. novel, movie,
reviews, critical essays, pictures they do not confirm to any specific
domain integrity and can be stored.

Disadvantage of violating domain integrity
Data accuracy, data integrity and data reliability

Good Luck!
END OF EXAM

欢迎咨询51作业君
51作业君 51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: ITCSdaixie