Page 1 of 8 School of Science ISYS1055/1057 Database Concepts Assignment 2 Assessment Type: Individual assignment; no group work. Submit online via Canvas→Assignments→Assignment 2. Marks are awarded for meeting requirements as closely as possible. Clarifications/updates may be made via announcements/relevant discussion forums. Due date: Tuesday 26 May 2020, 23:59. Please check Canvas→Syllabus or via Canvas→Assignments→Assignment 2 for the most up to date information. As this is a major assignment in which you demonstrate your understanding, a late penalty of 10% of full available marks per day or part day applies for up to 5 days late. After 5 days, 0 marks will be awarded. Weighting: 50 marks 1. Overview Database systems are a key technology for the storage, management, manipulation, and retrieval of structured data. They have an impact on the use of information technology in applications ranging from banking, to travel bookings, to online shopping. In this assignment you will apply the skills and concepts that you have learned about database systems in the course so far. 2. Assessment Criteria This assessment will determine your ability to: 1. Follow coding, convention and behavioral requirements provided in this document and in the lessons. 2. Independently solve problems by using database concepts taught over the first several weeks of the course. 3. Understand the relational model. 4. Independently design a database using the ER model. 5. Write and understand SQL queries. 6. Meet deadlines. Seek clarification from your instructor, when needed, via discussion forums. This assignment is worth fifty points in total for five questions (15+8+8+10+9=50), which accounts for 50% of the overall assessment for the course. The revised assessment components and weights for the course are: Assignment 1 Assignment 2 Assignment 3 20% 50% 30% 3. Learning Outcomes This assessment is relevant to the following Course Learning Outcomes: • CLO 1: Describe various data modelling and database system technologies. • CLO 2: Explain the main concepts for data modelling and characteristics of database systems. • CLO 3: Identify issues with and compare, justify relational database design using the functional dependency concepts. • CLO 4: Apply SQL as a programming language to define database schemas and update database contents. • CLO 5: Apply SQL as programming language to extract data from databases for specific users’ information needs • CLO 6: Design a database schema using conceptual modeling mechanisms such as entity-relationship diagrams. It also supports the following Graduate Learning Outcomes: • Enabling Knowledge: You will gain skills as you apply data modelling knowledge effectively in diverse contexts. • Critical Analysis: Analyse and model requirements and constraints for the purpose of designing and implementing software artefacts and IT systems. • Problem solving: Design and implement database solutions that accommodate specified requirements and constraints, based on analysis or modelling or requirements specification. Page 2 of 8 4. Submission format Submit your assignment via Canvas→Assignments→Assignment 2. Your submission must be a single .pdf file, with the filename being your student number (e.g., S1234567.pdf) that contains the answers for each question below. ● Clearly number each question and sub-question in line with the numbering in the assignment (e.g. Q1.1, Q1.2, etc.) ● Start each main question (Q1, Q2, Q3, Q4, Q5) on a new page ● Use at least 11-point font size ● It is your responsibility to correctly submit your files. Please verify that your submission is correctly submitted by downloading what you have submitted to see if your .zip file includes the correct content. ● Never leave submission to the last minute -- you may have difficulty uploading files. ● You can submit multiple times – a new submission will override any earlier submissions. However, if your final submission is after the due time, late penalties apply. ● If unexpected circumstances affect your ability to complete the assignment, you can apply for special consideration. Special Consideration that extends beyond the release of solutions (typically 1—2 weeks) will automatically result in an equivalent assessment, assessing the same knowledge and skills of the assignment (time to be arranged by the course coordinator). ● More information on special consideration is available at https://www.rmit.edu.au/students/student-essentials/assessment-and-exams/assessment/special-consideration 5. Academic integrity and plagiarism (standard warning) Academic integrity is about honest presentation of your academic work. It means acknowledging the work of others while developing your own insights, knowledge and ideas. You should take extreme care that you have: • Acknowledged words, data, diagrams, models, frameworks and/or ideas of others you have quoted (i.e. directly copied), summarised, paraphrased, discussed or mentioned in your assessment through the appropriate referencing methods, • Provided a reference list of the publication details so your reader can locate the source if necessary. This includes material taken from Internet sites. If you do not acknowledge the sources of your material, you may be accused of plagiarism because you have passed off the work and ideas of another person without appropriate referencing, as if they were your own. RMIT University treats plagiarism as a very serious offence constituting misconduct. Plagiarism covers a variety of inappropriate behaviours, including: • Failure to properly document a source • Copyright material from the internet or databases • Collusion between students For further information on our policies and procedures, please refer to the University website. 6. Assessment declaration When you submit work electronically, you agree to the assessment declaration. 7. Rubric/assessment criteria for marking The detailed rubric and assessment criteria are available online via Canvas→Assignments→Assignment 2. Page 3 of 8 8. Assignment Questions Question 1. SQL (15 points). In addition to the lecture notes, you should also study by yourself the SQL*Plus tutorial on Canvas (the Oracle section) and other resources for syntax and useful functions. The relational schema for the Academics database is as follows: DEPARTMENT(deptnum, descrip, instname, deptname, state, postcode) ACADEMIC(acnum, deptnum*, famname, givename, initials, title) PAPER(panum, title) AUTHOR(panum*, acnum*) FIELD(fieldnum, id, title) INTEREST(fieldnum*, acnum*, descrip) Some notes on the Academics database: • An academic department belongs to one institution (instname) and often has many academics. An academic only works for one department. • Research papers (PAPER) are often authored by several academics, and of course an academic often writes several papers (AUTHOR). • A research field (FIELD) often attracts many academics and an academic can have interest in several research fields (INTEREST). Primary keys are underlined and foreign keys are marked with *. You should download the SQL script for defining and populating the database academics.sql from Canvas (the Oracle section) and run academics.sql in your Oracle account to build the database. Write ONE SQL query for each of questions 1.3 through to 1.15. Your query must be formatted in such a way that it could be directly ported and run in Oracle SQL Developer. For example, if you use explanatory comments, they must be appropriately formatted SQL comments. Do not include the output of the query or the script used to create the tables. Assessment notes: • Each question in this section is worth 1 point. • For questions with “You must ...” or “You must not ...” requirements, queries failing to meet the requirements receive a maximum of 0.5 mark. For example, question 1.2 has “You must use a subquery”. A query not using the subquery operator can receive a maximum of 0.5 mark. • Your query should not output duplicates, but use DISTINCT only if necessary. • Queries are marked in terms of both correctness and efficiency. Unnecessary joins will incur a deduction. 1.1. Explain the following query in English. A literal explanation will receive 0 marks. select givename, famname, instname from academic natural join department where acnum in (select acnum from author where acnum not in (select acnum from interest group by acnum)) and deptNum in (select deptNum from academic where deptname = ‘Computer Science’); Page 4 of 8 1.2. The following SQL query is meant to output a list of papers (panum) with the total number of authors for each paper. It has syntax errors and logic errors. Explain the syntax and logic errors and give the correct query. select PaNum, count(A1.AcNum) from Author A1, Author A2 where PaNum = A2.PaNum group by PaNum; 1.3. Find departments that have a description (descrip) available in the database. Return all details of these departments. 1.4. List the paper number and title of papers by the academic whose acnum is 100. 1.5. For each academic, give the acnum, givename, famname and the total number of papers s/he has written. Note that if an academic has not written any paper, his/her total should be zero. You can use JOIN operators such as NATURAL, JOIN ...ON. 1.6. The research field ID is a research field classification code representing classes for three “Levels”. These three Levels are separated by a “full stop” in a single string. For example the research field ID “B.1.6” represents that the research field belongs to Class “B” for Level one, Class “1” for Level two and Class “6” for Level three. For research fields in Class “1” for Level two, list the field IDs and the number of academics for each field ID. 1.7. Find departments where at least one academic does not have research interest, and list the deptnum, depntname, instname of these departments. Must use a subquery. 1.8. Output in alphabetical order the acnum, famname, givename, and department number (deptnum), and description (descrip) of authors whose family name starts with “C”. 1.9. List the fieldnum, title, and total number of interested academics (under the heading "NO. ACADEMICS INTERESTED") in each research field, in increasing order (i.e. ascending order) of fieldnum. 1.10. List in alphabetical order the institution and name of departments where at least 10 academics have written papers. 1.11. List the deptnum of departments whose postcodes are in the range 3000..3999 and that do not have any academics with the title of Professor (stored as “Prof” or “Prof.” in the database) , including departments that do not have any academics. 1.12. Find the departments that have produced at least ten papers (that is, those departments where the sum of papers written by their academics is at least ten). Output their deptnum and deptname in ascending order. 1.13. List the deptnum and deptname of departments whose academics have never written any papers. 1.14 List papers (panum) by academics with research interests in fields related to "data". You must use EXISTS. Note that “fields related to data” includes any occurrence of the four letters “data” within a field name, in any case. 1.15. The popularity of a field is measured by the number of interested academics. List details (filednum, ID and title) of the most popular field together with the total number of interested academics. Page 5 of 8 Question 2. The Relational Model (8 points). Consider the below relational database schema for a project management database of five relations. A sample tuple is given for each relation to help explain the meaning of attributes. The Project Management Database Department(deptID, deptName, manager) <2, ‘Production’, ‘E5’> Employee(empID, empName, deptID, email) <’E4’, ‘Ann’, 4, ‘
[email protected]’> Project(projID, startYear, deptID) <’P3’, 2000, 2> EmpProj(empID, projID, role) <’E2’, ‘P6’, ‘Designer’> Evaluation(projID, manager, evalDate, grade) <’P3’, ‘E5’, ’11-01-2020’, 5> Assume that • Each employee belongs to one department. Each department has one manager. • Each project is for one department only. • An employee can work for many projects with different roles, but has one role for a project. A project can have many employees. • The progress of projects is evaluated once a week by department managers. The evaluation of a project on a particular date has a grade, which is an integer between 0 (very bad) to 5 (excellent). Answer the following questions based on the given description. 2.1: (2 points) Give all likely FDs. Do not include trivial or redundant FDs. 2.2: (2 points) Give {empID, projID}+ and {deptID}+ based on the FDs in Question 2.1. 2.3: (2 points) Specify the primary key (by underlining) and any foreign keys (with *) for each relation. 2.4: (2 points) Discuss the normal form for the Evaluation relation using the FDs in Question 2.1. Page 6 of 8 Question 3. Normalisation (8 points). Consider the following APP relation schema about patients and their scheduled appointments with doctors at a clinic: APP(docID, docName, patID, patName, patDOB, appDate, appTime, roomNo) The semantics of attributes are as follows: • Several doctors are on duty to see patients at the clinic. docID and docName are the unique ID and name of doctors. • Attributes patID, patName and patDOB are the unique ID, name and date of birth of patients. Attributes appDate and appTime are the date and time for a patient's appointment. • The clinic has several consultation rooms. A doctor is assigned a particular room on a day for consultation, and no two doctors are assigned the same room on the same day. • A patient can make an appointment for a particular date and time for consultation with a doctor. A patient can only make one appointment for a day. FDs based on business rules are given as follows: docID --> docName patID --> patName, patDOB patID, appDate, appTime --> docID, roomNo appDate, docID --> roomNo patID, appDate --> appTime, roomNo, docID Answer the following questions. 3.1 (2 points) The given FDs may have redundancies. Give the minimal basis for the given FDs. 3.2 (2 points) Discuss all candidate keys for the APP relation. Explain your answer using the functional dependencies in Question 3.1. 3.3 (4 points) The APP relation is not in BCNF or 3NF. Decompose the relation into BCNF/3NF relations. Your decomposition must keep all functional dependencies and must be lossless. For each resultant relation, discuss if it is in BCNF or 3NF and indicate the primary key (underline) and any foreign keys (*). Page 7 of 8 Question 4. ER Model (10 points) Due to your experience in designing the database for the "Legendary League" game, you have been asked to design the ER diagram for a bigger database to manage the events for the "Legendary League" eSports Oceanic Championship (OC). The requirements are as follows: • Registered teams compete in the OC. Each team has a name, and a number of team members. A team also maintains a rank throughout the OC, reflecting how well it is doing in the championship. • Team members are identified by a unique member id, their first name, and their surname. There are two types of team members: managers, and players. For each manager, their number of years of experience is tracked. Each player in a team has an in-game player name, and an assigned role in the team. A role can be one of "top", "mid", "jungle", "adc" or "support". • A match is a contest between two teams. Teams get a final result, “win” or “lose”, for each match that they play in. • Matches occur at a specified date and time. They also take place in uniquely named venues. In addition to names, venues have a location and a capacity. Only venues that are part of the OC need to be tracked. • In a match, each player selects a single champion from a roster of pre-defined champions to play. Each champion may only be selected once in a match. A champion has a unique champion name, and also has a type (for example, “mage” or “support”). • Additional performance statistics need to be recorded for each champion being used by a particular player in a particular match. The three statistics to be recorded are a count of the number of each of the "kills", "deaths" and "assists" that they achieve (for example, in a particular match, a particular player playing a particular champion might achieve 4 kills, 2 deaths, and 3 assists). According to the requirements above, give the ER diagram for the database using the UML class symbols (as used in the lecture notes and tutorials), making appropriate assumptions where necessary. You must represent entities, relationships and their attributes, and all applicable constraints in your diagram. Explain any constraints that cannot be expressed in the diagram. Page 8 of 8 Question 5. ER to Relational Schema Mapping (9 points). Consider the University database ER diagram below, shown using the UML class symbols. Some notes on the ER diagram: • Classes within a course are identified by group numbers. Lecture classes always have groupNo zero (“0”), and tute classes are numbered as 1, 2, 3, etc. • Classes are taught by staff members, of which some are tutors on contracts. Map the diagram to a relational database schema. Indicate the primary key (underline) and any foreign keys (asterisk) for each relation. Figure 1 The ER model for a University database