Last time… 1 ◼ 1 NF ◼ 2 NF ◼ Functional Dependency Lecture 19 Outline ◼ THIRD NORMAL FORM 3NF ◼ R Review EXAMPLE 2 THIRD NORMAL FORM (3NF) A relation is in 3NF if both of the following conditions hold: (i) the relation is in 2NF. (ii) there are no TRANSITIVE FUNCTIONAL DEPENDENCIES of NON-PRIME ATTRIBUTES on the primary key. If A--->B and B--->C are two FDs then A--->C is a TRANSITIVE FUNCTIONAL DEPENDENCY Recall: If a set of attributes A of a relation uniquely identifies a set of attributes B of the same relation, then B is FUNCTIONALLY DEPENDENT on A. Written: A ---> B An attribute that is not part of any candidate key is a NON-PRIME ATTRIBUTE. 3 3NF: A SIMPLE EXAMPLE Consider the following relation schema: Student_Address (STUDENT_ID, STUDENT_NAME, ZIP, STATE, CITY) Assumptions: Each student has a unique STUDENT_ID and only one address, denoted by ZIP, STATE, CITY Super-keys: {STUDENT_ID}, {STUDENT_ID, STUDENT_NAME} and so on… Candidate keys: {STUDENT_ID} Primary key: {STUDENT_ID} Non-prime attributes: {STUDENT_NAME, ZIP, STATE, CITY} Note: 2NF holds. Relation is in 1NF and no non-prime attribute is functionally dependent on a proper subset of any candidate key. 4 3 NF 3NF will hold if there are no TRANSITIVE FUNCTIONAL DEPENDENCIES of NON-PRIME ATTRIBUTES on the primary key. 5 ZIP is functionally dependent on STUDENT_ID. STATE AND CITY is functionally dependent on ZIP STUDENT_ID ---> ZIP ZIP ---> {STATE, CITY} If A ---> B and B ---> C are two FDs then A ---> C is a TRANSITIVE FUNCTIONAL DEPENDENCY {STATE AND CITY} Is transitively functionally dependent on STUDENT_ID STUDENT_ID ---> {STATE, CITY} 3NF: A SIMPLE EXAMPLE STUDENT_ID ---> ZIP ZIP ---> {STATE, CITY} STUDENT_ID ---> {STATE, CITY} CONDITION FOR 3NF: There are no TRANSITIVE FUNCTIONAL DEPENDENCIES of NON-PRIME ATTRIBUTES on the primary key. STUDENT_ID ---> {STATE, CITY} is a transitive functional dependency STATE AND CITY are non-prime attributes STUDENT_ID is the primary key. Therefore, the relation Student_Address violates third normal form requirements. 6 DECOMPOSING A RELATION TO CONFORM WITH 3 NF EXAMPLE continued Student_Address (STUDENT_ID, STUDENT_NAME, ZIP, STATE, CITY) STUDENT_ID --->ZIP functionally dependency ZIP ---> {STATE, CITY} functionally dependency STUDENT_ID ---> {STATE, CITY} transitive functional dependency We need to remove the transitive functional dependency. Form a new relation that includes all attributes except those transitively determined. SA2 (STUDENT_ID, STUDENT_NAME, ZIP) (all attributes except STATE, CITY) This leaves us with attributes STATE and CITY. Form a new relation defined on these attributes with ZIP as primary key, since ZIP ---> {STATE, CITY} SA3 (ZIP, STATE, CITY) 7 DECOMPOSING A RELATION TO CONFORM WITH 3 NF EXAMPLE SUMMARY STUDENT_ID --->ZIP functionally dependency ZIP ---> STATE, CITY functionally dependency STUDENT_ID ---> STATE, CITY transitive functional dependency SCHEMA OF ORIGINAL RELATION Student_Address (STUDENT_ID, STUDENT_NAME, ZIP, STATE, CITY) SCHEMSA OF NEW RELATIONS Student_Address2
Student_Address3 8 Why is this better? ORIGINAL RELATION: Student_Address (STUDENT_ID, STUDENT_NAME, ZIP, STATE, CITY) NEW RELATIONS Student_Address2 Student_Address3 Look for potential anomalies: UPDATE ANOMALY, INSERTION ANOMALY, DELETION ANOMALY. 9 Boyce-Codd NF, 4NF, 5NF, Domain-key NF • There are other normal forms, each stricter than the one preceding it. • We do not discuss them. END OF DATABASE WORK 10 R Programming Review ◼ Review of R by conducting an analysis of a dataset. ◼ Dataset available on GS, Week 10. 11 titanic data set description 12 variables 12 • PassengerId: Serial Number • Survived: Contains binary Values of 0 & 1. Passenger did not survive — 0, Passenger Survived — 1. • Pclass — Ticket Class | 1st Class, 2nd Class or 3rd Class Ticket • Name — Name of the passenger • Sex — Male or Female • Age — Age in years — Integer • SibSp — No. of Siblings / Spouses — brothers, sisters and/or husband/wife • Parch — No. of parents/children — mother/father and/or daughter, son • Ticket — Serial Number • Fare — Passenger fare • Cabin — Cabin Number • Embarked — Port of Embarkment | C- Cherbourg, Q —Queenstown, S — Southampton R Programming Demo 1 ◼ Analyzing the ‘titanic’ dataset 13 THIS IS THE LAST LECTURE You should attend both discussion sections. Exam details will be posted separately on GS. This is the last lecture. (There is no Lecture 20) 14 Good luck! 15 欢迎咨询51作业君