Functional
Dependencies
COMP9311 24T2; Week 5.1
By Zhengyi Yang, UNSW
How good is your DB design?
➢ Conceptual Level
➢ How do users interpret the relation schemas and the meaning of
their attributes?
➢ Physical Level
➢ How the tuples in a base relation are stored and updated?
2
How good is your DB design?
➢ Information Preservation
➢ Does your design correctly capture all attributes, entities and relations?
➢ Minimum Redundancy
➢ Does your design minimize redundant storage of the same
information and reduce the need for multiple updates?
3
Example of Redundancy
Suppose we have a table inst_dept which contains information
for both instructor and department.
Result is possible repetition of information, which leads to update
anomalies.
4
Update Anomalies
Redundancy in a database means storing a piece of data more
than once.
Redundancy is often useful for efficiency and semantic reasons,
but creates the potential for consistency problems.
A poor redundancy control may cause update anomalies.
5
Update Anomalies
Consider the previous example relation.
Insertion Anomalies
➢
To insert a new employee, we must include the correct values for
➢
his/her department or NULLs.
How to insert a department with no employees? (set ID to null
➢
violates primary key constraint if ID is the primary key)
Deletion Anomalies
➢
What if we delete the last employee in a department? (lose the
➢
information of a department)
Modification Anomalies
➢
What if we change the budget of a department? (have to maintain
➢
multiple duplication of the same value)
6
Devise a Theory for what is Good
We want to do two things:
1. Decide whether a particular relation R is in “good” form.
2. If a relation R is not in “good” form, decompose it into a set of relations {R , R , ..., R }
1 2 n
such that
➢ each relation is in good form
➢ the decomposition is a lossless
Our theory/properties are defined based on functional
dependencies.
7
Attribute Values can be Related
8
Functional Dependencies
A functional dependency describes a relation between attributes
Whenever any two tuples t and t of r agree on one attribute α,
1 2
they also agree on another attribute β:
t [α] = t [α] ⇒ t [β ] = t [β ]
1 2 1 2
This relation is denoted α → β.
9
Functional Dependencies
ID → Name, Depart_name → Building
Describes the semantics or meaning of the attributes
10
Functional Dependencies
The functional dependency
ID Name Code Grade
X → Y is true (holds) 100 J 3550 A
200 X 3550 B
if and only if
100 J 4540 B
100 J 4550 A
t [X] = t [X] ⇒ t [Y] = t [Y]
1 2 1 2
in relation R
Example: R = {ID, Name, Code, Grade}
➢
ID → Name (OK)
➢
ID → Grade (not OK), ID → Code (not OK)
➢
ID, Name → Grade (not OK), ID, Code → Grade (OK)
➢
ID, Name → Name (trivial)
➢
11
Functional Dependencies: Test
Let’s see if you understand (Test1)
F: X → Y
X Y
---------
a b
a ?
12
Functional Dependencies: Test
Let’s see if you understand (Test1)
F: X → Y
X Y
---------
a b
a ?
? must be b
13
Functional Dependencies: Test
Let’s see if you understand (Test2)
F: X → Y
X Y
---------
a b
? b
14
Functional Dependencies: Test
Let’s see if you understand (Test2)
F: X → Y
X Y
---------
a b
? b
? can be any value
15
Functional Dependencies: Test
Let’s see if you understand (Test3)
F: X → Y
X Y
---------
a b
c b ?
is it okay?
16
Functional Dependencies: Test
Let’s see if you understand (Test3)
F: X → Y
X Y
---------
a b
c b ?
Yes, it doesn’t violates X → Y
17
Functional Dependencies: Test
Let’s see if you understand (Test4,5)
X, Y → X ?
X → X ?
18
Functional Dependencies: Test
Let’s see if you understand (Test4,5)
X, Y → X ? Yes
X → X ? Yes
Note: Functional dependencies like these are trivial
19
Functional Dependencies: Test
Let’s see if you understand (Test6)
Consider R (A , B) with the following instance r.
1 4
1 5
3 7
On this instance, A → B does NOT hold
20
FD: relation between two sets
A functional dependency is a relation between two sets of attributes.
I.e., the value for a set of attributes determines the value for another
set of attributes.
A functional dependency describes the relation between two sets of
attributes from a relation.
Examples:
XY → WZ
XW → Z
Z → XQ
21
Functional Dependencies
A functional dependency is a constraint between two sets of
attributes for all its relation instances.
A constraint means a constraint across all it's relation instances
(extensions), that it must hold for all relation instances.
F is a set of FD specified on relation R. It must hold on all relation
instances.
22
Constraint on all Relations
Example: course → course_code in Students
STUDENTS
id course course_code major prof
1 Database 353 Comp Sci Smith
2 Chem101 427 Chemistry Turner
3 Database 353 Comp Sci Smith
STUDENTS
id course course_code major prof
1 Database 353 Comp Sci Yu
4 Agile Dev 821 Comp Sci Turner
…
5 Compiler 237 Comp Sci Clark
23
Legal Extensions of R
Relation extensions r(R) that satisfy the functional dependency constraints
are called legal relation states (or legal extensions) of R.
Let course → course_code be the only FD for Students
STUDENTS Legal
id course course_code major prof
1 Database 353 Comp Sci Smith
2 Chem101 427 Chemistry Turner
3 Database 353 Comp Sci Clark
STUDENTS Also legal
id course course_code major prof
1 Database 353 Comp Sci Yu
4 Agile Dev 821 Comp Sci Turner
24
5 Compiler 237 Comp Sci Clark
Notation and Terminology
Let X → Y be a functional dependency on relation R
We say that
X → Y holds on R
➢
We say that
X functionally determines Y
➢
Y is functionally dependent on X
➢
We say that
X is determinant of the dependency
➢
Y is dependent of the dependency
➢
OR
X is left-hand side of the dependency
➢
Y is right-hand side of the dependency
➢
25
Functional Dependencies
A WORKS_ON relation
➢ Ssn = social security number
➢ Pnumber = project number
Question:
What might be the FDs of
WORKS_ON?
Ssn, Pnumber → Hours
26
Functional Dependencies
An EMPLOYEE relation
➢ SSn = social security number
➢ Bdate = birthday
➢ Dnumer = department number
Question: What might be the FDs of EMPLOYEE?
Ssn → Ename, Address, Bdate
27
Functional Dependencies
Example: R = {ID, Name, Code, Grade}
r(R) Instance A r(R) Instance B
ID Name Code Grade ID Name Code Grade
100 J 3550 A 100 J 3550 A
200 X 3550 B 200 X 3550 B
100 J 4540 B 100 J 4540 A
100 J 4550 A 100 J 4550 A
➢ ID => Name (OK), ➢ ID => Name (OK)
➢ ID => Grade (not OK), ➢ ID => Grade (OK),
➢ ID => Code (not OK), ➢ ID => Code (not OK)
➢ ID, Name => Grade (not OK), ➢ ID, Name => Grade (OK),
➢ ID, Code => Grade (OK). ➢ ID, Code => Grade (OK).
Important: You can’t infer FD’s from a relation’s instances 28
Functional Dependencies
Functional dependencies exist to:
➢ specify the semantics between attributes
➢ semantics of a relation should be kept across all its extensions
➢ specify constraints on a relational schema
➢ this semantics is not captured by ER
29
Designing FDs
FD cannot be inferred automatically from a given relation
extension r.
So given a relation, where do its FDs come from? Where do we
find it?
Deciding the FDs of a table is part of a design decision.
➢ Defined explicitly by someone who knows the semantics of the attributes of R.
30
Designing FDs
Assume we need to define the FDs of this relation
STUDENTS
ID Course Phone Major Prof Grade
We need to know the semantics of the columns.
Could each ID have a unique phone number and major?
31
Which Columns are Related?
STUDENTS
ID Course Phone Major Prof Grade
Every ID has a unique phone number and major
➢ We can say {ID} →{Phone , Major}
Other relations between columns:
➢ Every course has a unique professor {Course} → {Prof}
➢ Every ID and course has a unique grade {ID , Course} → {Grade}
Whenever the semantics of two sets of attributes in R indicate
that a functional dependency should hold, we specify the
dependency as a constraint.
32
Final Notations
We may denote the attributes sets with/without curly brackets
➢ With curly brackets, attributes are comma separated
➢ {X,Y} = XY
The order of the attribute sets doesn’t matter
➢ ZY = YZ
➢ {Z,Y} = {Y, Z}
33
Dependency Diagram
Each horizontal line represents a FD
➢ Left-hand side attr. connected by vertical lines to the line,
➢ Right-hand side attr. connected by vertical lines with arrows
➢ Arrow pointing toward the attributes
Dependency diagram from previous example.
Id Course Phone Major Prof Grade
34
Dependency Diagram (Cont.)
Some more examples of dependency diagrams.
35
Inferring other FDs
A → B and B → C, what do we know about A → C?
Given A → B and B → C on relation R,
We know A → C holds on R, given A determines B, and B
determines C.
36
Inferring Other FDs
It’s true that given a set F of functional dependencies, there are other
functional dependencies that are logically implied by F.
F |= X→ Y
Denotes that set of FDs F infers X → Y if all relation instances
satisfying F also satisfies X → Y .
Example:
F = A → B, B → C,
F |= A → C
Usually, the schema designer will only specify the functional
dependencies that are semantically obvious.
37
Armstrong’s Axioms
These are the inference rules for functional dependencies
➢ Rule 1 (reflexivity)
➢ if β ⊆ α, then α → β
➢ Rule 2 (augmentation)
➢ if α → β, then γ α → γ β
➢ Rule 3 (transitivity)
➢ if α → β, and β → γ, then α → γ
➢ Where α, β, γ are all (nonempty) sets of attributes
The above are the primary rules/axioms from Armstrong’s Axioms (1974)
38
Practice
R = (A, B, C, G, H, I)
F = { A → B, A → C, CG → H, CG → I, B → H }
These FDs can be inferred/deduced.
A → H
AG → I
CG → HI
39
(Solutions)
R = (A, B, C, G, H, I)
F = { A → B, A → C, CG → H, CG → I, B → H }
A → H
by transitivity from A → B and B → H
AG → I
by augmenting A → C to get AG → CG
then transitivity with given CG → I
CG → HI
by augmenting CG → I to infer CG → CGI,
then augmenting CG → H to infer CGI → HI,
followed up by a transitivity
40
Armstrong’s Axioms (Cont.)
Additional Rules we inferred from Armstrong’s axioms.
➢ Rule 4 (additivity):
➢ If α → β holds and α → γ holds, then α → β γ holds
➢ Rule 5 (projectivity):
➢ If α → β γ holds, then α → β holds and α → γ holds
➢ Rule 6 (pseudo-transitivity):
➢ If α → β holds and γ β → δ holds, then α γ → δ holds
41
Proving Secondary Rules
Let’s try prove rule 5: projectivity
{X → Y Z} |= X → Y
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
42
(Solution)
Let’s try prove rule 5: projectivity
{X → Y Z} |= X → Y
Step 1. X → Y Z (Given)
Step 2. YZ → Y (Reflexivity)
Step 3. X → Y (Transitivity of 1 and 2)
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
43
Proving Secondary Rules
Let’s prove rule 6: Pseudo-transitivity
{X → Y , Y Z → W} |= XZ → W
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
44
(Solution)
Let’s prove rule 6: Pseudo-transitivity
{X → Y , Y Z → W} |= XZ → W
Step 1. X → Y (Given)
Step 2. XZ → YZ (Augmentation of 1)
Step 3. YZ → W (Given)
Step 4. XZ → W (Transitivity, from 2 and 3)
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
45
Proving Secondary Rules
Let’s prove rule 4: Additivity
{X→ Y , X → Z} |= X → Y Z
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
46
(Solution)
Let’s prove rule 4: Additivity
{X→ Y , X → Z} |= X → Y Z
Step 1. X→ Y (Given)
Step 2 . XX → XY (Augmentation of 1); that is, X → XY
Step 3. X → Z (Given)
Step 4. X Y → Y Z (Augmentation of 2)
Step 5. X → Y Z (Transitivity, from 2 and 4)
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z. 47
Practice FD Inference
Given F = {A → B, A → C, BC → D}
Prove A → D:
Cheat Sheet
F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
F4 (Additivity) {X→ Y , X → Z} |= X → Y Z.
F5 (Projectivity) {X → Y Z} |= X → Y .
F6 (Pseudo-transitivity) {X → Y , Y Z → W} |= XZ → W.
48
(Solution)
Given F = {A → B, A → C, BC → D}
Cheat Sheet
Prove A → D: F1 (Reflexivity) If X ⊇ Y then X →Y .
F2 (Augmentation) {X → Y } |= XZ → Y Z.
Step 1. A → B (Given)
F3 (Transitivity) {X → Y , Y → Z} |= X → Z.
F4 (Additivity) {X→ Y , X → Z} |= X → Y Z.
Step 2. A → C (Given)
F5 (Projectivity) {X → Y Z} |= X → Y .
F6 (Pseudo-transitivity) {X → Y , Y Z → W}
Step 3. A → BC (Additivity, from 1 and
|= XZ → W.
2)
Step 4. BC → D (Given)
Step 5. A → D (Transitivity, from 3 and
4)
49
Closure of F
Definition. the set of all dependencies that can be inferred from F
is called the closure of F.
F+ denotes the closure of F.
F+ includes dependencies in F.
Note: We typically reserve F to denote the set of functional
dependencies that are specified on relation schema R.
50
The Procedure for Computing F+
To compute the closure of a set of functional dependencies F:
F+ = F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F+
for each pair of functional dependencies f and f in F+
1 2
if f and f can be combined using transitivity
1 2
then add the resulting functional dependency to F+
until F+ does not change any further
51
The Procedure for Computing F+
F = { X → Y, Y → Z}
F+ = {XY → X, XY → Y, XY → Z, XZ → X, XZ → Y,
XZ → Z, XYZ → X, XYZ → Y, XYZ → Z, XY → XY,
XY → YZ, XY → XZ, …}
52
Checking Membership by F+
Given F = { X → Y, Y → Z}
Question: Can X → Z be inferred or derived from the FDs in F?
How to do it? Check X → Z by computing F+?
53
Checking Membership by F+
Given F = { X → Y, Y → Z}
Question: Can X → Z be inferred or derived from the FDs in F?
How to do it? Check X → Z by computing F+?
F+ = {XY → X, XY → Y, XY → Z, XZ → X, XZ → Y,
XZ → Z, XYZ → X, XYZ → Y, XYZ → Z, XY → XY,
XY → YZ, XY → XZ, … , X → Z , … }
Oh yes… X → Z is in the closure of F.
54
Checking Membership by F+
Given F = { X → Y, Y → Z}
Question: Can X → Z be inferred or derived from the FDs in F?
How to do it? Check X → Z by computing F+?
F+ = {XY → X, XY → Y, XY → Z, XZ → X, XZ → Y,
XZ → Z, XYZ → X, XYZ → Y, XYZ → Z, XY → XY,
XY → YZ, XY → XZ, … , X → Z , … }
Oh yes… X → Z is in the closure of F.
Problem: In real life, it is impossible to specify all possible functional
dependencies for a given situation. The size of F+ is always
exponential size w.r.t |F|.
55
Closure of Attributes
Given F = { X → Y, Y → Z}
Question: How else to check if X → Z without computing F+ ?
Definition: Given a set of attributes α, define the closure of α under F
(denoted by α+) as the set of attributes that are functionally determined
by α under F.
Realistically:
Narrow our attention to X, which is smaller than F.
Compute X+ instead of F+
Then check if Z is covered by X+
X+ is the largest set of attributes functionally determined by X.
56
Closure of Attribute Sets
Pseudocode to the closure of α under F
result := a;
while (changes to result) do
for each β → γ in F do
begin
if β ⊆ result then result := result ∪ γ
end
When no additional changes to result is possible, the final value of variable
result is α+
57
Algorithm to Compute X+
An algorithm for you to follow step by step
X+ := X;
change := true;
while change do
begin
change := false;
for each FD W → Z in F do
begin
if (W ⊆ X+) and (Z − X+ ≠ ∅) then do
begin
X+ := X+ ∪ Z;
change := true;
end
end
end
58
Exercise
F = { A → B, BC → D, A → C } Cheat Sheet:
X+ := X;
Practice: Compute A+
change := true;
while change do
begin
change := false;
for each FD W → Z in F do
begin
if (W ⊆ X+) and (Z − X+ ≠ ∅)
then do
begin
X+ := X+ ∪ Z;
change := true;
end
end
end
59
(Solution)
F = { A → B, BC → D, A → C }
Cheat Sheet:
Task: Compute {A}+
X+ := X;
change := true;
1st scan of F:
while change do
begin
X+ := {A}
change := false;
X+ := {A, B}
for each FD W → Z in F do
X+ := {A, B, C}
begin
2nd scan of F: if (W ⊆ X+) and (Z − X+ ≠ ∅)
then do
X+ := {A, B, C, D } begin
X+ := X+ ∪ Z;
3rd scan of F: no change,
change := true;
therefore, the algorithm terminates.
end
end
{A}+ := {A, B, C, D }
end
60
Recall of Attribute Set Closure
R = (A, B, C, G, H, I)
F = {A → B, A → C, CG → H, CG → I, B → H}
We know (AG)+ = ABCGHI
Observation: could AG be a candidate key?
Is AG a super key?
Does AG → R? => Is (AG) + = R?
Is any subset of AG a super key?
Does A → R? => Is (A) + = R?
Does G → R? => Is (G) + = R?
61
Functional Dependencies (Cont.)
K is a super key for relation schema R if and only if K → R
K is a candidate key for R if and only if
K → R, and
➢
for no α ⊂ K, α → R
➢
62
Answer
R = (A, B, C, G, H, I)
F = {A → B, A → C, CG → H, CG → I, B → H}
We know (AG)+ = ABCGHI
Observation: could AG be a candidate key?
Is AG a super key?
Does AG → R? => Is (AG) + = R? Yes, so AG is a super key
Is any subset of AG a super key?
Does A → R? => Is (A) + = R? No
Does G → R? => Is (G) + = R? No
So AG is a candidate key
63
Procedurally Determine Keys
How to compute a candidate key of a relation R
based on the FD’s belonging to R
Algorithm:
➢ Step 1 : Assign a super-key of R in F to X.
➢ Step 2 : Iteratively remove attributes from X while retaining the property
X+ = R till no reduction on X is possible.
➢ The remaining X is a key.
Let’s try an example
64
Practice
Given:
R = {A, B, C, D}
F = { A → B, BC → D, A → C }
Step 1 : Assign a super-key of R in F to X.
Step 2 : Iteratively remove attributes from X while retaining the property X+
= R till no reduction on X is possible.
The remaining X is a key.
65
(Solution)
Given:
R = {A, B, C, D} Step 1 : Assign a super-key of R in F to X.
F = { A → B, BC → D, A → C } Step 2 : Iteratively remove attributes from
X while retaining the property X+ = R till no
Let X = {A, B, C} ({A, B, C, D} is also a super
reduction on X is possible.
key)
The remaining X is a key.
A cannot be removed because {BC}+ = {B, C,
D} ≠ R
B can be removed because {AC}+ = {A, B, C,
D} = R
We remove B from X and update X to be { A, C}
C can be further removed because {A}+ = {A, B,
C, D}
We remove C from X and update X to be { A}
66
Compute all Candidate Keys
Given a relational schema R and a set of functional
dependencies F on R, find all the possible ways we can identify a
row.
Note: we know how to compute one candidate key already.
67
Compute All the Candidate Keys
Given a relational schema R and a set F of functional dependencies on R, the algorithm to compute all the
candidate keys is as follows:
T := ∅
Main:
X := S where S is a super key which does not contain any candidate key in T
remove := true
While remove do
For each attribute A ∈ X
Compute {X-A}+ with respect to F
If {X-A}+ contains all attributes of R then
X := X – {A}
Else
remove := false
T :=T ∪ X
Repeat Main until no available S can be found. Finally, T contains all the candidate keys. 68
Compute all Candidate Keys
Given relation R(A, B, C, D, E)
with set of FDs {A → B, BC → A, D → E}
Find all the candidate keys for relation R
69
(Solution)
Step 1: Steps 3,4,5:
Let X := {A, B, C, D} Attempts to remove B, C, D
separately
Step 2:
Try to remove A {C, D}+ = {C, D, E}
{B, C, D}+ = {A, B, C, D, E}
{B, D}+ = {B, D, E}
Thus X := {B, C, D}
{B, C}+ = {A, B, C}
None can be removed
So {B, C, D} is a candidate key
and add to T
70
(Solution)
Step 6:
Find another super key
Let X := {A, C, D}
Step 7,8,9:
Attempts to remove A, C, D separately
{C, D}+ = {C, D, E}
{A, D}+ = {A, B, D, E}
{A, C}+ = {A, B, C}
None cannot be removed
So, {A, C, D} is another candidate key and add to T
71
(Solution)
Step 10:
Cannot find any other super keys,
Conclusion: candidate keys are {B, C, D} and {A, C, D}
72
Lecture Learning Outcomes
Take aways
➢ Functional Dependencies
➢ Armstrong’s axioms
➢ Given a FD, check if the FD can be derived from a given set of FD
➢ How to compute one candidate key
➢ How to compute all candidate keys
73