Relational Database Design Functional Dependencies
Relational Database Design Functional Dependencies
Relational Database Design Functional Dependencies
For every instance of |R| of R, if two tuples in |R| agree on the values of the attributes in X, then they agree on the values of the attributes in Y
Trivial FDs
A functional dependency X Y is trivial if Y is a subset of X {name, supervisor_id} {name} If two records have the same values on both the name and supervisor_id attributes, then they obviously have the same name. Trivial dependencies hold for all relation instances
For instance, the specialization of a students must be the same as that of the supervisor.
They constrain the set of legal relation instances. For instance, if I try to insert two students under the same supervisor with different specializations, the insertion will be rejected by the DBMS
3
Example: From {sid} {first_name} and {sid} {last_name} We can infer {sid} {first_name, last_name}
Armstrongs Axioms
Be X, Y, Z be subset of the relation scheme of a relation R Reflexivity: If YX, then XY (trivial FDs)
{name, supervisor_id}{name}
Proof of soundness: Assume two tuples T1 and T2 of |R| are such that, for all attributes in X, T1.X = T2.X. We want to prove that if transitivity holds and T1.X = T2.X, then indeed T1.Z = T2.Z since XY and T1.X = T2.X then, T1.Y = T2.Y since YZ and T1.Y = T2.Y then
T1.Z = T2.Z
10
Finding Keys
Example: Consider the relation scheme R(A,B,C,D) with functional dependencies {A}{C} and {B}{D}.
determine all attributes (i.e., be a superkey) be minimal {A}{C} {A,B}{A,B,C} (augmentation by AB) {B}{D} {A,B,C}{A,B,C,D} (augmentation by A,B,C) We obtain {A,B}{A,B,C,D} (transitivity)
{A,B} is minimal because neither {A} nor {B} alone are candidate keys
11
{A}+ = {A,C}
{B}+ = {B,D}
{C}+={C} {D}+={D} {A,B}+ = {A,B,C,D}
12
Output:
X+ the closure of X w.r.t. F
X(0) := X Repeat X(i+1) := X(i) Z, where Z is the set of attributes such that
there exists YZ in F, and Y X(i)
13
R = {A,B,C,D,E,G} F = { {A,B}{C}, {C}{A}, {B,C}{D}, {A,C,D}{B}, {D}{E,G}, {B,E}{C}, {C,G}{B,D}, {C,E}{A,G}} X = {B,D} X(0) = {B,D}
14
15
Redundancy of FDs
Sets of functional dependencies may have redundant dependencies that can be inferred from the others
{A}{C} is redundant in: {{A}{B}, {B}{C},{A} {C}}
{{A}{B}, {B}{C}, {A}{C,D}} can be simplified to {{A}{B}, {B}{C}, {A}{D}} (because {A}{C} is inferred from {A} {B}, {B}{C}) Example of extraneous/redundant attribute on LHS: {{A}{B}, {B}{C}, {A,C}{D}} can be simplified to {{A}{B}, {B}{C}, {A}{D}} (because of {A}{C})
16
Canonical Cover
A canonical cover for F is a set of dependencies Fc such that
F and Fc,are equivalent
Fc contains no redundancy
Each left side of functional dependency in Fc is unique. For instance, if we have two FD XY, XZ, we convert them to XYZ.
Algorithm for canonical cover of F: repeat Use the union rule to replace any dependencies in F X1 Y1 and X1 Y2 with X1 Y1 Y2 Find a functional dependency X Y with an extraneous attribute either in X or in Y If an extraneous attribute is found, delete it from X Y until F does not change Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied
17
A is extraneous in AB C because of B C.
Set is now {A BC, B C}
18
Functional dependencies can be used to refine ER diagrams or independently (i.e., by performing repetitive decompositions on a "universal" relation that contains all attributes). Relational database design requires that we find a good collection of relation schemas. A bad design may lead to
Repetition of Information. Inability to represent certain information.
Design Goals:
Avoid redundant data Ensure that relationships among attributes are represented Facilitate the checking of updates for violation of database integrity constraints.
19
Bad Design Wastes space. Data for branch-name, branch-city, assets are repeated for each loan that a branch makes Complicates updating, introducing possibility of inconsistency of assets value Difficult to store information about a branch if no loans exist. Can use null values, but they are difficult to handle.
20
Usefulness of FDs
Use functional dependencies to decide whether a particular relation R is in good form. In the case that a relation R is not in good form, decompose it into a set of relations {R1, R2, ..., Rn} such that each relation is in good form the decomposition is a lossless-join decomposition if possible, preserve dependencies
In our example the problem occurs because there FDs ({branch-name}{branch-city, assets}) where the LHS is not a key
Solution: decompose the relation schema Lending-schema into: Branch-schema = (branch-name, branch-city,assets) Loan-info-schema = (customer-name, loan-number, branch-name, amount)
21