Normalisation
- Normalisation is a step towards
DB optimisation
- Normalisation is used to
minimise the redundancy
from a relations. - It is also used to
eliminate undesirable
characteristics like Insertion, Update, and Deletion Anomalies. - Normalisation
divides the composite attributes into individual
attributes ORlarger table into smaller
and links them using relationships.
Functional Dependency (FD)
- It’s a relationship between the
primary key
attribute (usually) of the relation to that of theother attribute
of the relation. - X -> Y, the left side of FD is known as a Determinant, the right side of the production is known as a Dependent.
- Types of FD
Trivial FD
- A → B has trivial functional dependency if B
is a subset
of A. - A → A, B → B are also Trivial FD.
- A → B has trivial functional dependency if B
Non-trivial FD
- A → B has a non-trivial functional dependency if B is
not a subset
of A. [i.e A intersection B is NULL].
- A → B has a non-trivial functional dependency if B is
- Rules of FD (Armstrong’s axioms)
- Reflexive
- If ‘A’ is a set of attributes and ‘B’ is a subset of ‘A’. Then, A→ B holds.
- If A ⊇ B then A → B.
- Augmentation
- If B can be determined from A, then adding an attribute to this functional dependency won’t change anything.
- If A→ B holds, then AX→ BX holds too. ‘X’ being a set of attributes.
- Transitivity
- If A determines B and B determines C, we can say that A determines C.
- if A→ B and B→ C then A→ C.
- Reflexive
Why Normalisation?
To avoid redundancy
in the DB, not to store redundant data.- Issue if we have redundant data
- Insertion, deletion and updation
anomalies arises
- Insertion, deletion and updation
Anomalies
Anomalies means abnormalities
- There are
three types of anomalies
introduced by data redundancy- Insertion anomaly
- When certain data (attribute) can not be inserted into the DB without the presence of other data.
- Deletion anomaly
- The delete anomaly refers to the situation where the deletion of data results in the
unintended loss
of some other important data.
- The delete anomaly refers to the situation where the deletion of data results in the
- Updation anomaly (or modification anomaly)
- The update anomaly is when an update of a single data value requires multiple rows of data to be updated.
- Due to updation to many places, may be
Data inconsistency arises
, if one forgets to update the data at all the intended places.
- Insertion anomaly
- Due to these anomalies
DB size increases
DB performance become very slow
Types of Normal forms
- First Normal Form (1NF)
- Every relation cell must have atomic value.
- Relation
must not have multi-valued attributes
- Second Normal Form (2NF)
- Relation
must be in 1NF
- There should
not be any partial dependency
- All non-prime attributes must be fully dependent on PK.
- Non prime attribute can not depend on the part of the PK.
- Relation
- Third Normal Form (3NF)
- Relation
must be in 2NF
- No transitivity dependency exists.
Non-prime attribute should not find a non-prime attribute
- Relation
- Boyce Codd normal form (BCNF)
- BCNF is the advance version of 3NF. It is stricter than 3NF.
- Relation
must be in 3NF
- FD: A -> B, A must be a super key.
- We must not derive prime attribute from any prime or non-prime attribute.
Advantages of Normalisation
- Normalisation helps to minimise data redundancy.
- Greater overall database organisation.
- Data consistency is maintained in DB.