Normalization is the process of efficiently organizing data in a database such that duplicate and redundant data is avoided.
- There are two goals of the normalization process:
- Eliminating redundant data (for example, storing the same data in more than one table)
- Data dependencies are logical (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
- Normalization is important for many reasons, but chiefly because it allows databases to take up as little disk space as possible, resulting in increased performance. If some piece of data is duplicated several places in the database, there is the risk that it is updated in one place but not the other, leading to data corruption.
Normalization is also known as data normalization.
- Breaking the database up into numerous smaller tables, and eliminating redundancies, eases management and enhances efficiency. It can also decrease database performance because there is need to join many tables together to get complete answers. The process of de-normalizing a database is often used in this case to create "reporting tables", which combine the normalized data into larger tables, which can be queried more quickly. This strategy is used heavily in "data warehouses".