Friday, February 5, 2016

What are Surrogate Keys, Primary Keys and Candidate keys? Where its used?

A Surrogate key is any column or set of columns that can be declared as the primary key instead of a real or natural key. Sometimes there can be several natural keys that could be declared as the primary key, and these are all called candidate keys. So we can call a  surrogate key is a candidate key.

A Surrogate key is the alternate of primary key that allows duplication of datas/records. It is an immutable set of attributes that uniquely identify a row that were generated specifically and solely to identify this row which is not in case of natural key or primary key.

A table could actually have more than one surrogate keys, although this would be unusual. A natural key is an immutable set of attributes that uniquely identify a row that occur naturally with the row itself. The most common type of surrogate key is an incrementing integer, such as an auto_increment column in MySQL, or a sequence in Oracle, or an identity column in SQL Server.

Primary key and Surrogate key are same but surrogate key is a system generated numeric or integer value to identify each row uniquely, it has a define incremental value for each row in a table.

Surrogate key does not have any business importance for the value it holds but primary key has a significant business value.

OLTP Databases are called as of  Normalised Form  whereas  Data warehouses - DWHs  are called as of De-normalised form as DWH is used to maintain the historic data for analyzing. To remain de-normalised, duplication is allowed. When data inserting in DWH, Surrogate key a new column named serial number is introduced to allow duplication.

A Surrogate key in a data warehouse is more than just a substitute for a natural key. In a data warehouse, a surrogate key is a necessary generalization of the natural production key and is one of the basic elements of data warehouse design. Surrogate Key is the solution for critical column problems.

Ex.  A customer purchases different items from stores at different locations. Here, we have to maintain historical data, by using surrogate key which introduces the row in the data warehouse to maintain historical data. Another example of its, a single mobile number is used by other person if it is not in use for more than one year, it is possible just because of this surrogate key.