Surrogate key In Data warehouse.
Defining Surrogate Key :
Surrogate Key may be defined as a system generated key which is used to maintain unique records in the Data warehouse databases.
Need for Surrogate key :
We all know that in Data warehouse systems , Data from source which is an OLTP system ,goes to Dimension tables which may be linked to many Fact Tables in the Data warehouse.Now as the Source only maintains the current data, it has the primary key to uniquely identify its records. But the Dimension table in Data warehouse maintains historical data ( SCD Type 2) , thus for the same record there will be multiple entries including the primary key of the source . As shown in below example.
Thus in order to identify the records uniquely in Dimension table , we have the concept of Surrogate Key.
Is Primary and Surrogate Key Same or Different ??
Most people think that both primary and surrogate keys are the same. But actually both are different.
Primary Key is used to maintain unique records in OLTP Databases .
Surrogate Key is used to maintain unique records in OLAP (Data warehouse) databases.
Primary Key Values are User-Generated.
Surrogate Key Values are System-Generated.
Primary Key can be Alphanumeric.
Surrogate Key is always Numeric.
Primary Key belongs to Business (Table) Data.
Surrogate Key belongs to Business data.
Primary Key Sequence order can be random.
Surrogate Key order is always in sequence and never random.