Slow Changing Dimensions in Informatica (SCD)
Defining Slow Changing Dimensions:
Change in the city names for e.g. Madras to Chennai , Change in the price of the product , Changes in the residential location of the people , Changes in the working location of the people , etc. All these changes vary over a period of time and does not occur on a regular basis .Capturing such changes and updating them in data warehouse is a big challenge as there is no fixed time when these changes would happen. The Dimension tables containing such data are called Slow Changing Dimensions.
Thus,
Slow changing Dimensions may be defined as the dimensions in which Data changes slowly over a period of time instead of regular basis. The reason why they are called slow changing and not fast changing dimensions is because the rate of these changes is slow.
SCD Management Techniques :
There are 3 different management methodologies to handle the above problem :
- SCD Type 1
- SCD Type 2
- SCD Type 3
SCD Type 1 :
In this method , the current data OVERWRITES the old data. Thus , no historical record is maintained in this technique.
Advantages :
- Easy to maintain as there is no need to track the old records and only current records needs to be handled.
- Occupies less space.
Disadvantages :
With this method it is not possible to track any historical record, as it only maintains the current data.
When to use Type 1 :
When the business requirement does not need the Data warehouse to maintain historical records , then this technique is best to use.
SCD Type 2 :
In this method , both the historical and current data are maintained. The current data enters as new record in the data warehouse.
For e.g , In above case suppose Ravi changes his city from Bangalore to Chennai and then pune. Now since in this technique , all the historical data is recorded , there will be 3 entries for Ravi in the dimension table including the EmpId which is a primary key in source table. Now if I want to find out the second city where Ravi lived , it will be very difficult as there is no flag or key to determine that.
Hence SCD Type 2 technique stores the historical data using 3 different ways :
- Versioning
- Flagging
- Effective Date
Versioning : In this a Version column is created in the dimension table . This Version number is a sequence . Every time the record is updated , it is loaded in the dimension table as new record with a new version number . for e.g. in above case first time when Ravi was living in Bangalore the version number was 1 . When he moved to Chennai the version number changed to 2 , and finally when he moved to pune the version number changed to 3. Thus version number is incremented each time with the new entry and latest version number indicates the current record.
Advantages :
Easy to maintain as there is no need to track the old records and only current records needs to be handled.
Disadvantages :
With this method it is not possible to track any historical record, as it only maintains the current data.
When to use Type 2 :
When the business requirement does not need the Data warehouse to maintain historical records , then this technique is best to use.
SCD Type 2 Methodology :
SCD Type 3 Methodology :