SCD-Slowly Changing Dimension
What is SCD?
Type 1 (changing attribute): When the changes in any attribute or column overwrites the existing records.
Please provide your feedback for the post, if you find this post useful.
In this post I will try to include everything about SCD.
What is SCD?
SCD is
Slowly Changing Dimension. As the name suggests, a dimension which changes
slowly. For Example, say there is a table Employee, which stores information
regarding employee as below:
BusinessEntityID,
NationalIDNumber, First_Name, last_Name LoginID, OrganizationNode
OrganizationLevel,
JobTitle, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag
CurrentFlag,
ModifiedDate
In this
Employee table, the data for an employee doesn't change very often,
but yes we can’t say that the changes won’t be there. The changes, which may
happen, are
· Mistakenly
spelling of First_Name is stored incorrect.
· The
employee gets married and marital status changes.
· Last_Name
changes.
· The
employee gets promotion and job designation changes and organization level
changes.
· The
columns which doesn't change except if we assume that no mistake
happens while data entry are HireDate, Gender, NationalIDNumber
The
changes discussed don’t happen frequently, but may happen after certain time.
SCD supports four types of changes:
changing attribute, historical attribute, fixed attribute, and inferred member.
changing attribute, historical attribute, fixed attribute, and inferred member.
Type 1 (changing attribute): When the changes in any attribute or column overwrites the existing records.
For
example; as discussed first name of employee is misspelled and wrong spelling
is stored in first name of that employee. For making the first name correct, we
don’t need to add one more record for the same employee, so we can overwrite
the first name. SCD which does this kind of changes comes into type 1 category.
This SCD transformation directs these rows to an output named Changing
Attributes Updates Output.
Emp
ID
|
First
Name
|
Last
Name
|
1
|
Rajan
|
Gupta
|
1
|
Ranjan
|
Gupta
|
This
SCD transformation directs these rows to an output named Changing
Attributes Updates Output.
Type 2 (historical attribute): when we need to maintain the
history of records, whenever some particular column value changes.
For
example the employee gets promotion, designation changes and organization level
changes. In such case we need to maintain the history of the employee, that
with which designation he joined, and when his designation and organizational
level changes.
For
these kinds of changes, there will be multiple records for the same employee
with different designation. Then to indentify the current records, we can
either add a column as current flag, which will be ‘y’ for the current or
latest records, Or else we can add two column as start date and end date
(expiry date), through which we can maintain history of employees records. This
SCD directs these rows to two outputs: Historical Attribute Inserts
Output and New Output.
EmpID
|
FirstName
|
DEsignation
|
StartDate
|
EndDate
|
Current
|
1
|
Ranjan
|
Graduate
Engineer
|
20-01-2010
|
25-01-2011
|
N
|
1
|
Ranjan
|
Analyst
Programmer
|
25-01-2011
|
25-01-2012
|
N
|
1
|
Ranjan
|
Business
Analyst
|
25-01-2012
|
1-01-2099
|
Y
|
Fixed attribute: when the attribute must not change.
For
example HireDate, Gender, NationalIDNumber should never change. So whenever
changes will occur in these columns value then either it should throw error or
the changes can be saved in some other destination. But changes should not be
applied in the columns.
This
SCD transformation detects changes and directs the rows with changes to an
output named Fixed Attribute Output.
Inferred member: are those records of the dimension, which are found missing during fact load.
Inferred member: are those records of the dimension, which are found missing during fact load.
For
example, say there is a fact table which contains employee and department
information. While generating the fact table from employee_stg table
and department_stg table , sometimes happenes that employee_stg contains
some departments name which has no records in department table
, and during fact table generation those records are found missing from
department table, these kind of member of dimension department are called
inferred member. It’s like ‘Fact arriving earlier than dimensions’.
This
SCD transformation directs these rows to an output named Inferred
Member Updates. When data for the inferred member is loaded, you can update
the existing record rather than create a new one.
Cllick For Implementation of SCD type 1 step by step
For Implementation of SCD type 2 step by step
For Alternate methods of SCD
For Implementation of SCD type 2 step by step
For Alternate methods of SCD
Please provide your feedback for the post, if you find this post useful.
it is very useful.Now i got clear understand about SCD.Thank you so much.Really it was very excellent.
ReplyDeleteAnurag Khare, Aug 1st, 2013 @ 10:05 AM
ReplyDeleteIts really very clear and useful information about the SCD. I need some option for incremental load. I have implemented this in my project and its working fine...
How can I validate about Inferred Member Updates. Pls reply if possible...
Thanks in Anticipation...
It is very clear to understand.Thank you so much.
ReplyDeletei liked it
ReplyDeleteExcellent article..very nice. Thanks for posting.
ReplyDeleteThis is the article i am looking for. Thank you.
ReplyDeleteOk but need to show it as by Code format.
ReplyDeleteBrilliant Article ....Thanks for Posting..
ReplyDeleteNicely articulated, it can be very well understood by a novice. Thanks for your efforts.
ReplyDeleteHey Ragini, many thanks for the post. extremely helpful.
ReplyDeleteSimple and clear to understand ....
ReplyDeleteIts is very useful with simple and clear information to understand.. Thank you.
ReplyDeleteWhich SCD types is not supported by SSIS, type 3 or type 4.
ReplyDeleteIt is very clear ....
ReplyDeletewhat are the other alternatives for scd and which one is used in real time... please let me know....thanks
ReplyDeleteits really clear please also write basics, like behind the scene also if possible
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThank you for this beautiful post,extremely helpful!!
ReplyDeletethe only question i have here is, Fixed Attribute and Inferred Member are type 3 and 4 or they are belong type2?