Wednesday, June 12, 2013

SSIS: SCD Type1 Step by Step



SCD Type 1 , step by step



Type 1 (changing attribute): 
When the changes in any attribute or column overwrites the existing records.

By considering Type1, I assumed that the changed made in FirstName, Middlename and last name of an employee will be overwritten. For making such scenario, I have created two tables; one is tmpPerson, which is created as follows by using AdbentureWorks tables:
---Start-TmpPerson table ---

create table tmpPerson (EmpId int, Title varchar(10) Null,FirstName varchar(20), MiddleName varchar(20),LastName varchar(20),
JobTitle varchar(100),BirthDate Date,MaritalStatus char(1),Gender char(1),NationalIDNumber varchar(20)
,CurrentFlag int , StartDate date,EndDate date default ('2050-12-31'))


insert into tmpPerson  (EmpId , Title ,FirstName , MiddleName ,LastName,
JobTitle ,BirthDate ,MaritalStatus ,Gender ,NationalIDNumber
,CurrentFlag, StartDate )
select e.BusinessEntityID,per.Title,per.FirstName,per.MiddleName,per.LastName,
e.JobTitle,e.BirthDate,e.MaritalStatus,e.Gender,e.NationalIDNumber,
e.CurrentFlag,e.ModifiedDate

from HumanResources.Employee e inner join Person.Person per on e.BusinessEntityID=per.BusinessEntityID

----End –TmpPerson table----

I have also created one table tmpPersonStage similar to tmpperson, which stores the records which are changed by changing any of the column value. For Type 1 I have made some changes in FirstName, MiddleName and LastName, as below:

---Start--tmpPersonStage table----
create table tmpPersonstage (EmpId int, Title varchar(10) Null,FirstName varchar(20), MiddleName varchar(20),LastName varchar(20),
JobTitle varchar(100),BirthDate Date,MaritalStatus char(1),Gender char(1),NationalIDNumber varchar(20)
,CurrentFlag int , StartDate date,EndDate date default ('2050-12-31'))

insert into tmpPersonstage  (EmpId , Title ,FirstName , MiddleName ,LastName,
JobTitle ,BirthDate ,MaritalStatus ,Gender ,NationalIDNumber
,CurrentFlag, StartDate,EndDate )
select EmpId , Title ,FirstName , MiddleName ,LastName,
JobTitle ,BirthDate ,MaritalStatus ,Gender ,NationalIDNumber
,CurrentFlag, StartDate,EndDate
from dbo.tmpperson where empid in (3,6,8,58,49)

 update tmpPersonstage set FirstName ='Robert' where EmpId=3
 update tmpPersonstage set LastName ='Goldb' where EmpId=6
 update tmpPersonstage set FirstName ='Josse' where EmpId=8
 update tmpPersonstage set MiddleName ='L' where EmpId=49
 update tmpPersonstage set FirstName ='Kendal' , LastName='Keill' where EmpId=58

 ---End---tmpPersonstage table -------


After preparing the initial steps for the scenario, let’s move to SSIS package to apply SCD transformation. The below steps are the implementation of SCD Type 1


Step 1:

 Drag and drop one Data Flow Task in the Control flow tabs.

And name this DFT as ‘scd type 1’



Step2:

Go to Data Flow tab, Drag and drop OleDB Source and Slowly Changing Dimension transformation from Data flow transformations.

Step 3:

Configure OleDB source, here the source will be the table which has changed or new records, which will be updated or inserted in the main table. In our scenario the source table is ‘tmpPersonStage’, which keeps some updated and new records which will be updated or inserted into the mail table ‘tmpPerson’











Step 4:

Configure the SCD transformation; double click on SCD transformation, one wizard will open as below:



Click on next and create either new connection or already created connection to AdventureWorks2008, and while mapping the input columns to dimension column, at least one column need to be mentioned as ‘Business Key’.

 Business key is the column on the basis of which the updation or insertion into ‘tmpPerson’ will be made. Here we have mentioned EMPID as BusinessKey, so if EmpId of input table ‘tmpPersonStage’ is found in ‘tmpPerson’ table then the respective row  in ‘tmpPerson’ will be updated with the changes  from ‘tmpPersonStage’ input table





Click on next. In the below window, change type need to be mentioned on certain column. Change Type can be ‘Fixed attribute’, ‘Changing attribute’ and ‘Historical Attribute’



As in this scenario is based on changing attribute we will select some columns which will be overwritten if found modified.



Go to Next and select the below option,



Go to next and unselect ‘Enable inferred member support’, as we don’t require it for this scenario.



Click next, which displays New Records, Updated, other Outputs, which means there will be three arrows from SCD, one will take the new records, another will take the records which needs to be updated, and other outputs, which can be used for other operations.



Click next and Finish the wizard. As we finish the wizard we see some transformation like ‘OleDb command’ and ‘Insert Destination’ are automatically created. Oledb command is for upfated records and insert destination for new records.





Step 5:

Double click on Oledb command, and it will be configured automatically as below. The connection will be automatically configured to table ‘tmpPerson’ of Adventure works 2008, in which the records will be updated.



Also when clicked on Component Properties tab then the update command is also configure, which we can see on string value editor.



The update command updates records on the basis of EmpId



In column mapping tab the parameter are mapped with the columns mentioned in query. These Parameters contain the changing attribute columns (firstname, middlename ,lastname and one business key’empid’) from SCD




Step 6:

Insert Destination will also be configured automatically, to insert the new records coming from SCD to tmpPerson table.



Step 7:
Execute the DFT







After execution, we can see that , we updated 5 records in the tmppersonstage table, which are updated in tmpperson table also.As there was no new records , so no row is inserted into tmpperson.







Please provide your feedback for the post, if you find this post useful.

3 comments:

  1. I wish i could buy u a cup of coffee or beer, your article has been so helpful thank you

    ReplyDelete