Showing posts with label sql server. Show all posts
Showing posts with label sql server. Show all posts

Monday, November 11, 2013

Surrogate key

Surrogate key

  • What is surrogate key?
  • Where do we use a surrogate key?
  • How it is different from the natural key?
  • How do we implement surrogate key?
  • What is the advantage of using a surrogate key?


  • What is surrogate key?

As the name suggests Surrogate means ‘substitute’.
The surrogate key is just another substitute key, that will be used to uniquely identify the rows in a dimension table. The surrogate key will have no business meaning.
It’s internally generated by the system and is invisible to the user or application. Thus its value cannot be modified by user or application.

  • Where do we use a surrogate key?

If the table is not having any attribute (or natural key), that uniquely identify a row.
If the attribute used as unique identifier uses much storage than required. For example, if the attribute is alphanumeric.
When there is doubt of changes in the values of the attribute (used for unique identification of the row)
If the table participates in more than one relationship from the logical data model. Then data needs to be fetched from multiple related tables using joins.
If we want to improve the performance of data retrieval from multiple tables.


  • Difference between Surrogate key and natural key

Surrogate key
Natural key
Don’t have business meaning
Have business meaning, for e.g. SSN number

These can be changed.
Small in size, generally of integer type. Can be created as identity column
Bigger in size, generally alphanumeric or string,
Comparison takes less time , as it compares to integer, joins are fast
Comparison  takes much time, as it converts string or alphanumeric to Ascii , and then comparison takes place, joins are slow
No locking contentions because of unique constraint, as the surrogates get generated by the DB and are cached – very scalable.

Locking contention may occur, as two sessions may try to insert the same unique business key,
Not searchable from application
searchable
An additional index is needed.  
No additional index required
There are often generated by the system .
Records cannot be inserted untill the key is not known, as it has got business meaning



Let's take a simple, classical example:
On the 1st of January 2002, Employee 'E1' belongs to Business Unit 'BU1' (that's what would be in your Employee Dimension). This employee has a turnover allocated to him on the Business Unit 'BU1' But on the 2nd of June the Employee 'E1' is moved from Business Unit 'BU1' to Business Unit 'BU2.' The entire new turnovers have to belong to the new Business Unit 'BU2' but the old one should belong to the Business Unit 'BU1.'
If you used the natural business key 'E1' for your employee within your datawarehouse everything would be allocated to Business Unit 'BU2' even what actually belongs to 'BU1.'
If you use surrogate keys, you could create on the 2nd of June a new record for the Employee 'E1' in your Employee Dimension with a new surrogate key.
This way, in your fact table, you have your old data (before 2nd of June) with the SID of the Employee 'E1' + 'BU1.' All new data (after 2nd of June) would take the SID of the employee 'E1' + 'BU2.'
You could consider Slowly Changing Dimension as an enlargement of your natural key: natural key of the Employee was Employee Code 'E1' but for you it becomes
Employee Code + Business Unit - 'E1' + 'BU1' or 'E1' + 'BU2.' But the difference with the natural key enlargement process, is that you might not have all part of your new key within your fact table, so you might not be able to do the join on the new enlarge key -> so you need another id.






Thursday, October 17, 2013

How to stop any user from dropping any database



How to stop any user from dropping any database


you can prevent casual database deletions with a trigger:

CREATE TRIGGER STOP_DB_DROP
ON ALL SERVER
FOR DROP_DATABASE
AS
PRINT 'You are not allowed to DROP databases'
ROLLBACK;

Monday, September 2, 2013

remove single user mode from a database

How to remove single user mode from a database?


Step1: Fine out the processes, which are using the particular database say ‘AdventureWorks’. This can be find executing below SP
Sp_who2

By executing sp_who2, get the SPID’s which uses particular database



Step 2: Kill that process as below

KILL [SPID]
KILL 52 --For example, if 52 is the SPID, that is using database AdventureWorks

Step 3: Execute Below query

ALTER DATABASE [DatabaseName] SET SINGLE_USER WITH ROLLBACK IMMEDIATE

Step 4: Set multiple user mode on database using below query


ALTER DATABASE [DatabaseName] SET MULTI_USER

Monday, August 26, 2013

Sql Server: index part 3

a.    What is non clustered index?

b.    How many non clustered indexes can be created on one table?

c.    Can you tell me some example on non clustered index that relates to real life experience?

d.    How non-clustered is different from clustered indexes?



Non-Clustered Index


 non-clustered index can be created as shown below:
create nonclustered index indexname on [TableName](ColumnName)

Some points regarding non-clustered indexex 
  • It’s similar to index in textbook, index at one place and data at other place. These indexes will have the pointers to the storage location of the actual data.

  • A table can have more than 1 non-clustered index, just like in a book, one index in starting of the book, another at the end of the book.

  • A table can only have one Clustered index and up to 249 Non-Clustered Indexes.  If a table does not have a clustered index it is referred to as a Heap.


Difference between Clustered and non clustered index
  • Max one clustered index and max 249 non clustered can be created on each table.

  • Clustered indexing is faster than non-clustered index. As clustered index will always cover  a query ,since it contains all of the data in the leaf node itself.

  •  Non-clustered indexes requires extra disk storage space.

for more questions and answers on indexes
Previous      Next

Sql Server: index part 1

a.    What is index? Why do we need indexes?
b.    What are the various types of indexes in sql server?
c.    How will you check if indexes already exist on a table
d.   
What will happen in case of no indexes?


Indexes are created to provide faster data access during specific data fetch and retrieval form huge data tables.

There are various types of indexes in sql server
a.    Clustered Index
b.    Non clustered index
c.    Unique
d.    Filtered
e.    Spatial
f.     Xml
g.    Full text
h.    Index with computed column
i.     Index with non computed column
  
    Using below procedure we can identify, if indexes exist on a table
Execute sp_helpindex table_name
For example:


In case of No index
If no index exists on a table, a table scan must be performed for each table referenced in a database query. The larger the table, the longer a table scan takes because a table scan requires each table row to be accessed sequentially. Although a table scan might be more efficient for a complex query that requires most of the rows in a table, for a query that returns only some table rows an index scan can access table rows more efficiently.

Next for more questions and answers on indexes
Understanding on Indexes


Sql Server: Understanding on indexes

Understanding on indexes

Indexes are created to provide faster data access during specific data fetch and retrieval form huge data tables.

There are various types of indexes in sql server
a.    Clustered Index
b.    Non clustered index
c.    Unique
d.    Filtered
e.    Spatial
f.     Xml
g.    Full text
h.    Index with computed column
i.     Index with non computed column

Mainly we will discuss here about Clustered index and non clustered index
Before going through understanding about indexes, let’s focus on the questions that one faces.

Monday, May 20, 2013

Sql Server: Local temporary table behavior in nested stored procedure with example


Local temporary table behavior in nested stored procedure with example


local temporary table scope remains only inside the stored procedure, in which it is created. Any inner stored procedure can access the local temporary table created in outer procedure. But any outer stored procedure can not access the local temporary table created in inner procedure.

Error you may getMsg 208, Level 16, State 0, Procedure procedure1, Line 6
Invalid object name '#temp1'.

Scenario 1: when one procedure calls another procedure and temp table is created in outer procedure procedure1 and accessed by inner procedure procedure2
Scenario 2: when one procedure calls another procedure and temp table is created in inner procedure procedure2 and accessed by outer procedure procedure1
create proc procedure1
as
begin
create table #temp1(name varchar(30),age int)
insert into #temp1
select 'ram',20
union all
select 'shyam',23
union all
select 'anjum',28
exec procedure2
end

GO;

create proc procedure2
as
begin
select * from #temp1
end

GO;
create proc procedure2
as
begin
create table #temp1(name varchar(30),age int)
insert into #temp1
select 'ram',20
union all
select 'shyam',23
union all
select 'anjum',28
end

GO;

create proc procedure1
as
begin
exec procedure2
select * from #temp1
end

GO;


As local temporary table scope is inside the stored procedure, in which it is created.

Procedure2 is nested inside procedure1 , thus procedure2 has access to temp table #temp1.
As local temporary table scope is only inside the stored procedure in which it is created.

Temp table #temp1 is created in procedure2, which is called by procedure1, thus any procedure inside procedure2 can access that table, but when procedure 1 tries to access the table #temp1 , which is an outer procedure, doesn’t find the table #temp1


Please provide your feedback for the post, if you find this post useful.

Friday, May 17, 2013

Sql Server: Difference between Row_Number, Rank, Dense_Rank


Difference between Row_Number, Rank, Dense_Rank

Syntax and use:

Row_Number 
Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.

ROW_NUMBER ( )     OVER ([<partition_by_clause>] <order_by_clause>)

Rank 
Returns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question.

RANK ( )    OVER ([< partition_by_clause >] < order_by_clause >)

Dense_Rank
 Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.

DENSE_RANK ( )    OVER ([<partition_by_clause> ] < order_by_clause > )

NTILE 
Distributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs.
NTILE (integer_expression) OVER ([<partition_by_clause>] < order_by_clause >)

Where
<partition_by_clause>
Divides the result set produced by the From clause into partitions to which the Row_Number/ Rank/ Dense_Rank/ Ntile function is applied.
<order_by_clause>
Determines the order in which the Row_Number/ Rank/ Dense_Rank/ Ntile values are applied to the rows in a partition. 

We will apply these function on the below customer product table CustProd.

name
Product
cust1
decoder
cust2
cable
cust1
cable
cust2
package
cust3
decoder
cust3
cable

Please see the below snapshot for understanding of these function through example



With partition by product and order by name,

When we use partition by product, then it divides the result on the basis of product, as there are three distinct products then there will be 3 partitions.

After partition, order by name is used, that means, in the partitions Row Number, Rank or Dense Rank will be assigned as per the order of name. Here in the below result we see that rank ,row number and dense rank, all are having same value, It’s because in each partition there are distinct name given, if name would have been repeated for the same product then those records will have same rank and dense rank, but row number would have been same as shown below.


When used order by product instead of name , then we see in the below result that, the Rank and dense Rank were 1, Because we did partition of result by product , that means there will be common product in each partition , and rank and dense rank will also be same for same product.

Please provide your feedback for the post, if you find this post useful.