Slowly Changing Dimensions

Ralph introduced the concept of "slowly changing dimension (SCD)" attributes in 1996.

Slowly Changing Dimensions (SCD) - dimensions that evolve slowly over time, rather than changing the time base in the regular calendar. In the data warehouse, it is necessary to follow the changes of the attributes in the dimension in order to make the data historized. In other words, the implementation of an SCD type must allow users to assign the dimension attribute value appropriate for a given date. Example of these dimensions could be: customer, geography, employee.

There are many approaches to implement SCD, the most popular are:

o Type 0 - The passive method

o Type 1 - Overwrite the old value

o Type 2 - Creating an additional new record

o Type 3 - Addition of a new column

o Type 4 - Using the historical table

o Type 6 - Combining 1,2,3 type approaches (1 + 2 + 3 = 6)

Type 0 - The passive method.

In this method, no special action is performed on dimensional changes. Some dimension data may remain the same as the first time inserted, others may be overwritten.

Type 1 - Overwrite the old value.

In this method, no change history is kept in the database. The old value is simply overwritten by the news. This type is easy to maintain and often used for data whose changes are correction processing results (for example: special characters for removal, correction of spelling mistakes).

Before the change:

EmployeeKey FirsName title
13 Sidney Production Technician

After the change:

EmployeeKey FirsName title
13 Sidney Production Supervisor
Type 2 - Creating an additional new record.

In this methodology, the entire history of dimension changes is kept in the database. You capture the attribute change by adding a new row with a new substitution key to the dimension table. Both lines contain the attribute of the natural key. Also "effective date" and "current indicator" are used in this method. There could be one record with the current indicator set to 'Y'. For the "effective date" columns, start_date and end_date, the end_date for the current record is usually set to 9999-12-31.

Before the change:

EmployeeKey FirsName title START_DATE END_DATE Current_Flag
13 Sidney Production Technician 01/08/2014 31/12/9999 Y

After the change:

EmployeeKey FirsName title START_DATE END_DATE Current_Flag
13 Sidney Production Technician 01/08/2014 01/10/2015 N
25 Sidney Production Supervisor 02/10/2015 31/12/9999 Y
Type 3 - Adding a new column .

In this type, usually the current value and the previous dimension value are kept in the database. The new value is loaded in the 'current / new' column and the old value in the 'old / previous' column.

In general, the history is limited to the number of columns created to store historical data

Before the change:

EmployeeKey FirstName Current_Title Previous_Title
13 Sidney Production Technician Production Technician

After the change:

EmployeeKey FirstName Current_Title Previous_Title
13 Sidney Production Technician Production Supervisor
Type 4 - Using the Historical Table.

In this method, a separate historical table is used to track changes in the historized attributes for each dimension.

The "main" dimension table retains only the current data, for example: the customer and customer_history tables.

Current table:

EmployeeKey FirstName title
13 Sidney Production Technician

Historical table:

EmployeeKey FirsName title START_DATE END_DATE
13 Sidney Production Technician 01/08/2014 01/10/2015
13 Sidney Production Supervisor 02/10/2015 01/01/2016
13 Sidney Production Control Manager 02/01/2016 31/12/9999
Type 6 - Combining 1,2,3 type approaches (1 + 2 + 3 = 6).

In this type we have the following additional columns in the dimension table:

current_title - to maintain the current value of the attribute. All the historized records have the same current value.

historical_title - to maintain the historical value of the attribute. All the historized records can have different values.

start_date - to keep the start date of the attribute history.

end_date - to maintain the end date of the attribute's history.

current_flag - to keep information about the most recent record.

In this method, to capture the attribute change, we add a new record as in the 2 type. The current_title information is replaced by the new one as in the 1 type. We store the history in a historical_title column as in the 3 type.

EmployeeKey FirsName Current_Title Historical_Title START_DATE END_DATE Current_Flag
13 Sidney Production Control Manager Production Technician 01/08/2014 01/10/2015 N
25 Sidney Production Control Manager Production Supervisor 02/10/2015 01/01/2016 N
30 Sidney Production Control Manager Production Control Manager 02/01/2016 31/12/9999 Y

2 thoughts on "Slowly Changing Dimensions"

  1. I was curious if you ever changed the structure of your website?
    Its very well written; I love what you've got to say. But maybe you could have a little more
    in the way of being happy.

    You've got an awful lot of text for only having one or two
    images. Maybe you could space it out better?

    Reply

    1. Hi,
      Thank you for these remarks, I will take them into consideration and try to improve the best possible.

      Reply

Leave a Reply

Your email address Will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment is processed.