Saturday, May 10, 2008

Handling Multiple Calendars with a M2M Scenario

Most dimensional models and BI solutions implement a date dimension of one type or another and many of them include multiple calendars -- the "wall" (or Gregorian) calendar, a fiscal calendar, a manufacturing calendar, etc.  Handling multiple calendars is fairly simple and most people approach the problem by adding additional columns to a date dimension table that holds a row for each date they need.  Thus, many date dimension tables look like the one in Adventure Works:

DimTime Dimension Table

However, what if you need to support more calendars now (maybe more than 3 or 4) and you need a solution that will support N calendars into the future?  Sounds a little crazy, but a while back I was working with a client that had such a need.  As it turned out, they wanted to have their major customer fiscal calendars included in their date dimension so that sales and marketing managers could analyze data according to the fiscal calendar of the customer.  Maybe not so crazy...

The client wanted a solution that would not require schema changes (relationally or in SSAS) whenever a new calendar needed to be added (or when a calendar needed to be retired).  Thus, the solution couldn't really rely on things like adding new columns to a date dimension.  After thinking about the problem quite a bit, we decided that we could solve it using a M2M scenario.  In this blog entry, I'll lay the ground work for the solution by providing the details on the relational side.  In a follow-on entry, I'll finish it up by showing the details on the SSAS side.

On the relational side, we needed to add a table to hold the calendar information as well as a "bridge" table that would relate dates to a given calendar.  I'll show how this might work in the AdventureWorksDW database.  To begin with, we'll continue to use the DimTime table as it exists today so that we can compare our new solution and its results to the existing mechanism for showing the Gregorian and fiscal calendars for Adventure Works.  A "final" version of this solution might trim down the DimTime table so that it only includes TimeKey and FullDateAlternateKey.

Now, we need a new dimension table to represent our new solution for calendars and dates.  Each row in the new table will represent a date within a given calendar; thus, we will end up with 1/1/2008 in the table for as many calendars as we need to support.  And every new calendar that gets added (in the future) will add a new row for 1/1/2008.  Thus, the grain of the dimension table is no longer just the date -- it's the combination of a date and the calendar it belongs to.  Thus, we create a new DimCalendar table that looks something like this:

DimCalendar Dimension Table

Notice that the table includes most of the normal attributes you'd think of for a date -- year attributes, quarter attributes, etc.  If a given calendar needs more attributes (for different "levels"), you can add columns for them (semesters, for example).  If a given calendar doesn't need that particular "level" or attribute, then you can simply leave that column NULL for each of the dates in that calendar.

You may be wondering why the table design includes the date itself (DayName and DayNumber attributes) since we're going to end up relating this new table to the existing DimTime table, which already includes date.  By including date here, we're able to include the day level in hierarchies that we build based on this new dimension (for example, Year-Quarter-Month-Day and Year-Week-Day hierarchies).

When populating this table, we decided to ensure that the DayNumber column provided a unique value for each day within each calendar.  Thus, for each row, we decided to assign a prefix (10 for the Gregorian calendar, 20 for the first fiscal calendar, etc.) to each day number.  So, the DayNumber column for 1/1/2008 in the Gregorian calendar became 1020080101 while the DayNumber column for 1/1/2008 in the first fiscal calendar became 2020080101.

With this table in hand, you can populate each calendar that you need by simply adding rows to the table for each day (across as many years as you want) for each calendar.  And in the future, you can add another calendar by simply adding new rows to the table for each day in the new calendar.  New calendars no longer need schema changes.  Here's what some of the data in the DimCalendar table might look like:

DimCalendar Table Data

Finally, to make the M2M scenario work for us, we need a bridge table that relates DimTime to DimCalendar.  This new bridge table will relate a date from DimTime (like 1/1/2008) to multiple rows in the DimCalendar table (like our 1/1/2008 row for the Gregorian calendar and our 1/1/2008 row in the first fiscal calendar.  We named this new bridge table FactDateCalendar and it looks like this:

FactDateCalendar Bridge Table

The overall model of using these new tables (shown along with a couple of the existing fact tables in the AdventureWorksDW database) then looks like this:

Updated AdventureWorksDW Model

With these tables populated (using whatever means necessary), we're ready to modify the SSAS solution for Adventure Works to use our new M2M scenario.  I'll follow-up in the next week or so with those details.


john said...

Great article! And great idea too!

In the article you mention:

"When populating this table, we decided to ensure that the DayNumber column provided a unique value for each day within each calendar. Thus, for each row, we decided to assign a prefix (10 for the Gregorian calendar, 20 for the first fiscal calendar, etc.) to each day number..."

But looking at the screenshot of DimCalendar, it looks like you also assigned the same prefix to the QuarterNumber, MonthNumber, and WeekNumber as well – correct? I assume that’s the case, but I just wanted to make sure.

Sam Kane said...

Here are this and some other articles on SSAS Date Dimension

Anonymous said...

The keying of the calendar table, and subsequently the M2M table, isn't clear to me. If the calendar table is only keyed on the Calendar ID, then there should only be one row allowed per calendar type. Is it really some combination of calendar and date? Why did you opt for a surrogate ID in the M2M table? Wouldn't you want to prevent duplicates? I'm intrigued by your results, but there's some apparent holes in the process that I can't figure out.