Sunday, February 19, 2012

Create Dimension Table from Fact Table!!

I have picked an exmple from this forum, to help me explain my current problem...

"I'm looking for a solution to import data from a flat file into an normalized data modell. To explain it a little simpler think about to following:

The Data Souce is a CSV-File with FirstName, LastName and Category. Sample data could be

Dirk; Bauer; sailing
Peter; Bauer; fishing
Marc; Bauer; reading

In my data modell I have defined the 2 tables "Person" and "Category":

Table "Person"
-
[PersonID] [int] IDENTITY(1,1) NOT NULL
[CategoryID] [int] NOT NULL
[FirstName] [nvarchar](50)
[LastName] [nvarchar](50)

Table "Category"
-
[CategoryID] [int] IDENTITY(1,1) NOT NULL
[CategoryName] [nvarchar](50)

Now I like to read my first row from the source and lookup a value for the CategoryID "sailing". As my data tables are empty right now, the lookup is not able to read a value for "sailing". Now I like to insert a new row in the table "Category" for the value "sailing" and receive the new "CategoryID" to insert my values in the table "Person" INCLUDING the new "CategoryID".

I think this is a normal way of reading data from a source and performing some lookups. In my "real world" scenario I have to lookup about 20 foreign keys before I'm able to insert the row read from the flat file source.

I really can't belief that this is a "special" case and I also can't belief that there is no easy and simple way to solve this with SSIS. Ok, the solution from Thomas is working but it is a very complex solution for this small problem. So, any help would be appreciated...

Thanks,
Dirk"

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=74752&SiteID=1

Could someone help me creating the dimension table?

Thanks!!

It's very common to derive dimensions from source data - that is the purpose of a data warehouse after all.

What you need to do is perform the dimension lookup with your source data. For values not found (New dimensions), you can run the records down the error output into a Derived Column derivation to create your new dimension table complete with keys, then merge them back into the flow to update the dimension tables. Your methodology will determine how this is done.

|||How about seperate packages for dimensions and facts. The dimensions get built first from the source data so there will always be a lookup match when building the fact|||That will work as well. Always transform dimensions before facts.

Wes|||

wesd wrote:

That will work as well. Always transform dimensions before facts.

Wes

That's what I would do.

|||Please mark this thread as answered, the op has the information he needs. Email me if you need more information.|||

I will Try!!

Thanks!!!

No comments:

Post a Comment