Dimensions definition
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 9 of 9

Thread: Dimensions definition

  1. #1
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447

    Dimensions definition

    Hi

    I have to modify a Datawarehouse and adapt it to multi-company. Basically this is what I should do

    a Sales fact table, the Group was one company but after some merges it becomes three companies. So now they have to do rollup and drill down analysis based on companies, product category and Category type

    For example

    company ===> 3 companies

    product category ===> engine oil, fuel etc

    category type ===> Services, marketing product, finishing material etc

    The new aggreates can be based on companies, product category and/or category type

    Some data

    Company
    ============
    company_A
    company_B
    company_C

    Product category (these are codes)
    ===================================
    10
    20
    30
    40
    50
    60

    Category type (these are codes)
    ===================================
    01
    02
    MK
    PS
    SV


    The problem is any company can or NOT to share the product category codes and category type codes. I.E

    Company_A, 10, 01 can be same as Company_B, 10, 01 or Company_C, 20, 02



    My question how on earth would the dimensions defined? I was thinking to define three dimensions however since the codes by themselves EXCEPT the company code are not UNIQUE (The unique key would be the combination of all three) so there is no way to isolate product category and category type in their dimension (If my understanding of dimension is correct)

    I am thinking to simply add the three columns to the fact table and byebye to the dimensions...

    Any suggestions?

  2. #2
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Do you mean that the same code values are shared among the companies, but that knowing the meaning of the codes requires that you know the company also, because the sharedcode may mean something different in each company?
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  3. #3
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    The codes can have or not the same meaning in different company so basically is to get a unique code we require the company code too

    For example code 01 in company_A can be Fuel and 01 in company_B can be engine oil and 01 in company_C can mean something else OR same to any in company_A or company_B

    So the PK would be company_code, product_category_code and category_type_code. To make a dimension on these three columns would be pointless IMHO because these three columns have to go to the fact table, also would be pointless because the analysis can be based on any of these three columns or two of them or all of them

  4. #4
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    This is a tricky business then.

    A conventional way of dealing with this would be to translate the codes to a new, common basis. So if companies A, B, and C all have different codes for the product category "fuel", you would define a single standard "fuel" code (and it need bear no relation to the existing codes -- in fact, it's probably better if it doesn't, in order to avoid confusion). It's alsways nice if you can find some industry standard set of codes to map to, though it's not always possible to do so.

    If you really require that the companies be able to view their native codes then you could take one of two approaches.

    Approach 1 is to make the company's native codes children of the standard code, and do the company/code concatanation thing to make them unique. Only the company native code would appear in the fact table -- the standard code would be related to the native code through the dimension table.

    Approach 2 is to make the standard and native codes completely independant of each other by storing both in the fact table and maintaining separate dimensions tables for each.

    Whatever approach you take i would still maintain the company as a separate column from the product category in the fact table.

    It's stuff like this that means 70-80% of the DW effort is devoted to ETL coding.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  5. #5
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    Hi Slimdave

    Thx for the replies.

    I just found out today that the codes are unique, i.e company_id is unique, product_category_id is unique and category_type_id is unique as well. Thing seems better now.

    Now I have some denormalization/normalization query. After knowing these three attributes can be three dimensions by themselves I am proceeding to create three dimensions, one for company, one for product_category and one for category_type. The SALES fact table would have all these three dimensions PK and corresponding FK to these dimensions. The query is there is a customer table which supposes to have all these three dimensions PKs as well because before adding these dimensions customer_id was the PK of customer table but from now on the combination of

    customer_id
    company_id
    product_category_id
    category_type_id

    is the new PK of customer table, I wonder should I create relationship between customer table and the three new dimensions? In OLTP I would do it but this is DWH and I am not so sure (I am not a modeling expert nor OLTP or DWH). Any suggestions :-?

    Cheers

  6. #6
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    The usual DWH practice is to code the ETL procedure to guarantee referential and primary key integrity. In such cases you define the PK and FK constraints but make them ...

    * novalidate
    * disable
    * rely

    The last setting will tell the optimizer to assume integrity even though the constraint is neither enforced nor validated by the db.

    You also save space on defining a pk index, which is often as large as the table itself.

    Is there a direct relationship between product category and category type, like type is the parent of category?
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  7. #7
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    No, there is no direct relationship between product category and category type...

    cheers

  8. #8
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    By the way the company will only have 5 rows, product category 20 and category type 10... Not sure if they should be dimensions with so few number of rows! And simply add the ID columns and the description columns of all these three in the customer table then simply in the ETL process do the integrity checks
    Last edited by pando; 11-22-2003 at 03:16 AM.

  9. #9
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Nothing wrong with so few rows -- I have many dimensions of just two or three values. That's what them bitmap indexes are for.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width