time dimension: number or date as PK?
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 17

Thread: time dimension: number or date as PK?

  1. #1
    Join Date
    Mar 2002
    Posts
    534

    time dimension: number or date as PK?

    Hi everybody,

    I'm creating a new data mart and was wondering what kind of datatype you would recommand to use for the time dimension PK.

    Kimball recommands to allways use numbers but personaly I would prefere to use a date so that the range partitioning of the fact table would be simpler.

    Regards
    Mike

  2. #2
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    Never use numbers for dates; It messes up the optimizer.
    Jeff Hunter
    marist89@yahoo.com
    http://marist89.blogspot.com/
    Get Firefox!
    "I pledge to stop eating sharks fin soup and will not do so under any circumstances."

  3. #3
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  4. #4
    Join Date
    Mar 2002
    Posts
    534
    >Never use numbers for dates; It messes up the optimizer.

    why should it mess up the optimizer. I only use the PK key of the time dimension to join with the FK of the fact table but never to limit the result.

  5. #5
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Wouldn't the range partitioning be more simple if you used a date?
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  6. #6
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    Quote Originally Posted by mike9
    why should it mess up the optimizer. I only use the PK key of the time dimension to join with the FK of the fact table but never to limit the result.
    How many values are between 20060501 and 20060601? You say 31, the optimizer says 100.
    Jeff Hunter
    marist89@yahoo.com
    http://marist89.blogspot.com/
    Get Firefox!
    "I pledge to stop eating sharks fin soup and will not do so under any circumstances."

  7. #7
    Join Date
    Mar 2002
    Posts
    534
    marist89,

    Could you please be more precise on the definition of "values".
    Why would there be only 31 values? The smallest unit of a date value is one second not one day.
    So back to my question: why should it mess up the CBO?


    Slimdave,

    Yes I agree with you that it would be much simpler to range partition over a date. But the problem is that I have to convince some poeple that it would be better/easier not to take Kimballs way.

  8. #8
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    What arguments does Kimball make in support of his position?
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  9. #9
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    Quote Originally Posted by mike9
    marist89,

    Could you please be more precise on the definition of "values".
    Why would there be only 31 values? The smallest unit of a date value is one second not one day.
    So back to my question: why should it mess up the CBO?
    In the simplest form of a date being an integer, (YYYYMMDD), the first of the month is 20060501. These dates continue until 20060531, or 31 values. However, the optimizer says these are numbers and 20060542 is a valid value, which, of course, we know is not possible.

    By using integers for dates, you are feeding the optimizer bad information. This confusion compounds when you add in additional time components of HH, MI, and SS.
    Jeff Hunter
    marist89@yahoo.com
    http://marist89.blogspot.com/
    Get Firefox!
    "I pledge to stop eating sharks fin soup and will not do so under any circumstances."

  10. #10
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    Yes I agree with you that it would be much simpler to range partition over a date. But the problem is that I have to convince some poeple that it would be better/easier not to take Kimballs way.
    I think you misunderstood the Kimball's theory.
    The design should be :
    Kimball might have said that a DIM table should be created with surrogate key as PK, and date/time col.
    And the FACT table has FK referencing the DIM table.
    There is a small space advantage if you use adopt this design.
    A date col requires 7 bytes internal storage, where as the integer col in the FACT Table may not require 7 bytes.
    The 2nd advantage is tomorrow you have the flexibility of changing the data type of the date col in the DIM table to timestamp data type.

    Tamil

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width