DWH: how do you analyze fact and dimension tables
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 9 of 9

Thread: DWH: how do you analyze fact and dimension tables

  1. #1
    Join Date
    Mar 2002
    Posts
    534

    DWH: how do you analyze fact and dimension tables

    Hi,

    I was wondering how you analyze your fact and dimension tables.

    Our fact table is partitioned per month. Each partition contains 4M rows and is 270 MB large. We are using 9 dimensions, 6 have about 50'000 rows (2MB), 1 about 1M rows (50MB) and 2 about 3M rows (200MB). All tables are compressed. The version of Oracle we are using is 9.2.0.5.

    What I was wondering is how you would, using dbms_stats, analyze the fact and dimension tables. Which percentage would you analyze? On which column would you build histograms?

  2. #2
    Join Date
    Dec 2002
    Location
    Bangalore ( India )
    Posts
    2,434
    any sample of more than 5% would be better and histograms for columns which are frequently joined.. ( which would mostly likely be indexed columns .. which otherwise i dont see why would an index be present
    on that col/cols )

    For partitions.. i would think stats for new partitions created would be sufficeint, as a general rule, where in application would have increasing amount of data on latest partitions which are mostly on date columns..

    Abhay.
    funky...

    "I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."

    "Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"

  3. #3
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Random thoughts ...

    i) Queries that address single partitions use partition-level statistics, otherwise they use global statistics.

    ii) I compute statistics where time permits, and analyze all indexed columns even if they have no skew.

    iii) Some global statistics can be calculated from partion-level statistics (number of rows, blocks etc) pretty easily. For others (column stats) it is usually necessary to estimate.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  4. #4
    Join Date
    Dec 2002
    Location
    Bangalore ( India )
    Posts
    2,434
    Originally posted by slimdave
    ii) I compute statistics where time permits
    Any good compared to estimate with sample more than 5%.. ( for very large tables )

    Abhay.
    funky...

    "I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."

    "Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"

  5. #5
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Originally posted by abhaysk
    Any good compared to estimate with sample more than 5%.. ( for very large tables )

    Abhay.
    "... where time permits"
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  6. #6
    Join Date
    Dec 2002
    Location
    Bangalore ( India )
    Posts
    2,434
    Yes .. but when you can get almost same stats with estimate, then why waste resources ( rather resource's time ) by computing on table/index..?
    funky...

    "I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."

    "Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"

  7. #7
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Originally posted by abhaysk
    Yes .. but when you can get almost same stats with estimate, then why waste resources ( rather resource's time ) by computing on table/index..?
    "... where time permits"
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  8. #8
    Join Date
    Mar 2002
    Posts
    534
    what's your opinion/experience about star transformation?

    Does Oracle im most case take the right decision to either access the fact table via full table scan or rowid (TABLE ACCESS BY LOCAL INDEX ROWID) ?

  9. #9
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    Generally so, I think. It's not always correct because estimating the number of rows to be accessed is a tricky business in the real world as Oracle doesn't have a mechanism for maintaining multi-column histograms, which would really help in many cases.

    The key is to get the right value of optimizer index cost adj, or to collect system (hardware) statistics.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width