DBAsupport.com Forums - Powered by vBulletin
Results 1 to 1 of 1

Thread: Data Warehouse Query Performance question

  1. #1
    Join Date
    Nov 2008
    Posts
    1

    Data Warehouse Query Performance question

    I'm trying to write a complicated query for a data warehouse that uses the star schema. One of the fact tables holds inventory history data for many purposes. One of those purposes is history of inventory holds. One of the dimensions for this fact table contains a flag that, when set to true indicates the inventory is put on hold, when set to false indicates the inventory is taken off of hold, and when not set indicates that this fact record doesn't pertain to a hold.

    The query I'm writing is for a report that will show inventory that was placed on hold and how many of those have been taken off hold, indicating the remaining quantity that needs to be handled. To further complicate matters, it is possible to "handle" the On-Hold inventory in increments. In other words, you could place 100 parts on hold at once but then the released quantity could come back one at a time, 10 at a time, or all 100 at once. That means I need to sum up the released quantity on the fact records as they pertain to the released holds.

    Here is the solution I've come up with:

    I created one view called the init_hold_view that fetches the fact records pertaining to initialized holds (*note that I've left out all but the relevant names*)

    Code:
    select [dimension_keys],
           [needed_dimension_data]
           fact.qty init_qty
      from fact_table fact
      left join [all_dimension_tables]
           on [dimension.dimension_key] = [fact.dimension_key]
     where hold_disposition = 'T'
    I then created a second view called the released_hold_view that fetches the fact records pertaining to released holds.

    Code:
    select [dimension_keys],
           fact.qty rels_qty
      from fact_table fact
      left join [disposition_dimension]
        on [dim_key] = [fact_key]
     where hold_disposition = 'F'
    My report query then uses the two views to find the results:

    Code:
    select init.init_qty,
           sum(rlsd.rels_qty) rels_qty,
           [needed_dimension_data]
      from init_hold_view init
     inner join released_hold_view rlsd
        on [matching dimension keys]
     where ...
     group by [dimension_data and init_qty]
    My worry is that I'm performing too many joins and I need to find an optimal solution that can scale to millions of rows. I was hoping that by joining simple integer keys across views into the same table that I was saving myself some work. Looking back at it, I'm not so sure there isn't a better way. Does anyone have any suggestions on a better way to do something like this?

    Thank you in advance,
    J
    Last edited by gitter1226; 11-10-2008 at 11:38 AM. Reason: sorry - doing some formatting

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width