-
DataWarehouse with hourly updates
Whats the best approach on this ? I have been asked during 2 different interviews how to do it.
Originally I had monthly partitions, then we were going to daily. In order to only summarize and index the delta 'one day' I was going to continue the monthly partition table and alongside that would be a 31 partition daily table. I was going to summarize each week to forestall having to do it all on the last day of the month.
Clearly we just want to summarize/index the delta. Now as this model changes to 'Hourly' delta's can I extend this approach further? Union monthly_partitioned table Union Daily_partitioned_table Union Hourly_partitioned_table.
I have only read about materialized views academically but am quite sure there is no magic or silver bullet there.
Have you built an hourly? Did you make each delta a seperate partition so you could summarize/index seperately and snap-on with 'exchange partition' ? Did you use seperate partitioned objects while still allowing the application to query them together ?
Thanks for your ideas in advance.
-
Tell them you split it into nanosecond partitions to add an element of future-proofing. Then stand up, slap them and walk out
-
Originally Posted by TimHall
Tell them you split it into nanosecond partitions to add an element of future-proofing. Then stand up, slap them and walk out
Given that you're a Karate black belt, wouldn't you rather poke them in the solarplexus and yell "kiai!!" ?
-
There's no reason why you couldn't be loading into hourly partitions, then overnight merge them into a single day partition, then at the end of the week merge the month-to-date daily partitions into a monthly one. Does that make sense? You can have a range partitioned table with varying granularity of partition key, and I don't believe that merge is a very expensive operation.
-
Originally Posted by slimdave
There's no reason why you couldn't be loading into hourly partitions, then overnight merge them into a single day partition, then at the end of the week merge the month-to-date daily partitions into a monthly one. Does that make sense? You can have a range partitioned table with varying granularity of partition key, and I don't believe that merge is a very expensive operation.
Right. Thats the approach I took going monthly to daily. I was wondering if there was a better way, because with hourly you have to union queries to all 3 types of partitioned tables or put a view in front of it them. I suppose if its setup properly then oracle might be able to eliminate one of them if the key is out of range, similarily to how partition elimination works. Actually I think thats doubtful. At any rate you can see the application programmers looking at it as a kluge.
Was wondering if there was a slicker way to pull them together than view or union.
-
You're talking about having multiple tables, each with a consistent partition granularity.
I'm talking about having a single table with a varying partition granularity. Your current setup of two unioned tables with different granularities is kludgy, and you ought to have a single table.
-
Originally Posted by slimdave
You're talking about having multiple tables, each with a consistent partition granularity.
I'm talking about having a single table with a varying partition granularity. Your current setup of two unioned tables with different granularities is kludgy, and you ought to have a single table.
It is kludgy. But I cannot believe I talked myself into thinking they could not go into the same table. Oracle would neither know nor care that some partitions were month intervals, day, hour etc. But there may still be a snag, the partitioning key of the monthly partitions would be YEAR-MON, this would need to be extended to include hour for the smallest granularity partitions. I think thats where its stops working for me. So if month keys were extended to YEAR-MON-DAY-HOUR then I wonder how that will play out when the DAY-HOUR portion is zeroes.
-
Why not just use a single column date type for the key?
-
Because thats how the big telecom DW I improved was originally setup. I dont see why it wont work out with a simple date partition key now that you point it out. Thanks for talking some sense into me.
-
The telecom approach might make some sense if they wanted to compare the numbers for a particular hour in day one to the same hour in day two, I suppose, but it sounds yucky.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|