DW-- Long hours for minimum incremental data
One of our client is having an DW environment, where
300mb of weekly incremental data is applied every week to the
Data warehousing environment.
They do the below steps,
Download the data which is in the form of Dump.
ETL,(17 procedures are running) which is running for 35 hours.
( The experts of the project who were supporting this told
the time taken is correct, but we doubt.)
Cube building --RUnning for 12 hours.
Every week they are doing the above procedure.
Database -- Oracle 8i.
Since We are new to data warehousing environment and we
were asked to look into this, we need advices from experts
about the job running for long hours. We think they are doing
the entire rebuilding instead of applying the incremental data
into the required format.
I am sure that I will get immediate reply from Slimdave / Tamil..if they available.
you should provide us with more info of the unput data, the programs
being run, and their codes what jah think
trace the statements, see what is being done all you have said so far is that something is being done and it takes a long time
I think you are more concern to the whole process of warehousing
the new data to the DW database and not a specific procedure
that hampers the performance of loading the data.
I have designed a DW setup before that cut the usual whole process of loading new data to the DW server, regenerate the cube to up to 400%.
Here is what I did, create a dummy database that its only purpose is to load new data anytime as long as the new data(either dump, data from other db server, etc.) is available. After the new data are loaded, when DW DB server is available for use, retrieve all new data from the dummy server using dblink and then update the date range of the cubes(discover in my case at that time) for regenerating/updating the cube. And then truncate again all data from dummy server.
So, users still enjoy the use of the DW server, while the dummy database uploads the new data. And I do it every night.
I don't like to sound picky, but it seems like what you want is for someone else to do your consultancy engagement for you.
So, maybe they are indeed doing a full rebuild instead of an incremental one. I guess the first step would be to actually find out.
It takes tham 35 hours + 12 hours for cube rebuilds to load 300Mb of data? That's less than a single CD! What's their ETL tool, Quick BASIC?
Started my analysis.(Oppurtunity for my to get into DW environment.)
There is no ETL tool used.
18 custom developed procedures are used during ETL operations.
Let me trace the procedures, and anlyse where the problem is.
Click Here to Expand Forum to Full Width