We have a daily job that collects schema stats via the DBMS_STATS package
run via cron and starts at 2AM. ETL processing starts at 7:30 AM daily. The stats collection job does not seem to finish before 7:30 AM and continues till about 11-12 AM.
Is it correct to assume that the ETL processes use the stats. collected from the previous day's run ? The concern here is that maybe no stats. are used at all by the optimizer.
It is hard to say what is being used. Please post complete syntax of dbms_stats job.
One, who thinks that the other one who thinks that know and does not know, does not know either!
it's easy to say - whatever was there when the query was parsed will be used
wow, you run stats for 10 hours? how big is the DB? What degree you use ?
My Private World on the Web !
basic Q is why daily, why not stale option?
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
It's a DW - around 450 GB in size. Stale is not used as MONITORING is not turned on - as davey23uk stated if previous runs stats are used then it's fine.
BTW - saw notes on Metalink about estimate % from 1-5% - our jobs use 30% and we have some vendor code that uses 3% so is there a consensus on this too ?
Thanks to all who replied
try it with 1% see how things perform
it's pretty easy to test the accuracy of sampling. just use the SAMPLE clause on a select statement with varying percentages and based on rows or blocks, and see how precise and accurate row counts and column max/min's are.
Since you want to collect stats for very huge database I would prefer to go for gather auto or gather stale with table monitoring
Click Here to Expand Forum to Full Width