|
-
Hi,
I have read some documents in metalink about histograms, and it is not much clear for me when to use it.
Could you explain me when it is recimmended their use for increase performance of sql statements, and when it will decrease the performance?
Thanks in advance and best regards
Angel
-
In short, they can be very usefull if your data distribution on a particular table is very skewed.
For example, let's say you have 10000 rows in a table. On some indexed column col1 you have 5000 distinct values. So for CBO, you have on avarage 2 rows per key. So when you isue some query with condition:
... WHERE col1=0 ....
optimizer will probably find it very efficient to use that index.
But in reality you might have the situation that you have 5000 rows with value col1=0 and the remaining 5000 rows having all other distinct values. Without having histograms, CBO can not be aware that full table scan would be much more efficient than index scan for that particular query. But if you have proper histograms for that indexed column, CBO will have much more information about the actual data distribution and can choose much better execution plan.
When are histograms useless? probably in cases where:
- Data distribution is even, so that avarages are quite reliable and sufficient.
- You use bind variables. In that case histograms are useless, because the actual values used in the WHERE condition are not known at parse time, when the optimal execution path is determined (actual values are bound to the query only at execution time, when execution path is allready determined).
Jurij Modic
ASCII a stupid question, get a stupid ANSI
24 hours in a day .... 24 beer in a case .... coincidence?
-
Thank you very much Jurij for you well explanation
Regards
Angel
-
To add something...:
The first rule is to use histograms only if you understand well what they are and used for.
I have noticed that it is risky to make histograms for all columns, I have used them only for columns having a high degree of skew data.
The basic rules are not to use histograms
1. If the column is unique and used with =.
2. If you do not have that column in where statements.
3. If the column is uniformly distributed.
Try to understand the rules for choosing the sample size when analyzing (for example with histograms the number of sample rows must be > 100 times the number of buckets, etc). Then you should know how to choose the number of buckets, etc.
I do not use histograms for the predicates on the columns in our applications often use bind variables. During testing with histograms the applications was slower than without histograms.
-
It is not my case. It is simply curiosity, if I want to delete histograms, wich will be the steps to do it?
Thanks in advance
Angel
-
Originally posted by aarroyob
It is not my case. It is simply curiosity, if I want to delete histograms, wich will be the steps to do it?
Thanks in advance
Angel
:-) Good question.
According to Oracle, histogram information will always exist, according to bug 584092, which was closed as not a bug:
"When the FOR clause is not specified it is a histogram of one bucket.
When the optimizer sees this, it won't consider it a histogram. "
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|