-
Histograms Hindering in chosing optimal access path by CBO.
Hi All,
The index is getting used if i just analyze the table with buckets 1 for COLS stats, where as same is going for FTS when i take STATS of the cols with 50/75/100 buckets.
PS below
Code:
WW04_PRO_DBA> analyze table shipment_2002 delete statistics;
Table analyzed.
WW04_PRO_DBA> analyze table shipment_2002 estimate statistics sample 40 percent;
Table analyzed.
WW04_PRO_DBA> Select
2 TABLE_NAME ,
3 NUM_ROWS ,
4 BLOCKS ,
5 EMPTY_BLOCKS ,
6 AVG_SPACE ,
7 CHAIN_CNT ,
8 AVG_ROW_LEN ,
9 AVG_SPACE_FREELIST_BLOCKS ,
10 SAMPLE_SIZE ,
11 LAST_ANALYZED
12 From
13 dba_tables
14 Where
15 Table_Name = 'SHIPMENT_2002'
16 /
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN SAMPLE_SIZE LAST_ANAL
--------------- ---------- -------- -------------- ----------- ----------- ------------- ------------ ---------
SHIPMENT_2002 1401935 20784 975 874 0 105 460658 11-SEP-03
WW04_PRO_DBA> ed
Wrote file afiedt.buf
1 Select
2 TABLE_NAME ,
3 COLUMN_NAME ,
4 NUM_DISTINCT ,
5 LOW_VALUE ,
6 HIGH_VALUE ,
7 DENSITY ,
8 NUM_NULLS ,
9 NUM_BUCKETS ,
10 LAST_ANALYZED ,
11 SAMPLE_SIZE
12 from
13 dba_tab_columns
14 where
15* Table_Name = 'SHIPMENT_2002'
WW04_PRO_DBA> /
TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_BUCKETS SAMPLE_SIZE
---------------- ---------------------- ------------- ------------------------ -------------------------- -------------------- ------------- -------------
SHIPMENT_2002 SLS_ORG_CD 76 41455331 55535645 .0131578947368421 1 460658
SHIPMENT_2002 SLS_ORD_NUMBER 181091 30303030313631343333 49313433303932303530 5.52208558128234E-06 1 460658
SHIPMENT_2002 SHIPID 325636 30303231313333313837 41333733373434323838 .0000030709135353585 1 460658
SHIPMENT_2002 SLS_ORD_LINE_ITEM_NM 1413 303030303031 313232303031 .000707714083510262 1 460658
SHIPMENT_2002 PART_NUMBER 2879 3130333235302D423234 525A32394C2D454E .000347342827370615 1 460658
WW04_PRO_DBA> select * from shipment_2002 where part_number='ajsd';
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=382 Card=487 Bytes=4
3830)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SHIPMENT_2002' (Cost=382
Card=487 Bytes=43830)
2 1 INDEX (RANGE SCAN) OF 'PK_SHIPMENT_2002' (UNIQUE) (Cost=
6 Card=487)
WW04_PRO_DBA> analyze table shipment_2002 estimate statistics sample 40 percent for all columns size 75;
Table analyzed.
WW04_PRO_DBA> Select
2 TABLE_NAME ,
3 COLUMN_NAME ,
4 NUM_DISTINCT ,
5 LOW_VALUE ,
6 HIGH_VALUE ,
7 DENSITY ,
8 NUM_NULLS ,
9 NUM_BUCKETS ,
10 LAST_ANALYZED ,
11 SAMPLE_SIZE
12 from
13 dba_tab_columns
14 where
15 Table_Name = 'SHIPMENT_2002';
TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_BUCKETS SAMPLE_SIZE
---------------- ----------------------- ------------ ----------------------- ------------------------- -------------------- ------------ ------------
SHIPMENT_2002 SLS_ORG_CD 76 41455331 55535645 .00666666666666667 31 460658
SHIPMENT_2002 SLS_ORD_NUMBER 181091 30303030313631343333 49313433303932303530 .0000134466437177281 75 460658
SHIPMENT_2002 SHIPID 325636 30303231313333313837 41333733373434323838 .0000076974590687614 75 460658
SHIPMENT_2002 SLS_ORD_LINE_ITEM_NM 1413 303030303031 313232303031 .00666666666666667 42 460658
SHIPMENT_2002 PART_NUMBER 2879 3130333235302D423234 525A32394C2D454E .00587852006350807 68 460658
WW04_PRO_DBA> select * from shipment_2002 where part_number='ajsd';
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1268 Card=8241 Bytes
=741690)
1 0 TABLE ACCESS (FULL) OF 'SHIPMENT_2002' (Cost=1268 Card=824
1 Bytes=741690)
I dunno why CBO sometimes acts crazyly..could any one explain this behaviour.
Thanks
Abhay.
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
-
Hi Abhay,
I got a similair problem but didnt have yet enough time to realy investigate it and find out how this histogramms work and how they are used by the CBO.
I guess the reason for the different behaviour is the value of density which change from 0.0003 to 0.005. So the optimizer thinks that it should be faster doing a fts then accesing each row via the index. Why? I haven't got a clue, he doesnt even know if the record exists!?
What happens if you lower the value of DB_FILE_MULTIBLOCK_READ_COUNT does he still do a fts?
You may also have a look at:
Performance poor after analyzeing table
http://asktom.oracle.com/pls/ask/f?p...3126073805757,
Density
http://asktom.oracle.com/pls/ask/f?p...2969235095639,
If you should find a realy good document which explains well how the histogramm is working and how it should be used it would be great if you could make a notice about it in this thread
-
-
Thanks for the links, but I am still not convinced as to why oracle behaves like this, according to my understanding its thinking somethink like
Avg No Of Non popular values per bucket=16.92425926283973353
Number of values per bucket=42
So Avg Number of popular values per bucket ~~ 23
It even isnt recognizing when queried with the very low (repeating values) & its going for FTS. I think 145 bucket size isnt sufficient for distinct count of 2000+ to abloutely determine the skewness of the column.
can somebody correct if i am wrong, or even suggest.
Well i am now getting convienced that Histograms on highly/averagely spread values should not be taken histograms. Well i only wanted to see how Oracle will calculate Evenness of the column.
Histograms will well be useful if distinct count of the value is well less than 254.
I would like to get suggestion anyhow ( from probabaly slimdave, pando, jurij, dapi, tamil, chris )
Abhay.
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
-
This guy has a lot of complex scripts that would do the histogram-trick correctly.
Does anyone use these scripts? And if so, do they work okay?
I don't want to just copy the scripts and run them obliviously, but I can't determine if these script are totally useless or not. The biggest problem - for me - is the technical english in some of these docs.
An expert is one who knows more and more about less and less until he knows absolutely everything about nothing.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|