DBAsupport.com Forums - Powered by vBulletin
Results 1 to 3 of 3

Thread: Difference between Compute and estimate statistics

  1. #1
    Join Date
    Jun 2000
    Location
    chennai,tamil nadu,india
    Posts
    159
    Can anyone reply me with the actual difference between compute statistics and estimate statistics of analyze command.

  2. #2
    Join Date
    Nov 2000
    Posts
    344
    Compute uses all rows in the table/index/cluster to determine the statistics.

    Estimate uses only a portion of the rows (you specify how many rows or what percent of rows). If you are using, say , 50 % of the rows, I don't know if Oracle uses every other row, or the first or last half, though... does anyone know which portion of the rows Oracle uses?

    -John

  3. #3
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    I'm not actually sure which records it reads. However, what I *have* noticed:

    - The stats *will not* change if neither the number of rows nor the data changed.
    This is important. This means that the algorithm they use is deterministic and based on the number of rows in the table together with the % to estimate.

    - If you add a single row, which may be less than 1/1000th of 1%, the stats can change by up to 5%. This is affected significantly by how 'skewed' your data is.

    - If you have any 'extreme' data values, do *not* use estimation! For example, say 99% of the data in a given, indexed date field are within the past 5 years. *However*, this one rogue entry exists with a year of 1897. With Estimation, this value may or may not be found each time. As you can imagine, such a value will significantly change the high and low values in your stats, and therefore the applicability of the index in ranged selects. Therefore, with estimation, you never know what kind of plans you will get.

    Of course, the solution is to fix the data, but this is sometimes not possible. Another solution is to create a histogram, but that only works if your date values are 'hard-coded' in the queries. The final choice is to always COMPUTE that table and optimize to a consistent set of 'bad' statistics, possibly hinting the index where necessary.

    - So, basically, estimation is best for tables with relatively even distribution and no 'extreme' values. If you *do* use estimation, Oracle suggests using at least 30%. Further, any value of 60% or greater is the same as using 100%.

    Hope that all made sense,

    - Chris

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width