Incorrect and inconsistent result while grouping very large volume of data
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: Incorrect and inconsistent result while grouping very large volume of data

  1. #1
    Join Date
    May 2006
    Location
    New York, USA
    Posts
    8

    Incorrect and inconsistent result while grouping very large volume of data

    Hello - here I am facing an interesting problem. The two aggregate queries below on the same table, written in slightly different way, gives me different results! More surprising is that, the second query gives different different aggregate result everytime I run it, and each time the result is different from what it actually should be - the value of first query. I have tested exactly similar kind of queries on smaller data set, say couple of millions, no problem at all. But the problem starts if the table has large number of records, which is true in my case - the f_rrn_gross table has 260M records, and it is partitioned by one of the date key fields (surrogate key). All the group by fields are actually surrogate key in the fact table, and they are all not null.

    I have tried different test cases to find out the root cause of the problem, but no clue till today. Consulted multiple DBAs, even opended a TAR in metalink, no solution or problem reason could be offered. Finally decided to post into this forum, wondering if someone could help me out.

    I would really appreciate if you can throw some light why this never-seen-before, unexplainable (to me) problem is happening.

    Thanks so much.
    -Titash.

    -------------------------------------------

    SELECT SUM(RR_USD_AMT)
    FROM F_RRN_GROSS

    42548085415.4457

    SELECT SUM (tot_rr_usd_amt)
    FROM (SELECT /*+ parallel(f,10) */
    SUM (rr_usd_amt) tot_rr_usd_amt
    FROM f_rrn_gross f
    GROUP BY bkng_date_key,
    bkng_loc_key,
    cur_key,
    PROMO_KEY,
    prop_key,
    rate_plan_key);

    42545111219.4799

  2. #2
    Join Date
    Nov 2002
    Location
    Geneva Switzerland
    Posts
    3,142
    An expression generally assumes the datatype of its components.
    So, what is the datatype of rr_usd_amt?

    Is it possible that you have VERY large, compensating, values in the GROUP BY sub-query? Could these produce rounding errors?
    "The power of instruction is seldom of much efficacy except in those happy dispositions where it is almost superfluous" - Gibbon, quoted by R.P.Feynman

  3. #3
    Join Date
    May 2006
    Location
    New York, USA
    Posts
    8
    Thanks for the response.

    The datatype of rr_usd_amt is NUMBER (20,6).

    I concure with that 6 decimal point precision could have created rouding error, but I have had the same discrepancies/problem even when I rounded off the rr_usd_amt field to 0 decimal point, and tehn summed up, like the one below.

    I would really appreciate if you could throw some more light on this.

    Thanks once again.

    -Titash.

    -----------------------------------------------
    SELECT SUM(round(RR_USD_AMT))
    FROM F_RRN_GROSS

    SELECT SUM (tot_rr_usd_amt)
    FROM (SELECT /*+ parallel(f,10) */
    SUM (round(rr_usd_amt)) tot_rr_usd_amt
    FROM f_rrn_gross f
    GROUP BY bkng_date_key,
    bkng_loc_key,
    cur_key,
    PROMO_KEY,
    prop_key,
    rate_plan_key);

  4. #4
    Join Date
    Nov 2002
    Location
    Geneva Switzerland
    Posts
    3,142
    I suspect I'm barking up the wrong tree on this. Before declaring it to be a bug, I'd look at:

    SELECT MAX(tot_rr_usd_amt), MIN(tot_rr_usd_amt)
    FROM (SELECT /*+ parallel(f,10) */
    SUM (rr_usd_amt) tot_rr_usd_amt
    FROM f_rrn_gross f
    GROUP BY bkng_date_key,
    bkng_loc_key,
    cur_key,
    PROMO_KEY,
    prop_key,
    rate_plan_key);
    "The power of instruction is seldom of much efficacy except in those happy dispositions where it is almost superfluous" - Gibbon, quoted by R.P.Feynman

  5. #5
    Join Date
    May 2006
    Location
    New York, USA
    Posts
    8
    Thanks for the quick response, I really appreciate this.

    Here is the result of the query:

    -----------------------------------------------------------
    MAX(TOT_RR_USD_AMT) MIN(TOT_RR_USD_AMT)
    376001148.24 -77255

    -----------------------------------------------------------

    Thanks a lot for your help.

    -Titash.

  6. #6
    Join Date
    Feb 2005
    Posts
    158
    Any chance you have a query rewrite on a stale MV ?
    How does it look if you just count(*), rather than summing $ values

    SELECT count(*)
    FROM F_RRN_GROSS;

    SELECT SUM (cnt)
    FROM (SELECT /*+ parallel(f,10) */
    count(*) cnt
    FROM f_rrn_gross f
    GROUP BY bkng_date_key,
    bkng_loc_key,
    cur_key,
    PROMO_KEY,
    prop_key,
    rate_plan_key);

  7. #7
    Join Date
    Nov 2002
    Location
    Geneva Switzerland
    Posts
    3,142
    Code:
    MAX(TOT_RR_USD_AMT)	MIN(TOT_RR_USD_AMT)
    376001148.24	            -77255
    I think that kills my rounding error theory.
    "The power of instruction is seldom of much efficacy except in those happy dispositions where it is almost superfluous" - Gibbon, quoted by R.P.Feynman

  8. #8
    Join Date
    May 2006
    Location
    New York, USA
    Posts
    8
    1. No, there is no MV with query rewrite.
    2. Both the queries give same final count, no issue here:

    Thanks so much for your help.

    -----------------------------------------------------
    SELECT count(*)
    FROM F_RRN_GROSS;

    297115024

    SELECT SUM (cnt)
    FROM (SELECT /*+ parallel(f,10) */
    count(*) cnt
    FROM f_rrn_gross f
    GROUP BY bkng_date_key,
    bkng_loc_key,
    cur_key,
    PROMO_KEY,
    prop_key,
    rate_plan_key);

    297115024

  9. #9
    Join Date
    May 2006
    Location
    New York, USA
    Posts
    8
    Hi DaPi,

    Do you think this can be a Oracle 10.2.0.1.0 (this is version of oracle I am using) bug? If you think so, by any chance, is there any patch or work-arround which you are aware of?

    Thanks so much for your help, I really appreciate this.

    -Titash.

  10. #10
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    Quote Originally Posted by titash
    Hi DaPi,

    Do you think this can be a Oracle 10.2.0.1.0 (this is version of oracle I am using) bug? If you think so, by any chance, is there any patch or work-arround which you are aware of?

    Thanks so much for your help, I really appreciate this.

    -Titash.

    Did you try w/o parallel threads?

    Tamil

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width