Difference between count(id) and count(distinct id) on unique column
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 3 of 3

Thread: Difference between count(id) and count(distinct id) on unique column

  1. #1
    Join Date
    Mar 2002
    Posts
    534

    Question Difference between count(id) and count(distinct id) on unique column

    Hi,

    I'm working on Oracle 9.2.0.4.

    I noticed recently that Oracle makes a difference if a count() or a count(distinct) is executed against a column defined as unique (unique index + unique constraint + not null).

    From my point of view a count() and count(distinct) against a unique column should take the same query time because Oracle should know that each column is unique.
    However, as you can see below, when the distinct is used Oracle makes a group by and doesn't just aggregate the data.

    So my question is: has anybody a explanation for that behavior or any idea what I can do so that Oracle is as fast with as without the distinct?

    Code:
    SQL>  select count(  id) from sales3;
    
     COUNT(ID)
    ----------
       2032542
    
    Abgelaufen: 00:00:03.00
    
    Ausführungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1)
       1    0   SORT (AGGREGATE)
       2    1     INDEX (FAST FULL SCAN) OF 'I_RN' (UNIQUE) (Cost=3 Card=1
              034313)
    
    SQL> select count( distinct id) from sales3;
    
    COUNT(DISTINCTID)
    -----------------
              2032542
    
    Abgelaufen: 00:00:07.00
    
    Ausführungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=4)
       1    0   SORT (GROUP BY)
       2    1     INDEX (FAST FULL SCAN) OF 'I_RN' (UNIQUE) (Cost=3 Card=1
              034313 Bytes=4137252)

  2. #2
    Join Date
    Aug 2002
    Location
    Bangalore, India
    Posts
    405

    Re: Difference between count(id) and count(distinct id) on unique column

    Originally posted by mike9
    Hi,

    I'm working on Oracle 9.2.0.4.

    I noticed recently that Oracle makes a difference if a count() or a count(distinct) is executed against a column defined as unique (unique index + unique constraint + not null).

    From my point of view a count() and count(distinct) against a unique column should take the same query time because Oracle should know that each column is unique.
    However, as you can see below, when the distinct is used Oracle makes a group by and doesn't just aggregate the data.

    So my question is: has anybody a explanation for that behavior or any idea what I can do so that Oracle is as fast with as without the distinct?

    Code:
    SQL>  select count(  id) from sales3;
    
     COUNT(ID)
    ----------
       2032542
    
    Abgelaufen: 00:00:03.00
    
    Ausführungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1)
       1    0   SORT (AGGREGATE)
       2    1     INDEX (FAST FULL SCAN) OF 'I_RN' (UNIQUE) (Cost=3 Card=1
              034313)
    
    SQL> select count( distinct id) from sales3;
    
    COUNT(DISTINCTID)
    -----------------
              2032542
    
    Abgelaufen: 00:00:07.00
    
    Ausführungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=4)
       1    0   SORT (GROUP BY)
       2    1     INDEX (FAST FULL SCAN) OF 'I_RN' (UNIQUE) (Cost=3 Card=1
              034313 Bytes=4137252)
    Both the queries are using same execution plan, so should use same amount of CPU as long as blocks are fetched from either buffer pool or from disk not mix.

    The time showed above is elapsed time I guess. Elapsed time is different from CPU time. Elapsed time may depend on network traffic at that particular point of time, your local machine load at the time of query execution.
    -nagarjuna

  3. #3
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253

    Re: Re: Difference between count(id) and count(distinct id) on unique column

    Originally posted by nagarjuna
    Both the queries are using same execution plan, so should use same amount of CPU as long as blocks are fetched from either buffer pool or from disk not mix.

    The time showed above is elapsed time I guess. Elapsed time is different from CPU time. Elapsed time may depend on network traffic at that particular point of time, your local machine load at the time of query execution.
    Well they're not using the same plan, because a SORT (GROUP BY) is different from a SORT (AGGREGATE). I suspect that the SORT (AGGREGATE) is actually faster here.

    Mike is right, and similar things happen if you have a table with a PK on column MY_PK, and run something like ...
    Code:
    Select My_pk, sum(my_num)
    from   my_table
    group by my_pk
    /
    ... which you might expect Oracle to optimize as ...
    Code:
    Select My_pk, my_num
    from   my_table
    /
    There was a question on AskTom about this a while ago, but he seemed to think it was no big deal and that the answer was to optimize it on the application side. Not an opinion I share, but there we go, it's just the way it is.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width