DBAsupport.com Forums - Powered by vBulletin
Results 1 to 8 of 8

Thread: Partition-Wise Join in different cases

  1. #1
    Join Date
    Apr 2002
    Location
    Shenzhen, China
    Posts
    327

    Exclamation Partition-Wise Join in different cases

    Hi, all

    Following is an example of partition-wise join in different cases, and leads to several question need your advice:


    drop table pn;
    drop table ps;
    drop table ps2;
    create table pn (u_id int, u_name varchar2(10)) partition by hash(u_id) partitions 4;
    create table ps (u_id int, u_sales int) partition by hash(u_id) partitions 4;
    create table ps2 (u_id int, u_sales int) partition by hash(u_id) partitions 4;


    insert into pn values(1,'Calvin');
    insert into pn values(2,' Jessica Qiu');
    insert into ps values(1, 1000);
    insert into ps values(2, 1000);
    insert into ps2 values(1, 1000);
    insert into ps2 values(2, 1000);
    commit;


    exec dbms_stats.set_table_stats('CALVIN', 'PN', numrows => 1000000000);
    exec dbms_stats.set_table_stats('CALVIN', 'PS', numrows => 100000000);
    commit;


    --Query 1
    explain plan for select pn.u_name, ps.u_sales from pn, ps where pn.u_id = ps.u_id;
    @?/rdbms/admin/utlxplp
    ----------------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
    ----------------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1000M| 56G| | 3623K (3)| 12:04:40 | | |
    | 1 | PARTITION HASH ALL | | 1000M| 56G| | 3623K (3)| 12:04:40 | 1 | 4 |
    |* 2 | HASH JOIN | | 1000M| 56G| 905M| 3623K (3)| 12:04:40 | | |
    | 3 | TABLE ACCESS FULL| PS | 100M| 2479M| | 2726 (100)| 00:00:33 | 1 | 4 |
    | 4 | TABLE ACCESS FULL| PN | 1000M| 32G| | 27047 (100)| 00:05:25 | 1 | 4 |
    ----------------------------------------------------------------------------------------------------

    NOTE: It's clear that PS joined with PN in partition-wise join mode.

    --Query 2
    explain plan for select pn.u_name, ps.sales from pn, (select u_id, sum(u_sales) sales from ps group by u_id) ps where pn.u_id = ps.u_id;
    @?/rdbms/admin/utlxplp
    -----------------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
    -----------------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1000M| 63G| | 4392K (9)| 14:38:28 | | |
    | 1 | PARTITION HASH ALL | | 1000M| 63G| | 4392K (9)| 14:38:28 | 1 | 4 |
    | 2 | SORT GROUP BY | | 1000M| 63G| | 4392K (9)| 14:38:28 | | |
    |* 3 | HASH JOIN | | 1000M| 63G| 905M| 4112K (2)| 13:42:28 | | |
    | 4 | TABLE ACCESS FULL| PS | 100M| 2479M| | 2726 (100)| 00:00:33 | 1 | 4 |
    | 5 | TABLE ACCESS FULL| PN | 1000M| 39G| | 27047 (100)| 00:05:25 | 1 | 4 |
    -----------------------------------------------------------------------------------------------------

    NOTE: PS joined with PN in partition-wise join mode also. My question is that why did "SORT GROUP BY" (OP#2) appear above "HASH JOIN" instead of "TABLE ACCESS FULL| PS"


    --Query 3
    explain plan for select pn.u_name, ps.u_sales from pn, (select * from ps union all select * from ps2) ps where pn.u_id = ps.u_id;
    @?/rdbms/admin/utlxplp
    ------------------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
    ------------------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 32G| 1817G| | 3623K (3)| 12:04:40 | | |
    |* 1 | HASH JOIN | | 32G| 1817G| 3623M| 3623K (3)| 12:04:40 | | |
    | 2 | VIEW | | 100M| 2479M| | 2726 (100)| 00:00:33 | | |
    | 3 | UNION-ALL PARTITION| | | | | | | | |
    | 4 | PARTITION HASH ALL| | 100M| 2479M| | 2726 (100)| 00:00:33 | 1 | 4 |
    | 5 | TABLE ACCESS FULL| PS | 100M| 2479M| | 2726 (100)| 00:00:33 | 1 | 4 |
    | 6 | PARTITION HASH ALL| | 2 | 52 | | 4 (0)| 00:00:01 | 1 | 4 |
    | 7 | TABLE ACCESS FULL| PS2 | 2 | 52 | | 4 (0)| 00:00:01 | 1 | 4 |
    | 8 | PARTITION HASH ALL | | 1000M| 32G| | 27047 (100)| 00:05:25 | 1 | 4 |
    | 9 | TABLE ACCESS FULL | PN | 1000M| 32G| | 27047 (100)| 00:05:25 | 1 | 4 |
    ------------------------------------------------------------------------------------------------------

    NOTE: PS didn't join with PN in partition-wise join mode. I though that the access path should be:
    1, PS union all PS2 in partition level
    2, the interim result set from Step1 joins PN in partition-wise join mode.
    3, union all result from Step2.


    Thanks for help!

    Calvin
    Last edited by Calvin_Qiu; 11-01-2005 at 04:47 AM.
    Oracle Certified Master - September, 2003, the Second OCM in China
    *** LOOKING for PART TIME JOB***
    Data Warehouse & Business Intelligence Expert
    MCSE, CCNA, SCJP, SCSA from 1998

  2. #2
    Join Date
    Sep 2002
    Location
    England
    Posts
    7,334
    ok, i gotta ask, why the flowers in the insert statement?

  3. #3
    Join Date
    Mar 2002
    Posts
    534
    please use the CODE or PHP tag when you show explain plans

    which DB version are you using? I guess 10g or 10gR2.

    2) use the no_merge hint in your subquery and compare the costs

    3) it may be faster to do the 2 joins separatly because it would then do 2 partition wise joins:
    select pn.u_name, ps.u_sales from pn, ps where pn.u_id = ps.u_id
    union all
    select pn.u_name, ps2.u_sales from pn, ps2 where pn.u_id = ps2.u_id

  4. #4
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    1. Even though the table is parttioned, it does not have actual data. You just manipulated statistics by updating number of rows. What about other statistics? So the optimizer comes with worst plan as you said.

    2. AUTOTRACE in sqlplus will not always tell the true access path.
    You need to run the script and see tkprof output. Even tkprof output may omit sql statements if they are used at the column level.

    Tamil
    Last edited by tamilselvan; 11-02-2005 at 09:01 PM.

  5. #5
    Join Date
    Mar 2002
    Posts
    534

    1. A partition wise join combined with a partition wise group by doesn't mean that it is the worst execution plan

    2. Even if AUTOTRACE and/or EXPLAIN PLAN FOR doesn't always show the correct execution plan it doesnt it in most cases and with this kind of large queries I would say that it even does it more often.

  6. #6
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    Code:
    I simulated your query 2. 
    
    SQL> explain plan for select pn.u_name, ps.sales from pn,
         (select u_id, sum(u_sales) sales from ps group by u_id) ps 
         where pn.u_id = ps.u_id;
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------
    | Id  | Operation             |  Name       | Rows  | Bytes |TempSpc| Cost  | Pstart| Pstop |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |             |  1000P|  40856|       |    10M|       |       |
    |   1 |  MERGE JOIN           |             |  1000P|  40856|       |    10M|       |       |
    |   2 |   VIEW                |             |  1000M|    24G|       |  2557K|       |       |
    |   3 |    SORT GROUP BY      |             |  1000M|    24G|       |  2557K|       |       |
    |   4 |     PARTITION HASH ALL|             |       |       |       |       |     1 |     4 |
    |   5 |      TABLE ACCESS FULL| PS          |  1000M|    24G|       |     4 |     1 |     4 |
    |*  6 |   SORT JOIN           |             |  1000M|    18G|    52G|  8255K|       |       |
    |   7 |    PARTITION HASH ALL |             |       |       |       |       |     1 |     4 |
    |   8 |     TABLE ACCESS FULL | PN          |  1000M|    18G|       |     4 |     1 |     4 |
    ---------------------------------------------------------------------------------------------
    
    Let us see the tkprof output:
    
    select pn.u_name, ps.sales
    from  pn, 
           (select u_id, sum(u_sales) sales 
              from ps group by u_id) ps 
    where pn.u_id  = ps.u_id
    
    
    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.00       0.00          0          5          0           0
    Execute      1      0.00       0.00          0          0          0           0
    Fetch        2      0.00       0.00          0         24          0           2
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total        4      0.00       0.00          0         29          0           2
    
    Misses in library cache during parse: 1
    Optimizer goal: CHOOSE
    Parsing user id: 31
    
    Rows     Row Source Operation
    -------  ---------------------------------------------------
          2  MERGE JOIN
          2   VIEW
          2    SORT GROUP BY
          2     PARTITION HASH ALL PARTITION: 1 4
          2      TABLE ACCESS FULL PS PARTITION: 1 4
          2   SORT JOIN
          2    PARTITION HASH ALL PARTITION: 1 4
          2     TABLE ACCESS FULL PN PARTITION: 1 4
    
    ********************************************************************************
    In general, Oracle will materialize the in-line view if it sees rownum , or any aggregate function in it as it happened in my test.
    But you got a different plan in which the tables are joined , then the sum is calculated.

    That's why I asked you to run tkprof.

    Tamil
    Last edited by tamilselvan; 11-02-2005 at 09:03 PM.

  7. #7
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    Many times, autotrace desperates me.

  8. #8
    Join Date
    Mar 2002
    Posts
    534
    A important information which is missing here is which Oracle version is used. An interessting new feature introduced in 10gR2 is the use of the hash algorithm for the group by.
    http://www.oracle.com/technology/pro...10gr2_0705.pdf (page 9, Hash-Based Aggregation)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width