It is not so much a question of forcing the optimizer to do an anti-join as one of allowing it to.
Because of the logic of anti-join you must exclude the possibility of NULLs on either side of the join.
For columns or expressions not obviously NULL to the optimizer this is typically done by adding IS NOT NULL conditions to all columns or expressions referenced on BOTH sides of the in the NOT IN.
For example, the first query below does not guarantee that ename and loc are not null, hence hash anti-join is not an option. In the second query we explicitly state that ename and loc are NULL and anti-join results.
Code:
Personal Oracle Database 10g Release 10.1.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> SET AUTOTRACE ON EXPLAIN;
SQL> SELECT empno
2 FROM emp
3 WHERE ename NOT IN (
4 SELECT loc
5 FROM dept);
EMPNO
----------
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876
7900
7902
7934
14 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=76 Card=13 Bytes 130)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=55 Card=14 Bytes=140)
3 1 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=1 Bytes=7)
SQL> SELECT empno
2 FROM emp
3 WHERE ename IS NOT NULL
4 AND ename NOT IN (
5 SELECT loc
6 FROM dept
7 WHERE loc IS NOT NULL);
EMPNO
----------
7698
7654
7902
7499
7521
7934
7900
7369
7844
7782
7876
7788
7839
7566
14 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=59 Card=10 Bytes=170)
1 0 HASH JOIN (ANTI) (Cost=59 Card=10 Bytes=170)
2 1 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=55 Card=14 Bytes=140)
3 1 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=4 Bytes=28)
SQL>
Originally posted by padders Because of the logic of anti-join you must exclude the possibility of NULLs on either side of the join.
For columns or expressions not obviously NULL to the optimizer this is typically done by adding IS NOT NULL conditions to all columns or expressions referenced on BOTH sides of the in the NOT IN.
Its not the Q of "not null" at all, for the hash aj to be an option..
PS below.. the same query uses hash_aj if at all its run as query.. but when made as subqry in the update/merge statement it uses filter..
( here ofcourse i used not exists insead of not in, which should not make much difference ).. I tried with this not exists also in the merge statement still it goes for filter there....
Code:
PRO> explain plan set statement_id = 'ABHAY' for
2 select * from shipment_line sd
3 where
4 not exists (select /*+ hash_aj */ null from temp_slordt s where
5 S.SALES_ORGANIZATION_CD = Sd.SALES_ORGANIZATION_CD And
6 S.SALES_ORDER_NO = Sd.REFERENCE_DOCUMENT_NO And
7 Sd.REFRENCE_DOCUMENT_ITEM_NO = S.sales_order_item_no And
8 SUBSTR(s.SOURCE_NM,-3,3) = SUBSTR(sd.SOURCE_NM,-3,3)) and
9 sales_document_type <> '??' and --rownum > 0 and
10 exists (select null from temp_slsord s where
11 S.SALES_ORGANIZATION_CD = Sd.SALES_ORGANIZATION_CD And
12 S.SALES_ORDER_NO = Sd.REFERENCE_DOCUMENT_NO And
13 SUBSTR(s.SOURCE_NM,-3,3) = SUBSTR(sd.SOURCE_NM,-3,3));
Explained.
Elapsed: 00:00:01.40
PRO> @display_plan ABHAY
Id Par Pos Ins Plan
---- ---- ---- ---- ----------------------------------------------------------------------------------------------------------------------------------------------------------------
0 #### SELECT STATEMENT (choose) Cost (4743,6367,1139693)
1 0 1 HASH JOIN (anti) PARALLEL_TO_SERIAL Cost (4743,6367,1139693)
2 1 1 HASH JOIN (semi) PARALLEL_TO_PARALLEL Cost (3848,6431,964650)
3 2 1 PARTITION RANGE (all) PARALLEL_COMBINED_WITH_PARENT Pt id: 3 Pt Range: 1 - 4
4 3 1 1 TABLE ACCESS (analyzed) WWCIW_DBA SHIPMENT_LINE (full) PARALLEL_COMBINED_WITH_PARENT Pt id: 3 Pt Range: 1 - 4 Cost (3501,643102,81030852)
5 2 2 3 TABLE ACCESS (analyzed) WWCIW_SAP TEMP_SLSORD (full) PARALLEL_FROM_SERIAL Cost (165,79119,1898856)
6 1 2 2 TABLE ACCESS (analyzed) WWCIW_SAP TEMP_SLORDT (full) PARALLEL_FROM_SERIAL Cost (891,310268,8997772)
PS : In the table temp_slordt none of the columns are with "NOT NULL"..
Abhay.
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
> here ofcourse i used not exists insead of not in, which should not make much difference
Ah but it does. NOT IN and NOT EXISTS are not the same thing, and the difference revolves around NULLs.
Granted I do not have your tables or data but I can get your MERGE to go from FILTER to HASH_AJ simply by adding 8 IS NOT NULLs. Note that the FILTER plan is actually cheaper here because there is no data in the tables.
Code:
Personal Oracle Database 10g Release 10.1.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> CREATE TABLE shipment_line (
2 sales_organization_cd NUMBER,
3 reference_document_no NUMBER,
4 reference_document_item_no NUMBER,
5 source_nm VARCHAR2 (10),
6 shipment_no NUMBER,
7 shipment_item_no NUMBER,
8 sales_document_type VARCHAR2 (2),
9 reportable_flag VARCHAR2 (1),
10 sab101_flag VARCHAR2 (1),
11 country_cd VARCHAR2 (1),
12 ship_to_customer VARCHAR2 (1),
13 so_entry_date DATE,
14 sales_order_item_short_tx VARCHAR2 (2));
Table created.
SQL> CREATE TABLE temp_slordt (
2 sales_organization_cd NUMBER,
3 sales_order_no NUMBER,
4 sales_order_item_no NUMBER,
5 source_nm VARCHAR2 (10));
Table created.
SQL> CREATE TABLE temp_slsord (
2 sales_organization_cd NUMBER,
3 sales_order_no NUMBER,
4 sales_order_item_no NUMBER,
5 source_nm VARCHAR2 (10));
Table created.
SQL> SET AUTOTRACE ON EXPLAIN;
SQL> MERGE INTO shipment_line x
2 USING (SELECT *
3 FROM shipment_line sd
4 WHERE (sd.sales_organization_cd,
5 sd.reference_document_no,
6 sd.reference_document_item_no,
7 SUBSTR (sd.source_nm, -3, 3)) NOT IN (
8 SELECT /*+ HASH_AJ */
9 s.sales_organization_cd, s.sales_order_no,
10 s.sales_order_item_no, SUBSTR (s.source_nm, -3, 3)
11 FROM temp_slordt s)
12 AND sales_document_type <> '??'
13 AND EXISTS (
14 SELECT NULL
15 FROM temp_slsord s
16 WHERE s.sales_organization_cd = sd.sales_organization_cd
17 AND s.sales_order_no = sd.reference_document_no
18 AND SUBSTR (s.source_nm, -3, 3) = SUBSTR (sd.source_nm, -3, 3))) y
19 ON (x.sales_organization_cd = y.sales_organization_cd
20 AND x.shipment_no = y.shipment_no
21 AND x.shipment_item_no = y.shipment_item_no
22 AND x.source_nm = y.source_nm)
23 WHEN MATCHED THEN UPDATE
24 SET x.sales_document_type = '??', x.reportable_flag = '?',
25 x.sab101_flag = '?', x.country_cd = '?',
26 x.ship_to_customer = '?', x.so_entry_date = '31-Dec-2999',
27 x.sales_order_item_short_tx = '??'
28 WHEN NOT MATCHED THEN INSERT (
29 x.sales_organization_cd, x.shipment_no,
30 x.shipment_item_no, x.source_nm)
31 VALUES (y.sales_organization_cd, y.shipment_no,
32 y.shipment_item_no, y.source_nm);
0 rows merged.
Execution Plan
----------------------------------------------------------
0 MERGE STATEMENT Optimizer=ALL_ROWS (Cost=9 Card=1 Bytes=133)
1 0 MERGE OF 'SHIPMENT_LINE'
2 1 VIEW
3 2 FILTER
4 3 HASH JOIN (SEMI) (Cost=7 Card=1 Bytes=230)
5 4 HASH JOIN (OUTER) (Cost=5 Card=1 Bytes=197)
6 5 TABLE ACCESS (FULL) OF 'SHIPMENT_LINE' (TABLE) (Cost=2 Card=1 Bytes=95)
7 5 TABLE ACCESS (FULL) OF 'SHIPMENT_LINE' (TABLE) (Cost=2 Card=1 Bytes=102)
8 4 TABLE ACCESS (FULL) OF 'TEMP_SLSORD' (TABLE) (Cost=2 Card=1 Bytes=33)
9 3 TABLE ACCESS (FULL) OF 'TEMP_SLORDT' (TABLE) (Cost=2 Card=1 Bytes=46)
SQL> MERGE INTO shipment_line x
2 USING (SELECT *
3 FROM shipment_line sd
4 WHERE sd.sales_organization_cd IS NOT NULL
5 AND sd.reference_document_no IS NOT NULL
6 AND sd.reference_document_item_no IS NOT NULL
7 AND sd.source_nm IS NOT NULL
8 AND (sd.sales_organization_cd,
9 sd.reference_document_no,
10 sd.reference_document_item_no,
11 SUBSTR (sd.source_nm, -3, 3)) NOT IN (
12 SELECT /*+ HASH_AJ */
13 s.sales_organization_cd, s.sales_order_no,
14 s.sales_order_item_no, SUBSTR (s.source_nm, -3, 3)
15 FROM temp_slordt s
16 WHERE s.sales_organization_cd IS NOT NULL
17 AND s.sales_order_no IS NOT NULL
18 AND s.sales_order_item_no IS NOT NULL
19 AND s.source_nm IS NOT NULL)
20 AND sales_document_type <> '??'
21 AND EXISTS (
22 SELECT NULL
23 FROM temp_slsord s
24 WHERE s.sales_organization_cd = sd.sales_organization_cd
25 AND s.sales_order_no = sd.reference_document_no
26 AND SUBSTR (s.source_nm, -3, 3) = SUBSTR (sd.source_nm, -3, 3))) y
27 ON (x.sales_organization_cd = y.sales_organization_cd
28 AND x.shipment_no = y.shipment_no
29 AND x.shipment_item_no = y.shipment_item_no
30 AND x.source_nm = y.source_nm)
31 WHEN MATCHED THEN UPDATE
32 SET x.sales_document_type = '??', x.reportable_flag = '?',
33 x.sab101_flag = '?', x.country_cd = '?',
34 x.ship_to_customer = '?', x.so_entry_date = '31-Dec-2999',
35 x.sales_order_item_short_tx = '??'
36 WHEN NOT MATCHED THEN INSERT (
37 x.sales_organization_cd, x.shipment_no,
38 x.shipment_item_no, x.source_nm)
39 VALUES (y.sales_organization_cd, y.shipment_no,
40 y.shipment_item_no, y.source_nm);
0 rows merged.
Execution Plan
----------------------------------------------------------
0 MERGE STATEMENT Optimizer=ALL_ROWS (Cost=10 Card=1 Bytes=141)
1 0 MERGE OF 'SHIPMENT_LINE'
2 1 VIEW
3 2 HASH JOIN (ANTI) (Cost=10 Card=1 Bytes=276)
4 3 HASH JOIN (SEMI) (Cost=7 Card=1 Bytes=230)
5 4 HASH JOIN (OUTER) (Cost=5 Card=1 Bytes=197)
6 5 TABLE ACCESS (FULL) OF 'SHIPMENT_LINE' (TABLE) (Cost=2 Card=1 Bytes=95)
7 5 TABLE ACCESS (FULL) OF 'SHIPMENT_LINE' (TABLE) (Cost=2 Card=1 Bytes=102)
8 4 TABLE ACCESS (FULL) OF 'TEMP_SLSORD' (TABLE) (Cost=2 Card=1 Bytes=33)
9 3 TABLE ACCESS (FULL) OF 'TEMP_SLORDT' (TABLE) (Cost=2 Card=1 Bytes=46)
SQL>
Please do me the honour of posting your plan for MERGE statement with IS NOT NULLs added as above.
> why Oracle didnt use Hash Aj when i used NOT EXISTS in merge statement
Not 100% sure on that one. NOT EXISTS appears to use HASH_AJ when I try it. Perhaps it is related to your use of the /*+ ORDERED */ hint. Can you post a full example of that case.
execution plan and run time execution plan can differ, it happened to me quite a few times , I think the real plan you can see in the tkprof file and v$sql_plan, what it happened to me was I had cursor_sharing set to FORCE
Yes i know... Run time plan is what the exact plan that oracle takes for execution... but my Q is Why does it change when we explain and at run time.. my cursor sharing is exact.. btw, i execute as SQL rather then in a PL/SQL block.
Abhay.
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
Bookmarks