joins-nested loops,hash or merge
Q1:What are the factors that determine the join, viz. nested loop, hash or merge, to be used by a query?
not able to figure out the things as mentioned below:
SQL> create table e(name char(1), d_id number);
Table created.
SQL> create table d(d_id number, name char(1));
Table created.
SQL> select * from d,e where d.d_id=e.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 TABLE ACCESS (FULL) OF 'E'
4 1 SORT (JOIN)
5 4 TABLE ACCESS (FULL) OF 'D'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> alter table d add primary key(d_id);
Table altered.
SQL> select * from d,e where d.d_id=e.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 TABLE ACCESS (FULL) OF 'E'
3 1 TABLE ACCESS (BY INDEX ROWID) OF 'D'
4 3 INDEX (UNIQUE SCAN) OF 'SYS_C003007' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL>
Based on my reading of various googled topics, i felt that the nested loop join should have been used in the absence of indexes on d_id column of both tables and merge join should have been used in the presence of indexes on d_id column of both tables; and clueless abt when hash join to be used.
The misery doesnt end here.
Here is what i explored further.
SQL> alter table d drop primary key;
Table altered.
SQL> select * from d,e where d.d_id=e.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'E'
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'D'
4 2 INDEX (RANGE SCAN) OF 'E_TEST' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> analyze table e compute statistics;
Table analyzed.
SQL> analyze table d compute statistics;
Table analyzed.
SQL> select * from d,e where d.d_id=e.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=32)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'E'
2 1 NESTED LOOPS (Cost=2 Card=1 Bytes=32)
3 2 TABLE ACCESS (FULL) OF 'D' (Cost=2 Card=1 Bytes=16)
4 2 INDEX (RANGE SCAN) OF 'E_TEST' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> alter table d add primary key(d_id);
Table altered.
SQL> select * from d,e where d.d_id=e.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=32)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'E'
2 1 NESTED LOOPS (Cost=2 Card=1 Bytes=32)
3 2 TABLE ACCESS (FULL) OF 'D' (Cost=2 Card=1 Bytes=16)
4 2 INDEX (RANGE SCAN) OF 'E_TEST' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> select * from e,d where e.d_id=d.d_id;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=32)
1 0 NESTED LOOPS (Cost=2 Card=1 Bytes=32)
2 1 TABLE ACCESS (FULL) OF 'E' (Cost=2 Card=1 Bytes=16)
3 1 TABLE ACCESS (BY INDEX ROWID) OF 'D'
4 3 INDEX (UNIQUE SCAN) OF 'SYS_C003008' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
Q2. what determines the join order?
---ordering of tables in from clause(i figured, it is this factor; but still dont
want to be wrong at sm later stage...dats y asking)
---ordering of coumns in where clause
Q3: how do the following affect the performance of and type join to be used by a query?
---ordering of table names in the from clause
---order of columns in the where clause
---unique(primary key cloumn) and non-unique(foreign key column) indexes
on the columns being joined.