-
referencing the table name twice in the query
Hi all,
Our application in its legacy stage was using MS sql server. When the proceduress were changed to suit to Oracle, there may have been some inefficient statements left, which would have been working great in sql server. One such common occurance, was found to be a variant of the following sql:
INSERT INTO T#CORPRANGE ( DCORP1,DCORP2 ) SELECT C.DCORP,D.DCORP FROM LYRANGEENTITY A,LYRANGE B,LYENTITY C,LYENTITY D WHERE B.RANGEKEY = A.RANGEKEY AND B.RANGENAME = :b1 AND :b1 IS NOT NULL AND B.ENTERPRISEKEY = :b3 AND C.ENTITYKEY = A.ENTITYKEY_1 AND D.ENTITYKEY = A.ENTITYKEY_2
In absence of any self-joins, I asked the developers why the need for referencing LYENTITY twice. I was told that in sql server this is a necessity and the same thing has been retained in oracle too. I dont know ablut sql server but when I tried this statement with and without referencing the LYENTITY as C and D, I got the following results ( I had to include some hard values in place of variables):
*********************************************************************
Case 1. Referencing only once
SQL> set autotrace on exp stat
SQL> INSERT INTO T#CORPRANGE ( DCORP1,DCORP2 )
2 SELECT C.DCORP,C.DCORP FROM LYRANGEENTITY A,LYRANGE B,LYENTITY C
3 WHERE B.RANGEKEY = A.RANGEKEY AND B.RANGENAME = 'RSCONS' AND
4 'RSCONS' IS NOT NULL AND B.ENTERPRISEKEY = 82 AND C.ENTITYKEY = A.ENTITYKEY_1
5 AND C.ENTITYKEY = A.ENTITYKEY_2;
21 rows created.
Elapsed: 00:00:02.00
Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=43)
1 0 HASH JOIN (Cost=6 Card=1 Bytes=43)
2 1 TABLE ACCESS (FULL) OF 'LYRANGEENTITY' (Cost=1 Card=864
Bytes=10368)
3 1 NESTED LOOPS (Cost=4 Card=963 Bytes=29853)
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'LYRANGE' (Cost=1 Car
d=1 Bytes=16)
5 4 INDEX (UNIQUE SCAN) OF 'UKLYRANGE1' (UNIQUE)
6 3 TABLE ACCESS (FULL) OF 'LYENTITY' (Cost=3 Card=963 Byt
es=14445)
Statistics
----------------------------------------------------------
1474 recursive calls
17 db block gets
439 consistent gets
0 physical reads
404 redo size
848 bytes sent via SQL*Net to client
804 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
26 sorts (memory)
0 sorts (disk)
21 rows processed
********************************************************************
Case 2. Referencing twice
SQL> INSERT INTO T#CORPRANGE ( DCORP1,DCORP2 )
2 SELECT C.DCORP,D.DCORP FROM LYRANGEENTITY A,LYRANGE B,LYENTITY C,LYENTITY D
3 WHERE B.RANGEKEY = A.RANGEKEY AND B.RANGENAME = 'RSCONS'
4 AND 'RSCONS' IS NOT NULL AND B.ENTERPRISEKEY = 82
5 AND C.ENTITYKEY = A.ENTITYKEY_1 AND D.ENTITYKEY = A.ENTITYKEY_2;
21 rows created.
Elapsed: 00:00:01.53
Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=12 Card=6 Bytes=348)
1 0 NESTED LOOPS (Cost=12 Card=6 Bytes=348)
2 1 HASH JOIN (Cost=6 Card=6 Bytes=258)
3 2 TABLE ACCESS (FULL) OF 'LYRANGEENTITY' (Cost=1 Card=86
4 Bytes=10368)
4 2 NESTED LOOPS (Cost=4 Card=963 Bytes=29853)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'LYRANGE' (Cost=1 C
ard=1 Bytes=16)
6 5 INDEX (UNIQUE SCAN) OF 'UKLYRANGE1' (UNIQUE)
7 4 TABLE ACCESS (FULL) OF 'LYENTITY' (Cost=3 Card=963 B
ytes=14445)
8 1 TABLE ACCESS (BY INDEX ROWID) OF 'LYENTITY' (Cost=1 Card
=963 Bytes=14445)
9 8 INDEX (UNIQUE SCAN) OF 'PKLYENTITY' (UNIQUE)
Statistics
----------------------------------------------------------
1277 recursive calls
17 db block gets
434 consistent gets
0 physical reads
404 redo size
847 bytes sent via SQL*Net to client
815 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
27 sorts (memory)
0 sorts (disk)
21 rows processed
SQL>
*************************************************************
The timing was less in case 2, but the execution plan suggested that more work is being done. Also I tried another generic test:
09:49:45 SQL> select cse,amt from ctxamt where rownum<5;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1379 Card=318249 Byt
es=1272996)
1 0 COUNT (STOPKEY)
2 1 TABLE ACCESS (FULL) OF 'CTXAMT' (Cost=1379 Card=318249 B
ytes=1272996)
Statistics
----------------------------------------------------------
458 recursive calls
4 db block gets
181 consistent gets
8 physical reads
0 redo size
554 bytes sent via SQL*Net to client
426 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
4 rows processed
09:51:09 SQL> select a.cse,b.amt from ctxamt a, ctxamt b where rownum<5;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=182358056 Card=10128
2426001 Bytes=405129704004)
1 0 COUNT (STOPKEY)
2 1 MERGE JOIN (CARTESIAN) (Cost=182358056 Card=101282426001
Bytes=405129704004)
3 2 TABLE ACCESS (FULL) OF 'CTXAMT' (Cost=1379 Card=318249
Bytes=636498)
4 2 SORT (JOIN) (Cost=182356677 Card=318249 Bytes=636498)
5 4 INDEX (FAST FULL SCAN) OF 'UKCTXAMT1' (UNIQUE) (Cost
=573 Card=318249 Bytes=636498)
Statistics
----------------------------------------------------------
0 recursive calls
56 db block gets
3824 consistent gets
3828 physical reads
0 redo size
533 bytes sent via SQL*Net to client
426 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
1 sorts (disk)
4 rows processed
**********************************************************************
See the difference in the consistent gets and physical reads?
My question is, is it really a bad practice to use the tablenames twice when there are no self-joins? Or, are there some circumstances when it is required, or that it does not simply matter, one way or the other? I am sure it is a costly practice to use the tablenames twice, but would like to confirm from the others who might have come across such statements.
Our application is full of such occurances, and I would like to make a start on correcting these.
Thanks
Manjunath
-
This is really an advanced question.
Guys, please show up. I want to see the answers.
-
Since the two references to LYENTITY are joined to LYRANGEENTITY on different columns, the two queries are only identical if the LYRANGEENTITY.ENTITYKEY_2 and LYRANGEENTITY.ENTITYKEY_1 have the same value, no?
-
Hi Slimdave,
In the Case 2, LYENTITY is being referenced twice, seemingly without reason.
Thanks
Manjuanth
-
I don't understand why you can't see the reason.
In the query you propose that uses the table LYENTITY only once, you will get no rows inserted if LYRANGEENTITY.ENTITYKEY_2 and LYRANGEENTITY.ENTITYKEY_1 have different values, whereas in the original version you would still get a result as long as there were values (potentially in different rows) of LYRANGEENTITY.ENTITYKEY_2 and LYRANGEENTITY.ENTITYKEY_1 in the LYENTITY.ENTITYKEY table column.
Mybe the sefl-join thing is confusing you. There are joins in the SQL -- they are expressed by "table1.column1 = table2.column2"
-
enititykey_1 and entitykey_2 are separate fields in a table. How does it matter if I reference the tablename as a.entitykey_1 and b.entitykey_2? Why should there be aliases if selfjoin is not there?
-
I give up. Perhaps someone who speaks SQLServer would like to answer this, because we are obviously talking about two completely different types of SQL here.
-
AND C.ENTITYKEY = A.ENTITYKEY_1
AND D.ENTITYKEY = A.ENTITYKEY_2
is A.ENTITYKEY_1 FK of C.ENTITYKEY?
is A.ENTITYKEY_2 FK of D.ENTITYKEY?
You have to check the relationships not just because the column name are similar then you use it for join condition, now it may return same rows as other SQL statement but in the future?
Also your second example is totally wrong, how can you compare
select * from emp
and
select * from emp a, emp b
oranges and apples
-
Originally posted by manjunathk
enititykey_1 and entitykey_2 are separate fields in a table. How does it matter if I reference the tablename as a.entitykey_1 and b.entitykey_2? Why should there be aliases if selfjoin is not there?
Manjunath look into the example below....u will i undersatnd why we should use ref table twice........
PHP Code:
A.Ent1 A.Ent2 B.entpkey C.entky D.entky
1 11 82 1 11
11 111 82 1 11
111 1111 82 111 1111
1111 11111 82 22 11111
With ref table being used 2 times only row 1 and row 3 will be selected.
Had u used Ref table only once u wud have selected all the rows or no rows depending on A.ent1 and A.ent2 if they are same or different....it should be used in case if u want to roll ur rows into columns........
i hope this is clear.....
Abhay.
funky...
"I Dont Want To Follow A Path, I would Rather Go Where There Is No Path And Leave A Trail."
"Ego is the worst thing many have, try to overcome it & you will be the best, if not good, person on this earth"
-
create table t1 ( id number , city1 varchar2(30), city2 varchar2(30));
create table t2 ( id number, city varchar2(30));
insert into t1 values ( 1, 'Boston', 'Boston')
insert into t1 values ( 2, 'Boston' , 'Atlanta');
insert into t2 values ( 1, 'Boston');
insert into t2 values ( 2, 'Atlanta');
commit;
SQL> select * from t1;
ID CITY1 CITY2
---------- ------------------------------ ------------------------------
1 Boston Boston
2 Boston Atlanta
2 rows selected.
SQL> select * from t2 ;
ID CITY
---------- ------------------------------
1 Boston
2 Atlanta
2 rows selected.
Table t2 is used only once in the from clause.
SQL> select a.* from t1 a , t2 b where b.city = a.city1 and b.city = a.city2 ;
ID CITY1 CITY2
---------- ------------------------------ ------------------------------
1 Boston Boston
1 row selected.
Here, the tables are sorted and merge joined, producing only one row in t1 matches
city value of t2 with city1 and city2 in t1. This is what I expected. And Oracle gave me
a correct result set.
Now let us consider another example.
Table t2 is used 2 times in the from clause.
SQL> select a.* from t1 a , t2 b , t2 c where b.city = a.city1 and c.city = a.city2;
ID CITY1 CITY2
---------- ------------------------------ ------------------------------
2 Boston Atlanta
1 Boston Boston
2 rows selected.
Here, I got 2 rows becuase the join of t1(a) and t2(b) produced one row
and another join t1(a) and t2(c) produced 2nd row.
Probably your developer encountered the same set of data in his development envrionment,
that might caused him to use the table twice in his query.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|