Which query is better inner join or in???

**gandolf989** · 01-05-2006, 11:37 AM

I am currently working on a java application running against a 9.2.0.7 database running on windows 2000 with 2 gigs of ram. The java application dynamically creates queries based on a collection of stuff the a user has combined together. Currently the queries look something like this.

Code:

SQL> SELECT COUNT(*) FROM(
  2  SELECT detail.pkey    AS keyval,
  3         detail.p_date  AS grp0
  4    FROM detail
  5   WHERE TO_CHAR(detail.pkey) IN
  6         ( SELECT detail_pkey
  7             FROM agg_tbl
  8            WHERE agg_id = '0b080e0000010876d429ad0a1428098500' ));

The TO_CHAR function on the primary key of the detail table is completely unnecessary, but since the Java programmer would not remove it I created a function based index on that column for TO_CHAR, which helped a little.

Also the number of rows in agg_tbl will grow and shrink rapidly while the application is running. Someone might drag an object onto a chart or destroy a chart and cause the agg_tbl to grow or shrink by anywhere from 100 to 100,000 rows.

Most of the time the application does a full table scan on the detail table, which causes the application to slow down. I want to rewrite the query to the following.

Code:

SQL> SELECT COUNT(*) FROM(
  2  SELECT detail.pkey    AS keyval,
  3         detail.p_date  AS grp0
  4    FROM ( SELECT detail_pkey
  5             FROM agg_tbl
  6            WHERE comp_id = '0b080e0000010876d429ad0a1428098500' ) agg_tbl
  7   INNER JOIN detail
  8      ON detail.pkey = agg_tbl.detail_pkey);

I want to remove the TO_CHAR, that should have never been there, move the agg_tbl to the front of the from list because it controls what data gets returned and use an inner join instead of the in clause, so that it would be a correlated query rather than a non-correlated query. But the results seem rather similar. In SQL Plus it is taking .08 seconds. But there are times where it takes 10 - 15 seconds.

When I run the queries in SQL Plus they run fine, but when they run through the Java app they are painfully slow. I included a sql trace to show the difference. The performance of the app can vary greatly depending on how stale the stats are.

Does anyone have any suggestion as to how I should structure the queries so that the response time will be consistantly good?

Thanks.

**mike9** · 01-05-2006, 11:49 AM

Originally Posted by gandolf989

When I run the queries in SQL Plus they run fine, but when they run through the Java app they are painfully slow. I included a sql trace to show the difference. The performance of the app can vary greatly depending on how stale the stats are.

When you execute it via the application is it the sql query which is slow or the response time from the application. Did you trace a query executed by the application?

**marist89** · 01-05-2006, 12:14 PM

Couple points:
1. lots of sorts going on
2. why mess around with IN anyway, just join.
3. as you know, the to_char() will kill you
4. lets see a tkprof with timing and wait events

**marist89** · 01-05-2006, 12:16 PM

Originally Posted by gandolf989

The performance of the app can vary greatly depending on how stale the stats are.

Oh, and is that because of the stats or a High HWM when there is little data?

**slimdave** · 01-05-2006, 12:58 PM

Originally Posted by gandolf989

Also the number of rows in agg_tbl will grow and shrink rapidly while the application is running. Someone might drag an object onto a chart or destroy a chart and cause the agg_tbl to grow or shrink by anywhere from 100 to 100,000 rows.

You might like to investigate dynamic sampling, in that case.

**gandolf989** · 01-05-2006, 01:11 PM

I turned on the following tracing. "ALTER SYSTEM SET EVENTS '10046 trace name context forever, level 12';" and used tkprof with "tkprof.exe sid_ora_1264.trc tk01.out explain=schema/pwd@sid waits=yes sys=no sort=prsela,exeela,fchela"

Code:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      8.70      19.58          1       6977     218960           2
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      8.70      19.58          1       6977     218960           2

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 40  (CCDOTTSVR)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net more data from client                 829        0.01          0.14
  SQL*Net message to client                       2        0.00          0.00
  log file sync                                  11        0.99          9.58
  SQL*Net message from client                     2        1.05          1.09
********************************************************************************

When I did that I noticed that the log file sync caused a substantial wait on a database where I am the only user. Also the query that I used befiore does not include an extraneous table that the application query is using. There are three tables that are relevant to this query. The detail table contains information on stuff that is being readied to be shipped, the p_agg_tbl is a parent table that describes a shipment, and the to the agg_tbl contains pointers to the detail table attempts to show what will be included in a given shipment. I had the agg_id moved to the agg_tbl so that a three way join would no longer be needed. I inlcuded part of the trace file with the waits.

Thanks again.

**gandolf989** · 01-05-2006, 02:34 PM

Originally Posted by slimdave

You might like to investigate dynamic sampling, in that case.

I have thought about dynamic sampling. I will take a look at it again.
Thanks.

**marist89** · 01-05-2006, 04:04 PM

why sys=no? You have significant recursive calls, this might lead to something...

**tamilselvan** · 01-05-2006, 05:37 PM

You don't need to select "detail.p_date AS grp0" column.

Code:

Try this:

 SELECT COUNT(*) 
 FROM (
    SELECT distinct a.pkey   
       FROM detail a, agg_tbl b
     WHERE TO_CHAR(a.pkey) = b.detail_pkey
         AND b.agg_id  = '0b080e0000010876d429ad0a1428098500' );

Tamil

**gandolf989** · 01-05-2006, 05:59 PM

Originally Posted by tamilselvan

You don't need to select "detail.p_date AS grp0" column.

Code:

Try this:

 SELECT COUNT(*) 
 FROM (
    SELECT distinct a.pkey   
       FROM detail a, agg_tbl b
     WHERE TO_CHAR(a.pkey) = b.detail_pkey
         AND b.agg_id  = '0b080e0000010876d429ad0a1428098500' );

Tamil

I should have been more clear on this. I only encapsulated the query in a select count(*) so that I could find out how long it takes to get the data from the query. Looking at the data when I am doing the query is not as important.

Thread: Which query is better inner join or in???

Thread Tools

Display

Which query is better inner join or in???

Posting Permissions