Click to See Complete Forum and Search --> : Stats better but SQL is slower


Scorby
12-06-2005, 10:21 AM
Running Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production.

I had some SQL for detecting duplicates, which I rewrote to avoid having to do a self-join. The old SQL is:

SELECT /*+ USE_HASH(t d) */
t.BUSINESS_DATE, t.SCHEDULE_ID,
RANK() OVER
(PARTITION BY NVL(t.DEAL_ID, '~NVL'),
NVL(t.LEG_TYPE_CD, '~NVL')
ORDER BY t.SRC_SYS_FEED_ROW_ID DESC,
t.SCHEDULE_ID DESC) ERROR_BIT_
FROM OPD_SCHEDULE t,
(SELECT /*+ USE_HASH(t d) */ BUSINESS_DATE,
SRC_SYS_FEED_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,
NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD
FROM SCHEDULE t
WHERE BUSINESS_DATE = :p_business_date
and SRC_SYS_FEED_ID = :p_src_sys_feed_id
GROUP BY BUSINESS_DATE, SRC_SYS_FEED_ID,
DEAL_ID, LEG_TYPE_CD HAVING COUNT(*) > 1) d
WHERE t.BUSINESS_DATE = :p_business_date
and t.SRC_SYS_FEED_ID = :p_src_sys_feed_id
and BITAND(t.ROW_ERROR_CD, 3) = 0
and NVL(t.DEAL_ID, '~NVL') = d.DEAL_ID
and NVL(t.LEG_TYPE_CD, '~NVL') = d.LEG_TYPE_CD

I rewrote this to;

SELECT t.rowid, RANK() OVER (PARTITION BY DEAL_ID
,LEG_TYPE_CD
ORDER BY t.SRC_SYS_FEED_ROW_ID DESC, t.SCHEDULE_ID DESC) AS ERROR_BIT_ID
FROM
(SELECT ROWID, SRC_SYS_FEED_ROW_ID, SCHEDULE_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,
NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD,
COUNT(1) OVER (PARTITION BY NVL(t.DEAL_ID, '~NVL')
,NVL(t.LEG_TYPE_CD, '~NVL')
) DUPS FROM SCHEDULE t WHERE t.BUSINESS_DATE = :p_business_date
and t.SRC_SYS_FEED_ID = :p_src_sys_feed_id
and BITAND(t.ROW_ERROR_CD, 3) = 0 ) t
WHERE DUPS > 1

Using autotrace to look at the stats, things definitely seem to have improved.

Old:
0 recursive calls
29 db block gets
43550 consistent gets
58023 physical reads
0 redo size
210 bytes sent via SQL*Net to client
234 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
1 sorts (disk)
0 rows processed
New:
0 recursive calls
27 db block gets
21775 consistent gets
32073 physical reads
0 redo size
170 bytes sent via SQL*Net to client
234 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
1 sorts (disk)
0 rows processed

And yet...the old is actually quicker than the new one ! Admittedly, it's only a few seconds but it's a fairly consistent result across numerous runs. This is very annoying because I do think the new SQL is better and want to use it, but it's hard to justify when it seems to be slowing things down. Can anyone suggest an explanation why it is slower when all the stats point the other way ?

tamilselvan
12-06-2005, 10:31 AM
Do you have tkprof output for both SQL statements?

If yes, post them here.

Tamil

Scorby
12-06-2005, 10:48 AM
No, unfortunately I don't have access to the machine where the Oracle server runs, so I can't get hold of the trace file. I'm trying to get the DBA to send me it, but not having any response so far.

Scorby
12-06-2005, 11:51 AM
Investigating further, it looks like the main reason is just that the select ..group by.. having.. is just a lot faster than the partitioned version. The basic queries to get the duplicates are:

SELECT BUSINESS_DATE, SRC_SYS_FEED_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD
FROM SCHEDULE t
WHERE BUSINESS_DATE = :p_business_date and SRC_SYS_FEED_ID = :p_src_sys_feed_id
GROUP BY BUSINESS_DATE, SRC_SYS_FEED_ID, DEAL_ID, LEG_TYPE_CD HAVING COUNT(*) > 1
This runs in about 30 seconds and the plan is:

SELECT STATEMENT Hint=CHOOSE
FILTER
SORT GROUP BY
PARTITION RANGE SINGLE
TABLE ACCESS FULL SCHEDULE

The partiioning version is:

select * from
(SELECT ROWID, SRC_SYS_FEED_ROW_ID, SCHEDULE_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,
NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD,
COUNT(1) OVER (PARTITION BY NVL(t.DEAL_ID, '~NVL')
,NVL(t.LEG_TYPE_CD, '~NVL')
) DUPS FROM SCHEDULE t WHERE t.BUSINESS_DATE = :p_business_date
and t.SRC_SYS_FEED_ID = :p_src_sys_feed_id)
where dups > 1
This takes twice as long and has plan:

SELECT STATEMENT Hint=CHOOSE
VIEW
WINDOW SORT
PARTITION RANGE SINGLE
TABLE ACCESS FULL SCHEDULE

tamilselvan
12-06-2005, 12:33 PM
Can you post the exact tkprof output after you get it from the DBA?

Tamil

Scorby
12-06-2005, 01:03 PM
I don't have the tkprof output. All I can get at the moment is the explalin plan from TOAD, which is:

Old:

Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop

SELECT STATEMENT Hint=CHOOSE 9 K 4885
FILTER
SORT GROUP BY 9 K 245 K 4885
PARTITION RANGE SINGLE KEY KEY
TABLE ACCESS FULL OPD_SCHEDULE 547 K 13 M 3319 KEY KEY

New:

Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop

SELECT STATEMENT Hint=CHOOSE 547 K 7629
VIEW 547 K 34 M 7629
WINDOW SORT 547 K 22 M 7629
PARTITION RANGE SINGLE KEY KEY
TABLE ACCESS FULL OPD_SCHEDULE 547 K 22 M 3319 KEY KEY

If this has any meaning at all, the key difference seems to be that the group by..having is able to do a sort on only 9K rows, whereas the other one is sorting the whole table. 547K is correct for the number of rows in the table (I've done an analyze compute statistics for all columns), but I'm at a complete loss to explain where 9K comes from. There aren't 9000 distinct combinations of business_date, src_sys_feed_id, deal_id, leg_type_cd (there are only 12 duplicates in all). All the rows in the table have the same business_date and src_sys_feed_id, so it's not produced by the where conditions.

tamilselvan
12-06-2005, 05:49 PM
Explain plan may not tell you the correct picture unless you cross verify with each line by running individual statement.
Tkprof is the right place to investigate. Even in tkprof output, it will hide the info if a SQL is used at the column level.

Tamil

WilliamR
12-06-2005, 08:58 PM
It's easier to read your explain plan output if you format it with [code] tags.

The explain plan output from SQL*Plus (http://www.williamrobertson.net/code/xplan.sql.txt) can sometimes give you more clues than the TOAD version.

The old version seems to join two tables but the new version does not, so I'm not sure I follow the business rule for duplicates. Also you refer to the new version as "partitioned", but the table appears to be partitioned and both versions seem to use partition elimination. Are you referring to the analytic functions?

This is a bit of a guess but using a standard dedupe approach I get this:

SELECT * FROM schedule
WHERE rowid IN
( SELECT LEAD(rowid) OVER
( PARTITION BY deal_id, leg_type_cd
ORDER BY schedule_id DESC )
FROM schedule
WHERE business_date = :p_business_date
AND src_sys_feed_row_id = :p_src_sys_feed_id );

Scorby
12-07-2005, 05:23 AM
The SQL has to detect all duplicates, one of which will be marked as the "real" one and the rest as the "error" duplicate. I think your SQL probably only detects the error duplicates.

Scorby
12-07-2005, 06:21 AM
I rewrote it using lead and lag but the response times and autotrace stats were almost identical:

select rowid, (case when leader is not null and lagger is null then 1 else 2 end) as error_bit_id
from
(select rowid, lead(rowid) over (PARTITION BY NVL(DEAL_ID, '~NVL'),NVL(LEG_TYPE_CD, '~NVL')
ORDER BY SRC_SYS_FEED_ROW_ID DESC, SCHEDULE_ID DESC) as leader,
lag(rowid) over
(PARTITION BY NVL(DEAL_ID, '~NVL'),NVL(LEG_TYPE_CD, '~NVL')
ORDER BY SRC_SYS_FEED_ROW_ID DESC, SCHEDULE_ID DESC) as lagger from schedule
WHERE BUSINESS_DATE = :p_business_date
and SRC_SYS_FEED_ID = :p_src_sys_feed_id
and BITAND(ROW_ERROR_CD, 3) = 0)
where leader is not null or lagger is not null

Plan is similar:

Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop

SELECT STATEMENT Hint=CHOOSE 5 K 3373
VIEW 5 K 112 K 3373
WINDOW SORT 5 K 304 K 3373
PARTITION RANGE SINGLE KEY KEY
TABLE ACCESS FULL SCHEDULE 5 K 304 K 3319 KEY KEY

Scorby
12-07-2005, 10:41 AM
I eventually got the tkprof output:

Old
SELECT /*+ USE_HASH(t d) */ t.BUSINESS_DATE, t.SCHEDULE_ID, RANK() OVER
(PARTITION BY NVL(t.DEAL_ID, '~NVL'),NVL(t.LEG_TYPE_CD, '~NVL') ORDER BY
t.SRC_SYS_FEED_ROW_ID DESC, t.SCHEDULE_ID DESC) ERROR_BIT_ID
FROM
SCHEDULE t, (SELECT /*+ USE_HASH(t d) */ BUSINESS_DATE, SRC_SYS_FEED_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD
FROM SCHEDULE t WHERE BUSINESS_DATE = :p_business_date and
SRC_SYS_FEED_ID = :p_src_sys_feed_id GROUP BY BUSINESS_DATE,
SRC_SYS_FEED_ID, DEAL_ID, LEG_TYPE_CD HAVING COUNT(*) > 1) d WHERE
t.BUSINESS_DATE = :p_business_date and t.SRC_SYS_FEED_ID =
:p_src_sys_feed_id and BITAND(t.ROW_ERROR_CD, 3) = 0 and NVL(t.DEAL_ID,
'~NVL') = d.DEAL_ID and NVL(t.LEG_TYPE_CD, '~NVL') = d.LEG_TYPE_CD


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.03 0.02 0 0 0 0
Fetch 9 47.00 53.87 58335 43550 29 8
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11 47.03 53.90 58335 43550 29 8

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 670 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
8 WINDOW SORT
8 FILTER
548427 SORT GROUP BY
548435 HASH JOIN
548427 PARTITION RANGE SINGLE PARTITION: KEY KEY
548427 TABLE ACCESS FULL SCHEDULE PARTITION: KEY KEY
548427 PARTITION RANGE SINGLE PARTITION: KEY KEY
548427 TABLE ACCESS FULL SCHEDULE PARTITION: KEY KEY


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file scattered read 5440 0.01 6.31
direct path write 726 0.07 11.82
direct path read 1374 0.04 0.50

New:
SELECT t.rowid, RANK() OVER (PARTITION BY DEAL_ID
,LEG_TYPE_CD
ORDER BY t.SRC_SYS_FEED_ROW_ID DESC, t.SCHEDULE_ID DESC) AS ERROR_BIT_ID
FROM
(SELECT ROWID, SRC_SYS_FEED_ROW_ID, SCHEDULE_ID,
NVL(t.DEAL_ID, '~NVL') AS DEAL_ID,
NVL(t.LEG_TYPE_CD, '~NVL') AS LEG_TYPE_CD,
COUNT(1) OVER (PARTITION BY NVL(t.DEAL_ID, '~NVL')
,NVL(t.LEG_TYPE_CD, '~NVL')
) DUPS FROM SCHEDULE t WHERE t.BUSINESS_DATE = :p_business_date
and t.SRC_SYS_FEED_ID = :p_src_sys_feed_id
and BITAND(t.ROW_ERROR_CD, 3) = 0 ) t
WHERE DUPS > 1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.01 0.00 0 0 0 0
Fetch 1 49.17 58.45 32260 21775 27 8
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 49.19 58.46 32260 21775 27 8

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 670 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
8 WINDOW SORT
8 VIEW
548427 WINDOW SORT
548427 PARTITION RANGE SINGLE PARTITION: KEY KEY
548427 TABLE ACCESS FULL SCHEDULE PARTITION: KEY KEY


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file scattered read 2720 0.03 3.20
direct path write 135 0.07 5.24
direct path read 5903 0.09 8.98

wdfloyd
12-07-2005, 11:39 AM
It looks like you need to increase the sort_area_size for your session. The time is being spent on sorts (direct path waits).

I would give that a try and see if it helps. I agree with you, it seems like the second query should be the better approach.

Hope that helps.

Scorby
12-07-2005, 12:25 PM
I tried doubling the sort_area_size to 131072 and doubling it again to 262144, but there was no discernable change in the run times. Do I need to change anything else e.g. sort_area_retained_size (currently set to 0) ?

tamilselvan
12-07-2005, 12:31 PM
Set the sort area size to 8M and sort area retained size to 1MB.
If you use LMT TEMPORARY TABLESPACE, set it to uniform size of 8MB.

What is the HASH area size?
Do you use PGA?
What is the value for db_file_multiblock_read_count?

How many CPUs does this box have?

Tamil

Scorby
12-07-2005, 01:04 PM
After trying out your suggestions, the time for the job has come down by about 15 seconds, so that looks quite good. Many thanks for your help. On your other questions:

db_file_multiblock_read_count=8
hash_area_size=131072
cpu_count=3

I'm not sure what you mean by "Do you use PGA ?". I thought PGA was private global area - just the bit of memory assigned to the user.

wdfloyd
12-07-2005, 02:18 PM
Check the setting for workarea_size_policy. You will need to set this to manual during your session, otherwise Oracle automatically determines the size for the _size parameters.

It looked to me like the disk activity was coming from sorting and increasing the sort_area_size should move that activity into memory instead.

tamilselvan
12-07-2005, 04:08 PM
After trying out your suggestions, the time for the job has come down by about 15 seconds, so that looks quite good. Many thanks for your help. On your other questions:

db_file_multiblock_read_count=8
hash_area_size=131072
cpu_count=3

I'm not sure what you mean by "Do you use PGA ?". I thought PGA was private global area - just the bit of memory assigned to the user.

By increasing db_file_multiblock_read_count to 32 or 64 at the session level, you can reduce "db file scattered wait event". Also, hash_area_sise is too small. Set it to 8MB.

Tamil

Scorby
12-08-2005, 07:06 AM
I think my speed improvement must have been coincidental. pga_aggregate_target is set to 104857600 and workarea_size_policy is set to auto, so settings like sort_area_size should be ignored anyway.

I've tried setting:

alter session set workarea_size_policy=manual;
alter session set sort_area_size=8388608;
alter session set sort_area_retained_size=1048576;
alter session set db_file_multiblock_read_count=64;
alter session set hash_area_size=8388608;

Timings then went backwards, with the jobs definitely taking longer.

wdfloyd
12-08-2005, 08:40 AM
What does the trace file say now? Hash_area_size shouldn't be a factor anymore, it is not doing a hash join in the second version of the query.

I would test increasing the sort_area_size to larger amounts and see if there is any difference in performance. Try 20 Mb and see if your direct path waits go down.