subselect vs. no subselect sql with analytical functions
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 6 of 6

Thread: subselect vs. no subselect sql with analytical functions

  1. #1
    Join Date
    Apr 2007
    Posts
    31

    subselect vs. no subselect sql with analytical functions

    The optimizer in the explain plan's cost and rows are the same for the two following queries however one is efficient (a) and the one using a subselect is extremely slow (b). Any ideas why this would occur, it seems like it's trying to satisfy the subselect before filtering it with a literal.

    (a)
    select * from (
    select a.a, b.b, a.c, rank () over (partition by b.b order by a.c desc)
    from a, b
    where a.x=b.x
    and a.x = 555)

    (b)
    select * from (
    select a.a, b.b, a.c, rank () over (partition by b.b order by a.c desc)
    from a, b
    where a.x=b.x)
    where x = 555
    Last edited by wakedba; 05-31-2007 at 11:37 AM. Reason: modified

  2. #2
    Join Date
    Mar 2000
    Location
    Atlanta, GA,USA
    Posts
    155
    Does the second query really work? The most outer query filters on column "x" that I don't see in the column list for the inner query.

    Here is my test that shows it does not:

    14:49:34 sspopov@TDSDEV> create table a (a number, b number, c number, x number);

    Table created.

    Elapsed: 00:00:00.01
    14:49:44 sspopov@TDSDEV> create table b (a number, b number, c number, x number);

    Table created.

    Elapsed: 00:00:00.01
    14:49:44 sspopov@TDSDEV>
    14:49:44 sspopov@TDSDEV>
    14:49:44 sspopov@TDSDEV> select * from (
    14:49:44 2 select a.a, b.b, a.c, rank () over (partition by b.b order by a.c desc)
    14:49:44 3 from a, b
    14:49:44 4 where a.x=b.x
    14:49:44 5 and a.x = 555)
    14:49:44 6 /

    no rows selected

    Elapsed: 00:00:00.01
    14:49:44 sspopov@TDSDEV>
    14:49:44 sspopov@TDSDEV> select * from (
    14:49:44 2 select a.a, b.b, a.c, rank () over (partition by b.b order by a.c desc)
    14:49:44 3 from a, b
    14:49:44 4 where a.x=b.x)
    14:49:44 5 where x = 555
    14:49:44 6 /
    where x = 555
    *
    ERROR at line 5:
    ORA-06553: PLS-306: wrong number or types of arguments in call to 'OGC_X'


    Elapsed: 00:00:00.01
    14:49:44 sspopov@TDSDEV>


    As for the explanation why a correct version of the second query would take longer to execute the only thing I can think of is the size of the result set from the inner query. Because of the pseudo column created by analytical function ( rank() ) the entire result set has to be materialized. In case of a large row source it will cause longer filter execution for x=555 predicate.

    Have fun,
    Sergey

  3. #3
    Join Date
    Apr 2007
    Posts
    31
    Right. For the second query to work I would've needed "x" in the inner select statement not just the where clause, that was my bad in typing up the example. I didn't realize the inner select for the 2nd one would processes the entire set of tables using the rank function w/o taking the filtered literal into consideration since the explain plans for both A and B looked almost identical, the explain plan for B without the literal would be what it's really doing then. One of the tables is ~100GB in size, the other roughly ~2GB which is why I don't want to use a subselect for performance reasons. Any other way to achieve the same goal?

  4. #4
    Join Date
    Mar 2000
    Location
    Atlanta, GA,USA
    Posts
    155
    Your first query would be the way to go. The basic rule is to trim your result set as early as possible. In order to do that you need to have a predicate with pretty good selectivity.
    BTW, what is the purpose of using rank() function? Do you have another query wrapped around this one? So far it does not serve any obvious purpose.

    Explain plan for two queries you have should have different figures for rows and bytes on intermediate steps given your statistics is up to date and no extreme values set for instance wide parameter (optimizer_*). Hope you are not using RBO (I don't think it is possible with analytics) :-)

    Have fun,
    Sergey

  5. #5
    Join Date
    Apr 2007
    Posts
    31
    The optimizer is cost based. The rank function is filtered outside of that select only to return records where rank=1. I definitely would love to use the 1st query, however the problem I have is that I need to put this logic into a view so that users can query the view passing in the literal, I can think of any other way to achieve this efficiently.

  6. #6
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,448
    you second query slow is because when analytics are used the view becomes Nonmergeable therefore the inner query must be resolved first and predicate cannot be pushed to your inner query

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width