need help to optimize query
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 20

Thread: need help to optimize query

  1. #1
    Join Date
    Mar 2001
    Location
    New York , New York
    Posts
    577
    Hi,

    I am writing a query as


    select * from test
    where trade_date = (select max(trade_date) from test where trade_date <> trunc(sysdate));

    where I want the record having the max trade_date and which is not todays date.

    The data in the table looks something like this

    TRADE_DAT SHARES
    --------- ----------
    22-SEP-02 200
    21-SEP-02 300
    20-SEP-02 400
    24-SEP-02 100

    is there a better way of writing this query.

    Thanks
    Ronnie
    ronnie_yours@yahoo.com

    You can if you think you can.

  2. #2
    Join Date
    Nov 2000
    Location
    Pittsburgh, PA
    Posts
    4,002
    If there was a way that you could store the max(trade_date) before today without having to compute it.

    You could cache the data in a table and write a pl/sql procedure that could run shortly after midnight every day which would update the value. Why calculate the value many times a day when you can calulate it once a day. I'm guessing that it is doing a select on a detail table that has a lot of rows. Which is adding unnecessary overhead.

    You could even write a function to return the max trade date.

    select * from test where trade_date = getmaxtradedate();

  3. #3
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    "<>" in the where clause will not induce index.

    If trade_date is always less than Sysdate and an index is available on trade_date column, then use LESS THAN "<" and remove truncate function.

    select * from test
    where trade_date = (select max(trade_date)
    from test
    where trade_date < sysdate);


  4. #4
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    Exactly as Tamilselvan said.

    Do not implement it as a function as gandolf989 suggested - functions add unnecessary overhead.

    Additionally, we need to know if that column is indexed, and whether or not this simplified example is anywhere close to what your real-world SQL will be. This sql is trying to say 'get me the latest entry in the entire table.

    However, unless there is a unique index on that date column, it is possible to return more than one row from this SELECT. Is that what you want?

    If the column were indexed, a simple MIN/MAX lookup by the optimizer would get the date you want followed by a range scan of the index to get the actual row. This should all be very fast. However, if in the real world you are applying multiple predicates to your actual statement, you must consider more. Here's another version of the statement:
    Code:
    SELECT
       *
    FROM
       TEST
    WHERE
       SALESPERSON_ID  = :SALESPERSON_ID   AND
       TRADE_DATE      = (
                         SELECT
                            MAX(TRADE_DATE)
                         FROM
                            TEST
                         WHERE
                            SALESPERSON_ID  =   :SALESPERSON_ID   AND
                            TRADE_DATE      <   TRUNC(SYSDATE)      
                         )
    In this case, you will want to make sure there is an index on SALESPERSON_ID, TRADE_DATE.

    If you do not have the appropriate index for this query (or don't want to make one), another solution is:

    Code:
    SELECT
       *
    FROM
       (
       SELECT
          T.*         ,
          ROW_NUMBER()
             OVER(
                ORDER BY TRADE_DATE DESC
                )
             AS   RN
       FROM
          TEST   T
       WHERE
          SALESPERSON_ID  = :SALESPERSON_ID   AND
          TRADE_DATE      < TRUNC(SYSDATE)      
       )
    WHERE
       RN   =   2
    With this solution, the table is only accessed once instead of twice. However, only the SALESPERSON_ID index will be used, if it exists. A SALESPERSON_ID, TRADE_DATE index, if it existed, would not be used fully (meaning the second column would be irrelevant). That is why, if a SALESPERSON_ID, TRADE_DATE index exists, use the original sub-select solution. If not, this solution may prove faster.

    HTH,

    - Chris
    Christopher R. Long
    ChrisRLong@HotMail.Com
    But that's just my opinion. I could be wrong

  5. #5
    Join Date
    May 2000
    Location
    ATLANTA, GA, USA
    Posts
    3,135
    Use the following SQL.
    I changed <> to > and <.
    This uses index.

    select * from test
    where trade_date = (select max(trade_date)
    from test
    where trade_date < sysdate and trade_date > sysdate);

  6. #6
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    in Chris post, if your table is big then second solution is always better, the consistent gets is much much smaller.

    2 weeks ago I had a very similar query in a 5 millions rows table, with correlated query (indexed column) it took 17 minutes, with analytic functions 7 minutes with 10% consistent gets of correlated query

  7. #7
    Join Date
    Nov 2000
    Location
    Pittsburgh, PA
    Posts
    4,002
    If this value doesn't change during the day then why keep performing queries to find a value that doesn't change. If it does change during the day the then some type of correlated subquery is needed. Which takes more effort a function that returns one value from a table with one record, or a range scan on an index for a table that could be quite large?

  8. #8
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    Originally posted by pando
    in Chris post, if your table is big then second solution is always better, the consistent gets is much much smaller.
    Never say always (and never say never). Sometimes it is much better, sometimes it is worse. There appear to be a lot of variables and I haven't been able to pin them all down yet. The MIN/MAX optimiser step is very fast, so if a good index exists, it is hard to beat.

    Originally posted by gandolf989
    If this value doesn't change during the day then why keep performing queries to find a value that doesn't change. If it does change during the day the then some type of correlated subquery is needed. Which takes more effort a function that returns one value from a table with one record, or a range scan on an index for a table that could be quite large?
    Unfortunately, you are mixing solutions. Yes, if they keep querying the exact same value throughout the day (which rarely happens in the real world), then storing it somewhere (most likely as a de-normalized column in another table) would likely be more efficient. In either case, however, SELECTing that value from whichever table *directly in the sub-select* would be faster than having that same SELECT in a function. The function solution causes the function's statement to be parsed, etc. for every row as well as a context switch for every row. This is an awful lot of overhead. While using PL/SQL functions in SQL is necessary for certain situations, it should be avoided where possible.

    - Chris
    Christopher R. Long
    ChrisRLong@HotMail.Com
    But that's just my opinion. I could be wrong

  9. #9
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    your function still has to read the table to get the maxvalue right?

    nothing is saved with a function

  10. #10
    Join Date
    Mar 2001
    Location
    New York , New York
    Posts
    577
    Originally posted by chrisrlong
    Additionally, we need to know if that column is indexed, and whether or not this simplified example is anywhere close to what your real-world SQL will be. This sql is trying to say 'get me the latest entry in the entire table.
    I am trying to get the latest entry from the table which is before sysdate. it can be sysdate-1 or sysdate-100.

    Originally posted by chrisrlong

    However, unless there is a unique index on that date column, it is possible to return more than one row from this SELECT. Is that what you want?

    Yes, the Trade date column has a unique index on it and should return only one row.

    Originally posted by tamilselvan
    Use the following SQL.
    I changed <> to > and <.
    This uses index.

    select * from test
    where trade_date = (select max(trade_date)
    from test
    where trade_date < sysdate and trade_date > sysdate);
    Using the above query returns no rows and will never return any rows :-)
    Ronnie
    ronnie_yours@yahoo.com

    You can if you think you can.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width