Resultset Paging & Performance increase using ROWNUM
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 19

Thread: Resultset Paging & Performance increase using ROWNUM

  1. #1
    Join Date
    Dec 2000
    Posts
    22
    Hello friends,

    Is there a gain in performance when executing a query restricted by rownums? For example, I would like to create a procedure that would return 10 records (similar to altavista paging, or so).

    I have the following query:

    SELECT *
    FROM table1
    WHERE conditon AND rownum>N AND rownum<N+10;

    What I would like to know - when this query is executed, does Oracle first select all rows, and then returns the needed records, or it uses some optimization to select the needed rows without selecting all rows that satisfy the condition.

    If there isn't much gain using such a query, It is probably better to write some 3GL caching system.

    Has anyone coded such a system before?
    I will appreciate it very much if you share your ideas.

    Thank you.

  2. #2
    Join Date
    Jun 2000
    Location
    Hammersmith, London
    Posts
    20
    First, I dont think rowunm > N will work.

    If you have given rownum < 100 then oracle fetches upto 100 and stops. You can find out this by doing explain plan of the sql statement(COUNT (STOPKEY)).

    The read from the table is based on the DB block. It might fetch upto say 150 rows and return 100 rows out of it.

    Hope this helps.

  3. #3
    Join Date
    Dec 2000
    Posts
    22
    Hi g_raju

    Thank you for replying.
    Can you suggest a way to fetch records without rownum>N AND rownum<N+10?


  4. #4
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    First of all, it's easier on the rest of us if you only post in one of the forums so we have all the answers consolidated.

    Regardless, the answer is a resounding Yes! The use of RUWNUM can significantly enhance performance. But, as with everything else, it depends what you want to do with it. Unfortunately, you have not been very specific about how you wish to use it. Also, the query you provided is neither syntactically nor logically correct.

    SELECT *
    FROM table1
    WHERE conditon AND rownum>N AND rownum

    I would guess that you were trying to say:

    SELECT *
    FROM table1
    WHERE conditon AND rownum>X AND rownum<Y

    So now we have the correct syntax, but the logic is wrong. You *cannot* test for a ROWNUM greater than a value. The ROWNUM is not assigned to a row until *after* it enters the result set (meaning that it has passed all the criteria). Therefore, ROWNUM 1 will not get created until at least one records pass your restrictions. If X>1, then this will never happen. The first record comes in with a rownum of 1 and is discarded because its ROWNUM is not >X. The next row is tests - it also has a rownum of 1 because there are still no records in the resultset. It fails for the same reason, and so on and so on.

    To get around this, what you really want to write is:

    SELECT
    --C1,
    --C2
    FROM
    --(
    --SELECT
    ----C1,
    ----C2,
    ----ROWNUM
    ------AS QUERY_ROWNUM
    --FROM
    ----TABLE
    --WHERE
    ----ROWNUM < Y
    --/* You can add an ORDER BY here if you want */
    --) T1
    WHERE
    --T1.QUERY_ROWNUM > X

    The advantage of such a query is 2-fold:
    - Use of the upper bound (ROWNUM<Y) will lessen the amount of work that the query has to do. As soon as Y records are retrieved, the query stops and exits. This can be a significant savings

    - Use of the lower bound does not help the query at all. Those records still had to be processed. However, if Y=1500 and X = 1400, you will be returning 1300 *fewer* records across the wire and to the middle tier. This can be a significant savings as well.

    As with everything, however, there are tradeoffs. Let's say that you are using this to page the user through a large result set. the query returns 5000 rows. You show the user 50 rows at a time. Therefore, you are calling this query for each 'page' that you show the user. If the user is only going to (usually) look at the first few (3) pages, then you have:

    - called this proc 3 times (with differing X and Y values)
    - each time the statement must be executed
    - each time 50 records are returned

    If you had returned the entire resultset and cached it in the middle tier, you would have:

    - called the proc once
    - the statement would have executed once
    - 5000 rows would have been returned across the wire and cached in the middle tier in memory.

    In this example, the ROWNUM solution was probably better.

    However, if the user *usually* pages through the entire result set, then you:

    - called this proc 100 times (with differing X and Y values)
    - each time the statement must be executed
    - each time 50 records are returned

    Whereas the cached version's metrics would not have changed:
    - called the proc once
    - the statement would have executed once
    - 5000 rows would have been returned across the wire and cached in the middle tier in memory.

    In this case, cacheing is probably a better idea.

    So, the questions are:
    - how big is the average result set?
    - how big is the average 'page'?
    - how many 'pages' are viewed, on the average?

    Of course, you also need to know:
    - how painful is this query to execute?
    - how fast is your network (how painful is it to transfer large result sets?)
    - how much memory do you have in the middle tier to store 5000 records/user?

    I make no claims that this is an exhaustive study on the subject - just some things to think about.

    Nothing's ever easy, is it? :)

    Hope this helped,

    - Chris

  5. #5
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    Buggers. My post got cut off as well. I apologize for assuming you posted the statement incorrectly. Let's try this again:

    SELECT *
    FROM table1
    WHERE conditon AND
    rownum>X AND
    rownum<Y


  6. #6
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    Okay, it seems that this forum software does not like the 'Less than' character. How odd. Well, anywhere that ROWNUM appears by itself and looks a little out of context, assume that 'LESS THAN Y' is appended to it.

    Also, here is the rest of the one point that was cut off:

    Use of the upper bound (ROWNUM LESS THAN Y) will reduce the amount of work that the query must perform. As soon as Y records are retrieved, the query can stop and exit. This can be a significant savings.

    Hope this all makes sense,

    - Chris

  7. #7
    Join Date
    Dec 2000
    Posts
    22
    Hi Chris!

    Thank you very much for your reply. You sure gave me a lecture on PLSQL. I wasn't aware of the rownum assignment.
    I will try your code, hope you don't mind if I ask you more questions later.

    Best regards,
    Seralex

  8. #8
    Join Date
    Jan 2001
    Posts
    318

    Cool

    SELECT *
    FROM table1
    WHERE conditon AND
    rownum<200
    MINUS
    SELECT *
    FROM table1
    WHERE conditon AND
    rownum<100

    This will do it for rownum100 to 200.
    Is this what you were looking for ?

    Sonali

    Sonali

  9. #9
    Join Date
    Dec 2000
    Posts
    22
    Hi Sonali

    Yes, that is it. We were also discussing with Chris the performance issue. As Chris noted, there two cases - one that is worth 3GL caching and the other is not.

    From your experience, could you tell what query is faster - the one with MINUS or the one Chris suggested? Or there's no difference at all and Oracle treats them equally. For queries that return 10000 or more records this can make a difference.

    Seralex


  10. #10
    Join Date
    Dec 2000
    Location
    Ljubljana, Slovenia
    Posts
    4,439
    [QUOTE][i]Originally posted by chrisrlong [/i]
    [B]
    SELECT
    --C1,
    --C2
    FROM
    --(
    --SELECT
    ----C1,
    ----C2,
    ----ROWNUM
    ------AS QUERY_ROWNUM
    --FROM
    ----TABLE
    --WHERE
    ----ROWNUM < Y
    --/* You can add an ORDER BY here if you want */
    --) T1
    WHERE
    --T1.QUERY_ROWNUM > X
    [/B][/QUOTE]

    In Development forum, where seralex asked this same question, I sugested the following:

    SELECT c1, c2 FROM
    ----(SELECT c1, c2 FROM table1 ORDER BY c1)
    --WHERE ROWNUM < Y
    MINUS
    SELECT c1, c2 FROM
    ----(SELECT c1, c2 FROM table1 WHERE c2 < X ORDER BY c1)
    --WHERE ROWNUM < X

    On which Chris suggested to me to take a look at his sollution as a more efficient one. I aggree that his sollution is much more efficient as it is written now, but if one follows his comment and include ORDER BY in the inline view, his sollution is *wrong*.

    I think we would all agree that sliding window mechanism is only usefull if it is applied to a previously sorted total resultset. That is why I included ORDER BY in my inline views. So if we follow this business requiremnet, we have to do what Chris sugested in his comment - we must include ORDER BY in his inline view T1. But if we do this then the records from his T1 will not be selected in a sorted order, because ROWNUM is applied before the sorting is done. In other words, Y number of records will be selected in a random fashion, then this Y records will be sorted and applied QUERY_ROWNUM values. When we will want to move our sliding window in a next step we will be in troubles because we will find out that the new window is not starting at the position where the previus ended!

    HTH,
    Jurij Modic
    ASCII a stupid question, get a stupid ANSI
    24 hours in a day .... 24 beer in a case .... coincidence?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width