Resultset Paging & Performance increase using ROWNUM

**chrisrlong** · 01-04-2001, 11:46 AM

First of all, it's easier on the rest of us if you only post in one of the forums so we have all the answers consolidated.

Regardless, the answer is a resounding Yes! The use of RUWNUM can significantly enhance performance. But, as with everything else, it depends what you want to do with it. Unfortunately, you have not been very specific about how you wish to use it. Also, the query you provided is neither syntactically nor logically correct.

SELECT *
FROM table1
WHERE conditon AND rownum>N AND rownum

I would guess that you were trying to say:

SELECT *
FROM table1
WHERE conditon AND rownum>X AND rownum<Y

So now we have the correct syntax, but the logic is wrong. You *cannot* test for a ROWNUM greater than a value. The ROWNUM is not assigned to a row until *after* it enters the result set (meaning that it has passed all the criteria). Therefore, ROWNUM 1 will not get created until at least one records pass your restrictions. If X>1, then this will never happen. The first record comes in with a rownum of 1 and is discarded because its ROWNUM is not >X. The next row is tests - it also has a rownum of 1 because there are still no records in the resultset. It fails for the same reason, and so on and so on.

To get around this, what you really want to write is:

SELECT
--C1,
--C2
FROM
--(
--SELECT
----C1,
----C2,
----ROWNUM
------AS QUERY_ROWNUM
--FROM
----TABLE
--WHERE
----ROWNUM < Y
--/* You can add an ORDER BY here if you want */
--) T1
WHERE
--T1.QUERY_ROWNUM > X

The advantage of such a query is 2-fold:
- Use of the upper bound (ROWNUM<Y) will lessen the amount of work that the query has to do. As soon as Y records are retrieved, the query stops and exits. This can be a significant savings

- Use of the lower bound does not help the query at all. Those records still had to be processed. However, if Y=1500 and X = 1400, you will be returning 1300 *fewer* records across the wire and to the middle tier. This can be a significant savings as well.

As with everything, however, there are tradeoffs. Let's say that you are using this to page the user through a large result set. the query returns 5000 rows. You show the user 50 rows at a time. Therefore, you are calling this query for each 'page' that you show the user. If the user is only going to (usually) look at the first few (3) pages, then you have:

- called this proc 3 times (with differing X and Y values)
- each time the statement must be executed
- each time 50 records are returned

If you had returned the entire resultset and cached it in the middle tier, you would have:

- called the proc once
- the statement would have executed once
- 5000 rows would have been returned across the wire and cached in the middle tier in memory.

In this example, the ROWNUM solution was probably better.

However, if the user *usually* pages through the entire result set, then you:

- called this proc 100 times (with differing X and Y values)
- each time the statement must be executed
- each time 50 records are returned

Whereas the cached version's metrics would not have changed:
- called the proc once
- the statement would have executed once
- 5000 rows would have been returned across the wire and cached in the middle tier in memory.

In this case, cacheing is probably a better idea.

So, the questions are:
- how big is the average result set?
- how big is the average 'page'?
- how many 'pages' are viewed, on the average?

Of course, you also need to know:
- how painful is this query to execute?
- how fast is your network (how painful is it to transfer large result sets?)
- how much memory do you have in the middle tier to store 5000 records/user?

I make no claims that this is an exhaustive study on the subject - just some things to think about.

Nothing's ever easy, is it? :)

Hope this helped,

- Chris

Thread: Resultset Paging & Performance increase using ROWNUM

Thread Tools

Display

Threaded View

Posting Permissions