Another question or two on this topic.
chrisrlong, your discussion about these types of queries on the various threads has indeed been quite enlightening. I've only been reading these forums for a short while and discussions like these are what I was hoping to find here; discussions that teach me new things about how PL/SQL works. I wish that I had known about these forums sooner, especially when I participated in another discussion on another list (non-Oracle/DBA related) where someone had asked basically the same question. How do I select the N beginning/middle/ending number of rows inserted into a table?
So in the interests of furthuring my education, let me discuss the following concerning the use of ROWNUM in queries. I am appending to this thread since you suggested to other posters to keep the discussions in single threads so that all of the information is in one place. :)
In the other thread, I originally suggested the use of the ROWNUM column to find these rows. We discussed some queries, tried them out and discovered some problems with using ROWNUM. In particular, the first sentence from the following section from the Oracle 8.0.5 documentation troubled us: "Oracle assigns a ROWNUM value to each row as it is retrieved, before rows are sorted for an ORDER BY clause, so an ORDER BY clause normally does not affect the ROWNUM of each row. However, if an ORDER BY clause causes Oracle to use an index to access the data, Oracle may retrieve the rows in a different order than without the index, so the ROWNUMs may be different than they would be without the ORDER BY clause."
Another person suggested using an inline view as the method for solving this problem (very similar to your example). However, I have not reconciled to myself how inserting an ORDER BY clause in the inline view is going to result in ROWNUM values being assigned to the rows according to the specified order. My understanding of the statement from the Oracle documentation is that ROWNUM values are not affected by the ORDER BY clause. Is this a correct understanding or not? I believe that jmodic asked the same question in his reply on thread http://www.dbasupport.com/forums/sho...threadid=5480. To phrase the question again, is using the ORDER BY clause in the inline view guaranteed to assign ROWNUM values based upon the ordered set of rows?
I'm not an expert, so feel free to correct me. Just be nice. :)
On a related note, I have seen one other solution posted elsewhere to this problem. They suggested using the INDEX_DESC hint along with ROWNUM to get the last N number of rows from a table based upon some id column. For example,
select /*+ INDEX_DESC( ) */
from
where ROWNUM < N;
or something similar to that. Is using these types of hints an acceptable way of addressing this type of problem?
Oh, and I can post a really horribly inefficient query to determine these values as well. I'll refrain since it depends upon taking the cartesian product of a table with itself to assign the "numbers" to the rows. I suspect that it would be considered worse than the MINUS approach. :)
you can need also a bulk collect
hier an example:
declare
nr_var_tab nr_tab NUMBER := nr_tab();
text_var_tab text_tab VARCHAR2(50);
begin
open c1;
loop
fetch c1 bulk collect into
nr_var_tab, text_var_tab LIMIT 10;
exit when c1%notfound;
end loop;
end;
/
I hope you understand me because my English is bad, but i waant to learn it
no of rows ... order by
hi
just wanted to add my two bit ... the order by clause will *not* be used in the rownum part of the query
select * from
( select a.*, rownum rnum from
( YOUR_QUERY_GOES_HERE
-- including the order by ) a
where rownum <= 55 )
where rnum >= 35
this will fetch the rows from 35 to 55
Thus the rows are "ordered" before we use the Order By
These topN queries have become even easier with 8i rel2 with the analytic functions
for example
SELECT ename , deptno, sal,
RANK() OVER (PARTITION BY deptno ORDER BY sal DESC)
FROM emp;
Try them out very *cool*
Charanjiv
Okay, I just got majorly confused. I'll try to take this one issue at a time
First, Heath:
----------------
Yes! The trick will (almost) always work. The only time this trick *may* run into trouble is when you add more outer levels. This very special circumstance has to do with how the optimizer tries to satisfy the query. Basically, if you take the entire example I gave you and turned it into an in-line view *inside* another complicated SELECT statement, it *may or may not* do what you expect, because it may move some of the outside predicates into the in-line view.
I have, however, never seen it move the ROWNUM predicates before the ORDER BY.
***If you stick with the format:
SELECT
---OUTER.(target columns)
FROM
---(
------SELECT
---------INNER.(target columns),
---------ROWNUM
------------AS QUERY_ROWNUM
------FROM
---------(
---------(Whatever complicated SELECT you desire)
---------ORDER BY
------------(whatever you want to order by)
---------) INNER
------WHERE
---------ROWNUM < (window end)
---) OUTER
WHERE
---OUTER.QUERY_ROWNUM >= (window start)
You will *not* have a problem. ***
Why does this work?
Because ROWNUM is a virtual column that is *re-defined* at *every* level of the statement. Notice that we are not asking for the ROWNUM column from the inner-most SELECT. *That* is the ROWNUM that you are getting confused by. *That* ROWNUM would, indeed, get assigned *before* the ORDER BY is applied. Instead, we wait until *After* the ORDER BY has been done. We ask for the ROWNUM at the second-level, which is applied *after* the ORDER BY - this is very important. Note that *at this same level*, we can restrict by the upper-bound of the ROWNUM window. However, we cannot restrict by the lower-bound yet. That is because ROWNUMs are assigned *as they pass the WHERE clause*, or as the result set is actually built. Therefore, if we add AND ROWNUM > x to *any* SELECT, we will get no rows. We instead alias the ROWNUM column and wait until the *next* level of the query to restrict the lower bound. Note that this outer-most layer will *also* have a native ROWNUM column (which we don't use) along with the aliased QUERY_ROWNUM column from the second-level SELECT.
I hope this wasn't all too detailed, but people seem to be more than slightly confused about the whole issue, so I am trying to help clear it up. Let me know if I am still unclear.
Now, tomate:
----------------
Interesting solution. However, this limit is an upper-limit only. This means that you would have to enforce the lower-limit yourself, via the PL/SQL code. This further means that you will be retrieving from the database *more* records than you need. This is a BIG no-no in my book. One should almost *never* retrieve more records than one needs - this is a sure sign of innefficiency. Further, this upper-bound is only enforced *after* the entire result set has been built and Oracle has started returning rows to the calling program. This is also inefficient. With the ROWNUM solution, you will notice the COUNT (STOPKEY) and SORT (ORDER BY STOPKEY) operations in the plan - this shows that Oracle will only do as much sorting and as much result-set building as it has to. This is what makes the ROWNUM solution so efficient, and, unfortunately, a much better solution than using LIMIT.
Good try though!
Finally, ckalha,
----------------
I'm really not sure if the first part of your argument is in favor of the ROWNUM solution or not. If not, please see the explanation I gave above and let me know if I missed anything.
As for the Analytical functions - I LOVE THEM!!! They are the coolest thing since sliced bread. They are *THE new solution* for the WHERE column=(SELECT MAX(column)... problem. I implore everyone to check them out!
**** Digression:
Just ran into a reporting query yesterday against a 25 million record table. It used nested WHERE column=(SELECT MAX(column) calls!!! It took 2 hours and 16 million Logical Reads!! I re-wrote it using analytical functions and brought it down to 100,000 Logical reads and it returned in 5 minutes. THAT is how powerful these new functions are!!!
*****
However, they are generally *not* faster than the ROWNUM solution for windowing. I tried them, because I like them sooo much, but the ROWNUM solution won out.
Disclaimer: This may not be true in *every* situation. My guess would be that at *least* 80% of the time, the ROWNUM solution will be optimal, and will usually be *much* faster. However, I expect that under the right circumstances, such as maybe if using a single table, and selecting most of it, the ROW_NUMBER analytical function might be faster.
IF, by chance, you are ordering according to some complicated formula that an analytical function can replace, then, by ALL means, use that same function to do the windowing.
So, generally speaking, if the statement is already using or *can* use an analytical function for purposes *other* than the windowing... then it probably makes sense to use an analytical function *for* the windowing as well. If not, then I'd stick with the ROWNUM solution. It will generally be faster, and it is more easily implemented.
Phew, I hope I covered everything. Let me know if I didn't. And I *really* hope nobody has further replied to this post while I was writing this dissertation :)
- Chris
[Edited by chrisrlong on 04-26-2001 at 09:58 AM]