Hi all,
I have a query which returns like around 300K records. I want to know if it is possible to just fetch the 1st 10 records at one time, then move over to the next 10 & so on, maybe using the ROWNUM column....
Thanks,
Shiva.
Printable View
Hi all,
I have a query which returns like around 300K records. I want to know if it is possible to just fetch the 1st 10 records at one time, then move over to the next 10 & so on, maybe using the ROWNUM column....
Thanks,
Shiva.
Yes, and if you do a search on this forum, you can find many threads related to the topic. Here are a few...
http://www.dbasupport.com/forums/sho...?threadid=8025
http://www.dbasupport.com/forums/sho...?threadid=6830
http://www.dbasupport.com/forums/sho...?threadid=5480
...and here is the standard example I use...
SELECT
---OUTER.C1
FROM
---(
------SELECT
---------INNER.C1,
---------ROWNUM
------------AS QUERY_ROWNUM
------FROM
---------(
---------SELECT
------------C1
---------FROM
------------TABLE1
---------ORDER BY
------------C1
---------) INNER
------WHERE
---------ROWNUM < 8
---) OUTER
WHERE
---OUTER.QUERY_ROWNUM >= 4
Try This
Select, from table
where
rownum <=10
minus
Select, from table
where
rownum <=0
And keep changing the rownum variable and manage the output of the row,,,
something like this
Select, from table
where
rownum <=:variable1
minus
Select, from table
where
rownum <=:variable2
o to 10
10 to 20
etc..
etc..
...and if vbasker had read any of those links I posted, he would have seen that the solution as I posted it is much more efficient than the solution he posted. You definitely do *not* want to use the minus solution. I again refer you to the links I posted above.
- Chris
PS. Sorry if I seem cranky lately - I *am* quitting smoking, after all :)
i know that minus is not an efficient solution as compared to solution generated (inline views) !!
i ve seen u r posting 2 !! there is no double minds on this !! i ve just thrown the solution for better understanding !!! letz understand the logic rather than getting out to be best !!!
First, let me apologize for the tone of my previous comments.
Second, let me say that this reply is not meant to foster anything other than friendly debate.
However...
I must say that I have to disagree somewhat with your last statement. While I agree that it is important to understand the logic, I must counter that is more important to understand it within the proper context. Much like learning a sport, it is hard to break bad habits. If a person does not initially learn to swing a bat correctly, it is more difficult for them to learn the proper method later.
This analogy may seem a stretch, but stick with me :). To me, coding 'optimal' SQL is equivalent to coding SQL 'properly' or swinging that bat properly. While teaching someone to swing it any old way just so they can hit the ball may help them in the short term, it is detrimental to them in the long term.
In my opinion, it is more important to understand the problem in terms of *the optimal solution*, as opposed to understanding the problem in terms of any old solution.
So, back finally to the problem at hand. While learning of the existence of the MINUS functionality is important, so is learning the NOT IN functionality. However, as the problem was a general 'How do I do windowing?' question, the 'proper' answer is 'with ROWNUM', because it is the optimal solution for this problem (post 8i). The MINUS is wonderful at solving other problems, but not this one, and it should not be used to solve this problem, IMHO. I have tried on this forum to show how much better the ROWNUM solution is, and the links that I presented did already address the MINUS solution.
Therefore, when the MINUS solution was posted, without disclaimer, as a solution that was equally as viable as the ROWNUM solution, yeah - it bothered me a little. And again, I apologize for not biting my tongue - It's a character flaw of mine ;).
But I stick by my position - We *do* want to write the *best* solution. So when a less-optimal solution is presented *after* a more-optimal solution has already been given, it should at *least* be prefaced with some type of disclaimer. Such as 'another, less efficient solution...' or 'and if that doesn't work...' or something similar. Otherwise, you simply confuse the issue for the original post-er. When we know which solution is more efficient, it is our job to inform the post-er of that information, IMHO.
Hope that all made sense, and I again apologize if I came off too aggresively - I really do need a cigarette :D
- Chris
chrisrlong, your discussion about these types of queries on the various threads has indeed been quite enlightening. I've only been reading these forums for a short while and discussions like these are what I was hoping to find here; discussions that teach me new things about how PL/SQL works. I wish that I had known about these forums sooner, especially when I participated in another discussion on another list (non-Oracle/DBA related) where someone had asked basically the same question. How do I select the N beginning/middle/ending number of rows inserted into a table?
So in the interests of furthuring my education, let me discuss the following concerning the use of ROWNUM in queries. I am appending to this thread since you suggested to other posters to keep the discussions in single threads so that all of the information is in one place. :)
In the other thread, I originally suggested the use of the ROWNUM column to find these rows. We discussed some queries, tried them out and discovered some problems with using ROWNUM. In particular, the first sentence from the following section from the Oracle 8.0.5 documentation troubled us: "Oracle assigns a ROWNUM value to each row as it is retrieved, before rows are sorted for an ORDER BY clause, so an ORDER BY clause normally does not affect the ROWNUM of each row. However, if an ORDER BY clause causes Oracle to use an index to access the data, Oracle may retrieve the rows in a different order than without the index, so the ROWNUMs may be different than they would be without the ORDER BY clause."
Another person suggested using an inline view as the method for solving this problem (very similar to your example). However, I have not reconciled to myself how inserting an ORDER BY clause in the inline view is going to result in ROWNUM values being assigned to the rows according to the specified order. My understanding of the statement from the Oracle documentation is that ROWNUM values are not affected by the ORDER BY clause. Is this a correct understanding or not? I believe that jmodic asked the same question in his reply on thread http://www.dbasupport.com/forums/sho...threadid=5480. To phrase the question again, is using the ORDER BY clause in the inline view guaranteed to assign ROWNUM values based upon the ordered set of rows?
I'm not an expert, so feel free to correct me. Just be nice. :)
On a related note, I have seen one other solution posted elsewhere to this problem. They suggested using the INDEX_DESC hint along with ROWNUM to get the last N number of rows from a table based upon some id column. For example,
select /*+ INDEX_DESC(
) */
from
where ROWNUM < N;
or something similar to that. Is using these types of hints an acceptable way of addressing this type of problem?
Oh, and I can post a really horribly inefficient query to determine these values as well. I'll refrain since it depends upon taking the cartesian product of a table with itself to assign the "numbers" to the rows. I suspect that it would be considered worse than the MINUS approach. :) 04-26-2001, 01:55 AMtomateyou can need also a bulk collect
hier an example:
declare
nr_var_tab nr_tab NUMBER := nr_tab();
text_var_tab text_tab VARCHAR2(50);
begin
open c1;
loop
fetch c1 bulk collect into
nr_var_tab, text_var_tab LIMIT 10;
exit when c1%notfound;
end loop;
end;
/
I hope you understand me because my English is bad, but i waant to learn it 04-26-2001, 06:16 AMckalhano of rows ... order byhi
just wanted to add my two bit ... the order by clause will *not* be used in the rownum part of the query
select * from
( select a.*, rownum rnum from
( YOUR_QUERY_GOES_HERE
-- including the order by ) a
where rownum <= 55 )
where rnum >= 35
this will fetch the rows from 35 to 55
Thus the rows are "ordered" before we use the Order By
These topN queries have become even easier with 8i rel2 with the analytic functions
for example
SELECT ename , deptno, sal,
RANK() OVER (PARTITION BY deptno ORDER BY sal DESC)
FROM emp;
Try them out very *cool*
Charanjiv
04-26-2001, 09:47 AMchrisrlongOkay, I just got majorly confused. I'll try to take this one issue at a time
First, Heath:
----------------
Yes! The trick will (almost) always work. The only time this trick *may* run into trouble is when you add more outer levels. This very special circumstance has to do with how the optimizer tries to satisfy the query. Basically, if you take the entire example I gave you and turned it into an in-line view *inside* another complicated SELECT statement, it *may or may not* do what you expect, because it may move some of the outside predicates into the in-line view.
I have, however, never seen it move the ROWNUM predicates before the ORDER BY.
***If you stick with the format:
SELECT
---OUTER.(target columns)
FROM
---(
------SELECT
---------INNER.(target columns),
---------ROWNUM
------------AS QUERY_ROWNUM
------FROM
---------(
---------(Whatever complicated SELECT you desire)
---------ORDER BY
------------(whatever you want to order by)
---------) INNER
------WHERE
---------ROWNUM < (window end)
---) OUTER
WHERE
---OUTER.QUERY_ROWNUM >= (window start)
You will *not* have a problem. ***
Why does this work?
Because ROWNUM is a virtual column that is *re-defined* at *every* level of the statement. Notice that we are not asking for the ROWNUM column from the inner-most SELECT. *That* is the ROWNUM that you are getting confused by. *That* ROWNUM would, indeed, get assigned *before* the ORDER BY is applied. Instead, we wait until *After* the ORDER BY has been done. We ask for the ROWNUM at the second-level, which is applied *after* the ORDER BY - this is very important. Note that *at this same level*, we can restrict by the upper-bound of the ROWNUM window. However, we cannot restrict by the lower-bound yet. That is because ROWNUMs are assigned *as they pass the WHERE clause*, or as the result set is actually built. Therefore, if we add AND ROWNUM > x to *any* SELECT, we will get no rows. We instead alias the ROWNUM column and wait until the *next* level of the query to restrict the lower bound. Note that this outer-most layer will *also* have a native ROWNUM column (which we don't use) along with the aliased QUERY_ROWNUM column from the second-level SELECT.
I hope this wasn't all too detailed, but people seem to be more than slightly confused about the whole issue, so I am trying to help clear it up. Let me know if I am still unclear.
Now, tomate:
----------------
Interesting solution. However, this limit is an upper-limit only. This means that you would have to enforce the lower-limit yourself, via the PL/SQL code. This further means that you will be retrieving from the database *more* records than you need. This is a BIG no-no in my book. One should almost *never* retrieve more records than one needs - this is a sure sign of innefficiency. Further, this upper-bound is only enforced *after* the entire result set has been built and Oracle has started returning rows to the calling program. This is also inefficient. With the ROWNUM solution, you will notice the COUNT (STOPKEY) and SORT (ORDER BY STOPKEY) operations in the plan - this shows that Oracle will only do as much sorting and as much result-set building as it has to. This is what makes the ROWNUM solution so efficient, and, unfortunately, a much better solution than using LIMIT.
Good try though!
Finally, ckalha,
----------------
I'm really not sure if the first part of your argument is in favor of the ROWNUM solution or not. If not, please see the explanation I gave above and let me know if I missed anything.
As for the Analytical functions - I LOVE THEM!!! They are the coolest thing since sliced bread. They are *THE new solution* for the WHERE column=(SELECT MAX(column)... problem. I implore everyone to check them out!
**** Digression:
Just ran into a reporting query yesterday against a 25 million record table. It used nested WHERE column=(SELECT MAX(column) calls!!! It took 2 hours and 16 million Logical Reads!! I re-wrote it using analytical functions and brought it down to 100,000 Logical reads and it returned in 5 minutes. THAT is how powerful these new functions are!!!
*****
However, they are generally *not* faster than the ROWNUM solution for windowing. I tried them, because I like them sooo much, but the ROWNUM solution won out.
Disclaimer: This may not be true in *every* situation. My guess would be that at *least* 80% of the time, the ROWNUM solution will be optimal, and will usually be *much* faster. However, I expect that under the right circumstances, such as maybe if using a single table, and selecting most of it, the ROW_NUMBER analytical function might be faster.
IF, by chance, you are ordering according to some complicated formula that an analytical function can replace, then, by ALL means, use that same function to do the windowing.
So, generally speaking, if the statement is already using or *can* use an analytical function for purposes *other* than the windowing... then it probably makes sense to use an analytical function *for* the windowing as well. If not, then I'd stick with the ROWNUM solution. It will generally be faster, and it is more easily implemented.
Phew, I hope I covered everything. Let me know if I didn't. And I *really* hope nobody has further replied to this post while I was writing this dissertation :)
- Chris
[Edited by chrisrlong on 04-26-2001 at 09:58 AM] 04-26-2001, 10:30 AMHeathYour explanation was quite clearThanks for taking the time to straighten me out. :) 05-22-2001, 06:39 AMseenu27I modified the below query posted by shiva in the following way, is this query correct??. I tested as well and it is working fine. I cannot understand why shiva has included another inline query in his subquery, can anyone xplain for me please please.
thanks
Seenu
My query
------------------
select messageID
from
(select messageID, rownum as newcnt
from messages
where rownum <= 3)
where newcnt > 0
05-23-2001, 06:28 PMchrisrlongPlease re-read my explanations in this thread and the other referenced threads to fully understand the solution.
Your solution won't work if the result set needs to be ordered, and windowing generally only makes sense with an ordered result set. However, if ordering is not necessary, then your solution will work. I must admit I find it a little humorous that your complaint has to do with what you consider an extraneous layer to the original query while your example truly has one - checking for rownum>0 is definitely extraneous ;). Don't worry, I know where you were going, just found it funny.
Also, Shiva never actually posted a query, so I'm not actually sure which query you were referring to, not that it really matters though.
- Chris 08-16-2001, 07:50 PMssinhaTo further extend the discussion
I need 500 to 600 of 100000 records selected
here 100000 is the total number of records
of which I am displaying 500 - 600
ordered on some column ,.
What is the optimal solution here?
We use scrollable result set in Java . But I hear that in the above case all 100000 records have to be transfered across the wire .
SOmoene suggested that inline views are very slow and degrades performance. Is this true ? 08-17-2001, 11:41 AMchrisrlongUh, no. :)Quote:
Originally posted by ssinha
SOmoene suggested that inline views are very slow and degrades performance. Is this true ?
As for the optimal solution, I would still propose the ROWNUM solution. However, it would depend upon certain variables.
The reason the ROWNUM solution works so well is that the optimizer utilizes a STOPKEY method, which simply means that it will stop sorting once it reaches the upper limit (in your case, 600). So, for a single page, a lot of sorting has been avoided. However, if your user is going to end up paging through *the entire* resultset, then we will have actually performed more querying and sorting than if we only pulled the entire resultset once. In other words, if you simply selected all 100,000 rows into a local buffer and scrolled the user through that buffer, you would only have run the query once and sorted the resultset once. However, it is generally true that in *most* searches, where windowing is usually used, very few users will page through more than 5 pages. Most will only page once, maybe twice. Therefore, only sorting and returning the bare minimum number of rows is a better solution. If the user will always use a majority of the resultset, then a single mass query is preferred.
However, one must ask oneself how much use 100,000 rows of information will be to *anyone* on-line. Such massive queries are best left to nightly batch reporting, not on-line queries that the user scrolls through (windowing).
Food for thought,
- Chris 08-17-2001, 12:15 PMssinhathanks Chris
I have more questions hopfully someone can help
Yes most users dont scroll through all 100000 records. But still the requirement is to show for example rows 100 -200 and also display the total count at the bottom off the page which is 100000.
TO get the total count I have to do a count(*) query
I also would like note that the above query has atleast 10 tables and many of which is outer join and hierarchial.,
Does outerjoin cause full table scans or is there anyway to tune all the above combined ?
08-20-2001, 07:13 PMchrisrlongThe first thing to try is to get the requirement removed. It is costly to get that count. However, if you definitely need it, then there are a few options:
- Optimize the query that gets the count. Most queries contain joins to lookup tables and the like to translate codes to user-readable text, etc. Such joins might be able to be eliminated it the COUNT(*) query
- Only run the count(*) query on the first page - don't run it for every page
- If the COUNT query takes a long time anyway, it might be worth re-visiting the '1 big query' approach. Run the entire query and cache all the results. I wouldn't do this blindly - test the various options and know how many pages are usually viewed by the user.
HTH,
- Chris