I have a query which returns like around 300K records. I want to know if it is possible to just fetch the 1st 10 records at one time, then move over to the next 10 & so on, maybe using the ROWNUM column....
...and if vbasker had read any of those links I posted, he would have seen that the solution as I posted it is much more efficient than the solution he posted. You definitely do *not* want to use the minus solution. I again refer you to the links I posted above.
PS. Sorry if I seem cranky lately - I *am* quitting smoking, after all
i know that minus is not an efficient solution as compared to solution generated (inline views) !!
i ve seen u r posting 2 !! there is no double minds on this !! i ve just thrown the solution for better understanding !!! letz understand the logic rather than getting out to be best !!!
First, let me apologize for the tone of my previous comments.
Second, let me say that this reply is not meant to foster anything other than friendly debate.
I must say that I have to disagree somewhat with your last statement. While I agree that it is important to understand the logic, I must counter that is more important to understand it within the proper context. Much like learning a sport, it is hard to break bad habits. If a person does not initially learn to swing a bat correctly, it is more difficult for them to learn the proper method later.
This analogy may seem a stretch, but stick with me . To me, coding 'optimal' SQL is equivalent to coding SQL 'properly' or swinging that bat properly. While teaching someone to swing it any old way just so they can hit the ball may help them in the short term, it is detrimental to them in the long term.
In my opinion, it is more important to understand the problem in terms of *the optimal solution*, as opposed to understanding the problem in terms of any old solution.
So, back finally to the problem at hand. While learning of the existence of the MINUS functionality is important, so is learning the NOT IN functionality. However, as the problem was a general 'How do I do windowing?' question, the 'proper' answer is 'with ROWNUM', because it is the optimal solution for this problem (post 8i). The MINUS is wonderful at solving other problems, but not this one, and it should not be used to solve this problem, IMHO. I have tried on this forum to show how much better the ROWNUM solution is, and the links that I presented did already address the MINUS solution.
Therefore, when the MINUS solution was posted, without disclaimer, as a solution that was equally as viable as the ROWNUM solution, yeah - it bothered me a little. And again, I apologize for not biting my tongue - It's a character flaw of mine .
But I stick by my position - We *do* want to write the *best* solution. So when a less-optimal solution is presented *after* a more-optimal solution has already been given, it should at *least* be prefaced with some type of disclaimer. Such as 'another, less efficient solution...' or 'and if that doesn't work...' or something similar. Otherwise, you simply confuse the issue for the original post-er. When we know which solution is more efficient, it is our job to inform the post-er of that information, IMHO.
Hope that all made sense, and I again apologize if I came off too aggresively - I really do need a cigarette
chrisrlong, your discussion about these types of queries on the various threads has indeed been quite enlightening. I've only been reading these forums for a short while and discussions like these are what I was hoping to find here; discussions that teach me new things about how PL/SQL works. I wish that I had known about these forums sooner, especially when I participated in another discussion on another list (non-Oracle/DBA related) where someone had asked basically the same question. How do I select the N beginning/middle/ending number of rows inserted into a table?
So in the interests of furthuring my education, let me discuss the following concerning the use of ROWNUM in queries. I am appending to this thread since you suggested to other posters to keep the discussions in single threads so that all of the information is in one place. :)
In the other thread, I originally suggested the use of the ROWNUM column to find these rows. We discussed some queries, tried them out and discovered some problems with using ROWNUM. In particular, the first sentence from the following section from the Oracle 8.0.5 documentation troubled us: "Oracle assigns a ROWNUM value to each row as it is retrieved, before rows are sorted for an ORDER BY clause, so an ORDER BY clause normally does not affect the ROWNUM of each row. However, if an ORDER BY clause causes Oracle to use an index to access the data, Oracle may retrieve the rows in a different order than without the index, so the ROWNUMs may be different than they would be without the ORDER BY clause."
Another person suggested using an inline view as the method for solving this problem (very similar to your example). However, I have not reconciled to myself how inserting an ORDER BY clause in the inline view is going to result in ROWNUM values being assigned to the rows according to the specified order. My understanding of the statement from the Oracle documentation is that ROWNUM values are not affected by the ORDER BY clause. Is this a correct understanding or not? I believe that jmodic asked the same question in his reply on thread http://www.dbasupport.com/forums/sho...threadid=5480. To phrase the question again, is using the ORDER BY clause in the inline view guaranteed to assign ROWNUM values based upon the ordered set of rows?
I'm not an expert, so feel free to correct me. Just be nice. :)
On a related note, I have seen one other solution posted elsewhere to this problem. They suggested using the INDEX_DESC hint along with ROWNUM to get the last N number of rows from a table based upon some id column. For example,
select /*+ INDEX_DESC(
where ROWNUM < N;
or something similar to that. Is using these types of hints an acceptable way of addressing this type of problem?
Oh, and I can post a really horribly inefficient query to determine these values as well. I'll refrain since it depends upon taking the cartesian product of a table with itself to assign the "numbers" to the rows. I suspect that it would be considered worse than the MINUS approach. :)
Okay, I just got majorly confused. I'll try to take this one issue at a time
Yes! The trick will (almost) always work. The only time this trick *may* run into trouble is when you add more outer levels. This very special circumstance has to do with how the optimizer tries to satisfy the query. Basically, if you take the entire example I gave you and turned it into an in-line view *inside* another complicated SELECT statement, it *may or may not* do what you expect, because it may move some of the outside predicates into the in-line view.
I have, however, never seen it move the ROWNUM predicates before the ORDER BY.
***If you stick with the format:
---------(Whatever complicated SELECT you desire)
------------(whatever you want to order by)
---------ROWNUM < (window end)
---OUTER.QUERY_ROWNUM >= (window start)
You will *not* have a problem. ***
Why does this work?
Because ROWNUM is a virtual column that is *re-defined* at *every* level of the statement. Notice that we are not asking for the ROWNUM column from the inner-most SELECT. *That* is the ROWNUM that you are getting confused by. *That* ROWNUM would, indeed, get assigned *before* the ORDER BY is applied. Instead, we wait until *After* the ORDER BY has been done. We ask for the ROWNUM at the second-level, which is applied *after* the ORDER BY - this is very important. Note that *at this same level*, we can restrict by the upper-bound of the ROWNUM window. However, we cannot restrict by the lower-bound yet. That is because ROWNUMs are assigned *as they pass the WHERE clause*, or as the result set is actually built. Therefore, if we add AND ROWNUM > x to *any* SELECT, we will get no rows. We instead alias the ROWNUM column and wait until the *next* level of the query to restrict the lower bound. Note that this outer-most layer will *also* have a native ROWNUM column (which we don't use) along with the aliased QUERY_ROWNUM column from the second-level SELECT.
I hope this wasn't all too detailed, but people seem to be more than slightly confused about the whole issue, so I am trying to help clear it up. Let me know if I am still unclear.
Interesting solution. However, this limit is an upper-limit only. This means that you would have to enforce the lower-limit yourself, via the PL/SQL code. This further means that you will be retrieving from the database *more* records than you need. This is a BIG no-no in my book. One should almost *never* retrieve more records than one needs - this is a sure sign of innefficiency. Further, this upper-bound is only enforced *after* the entire result set has been built and Oracle has started returning rows to the calling program. This is also inefficient. With the ROWNUM solution, you will notice the COUNT (STOPKEY) and SORT (ORDER BY STOPKEY) operations in the plan - this shows that Oracle will only do as much sorting and as much result-set building as it has to. This is what makes the ROWNUM solution so efficient, and, unfortunately, a much better solution than using LIMIT.
Good try though!
I'm really not sure if the first part of your argument is in favor of the ROWNUM solution or not. If not, please see the explanation I gave above and let me know if I missed anything.
As for the Analytical functions - I LOVE THEM!!! They are the coolest thing since sliced bread. They are *THE new solution* for the WHERE column=(SELECT MAX(column)... problem. I implore everyone to check them out!
Just ran into a reporting query yesterday against a 25 million record table. It used nested WHERE column=(SELECT MAX(column) calls!!! It took 2 hours and 16 million Logical Reads!! I re-wrote it using analytical functions and brought it down to 100,000 Logical reads and it returned in 5 minutes. THAT is how powerful these new functions are!!!
However, they are generally *not* faster than the ROWNUM solution for windowing. I tried them, because I like them sooo much, but the ROWNUM solution won out.
Disclaimer: This may not be true in *every* situation. My guess would be that at *least* 80% of the time, the ROWNUM solution will be optimal, and will usually be *much* faster. However, I expect that under the right circumstances, such as maybe if using a single table, and selecting most of it, the ROW_NUMBER analytical function might be faster.
IF, by chance, you are ordering according to some complicated formula that an analytical function can replace, then, by ALL means, use that same function to do the windowing.
So, generally speaking, if the statement is already using or *can* use an analytical function for purposes *other* than the windowing... then it probably makes sense to use an analytical function *for* the windowing as well. If not, then I'd stick with the ROWNUM solution. It will generally be faster, and it is more easily implemented.
Phew, I hope I covered everything. Let me know if I didn't. And I *really* hope nobody has further replied to this post while I was writing this dissertation