I have tables which have more than 100 million rows.
what is fastest way to get count of number of rows, count(*)
is putting too much load on the machine and it is taking lot
I did not analyze these tables, so I can't use num_rows in all_tables.
Any ideas ??
03-07-2001, 09:04 PM
My understanding is that you should use count(rowid), since there is less data dictionary work to do the identification of what columns the * actually represents.
03-07-2001, 10:38 PM
Unfortunately, that's not true. Count(1), count(*), and count(ROWID) are all the same thing, essentially. As long as you don't do count(columnname) where columnname is not part of the set it is using to count. Usually, the count(*) will simply use the PK index, since it is the smalles object that contains all the rows. It has to go to through all the leaf nodes to get the count, so it will substitue the PK for the *, or count(1) will count every entry, and every entry contains the PK and ROWID, so they are all the same. If, however, you put another columnname in there that is not the PK, then it will have to go to the table to get the column and *that* will cost you. The best you can do is make sure your PK is as small as possible, and is a single field, so the PK index is as small as possible. Of course, this should *always* be the case. Every table, *especially* such large ones, should have a numeric, abstracted, single-field as the PK. Also, this will be hampered by such things as Index-organized table, clusters, and anything else that messes up the index. As long as you have a numeric, single-field PK, the count - no matter how large - should not be *that* painful.
Other than that - there's not much you can do. However, I am a little concerned that you say you haven't analyzed the tables. Please tell me you're not using the RULE optimization mode when dealing with such large tables! If so, more power to ya. :)
Actually, the only other thing that occurs to me rests in the resolution of an issue jmodic and I are discussing over on the developer's forum. If, as he suggest, a Fast-Full Index scan uses less buffer space than a regular index scan, it may be worthwhile to use a hint on your count, ie:
SELECT /*+ INDEX_FFS(T1, pk_index_name) */ COUNT(*) FROM Table1
This *may* save you some cache space, which *may* be contributing the the load the statement is putting on your system. It may be worth a try.
- Actually, upon re-reading what you said, you might have a slight point. Most people think that it is faster because they think count(*) counts *all* the columns, meaning that it has to access the table, which is not true. I never actually thought about the fact that it might have to make the single, extra step of resolving the * to a field name. *If* this is true, it simply *can't* take more than a few milliseconds to do that. Considering the size of the table, and the length of the overall statement, I would say that would be an insignificant difference. Interesting theory nonetheless - I'll have to look into that.