The only possible solution for this i can think of right now is:
Create a non-prefixed local index partitioned on zip , and use parallel clause , for this u have to enable parallel query parameters and
bitmap_merge_area_size (increase from 1M to equal ro sort_area_size).
and db_file_multiblock_read_count , increase it to 32.
03-28-2001, 01:56 PM
Your help is really wonderful and your replies are very optimistic.
The above query is using non-prefix bitmap index on zip code. The sort_area_size is 30M,
is it possible to increase the bitmap_merge_area_size to 30M ?
I hope you might have read this thread from the beginning ?, the application which is using this database lets the user create ad-hoc queries and its unfortunate that i can not create composite indexes as i don't know what queries the applicaion is going to build.
Another thing which is coming to my mind is about the hardware performance ?. I checked the server, its Sun Sparc 3000 with lot of smaller disks and with the help of solstice the bigger volumes have been made. What do you thing on this part ?.
03-28-2001, 01:57 PM
i wanted to add one more thing, db_file_multiblock_read_count is alread y32K
03-28-2001, 02:51 PM
I do not think it is a server disk problem, as u see lots of physical reads , anyway u can check if there are any disk i/o waits .
The only option left is parallel query.
Also give a try to this, enable 10053 event and see what the trace file gives u.
This will show what CBO is doing in detail.
03-29-2001, 04:19 AM
I've spend some time calculating row size in bytes(maximum),
assumed block size to be 8k,
assumed that block fits block_size/row_size rows
accounted for pctfree 5
row migration/chaining was not accounted.
Results: got that you read only <2 times more blocks than necessary:
156.562 consistent gets
129.468 physical gets
81.821 my estimated number of blocks occupied by rows returned in your query.
So from my understanding you should try to increase I/O speed and utilize memory, CPU work seems to be moderate for this particular query (try to check block cleanout statistics for session performing your query - those cleanouts may slow done things).
Could it be that your sybase machine keeps data in RAM?
Other issue is that you may need to read much more about sybase in order to reenginner properly the old database - from my knowledge Oracle select is not fastest and other DBMS has some advanced tricks for sql optimization.
03-29-2001, 06:27 AM
Is the count(*) necessary?
Try count(1).I think you will definitely have performance increase with the same set of output.
Pls lemme know if it helps
03-29-2001, 06:34 AM
can you explain why count(*) would be slower?
03-29-2001, 07:35 AM
what is interesting in this example is the fact that table is FIXED width.
If sybase is able to utilize this fact then it is no surprise it is fast, for example:
lets have an index on (gender, zip) (not zip, gender!).
a B-tree index can then be represented as:
Level0: M(male), F(female), I(others...)
Level1: zipM1, ...,zipMn; zipF1, ...zipFn; zipI1, ...zipIn
(zips at level1 are sorted for M,F and I entries)
Now, since rows are fixed width, it is just a matter of a simple calculation to find number of rows falling in ranges of (zipMx and zipMy), (zipFx and zipFy) and (zipIx and zipIy).
03-29-2001, 09:49 AM
Hmmm.... This is definitely a tough one. Unfortunately, I have little to add to the fray. I would agree that the row size is important, so making a larger block size may be prudent in this case. As for the specific indexes, you already said that you have no idea what they will be asking. Therefore, trying to craft an index specifically for this request seems irrelevant, as this request is not specifically your problem. I would first assume that you are dealing with something with low rate of updates, or bitmaps will get you in trouble. Given that assumption, multiple simple bitmap indexes are definitely the best idea. Going further with that assumption, parallel operations are a must; Larger block size is warrented; db_file_multiblock_read_count should be high. Also, this table should be thought of as a fact table in and of itself. To this end, think datawarehousing thoughts. This table needs to be more tightly normalized than it is. It should definitely *not* have 5-character zip codes in it. There should be a separate ZipCode_T table that holds the available zip codes with an abstracted, small as possible ID as the PK. Then, this much smaller FK would be in the very large customer table. There are a lot of issues like this that appear to me about this table design. If you are able, you should *really* think about redesigning the majority of this table. If not, oh well :).
The main reason I wanted to post a response here was to de-bunk the Count(*) vs. Count(1) issue - There is NO difference, period.
While I'm at it, cursor sharing is almost pointless in your case. It is not the parsing of the statement that is killing you - it is the size of the data. Besides, you already said they can build almost any select they want - the re-use would probably be low. Not that you can't do cursor sharing, but it is certainly nowhere near your major problem. Same with FIRST_ROWS, especially if you go with the bitmapped index solution where first_rows will not play mostly.
Finally, I guess I should ask how certain you are that partitioning on Region is the best. The *main* advantage of partitioning *from a performance perspective* is for partition elimination and parallelization. Partition elimination is the key. You want to partition on something that is *often* used in the WHERE clause, so the optimizer can easily decide to go after only a single partition or two. Partitioning can actually hurt performance if it is done on a bad field. (and especially if you do not use paralell execution)
Anyway, just things to think about. Sorry I don't have any hard and fast solutions for you - especially as to why Sybase is faster (which is a little shocking :) )