-
We're trying to determine which type of index (B-tree, Bitmap, etc..) would be better for an attribute that will only to values of either "Y" or "N" on a table that has 3.5 million records and growing. We are estimating an average of 500K with value of "N", the rest "Y".
Thanks in advance for your help on this?
-
Well, the B*tree index would be totaly useless with this kind of data distribution. Tho only usable option would be bitmap.
-
-
But if your database is OLTP you can forget about BITMAP indexes, would be too slow in DML operations
-
Thank you ALL for your response. BITMAT is it. I will look into the syntax now. Any examples out there?
-
What do you hope to gain by indexing? Neither option is likely to produce much benefit if any. BITMAP is a disaster in OLTP.
-
Unfortunately, it sounds as if you are taking an overly-simplistic view here.
A bitmap index in isolation is a waste.
First, you need to know if you have an OLTP or a DSS database. If OLTP, as others have emphasized - NO BITMAPS!!!!! Period.
Conversely, if you are DSS, then you should probably have LOTS of bitmaps. Bitmaps not only work better on low selectivity columns, they work better in groups.
Here is a thread where jurij and I discuss bitmap indexes in start schemas - this is applicable for most DSS situations in general, however.
http://www.dbasupport.com/forums/sho...threadid=11745
- Chris
-
We have a Data Warehouse that updates once a day with batch processing. There is a cursor that processes the table with 3.5 million records and it is using sequential scan and not the associated binary index because of NULLS in one of the concatenated indexed columns.
Given this, is BITMAP still the best and most efficient type of index we can use?
Thanks!!
-
Please take the time to read the thread.
- Chris
-
if the distinctveness is low you can use bitmap otherwise b*tree
-
We tried the BITMAP index approach. However, we had to back it out because it caused major overhead in regards to the nightly processing. We then reverted to Binary indexes and all is working fine and now seeing major improvements in the processing with this type of index.
Even though, we have a data warehouse (not OLTP) we decided on BITMAP to experience the results we would have--unfortuanatelly, due to the large file that was bein processed ~100K records, most definietly bitmap was not the way to go. Very good learning experience, though.
Thanks again to everyone for your contributions on my questions.
BR
-
Yes bitmap indices slows down inserts. You didnt specify in your original post you will be inserting 100k of records everynight, I thought it was some sort of a static look up table used in a DSS.
-
Sorry, but something still doesn't sound right. A standard B*Tree index on a Y/N field *will not be used*. The only possible exception is if the query was already restriciting on a number of fields that already had bitmap indexes on them. Then, it *might* choose this B*Tree index to include, since it can include a single B*Tree index together with the bitmap indexes. However, in such a case, the bitmap index would still perform better. Also, since you said you experienced performance problems in the load with the bitmap index, it sounds like it's your first one(bitmap index). If this is the case, I repeat: the B*Tree index will not be used. I see no way that this index could have helped your performance.
You definitely want to use bitmap indexes in a DSS - *especially* a warehouse. I *again* implore you to read the thread I referenced before - you *really* need to understand those issues.
http://www.dbasupport.com/forums/sho...threadid=11745
Now, as for loading - drop the index before the load and re-create it afterwards. Bitmap indexes are very fast to create. The time you save during the load will more than make up for the time necessary to re-create the bitmap index. That is, of course, if you are talking about simple loading. I am having difficulty in assessing the need you are trying to address. You speak about wanting an index for a daily update process, but you then say that the index slows down the nightly load. Are there actually 2 processes? And what in the world do you mean by "and it is using sequential scan and not the associated binary index because of NULLS in one of the concatenated indexed columns"? You were asking about a single Y/N field, but then you talk about a null-able field in a concatenated index - which is it? Finally, you talk about a cursor that is used to update this 3.5 million record table - this worries me because you really should avoid cursors whenever possible.
Not that it matters much that I am confused by what you need. But I am worried that you are drawing incorrect conclusions from your tests and missing the bigger picture as far as warehouse optimization is concerned.
Just my .02,
- Chris