Hi
Does anyone knows how bitmap index works internally? The difference between it and b*tree
cheers
Printable View
Hi
Does anyone knows how bitmap index works internally? The difference between it and b*tree
cheers
A bitmap index is a 2-dimensional table of bits.
when a certain value is OK, the bit is 1, otherwise it's 0
eg, table demo ( switchA char(1) )
values
A
B
A
C
D
E
the following table is created
--1 2 3 4 5 6" -> ID. There's a function that transforms this value to the ROWID if the row is selected
A 1 0 1 0 0 0
B 0 1 0 0 0 0
C 0 0 0 1 0 0
D 0 0 0 0 1 0
E 0 0 0 0 0 1
|--> possible values
When you do select ... where switchA = 'A'
--> all ID's where the 'A'-bit = 1 are selected, The ID is converted to a ROWID using a function. Now we have the rowid's off the records.
Hope this helps
Gert
Hi Thanks for the reply Devenge it certainly helped a bit, a bit because I dont see how this helps to improve the perfomance in a low cardinality column :P
For example if the table you put has 100 'A's will each 'A' get a different combination of 1 and 0?
what do you mean the 'A'-bit = 1
A minus bit?
what is a bitmap index ?
I said it's a 2-dimensional array.
each element of the array has 2 elements. the value + a serie off 1's and/or 0's. ( = bitmap )
each bit in the bitmap represents a row in the table.
If the row in the table has the same value as the 'value'- element,
the bit in the bitmap for that row is 1, otherwise it's 0.
Let's take my example and put it in a array
value+bitmap
--------------------
A + 101000
B + 010000
C + 000100
D + 000010
E + 000001
When selecting switchA = 'A'
the bitmap that corresponds to the value 'A' is taken.
==> 101000
then we can identify the positions in the bitmap where bit=1
==> position 1
position 3
then the positions are converted to the ROWID's
==> ROWID's of the records in the table
the same for where switchA='D'
==> 000010
==> postion 5 ==> rowid
Why is it faster than a 'normal' index
--> it takes less space
Why is it fast when we use multiple bitmap indexes
--> because we can do a binary operation before converting the positions to rowid
-->==> example
-->==> where switchA='A' and myspecialcode='Y'
-->==> we get the bitmap from switchA='A' : 101000
-->==> we get the bitmap from myspecialcode='Y': let assume the bitmap is 001000
-->==> 101000 AND 001000 = 001000
-->==> where do we have 1 in bitmap : position 3
-->==> position 3 ==> converted to rowid
why is it preferred for low cardinality column
--> for each different value, a value+bitmap combination is made. So if you have to much different values, the size of the bitmap indexes become to big.
--> although I known somebody : table with 10.000.000 records / column with about 48.000 different values and still a bitmap index gives a big performance gain.
I hope this was better then the first reply.
Regards
Gert
I think more or less I am getting the idea, I still dont see if there will be different combination of value+bitmap though
for example if I have these values
A
B
A
A
C
will it be like this ?
value+bitmap
A 101000
B 010000
A 101000
......
or
value+bitmap
A 101000
B 010000
A 101001
A 10110 -> record 1,3,4 have value A
B 01000 -> record 2 has value B
C 00001 -> record 5 has value C
when we add a new row with value D the new array will be
A 101100
B 010000
C 000010
D 000001
--> added value D
--> added a bit to every bitmap
when again we add a new row with value B the new array will be
A 1011000
B 0100001
C 0000100
D 0000010
--> added a bit to every bitmap
Gert
Ok I see it now, the bitmap is really 1 and 0 add together which becomes the number of rows :o
so if I have 10 rows
I will have 10 1`s and o's
if 11 rows 11 1's and 0's and so on...
by the way, bitmap index is not suitable for high cardinality columns because of SIZE issue? Because i think it´perfomance would be great in any query
It's not suitable for HC because the RDBMS will make the comparison with all the records just to get a few records.
For this kind is better b-tree because the RDBMS is going to navigate through the index to find the few records.
I have bitmapped indexes with 150 million of records and 100 thousand of values and it's still good enough.
The examples shown before are good, but they are really vertical not horizontal, I mean A,B,C,etc are the columns of the index, if you add a record with a new value it gonna be a NEW column in the index, with all the problems you know there are.
A tip:
Create your table=>loaded up=>alter table name minimize_records_per_block=>create all the bitmap indexes on the table.
Just be careful if your updating a column with a bitmap index on.
Because the smallest part of a bitmap index that can be locked is a bitmap segment (up to half a block). This actually contains a number of rows, so, if you have some parallel updates (like we have) - even though you may be updating records on say a primary key, you may still get deadlocking issues due locking on the index.
Bitmap indexes were put on our system ages ago to help performance, but they've introduced a whole host of problems - so the application was coded around it (i.e. commit's in cursor loops to stop locking), and now this is introducing snapshot too old.
Terry