Functional indexes
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 6 of 6

Thread: Functional indexes

  1. #1
    Join Date
    Jul 2001
    Posts
    181
    CAn someone please show me the sql used to create a functional index?

    Thanks in advance.

  2. #2
    Join Date
    Jan 2002
    Location
    Up s**t creek
    Posts
    1,525
    Based on a table called test
    col1 number
    col2 number

    Example would be

    create index example
    on test(col1*col2);

    Regards
    Jim
    Oracle Certified Professional
    "Build your reputation by helping other people build theirs."

    "Sarcasm may be the lowest form of wit but its still funny"

    Click HERE to vist my website!

  3. #3
    Join Date
    Jul 2002
    Posts
    132

  4. #4
    Join Date
    Jul 2002
    Location
    Lincolnshire, IL
    Posts
    203
    Link does not work.... can you provide revised link...
    Thanks
    Sandy
    "Greatest Rewards come only with Greatest Commitments!"

  5. #5
    Join Date
    Jul 2002
    Posts
    132

    Function Based Indexes
    Oracle8i introduces a feature virtually every DBA and programmer will be using immediately -- the ability to index functions and use these indexes in query. In a nutshell, this capability allows you to have case insenstive searches or sorts, search on complex equations, and extend the SQL language efficiently by implementing your own functions and operators and then searching on them.



    --------------------------------------------------------------------------------

    Why to use this feature
    It's easy and provides immediate value
    It can be used to speed up existing applications without changing any of their logic or queries
    It can be used to supply additional functionality to applications with very little cost
    So why is it easy and of immediate value? It's easy because it's just a CREATE INDEX statement. Consider the following example: I begin by creating a copy of the 'scott/tiger' demo employee table. I then change the data in the employee name column to be in mixed case. I then create an index on the UPPER of the ename column -- efectively creating a case insensitive index:

    SQL> create table emp as select * from scott.emp;
    Table created.

    SQL> update emp set ename = initcap(ename);
    14 rows updated.

    SQL> commit;
    Commit complete.

    SQL> create index emp_upper_idx on emp(upper(ename));
    Index created.





    We now have an index on the "UPPER" of a column. Any application that already issues 'case insensitive' queries of the form:


    SQL> set autotrace on explain
    SQL> select ename, empno, sal from emp where upper(ename) = 'KING';

    ENAME EMPNO SAL
    ---------- ---------- ----------
    King 7839 5000


    Execution Plan
    ----------------------------------------------------------
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=40)
    1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=1 Card=1 Bytes=40)
    2 1 INDEX (RANGE SCAN) OF 'EMP_UPPER_IDX' (NON-UNIQUE) (Cost=1 Card=1)




    will transparently make use of this index -- gaining the performance boost an index can deliver. Before this feature was available, every row in the emp table would have been scanned, upper-cased and compared. In constrast, with the index on upper(ename), the query takes the constant KING to the index, range scans a little data and accesses the table by rowid to get the data. This is very fast.

    This performance boost is most visible when indexing user written functions on columns. Oracle7 Release 7.1 added the ability to use user written functions in SQL so that you could:


    SQL> select my_function(ename)
    2 from emp
    3 where some_other_function(empno) > 10
    4 /





    This was great because you could now effectively extend the SQL language to include application specific functions. Unfortunately however, the performance of the above query was a bit disappointing. Say the EMP table had 1,000 rows in it -- the function "some_other_function" would be executed 1,000 times during the query, once per row. Additionally, assume the function took 1/100 of a second to execute. This relatively simple query now takes at least 10 seconds.

    Here is a real example. I've implemented a modified 'soundex' routine in PL/SQL:


    SQL> create or replace package stats
    2 as
    /* We will use the stats package to count
    the number of times the 'my_soundex' function
    is called. The package will just let us maintain
    the state. This also demonstrates that the 'purity'
    restrictions from Oracle8.0 and less have been relaxed
    a great deal... There are no pragmas and we will be
    calling a function that modifies a package state in a
    where clause -- two things that could not happen
    in previous releases
    */
    3 cnt number default 0;
    4 end;
    5 /
    Package created.

    SQL> create or replace function my_soundex( p_string in varchar2 ) return varchar2
    2 deterministic
    3 as
    4 l_return_string varchar2(6) default substr( p_string, 1, 1 );
    5 l_char varchar2(1);
    6 l_last_digit number default 0;
    7
    8 type vcArray is table of varchar2(10) index by binary_integer;
    9 l_code_table vcArray;
    10
    11 begin
    12 stats.cnt := stats.cnt+1;
    13
    14 l_code_table(1) := 'BPFV';
    15 l_code_table(2) := 'CSKGJQXZ';
    16 l_code_table(3) := 'DT';
    17 l_code_table(4) := 'L';
    18 l_code_table(5) := 'MN';
    19 l_code_table(6) := 'R';
    20
    21
    22 for i in 1 .. length(p_string)
    23 loop
    24 exit when (length(l_return_string) = 6);
    25 l_char := substr( p_string, i, 1 );
    26
    27 for j in 1 .. l_code_table.count
    28 loop
    29 if ( instr( l_code_table(j), l_char ) > 0 AND j <> l_last_digit )
    30 then
    31 l_return_string := l_return_string || to_char(j,'fm9');
    32 l_last_digit := j;
    33 end if;
    34 end loop;
    35 end loop;
    36
    37 return rpad( l_return_string, 6, '0' );
    38 end;
    39 /

    Function created.





    Notice in this function, I am using a new keyword "deterministic". Deterministic declares that the above function -- when given the same inputs -- will always return the exact same output. This keyword is needed in order to create an index on a user written function. You must tell Oracle that the function is 'deterministic' and will return a consistent result given the same inputs. This implies for example that you cannot index using the package 'dbms_rand', the random number generator. Its results are not deterministic, given the same inputs you'll get random output. The builtin sql function UPPER on the other hand is deterministic so you can create an index on the UPPER of a column.

    Now that we have the function "My_Soundex()", lets see how it performs without an index...


    SQL> create table test_soundex( name varchar2(30) );

    Table created.

    SQL>
    SQL> set timing on
    SQL>
    SQL> insert into test_soundex
    2 select object_name
    3 from all_objects
    4 where rownum <= 1000;

    1000 rows created.

    Elapsed: 00:00:00.46
    SQL>
    SQL> set autotrace on explain
    SQL> exec stats.cnt := 0;

    PL/SQL procedure successfully completed.

    Elapsed: 00:00:00.04
    SQL> select name
    2 from test_soundex A
    3 where my_soundex(name) = my_soundex( 'FILE$' )
    4 /

    NAME
    ------------------------------
    FILE$

    Elapsed: 00:00:02.38

    Execution Plan
    ----------------------------------------------------------
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=34)
    1 0 TABLE ACCESS (FULL) OF 'TEST_SOUNDEX' (Cost=1 Card=1 Bytes=34)




    SQL> set autotrace off
    SQL> set timing off
    SQL> exec dbms_output.put_line( stats.cnt )
    2000

    PL/SQL procedure successfully completed.




    So, we can see this query took over 2 seconds to execute and had to do a full scan on the table. The function my_soundex was invoked 2,000 times (according to our counter), twice for each row. Lets see how indexing the function can be used to speed things up.

    The first thing we will do is create the index as follows:


    SQL> create index test_soundex_idx on test_soundex( substr(my_soundex(name),1,6) )
    2 /
    Index created.





    Now, the interesting thing to note in this create index command is the use of the substr function. This is because we are indexing a function that returns a string. If we were indexing a function that returned a number or date this substr would not be necessary. The reason we must substring the user written function that returns a string is that they return varchar2(4000) types. That is too big to be indexed -- index entries must fit within 1/3 the size of the block. If we tried we would recieve (in a database with a 2k blocksize) the following:


    SQL> create index test_soundex_idx on test_soundex( my_soundex(name),1,6 );
    create index test_soundex_idx on test_soundex( my_soundex(name),1,6 )
    *
    ERROR at line 1:
    ORA-01450: maximum key length (758) exceeded





    In databases with larger block sizes, the number 758 would vary but, unless you are using a 16k or larger blocksize, you will not be able to index a varchar2(4000).

    So, in order to index a user written function that returns a string, we must constrain the return type in the create index statement. In the above, knowing that my_soundex returns at most 6 characters, we are substringing the first six characters.

    We are now ready to test the performance of the table with the index on it. We would like to monitoring the effect of the index on INSERTS as well as the speedup for SELECTS to see the effect on each. In the un-indexed test case, our queries took over 2 seconds and the insert of 1,000 records took about 1/2 of a second. Looking at the new testcase we see:


    SQL> exec stats.cnt := 0;
    PL/SQL procedure successfully completed.

    SQL> insert into test_soundex
    2 select object_name
    3 from all_objects
    4 where rownum <= 1000;

    1000 rows created.

    Elapsed: 00:00:02.23
    SQL> exec dbms_output.put_line( stats.cnt )
    1000

    PL/SQL procedure successfully completed.

    Elapsed: 00:00:00.05
    SQL>
    SQL>
    SQL> set autotrace on explain
    SQL> exec stats.cnt := 0;

    PL/SQL procedure successfully completed.

    SQL> select name
    2 from test_soundex B
    3 where substr(my_soundex(name),1,6) = my_soundex( 'FILE$' )
    4 /

    NAME
    ------------------------------
    FILE$

    Elapsed: 00:00:00.05

    Execution Plan
    ----------------------------------------------------------
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=34)
    1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_SOUNDEX' (Cost=1 Card=1 Bytes=34)
    2 1 INDEX (RANGE SCAN) OF 'TEST_SOUNDEX_IDX' (NON-UNIQUE) (Cost=1 Card=1)


    SQL> exec dbms_output.put_line( stats.cnt )
    2




    If we compare the two examples (unindexed versus indexed) we find:


    Operation Unindexed Response Indexed Response Difference Increase/Decrease
    Insert 0.46 2.23 1.77 ~5 times slower
    Select 2.38 0.05 2.33 ~47 times faster

    The important things to note here are that:

    The insert of 1,000 records took approximately 5 times longer. Indexing a user written function will necessarily affect the peformance of inserts and some updates. Since most applications insert and update singleton entries. Each row took less then 3/1000 of a second to insert. In a typical application, this overhead won't even be noticed. Don't forget -- you insert a row once but query it thousands of times.

    While the insert ran 5 times slower, the query ran something like 47 times faster. It evaluated the my_soundex function 2 times instead of 2,000. There is no comparision in the performance of this indexed query to the non-indexed query. Also, as the size of our table grows, the full scan query will take longer and longer to execute. The index based query will always execute with the near same performance characteristics as the table gets larger.

    We had to use "substr()" in our query. This is not as nice as just coding "where my_soundex(name) = my_soundex( 'FILE$' )" but we can easily get around that we will show below
    So, the insert was marginally affected but the query ran incredibly fast. The payoff for a small reduction in insert/update performance is huge. Additionally, if you never update the columns involved in the my_soundex function call -- the updates are not penalized at all (my_soundex is invoked only if the 'name' column is updated).
    We would now like to see how to make it so the query does not have use the substr function call. The use of the substr call could be error prone -- your end users have to know to substr from 1 for 6 characters. If they used a different size, the index would not be used. Also, you want to control in the server the number of bytes to index. This will allow you do reimplement the my_soundex function later with 7 bytes instead of 6 if you want to. We can do this, hide the substr, with a view quite easily as follows:


    SQL> create or replace view test_soundex_v
    2 as
    3 select name, substr(my_soundex(name),1,6) name_soundex
    4 from test_soundex
    5 /
    View created.

    SQL> exec stats.cnt := 0;

    PL/SQL procedure successfully completed.

    Elapsed: 00:00:00.05
    SQL> select name
    2 from test_soundex_v B
    3 where name_soundex = my_soundex( 'FILE$' )
    4 /

    NAME
    ------------------------------
    FILE$

    Elapsed: 00:00:00.04

    Execution Plan
    ----------------------------------------------------------
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=34)
    1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_SOUNDEX' (Cost=1 Card=1 Bytes=34)
    2 1 INDEX (RANGE SCAN) OF 'TEST_SOUNDEX_IDX' (NON-UNIQUE) (Cost=1 Card=1)




    SQL> exec dbms_output.put_line( stats.cnt )
    2

    PL/SQL procedure successfully completed.





    So, what we have done here is to hide the substr( f(x), 1, 6 ) in the view itself. The optimizer still recognizes that this virtual column is in fact the indexed column and does the 'right thing'. We see the same performance improvement and the same query plan. Using this view is as good as using the base table, better even since it hides the complexity and allows you to change the size of the substr later.








    --------------------------------------------------------------------------------

    How to use this
    Above we now have most of the steps you need to use function based indexes. In addition to the above steps, there are some init.ora or session settings you must use and a privelege you must have. The following is a list of what needs to be done to use function based indexes:
    You must have the system privelege query rewrite to create function based indexes on tables in your own schema.
    You must have the system privelege global query rewrite to create function based indexes on tables in other schemas
    For the optimizer to use function based indexes, the following session or system variables must be set:
    QUERY_REWRITE_ENABLED=TRUE
    QUERY_REWRITE_INTEGRITY=TRUSTED

    You may enable these at either the session level with ALTER SESSION or at the system level via ALTER SYSTEM or by setting them in the init.ora parameter file. The meaning of query_rewrite_enabled is to allow the optimizer to rewrite the query allowing it to use the function based index. The meaning of query_rewrite_integrity=trusted is to tell the optimizer to 'trust' that the code marked deterministic by the programmer is in fact deterministic. If the code is in fact not deterministic (that is, it returns different output given the same inputs), the resulting rows from the index may be incorrect.
    Use the Cost Based Optimizer. Function based indexes are only visible to the Cost Based Optimizer and will not be used by the Rule Based Optimizer ever.
    Use substr() to constrain return values from user written functions that return VARCHAR2 or RAW types. Optionally hide the substr in a view (recommended).
    Once the above list has been satisfied, it is as easy as "CREATE INDEX" from there on in. The optimizer will find and use your indexes at runtime for you.


    --------------------------------------------------------------------------------

    Pros and Cons of this feature
    Here are some of the pros of this feature
    Pros Cons
    Its easy to use/implement and provides immediate value
    It can be used to speed up existing applications without changing any of their logic or queries. Many orders of magnitude query improvement may be observed.
    It can be used to precompute complex values without using a trigger
    Can be created either as B*Tree or bitmap index
    Index can be built on an arithmetic expression or expression containing PL/SQL, package functions, C callout or SQL built-in functions
    Optimizer can estimate selectivity more accurately if the expressions are materialized in a function-based index. Range scan can be used for queries with expression in where clause and has index build on the expression used.
    Provides efficient linguistic collation to use NLS sort index
    Indexes can be created on object columns and REF columns by using methods defined for the object.
    You cannot direct path load the table with a function based index if that function was user written and requires the SQL ENGIRE. That means you cannot direct path load into a table indexed using my_soundex(x), but you could if it had indexed upper(x).
    It will affect the performance of inserts and updates. (Remember, you insert a row once, you query it thousands of times.)
    If the function you are indexing is a user written function and it returns a string, you may have to expose a view for the end users to use.

    In general, the pros heavily out weigh any of the cons in this case. The inability to direct path load with a pl/sql based index can easily be overcome by indexing after the load with the parallel query option. The performance of inserts is only marginally affected, most applications won't even notice the effect.



    --------------------------------------------------------------------------------

    Scripts to demonstrate the new feature
    There is only one script to accompany this article. This script runs through the example we used above with the soundex function. It creates the 'my_soundex' function and 'stats' package. It creates a sample table 'test_soundex' by selecting the first 1000 records from your all_objects table. It queries this table with and without the function based index. It also demonstrates the use of the view to hide the substr on the function.
    In order for this script to execute correct only the following is assumed:

    Autotrace is enabled in sqlplus. Click here to see how to enable autotrace in sqlplus.
    The user executing the script has the rewrite query system privelege (included in the DBA role)
    Download Script (2k)



  6. #6
    Join Date
    Dec 2000
    Location
    Ljubljana, Slovenia
    Posts
    4,439
    Originally posted by sandycrab
    Link does not work.... can you provide revised link...
    Thanks
    Sandy
    Remove the last dot (following index.html) from that URL and it will work. So the correct URL is http://osi.oracle.com/~tkyte/article1/index.html
    Jurij Modic
    ASCII a stupid question, get a stupid ANSI
    24 hours in a day .... 24 beer in a case .... coincidence?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width