Somebody please straighten me out on this one. :)

Let me describe the problem first.

There are two tables in the schema, table a and table b. Table a holds information about the individual units of an item. The business logic is that there will only be zero to three rows in table a for each item. Table b holds related information about activities performed on each unit of each item in table a.


create table TABLE_A(
    KEY NUMBER PRIMARY KEY,
    ID NUMBER,
    UNIT_ID NUMBER,
    COL_X VARCHAR2,
    COL_Y VARCHAR2,
    other columns );

create table TABLE_B(
    ID NUMBER,
    UNIT_ID NUMBER,
    ACTIVITY VARCHAR2,
    other columns );


Single column indexes exist on TABLE_A.ID and TABLE_B.ID.

The rows in the two tables are related by their ID and UNIT_ID values. There can be more than one row in table b for each ID, UNIT_ID pair.

I was asked to write a SELECT statement that generates a single string of the COL_X and COL_Y values of each unit of each item in table a, with the information for each unit being separated by a newline character. This is another example of the classic bounded table pivoting problem. So, I threw the following together assuming that the UNIT_ID values for each item ID were sequential numbers starting with 1.


    select a.ID,
        SUBSTR(
            MAX(
                CASE WHEN a.UNIT_ID = 1 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END )
            ||
            MAX(
                CASE WHEN a.UNIT_ID = 2 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END )
            ||
            MAX(
                CASE WHEN a.UNIT_ID = 3 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END), 2 )
            "LABEL"
    from TABLE_A a, TABLE_B b
    where NVL( b.ACTIVITY, 'I' ) = 'I' and a.ID = b.ID and a.UNIT_ID = b.UNIT_ID
    group by a.ID;


This worked just fine and returned the results quickly, but then I learned that my assumption was wrong. While there will be only up to 3 UNIT_ID values for a specific ID, no assumptions can be made as to the actual values of those unit identifiers. I then decided to try one of the new (at least to me) analytic functions since this customer is running 8i (8.1.6 on WinNT).

Here's what I wrote:


    select a.ID,
        SUBSTR(
            MAX(
                CASE WHEN inner.UNITROWNUM = 1 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END )
            ||
            MAX(
                CASE WHEN inner.UNITROWNUM = 2 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END )
            ||
            MAX(
                CASE WHEN inner.UNITROWNUM = 3 THEN CHR(10) || a.COL_X || ' ' || a.COL_Y
                     ELSE NULL
                END ), 2 )
            "LABEL"
    from(
        select TABLE_A.KEY,
               ROW_NUMBER() OVER(
                   PARTITION BY TABLE_A.ID
                   ORDER BY TABLE_A.UNIT_ID) AS "UNITROWNUM"
        from TABLE_A, TABLE_B b
        where NVL( b.ACTIVITY, 'I' ) = 'I' and TABLE_A.ID = b.ID
            and TABLE_A.UNIT_ID = b.UNIT_ID ) inner,
        TABLE_A a
    where a.KEY = inner.KEY
    group by a.ID


This new query works, but it is much slower than the original one. I looked at the execution plans of both statements and for the first, both of the indexes on the ID columns of table a and b are used. For the second select (using the analytic function) the index on the ID column of table a is used, but there is a full table scan on table b instead of using its index on the ID column.

I hope all of the above made sense. :)

Any suggestions as to what I can do to speed up the second query?

Thanks,

Heath