DBAsupport.com Forums - Powered by vBulletin
Results 1 to 5 of 5

Thread: Help .. I can't figure this query out...

  1. #1
    Join Date
    Mar 2002
    Location
    Mesa, Arizona
    Posts
    1,204
    I have a set of data that I would like to summarize. My goal is to return a set of data that has the person_id, gl # , and first effective_start_date and last effective_end_date for each range of GL#'s The trick is that the list must be in chronological order. I have tried first_value and last_value but can't get exactly what I would like.

    The raw data is

    person_id effective_start_date effective_end_date GL #
    3269 07/11/1994 12/31/1998 5135
    3269 01/01/1999 06/30/2000 5175
    3269 07/01/2000 12/31/2000 5135
    3269 01/01/2001 05/15/2002 5135
    3269 05/16/2002 12/31/2002 5135

    My desired results are:

    person_id effective_start_date effective_end_date GL #
    3269 07/11/1994 12/31/1998 5135
    3269 01/01/1999 06/30/2000 5175
    3269 07/01/2000 12/31/2002 5135

    Any suggestions would be appreciated.
    "I do not fear computers. I fear the lack of them." Isaac Asimov
    Oracle Scirpts DBA's need

  2. #2
    Join Date
    May 2002
    Posts
    2,645
    select person_id, effective_start_date start, effective_end_date end, gl_no from your_table_name
    order by 4, 1, 3;

  3. #3
    Join Date
    Mar 2002
    Location
    Mesa, Arizona
    Posts
    1,204
    stecal,

    Good first stab. This one is a major step up in complexity.

    Notice that not only do the records need to be returned in chrolological order, they need to be summarized when there is a duplication in the gl #.

    For example, 5135 should be returned only two times. Once with the 07/11/1994 - 12/31/1998 date range and the second 5135 record should return the 07/01/2000 start date from one record and the 12/31/2002 end data from another.

    I hope this makes sense. Thank you for your help.
    "I do not fear computers. I fear the lack of them." Isaac Asimov
    Oracle Scirpts DBA's need

  4. #4
    Join Date
    May 2002
    Posts
    2,645
    I see what you want now. It's going to require date checking - where an end date for one record has to be after the start date of another record for a GL to meet a continuity requirement. If end date is not in between start and end of another GL record, that record gets output as a separate record because the time interval does not overlap. Piece of cake.

  5. #5
    Join Date
    Apr 2002
    Posts
    14
    I've shortened the column names to save typing :-) but here goes - it seems to work okay.

    create table my_table (
    personID integer,
    startdate date,
    enddate date,
    glID integer );

    /* I assume/hope the following applies */
    create unique index my_index on my_table(personId, startDate);

    insert into my_table values(3269, to_date('07/11/1994','mm/dd/yyyy'), to_date('12/31/1998','mm/dd/yyyy'), 5135);
    insert into my_table values(3269, to_date('01/01/1999','mm/dd/yyyy'), to_date('06/30/2000','mm/dd/yyyy'), 5175);
    insert into my_table values(3269, to_date('07/01/2000','mm/dd/yyyy'), to_date('12/31/2000','mm/dd/yyyy'), 5135);
    insert into my_table values(3269, to_date('01/01/2001','mm/dd/yyyy'), to_date('05/15/2002','mm/dd/yyyy'), 5135);
    insert into my_table values(3269, to_date('05/16/2002','mm/dd/yyyy'), to_date('12/31/2002','mm/dd/yyyy'), 5135);


    /*** The select ***/

    select personId, startDate, max(endDate), glId
    from
    (
    select s.personID, s.glID, s.startDate, e.endDate, e.endDate - s.endDate period
    from my_table s, my_table e
    where
    e.personID = s.personID
    and e.glID = s.glID
    and e.startDate >= s.startDate
    and e.endDate - s.endDate = (
    select nvl(sum( t.endDate - t.startDate + 1 ),0) non_gap_days
    from my_table t
    where
    t.startDate >= s.endDate
    and t.endDate <= e.endDate
    and t.personID = s.personID
    and t.glID = s.glID
    )
    and not exists (
    select 'x' from my_table p
    where
    p.personID = s.personID
    and p.glID = s.glID
    and p.endDate + 1 = s.startDate
    )
    )
    group by
    personId, glId, startDate
    order by
    personId, startDate


    With the index on personId. startDate the explain plan in RULE based optimiser is
    Execution Plan
    ----------------------------------------------------------
    0 SELECT STATEMENT Optimizer=RULE
    1 0 SORT (GROUP BY)
    2 1 FILTER
    3 2 NESTED LOOPS
    4 3 TABLE ACCESS (FULL) OF 'MY_TABLE'
    5 3 TABLE ACCESS (BY INDEX ROWID) OF 'MY_TABLE'
    6 5 INDEX (RANGE SCAN) OF 'MY_INDEX' (UNIQUE)
    7 2 SORT (AGGREGATE)
    8 7 TABLE ACCESS (BY INDEX ROWID) OF 'MY_TABLE'
    9 8 INDEX (RANGE SCAN) OF 'MY_INDEX' (UNIQUE)
    10 2 TABLE ACCESS (BY INDEX ROWID) OF 'MY_TABLE'
    11 10 INDEX (RANGE SCAN) OF 'MY_INDEX' (UNIQUE)

    Clearly the sorts will be expensive (perhaps prohibatively so) if this is run on a large table (perhaps time to use PQO).

    The query tries to identify pairs of records with no gap in continuity between them (using the sum sub-query) and eliminates those where the first of the pair is the earliest record in a run of contiguous records (using the not exists). Finally the group by and max() removes those records where a longer encompassing run exists.

    Hopefully the logic stands up to different data situations.

    Andy

    P.S. Shame that the formatting goes to pieces

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width