DBAsupport.com Forums - Powered by vBulletin
Results 1 to 5 of 5

Thread: SQL Help

  1. #1
    Join Date
    Dec 2000
    Location
    Virginia, USA
    Posts
    455
    I got two tables T1 and T2. Five columns together makes the primary Key in each table. They both have same number of columns with the same names.
    Only difference is the number of records in each of them. T1 has more records then the T2.

    What I want is everything from T1 that is not in T2.

    Thanks in Advance

  2. #2
    Join Date
    Sep 2000
    Posts
    362
    You can use the MINUS Set operator for doing this.


    Consider these two queries and their results:

    SELECT part
    FROM orders_list1;

    PART
    ----------
    SPARKPLUG
    FUEL PUMP
    FUEL PUMP
    TAILPIPE

    SELECT part
    FROM orders_list2;

    PART
    ----------
    CRANKSHAFT
    TAILPIPE
    TAILPIPE


    SELECT part
    FROM orders_list1
    MINUS
    SELECT part
    FROM orders_list2;

    PART
    ----------
    SPARKPLUG
    FUEL PUMP

    Hope this Helps
    Anurag

  3. #3
    Join Date
    Feb 2001
    Posts
    184
    Select Distinct col1, col2, col3 ...
    From Tab1 T1
    Where Not Exists ( Select Null From Tab2 T2
    Where T1.Col1 = T2.Col1 and T1.Col2 = T2.Col2 ..... )

    This will Give you the desired result...

    Hope will help

    Thanks

  4. #4
    Join Date
    Nov 2000
    Location
    Baltimore, MD USA
    Posts
    1,339
    As always, I'm here to add my 2 cents on behalf of performance.

    I have a table ORG with over 22,000 rows. I then made a copy of the table called CRL_ORG which I populated with all the 'even' Org_PK records, resulting in over 11,000 rows.

    I then compared the following, logically equivalent, statements:
    (as always, each statement was run multiple times on my own, dedicated server).

    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---*
    FROM
    ---ORG
    MINUS
    SELECT
    ---*
    FROM
    ---CRL_ORG

    Plan:
    ------
    SELECT STATEMENT Optimizer=CHOOSE (Cost=1944 Card=34175 Bytes=16779925)
    ---MINUS
    ------SORT (UNIQUE) (Cost=1292 Card=22734 Bytes=11162394)
    ---------TABLE ACCESS (FULL) OF ORG (Cost=33 Card=22734 Bytes=11162394)
    ------SORT (UNIQUE) (Cost=652 Card=11441 Bytes=5617531)
    ---------TABLE ACCESS (FULL) OF CRL_ORG (Cost=17 Card=11441 Bytes=5617531)

    Stats:
    -------
    Logical Reads: 1349 (So low because of the table scans)
    Time: 32.59

    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---O.*
    FROM
    ---ORG------O ,
    ---CRL_ORG---CO
    WHERE
    ---CO.ORG_PK---(+)=---O.ORG_PK---AND
    ---CO.ORG_PK---IS------NULL

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=22734 Bytes=11253330)
    ---FILTER
    ------NESTED LOOPS (OUTER)
    ---------TABLE ACCESS (FULL) OF ORG (Cost=33 Card=22734 Bytes=11162394)
    ---------INDEX (UNIQUE SCAN) OF SYS_C0035879 (UNIQUE)

    Stats
    ------
    Logical Reads: 47,037
    Time: 26.64

    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---*
    FROM
    ---ORG---O
    WHERE
    ---NOT EXISTS
    ------(
    ---------SELECT
    ------------1
    ---------FROM
    ------------CRL_ORG---CO
    ---------WHERE
    ------------CO.ORG_PK---=---O.ORG_PK
    ------)

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=33 Card=1137 Bytes=558267)
    ---FILTER
    ------TABLE ACCESS (FULL) OF ORG (Cost=33 Card=1137 Bytes=558267)
    ------INDEX (UNIQUE SCAN) OF SYS_C0035879 (UNIQUE) (Cost=1 Card=1 Bytes=4)

    Stats
    ------
    Logical Reads: 47,037
    Time: 25.67

    --------------------------------------------------------------------
    I decided to give the first statement a better shot, so I re-wrote it a little.

    Statement
    ------------
    SELECT
    ---O.*
    FROM
    ---ORG---O,
    ---(
    ------SELECT
    ---------ORG_PK
    ------FROM
    ---------ORG
    ------MINUS
    ------SELECT
    ---------ORG_PK
    ------FROM
    ---------CRL_ORG
    ---)------O2
    WHERE
    ---O.ORG_PK---=---O2.ORG_PK

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=179 Card=34175 Bytes=17224200)
    ---HASH JOIN (Cost=179 Card=34175 Bytes=17224200)
    ------VIEW (Cost=37 Card=34175 Bytes=444275)
    ---------MINUS
    ------------SORT (UNIQUE) (Cost=24 Card=22734 Bytes=90936)
    ---------------INDEX (FULL SCAN) OF XPKORG (UNIQUE) (Cost=25 Card=22734 Bytes=90936)
    ------------SORT (UNIQUE) (Cost=13 Card=11441 Bytes=45764)
    ---------------INDEX (FULL SCAN) OF SYS_C0035879 (UNIQUE) (Cost=13 Card=11441 Bytes=45764)
    ------TABLE ACCESS (FULL) OF ORG (Cost=88 Card=22734 Bytes=11162394)

    Stats
    ------
    Logical Reads: 1607 (low because of the FULL SCANs)
    Time: 25.22


    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---*
    FROM
    ---ORG---O
    WHERE
    ---O.ORG_PK NOT IN
    ------(
    ---------SELECT
    ------------ORG_PK
    ---------FROM
    ------------CRL_ORG---CO
    ------)

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=33 Card=1137 Bytes=558267)
    ---FILTER
    ------TABLE ACCESS (FULL) OF ORG (Cost=33 Card=1137 Bytes=558267)
    ------INDEX (UNIQUE SCAN) OF SYS_C0035879 (UNIQUE) (Cost=1 Card=1 Bytes=4)

    Stats
    ------
    Logical Reads: 47037
    Time: 25.28

    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---*
    FROM
    ---ORG---O
    WHERE
    ---O.ORG_PK NOT IN
    ------(
    ---------SELECT /*+ HASH_AJ */
    ------------ORG_PK
    ---------FROM
    ------------CRL_ORG---CO
    ------)

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=215 Card=11293 Bytes=5691672)
    ---HASH JOIN (ANTI) (Cost=215 Card=11293 Bytes=5691672)
    ------TABLE ACCESS (FULL) OF ORG (Cost=33 Card=22734 Bytes=11162394)
    ------VIEW OF VW_NSO_1 (Cost=1 Card=11441 Bytes=148733)
    ---------INDEX (FULL SCAN) OF SYS_C0035879 (UNIQUE) (Cost=13 Card=11441 Bytes=45764)

    Stats
    ------
    Logical Reads: 860 (low because of the FULL SCANs)
    Time: 27.8

    --------------------------------------------------------------------
    Statement
    ------------
    SELECT
    ---*
    FROM
    ---ORG---O
    WHERE
    ---O.ORG_PK NOT IN
    ------(
    ---------SELECT /*+ MERGE_AJ */
    ------------ORG_PK
    ---------FROM
    ------------CRL_ORG---CO
    ------)

    Plan
    -----
    SELECT STATEMENT Optimizer=CHOOSE (Cost=108 Card=11293 Bytes=5691672)
    ---MERGE JOIN (ANTI) (Cost=108 Card=11293 Bytes=5691672)
    ------TABLE ACCESS (BY INDEX ROWID) OF ORG (Cost=88 Card=22734 Bytes=11162394)
    ---------INDEX (FULL SCAN) OF XPKORG (UNIQUE) (Cost=25 Card=22734)
    ------SORT (UNIQUE) (Cost=20 Card=11441 Bytes=148733)
    ---------VIEW OF VW_NSO_1 (Cost=1 Card=11441 Bytes=148733)
    ------------INDEX (FULL SCAN) OF SYS_C0035879 (UNIQUE) (Cost=13 Card=11441 Bytes=45764)

    Stats
    ------
    Logical Reads: 9999
    Time: 25.07

    --------------------------------------------------------------------
    --------------------------------------------------------------------
    --------------------------------------------------------------------

    As you can see, the choices are relatively equal, although the straight 'minus' solution is definitely the worst. However, the modified 'minus' solution is conversely one of the best, with perhaps the MERGE_AJ being a little better, if for no other reason than the simplicity, although I generally avoid NOT IN sub-selects.

    So why did I do this? Mostly to try to shed some more light on the fact that there are always a ton of different ways to solve any given problem in SQL. It always pays to look into things a little deeper. In this case, we didn't see much gain, but we did see a 20% difference between the best and worst case, and this was a *real* basic example. Also note that this was not nearly an exhaustive study. There were permutations I did't address. Also, when the table sizes change or the percentage difference between the table change, then the optimal solution might be different. Also note that I did not include the INDEX_FFS hint (which didn't help anywhere). I also did not touch Parallel query options or partitioned tables, which would have made *massive* differences.

    Anyway, happy coding,

    - Chris

  5. #5
    Join Date
    Dec 2000
    Location
    Virginia, USA
    Posts
    455
    Thanks Awsome

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width