Shell Sript to load data from flat file
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 18

Thread: Shell Sript to load data from flat file

Hybrid View

  1. #1
    Join Date
    Sep 2000
    Posts
    362
    Hi,

    I am totally new to writing shell scripts.

    What I want to do is that i want to do a conditional load of data from a flat file onto a table.

    The data looks something like this

    MBOC YR_1991 REGROLL BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL COFYE 24.000000 24.000000
    ALLEP YR_1991 REGROLL1 COLOP 48.000000 48.000000
    ALLEP YR_1991 REGROLL2 COEPAM 8.000000 8.000000

    It is in a fixed length format. I want to load only those records in which the 3rd field starting fom 21st column and ending at 30th is "REGROLL" or "REGROLL2" .

    Please Help me out.

    Thanks
    Anurag
    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


  2. #2
    Join Date
    Jun 2001
    Location
    Charlotte, NC
    Posts
    27

    Cool

    Originally posted by anuragmin
    Hi,

    I am totally new to writing shell scripts.

    What I want to do is that i want to do a conditional load of data from a flat file onto a table.

    The data looks something like this

    MBOC YR_1991 REGROLL BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL COFYE 24.000000 24.000000
    ALLEP YR_1991 REGROLL1 COLOP 48.000000 48.000000
    ALLEP YR_1991 REGROLL2 COEPAM 8.000000 8.000000

    It is in a fixed length format. I want to load only those records in which the 3rd field starting fom 21st column and ending at 30th is "REGROLL" or "REGROLL2" .

    Please Help me out.


    Thanks
    Anurag


    Oracle has a utility called sql loader which allows a user to load data from a flat file into a table. The syntax is as follows

    sqlldr username/password contol=load.ctl

    the layout for the load.ctl file is as follows

    Load Data
    infile *
    choose one of the following options (truncate,append,replace)
    into table (table_name)
    when (21:30) = 'regroll' or (21:30) = 'regroll2)
    (column_name postion(x:x)
    ... continue to enter the record format
    ...
    )
    BEGINDATA
    MBOC YR_1991 REGROLL BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL COFYE 24.000000 24.000000
    ALLEP YR_1991 REGROLL1 COLOP 48.000000 48.000000
    ALLEP YR_1991 REGROLL2 COEPAM 8.000000 8.000000



  3. #3
    Join Date
    Sep 2000
    Posts
    362
    Originally posted by JUNE

    Load Data
    infile *
    choose one of the following options (truncate,append,replace)
    into table (table_name)
    when (21:30) = 'regroll' or (21:30) = 'regroll2)
    (column_name postion(x:x)
    ... continue to enter the record format
    ...
    )
    BEGINDATA
    MBOC YR_1991 REGROLL BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL COFYE 24.000000 24.000000
    ALLEP YR_1991 REGROLL1 COLOP 48.000000 48.000000
    ALLEP YR_1991 REGROLL2 COEPAM 8.000000 8.000000


    Hi,
    Sql Loader does not support the OR operator and thats teh reason I need to use a shell script.

    Please see my other thread where I had the same question for sql loader

    http://www.dbasupport.com/forums/sho...threadid=11922

    Thanks
    Anurag
    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


  4. #4
    Join Date
    Mar 2001
    Posts
    314
    An observation - somebody please explain :-)

    In the earlier thread, the solution to Anurag's problem suggested by Jurij works fine, but only if you use a fixed format input file with a control file like the following:

    LOAD DATA
    INFILE 'c:\tmp\tt.dat'
    APPEND
    INTO TABLE tt
    WHEN C3 = "REGROLL"
    (c1 POSITION(01:05) CHAR,
    c2 POSITION(06:12) CHAR,
    c3 POSITION(13:20) CHAR,
    c4 POSITION(21:27) CHAR,
    c5 POSITION(28:36) DECIMAL EXTERNAL,
    c6 POSITION(37:45) DECIMAL EXTERNAL)

    INTO TABLE tt
    WHEN C3 = "REGROLL2"
    (c1 POSITION(01:05) CHAR,
    c2 POSITION(06:12) CHAR,
    c3 POSITION(13:20) CHAR,
    c4 POSITION(21:27) CHAR,
    c5 POSITION(28:36) DECIMAL EXTERNAL,
    c6 POSITION(37:45) DECIMAL EXTERNAL)

    The input file used is:
    tt.dat
    ------------
    MBOC YR_1991REGROLL BSPSIEB 1.000000 1.000000
    ALLEPYR_1991REGROLL COFYE 24.00000024.000000
    ALLEPYR_1991REGROLL1COLOP 48.00000048.000000
    ALLEPYR_1992REGROLL2COEPAM 8.000000 8.000000
    ALLEPYR_1991REGROLL3COEPAM 9.000000 9.000000
    ---------------------------------


    Now, if I try to do the same thing using a free format input file (FIELDS SEPARATED BY WHITESPACE) it does not seem to work! The control file used is:

    LOAD DATA
    INFILE 'c:\tmp\tt.dat'
    APPEND
    INTO TABLE tt
    WHEN C3 = "REGROLL"
    FIELDS TERMINATED BY WHITESPACE
    (c1,c2,c3,c4,c5,c6)

    INTO TABLE tt
    WHEN C3 = 'REGROLL2'
    FIELDS TERMINATED BY WHITESPACE
    (c1,c2,c3,c4,c5,c6)

    The input file used is:
    tt.dat
    --------------------
    MBOC YR_1991 REGROLL BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL COFYE 24.000000 24.000000
    ALLEP YR_1991 REGROLL1 COLOP 48.000000 48.000000
    ALLEP YR_1991 REGROLL2 COEPAM 8.000000 8.000000
    --------------------------------------

    Only the first WHEN clause is satisfied.

    Now, I know that I am missing something - question is what?

    Can somebody explain please?

    -amar


  5. #5
    Join Date
    Dec 2000
    Location
    Ljubljana, Slovenia
    Posts
    4,439
    The following controlfile will do:

    LOAD DATA
    INFILE *
    APPEND
    INTO TABLE tt
    WHEN C3 = 'REGROLL'
    FIELDS TERMINATED BY WHITESPACE
    (c1,c2,c3,c4,c5,c6)
    INTO TABLE tt
    WHEN C3 = "REGROLL2"
    FIELDS TERMINATED BY WHITESPACE
    (c1 POSITION(1),c2,c3,c4,c5,c6)
    BEGINDATA
    MBOC YR_1991 REGROLL BSPSIEB 1,000000 1,000000
    ALLEP YR_1991 REGROLL COFYE 24,000000 24,000000
    ALLEP YR_1991 REGROLL1 COLOP 48,000000 48,000000
    ALLEP YR_1991 REGROLL2 COEPAM 8,000000 8,000000

    Note the POSITION(1) keyword after the first column specification for the second table load. This is neccesary for SQL*Loader to know that the field c1 in second table load begins at position 1 in the data line, othervise it would search for it after the field c6 in the first table load.
    Jurij Modic
    ASCII a stupid question, get a stupid ANSI
    24 hours in a day .... 24 beer in a case .... coincidence?

  6. #6
    Join Date
    Sep 2000
    Posts
    362
    Hi,

    My question of an alternative method of loading data is still open.


    Anurag


    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


  7. #7
    Join Date
    Aug 2000
    Location
    Belgium
    Posts
    342
    If you're on unix, make an awk script that strips the data you want to load.
    Then user SQLLOADER to load the data to oracle

    > cat test.txt
    line1 bla error
    line2 bla2 good
    line3 bla3 good
    line4 bla error bla2
    line5 bla2 good

    > cat test.awk
    {
    tekst = substr($0,7,4)
    if ( tekst == "bla2" ) print $0
    if ( tekst == "bla3" ) print $0
    }

    > awk -f test.awk test.txt
    line2 bla2 good
    line3 bla3 good
    line5 bla2 good

    Hope this helps
    Gert

  8. #8
    Join Date
    Sep 2000
    Posts
    362

    Unhappy

    Originally posted by denevge

    > cat test.awk
    {
    tekst = substr($0,7,4)
    if ( tekst == "bla2" ) print $0
    if ( tekst == "bla3" ) print $0
    }

    > awk -f test.awk test.txt
    Hi Denevge,
    Thanks a lot. The script works. But mine a little more complex and in the test.awk looks something like

    {
    tmpregion = substr( $0,21,10)
    tmpyear = substr($0,14,4)
    if ( tmpregion == "REGROLL " ) && (tmpyear > 1996) print $0
    }

    But when i try to execute awl from the command line it gives me the following error

    bash-2.02$ awk -f test.awk test.txt > a
    awk: test1.awk:4: if ( tmpregion == "REGROLL " ) && (tmpyear > 1996) print $
    0
    awk: test1.awk:4: ^ parse error

    Please help me out with writing this script as it is the first time in my life I am writing one.

    Thanks
    Anurag

    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


  9. #9
    Join Date
    Sep 2000
    Posts
    362
    hi,
    I was able to do it with the following change.

    {
    tmpregion = substr( $0,21,10)
    tmpyear = gsub(substr($0,14,4)
    if ( tmpregion == "REGROLL " && tmpyear >= 1996) print $0
    }

    But my problem is that I need to replace YR_1996 to 1996 in the output.


    The data looks something like this
    MBOC YR_1991 REGROLL1 BSPSIEB 1.000000 1.000000
    ALLEP YR_1991 REGROLL2 COFYE 24.000000 24.000000
    ALLEP YR_1997 REGROLL COLOP 48.000000 48.000000
    ALLEP YR_1996 REGROLL COEPAM 8.000000 8.000000

    I want to select only the records in which the year is greater than 1996 and also in the process replace YR_1991 with 1991 to be inserted in the database.

    Please Please Help.

    Thanks
    Anurag

    [Edited by anuragmin on 06-15-2001 at 01:58 PM]
    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


  10. #10
    Join Date
    Sep 2000
    Posts
    362
    Please Suggest. The question is still open

    Sorry for bugging again and again.

    Thanks a Lot.
    Anurag
    Appreciation is a wonderful thing;
    It makes what is excellent in others belong to us as well.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width