Parse THIS!!! XML can't contain HTML (anchor tags)???
DBAsupport.com Forums - Powered by vBulletin
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Parse THIS!!! XML can't contain HTML (anchor tags)???

  1. #1
    Join Date
    Aug 2005
    Posts
    7

    Angry Parse THIS!!! XML can't contain HTML (anchor tags)???

    Is there a way to parse XML that has HTML tags and special characters within it?

    I've been searching for days for a solution with no joy.
    The Oracle scripts I wrote below work fine as long as only alphanumeric characters are in the XML doc.

    Here's what I'm doing...I'm page scraping several dynamic websites using ColdFusion. I remove all the unneeded HTML...and replace the remaining HTML with XML tags.

    I then want to pass this XML (attached txt file) variable (cfxml) to Oracle so it can be parsed and loaded into the database. If I remove the anchor tags and the CDATA tags...the code works fine....but I want the anchor tags loaded into my table.

    Anyone have any suggestions....preferably a solution so I can stop pulling my damn hair out.

    ---------------------------------------------------------
    Oracle Scripts
    ---------------------------------------------------------
    create or replace procedure LoadXML (cfxml in varchar) AS
    begin
    insert into JOBS (xmldoc) values
    (xmltype.createxml(cfxml));
    end;

    create or replace trigger job_bifert as
    begin
    if (:new.xmldoc is null) then
    return;
    end if;
    select
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
    into :new.title, :new.company, :new.location
    from dual;
    end;
    Attached Files Attached Files

  2. #2
    Join Date
    Dec 2001
    Location
    UK
    Posts
    1,684
    Hi.

    Can't really understand what your problem is from your post. I can see your trigger definition is a little off. Is you problem the compilation of the trigger, or what it does subsequently. I guess it should read:

    Code:
    create or replace trigger job_bifert 
    before insert on jobs
    for each row
    begin
      if (:new.xmldoc is null) then
        return;
      end if;
      select
        xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
        xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
        xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
      into :new.title, :new.company, :new.location
      from dual;
    end;
    /
    Is all the XML in a variable ready to pass into you procedure?
    Why are you parsing it in the trigger, rather than the procedure, since this would perform better than doing half of the job in the procedure and half in the trigger.

    Please provide everything you need to build the test case, including the CREATE TABLE statements. Provide a version of the code you are using, and the errors you are getting (compile or runtime errors).

    Cheers

    Tim...
    Tim...
    OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
    OCA PL/SQL Developer
    Oracle ACE Director
    My website: www.oracle-base.com
    My blog: www.oracle-base.com/blog

  3. #3
    Join Date
    Aug 2005
    Posts
    7

    Scripts, Examples....Errs, Etc.

    Tim,

    Thanks for responding. Below is the table (JOBs) I want to insert records.

    CREATE TABLE JOBS
    (
    ID NUMBER,
    HOST VARCHAR2(100 BYTE),
    SEARCH VARCHAR2(100 BYTE),
    LINK VARCHAR2(4000 BYTE),
    POSTDATE DATE,
    TITLE VARCHAR2(4000 BYTE),
    COMPANY VARCHAR2(4000 BYTE),
    LOCATION VARCHAR2(100 BYTE),
    D_UPD DATE,
    STATE VARCHAR2(100 BYTE),
    CITY VARCHAR2(100 BYTE),
    TITLE_NOLINK VARCHAR2(100 BYTE),
    COMPANY_NOLINK VARCHAR2(100 BYTE),
    XMLDOC SYS.XMLTYPE
    )

    Like I was saying...when I try to use the SP and Trigger in Ex 1...everything works fine. Ex 2 doesn't work (see attachment).

    Any ideas why???
    Attached Files Attached Files
    Last edited by jpollar; 08-08-2005 at 06:03 AM.

  4. #4
    Join Date
    Dec 2001
    Location
    UK
    Posts
    1,684
    Hi.

    Looking at your code you are using a definition of "CFXML VARCHAR2(200);" to hold you XML document, prior to insertion. This is adequate for the first example but is too small in the second example, hence your problem. Try switching to "CFXML VARCHAR2(32767);".

    Try this out and tell me how you get on.

    Cheers

    Tim...
    Tim...
    OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
    OCA PL/SQL Developer
    Oracle ACE Director
    My website: www.oracle-base.com
    My blog: www.oracle-base.com/blog

  5. #5
    Join Date
    Aug 2005
    Posts
    7

    Nope....still get the error

    Nope...didn't work.

    I even tried changing it to a CLOB (see below + attachment).

    Still getting this error.

    ORA-06502: PL/SQL: numeric or value error: character string buffer too small
    ORA-06512: at "SASSOH.JOBS_BIFERT", line 5
    ORA-04088: error during execution of trigger 'SASSOH.JOBS_BIFERT'
    ORA-06512: at "SASSOH.LOADXML", line 3
    ORA-06512: at line 22

    When I take out the anchor tags and the CDATA, everything works perfectly.


    --------------------------------------------------------------
    CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
    begin
    insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
    end;
    /
    Attached Files Attached Files

  6. #6
    Join Date
    Dec 2001
    Location
    UK
    Posts
    1,684
    Strange. This works fine for me:

    Code:
    DECLARE 
      CFXML VARCHAR2(32767);
    
    BEGIN 
      CFXML := '
    <row>
    <date>
    <![CDATA[ Jul 27 ]]>
    </date>
    <title>
    <![CDATA[ ><a href="http://jobsearch.monster.com:80/getjob.asp?JobID=32472300' || CHR(38) || 'AVSDM=2005%2D07%2D28+02%3A36%3A26' || CHR(38) || 'Logo=1' || CHR(38) || 'q=cold fusion' || CHR(38) || 'cy=US' || CHR(38) || 'sort=dt">ASP.NET Technical Program Manager with Marketing Savvy</a> ]]>
    </title>
    <company>
    <![CDATA[ Excell Data ]]>
    </company>
    <location>
    <![CDATA[ ><a onClick="popMMLL(''WA'',''Redmond''); return false;" href="http://jobsearch.monster.com:80/896">WA-Redmond</a> ]]>
    </location>
    </row>
    </recordset>';
    
      --SASSOH.LOADXML ( CFXML );
      COMMIT; 
    END; 
    /
    Notice, I've commented out the procedure call as the error is at line 5 which is the variable assignment in the PL/SQL block, not the procedure.

    Cheers

    Tim...
    Last edited by TimHall; 08-09-2005 at 07:01 AM.
    Tim...
    OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
    OCA PL/SQL Developer
    Oracle ACE Director
    My website: www.oracle-base.com
    My blog: www.oracle-base.com/blog

  7. #7
    Join Date
    Aug 2005
    Posts
    7

    Same here...but

    It doesn't do me any good if I can't call the procedure.

    The error is coming from the 5th line in the trigger....and the 3rd line of the procedure.

    The assignment statement in the call is not causing an error.

    CREATE OR REPLACE TRIGGER jobs_bifert
    before insert on jobs
    for each row
    begin
    if (:new.xmldoc is null) then
    return;
    end if;
    select
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
    into :new.title, :new.company, :new.location
    from dual;
    end;

    CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
    begin
    insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
    end;
    /
    Last edited by jpollar; 08-10-2005 at 09:24 AM.

  8. #8
    Join Date
    Aug 2005
    Posts
    7
    push

  9. #9
    Join Date
    Aug 2005
    Posts
    7
    push

  10. #10
    Join Date
    Dec 2001
    Location
    UK
    Posts
    1,684
    Hi.

    I've not been in the forum for a while, hence no response.

    The point about the commented out procedure, was that the calling code was wrong for a start, since you were trying to hold a large XML document in small variable. This was obviously only the first of several errors.

    The answer's, pretty simple really. Look at the size of the data in the elements of your XML document, then look at th size of the table columns you are trying to insert into. Notice anything, especially when you look at the locations column? Yep, it's smaller than the data, making the :new.location record element too small, hence the error. If you increase the size of he column it works fine, as seen here:

    Code:
    CREATE TABLE JOBS
    (
      ID NUMBER,
      HOST VARCHAR2(100 BYTE),
      SEARCH VARCHAR2(100 BYTE),
      LINK VARCHAR2(4000 BYTE),
      POSTDATE DATE,
      TITLE VARCHAR2(4000 BYTE),
      COMPANY VARCHAR2(4000 BYTE),
      LOCATION VARCHAR2(100 BYTE),
      D_UPD DATE,
      STATE VARCHAR2(100 BYTE),
      CITY VARCHAR2(100 BYTE),
      TITLE_NOLINK VARCHAR2(100 BYTE),
      COMPANY_NOLINK VARCHAR2(100 BYTE),
      XMLDOC SYS.XMLTYPE
    );
    
    CREATE OR REPLACE TRIGGER jobs_bifert
    before insert on jobs
    for each row
    begin
    if (:new.xmldoc is null) then
    return;
    end if;
     select
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
    xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
    into :new.title, :new.company, :new.location
    from dual;
    end;
    /
    
    CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
    begin
    insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
    end;
    /
    
    SQL> DECLARE 
      2    CFXML VARCHAR2(32767);
      3  
      4  BEGIN 
      5    CFXML := '<recordset>
      6  <row>
      7  <date>
      8  <![CDATA[ Jul 27 ]]>
      9  </date>
     10  <title>
     11  <![CDATA[ ><a href="http://jobsearch.monster.com:80/getjob.asp?JobID=32472300' || CHR(38) || 'AVSDM=2005%2D07%2D28+02%3A36%3A26' || CHR(38) || 'Logo=1' || CHR(38) || 'q=cold fusion' || CHR(38) || 'cy=US' || CHR(38) || 'sort=dt">ASP.NET Technical Program Manager with Marketing Savvy</a> ]]>
     12  </title>
     13  <company>
     14  <![CDATA[ Excell Data ]]>
     15  </company>
     16  <location>
     17  <![CDATA[ ><a onClick="popMMLL(''WA'',''Redmond''); return false;" href="http://jobsearch.monster.com:80/896">WA-Redmond</a> ]]>
     18  </location>
     19  </row>
     20  </recordset>';
     21  
     22    LOADXML ( CFXML );
     23    COMMIT; 
     24  END; 
     25  /
    DECLARE
    *
    ERROR at line 1:
    ORA-06502: PL/SQL: numeric or value error: character string buffer too small
    ORA-06512: at "TIM_HALL.JOBS_BIFERT", line 5
    ORA-04088: error during execution of trigger 'TIM_HALL.JOBS_BIFERT'
    ORA-06512: at "TIM_HALL.LOADXML", line 3
    ORA-06512: at line 22
    
    
    SQL> alter table jobs modify (LOCATION VARCHAR2(150 BYTE));
    
    Table altered.
    
    SQL> DECLARE 
      2    CFXML VARCHAR2(32767);
      3  
      4  BEGIN 
      5    CFXML := '<recordset>
      6  <row>
      7  <date>
      8  <![CDATA[ Jul 27 ]]>
      9  </date>
     10  <title>
     11  <![CDATA[ ><a href="http://jobsearch.monster.com:80/getjob.asp?JobID=32472300' || CHR(38) || 'AVSDM=2005%2D07%2D28+02%3A36%3A26' || CHR(38) || 'Logo=1' || CHR(38) || 'q=cold fusion' || CHR(38) || 'cy=US' || CHR(38) || 'sort=dt">ASP.NET Technical Program Manager with Marketing Savvy</a> ]]>
     12  </title>
     13  <company>
     14  <![CDATA[ Excell Data ]]>
     15  </company>
     16  <location>
     17  <![CDATA[ ><a onClick="popMMLL(''WA'',''Redmond''); return false;" href="http://jobsearch.monster.com:80/896">WA-Redmond</a> ]]>
     18  </location>
     19  </row>
     20  </recordset>';
     21  
     22    LOADXML ( CFXML );
     23    COMMIT; 
     24  END; 
     25  /
    
    PL/SQL procedure successfully completed.
    
    SQL>
    Cheers

    Tim...
    Last edited by TimHall; 08-18-2005 at 01:02 PM.
    Tim...
    OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
    OCA PL/SQL Developer
    Oracle ACE Director
    My website: www.oracle-base.com
    My blog: www.oracle-base.com/blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width