Parse THIS!!! XML can't contain HTML (anchor tags)???
Is there a way to parse XML that has HTML tags and special characters within it?
I've been searching for days for a solution with no joy.
The Oracle scripts I wrote below work fine as long as only alphanumeric characters are in the XML doc.
Here's what I'm doing...I'm page scraping several dynamic websites using ColdFusion. I remove all the unneeded HTML...and replace the remaining HTML with XML tags.
I then want to pass this XML (attached txt file) variable (cfxml) to Oracle so it can be parsed and loaded into the database. If I remove the anchor tags and the CDATA tags...the code works fine....but I want the anchor tags loaded into my table.
Anyone have any suggestions....preferably a solution so I can stop pulling my damn hair out.
---------------------------------------------------------
Oracle Scripts
---------------------------------------------------------
create or replace procedure LoadXML (cfxml in varchar) AS
begin
insert into JOBS (xmldoc) values
(xmltype.createxml(cfxml));
end;
create or replace trigger job_bifert as
begin
if (:new.xmldoc is null) then
return;
end if;
select
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
into :new.title, :new.company, :new.location
from dual;
end;
Can't really understand what your problem is from your post. I can see your trigger definition is a little off. Is you problem the compilation of the trigger, or what it does subsequently. I guess it should read:
Code:
create or replace trigger job_bifert
before insert on jobs
for each row
begin
if (:new.xmldoc is null) then
return;
end if;
select
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
into :new.title, :new.company, :new.location
from dual;
end;
/
Is all the XML in a variable ready to pass into you procedure?
Why are you parsing it in the trigger, rather than the procedure, since this would perform better than doing half of the job in the procedure and half in the trigger.
Please provide everything you need to build the test case, including the CREATE TABLE statements. Provide a version of the code you are using, and the errors you are getting (compile or runtime errors).
Looking at your code you are using a definition of "CFXML VARCHAR2(200);" to hold you XML document, prior to insertion. This is adequate for the first example but is too small in the second example, hence your problem. Try switching to "CFXML VARCHAR2(32767);".
I even tried changing it to a CLOB (see below + attachment).
Still getting this error.
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "SASSOH.JOBS_BIFERT", line 5
ORA-04088: error during execution of trigger 'SASSOH.JOBS_BIFERT'
ORA-06512: at "SASSOH.LOADXML", line 3
ORA-06512: at line 22
When I take out the anchor tags and the CDATA, everything works perfectly.
--------------------------------------------------------------
CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
begin
insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
end;
/
It doesn't do me any good if I can't call the procedure.
The error is coming from the 5th line in the trigger....and the 3rd line of the procedure.
The assignment statement in the call is not causing an error.
CREATE OR REPLACE TRIGGER jobs_bifert
before insert on jobs
for each row
begin if (:new.xmldoc is null) then
return;
end if;
select
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
into :new.title, :new.company, :new.location
from dual;
end;
CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
begin insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
end;
/
I've not been in the forum for a while, hence no response.
The point about the commented out procedure, was that the calling code was wrong for a start, since you were trying to hold a large XML document in small variable. This was obviously only the first of several errors.
The answer's, pretty simple really. Look at the size of the data in the elements of your XML document, then look at th size of the table columns you are trying to insert into. Notice anything, especially when you look at the locations column? Yep, it's smaller than the data, making the :new.location record element too small, hence the error. If you increase the size of he column it works fine, as seen here:
Code:
CREATE TABLE JOBS
(
ID NUMBER,
HOST VARCHAR2(100 BYTE),
SEARCH VARCHAR2(100 BYTE),
LINK VARCHAR2(4000 BYTE),
POSTDATE DATE,
TITLE VARCHAR2(4000 BYTE),
COMPANY VARCHAR2(4000 BYTE),
LOCATION VARCHAR2(100 BYTE),
D_UPD DATE,
STATE VARCHAR2(100 BYTE),
CITY VARCHAR2(100 BYTE),
TITLE_NOLINK VARCHAR2(100 BYTE),
COMPANY_NOLINK VARCHAR2(100 BYTE),
XMLDOC SYS.XMLTYPE
);
CREATE OR REPLACE TRIGGER jobs_bifert
before insert on jobs
for each row
begin
if (:new.xmldoc is null) then
return;
end if;
select
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/title/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/company/text()')),
xmltype.getstringval(:new.xmldoc.extract('/recordset/row/location/text()'))
into :new.title, :new.company, :new.location
from dual;
end;
/
CREATE OR REPLACE PROCEDURE LoadXML (cfxml in CLOB) AS
begin
insert into JOBS (xmldoc) values (xmltype.createxml(cfxml));
end;
/
SQL> DECLARE
2 CFXML VARCHAR2(32767);
3
4 BEGIN
5 CFXML := '<recordset>
6 <row>
7 <date>
8 <![CDATA[ Jul 27 ]]>
9 </date>
10 <title>
11 <![CDATA[ ><a href="http://jobsearch.monster.com:80/getjob.asp?JobID=32472300' || CHR(38) || 'AVSDM=2005%2D07%2D28+02%3A36%3A26' || CHR(38) || 'Logo=1' || CHR(38) || 'q=cold fusion' || CHR(38) || 'cy=US' || CHR(38) || 'sort=dt">ASP.NET Technical Program Manager with Marketing Savvy</a> ]]>
12 </title>
13 <company>
14 <![CDATA[ Excell Data ]]>
15 </company>
16 <location>
17 <![CDATA[ ><a onClick="popMMLL(''WA'',''Redmond''); return false;" href="http://jobsearch.monster.com:80/896">WA-Redmond</a> ]]>
18 </location>
19 </row>
20 </recordset>';
21
22 LOADXML ( CFXML );
23 COMMIT;
24 END;
25 /
DECLARE
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "TIM_HALL.JOBS_BIFERT", line 5
ORA-04088: error during execution of trigger 'TIM_HALL.JOBS_BIFERT'
ORA-06512: at "TIM_HALL.LOADXML", line 3
ORA-06512: at line 22
SQL> alter table jobs modify (LOCATION VARCHAR2(150 BYTE));
Table altered.
SQL> DECLARE
2 CFXML VARCHAR2(32767);
3
4 BEGIN
5 CFXML := '<recordset>
6 <row>
7 <date>
8 <![CDATA[ Jul 27 ]]>
9 </date>
10 <title>
11 <![CDATA[ ><a href="http://jobsearch.monster.com:80/getjob.asp?JobID=32472300' || CHR(38) || 'AVSDM=2005%2D07%2D28+02%3A36%3A26' || CHR(38) || 'Logo=1' || CHR(38) || 'q=cold fusion' || CHR(38) || 'cy=US' || CHR(38) || 'sort=dt">ASP.NET Technical Program Manager with Marketing Savvy</a> ]]>
12 </title>
13 <company>
14 <![CDATA[ Excell Data ]]>
15 </company>
16 <location>
17 <![CDATA[ ><a onClick="popMMLL(''WA'',''Redmond''); return false;" href="http://jobsearch.monster.com:80/896">WA-Redmond</a> ]]>
18 </location>
19 </row>
20 </recordset>';
21
22 LOADXML ( CFXML );
23 COMMIT;
24 END;
25 /
PL/SQL procedure successfully completed.
SQL>