Parsing XML flat file performance issues

All posts relating to Oracle PL/SQL development.

Moderator: Tim...

Parsing XML flat file performance issues

Postby mikedev » Thu Aug 23, 2012 7:26 pm

Hi first time posting to any forum so apologies if I do anything wrong.

We have implemented the following technique for parsing a XML flat file.

http://www.oracle-base.com/articles/8i/ ... nts-8i.php

I was wondering if there is any way to reduce the time that the FOR LOOP takes to add new records to the table collection.
What I’m experiencing, I’m going to use employee as an example form the article. If I process 35 employees the average time it takes the FOR LOOP to process is around 1.5 seconds per employee. If we process 120 employees in a file the average time jumps to over 5 seconds an employee. So when we get to processing around 500 employees the process is taking about 4 ½ hours to complete.

When I did my test remove the insert to Oracle table code.

We are using 8i database.

Is that just the natural of the beast or can I do anything to improve the loading of the table collection?

Thanks in advance for any help that is provide.
mikedev
Member
 
Posts: 2
Joined: Thu Aug 23, 2012 6:17 pm

Re: Parsing XML flat file performance issues

Postby Tim... » Thu Aug 23, 2012 10:24 pm

Hi.

It is unfortunate that parsing of XML using a DOM parser is done entirely in memory. To do this, the whole DOM tree has to be resolved in to memory in one go. As documents get bigger, the effort to parse them gets proportionately bigger, making them even slower.

For very large documents, DOM parsers are very inefficient and people tend to use event-based SAX parsers. Oracle do not support SAX parsers from PL/SQL. :(

The trick is to make the XML as small as possible. It is better to use attribute-based XML, rather than tag based, as it reduces the total size of the documents.

Code: Select all
Tag-Based:
<employee><id>1234</id><name>Tim Hall</name></employee>

Attribute-Based
<employee id="1234" name="Tim Hall" />


Much of the XML functionality in 8i is implemented in Java. In 9i and above it is implemented in the kernel (written in C) and is typically much faster.

Cheers

Tim...
Tim...
Oracle ACE Director
Oracle ACE of the Year 2006 - Oracle Magazine Editors Choice Awards
OakTable Member
OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
OCP Advanced PL/SQL Developer
Oracle Database: SQL Certified Expert
My website: http://www.oracle-base.com
My blog: http://www.oracle-base.com/blog
Tim...
Site Admin
 
Posts: 17935
Joined: Mon Nov 01, 2004 5:56 pm
Location: England, UK

Re: Parsing XML flat file performance issues

Postby mikedev » Fri Aug 24, 2012 12:38 am

Thank you Tim!

You have been very helpful.
mikedev
Member
 
Posts: 2
Joined: Thu Aug 23, 2012 6:17 pm

Re: Parsing XML flat file performance issues

Postby Tim... » Fri Aug 24, 2012 6:03 am

:)
Tim...
Oracle ACE Director
Oracle ACE of the Year 2006 - Oracle Magazine Editors Choice Awards
OakTable Member
OCP DBA 7.3, 8, 8i, 9i, 10g, 11g
OCP Advanced PL/SQL Developer
Oracle Database: SQL Certified Expert
My website: http://www.oracle-base.com
My blog: http://www.oracle-base.com/blog
Tim...
Site Admin
 
Posts: 17935
Joined: Mon Nov 01, 2004 5:56 pm
Location: England, UK


Return to Oracle SQL and PL/SQL Development

Who is online

Users browsing this forum: No registered users and 6 guests

cron