Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
Enhancing parsing performance

Enhancing parsing performance

2003-01-13       - By Brian Madigan
Reply:     1     2     3     4     5     6     7     8     9     10  

Turn validation off!
DOMParser parser = new DOMParser( );
parser.setFeature
           ("http://xml.org/sax/features/validation",

           false);
or something to that effect. If I am not mistaken,
that should stop any dtd validation from happening.

--- Jean Georges PERRIN <jgp@(protected)> wrote:
> Hi,
>
> Thanks for the hope message!
>
> I was timing the whole method, I focused on parser
> creation and parse time
> now.
>
> I changed my code to:
>   public void load () {
>     DOMParser parser;
>     Logger log =
>
ThinStructureConfiguration.getInstance().getLogger();
>    
>     try {
>       long start = System.currentTimeMillis();
>       parser = new DOMParser();
>       long stop = System.currentTimeMillis();
>       log.finest ("Creating parser took " + (stop -
> start) + " ms");
>     }
>     catch (Exception e) {
>       log.severe ("Error: Unable to instantiate
> parser");
>       return;
>     }
>
>     try {
>       long start = System.currentTimeMillis();
>       parser.parse(m_file.toURI().toString());
>       long stop = System.currentTimeMillis();
>       log.finest ("Parsing of " + m_file.getName() +
> " took " + (stop -
> start) + " ms");
>       m_document = parser.getDocument();
>     }
>     catch (SAXParseException e) {
>       // ignore
>     }
>     catch (Exception e) {
>       String msg;
>       msg = ("Error: Parse error occurred, " +
> e.getMessage());
>       if (e instanceof SAXException) {
>         e = ((SAXException)e).getException();
>       }
>       msg += '\n' + e.toString();
>       log.severe (msg);
>     }
>   }
>
> Results are:
> Jan 13, 2003 11:52:20 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Creating parser took 251 ms
> Jan 13, 2003 11:52:25 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Parsing of emailpassword.xhtml took 5227 ms
> Jan 13, 2003 11:52:25 PM com.awoma.ts.ui.Store add
> INFO: Window definition emailpassword.xhtml added.
> Jan 13, 2003 11:52:25 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Creating parser took 10 ms
> Jan 13, 2003 11:52:29 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Parsing of emailpassword2.xhtml took 3085 ms
> Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.Store add
> INFO: Window definition emailpassword2.xhtml added.
> Jan 13, 2003 11:52:29 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Creating parser took 0 ms
> Jan 13, 2003 11:52:29 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Parsing of emailpassword3.xhtml took 10 ms
> Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.Store add
> INFO: Window definition emailpassword3.xhtml added.
> Jan 13, 2003 11:52:29 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Creating parser took 0 ms
> Jan 13, 2003 11:52:31 PM
> com.awoma.ts.ui.impl.XHTML11Window load
> FINEST: Parsing of emailpassword4.xhtml took 2774 ms
>
> All files are identical, except #3 where I removed
> all references to the
> external world.
>
> I use Xerces J 2.2.1 (according to build.xml).
>
> Conclusions & questions:
> 1/ Creation of DOMParser() is slow the first time,
> but ridiculous
> afterwards, so there is no need for enhancing that
> much.
> 2/ My parser seems to want to check the validity
> through external
> connection. How can I remove those without modifying
> all my files?
>
> jgp
>
> > -----Original Message-----
> > From: Simon Kitching
> [mailto:simon@(protected)]
> > Sent: Monday, January 13, 2003 23:24
> > To: jgp@(protected)
> > Cc: xerces-j-user@(protected)
> > Subject: Re: Enhancing parsing performance
> >
> > Hi Jean Georges,
> >
> > Firstly, does the document you are parsing contain
> a DTD or schema
> > reference? If it uses http://acme.com/xyz.dtd,
> then much of your parsing
> > time may actually be in retrieval of the remote
> dtd. And if the
> > dtd/schema is large then time will be spent
> processing it. If this is
> > the case, there are optimisations available for
> both these problems.
> >
> > Secondly, you don't say exactly what you are
> timing. Is it the complete
> > application time, or the time taken by the method
> you include below, or
> > just the time for the parse method?
> >
> > Thirdly, you don't mention which version of Xerces
> you are using...
> >
> > Providing information on the above would allow
> people to provide better
> > suggestions for you..
> >
> > I certainly see better performance than you do, so
> there is hope :-)
> >
> > Regards,
> >
> > Simon
> >
> > On Tue, 2003-01-14 at 10:56, Jean Georges PERRIN
> wrote:
> > > Hi,
> > >
> > > Thanks for those who helped me with cloning...
> > >
> > > I am a little surprised with performance. Maybe
> there are some basic
> > things
> > > I am doing wrong.
> > >
> > > I am parsing a 3 Kb XHTML file and it takes me
> about 4s, cloning the
> > tree
> > > takes me roughly a ridiculous amount of time
> (10ms). This on an Athlon
> > XP
> > > 1800+ running XP (sure I could switch to Linux
> but it is not planned for
> > now
> > > :) ).
> > >
> > > My code for parsing:
> > >   protected void load () {
> > >     DOMParser parser;
> > >
> > >     try {
> > >       parser = new DOMParser();
> > >     }
> > >     catch (Exception e) {
> > >       log.severe ("Error: Unable to instantiate
> parser");
> > >       return;
> > >     }
> > >
> > >     try {
> > >       parser.parse(m_file.toURI().toString());
> > >       m_document = parser.getDocument();
> > >     }
> > >     catch (SAXParseException e) {
> > >       // ignore
> > >     }
> > >     catch (Exception e) {
> > >       String msg;
> > >       msg = ("Error: Parse error occurred, " +
> e.getMessage());
> > >       if (e instanceof SAXException) {
> > >         e = ((SAXException)e).getException();
> > >       }
> > >       msg += '\n' + e.toString();
> > >       log.severe (msg);
> > >     }
> > >   }
> > >
> > > Questions:
> > > 1/ is static'ing my parser will enhance the
> process?
> > > 2/ can I "pre" create some objects I can reuse?
> > > 3/ are there some eventual verification I can
> turn
=== message truncated ===


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)