Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
First proposal on SoC project "Add support for the StAX (JSR-173) cursor API

First proposal on SoC project "Add support for the StAX (JSR-173) cursor API

2007-04-01       - By Michael Glavassevich
Reply:     1     2     3     4  

Hi Wei,

Welcome to the list and thanks for sharing your thoughts with everyone. I
think you've got the general idea. Some initial comments below...

"wei duan" <weidua@(protected)> wrote on 04/01/2007 09:05:53 AM:

> Hello,Everyone,
>        I'm a student applying for SoC project Add support for the
> StAX(JSR-173) cursor API to Xerces-J. Michael suggested I could
> discuss my proposal in the mailing list. So I would like to
> introduce my thoughts and plan on this student project, any comments
> are welcomed. : )
>         The abstract description of project is: "To design and
> implement the cursor-based XMLStreamReader (and [image removed]
filtering
> support). It should be possible to accomplish this using XNI by building
the
> XMLStreamReader on top of an XMLPullParserConfiguration."
> Besides XNI, there are several ways to implement StAX interface. For
> example, parse the XML document as raw text and start from scratch,
> including parsing characters, building token, and interpreting
> tokens, and so on. Or to implement a converter from existing DOM or
> SAX interfaces.

The student who we had for GSoC last year implemented those already.
They're useful when you're starting from a SAX or DOM source though you
really need a native solution to get decent performance if you're parsing
the document from a stream.

> However, after reading Xerces sources code, I found
> both SAX and DOM implementations are based on XNI, so it's very
> natural to build StAX on XNI.
> To implement XMLStreamReader, two important preconditions should be
confirmed.
> 1.       XML event information can be received.
> 2.       The pull style parsing process can be simulated.
> When I look through the XNI interfaces, I found it actually meets
> these two preconditions. The handler interfaces in XNI such as
> XMLDocumentHandler and XMLDTDHandler can get XML events including
> startDocument and endDocument, which can be easily mapped to StAX
> events accordingly. XMLPullParserConfiguration interface in XNI is
> used to represent a parser configuration that can be used as the
> configuration for a "pull" parser, thus the pull parsing process of
> StAX can be simulated by calling "boolean parse(boolean)" method in
> XMLPullParserConfiguration  .
> Then I looked through the current Xerces Implementation, I found
> AbstractXMLDocumentParser class implements XMLDocumentHandler,
> XMLDTDHandler, and XMLDTDContentModelHandler interfaces. Both
> AbstractDOMParser and AbstractSAXParser extend from
> AbstractXMLDocumentParser. So I think  I can implement an
> AbstractStAXParser extending AbstractXMLDocumentParser to get XML
events.
> For example, code in current AbstractSAXParser:
>     public void comment(XMLString text, Augmentations augs) throws
> XNIException {
>         try {
>             // SAX2 extension
>             if (fLexicalHandler != null) {
>                 fLexicalHandler.comment(text.ch, 0, text.length);
>             }
>         }
>         catch (SAXException e) {
>             throw new XNIException(e);
>         }
>     } // comment(XMLString)
>
> And in my AbstractStAXPaser, it may be implemented like this,
>    public class AbstractStAXParser extends AbstractXMLDocumentParser {
>         public int m_curEventType;
>         public String m_characters;
>        ….
>        public void comment(XMLString text, Augmentations augs)
> throws XNIException {
>               m_curEventType = XMLStreamConstants.COMMENT;
>               m_characters = new String(text.ch, text.offset,
text.length);

Where possible you should try to avoid creating new strings unless a call
to the API demands one.

>        }
>       …
>   }
>
> Meanwhile, XMLPullParserConfiguration will be used to control the
> parsing process. XML11Configuration is the implementation of
> XMLPullParserConfiguration interface in Xerces.

It's one of several parser configurations which implement
XMLPullParserConfiguration. Given the way we implemented XInclude
(dispatching a child pipeline to read the entire include before returning
to the parent), XML11Configuration is probably a better choice than
XIncludeAwareParserConfiguration (the current default config for SAX and
DOM).

> I think I can implement StAXPaserConfiguration which extends from
> XML11Configuration for XML1.0 and XML 1.1.

I'm not sure why you would need a new parser configuration. I think
XML11Configuration would work just fine. Is there something that you think
is missing from it?

> In runtime, AbstractStAXParser will be set as the handlers of the
> StAXParserConfiguration instance.
> As for XMLStreamReader, it can be implemented as this,
> public class StAXXMLStreamReaderr implements XMLStreamReader {
>        public StAXPaserConfiguration m_configuration;
>        public StAXParser m_parser;
>        ….
>       int getEventType()
> {
>     return m_parser.m_curEventType;
>      }
> int next()
> {
>     m_configuration.parse(false)
> return m_parser.m_curEventType;;
> }
> …
> }

I don't think the StAX and XNI method signatures overlap with each other.
You could probably merge this into the other class and avoid the
indirection.

public class AbstractStAXParser
  extends AbstractXMLDocumentParser
  implements XMLStreamReader {
 ...
}

>       Above are some of my rough thoughts, so if you have any
> comments and questions, I would like to discuss with you.
>
> Thanks, Wei

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@(protected)
E-mail: mrglavas@(protected)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@(protected)
For additional commands, e-mail: j-dev-help@(protected)