Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
determining the encoding of an external subset via XNI

determining the encoding of an external subset via XNI

2003-03-10       - By neilg@(protected)
Reply:     1     2  

Hi all,

In an attempt to generate some more discussion surrounding the issue I
raised in the message below, here are some ways by which we might move
forward.  For those who didn't see the previous thread, the Cole's Notes
version of the problem is that, as XNI is currently designed, there doesn't
seem to be any way of determining what the parser autodetected the encoding
of the DTD external subset to be--or any way of determining anything about
that encoding at all if the external subset doesn't happen to contain a
text decl.

Here are all the options that I've thought of:

1.  We could modify the XMLDTDHandler#externalSubset callback so that,
instead of looking like

     public void startExternalSubset(XMLResourceIdentifier identifier,
Augmentations augs)

it looks like

     public void startExternalSubset(XMLResourceIdentifier identifier,
String encoding, Augmentations augs)

This would make that callback much more symmetric to the startDocument
callback of the XMLDocumentHandler interface; unfortunately it has the
tremendous drawback of not being terribly backwards compatible.

2.  We could add a new callback to the XMLDTDHandler interface, something
like:

     public void externalSubsetEncoding(String encoding)

which we would advertise as occurring after the startExternalSubset
callback and before the textDecl call. While this would be far more
backward compatible, there's no precedent for anything like it in XNI;
also, the callback would only be useful for external subsets, since in all
other contexts we already have methods for conveying encoding information.

3.  We could use the Augmentations parameter of the startExternalSubset
callback.  This would preserve backward compatibility, but certainly
couldn't be accused of being beautiful; also , it would mark the first time
we've used Augmentations in Xerces for something at the level of a scanner.
So far, we've only employed that functionality in the context of schema
validation.

4.  We could amend the XMLLocator interface by adding a method like

     public String getEncoding()

on the lines of the SAX Locator2 interface.  This again would only be
really useful in this single context, since XNI goes out of its way
everywhere else to explicitly make provision for the passage of encoding
information; i.e., it doesn't seem to accord well with the overall design
of the API.

I'll readily admit that none of these solutions is particularly attractive.
Thoughts, preferences, or more appealing solutions are thus even more than
usually welcome!

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  neilg@(protected)


----- Forwarded by Neil Graham/Toronto/IBM on 03/10/2003 06:03 PM -----
|---------+---------------------------->
|         |           Neil Graham      |
|         |                            |
|         |           03/04/2003 11:13 |
|         |           PM               |
|         |                            |
|---------+---------------------------->
 >----------------------------------------------------------------------------
-----------------------------------------------------------------|
 |                                                                            
                                                                |
 |       To:       xerces-j-dev@(protected)                                
                                                                |
 |       cc:                                                                  
                                                                |
 |       From:     Neil Graham/Toronto/IBM@(protected)                              
                                                                |
 |       Subject:  another encoding issue                                    
                                                                |
 |                                                                            
                                                                |
 |                                                                            
                                                                |
 >----------------------------------------------------------------------------
-----------------------------------------------------------------|



Hi all,

How does one determine the autodetected encoding of a DTD external subset?

Right now, our DTD scanner takes this information from the entity manager
in a (non-XNI) startEntity(name, resourceIdentifier, encoding) call but
drops the encoding information on the floor for entities whose names are
[dtd].

It sure would have been handy if the
XMLDTDHandler#startExternalSubset(XMLResourceIdentifier, Augmentations) had
also included an encoding parameter...

Thoughts?

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  neilg@(protected)





---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)