Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
high value unicode characters

high value unicode characters

2004-04-07       - By Joshua Santelli
Reply:     1     2     3     4  

Hello,

We're using Xerces SAX2Print, version 2.5.0
(xerces-c_2_5_0-solaris_27-cc_62) and have run into a problem with a few
"high value" unicode characters.  What we would like to do is validate the
file and convert it to UTF-8.  The SAX2Print process completes with no
error but there appears to be some strange characters after the high value
unicode characters (𝖢, 𝖧 and 𝒫) in the output.

    The command is: # SAX2Print -v=always -x=UTF-8 test1.xml

The error that I get using SAX2Print on the output XML file is:

    Fatal Error at file test1-out.xml, line 5, char 35
      Message: Got an unexpected trailing surrogate character


Any idea what is going wrong here?

Thanks in advance,
josh


=========================
<?xml version="1.0"?>
<!DOCTYPE test SYSTEM "test.dtd">
<test>
        <testPara>
                <head>1. high value Unicode characters and some
punctuation as entities</head>
                <p>Assuming &#x1D5A2;&#x1D5A7;, Hindman [ht1] showed that
the existence of certain ultrafilters on the power set of the natural
numbers is equivalent to Hindman&#x2019;s Theorem.  Adapting this work to a
countable setting formalized in RCA<sub>0</sub>, this article proves the
equivalence of the existence of certain ultrafilters on countable Boolean
algebras and an iterated form of Hindman&#x2019;s Theorem, which is closely
related to Milliken&#x2019;s Theorem.</p>
        </testPara>
        <testPara>
                <head>2. high value Unicode char and some Greek as
entities</head>
                <p>This article is a continuation of our search for
tautologies that are hard even for strong propositional proof systems like
EF, cf. [Kra-wphp,Kra-tau].  The particular tautologies we study, the
&#x03C4;-formulas, are obtained from any &#x1D4AB;/poly map g; they express
that a string is outside of the range of g. Maps g considered here are
particular pseudorandom generators. The ultimate goal is to deduce the
hardness of the &#x03C4;-formulas for at least EF from some general,
plausible computational hardness hypothesis.</p>
        </testPara>
</test>
=========================
<!ELEMENT test (testPara+) >
<!ELEMENT testPara (head, p) >
<!ELEMENT head (#PCDATA) >
<!ELEMENT p (#PCDATA | b | i | sub)* >
<!ELEMENT b (#PCDATA) >
<!ELEMENT i (#PCDATA) >
<!ELEMENT sub (#PCDATA) >
=========================


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)