Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
xerces always escapes ampersands

xerces always escapes ampersands

2003-08-04       - By Williams, Erskine BGI SF
Reply:     1     2     3     4     5     6  

I'm finding that xerces is always escaping ampersands, even when they are a
part of a character reference. For example, if I want to define a text
element like so: <someText>&#x20AC</someText>, (where "&#x20AC;" is the
hexadecimal entity reference for the euro "EUR" sign) when xerces writes
this out to a file, I invariably get: "<someText>&amp;#x20AC;</someText>"
Xerces is always escaping ampersands into the entity ref "&amp;"

Perhaps my confusion arises out of poor understanding of xml, but I should
think that xerces would only escape ampersands that aren't a part of a valid
entity reference, i.e., if an ampersand is immediately followed by a pound
(#) sign, it should leave it alone. Is there a more reliable way to
reference extended ascii characters in xml, so that they will pass through
xerces unmolested?

I use castor and dom4j to manipulate my xml in my application, but these
both use Xerces under the covers if I am not mistaken. Some simple test
cases are below. Any guidance is very much appreciated.
Cheers,
Erskine

/***********************
* Castor example
*
************************/
import java.io.FileWriter;
import java.io.File;

import org.exolab.castor.xml.Marshaller;

public class CastorTest {

 public static void main(String [] args) {

   //populate an arbitrary data object with special characters
   Factsheet fs = new Factsheet();
   ContentSections cs = new ContentSections();
   Content c = new Content();
   c.addPara("&#xA3; &#xA9; &#xAE;");
   cs.addContent(c);
   fs.setContentSections(cs);

   //now use the castor marshalling framework to write the data object out
to xml
   try {
     FileWriter fw = new FileWriter(new File("tmp.xml"));
     Marshaller m = new Marshaller(fw);
     m.setEncoding("iso-8859-1");
     m.marshal(fs);
   } catch (Exception e) {
     e.printStackTrace();
   }
 }
}

The resulting xml file looks like:

<?xml version="1.0" encoding="iso-8859-1"?>
<factsheet>
 <content>
   <para>&amp;#xA3; &amp;#xA9; &amp;#xAE;</para>
 </content>
</factsheet>

/********************************
*
* Dom4J example
*
********************************/
import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;

import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;

public class JDomTest {

 public static void main(String [] args) {
   Document document = DocumentHelper.createDocument();
   Element root = document.addElement("root");
   Element test = root.addElement("test").addText("&#xA3;,&#xAE;");
   try {
     Writer w = new FileWriter("tmp.xml");
     document.write(w);
     w.close();
   } catch (IOException e) {
     e.printStackTrace();
   }
 }
}

The result document is:

<?xml version="1.0" encoding="UTF-8"?>
<root>
 <test>&amp;#xA3;,&amp;#xAE;</test>
</root>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)