Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
encoding and saxparser

encoding and saxparser

2003-01-20       - By Voytenko, Dimitry
Reply:     1     2     3     4     5     6     7     8     9     10  

Hi Joseph,

Could you change couple lines of your code and try to run it again?

old >>   ByteArrayOutputStream bos = new ByteArrayOutputStream();
new >> StringWriter bos = new StringWriter ();

old >> public java.io.PrintStream out = System.out;
new >> public java.io.PrintWriter out; // = new PrintWriter (System.out);

old >> public TestContentHandler (java.io.ByteArrayOutputStream bos){
old >>    out = new java.io.PrintStream(bos);
new >> public TestContentHandler (StringWriter bos){
new >>   out = new PrintWriter (bos);

I think the problem is that in the fragment
 public  void characters(char[] ch, int start, int length){
     out.print(new String(ch,start,length));
 }

You implicitly convert string to bytes (using default encoding). Then you
convert bytes to string again, when you implicitly call bos.toString(),
using default encoding again. And only then you output this string to
console, using console/output encoding.
You can check your default character encoding using:
     System.err.println (sun.io.CharToByteConverter.getDefault());

Your default encoding might be different from ISO-8859-1 and may not support
character 0xE9. Your console/output encoding apparently supports this
character, since you can see it in the second case. And you don't use extra
byte-char-byte conversions in the second case, that's why this is first
thing to suspect.

In either case, it's dangerous to use default encodings in this case, b/c
you might encounter deployment problems. Plus, you have several extra
conversions, which don't come free and absolutely excessive.

In the conclusion I can say, that I ran your example (with my changes) using
Xalan 2.4.1 and Xerces 2.2.1 and everythign was just fine in both cases.

Thanks,
Dimitry

-----Original Message-----
From: Joseph Shraibman [mailto:jks@(protected)]
Sent: Monday, January 20, 2003 17:44
To: xerces-j-user@(protected)
Subject: Re: encoding and saxparser


OK here is what I used to test:

My jsp:
===============================  <% response.setContentType("text/plain"); %>
<%@ page import="java.io.*" %>
<%@ page import="org.w3c.dom.Document" %>
<%@ page import="org.apache.xerces.parsers.*" %>
<%@ page import="org.xml.sax.*" %>
<%@ page import="org.apache.xerces.dom.*" %>
<%@ page import="javax.xml.transform.stream.*" %>
<%@ page import="javax.xml.transform.dom.*" %>
<%@ page import="javax.xml.transform.*" %>

D�cio: <%= "D�cio" %>

<%

{
File file = new java.io.File("/tmp/temp1.xml");
String xml_str = com.xtenit.control.SQLUtils.getFromFile(file);

 ByteArrayOutputStream bos = new ByteArrayOutputStream();
TestContentHandler tch = new TestContentHandler(bos);

 SAXParser sp = new SAXParser();
sp.setFeature("http://apache.org/xml/features/allow-java-encodings",true);
  sp.setContentHandler(tch);
  InputSource is = null;
is = new InputSource("/tmp/temp1.xml");
      sp.parse(is);

%>is encoding is: <%= is.getEncoding() %> xml: <br> <%= bos %><br>
now encoding is:  <%= is.getEncoding() %>

================================================================<%
}
{
File file = new java.io.File("/tmp/temp1.xml");
 DOMParser _dp = new DOMParser();
  InputStream is = new FileInputStream(file);

 _dp.parse(new InputSource(is));
  Document doc = _dp.getDocument() ;

 StringWriter sw = new StringWriter();
            TransformerFactory.newInstance().newTransformer().transform(new
                DOMSource(doc), new StreamResult(sw));
%> xml: <br> <%= sw %><br>

<%
}
%>
===================== end of jsp
TestContentHandler.java:

package com.xtenit.xml;

/**
 * TestContentHandler.java
 *
 *
 * Created: Mon Jan 13 19:59:00 2003
 *
 * @(protected) Joseph Shraibman
 * @(protected)
 */
import org.xml.sax.*;
import javax.xml.transform.stream.*;
import javax.xml.transform.sax.*;
import javax.xml.transform.*;
import org.apache.xerces.parsers.*;

public class TestContentHandler implements org.xml.sax.ContentHandler{

   public java.io.PrintStream out = System.out;

  public  void endDocument(){}
public  void startElement(java.lang.String namespaceURI, java.lang.String
localName,
                          java.lang.String qName, Attributes atts){
    StringBuffer sb = new StringBuffer();
    sb.append('<');
    if (namespaceURI != null && namespaceURI.length() > 0){
        sb.append(namespaceURI+':');
    }
    sb.append(localName);
    for ( int i = 0, atts_len = atts.getLength() ; i < atts_len ; i++ )
        sb.append('
').append(atts.getLocalName(i)).append("=\"").append(atts.getValue(i)).appen
d('"');
    sb.append('>');
    out.print(sb.toString());
}
 public  void characters(char[] ch, int start, int length){
     out.print(new String(ch,start,length));
 }
   public  void endElement(java.lang.String namespaceURI, java.lang.String
localName,
java.lang.String qName){
       StringBuffer sb = new StringBuffer();
       sb.append("</");
       if (namespaceURI != null && namespaceURI.length() > 0 ){
           sb.append(namespaceURI+':');
       }
       sb.append(localName+">");
       out.print(sb.toString());
   }
   public  void endPrefixMapping(java.lang.String prefix){}
    public  void ignorableWhitespace(char[] ch, int start, int length){}
    public  void processingInstruction(java.lang.String target,
java.lang.String data){}
    public  void setDocumentLocator(Locator locator){}
    public  void skippedEntity(java.lang.String name){ if (true)
out.println("DEGUG:
skipped Entity: '"+name+"'"); }
    public  void startDocument(){}
    public  void startPrefixMapping(java.lang.String prefix,
java.lang.String uri){}

    public TestContentHandler (){

    }
      public TestContentHandler (java.io.ByteArrayOutputStream bos){
          out = new java.io.PrintStream(bos);
    }

  public static void main(String[] args)throws Exception{
      SAXParser sp = new SAXParser();
      TestContentHandler xc = new TestContentHandler();
      sp.setContentHandler(xc);
      String filename = args[0];
      InputSource is = null;
      if (filename.equals("-"))
          is = new InputSource(System.in); //use standard input
      else
          is = new InputSource(new java.io. FileReader(filename));
      sp.parse(is);
  }


}// TestContentHandler
===============================my xml file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<data>
      <firstname>D�cio</firstname>
</data>

My xerces is 2.2.1. In my test the first one does not work but the last one
does.
You can see what the jsp looks like at http://xis.xtenit.com/temp.jsp
(except that has an
old version of TestContentHandler that puts colons in the output.

Voytenko, Dimitry wrote:
> Hi Joseph,
>
> I'm afraid nobody will be able to answer this w/o seeing the your code
(the
> one with the SAXParser).
> So if you could send it (or just a fragment when you initialize SAXParser,
> start parsing and process the SAX events) it would be hepful.
> Plus, include Xerces version information.
>
> thanks,
> Dimitry
>
> -----Original Message-----
> From: Joseph Shraibman [mailto:jks@(protected)]
> Sent: Monday, January 20, 2003 11:58
> To: xerces-j-user@(protected)
> Subject: Re: encoding and saxparser
>
>
> neilg@(protected) wrote:
>
>>Hi Joseph,
>>
>>I had a feeling something like that might have been the case.  I'll bet
>>there's some difference in the way you're viewing the SAX output as
>>compared to the DOM output.
>
>
> No, I made sure that that isn't a problem.


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)


_____________________________________________________
Sector Data, LLC, is not affiliated with Sector, Inc., or SIAC

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected)
For additional commands, e-mail: xerces-j-user-help@(protected)