charset problem - UTF-8 2003-02-21 - By Scott Eade
I have had a brief scan of the mail archive and not come across anything like this, but that said, I am not sure of exactly where this problem bight be coming from.
Here is what I have: 1. Some data in a MySQL database that contains "right single quotation marks" (UTF Hex 2019) - thanks to the content being pasted in from MS Word. 2. The data is included in a CDATA section in a jdom-b8 tree. 3. A jdom XMLOutputter created with the encoding set to UTF-8 XMLOutputter outputter = new XMLOutputter(" ", true, "UTF-8"); 4. A HttpServletResponse with ContentType set to "text/xml; charset=UTF-8". HttpServletResponse response = whatever...; response.setContentType("text/xml; charset=UTF-8"); 5. The Writer for the response is used to output the content outputter.output(doc, response.getWriter()); response.flushBuffer();
Now the trouble is that the /u2019 characters do not seem to be written correctly to the output stream (I am expecting to see "’" as a replacement for these characters, but instead I am seeing the square block placeholder - platform is win2k).
I am at a loss of what to try. I have gone from jdom-b7 to jdom-b8 and from xercesj-1.3.0 to xercesj-2.0.2 to xercesj-2.3.0 and the problem persists.
Interestingly some other characters are being correctly converted to their character entity references, but then sometimes they are not in the same document.
Any clues would be most welcome. I'll probably try the jdom list as well.
Thanks in advance for any replies.
Cheers,
Scott -- Scott Eade Backstage Technologies Pty. Ltd. http://www.backstagetech.com.au .Mac Chat/AIM: seade at mac dot com
--------------------------------------------------------------------- To unsubscribe, e-mail: xerces-j-user-unsubscribe@(protected) For additional commands, e-mail: xerces-j-user-help@(protected)
|
|