for XIncludeHandler.handleIncludeElement included, was Re: xinclude funn 2003-08-27 - By Neil Pitman
Hi Bob,
Thanks for you counsel. Indeed, it was my shortcuts are correct URI's that lay at the root of my problem. Everything fell in place with a "memory:" in front of the URI's.
The org.apache.xerces.xinclude.XIncludeHandler.handleIncludeElement is still a problem though. Here is the patch that I implemented to make it work. What is the protocol for submitting it for due consideration? This is definately not production quality (I don't know how to handle the possible exception), but it illustrates the issue.
1009,1015c1009,1021 < XMLResourceIdentifier resourceIdentifier = < new XMLResourceIdentifierImpl( < null, < href, < fDocLocation.getBaseSystemId(), < null); < --- > // XMLResourceIdentifier resourceIdentifier= > XMLResourceIdentifier resourceIdentifier; > try { > resourceIdentifier = > new XMLResourceIdentifierImpl( > null, > href, > fDocLocation.getBaseSystemId(), > XMLEntityManager.expandSystemId(href, fDocLocation.getBaseSystemId(), false)); > } catch (MalformedURIException e1) { > throw new XIncludeFatalError ("who knows",new Object[0]); > } > // null); 1044a1051 > fChildConfig.setEntityResolver(fEntityResolver);
_________________________________________ Neil Pitman neil.pitman@(protected) +1.514.863.5465 ICQ#: 21101052 _________________________________________
----- Original Message ----- From: Bob Foster To: xerces-j-user@(protected) ; Neil Pitman Sent: Saturday, August 23, 2003 3:10 PM Subject: Re: xinclude funnies
Hi Neil,
I can't give you any answers about handleIncludeElement and the null ids, though it sounds odd.
Your other troubles seem to be not using URIs properly. Unless Xerces is misbehaving (and it's been broken before in this way, so it is a possibility) it will construct an absolute URI from the base URI of the document being parsed and the relative paths you are using in your XInclude directory. (If you don't like what it does with relative paths, use absolute paths. An absolute URI always has a scheme. But I'll go on with the relative URI example.) If Xerces has a null base URI, it picks some pseudo-random directory to be relative to - never one you might like - so the moral of this story is make sure it knows the document location.
If you supply the EntityResolver, then you can make the base URI any well -formed thing you want. In particular, you can invent your own scheme for your EntityResolver to interpret. For example, if you give it "memory:docname" as a base URI and a relative URI in the XInclude of part1/subpart1/abc.xml, your resolver should get "memory:part1/subpart1/abc.xml" as the URI to resolve, and so on.
Bob Foster ----- Original Message ----- From: Neil Pitman To: xerces-j-user@(protected) Sent: Saturday, August 23, 2003 8:35 AM Subject: xinclude funnies
I'll preface this with "I'm a bit new to digging around in Xerces and XInclude". (Xerces always worked, but then I wasn't using new, beta features.)
I'm trying to make Saxon 7.6.5 (XSLT) work with Xerces 2.5.0 (tarball not recent cvs) using XInclude and SAX with my own XMLEntityResolvers /EntityResolvers. Saxon has it's own issues but that's for another mailinglist. Once upon a time, I would preprocess my files with Elliot Rusty Harold's Xincluder from SourceForge into a separate XML. Now I'd like to stream it in situ. (that means that the input files are understandable by elharo's implementation so they are less suspect). The trick is to change the OS based file references into arbitrary key references.
Here is what I understand. My questions follow.
Soon after hitting the include element, org.apache.xerces.xinclude .XIncludeHandler.handleIncludeElement(XMLAttributes attributes) is called. He creates a XMLResourceIdentifier with an explicit null public id and an explicit null expanded system id (the literal system id is retrieved from the href and represents a relative "file". This is what I used to use with XIncluder). When he determines that there is, indeed, a resolver, he calls EntityResolverWrapper holding my resolver. First the wrapper checks that the public id and system id (the expanded one) are not null. They are so he exits immediately.
I "fixed" this using XMLEntityManager.expandSystemId() to produce the expanded system id in handleIncludeElement where the XMLResourceIdentifier is first created.
Now with the EntityResolverWrapper getting a reasonable system id, my resolver gets a reasonable id and it attempts to load the first xinclude. The system id's are now a mixture of my keys and file bases. My entity resolver is completely memory based so the file based URI's are confusing. For example: In the old system running from the file system there were three files file:///c:/work/proj/main.xml file:///c:/work/proj/part1/subpart1/abc.xml file:///c:/work/proj/part1/subpart1/helper.xml
With the elharo xincluder, main.xml had <xi:include href="part1/subpart1 /abc.xml"/> and abc.xml contained <xi:include href="methods.xml"/>. My Resolver receives file:///c:/home/npitman/part1/subpart1/abc.xml. It gets the "file:///c :/home/npitman/" part from the running location of the application. XMLEntityManager noticed that the base system id of "main.xml" was null so he assumed that he would need a real URI and that should be based on the current working directory. In the memory-based situation, the hrefs are not so much URI's as keys. I'm expecting just "part1/subpart1/abc.xml".
This is what I find in the literal system id. Unfortunately, this helps little because the first include xincludes a second. This has an href of "helper.xml". In my key system, I'd expect to see "part1/subpart1/helper.xml".
Questions:
1) What is going on in org.apache.xerces.xinclude.XIncludeHandler .handleIncludeElement? Setting the id's to null can't be right.
2) Is there a way to accept a blank base system id?
I'd like href="part1/subpart1/abc.xml" within "main.xml" to try to resolve "part1/subpart1/abc.xml" and href="helper.xml" within "part1/subpart1/abc.xml" to try to resolve "part1/subpart1/help.xml".
3) Alternately, is the there an extension mechanism, like the EntityResolver, to externalize expandSystemId()?
I suppose that the fallback would be to set the base system id of main.xml to an abitrary scheme like "npitman://" and the look up "npitman://part1 /subpart1/abc.xml" and "npitman://part1/subpart1/helper.xml"
Thanks for your patience in reading. _________________________________________ Neil Pitman neil.pitman@(protected) +1.514.863.5465 ICQ#: 21101052 _________________________________________
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META http-equiv=Content-Type content="text/html; charset=iso-8859-1"> <META content="MSHTML 6.00.2800.1170" name=GENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=#ffffff> <DIV><FONT face=Arial size=2>Hi Bob,</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Thanks for you counsel. Indeed, it was my shortcuts are correct URI's that lay at the root of my problem. Everything fell in place with a "memory:" in front of the URI's.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>The org.apache.xerces.xinclude.XIncludeHandler.handleIncludeElement is still a problem though. Here is the patch that I implemented to make it work. What is the protocol for submitting it for due consideration? This is definately not production quality (I don't know how to handle the possible exception), but it illustrates the issue.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face="Courier New" size=2> <P>1009,1015c1009,1021<BR></FONT><FONT color=#ff0000><FONT face="Courier New" size=2>< XMLResourceIdentifier resourceIdentifier =<BR></FONT><FONT face="Courier New" size=2>< new XMLResourceIdentifierImpl(<BR></FONT><FONT face="Courier New" size=2>< null,<BR></FONT><FONT face="Courier New" size=2>< href,<BR></FONT></FONT><FONT color=#ff0000><FONT face="Courier New" size=2>< fDocLocation.getBaseSystemId(),<BR></FONT><FONT face="Courier New" size=2>< null);<BR></FONT><FONT face="Courier New" size=2>< <FONT color=#000000><BR></FONT></FONT></FONT><FONT face="Courier New" size=2>---<BR></FONT><FONT color=#0000ff><FONT face="Courier New" size=2>> / / XMLResourceIdentifier resourceIdentifier=<BR></FONT><FONT face="Courier New" size=2>> XMLResourceIdentifier resourceIdentifier;<BR></FONT><FONT face="Courier New" size=2>> try {<BR></FONT></FONT><FONT color=#0000ff><FONT face="Courier New" size=2>> resourceIdentifier =<BR></FONT><FONT face="Courier New" size=2>> new XMLResourceIdentifierImpl(<BR></FONT><FONT face="Courier New" size=2>> null,<BR></FONT><FONT face="Courier New" size=2>> href,<BR></FONT><FONT face="Courier New" size=2>> fDocLocation.getBaseSystemId(),<BR></FONT><FONT face="Courier New" size=2>> XMLEntityManager.expandSystemId(href, fDocLocation.getBaseSystemId(), false));<BR></FONT><FONT face="Courier New" size=2>> } catch (MalformedURIException e1) {<BR></FONT><FONT face="Courier New" size=2>> throw new XIncludeFatalError ("who knows",new Object[0]);<BR></FONT><FONT face="Courier New" size=2>> }<BR></FONT><FONT face="Courier New" size=2>> // null);<BR></FONT></FONT><FONT face="Courier New" size=2>1044a1051<BR></FONT><FONT face="Courier New" color=#0000ff size=2>> fChildConfig.setEntityResolver(fEntityResolver);</P></FONT></DIV> <DIV>_________________________________________<BR>Neil Pitman<BR><A href="mailto:neil.pitman@(protected)">neil.pitman@(protected)</A><BR>+1.514 .863.5465<BR>ICQ#: 21101052<BR>_________________________________________<BR></DIV> <BLOCKQUOTE dir=ltr style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px"> <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV> <DIV style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B> <A title=bob@(protected) href="mailto:bob@(protected)">Bob Foster</A> </DIV> <DIV style="FONT: 10pt arial"><B>To:</B> <A title=xerces-j-user@(protected) .org href="mailto:xerces-j-user@(protected)">xerces-j-user@(protected)</A> ; <A title=neil.pitman@(protected) href="mailto:neil.pitman@(protected)">Neil Pitman</A> </DIV> <DIV style="FONT: 10pt arial"><B>Sent:</B> Saturday, August 23, 2003 3:10 PM</DIV> <DIV style="FONT: 10pt arial"><B>Subject:</B> Re: xinclude funnies</DIV> <DIV><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><BR></DIV> <DIV><FONT face=Arial size=2>Hi Neil,</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I can't give you any answers about handleIncludeElement and the null ids, though it sounds odd.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Your other troubles seem to be not using URIs properly. Unless Xerces is misbehaving (and it's been broken before in this way, so it is a possibility) it will construct an absolute URI from the base URI of the document being parsed and the relative paths you are using in your XInclude directory. (If you don't like what it does with relative paths, use absolute paths. An absolute URI always has a scheme. But I'll go on with the relative URI example.) If Xerces has a null base URI, it picks some pseudo-random directory to be relative to - never one you might like - so the moral of this story is make sure it knows the document location.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>If you supply the EntityResolver, then you can make the base URI any well-formed thing you want. In particular, you can invent your own scheme for your EntityResolver to interpret. For example, if you give it "memory:docname" as a base URI and a relative URI in the XInclude of part1/subpart1/abc.xml, your resolver should get "memory:part1/subpart1/abc.xml" as the URI to resolve, and so on.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Bob Foster</FONT></DIV> <BLOCKQUOTE dir=ltr style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px"> <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV> <DIV style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:< /B> <A title=neil.pitman@(protected) href="mailto:neil.pitman@(protected)">Neil Pitman</A> </DIV> <DIV style="FONT: 10pt arial"><B>To:</B> <A title=xerces-j-user@(protected) href="mailto:xerces-j-user@(protected)">xerces-j-user@(protected)</A> </DIV> <DIV style="FONT: 10pt arial"><B>Sent:</B> Saturday, August 23, 2003 8:35 AM</DIV> <DIV style="FONT: 10pt arial"><B>Subject:</B> xinclude funnies</DIV> <DIV><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><BR></DIV> <DIV><FONT face=Arial size=2>I'll preface this with "I'm a bit new to digging around in Xerces and XInclude". (Xerces always worked, but then I wasn't using new, beta features.)</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I'm trying to make Saxon 7.6.5 (XSLT) work with Xerces 2.5.0 (tarball not recent cvs) using XInclude and SAX with my own XMLEntityResolvers/EntityResolvers. Saxon has it's own issues but that's for another mailinglist. Once upon a time, I would preprocess my files with Elliot Rusty Harold's Xincluder from SourceForge into a separate XML. Now I'd like to stream it in situ. (that means that the input files are understandable by elharo's implementation so they are less suspect). The trick is to change the OS based file references into arbitrary key references.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Here is what I understand. My questions follow.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Soon after hitting the include element, </FONT><FONT face=Arial size=2>org.apache.xerces.xinclude.XIncludeHandler.</FONT><FONT face=Arial size=2>handleIncludeElement(<FONT color=#0000ff size=2>XMLAttributes</FONT><FONT size=2> attributes) is called. He creates a </FONT></FONT><FONT face=Arial size=2><FONT size=2><FONT size=2>XMLResourceIdentifier with an explicit null public id and an explicit null expanded system id (the literal system id is retrieved from the href and represents a relative "file". This is what I used to use with XIncluder). </FONT></FONT></FONT><FONT face=Arial size=2><FONT size=2>When he determines that there is, indeed, a resolver, he calls EntityResolverWrapper holding my resolver. First the wrapper checks that the public id and system id (the expanded one) are not null. They are so he exits immediately.</FONT></FONT></DIV> <DIV><FONT face=Arial size=2><FONT size=2></FONT></FONT> </DIV> <DIV><FONT face=Arial size=2><FONT size=2>I "fixed" this using XMLEntityManager.expandSystemId() to produce the expanded system id in handleIncludeElement where the <FONT size=2>XMLResourceIdentifier is first created.</FONT></FONT></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Now with the EntityResolverWrapper getting a reasonable system id, my resolver gets a reasonable id and it attempts to load the first xinclude. The system id's are now a mixture of my keys and file bases. My entity resolver is completely memory based so the file based URI's are confusing. For example:</FONT></DIV> <DIV><FONT face=Arial size=2>In the old system running from the file system there were three files</FONT></DIV> <DIV><FONT face=Arial size=2><A href="file:///c:/work/proj/main.xml">file:///c:/work/proj/main.xml</A>< /FONT></DIV> <DIV><FONT face=Arial size=2> <DIV><FONT face=Arial size=2><A href="file:///c:/work/proj/part1/subpart1/abc.xml">file:///c:/work/proj /part1/subpart1/abc.xml</A>
<DIV><FONT face=Arial size=2><A href="file:///c:/work/proj/part1/subpart1/helper.xml">file:///c:/work/proj /part1/subpart1/helper.xml</A></FONT></DIV></FONT></DIV></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT size=2><FONT face=Arial>With the elharo xincluder, main.xml had </FONT><FONT color=#0000ff size=1><FONT face=Arial color=#000000 size=2><</FONT></FONT><FONT face=Arial>xi:include href="part1/subpart1/abc.xml<FONT color=#0000ff size=1><FONT color=#000000 size=2>"/> and abc.xml contained <xi:include href="methods.xml<FONT color=#0000ff size=1><FONT color=#000000 size=2>"/>. My Resolver receives <A href="file:///c:/home/npitman/part1/subpart1/abc.xml">file:///c:/home /npitman/<U><FONT color=#0000ff>part1/subpart1/abc.xml</A></FONT></U></FONT></FONT></FONT>< /FONT></FONT></FONT><FONT face=Arial size=2>. It gets the "<A href="file:///c:/home/npitman/part1/subpart1/abc.xml">file:///c:/home /npitman/</A>" part from the running location of the application. XMLEntityManager noticed that the base system id of "main.xml" was null so he assumed that he would need a real URI and that should be based on the current working directory. In the memory-based situation, the hrefs are not so much URI's as keys. I'm expecting just "part1/subpart1/abc.xml". </FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>This is what I find in the literal system id. Unfortunately, this helps little because the first include xincludes a second. This has an href of "helper.xml". In my key system, I'd expect to see "part1/subpart1/helper.xml".</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Questions:</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>1) What is going on in org.apache.xerces.xinclude.XIncludeHandler.<FONT face=Arial size=2>handleIncludeElement? Setting the id's to null can't be right.</FONT></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>2) Is there a way to accept a blank base system id? </FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I'd like href="part1/subpart1/abc.xml<FONT color=#0000ff size=1><FONT color=#000000 size=2>" within "main.xml" to try to resolve "part1/subpart1/abc.xml" and href="helper.xml" within "part1/subpart1/abc.xml" to try to resolve "part1/subpart1/help.xml". </FONT></FONT></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>3) Alternately, is the there an extension mechanism, like the EntityResolver, to externalize expandSystemId()? </FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2><FONT color=#0000ff size=1><FONT color=#000000 size=2>I suppose that the fallback would be to set the base system id of main.xml to an abitrary scheme like "npitman://" and the look up "npitman://part1/subpart1/abc.xml" and "npitman://part1/subpart1/helper.xml"</FONT></FONT></FONT></DIV> <DIV><FONT face=Arial size=2><FONT size=2><FONT size=2></FONT></FONT></FONT> </DIV> <DIV><FONT face=Arial size=2><FONT size=2><FONT size=2>Thanks for your patience in reading.</DIV></FONT></FONT></FONT> <DIV><FONT face=Arial size=2>_________________________________________<BR>Neil Pitman<BR></FONT> <A href="mailto:neil.pitman@(protected)"><FONT face=Arial size=2>neil.pitman@(protected)</FONT></A><BR><FONT face=Arial size=2>+1.514.863.5465<BR>ICQ#: 21101052<BR>_________________________________________<BR></FONT></DIV>< /BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>
|
|