Subjects
Home
VOTE Move XML Commons to Xerces
Commented: (XERCESJ 589) Bug with pattern restriction on long strings
: Xerces J 2 8 1 Release on Wednesday, September 13th
: Xerces J 2 9 0 Release on Wednesday, November 22nd
Commented: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1178) Error getting prefix for an attribute with no n
Updated: (XERCESJ 1244) XMLSchemaValidator does not contribute element 's
Some consideration about the xerces DOM implementation
Updated: (XERCESJ 1066) Restriction+choice+substitutionGroup error
Commented: (XERCESJ 1227) Poor performance / OutOfMemoryError for sequenc
retain exception stack traces
Updated: (XERCESJ 1193) NPE or hang when parsing using the "continue afte
Future of NekoHTML
Commented: (XERCESJ 1203) NPE in XMLDTDProcessor
DOM Level 3 APIs for Xalan J and a new Xalan release (2 7 1)
: xml commons external 1 3 04 Release on Wednesday, November 22nd
Commented: (XERCESJ 1247) Incorrect location information on SAX when usin
XInclude exceptions how to mirror Xerces J functionality into Xerces C++?
First proposal on SoC project "Add support for the StAX (JSR 173) cursor API
: xml commons resolver 1 2 Release on Wednesday, November 22nd
Typo in RangeToken java Please check
Validator features
java lang ClassCastException when adopting Node
using the org apache xerces impl xs identity package
Updated: (XERCESJ 1257) buffer overflow in UTF8Reader for characters out
Problem with ref attributes and schema validation
Updated: (XERCESJ 122) XMLSchemaValidator does not contribute element 's d
Performance problem under load Xerces with Weblogic 9 x
remove ignored memory allocation
Commented: (XERCESJ 1177) SAXXMLStreamReader doesn 't always report namesp
Commented: (XERCESJ 977) Null pointer exception during DOM parsing
Commented: (XERCESJ 1197) Code cleanup for org apache xml serialize
Commented: (XERCESJ 1201) Initial contribution for StAX Event API
Updated: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special c
Commented: (XERCESJ 1199) SAXXMLStreamReader should attempt to register a
Commented: (XERCESJ 1061) Regex "$ " and "^ " characters treated as special
Updated: (XERCESJ 589) Bug with pattern restriction on long strings
StackOverflow
xerces Range unnecessarily not garbage collectable if not detached
Updated: (XERCESJ 1178) Error getting prefix for an attribute with no nam
Bug in xs:redefine
Commented: (XERCESJ 1204) Can not set XMLEntityResolver for LSParser
Updated: (XERCESJ 1253) Prototype for SoC2007 project "Add support for th
Updated: (XERCESJ 1259) Add SteamFilter Function to SoC2007 project "Add
Assigned: (XERCESJ 444) SAXException thrown by EntityResolver is reported
Google Summer of Code 2007
Xerces J and XInclude relative path issue
Assigned: (XERCESJ 206) Stack overflow when using a schema validation
Commented: (XERCESJ 1215) Restrictions involving two levels of substituti
Closed: (XERCESJ 1203) NPE in XMLDTDProcessor
non overriding equals methoda
Resolved: (XERCESJ 1079) invalid value returned for TOTALDIGITS facet in
Xerces AS3 port
Updated: (XERCESJ 325) Regular Expression; Pattern "| " clause order de
Updated: (XERCESJ 1196) Javadoc generation fails on Java SE 5 0
Closed: (XERCESJ 1202) DTD validation on XIncluded documents when the sch
Created: (XERCESJ 1124) Nonspecific schema error message
a bug in xerces
Updated: (XERCESJ 1201) Initial contribution for StAX Event API
Closed: (XERCESJ 1254) Empty uris in targetNamespace attribute not report
Links
Home
Oracle database error code
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
Commented: (XERCESJ-1066) Restriction+choice+substitutionGroup error

Commented: (XERCESJ-1066) Restriction+choice+substitutionGroup error

2006-09-28       - By Sandy Gao (JIRA)
Reply:     1     2     3     4     5     6     7     8     9     10     >>  

   [ http://issues.apache.org/jira/browse/XERCESJ-1066?page=comments#action
_12438500 ]
           
Sandy Gao commented on XERCESJ-1066:
------------------------------------

[Problem analysis]

First of all, this bug is *not* a duplicate of 1032. After applying the patch
provided in 1032, the 1032 test schema passes, but the schema attached to 1066
still fails.

There are 2 problems in Xerces' current implementation. The first one is, as
Lucian correctly pointed out in 1032, that the order of sub-group-expansion is
not specified (more a problem in the spec, as I mentioned in the first comment
to this bug).

The second problem (what's really causing 1066) is that "pointless particle
removal" happens *before* sub-group-expansion, as opposed to *after*, as
specified in the spec.

To be more specific about point 2 (and to correct what I said in the first
comment). For the schema attached above, after removal and expansion:
Base = ((X|X1|X2|X3)|Y)*
Restriction = (X1|X2|Y)*

Note that Base has nested choice groups and Restriction doesn't. Now when the
"RecurseLax" rule is invoked, the 3 particles in Restriction need to map to 2
particles in the Base. Never possible, hence rejected.

So to me, the right fix needs to contain 2 parts:
1. expand sub-groups *before* pointless particle removal
2. disregard ordering for choices resulted from sub-group expansion.

[Patch analysis - 1032]

1032 patch sorts particles resulted from sub-group-expansion. This partially
fixes the ordering problem, but not completely. It works when both base and
restriction use sub-groups. After both are sorted, the "complete mapping" rule
can be applied. But it doesn't work when one of the types uses sub-group and
the other doesn't, because the type that doesn't use sub-group may have
elements in arbitrary order.

Knowing the sorting strategy may help schema designers: when writing choices,
try to sort them. This may or may not be appropriate for certain schema authors
/designs, and may or may not work for different languages.

Overall, 1032 is a safe fix, it improves things, though doesn't fix the problem
entirely. I'm willing to apply it unless a better/more complete solution is
found.

[Patch analysis - 1066]

On the surface, Ignacio's patch works perfectly: both schemas from 1032 and
1066 are now accepted. But careful looking at the details reveals some rather
serious problems.

For the 1032 schema, this patch works because both sub-groups are turned into
this special MODELGROUP_SUBSTITUTIONGROUP and are handled specially (without
worrying about the order).

For the 1066 schema, it works because it treats X1 (the element) as restricting
the sub-group (X|X1|X2|X3) and X2 as restriction (X|X1|X2|X3) again. I would
consider this as "works by luck". :-)

The reason it's "luck" is because there are some schemas (valid and invalid)
that this patch will give the wrong answer.

Case 1:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
           targetNamespace="urn:restrict" xmlns="urn:restrict"
           elementFormDefault="qualified"
           attributeFormDefault="unqualified">

 <xsd:element name="X"/>
 <xsd:element name="X1" substitutionGroup="X"/>

 <xsd:complexType name="base">
   <xsd:sequence>
     <xsd:element ref="X" minOccurs="0"/>
   </xsd:sequence>
 </xsd:complexType>

 <xsd:complexType name="restriction">
   <xsd:complexContent>
     <xsd:restriction base="base">
       <xsd:choice minOccurs="0">
         <xsd:element ref="X1"/>
       </xsd:choice>
     </xsd:restriction>
   </xsd:complexContent>
 </xsd:complexType>

</xsd:schema>

After expansion and removal,
Base = (X|X1|)?
Restriction = (X1|)?

(The last '|' is just to indicate it's a choice.)

The spec is clear that this is a valid restriction (RecurseLax). But 1066 patch
would reject it, because now dType=choice and bType=subgroup, which is not
handled by the big switch.

Case 2:

Similar to Case 1, but change the <choice> in "restriction" to <sequence>, it
should still be valid (MapAndSum). But again, 1066 patch rejects it, for the
same reason.

Case 3:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
           targetNamespace="urn:restrict" xmlns="urn:restrict"
           elementFormDefault="qualified"
           attributeFormDefault="unqualified">

 <xsd:element name="X"/>
 <xsd:element name="Y"/>
 <xsd:element name="Z"/>

 <xsd:complexType name="base">
   <xsd:choice>
     <xsd:choice minOccurs="0">
       <xsd:element ref="X"/>
       <xsd:element ref="Y"/>
     </xsd:choice>
     <xsd:element ref="Z"/>
   </xsd:choice>
 </xsd:complexType>

 <xsd:complexType name="restriction">
   <xsd:complexContent>
     <xsd:restriction base="base">
       <xsd:choice>
         <xsd:choice>
           <xsd:element ref="X"/>
           <xsd:element ref="Y"/>
         </xsd:choice>
         <xsd:element ref="Z"/>
       </xsd:choice>
     </xsd:restriction>
   </xsd:complexContent>
 </xsd:complexType>

</xsd:schema>

However valid this looks like (I just changed something from optional to
mandatory), this is actually invalid in schema 1.0. (Note that schema 1.1 [1]
plans to replace the entire Particle Valid (Restriction) rule with a supposedly
simple statement: it's a restriction as long as it accepts a subset.)

Base = ((X|Y|)?|Z)
Restriction = (X|Y|Z)

"X" and "Y" in restriction can not both map to (X|Y) in base, because
RecurseLax requires an *order-preserving* mapping. So this should be invalid,
but 1066 patch says it's valid. What causes this to fail to produce the correct
result is actually the same as what was introduced to make the original 1066
test case happy. Namely the change in the method "checkRecurseLax" to reuse the
base particle.

Though not working as a charm, this patch actually involves some creative
thinking and is somewhat similar to some of my thoughts back in 2005 when 1066
was first opened (see below). Thanks for the effort and do keep trying. I
sincerely hope that you beat me in finding the *perfect* solution. :-)

[My Attempts]

My first attempt in 2005 was similar to your approaches in different aspects. I
mark sub-group choices as special, and have a special method to handle the
RecurseLax case when either choice came from a sub-group. This does a better
job than the 1032 patch, because the special method discards order entirely,
instead of using a specific order.

This attempt would have fixed 1032, but my focus was 1066 and it didn't work
for 1066, because of the reason I mentioned earlier: expansion happened after
removal.

My second attempt was to move the expansion to happen before removal, but
encounter a big problem where expansion and removal don't seem to work together
happily. Consider a choice

(A|B|C|D)

where B has X in its sub-group and D has Y in its sub-group. After expansion
/removal, it becomes

(A|B|X|C|D|Y)

Now we have to remember that the order between B and X doesn't matter, neither
does that between D and Y. But the order does matter between A and B/X and so
on.

This is where I stopped (it seemed too difficult to solve when no one was
pressing :p).


Ouch, it takes almost an entire day to analyze the problems (again), look at
the patches, and re-gather my thoughts from last year, and of course, write
this long comment. I'm glad that I'm writing things down this time so that I
don't have to go through the same process again in the future. I will
definitely give it some more thoughts. The least we can do is to commit Lucian
's patch (or my attempt 1). Or to make expansion happen before removal + Lucian
's patch. Though not complete, the latter should make both test cases from 1032
and 1066 happy.

[1] http://www.w3.org/TR/xmlschema11-1/#cos-content-act-restrict

> Restriction+choice+substitutionGroup error
> ------------------------------------------
>
>                 Key: XERCESJ-1066
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1066
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: XML Schema Structures
>    Affects Versions: 2.6.2
>         Environment: N/A
>            Reporter: Martin Thomson
>         Assigned To: Sandy Gao
>         Attachments: patch1.txt, patch2.txt
>
>
> When using a substitution group head in a choice, the head of the
substitition group is not correctly treated as a choice.
> Given a choice of X and Y where X is the head of a group with the members X1,
X2 and X3, the following SHOULD be true:
> Base = (X|Y)*
> ...according to clause 2.1 of Schema Component Constraint: Particle Valid
(Restriction) <http://www.w3.org/TR/xmlschema-1/#cos-particle-restrict> this
should be interpreted as:
> Base = ((X|X1|X2|X3)|Y)*
> Therefore the following should be a valid restriction, but Xerces does not
allow it:
> Restriction = ((X1|X2)|Y)*
> I am aware that some simplification of the choices is required by clause 2.2
of the above section, but this should not have the effect that it is.
> The following schema document demonstrates this:
> -----------------------------------------
> <?xml version="1.0"?>
> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>             targetNamespace="urn:restrict" xmlns="urn:restrict"
>             elementFormDefault="qualified"
>             attributeFormDefault="unqualified">
>   <xsd:complexType name="base">
>     <xsd:complexContent>
>       <xsd:restriction base="xsd:anyType">
>         <xsd:choice minOccurs="0" maxOccurs="unbounded">
>           <xsd:element ref="X"/>
>           <xsd:element ref="Y"/>
>         </xsd:choice>
>       </xsd:restriction>
>     </xsd:complexContent>
>   </xsd:complexType>
>   <xsd:element name="X"/>
>   <xsd:element name="Y"/>
>   <xsd:complexType name="restriction">
>     <xsd:complexContent>
>       <xsd:restriction base="base">
>         <xsd:choice minOccurs="0" maxOccurs="unbounded">
>           <xsd:choice>
>             <xsd:element ref="X1"/>
>             <xsd:element ref="X2"/>
>           </xsd:choice>
>           <xsd:element ref="Y"/>
>         </xsd:choice>
>       </xsd:restriction>
>     </xsd:complexContent>
>   </xsd:complexType>
>   <xsd:element name="X1" substitutionGroup="X"/>
>   <xsd:element name="X2" substitutionGroup="X"/>
> </xsd:schema>

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http:/
/issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@(protected)
For additional commands, e-mail: j-dev-help@(protected)