Skip to content

<empty>, when child of <sequence> or <alternate>, not correctly processed #780

@sydb

Description

@sydb

See #263; despite being closed, I do not think the problem identified by @lb42 has been solved.

The <empty> element, as a member of model.contentPart, is permitted only as a child of <content>, <sequence>, or <alternate> (and, after TEIC/TEI#2538 is merged, <interleave>).

content/empty

I do not believe there is any controversy when <empty> is a child of <content> — this has been exercised by both the Guidelines and lots of people’s ODDs dozens or hundreds of times. Since <content> is required to have 1 and only 1 child, there is no sibling rivalry between <empty> and its siblings, as it has none.

alternate/empty

But when <empty> is a child of <alternate>, its sibling seems to just beat it to death:

 <alternate minOccurs="1" maxOccurs="1">
   <empty/>
   <elementRef key="add" minOccurs="1" maxOccurs="1"/>
 </alternate>

should produce an optional <add> — either ( empty | add ) or ( add )? or ( add? ) or perhaps even ( add | empty ). But what it actually produces is just ( add ), i.e. a required <add>, not an optional <add>. (Yes, I realize the correct effect can be obtained by using the much simpler <elementRef key="add" minOccurs="0" maxOccurs="1"/>, but that’s not the point.)

sequence/empty

I have discovered at least one circumstance for which incorrect output is generated when <empty> is a child of <sequence>. Consider the following PureODD construction. While admittedly a bit off the beaten track, the intent is for a content model that allows either 0 <docDate> elements or 2 or more <docDate> elements — i.e., any number of <docDate>s except one; furthermore, if there are any <docDate>s there can also be global stuff with them.

 <content>
   <sequence minOccurs="0" maxOccurs="1">
     <empty/>
     <sequence minOccurs="2" maxOccurs="unbounded">
       <elementRef key="docDate" minOccurs="1" maxOccurs="1"/>
       <classRef key="model.global" minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
   </sequence>
 </content>

However, the RELAX NG produced by these Stylesheets seems to completely lose the outer <sequence minOccurs="0" maxOccurs="1"> clause. I.e., the <empty> seems to commit not only suicide (which was expected), but parricide as well:

 (
   docDate, model.global*,
   docDate, model.global*,
   ( docDate, model.global* )*
 )

If the outer <sequence> is changed to an <alternate>, the correct model is generated.

It is possible these two problems are related, although I think it unlikely. (So it may be more convenient to split this into two issues.)

I plan to post an ODD that demonstrates these situations shortly.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions