Text only | Skip links
Skip links||IT Services, University of Oxford

1. Why might you need an ODD?

  • You need to define an XML schema to describe your resource
  • You need to provide documentation about
    • the semantics of your XML schema
    • constraints, usage notes, examples
  • You need to keep the two in step
  • You want to share the results
    • with others
    • with yourself, long term
  • you don't want to reinvent the wheel

2. ODD : the basic notion

One Document Does it all

A special XML vocabulary for defining....
  • schemas
  • XML element types independent of a particular schema language
  • public or private groups of such elements
  • patterns (macros)
  • classes (and subclasses) of element
And also for defining references which can pull into a schema
  • named components from the above list
  • objects from other namespaces

And also closely integrated with a set of traditional document markup elements

3. Basic ODD components for schema definition

<schemaSpec>
Defines and identifies a schema
<elementSpec>
Provides some or all of an element specification, new or existing
<elementRef>
References an existing element specification
<classSpec>, <classRef>
Likewise, for classes
<attDef>, <attRef>
Likewise, for attributes
<moduleRef>
References an existing ‘module’ i.e. a group of predefined elements and attributes, entirely or partially

(We discuss documentary components of an ODD later)

4. A simple example

Our markup uses a <book> element, which contains a mixture of <para>s and <picture>s. We have never heard of the TEI and we don't want to use it.

<schemaSpec ns="start="bookident="bookSchema">
 <elementSpec ident="book">
  <desc>Root element for a very simple schema</desc>
  <content>
   <alternate maxOccurs="unbounded">
    <elementRef key="para"/>
    <elementRef key="picture"/>
   </alternate>
  </content>
 </elementSpec>
<!-- ... continues on next slide -->
</schemaSpec>

5. A simple example, contd.


<!-- ... contd --><elementSpec ident="para">
 <desc>paragraph of running text</desc>
 <content>
  <textNode/>
 </content>
</elementSpec>
<elementSpec ident="picture">
 <desc>empty element pointing to a graphic file</desc>
 <content/>
 <attList>
  <attDef ident="href">
   <desc>supplies the URI of the object pointed at</desc>
   <datatype>
    <rng:data type="anyURI"/>
   </datatype>
  </attDef>
 </attList>
</elementSpec>

6. So what?

  • We can now build a schema in RELAX NG, W3C schema, or DTD language by a simple XSLT transformation
  • We can also extract documentary fragments (e.g. the descriptions of elements and attributes)
TEI provides a special element for the latter purpose:
<specList>
 <specDesc key="para"/>
 <specDesc key="picture"/>
</specList>
which would generate something like
<para>
textual element in a very simple schema (may have pictures in it)
<picture>
Empty element to point at a picture
inside our running text

7. Let's try this out ...

  • Start oXygen
  • Make a new document (CTRL-N) using the TEI-P5 -> ODD Customization Framework Templates
  • Replace the proposed <schemaSpec> with the content of the file oddex-1.xml; add the content of the file oddex-1-doc.xml before it; save the result as oddex-1.odd
  • Use built-in Transformation Scenarios TEI ODD to RELAX NG XML and TEI ODD to HTML to generate a schema and its documentation
  • Save the generated schema file as oddex-1.rng ; view the displayed documentation
  • Open the test file oddex-1-test.xml and associate it with the generated schema; validate the file.

8. Defining a model class

In the real world, the elements that can appear inside a <book> are likely to be many and various. It's convenient therefore to have a way of talking about all of them: in ODD, we say that all such elements are members of a model class.

We use the <classes> element to record an element's membership in a class:
<elementSpec ident="para">
<!-- ... -->
 <classes>
  <memberOf key="bookPart"/>
 </classes>
<!-- ... -->
</elementSpec>

And for completeness, here's a definition for the bookPart class.

<classSpec ident="bookParttype="model">
 <desc>the elements of this class all represent top-level parts of a book</desc>
</classSpec>

9. Using a model class

Rather than say that a <book> contains <para> elements (and other things), we can now say that it contains members of the bookPart class.

<elementSpec ident="book">
 <desc>Root element for a very simple schema</desc>
 <content>
  <classRef key="bookPartminOccurs="1maxOccurs="unbounded"/>
 </content>
</elementSpec>

(When we realise that books can also contain <list>s this will save time!)

10. Defining an attribute class

In the real world, it's also likely that several elements will have the same attributes. It's convenient therefore to define them once only: in ODD we say all elements with some attributes in common are members of an attribute class, which we define like this:
<classSpec ident="pointingtype="atts">
 <desc>elements of this class all have an href attribute</desc>
 <attList>
  <attDef ident="href">
   <desc>supplies a URI for the object pointed at</desc>
   <datatype>
    <rng:data type="anyURI"/>
   </datatype>
  </attDef>
 </attList>
</classSpec>

11. Test your understanding

  • Open the file oddex-2.odd with oXygen and compare it with oddex-1.odd
  • Generate a schema from it and make sure that the test file oddex-1-test.xml is still valid
  • Check that you understand how the class references are being used.

12. Controlling attribute values

  • The value of an attribute can be specified just by referring to an externally defined datatype such as anyURI or ID (these are W3C defined standards)
  • We can also supply and document our own list of required or recommended values using the <valList> element
For example...
<classSpec ident="bookAttstype="atts">
 <desc>this class defines the attributes that can appear on any element inside a
   book</desc>
 <attList>
  <attDef ident="xml:id">
   <desc>provides a unique identifier for an element</desc>
   <datatype>
    <rng:data type="ID"/>
   </datatype>
  </attDef>
  <attDef ident="status">
   <desc>indicates the correction status of this element </desc>
   <valList>
    <valItem ident="red"/>
    <valItem ident="green"/>
    <valItem ident="unknown"/>
   </valList>
  </attDef>
 </attList>
</classSpec>

13. Test your understanding

  • The preceding attribute class definition is available in your file oddex-3.xml. Add it into your oddex-2.odd file
  • Provide appropriate <memberOf> elements for the elements <para> and <pointer> to make them both members of the bookAtts class
  • Generate a schema and check that the oddex-1-test.xml file is still valid against this version of the schema.
  • Check that oXygen now permits the attributes xml:id and status. What values can be used for them?

14. What else might you want to say about your elements?

  • Additional glosses and descriptions, perhaps in different languages
  • Usage examples
  • More sophisticated constraints
    • complex content models
    • contextual dependencies

Plus other documentary features : versioning, cross references, ontological mappings ...

15. Alternative descriptions and glosses

<elementSpec ident="para">
 <gloss>paragraph</gloss>
 <desc>marks paragraphs in prose.</desc>
 <desc xml:lang="zh-tw">標記散文的段落。</desc>
 <desc xml:lang="ja"> 散文の段落を示す. </desc>
 <desc xml:lang="fr">marque les paragraphes dans un texte en prose.</desc>
 <desc xml:lang="es">marca párrafos en prosa.</desc>
 <desc xml:lang="it">indica i paragrafi in prosa</desc>
<!-- ... -->
</elementSpec>

16. Usage examples

Documenting an XML schema requires the inclusion of examples in XML. If your documentation is also in XML, you need to be a little devious. There are three possible approaches:
  • hide everything within a CDATA marked section
  • Escape everything using entity references
  • Use a different name space

The last has the great advantage that you can validate your examples against an XML schema

17. Examples

<eg><![CDATA[<p>A paragraph</p> ]]></eg>
<eg>
 <code lang="XML">&amp;lt;p>A paragraph&amp;lt;/p></code>
</eg>
<egXML
xmlns="http://www.tei-c.org/ns/Examples"> <p>A paragraph</p> </egXML>

18. More sophisticated constraints

  • We can define the legal content of an element using ‘pure ODD’ constructs
  • Alternatively we can use RELAX NG directly
  • Content can be further constrained by means of a <valList> element ...
  • ... or by means of a <datatype> element (which uses RELAX NG)
  • Contextual dependencies can be expressed by means of <constraint> elements (which use e.g. ISO Schematron)

We will introduce these possibilities gradually !

19. Defining a content model in pure ODD

The <content> element can contain

  • References to other elements <elementRef>
  • References to classes of element <classRef>
  • Alternations of the foregoing <alternate>
  • Sequences of the foregoing <sequence>
  • Interleaved instances of the foregoing <interleave> (Warning: this is not yet implemented)

Attributes minOccurs and maxOccurs can be used to control repetition

20. Is your journey really necessary ?

The TEI defines elements very like yours. Why not use the TEI?

<schemaSpec
  source="http://www.tei-c.org/release/xml/tei/odd/p5subset.xml"
  start="div"
  ident="teiBook">

 <elementRef key="div"/>
 <elementRef key="p"/>
 <elementRef key="graphic"/>
 <elementRef key="figure"/>
 <moduleRef key="tei"/>
</schemaSpec>

The <moduleRef> here provides definitions for the TEI infrastructure, notably the classes and datatypes used throughout every TEI schema. Apart from that we just need to specify the TEI elements we want to use, by means of an <elementRef>.

21. Constructing a TEI ODD

  • Open the file oddex-3.odd and compare it with the previous versions
  • Compile it as before, and use it to validate the TEI file oddex-3-test.xml
  • Note that a TEI document must use the TEI namespace
  • Note also that TEI concepts don't always overlap exactly with our initial model (e.g. a <graphic> cannot appear between <p> elements)

We'll look at the TEI use of ODD in more detail this afternoon



Lou Burnard. Date:
Copyright University of Oxford