Text only | Skip links
Skip links||IT Services, University of Oxford

1. Part 1

1.1. Before you start

In this exercise, you will use Roma, a web tool available from the TEI web site and usable with any web browser: Firefox, Chrome, Internet Explorer, Safari etc. Once you have created your schema, you will also need an XML-aware editor: oXygen in our case.

Our goal is to make a schema which we can use to mark up some pages of Punch magazine. We don't need all of of TEI Lite, much less the full TEI, but we do need bits of various modules. We'll also have to tinker with some of the modules, to make our schema more helpful with daily editing.

You need some data files! Grab http://tei.oucs.ox.ac.uk/Talks/2011-06-18-odd/work.zip and unpack it somewhere you can find it again. We'll talk about this work directory when we refer to files you need.

1.2. Your Material

If you run the oXygen XML editor, and load the file ex1.xml you will find some samples from volume 147 of Punch Magazine. If it has started in ‘Author’ mode, click the word Text in the bottom left-hand corner. Don't worry too much about the content at the moment. Instead note that the file is validating against the tei_all schema. This means inside paragraphs you get lots of choices of elements but that your you don't have a very tailored schema specific to your needs. Move somewhere inside a paragraph and type < to get a dropdown list of all the elements available at this point. Do you think that we really need all of these elements in this case? (Delete the < when you are done).

1.3. Making your own schema

  1. Open the Roma application, by pointing your favourite web browser at http://www.tei-c.org/Roma/ — if this is down, use the backup at http://tei.oucs.ox.ac.uk/Roma/.
  2. The Roma start screen allows you to create a new customization, or to upload an existing customization for further work. We will start from scratch, which means ticking the first radio button ("Build schema (Create a new customisation by adding elements and modules to the smallest recommended schema)"). Press the Start button at bottom left of the screen to continue.
The next and subsequent screens show you a row of tabs which you can select to carry out a part of the customization process:
New
start making a new customization
Customize
set the name, description etc. for the customization and also the interface language
Language
set the languages to be used for names in the generated schema and for the documentation
Modules
select the TEI modules you wish to operate on
Add Elements
define new (non-TEI) elements for use in the generated schema
Change Classes
modify existing attribute class definitions
Schema
generate a schema in DTD, RELAX NG, or W3C Schema language
Documentation
generate human-readable reference documentation in HTML or PDF
Save Customization
generate an ODD file defining the current customization
Sanity Checker
run a consistency check of the modifications made to a schema

At the foot of each screen is a red Save button: remember to press this before moving on to the next tab, or your changes will not be stored and so have no effect.

We won't explore all of these options in this tutorial.

  1. Enter a title and filename for your schema. We suggest TEI for Punch and my_ipp respectively. Change the Description of your schema at the bottom if you like, but leave the other fields unchanged.
  2. If you want to change the interface language, select the language of your choice (but the rest of this exercise assumes you are working in English.
  3. Press the red Save button and then select the Modules tab to proceed.
The modules screen shows two lists: on the left are all available TEI modules; on the right are the modules currently selected for your schema. You can add modules from the list on the left, and remove modules from the list on the right, by clicking the appropriate word next to the module you wish to operate on.
  1. For this exercise, we will need the following extra modules:
    • figures
    • corpus
    • namesdates
    Click the word add next to the name of each module.
  2. The modules chosen contain many more elements than we need, so we will now remove some of them, simplifying the view in the XML editor. Click the name of a module in the List of selected modules (the right-most column) to see a list of the elements this module defines.

Each element listed has a name, a radio button indicating whether it is to be included or excluded, a tag name, a description, and a link to a further screen where its attributes are specified. The question mark following each element name is linked to its full documentation in the TEI; the element name itself is linked to a screen on which you can modify it. We'll use these facilities later.

For now, note that you can toggle inclusion or exclusion of all elements in the list by clicking the appropriate column heading. For example, you can click on Exclude to remove all elements from a module.

Now work down the list clicking the radio button to restore or add elements. We make suggestions for the elements you will need to have in your schema for this and the following exercises, but feel free to add others if you wish. Remember to press the red Save button when you have finished with each module. Press the Modules or back links to go back to the list of modules.
from the core module
delete everything except <author>, <bibl>, <cb>, <cit>, <date>, <desc>, <emph>, <foreign>, <gap>, <graphic>, <head>, <hi>, <item>, <l>, <label>, <lb>, <lg>, <list>, <mentioned>, <milestone>, <name>, <note>, <p>, <pb>, <ptr>, <quote>, <ref>, <said>, <soCalled>, <sp>, <speaker>, <stage>and <title>
from the header module
delete everything except <availability>, <change>, <distributor>, <edition>, <editionmStmt>, <encodingDesc>, <fileDesc>, <idno>, <keywords>, <principal>, <profileDesc>, <projectDesc>, <publicationStmt>, <revisionDesc>, <samplingDecl>, <sourceDesc>, <teiHeader>, and <titleStmt>
from the textstructure module
delete everything except <TEI>, <back>, <body>, <div>, <front>, <signed>, and <text>.
from the figures module
delete <formula>.
from the corpus module
delete everything except <particDesc>.
from the namesdates module
... well, it's up to you which elements you will need here. To complete these exercises we suggest you retain at least the following: <affiliation>, <birth>, <death>, <education>, <event>, <forename> <listPerson>, <occupation>, <orgName>, <persName>, <person>, <placeName>, <state>, <surname>, and <trait>.

We are now ready to generate a schema. Click the Schema tab, and then press the red Generate button, taking the default option of a RELAX NG compact schema. Your browser will ask whether you want to save or open the generated file: save it to your Work folder. Then complete this stage by going to the Save Customization tab of Roma and saving your work as a file in your Work folder. Do not close the web browser, as we'll use it again shortly.

1.4. Using your schema in oXygen

You can use oXygen and the file you just made to check that your schema is correct. Proceed as follows:
  • Go back to, or load up again the file ex1.xml.
  • At the head of the file there is an Oxygen processing instruction beginning <?oxygen RNGSchema="... This tells OxyGen which schema file it should use to validate your document, and is probably currently using the tei_all schema; now we will change over to the schema we have just generated. Delete the whole of the processing instruction line.
  • Now go to the menu Document, then Schema and then Associate Schema. Choose the RELAX NG tab, and locate your schema file (you will need to select "Browse for local folder" from the drop down menu, or select the middle folder icon on the right to browse).
  • oXygen will insert another processing instruction at the start of your file to mark the schema location, and attempt to validate the file. Try inserting some new elements, and you should see a different collection of possibilities from those you saw before.

1.5. Enhancing your schema

Our schema now includes only the elements we want for the Punch project, but we would like to constrain it further. For example, we have used the type attribute on the <div> element to categorize each component of an issue by means of a code. It would be useful to make sure that this code is always present, and also to make sure the values used all come from the same fixed list.

Go back to Roma. (If you have closed the browser, you will need to restart Roma and reload the session you saved earlier). Go to the Modules tab and click on textstructure in the right-hand column. Find <div> and click on Change attributes on the right-hand side. This will show you all the attributes of <div>. Click on type, and you will be able to change its properties:
  1. Change the Is it optional radio button to make it compulsory
  2. Change the radio button for Closed list? to make it a closed list
  3. In the box for List of values, type
    cartoon,verse,review,prose,snippet,snippets
    (ie a list of possible values, separated by commas, but without any spaces).
  4. Click the red Save button
Now Generate a new schema, just as you did as before, and save it under the name my_ipp.rnc, over-writing the file that you created before. Reload your file in oXygen. There should be a validation errors, because one of the <div>s in the file has no type attribute specified. Supply a value and validate again: try giving an illegal value to check that your list of legal values is being respected.

This is the end of the first half of Exercise 1. If you have extra time, read the materials we've provided!

2. Part 2

2.1. Adding a New Element

Now we will add a completely new element. If you've had time to read any of Punch, you've probably noticed that Mr Punch is very fond of combining a quotation from some other newspaper with a sarcastic comment, and using the result to fill up space on the page.

The TEI already has an element for the combination of a quotation and a bibliographic reference (<cit>); we will define a new element called <citCom> for the combination of that with an optional comment.

  1. Re-open Roma if necessary.
  2. Select the add Elements tab.
  3. Enter citCom as the name of the new element, and supply a brief description for it.
  4. This element can appear anywhere within a <div> element, so make it a member of the model.divPart model class by checking the tick box next to that name.
  5. We want this element to have a type attribute, so make it a member of the att.typed attribute class by checking the tickbox next to that name.
  6. In dropdown list labelled Contents select User content: we want to define our own content model for this element.
  7. Complete the <content> element in the box below by adding the following set of declarations:
    <rng:ref name="cit"   xmlns:tei="http://www.tei-c.org/ns/1.0"/>
    <rng:oneOrMore>
     <rng:ref name="model.pLike"/>
    </rng:oneOrMore>
    which says that a <citCom> must have a <cit> followed by one or more members of the model.pLike class.
  8. Save your changes, generate your schema, and use Oxygen to check that your new element is now available for use.
You may like to mark up the ‘snippet’ about the Mexican rebel with your new element. It may end up looking like this:
<citCom xmlns="http://www.example.org/ns/nonTEI"> <cit xmlns="http://www.tei-c.org/ns/1.0"> <quote>"<hi rend="sc">Mexican Rebel Split</hi>."</quote> <bibl> <hi rend="it">Morning Post.</hi> </bibl> </cit> <p xmlns="http://www.tei-c.org/ns/1.0">Now perhaps the other civilised Powers will intervene. We have heard of many inhumanities marking the war in Mexico, but this treatment of a rebel is surely the limit.</p> </citCom>

Note that your new element is defined in a new namespace because it is not a TEI element — it is unique to the IPP. This means that any of its direct child elements which are in the TEI namespace (for example, a <cit> or a <p>) must state this explicitly by including a TEI namespace declaration. OxyGen will help you do this, but take care!

2.2. Documenting your schema with Roma

One of the major benefits of Roma is that after you have customized your schema it can produce two ways of documenting the changes you have made.
  • Select the Save Customization tab to save an ODD file
  • Select the Documentation tab to convert the ODD file into human-readable documentation in HTML.

The ODD file indicates how your schema differs from full TEI, what modules you have included, what elements you may have added and changed, and so forth. You should keep this file along with with your generated schema, in case you need to generate a new schema with additional elements or constrain it further. Other people can generate the schema for your documents in different schema languages if needed. Moreover, this document, like other ODD files, is an ordinary TEI document which you can edit, using oXygen for example.

The HTML or other documentation generated by Roma looks like a subset of the standard TEI reference manual. Because the TEI is itself maintained as a very large ODD file, Roma has access to all the information in the Guidelines about all of the elements, attributes, classes, or macros you've chosen to include in your customization, and can include it in your user project manual.

3. Other things to try with Roma

  1. How do you go about renaming an existing element? What happens in the ODD when you do?
  2. Once you have an ODD, you can generate project-specific documentation. Where would you put this prose in the ODD file? Try generating some test documentation with additions you have made.
  3. Experiment with starting with different exemplar customization Roma offers. Which do you think would be best for your project?
  4. Try modifying the schema specification in your ODD using oXygen. If you remove one of the <elementSpec mode="delete"> lines what happens? Try improving the documentation of the changes you made in the schema. Try adding some usage examples.


TEI@Oxford. Date: 2011-06
Copyright University of Oxford