Text only | Skip links
Skip links||IT Services, University of Oxford

1. Using a TEI schema

In this exercise you will learn
  • how to make a valid TEI document, using an Oxygen template
  • how to use the Roma web application to make a TEI schema

1.1. Using an Oxygen Template

In the first exercise, we made a well-formed document, but we did not try to make one which conformed to any schema.

Oxygen comes with predefined templates for a large number of commonly used schemas, including TEI. We'll use one of these to create a simple TEI Header for the poem we tagged yesterday.

Start up Oxygen. You will see that the documents you were working on previously are still there. We will start however by making a new document.
  1. As before, click the New Document icon at top left (or select New Document from the File menu, or type CTRL-N.) to display the New dialog. This time, however, select the from Templates tab

    Scroll down the list to select TEI P5 - Lite. Press OK.

  2. A new editing window opens, with the basic parts of a valid TEI header already present.


    Replace the English text content with some appropriate Russian text for the title, publication statement, and source description for your digital edition of the Pushkin poem.

  3. You need to replace the text <p>Some text here.</p> inside the <text> element that follows the <teiHeader> with the contents of the file you prepared earlier. The easiest way is to cut-and-paste:
    • Click on the tab containing the window in which you edited the Pushkin poem (it should be labelled pushkin.xml)
    • Use the mouse to select the whole of the <div type="poem"> element
    • Press CTRL-c (or select Copy from the Edit menu)
    • Now click on the tab for the file you just created (it's probably labelled untitled.xml unless you saved it)
    • Select the text <p>Some text here.</p> with the mouse and press CTRL-v (or select Paste from the Edit menu) to replace it with the Pushkin poem
  4. Is the poem valid? What do you need to do to make it valid?
  5. Now go back to your <teiHeader>. We will add a <revisionDesc> element to show when the document was completed.
    • Use the mouse to place the cursor immediately after the closing-tag </fileDesc>
    • Type the < character and observe that Oxygen shows you only the elements that are valid at this point together with a brief description of each

    • Use the arrow key to step down to the one you want to insert: <revisionDesc> and press RETURN
    • The cursor stays inside the start-tag for <revisionDesc> in case you want to add some attributes; since we don't want to, press the right arrow key to move immediately before the end-tag, and type in a < again.
    • You can enter a <list> or a <change>; we suggest the latter.
    • This time, with the cursor inside the start-tag for <change>, press the space bar.
    • A long list of the available attributes for this element appears. Scroll down it to select when, and press Return.

    • Enter today's date in the form YYYY-MM-DD as the value for this attribute. Note what happens if you try to supply an invalid date (such as the 32nd of August).
    • We suggest you enter some text such as ‘first valid TEI version’ as content for the <change> element.
  6. Experiment to see what other elements are available at different points in your document, if you like.
  7. Click on the Author button at the bottom of the screen to see your completed document. Remember to save it!

1.2. Making a TEI Schema

In the previous exercise, we used a pre-defined TEI schema called TEI Lite. This schema has all you need for marking up most kinds of TEI documents -- but quite a lot more than most people want. You can use the web application Roma to make any kind of TEI schema, simple or complex. We'll start with a very simple one.
  1. Open http://www.tei-c.org/Roma/ using any web browser
  2. Click the radio button next to "Create Customization from Template". Use the default option "Absolutely Bare".


  3. Press the big red button at bottom left
  4. Roma shows you information about the schema you are about to create.

  5. Roma has many facilities, accessed by means of the buttons across the top of the screen. You can customize the interface language as well. If you'd rather see things in Russian, click the radio button next to "Russian", and then press the big red button at bottom left again.

  6. For now, we just want a schema. Press the button that says Schema (cxema)
  7. You have a choice of schema languages: choose the default (Relax NG schema (compact syntax), and press the big red button at bottom left again.

  8. Roma will send you a file tei_bare.rnc; save it in your working directory for the next part of the exercise.

1.3. Marking up Mr Punch

For the rest of this exercise, we'd like you to try marking up a page from Punch. To help you, we've prepared the following files, all of which you should be able to find in your Working Directory:
  • the directory Punch/Pages contains graphic images of each page of the issue, called things like 147_001.jpg (for page one)
  • the directory Punch/Text contains plain text for each page, called things like page2.txt (for page 2)
  • the file XML/punchHdr.xml contains a sample TEI Header for you to modify, or prefix to your document.
  • the directory XML/Graphics contains image files for all the illustrations in the issue, called things like 003.png for the illustration on page 3.

Start by choosing the page you want to work on. Some are very easy -- and some are less so!

Open Oxygen again, and make a New document, as before. This time however, when you see the Create an XML Document dialogue, make sure that the checkbox Use a DTD or Schema is checked.

Select the Relax NG tab, and then navigate to wherever you have saved the tei_bare.rnc file which Roma made for you in the previous exercise. Use this as your schema.

Using the techniques you've already learned you should find it easy to...
  • insert the default TEI Header at the start of your new document
  • find out what elements this schema offers you to mark up the text
  • select the elements you want, and mark up stretches of text with them.

How far can you get with your chosen page? Since the bare_bones schema really only offers paragraphs and lists, don't be too discouraged if the answer is ‘not very!’

We need to modify the schema to include more elements. To do this, we will return to Roma and discover more about what it offers.

1.4. A more ambitious schema

In this part of the exercise, we will build a ‘less bare’ version of the tei_bare schema, by adding to it some elements we need to mark up the Punch texts.

  1. Start up Roma and select the Bare Bones customization as before.
  2. Select the Modules (moduli) tab: you will see a list of available modules on the left, and the modules selected on the right.


  3. Click on the word add (Добавление) next to the word graphics on the left. This will add that module to the list on the right.
  4. Now we need to see what elements are being used from each of the modules in our schema. Start by clicking on the word core in the list on the right.
  5. A long list of element names appears. You can:
    • indicate whether the element is to be included (the Включить button is selected) or excluded (the Исключить button is selected)
    • change the spelling of the element's name (not recommended!)
    • use the question mark button to read a formal definition of this element in the online TEI Guidelines
    • ... or review the brief description of it
    • and you can also review the attributes available for this element by clicking on Изменение атрибутов

  6. We suggest that, for basic markup of a Punch page, you'll need to add at least the elements <bibl>, <cit>, <graphic>, <hi>, <l>, <lg>, <name>, <pb>, <q>, <sp>, <speaker> and <stage>. But feel free to explore, and add others!
  7. When you're done, click the "Submit Query" button at bottom left, and the schema will be updated with your selection. Then press the Back button at top left to return to the modules page.
  8. Review the elements provided by the figures module which you added to the schema earlier. You will see that this adds only a few elements: you may however want to remove the <formula> element.
  9. When you are happy with your selection, click on the Schema button to generate a schema as before.
  10. Save the schema generated as an RNC file, as before.
  11. You can also generate a manual for your colleagues: click on Documentation.

Now it's over to you! Go back to Oxygen and see if you can improve on the tagging of the document you were working on by using your new schema.

To tell Oxygen to use your new schema, rather than teibare, proceed as follows:
  • Remove the processing instruction which Oxygen inserted at the head of your document: it probably looks like this: <?oxygen RNGSchema="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_bare.rng" type="xml"?>
  • Select "Associate Schema" on the "XML Document" submenu of the "Document" menu.

  • Specify the file in which you saved the schema that Roma generated for you.

Date: 2008-07-07
Copyright University of Oxford