Text only | Skip links
Skip links||IT Services, University of Oxford

1. Spoken Texts

A spoken text may contain any of the following components:
  • utterances
  • pauses
  • vocalized but non-lexical phenomena such as coughs
  • kinesic (non-verbal, non-lexical) phenomena such as gestures
  • entirely non-linguistic incidents occurring during and possibly influencing the course of speech
  • writing, regarded as a special class of incident in that it can be transcribed, for example captions or overheads displayed during a lecture
  • shifts or changes in vocal quality

1.1. TEI for spoken texts

The spoken texts module proposes an XML-based markup for
  • a lexically useful subset of speech phenomena
  • a rich set of associated contextual information (metadata)
  • linking and alignment mechanisms

As a part of the TEI scheme, it can be integrated, extended, and customized in a standard way.

1.2. What is "lexically useful"?

1.3. The notion of "utterance"

  • problematic, but pragmatic
  • a sequence of speech from a single speaker
  • may be grouped into higher-level <div>s
  • or fragmented into smaller segments <seg> or <s>
  • the who attribute points to speaker information

1.4. Transcribed Speech

Elements defined:
<broadcast>, <equipment>, <incident>, <kinesic>, <pause>, <recording>, <recordingStmt>, <scriptStmt>, <shift>, <u>, <vocal>, <writing>,
Classes defined:
att.duration, model.divPart.spoken, model.global.spoken, model.recordingPart

1.5. Simple examples

Mixture of utterance and ‘paralinguistic’ information:
<u who="#Jan">This is just delicious</u>
<incident>
 <desc>telephone rings</desc>
</incident>
<u who="#Kim">I'll get it</u>
<u who="#Tom">I used to <vocal>
  <desc>coughs</desc>
 </vocal> smoke a lot</u>
<u who="#Bob">
 <vocal>
  <desc>sniffs</desc>
 </vocal>He thinks he's tough
</u>
<vocal who="#Ann">
 <desc>snorts</desc>
</vocal>
<u who="#Tom">Yeah
<kinesic>
  <desc>gives uplifted middle finger sign</desc>
 </kinesic>
</u>

1.6. Back channelling

<u who="#a">So what could I have done <vocal who="#b">
  <desc>tut-tutting</desc>
 </vocal> about it anyway?</u>

1.7. Example using other TEI elements

<u who="#mar">you never <pause/> take this cat for
show and tell
<pause/> meow meow</u>
<u who="#ros">yeah well I dont want to</u>
<incident>
 <desc>toy cat has bell in tail which continues
   to make a tinkling sound</desc>
</incident>
<u who="#ros">because it is so old</u>
<u who="#mar">how <choice>
  <orig>bout</orig>
  <reg>about</reg>
 </choice>
 <emph>your</emph> cat <pause/>yours is <emph>new</emph>
 <kinesic>
  <desc>shows Father the cat</desc>
 </kinesic>
</u>
<u trans="pausewho="#fat">thats <pause/> darling</u>
<u who="#mar">no <emph>mine</emph> isnt old
mine is just um a little dirty</u>

1.8. Shifts in voice quality

  • Classic multiple hierarchy problem
    • can use <shift> or <milestone> to mark boundaries...
    • ... or can use typed <seg> elements
  • useful also for code shifting
<u who="#LB">
 <shift feature="loudnew="f"/>Elizabeth
</u>
<u who="#EB">Yes</u>
<u who="#LB">
 <shift feature="loud"/>Come and try this <pause/>
 <shift feature="loudnew="ff"/>come on
<shift feature="loud"/>
 <milestone type="codeshiftnew="kreol"/>tinva

</u>

1.9. Sample prosodic feature list

(based on Boase, Survey of English Usage, 1990)
tempo (fast, slow, getting faster, slower, etc.)
loud loud, soft, getting louder, slower
pitch range high, low, wide, narrow, ascending...
tension slurred, tense, staccato, legato...
rhythm regular, irregular, spiky rising or falling...
voice quality whisper, husky, falsetto, giggle, sobbing, yawning, sighing...

Researchers need to define their own terms

1.10. <shift/> example

<u who="#a">Listen to this <shift new="reading"/>The government is
confident, he said, that the current economic problems will be
completely overcome by June<shift/> what nonsense</u>

1.11. or as an <incident>

<u who="#a">Listen to this
<incident>
  <desc>reads aloud from newspaper</desc>
 </incident> what nonsense</u>

1.12. <vocal> vs <u>

Compare:
<vocal who="#ann">
 <desc>snorts</desc>
</vocal>
and
<u who="#ann">
 <vocal>
  <desc>snorts</desc>
 </vocal>
</u>

1.13. <writing> example

<u who="#a">look at this</u>
<writing who="#atype="newspapergradual="false">
Government claims economic problems <soCalled>over by June</soCalled>
</writing>
<u who="#a">what nonsense!</u>

1.14. Timing issues

  • pausing: use <pause> element
  • duration: use dur attribute
  • synchronization: use synch attribute
  • overlap: use trans attribute

1.15. <pause> example

<u>Okay <pause dur="PT2M"/>U-m<pause dur="PT75S"/>the scene opens up
<pause dur="PT50S"/> with <pause dur="PT20S"/> um <pause dur="PT145S"/> you see
a tree okay?</u>

1.16. Overlap

Mutt: Have you heard the -- Jeff: the election result? Mutt: It's a disaster! Jeff: (at the same time) It's a miracle!
<u who="#mutt">have you heard the</u>
<u trans="latchingwho="#jeff">the election result</u>
<u who="#mutt">its a disaster</u>
<u who="#jefftrans="overlap">its a miracle</u>

1.17. More overlap

<u who="#tom">I used to smoke <anchor xml:id="TS-p10"/> a lot more than this <anchor xml:id="TS-p20"/> but I never inhaled the smoke</u>
<u start="#TS-p10end="#TS-p20who="#bob">You used to smoke</u>

1.18. Synchronization

<u who="#mutt">have you heard <anchor synch="#t1"/>the</u>
<u who="#jeffsynch="#t1">the election result</u>
<u who="#muttsynch="#t2">its a disaster</u>
<u who="#jeffsynch="#t2">its a miracle</u>
<!-- Elsewhere in Document -->
<timeline origin="#t1">
 <when xml:id="t1"/>
 <when xml:id="t2"/>
</timeline>

1.19. <timeline> example

<timeline unit="sorigin="#TS-P1">
 <when xml:id="TS-P1absolute="12:20:01"/>
 <when xml:id="TS-P2interval="4.5since="#TS-P1"/>
 <when xml:id="TS-P6"/>
 <when xml:id="TS-P3interval="1.5since="#TS-P6"/>
</timeline>
<u xml:id="TS-U1start="#TS-P2end="#TS-P3">This is my <anchor synch="#TS-P6xml:id="TS-P6A"/> turn</u>

The start of utterance TS-U1 is aligned with TS-P2 and its end with TS-P3. The transition between the words my and turn occurs at point TS-P6A, which is synchronous with point TS-P6 on the timeline.

1.20. Using elements seen elsewhere

<u>
 <del type="truncation">s</del>see
<del type="repetition">you you</del> you know
<del type="falseStart">it's</del> he's crazy

</u>
<gap reason="passing truckextent="5unit="s"/>
<u who="#P1">I proposed that <foreign xml:lang="de"> wir können <pause dur="PT1S"/> vielleicht </foreign> go to warsaw and <emph>vienna</emph>
</u>

1.21. Participant Description

<particDesc>
 <listPerson>
  <person xml:id="P-1234sex="2age="mid">
   <p>Female informant, well-educated, born in Shropshire UK, 12 Jan 1950, of unknown occupation. Speaks French fluently. Socio-Economic status B2.</p>
  </person>
  <person xml:id="P-4332sex="1">
   <persName>
    <surname>Hancock</surname>
    <forename>Antony</forename>
    <forename>Aloysius</forename>
    <forename>St John</forename>
   </persName>
   <residence notAfter="1959">
    <address>
     <street>Railway Cuttings</street>
     <settlement>East Cheam</settlement>
    </address>
   </residence>
   <occupation>comedian</occupation>
  </person>
 </listPerson>
</particDesc>

1.22. Flexibility and standardization of metadata

  • Information can be supplied purely in documentary terms, as plain text...
  • ... or it can be organized in a structured way, using a rich set of XML descriptors
  • The available descriptors can be constrained by means of a customised schema
  • Additional descriptors can be added, and integrated by means of the TEI class system

1.23. <scriptStmt> example

<sourceDesc>
 <scriptStmt xml:id="CNN12">
  <bibl>
   <author>CNN Network News</author>
   <title>News headlines</title>
   <date when="1991-06-12">12 Jun 91</date>
  </bibl>
 </scriptStmt>
</sourceDesc>

1.24. Similarly for recordings...

<recordingStmt>
 <recording type="audiodur="P30M">
  <respStmt>
   <resp>Location recording by</resp>
   <orgName>Sound Services Ltd.</orgName>
  </respStmt>
  <equipment>
   <p>Multiple close microphones mixed down to stereo Digital Audio Tape, standard play, 44.1 KHz sampling frequency</p>
  </equipment>
  <date>12 Jan 1987</date>
 </recording>
</recordingStmt>

1.25. Detailed <recording>

<recording type="audiodur="P10M">
 <equipment>
  <p>Recorded from FM Radio to digital tape</p>
 </equipment>
 <broadcast>
  <bibl>
   <title>Interview on foreign policy</title>
   <author>BBC Radio 5</author>
   <respStmt>
    <resp>interviewer</resp>
    <name>Robin Day</name>
   </respStmt>
   <respStmt>
    <resp>interviewee</resp>
    <name>Margaret Thatcher</name>
   </respStmt>
  </bibl>
 </broadcast>
</recording>

1.26. Or maybe just...

<recordingStmt>
 <recording type="audiodur="P15Mxml:id="rec-3001">
  <date>14 Feb 2001</date>
 </recording>
 <recording type="audiodur="P15Mxml:id="rec-3002">
  <date>17 Feb 2001</date>
 </recording>
 <recording type="audiodur="P15Mxml:id="rec-3003">
  <date>22 Feb 2001</date>
 </recording>
</recordingStmt>

1.27. ... and for settings

<setting xml:id="KDFSE002n="063505who="#PS0M6">
 <name type="place">Lancashire: Morecambe </name>
 <locale> at home </locale>
 <activity> watching television </activity>
</setting>

2. Analysis

  • associating simple analyses and interpretations with text elements
  • semantic or syntactic interpretations which an encoder wishes to attach to all or part of a text
  • mainly covering linguistic information
  • as often in the TEI, you can do the same thing in many ways:
    • using generic <seg> elements with type attributes
    • using the straightforward canned analyses described here
    • using the more powerful and general TEI Feature Structures

2.1. Linguistic units

To mark up text for linguistic purposes:
<s>
(s-unit) contains a sentence-like division of a text.
<cl>
(clause) represents a grammatical clause.
<phr>
(phrase) represents a grammatical phrase.
<w>
(word) represents a grammatical (not necessarily orthographic) word.
<m>
(morpheme) represents a grammatical morpheme.
<c>
(character) represents a character.
From the att.segLike class, these elements all have type and function attributes

2.2. Example of linguistic markup

Compare
<u>Like a suck of one of my sweets?</u>
<u>No I don't take sweets from strangers, oh God</u>
with....

2.3. linguistic markup

<u who="PS1K5">
 <s n="5963">
  <w type="AV0">Like</w>
  <w type="AT0">a</w>
  <w type="NN1">suck</w>
  <w type="PRF">of</w>
  <w type="CRD">one</w>
  <w type="PRF">of</w>
  <w type="DPS">my</w>
  <w type="NN2">sweets</w> ?</s>
</u>
<u trans="smoothwho="PS1BY">
 <s n="5964">
  <w type="ITJ">No </w>
  <w type="PNP">I </w>
  <w type="VDB">do</w>
  <w type="XX0">n't </w>
  <w type="VVI">take </w>
  <w type="NN2">sweets </w>
  <w type="PRP">from </w>
  <w type="NN2">strangers</w>
  <c type="PUN">, </c>
  <w type="ITJ">oh </w>
  <w type="NP0">God</w>
 </s>
</u>
(from British National Corpus, KSV 5963)

2.4. Mixing analysis with structure

Analytic units often cross structural boundaries. The <cl> (clause) elements here cross the verse lines (<l>). We can use the part attribute to show how a <cl> can be assembled:
<div type="stanza">
 <l>
  <cl part="I">Tweedledum and Tweedledee</cl>
 </l>
 <l>
  <cl part="F">Agreed to have a battle;</cl>
 </l>
 <l>
  <cl part="I">For Tweedledum said Tweedledee</cl>
 </l>
 <l>
  <cl part="F">Had spoiled his nice new rattle.</cl>
 </l>
</div>

2.5. Or the next attribute

<l>
 <cl next="#c5xml:id="c3part="I">For Tweedledum said
 <cl next="#c6xml:id="c4part="I">Tweedledee</cl>
 </cl>
</l>
<l>
 <cl prev="#c3xml:id="c5part="F">
  <cl prev="#c4xml:id="c6part="F">Had spoiled his nice new rattle.</cl>
 </cl>
</l>

2.6. Stand-off interpretation

When inline markup is inappropriate, the <span> element can be used to make ad hoc remarks about bits of text, linked to by ID. As usual, <spanGrp> is available to group assertions together.

<sp>
 <speaker>CORNWALL</speaker>
 <ab xml:id="eye_start">Lest it see more, prevent it. Out, vile jelly!</ab>
 <ab>Where is thy lustre now?</ab>
</sp>
<sp>
 <speaker>GLOUCESTER</speaker>
 <ab>All dark and comfortless. Where's my son Edmund?</ab>
 <ab>Edmund, enkindle all the sparks of nature,</ab>
 <ab xml:id="eye_end">To quit this horrid act.</ab>
</sp>
<span from="#eye_startto="#eye_end">the eye is pulled out</span>

2.7. Stand-off interpretation (cont)

The <interp> element is used to encode an interpretation. The global ana attribution can point from the text to such an interpretation:
<sp>
 <speaker>CORNWALL</speaker>
 <ab ana="#eyeloss">Lest it see more, prevent it. Out, vile jelly!</ab>
 <ab>Where is thy lustre now?</ab>
</sp>
<sp>
 <speaker>GLOUCESTER</speaker>
 <ab>All dark and comfortless. Where's my son Edmund?</ab>
 <ab>Edmund, enkindle all the sparks of nature,</ab>
 <ab>To quit this horrid act.</ab>
</sp>
<interp resp="#SPQRxml:id="eyeloss">removal of eyes</interp>

The <interpGrp> element is used to group interpretations together.

2.8. Interpretation example (1)

In this example:
  • A set of possible interpretations is defined, using <interp> elements
  • <seg> is used to markup distinct portions of a narrative
  • <s> is used to mark sentences
  • the ana attribute links sections or milestones to appropriate interpretation
<interpGrp resp="#TMAtype="structuralUnit">
 <interp xml:id="INTRO">introduction</interp>
 <interp xml:id="CONFLICT">conflict</interp>
 <interp xml:id="CLIMAX">climax</interp>
 <interp xml:id="REVENGE">revenge</interp>
 <interp xml:id="RECONCIL">reconciliation</interp>
 <interp xml:id="AFTERM">aftermath</interp>
</interpGrp>

2.9. Interpretation example (2)

<p xml:id="PP1">
 <seg xml:id="SS1-SS3ana="#INTRO">
  <s xml:id="SS1">Sigmund ... was a king in Frankish country.</s>
  <s xml:id="SS2">Sinfiotli was the eldest of his sons.</s>
  <s xml:id="SS3">Borghild, Sigmund's wife, had a brother ... </s>
 </seg>
 <s xml:id="SS4Aana="#CONFLICT">But Sinfiotli ... wooed the same woman</s>
 <s xml:id="SS4Bana="#I3">and Sinfiotli killed him over it.</s>
 <seg xml:id="SS5-SS17ana="#CLIMAX">
  <s xml:id="SS5">And when he came home, ... she was obliged to accept it.</s>
  <s xml:id="SS6">At the funeral feast Borghild was serving beer.</s>
  <s xml:id="SS17">Sinfiotli drank it off and at once fell dead.</s>
 </seg>
</p>
<anchor xml:id="NIL1ana="#RECONCIL"/>
<p xml:id="PP2">Sigmund carried him a long way in his arms ... </p>

2.10. Linguistic Transcription

When transcribing, some people are more interested in the linguistic values of texts than their physical or semantic contexts.

2.11. Phrase segmentation

<s>
 <cl type="finite-declarativefunction="independent">
  <phr type="NPfunction="subject">It</phr>
  <phr type="VPfunction="predicate">
   <phr type="Vfunction="verb-main">was</phr>
     also
  <phr type="NPfunction="predicate-nom.">a crucial year for me</phr>
  </phr>
 </cl>
</s>

2.12. Words with lemmas and morphemes with types

<s xml:lang="la">
 <w lemma="timeo">timeo</w>
 <w lemma="danaii">Danaos</w>
 <w lemma="et">et</w>
 <w lemma="donum">dona</w>
 <w lemma="fero">ferentes</w>
</s>
or
<w type="adjective">
 <m type="prefixbaseForm="con">com</m>
 <m type="root">fort</m>
 <m type="suffix">able</m>
</w>

2.13. Nested <w>

<s>
 <w>I</w>
 <w>
  <w>did</w>
  <m>n't</m>
 </w>
 <w>do</w>
 <w>it</w>
 <c>.</c>
</s>

2.14. Word analysis

<s>
 <w ana="#AT0">The</w>
 <w ana="#NN1">victim</w>
 <w ana="#POS">'s</w>
 <w ana="#NN2">friends</w>
 <w ana="#VVD">told</w>
 <w ana="#NN2">police</w>
 <w ana="#CJT">that</w>
 <w ana="#NP0">Kruger</w>
 <w ana="#VVD">drove</w>
 <w ana="#PRP">into</w>
 <w ana="#AT0">the</w>
 <w ana="#NN1">quarry</w>
 <w ana="#CJC">and</w>
 <w ana="#AV0">never</w>
 <w ana="#VVD">surfaced</w>
</s>

2.15. Interpretation

<interpGrp type="POS">
 <interp xml:id="AT0">Definite article</interp>
 <interp xml:id="AV0">Adverb</interp>
 <interp xml:id="CJC">Conjunction</interp>
 <interp xml:id="CJT">Relative that</interp>
 <interp xml:id="NN1">Noun singular</interp>
 <interp xml:id="NN2">Noun plural</interp>
 <interp xml:id="NP0">Proper noun</interp>
 <interp xml:id="POS">Genitive marker</interp>
 <interp xml:id="PRP">Preposition</interp>
 <interp xml:id="VVD">Verb past tense</interp>
</interpGrp>

2.16. More interpretation

<u xml:id="u1">Can I have ten oranges and a kilo of bananas please?</u>
<u xml:id="u2">Yes, anything else?</u>
<u xml:id="u3">No thanks.</u>
<u xml:id="u4">That'll be dollar forty.</u>
<u xml:id="u5">Two dollars</u>
<u xml:id="u6">Sixty, eighty, two dollars. Thank you.</u>
<spanGrp type="transactions">
 <span from="#u1">sale request</span>
 <span from="#u2to="#u3">sale compliance</span>
 <span from="#u4">sale</span>
 <span from="#u5">purchase</span>
 <span from="#u6">purchase closure</span>
</spanGrp>

2.17. British National Corpus

  • a snapshot of British English, taken at the end of the 20th century
  • 100 million words in approx 4000 different text samples, both spoken (10%) and written (90%)‏
  • synchronic (1990-4), sampled, general purpose corpus
  • available under licence; latest edition is BNC-XML (13 March 2007)
  • Part-of-speech and lemma tagging
  • Uses a variant of TEI XML originally called CDIF

2.18. BNC XML

<div level="1n="1type="leaflet">
 <head type="MAIN">
  <s n="1">
   <w c5="NN1hw="factsheetpos="SUBST">FACTSHEET</w>
   <w c5="DTQhw="whatpos="PRON">WHAT</w>
   <w c5="VBZhw="bepos="VERB">IS</w>
   <w c5="NN1hw="aidspos="SUBST">AIDS</w>
   <c c5="PUN">?</c>
  </s>  </head>
 <p>
  <s n="2">
   <hi rend="bo">  <w c5="NN1hw="aidspos="SUBST">AIDS</w>
    <c c5="PUL">(</c>
    <w c5="VVN-AJ0hw="acquirepos="VERB">Acquired</w>
    <w c5="AJ0hw="immunepos="ADJ">Immune</w>
    <w c5="NN1hw="deficiencypos="SUBST">Deficiency</w>
    <w c5="NN1hw="syndromepos="SUBST">Syndrome</w>
    <c c5="PUR">)</c>
   </hi>
   <w c5="VBZhw="bepos="VERB">is</w>
   <w c5="AT0hw="apos="ART">a</w>
   <w c5="NN1hw="conditionpos="SUBST">condition</w>
<!-- ... -->
  </s>
 </p>
</div>


James Cummings. Date: April 2009
Copyright University of Oxford