https://github.com/TEIC/TEI
Raw File
Tip revision: 347c64fac3fa1a64ade0d8d1842813d4b8f7acec authored by Hugh Cayless on 12 May 2017, 16:57:59 UTC
Updates.
Tip revision: 347c64f
SA.html

<!DOCTYPE html
  SYSTEM "about:legacy-compat">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><!--THIS FILE IS GENERATED FROM AN XML MASTER. DO NOT EDIT (4)--><title>16 Linking, Segmentation, and Alignment - The TEI Guidelines</title><meta property="Language" content="en" /><meta property="DC.Title" content="16 Linking, Segmentation, and Alignment - The TEI Guidelines" /><meta property="DC.Language" content="SCHEME=iso639 en" /><meta property="DC.Creator.Address" content="tei@oucs.ox.ac.uk" /><meta charset="utf-8" /><link href="guidelines.css" rel="stylesheet" type="text/css" /><link href="odd.css" rel="stylesheet" type="text/css" /><link rel="stylesheet" media="print" type="text/css" href="guidelines-print.css" /><script type="text/javascript" src="jquery-1.2.6.min.js"></script><script type="text/javascript" src="columnlist.js"></script><script type="text/javascript" src="popupFootnotes.js"></script><script type="text/javascript">
        $(function() {
         $('ul.attrefs-class').columnizeList({cols:3,width:30,unit:'%'});
         $('ul.attrefs-element').columnizeList({cols:3,width:30,unit:'%'});
         $(".displayRelaxButton").click(function() {
           $(this).parent().find('.RNG_XML').toggle();
           $(this).parent().find('.RNG_Compact').toggle();
         });
         $(".tocTree .showhide").click(function() {
          $(this).find(".tocShow,.tocHide").toggle();
          $(this).parent().find("ul.continuedtoc").toggle();
	  });
        })
    </script><script type="text/javascript"><!--
var displayXML=0;
states=new Array()
states[0]="element-a"
states[1]="element-b"
states[2]="element-c"
states[3]="element-d"
states[4]="element-e"
states[5]="element-f"
states[6]="element-g"
states[7]="element-h"
states[8]="element-i"
states[9]="element-j"
states[10]="element-k"
states[11]="element-l"
states[12]="element-m"
states[13]="element-n"
states[14]="element-o"
states[15]="element-p"
states[16]="element-q"
states[17]="element-r"
states[18]="element-s"
states[19]="element-t"
states[20]="element-u"
states[21]="element-v"
states[22]="element-w"
states[23]="element-x"
states[24]="element-y"
states[25]="element-z"

function startUp() {

}

function hideallExcept(elm) {
for (var i = 0; i < states.length; i++) {
 var layer;
 if (layer = document.getElementById(states[i]) ) {
  if (states[i] != elm) {
    layer.style.display = "none";
  }
  else {
   layer.style.display = "block";
      }
  }
 }
 var mod;
 if ( mod = document.getElementById('byMod') ) {
     mod.style.display = "none";
 }
}

function showall() {
 for (var i = 0; i < states.length; i++) {
   var layer;
   if (layer = document.getElementById(states[i]) ) {
      layer.style.display = "block";
      }
  }
}

function showByMod() {
  hideallExcept('');
  var mod;
  if (mod = document.getElementById('byMod') ) {
     mod.style.display = "block";
     }
}

	--></script></head><body><div id="container"><div id="banner"><img src="Images/banner.jpg" alt="Text Encoding Initiative logo and banner" /></div></div><div class="mainhead"><h1>P5: 
    Guidelines for Electronic Text Encoding and Interchange</h1><p>Version 3.1.1a. Last updated on
	10th May 2017, revision bd8dda3</p></div><div id="onecol" class="main-content"><h2><span class="headingNumber">16 </span>Linking, Segmentation, and Alignment</h2><div class="div1" id="SA"><div class="miniTOC miniTOC_left"><p><span class="subtochead">Table of contents</span></p><div class="subtoc"><ul class="subtoc"><li class="subtoc"><a class="subtoc" href="SA.html#SAPT" title="Links">16.1 Links</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAXP" title="Pointing Mechanisms">16.2 Pointing Mechanisms</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SASE" title="Blocks Segments and Anchors">16.3 Blocks, Segments, and Anchors</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SASY" title="Synchronization">16.4 Synchronization</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SACS" title="Correspondence and Alignment">16.5 Correspondence and Alignment</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAIE" title="Identical Elements and Virtual Copies">16.6 Identical Elements and Virtual Copies</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAAG" title="Aggregation">16.7 Aggregation</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAAT" title="Alternation">16.8 Alternation</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SASO" title="Standoff Markup">16.9 Stand-off Markup</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAAN" title="Connecting Analytic and Textual Markup">16.10 Connecting Analytic and Textual Markup</a></li><li class="subtoc"><a class="subtoc" href="SA.html#SAref" title="Module for Linking Segmentation and Alignment">16.11 Module for Linking, Segmentation, and Alignment</a></li></ul></div><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="CC.html"><span class="headingNumber">15 </span>Language Corpora</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="AI.html"><span class="headingNumber">17 </span>Simple Analytic Mechanisms</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><p>This chapter discusses a number of ways in which encoders may represent analyses of the structure of a text which are not necessarily linear or hierarchic. The module defined by this chapter provides for the following common requirements: </p><ul class="bulleted"><li class="item">to link disparate elements using the <span class="att">xml:id</span> attribute (section <a class="link_ptr" href="SA.html#SAPT" title="Links"><span class="headingNumber">16.1 </span>Links</a>);</li><li class="item">to link disparate elements without using the <span class="att">xml:id</span> attribute (sections <a class="link_ptr" href="SA.html#SAUR" title="Pointing Elsewhere"><span class="headingNumber">16.2.1 </span>Pointing Elsewhere</a> and <a class="link_ptr" href="SA.html#SATS" title="TEI XPointer Schemes"><span class="headingNumber">16.2.4 </span>TEI XPointer Schemes</a>);</li><li class="item">to segment text into elements convenient for the encoder and to mark arbitrary points within documents (section <a class="link_ptr" href="SA.html#SASE" title="Blocks Segments and Anchors"><span class="headingNumber">16.3 </span>Blocks, Segments, and Anchors</a>);</li><li class="item">to represent correspondence or alignment among groups of text elements, both those with content and those which are empty (section <a class="link_ptr" href="SA.html#SACS" title="Correspondence and Alignment"><span class="headingNumber">16.5 </span>Correspondence and Alignment</a>);<span id="Note94_return"><a class="notelink" title="We use the term alignment as a special case for the more general notion of correspondence. Using A as a short form for an element with its attribute x…" href="#Note94"><sup>57</sup></a></span></li><li class="item">to synchronize elements of a text, that is to represent temporal correspondences and alignments among text elements (section <a class="link_ptr" href="SA.html#SASY" title="Synchronization"><span class="headingNumber">16.4 </span>Synchronization</a>) and also to align them with specific points in time (section <a class="link_ptr" href="SA.html#SASYMP" title="Placing Synchronous Events in Time"><span class="headingNumber">16.4.2 </span>Placing Synchronous Events in Time</a>);</li><li class="item">to specify that one text element is identical to or a copy of another (section <a class="link_ptr" href="SA.html#SAIE" title="Identical Elements and Virtual Copies"><span class="headingNumber">16.6 </span>Identical Elements and Virtual Copies</a>);</li><li class="item">to aggregate possibly noncontiguous elements (section <a class="link_ptr" href="SA.html#SAAG" title="Aggregation"><span class="headingNumber">16.7 </span>Aggregation</a>);</li><li class="item">to specify that different elements are alternatives to one another and to express preferences among the alternatives (section <a class="link_ptr" href="SA.html#SAAT" title="Alternation"><span class="headingNumber">16.8 </span>Alternation</a>);</li><li class="item">to store markup separately from the data it describes (section <a class="link_ptr" href="SA.html#SASO" title="Standoff Markup"><span class="headingNumber">16.9 </span>Stand-off Markup</a>);</li><li class="item">to associate segments of a text with interpretations or analyses of their significance (section <a class="link_ptr" href="SA.html#SAAN" title="Connecting Analytic and Textual Markup"><span class="headingNumber">16.10 </span>Connecting Analytic and Textual Markup</a>).</li></ul><p>These facilities all use the same set of techniques based on the W3C XPointer framework (<a class="link_ptr" href="BIB.html#XPTRFMWK" title="Paul Grosso Eve Maler Jonathan Marsh Norman Walsh XPointer FrameworkW3C25 March 2003">Grosso et al. (eds.) (2003)</a>) This provides a variety of <span class="noindex">schemes</span>; the most convenient of which, and that recommended by these Guidelines, makes use of the global <span class="att">xml:id</span> attribute, as defined in section <a class="link_ptr" href="ST.html#STGA" title="Global Attributes"><span class="headingNumber">1.3.1.1 </span>Global Attributes</a>, and introduced in the section of <a class="link_ptr" href="SG.html" title="A Gentle Introduction to XML"><span class="headingNumber">v. </span>A Gentle Introduction to XML</a> titled <a class="link_ptr" href="SG.html#SG-id" title="Identifiers and Indicators"><span class="headingNumber"></span>Identifiers and Indicators</a> . When the <span class="ident-module">linking</span> module is included in a schema, the attribute class <a class="link_odd" title="provides attributes common to all elements in the TEI encoding scheme." href="ref-att.global.html">att.global</a> is extended to include eight additional attributes to support the various kinds of linking listed above. Each of these attributes is introduced in the appropriate section below. In addition, for many of the topics discussed, a choice of methods of encoding is offered, ranging from simple but less general ones, which use attribute values only, to more elaborate and more general ones, which use specialized elements.</p><div class="div2" id="SAPT"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAXP"><span class="headingNumber">16.2 </span>Pointing Mechanisms</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAPT" title="link to this section "><span class="invisible">TEI: Links</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.1 </span><span class="head">Links</span></h3><p>We say that one element <span class="noindex">points</span> to others if the first has an attribute whose value is a reference to the others: such an element is called a <span class="term">pointer element</span>, or simply a <span class="term">pointer</span>. Among the pointers that have been introduced up to this point in these Guidelines are <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a>, <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a>, and <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a>. These elements all indicate an association between one place in the document (the location of the pointer itself) and one or more others (the elements whose identifiers are specified by the pointer's <span class="att">target</span> attribute). The module described in this chapter introduces a variation on this basic kind of pointer, known as a <span class="term">link</span>, which specifies both ‘ends’ of an association. In addition, we define a syntax for representing locations in a document by a variety of means not dependent on the use of <span class="att">xml:id</span> attributes.</p><div class="div3" id="SAPTL"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAPTEG"><span class="headingNumber">16.1.2 </span>Using Pointers and Links</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAPTL" title="link to this section "><span class="invisible">TEI: Pointers and Links</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.1.1 </span><span class="head">Pointers and Links</span></h4><p>In section <a class="link_ptr" href="CO.html#COXR" title="Simple Links and CrossReferences"><span class="headingNumber">3.6 </span>Simple Links and Cross-References</a> we introduced the simplest pointer elements, <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> and <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a>. Here we introduce additionally the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element, which represents an association between two (or more) locations by specifying each location explicitly. Its own location is irrelevant to the intended linkage. All three elements use the attribute <span class="att">target</span>, provided by the <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a> class as a means of indicating the location or locations referenced or pointed to. </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.pointing.html">att.pointing</a></span> provides a set of attributes used by all elements which point to other elements by means of one or more URI references.<table class="specDesc"><tr><td class="Attribute"><span class="att">target</span></td><td>specifies the destination of the reference by supplying one or more URI References</td></tr></table></li><li><span class="specList-elementSpec"><a href="ref-link.html">link</a></span> defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements.</li></ul><p> The <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> element may be called a ‘pure pointer’, because its primary function is simply to point. A pointer sets up a <span class="noindex">connection</span> between an element (which, in the case of a pure pointer, is simply a location in a document), and one or more others, known collectively as its <span class="term">target</span>. The <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> and <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a> elements  point, conceptually, at a single target, even if that target may be discontinuous in the document. The <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element  specifies at least two targets and represents an association between them, independent of its own location.</p><p>These three elements also share a common set of attributes, derived from the <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a> and <a class="link_odd" title="provides attributes which can be used to classify or subclassify elements in any way." href="ref-att.typed.html">att.typed</a> classes: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.pointing.html">att.pointing</a></span> provides a set of attributes used by all elements which point to other elements by means of one or more URI references.<table class="specDesc"><tr><td class="Attribute"><span class="att">evaluate</span></td><td>specifies the intended meaning when the target of a pointer is itself a pointer.</td></tr></table></li><li><span class="specList-classSpec"><a href="ref-att.typed.html">att.typed</a></span> provides attributes which can be used to classify or subclassify elements in any way.<table class="specDesc"><tr><td class="Attribute"><span class="att">type</span></td><td>characterizes the element in some sense, using any convenient classification scheme or typology.</td></tr><tr><td class="Attribute"><span class="att">subtype</span></td><td>provides a sub-categorization of the element, if needed</td></tr></table></li></ul><div class="p">Double connection among elements could also be expressed by a combination of pointer elements, for example, two <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> elements, or one <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> element and one <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element. All that is required is that the value of the <span class="att">target</span> (or other pointing) attribute of the one be the value of the <span class="att">xml:id</span> attribute of the other. What the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element accomplishes is the handling of double connection by means of a single element. Thus, in the following encoding: <div id="index-egXML-d52e118238" class="pre egXML_valid"><span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">sa-p1</span>" <span class="attribute">target</span>="<span class="attributevalue">#sa-p2</span>"/&gt;</span><br /><span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">sa-p2</span>" <span class="attribute">target</span>="<span class="attributevalue">#sa-p1</span>"/&gt;</span></div> <span class="val">sa-p1</span> points to <span class="val">sa-p2</span>, and <span class="val">sa-p2</span> points to <span class="val">sa-p1</span>. This is logically equivalent to the more compact encoding: <div id="index-egXML-d52e118255" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#sa-p1 #sa-p2</span>"/&gt;</span></div></div><p>As noted elsewhere, the <span class="att">target</span>  attribute may take as value one or more URI reference. In the simplest case, each such reference will indicate an element in the current document (or in some other document), for example by supplying the value used for its global <span class="att">xml:id</span> attribute. It may however carry as value any form of URI, such as a URL pointing to some other document or location on the Internet. Pointing or linking to external documents and pointing and linking where identifiers are not available is described below in section <a class="link_ptr" href="SA.html#SAXP" title="Pointing Mechanisms"><span class="headingNumber">16.2 </span>Pointing Mechanisms</a>.</p></div><div class="div3" id="SAPTEG"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAPTL"><span class="headingNumber">16.1.1 </span>Pointers and Links</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAPTLG"><span class="headingNumber">16.1.3 </span>Groups of Links</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAPTEG" title="link to this section "><span class="invisible">TEI: Using Pointers and Links</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.1.2 </span><span class="head">Using Pointers and Links</span></h4><div class="p">As an example of the use of mechanisms which establish connections among elements, consider the practice (common in 18th century English verse and elsewhere) of providing footnotes citing parallel passages from classical authors. <figure class="figure float fullpage" id="POPE"><img src="Images/dunpic.png" alt="The figure shows the original page of Pope's Dunciad which is discussed in the text." class="graphic" /></figure> Such footnotes can of course simply be encoded using the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element (see section <a class="link_ptr" href="CO.html#CONO" title="Notes Annotation and Indexing"><span class="headingNumber">3.8 </span>Notes, Annotation, and Indexing</a>) without a <span class="att">target</span> attribute, placed adjacent to the passage to which the note refers:<span id="Note95_return"><a class="notelink" title="The type attribute on the note is used to classify the notes using the typology established in the Advertisement to the work: The Imitations of the An…" href="#Note95"><sup>58</sup></a></span> <div id="index-egXML-d52e118306" class="pre egXML_valid"><span class="element">&lt;l&gt;</span>(Diff'rent our parties, but with equal grace<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span>The Goddess smiles on Whig and Tory race,<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span><br /> <span class="element">&lt;note <span class="attribute">type</span>="<span class="attributevalue">imitation</span>" <span class="attribute">place</span>="<span class="attributevalue">bottom</span>"<br />  <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span><br />  <span class="element">&lt;bibl&gt;</span>Virg. Æn. 10.<span class="element">&lt;/bibl&gt;</span><br />  <span class="element">&lt;quote&gt;</span><br />   <span class="element">&lt;l&gt;</span>Tros Rutulusve fuat; nullo discrimine habebo.<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>—— Rex Jupiter omnibus idem.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/quote&gt;</span><br /> <span class="element">&lt;/note&gt;</span>'Tis the same rope at sev'ral ends they twist,<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span>To Dulness, Ridpath is as dear as Mist)<span class="element">&lt;/l&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div> </div><p>This use of the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element can be called <span class="term">implicit pointing</span> (or <span class="term">implicit linking</span>). It relies on the juxtaposition of the note to the text being commented on for the connection to be understood. If it is felt that the mere juxtaposition of the note to the text does not make it sufficiently clear exactly what text segment is being commented on (for example, is it the immediately preceding line, or the immediately preceding two lines, or what?), or if it is decided to place the note at some distance from the text, then the pointing or the linking must be made explicit. We now consider various methods for doing that.</p><div class="p">Firstly, a <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> element might be placed at an appropriate point within the text to link it with the annotation: <div id="index-egXML-d52e118341" class="pre egXML_valid"><span class="element">&lt;l&gt;</span>(Diff'rent our parties, but with equal grace<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span>The Goddess smiles on Whig and Tory race,<br /> <span class="element">&lt;ptr <span class="attribute">rend</span>="<span class="attributevalue">unmarked</span>" <span class="attribute">target</span>="<span class="attributevalue">#note3.284</span>"/&gt;</span><span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span>'Tis the same rope at sev'ral ends they twist,<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l&gt;</span>To Dulness, Ridpath is as dear as Mist)<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;note <span class="attribute">xml:id</span>="<span class="attributevalue">note3.284</span>" <span class="attribute">type</span>="<span class="attributevalue">imitation</span>"<br /> <span class="attribute">place</span>="<span class="attributevalue">bottom</span>" <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span><br /> <span class="element">&lt;bibl&gt;</span>Virg. Æn. 10.<span class="element">&lt;/bibl&gt;</span><br /> <span class="element">&lt;quote&gt;</span><br />  <span class="element">&lt;l&gt;</span>Tros Rutulusve fuat; nullo discrimine habebo.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>—— Rex Jupiter omnibus idem.<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;/quote&gt;</span><br /><span class="element">&lt;/note&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div>  The <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element has been given an arbitrary identifier (<span class="val">note3.284</span>) to enable it to be specified as the target of the pointer element. Because there is nothing in the text to signal the existence of the annotation, the <span class="att">rend</span> attribute has been given the value <span class="val">unmarked</span>.</div><div class="p">Secondly, the <span class="att">target</span> attribute of the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element can be used to point at its associated text, provided that an <span class="att">xml:id</span> attribute has been supplied for the associated text: <div id="index-egXML-d52e118386" class="pre egXML_valid"><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L3.283</span>"&gt;</span>(Diff'rent our parties, but with equal grace<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L3.284</span>"&gt;</span>The Goddess smiles on Whig and Tory race,<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L3.285</span>"&gt;</span>'Tis the same rope at sev'ral ends they twist,<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L3.286</span>"&gt;</span>To Dulness, Ridpath is as dear as Mist)<span class="element">&lt;/l&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div> Given this encoding of the text itself, we can now link the various notes to it. In this case, the note itself contains a pointer to the place in the text which it is annotating; this could be encoded using a <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a> element, which bears a <span class="att">target</span> attribute of its own and contains a (slightly misquoted) extract from the text marked as a <a class="gi" title="(quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text." href="ref-quote.html">quote</a> element: <div id="index-egXML-d52e118407" class="pre egXML_valid"><span class="element">&lt;note <span class="attribute">type</span>="<span class="attributevalue">imitation</span>" <span class="attribute">place</span>="<span class="attributevalue">bottom</span>"<br /> <span class="attribute">anchored</span>="<span class="attributevalue">false</span>" <span class="attribute">target</span>="<span class="attributevalue">#L3.284</span>"&gt;</span><br /> <span class="element">&lt;ref <span class="attribute">rend</span>="<span class="attributevalue">sc</span>" <span class="attribute">target</span>="<span class="attributevalue">#L3.284</span>"&gt;</span>Verse 283–84.<br />  <span class="element">&lt;quote&gt;</span><br />   <span class="element">&lt;l&gt;</span>——. With equal grace<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>Our Goddess smiles on Whig and Tory race.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/quote&gt;</span><span class="element">&lt;/ref&gt;</span><br /> <span class="element">&lt;bibl&gt;</span>Virg. Æn. 10.<span class="element">&lt;/bibl&gt;</span><br /> <span class="element">&lt;quote&gt;</span><br />  <span class="element">&lt;l&gt;</span>Tros Rutulusve fuat; nullo discrimine habebo.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>—— Rex Jupiter omnibus idem. <span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;/quote&gt;</span><br /><span class="element">&lt;/note&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div> </div><div class="p">Combining these two approaches gives us the following associations: <ul class="bulleted"><li class="item">a pointer within one line indicates the note</li><li class="item">the note indicates the line</li><li class="item">a pointer within the note indicates the line</li></ul> Note that we do not have any way of pointing from the line itself to the note: the association is implied by containment of the pointer. We do not as yet have a true double link between text and note. To achieve that we will need to supply identifiers for the annotations as well as for the verse lines, and use a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element to associate the two. Note that the <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> element and the <span class="att">target</span> attribute on the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> may now be dispensed with: <div id="index-egXML-d52e118448" class="pre egXML_valid"><span class="element">&lt;note <span class="attribute">xml:id</span>="<span class="attributevalue">n3.284</span>" <span class="attribute">type</span>="<span class="attributevalue">imitation</span>"<br /> <span class="attribute">place</span>="<span class="attributevalue">bottom</span>" <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span><br /> <span class="element">&lt;ref <span class="attribute">rend</span>="<span class="attributevalue">sc</span>" <span class="attribute">target</span>="<span class="attributevalue">#L3.284</span>"&gt;</span>Verse 283–84.<br />  <span class="element">&lt;quote&gt;</span><br />   <span class="element">&lt;l&gt;</span>——. With equal grace<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>Our Goddess smiles on Whig and Tory race.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/quote&gt;</span><span class="element">&lt;/ref&gt;</span><br /> <span class="element">&lt;bibl&gt;</span>Virg. Æn. 10.<span class="element">&lt;/bibl&gt;</span><br /> <span class="element">&lt;quote&gt;</span><br />  <span class="element">&lt;l&gt;</span>Tros Rutulusve fuat; nullo discrimine habebo.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>—— Rex Jupiter omnibus idem. <span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;/quote&gt;</span><br /><span class="element">&lt;/note&gt;</span><br /><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #L3.284</span>"/&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div> </div><div class="p">The <span class="att">target</span> attribute of the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element here bears the identifier of the note followed by that of the verse line. We could also allocate an identifier to the reference within the note and encode the association between it and the verse line in the same way: <div id="index-egXML-d52e118475" class="pre egXML_valid"><span class="element">&lt;note <span class="attribute">type</span>="<span class="attributevalue">imitation</span>" <span class="attribute">place</span>="<span class="attributevalue">bottom</span>"<br /> <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span><br /> <span class="element">&lt;ref <span class="attribute">rend</span>="<span class="attributevalue">sc</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">r3.284</span>"<br />  <span class="attribute">target</span>="<span class="attributevalue">#L3.284</span>"&gt;</span>Verse 283–84.<br />  <span class="element">&lt;quote&gt;</span><br />   <span class="element">&lt;l&gt;</span>——. With equal grace<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>Our Goddess smiles on Whig and Tory race.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/quote&gt;</span><span class="element">&lt;/ref&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/note&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#r3.284 #L3.284</span>"/&gt;</span><div style="float: right;"><a href="BIB.html#SAPTEG-eg-3">bibliography</a> </div></div>  Indeed, the two <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a>s could be combined into one, as follows: <div id="index-egXML-d52e118494" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #r3.284 #L3.284</span>"/&gt;</span></div></div></div><div class="div3" id="SAPTLG"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAPTEG"><span class="headingNumber">16.1.2 </span>Using Pointers and Links</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAPTIP"><span class="headingNumber">16.1.4 </span>Intermediate Pointers</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAPTLG" title="link to this section "><span class="invisible">TEI: Groups of Links</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.1.3 </span><span class="head">Groups of Links</span></h4><p>Clearly, there are many reasons for which an encoder might wish to represent a link or association between different elements. For some of them, specific elements are provided in these Guidelines; some of these are discussed elsewhere in the present chapter. The <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element is a general purpose element which may be used for any kind of association. The element <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> may be used to group links of a particular type together in a single part of the document; such a collection may be used to represent what is sometimes referred to in the literature of Hypertext as a <span class="term">web</span>, a term introduced by the Brown University FRESS project in 1969, and not to be confused with the World Wide Web. </p><ul class="specList"><li><span class="specList-elementSpec"><a href="ref-linkGrp.html">linkGrp</a></span> (link group) defines a collection of associations or hypertextual links.</li></ul><p> As a member of the class <a class="link_odd" title="provides a set of attributes common to all elements which enclose groups of pointer elements." href="ref-att.pointing.group.html">att.pointing.group</a>, this element shares the following attributes with other members of that class: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.pointing.group.html">att.pointing.group</a></span> provides a set of attributes common to all elements which enclose groups of pointer elements.<table class="specDesc"><tr><td class="Attribute"><span class="att">domains</span></td><td>optionally specifies the identifiers of the elements within which all elements indicated by the contents of this element lie.</td></tr><tr><td class="Attribute"><span class="att">targFunc</span></td><td>(target function) describes the function of each of the values of the <span class="att">target</span> attribute of the enclosed <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a>, <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a>, or <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> tags.</td></tr></table></li></ul><p> It is also a member of the <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a> and <a class="link_odd" title="provides attributes which can be used to classify or subclassify elements in any way." href="ref-att.typed.html">att.typed</a> classes, and therefore also carries the attributes specified in section <a class="link_ptr" href="SA.html#SAPTL" title="Pointers and Links"><span class="headingNumber">16.1.1 </span>Pointers and Links</a> above, in particular the <span class="att">type</span> attribute.</p><p>The <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> element provides a convenient way of establishing a default for the <span class="att">type</span> attribute on a group of links of the same type: by default, the <span class="att">type</span> attribute on a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element has the same value as that given for <span class="att">type</span> on the enclosing <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a>.</p><div class="p">Typical software might hide a web entirely from the user, but use it as a source of information about links, which are displayed independently at their referenced locations. Alternatively, software might provide a direct view of the link collection, along with added functions for manipulating the collection, as by filtering, sorting, and so on. To continue our previous example, this text contains many other notes of a kind similar to the one shown above. Here are a few more of the lines to which annotations have to be attached, followed by the annotations themselves: <div id="index-egXML-d52e118554" class="pre egXML_valid"><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L2.79</span>"&gt;</span>A place there is, betwixt earth, air and seas<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L2.80</span>"&gt;</span>Where from Ambrosia, Jove retires for ease.<span class="element">&lt;/l&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">L2.88</span>"&gt;</span>Sign'd with that Ichor which from Gods distills.<span class="element">&lt;/l&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;note <span class="attribute">xml:id</span>="<span class="attributevalue">n2.79</span>" <span class="attribute">place</span>="<span class="attributevalue">bottom</span>"<br /> <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span><br /> <span class="element">&lt;bibl&gt;</span>Ovid Met. 12.<span class="element">&lt;/bibl&gt;</span><br /> <span class="element">&lt;quote <span class="attribute">xml:lang</span>="<span class="attributevalue">la</span>"&gt;</span><br />  <span class="element">&lt;l&gt;</span>Orbe locus media est, inter terrasq; fretumq;<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>Cœlestesq; plagas —<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;/quote&gt;</span><br /><span class="element">&lt;/note&gt;</span><br /><span class="element">&lt;note <span class="attribute">xml:id</span>="<span class="attributevalue">n2.88</span>" <span class="attribute">place</span>="<span class="attributevalue">bottom</span>"<br /> <span class="attribute">anchored</span>="<span class="attributevalue">false</span>"&gt;</span> Alludes to <span class="element">&lt;bibl&gt;</span>Homer, Iliad 5<span class="element">&lt;/bibl&gt;</span> ...<br /> <span class="element">&lt;/note&gt;</span></div> To avoid having to repeat the specification of <span class="att">type</span> as <span class="val">imitation</span> on each <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a>, we may specify it once for all on a <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> element containing all links of this type. <div id="index-egXML-d52e118590" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">imitation</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.79 #L2.79</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.88 #L2.88</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #L3.284</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><div class="p">Additional information for applications that use <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements can be provided by means of special attributes. First, the <span class="att">domains</span> attribute can be used to identify the text elements within which the individual targets of the links are to be found. Suppose that the text under discussion is organized into a <a class="gi" title="(text body) contains the whole body of a single unitary text, excluding any front or back matter." href="ref-body.html">body</a> element, containing the text of the poem, and a <a class="gi" title="(back matter) contains any appendixes, etc. following the main part of a text." href="ref-back.html">back</a> element containing the notes. Then the <span class="att">domains</span> attribute can have as its value the identifiers of the <a class="gi" title="(text body) contains the whole body of a single unitary text, excluding any front or back matter." href="ref-body.html">body</a> and the <a class="gi" title="(back matter) contains any appendixes, etc. following the main part of a text." href="ref-back.html">back</a>, to enable an application to verify that the link targets are in fact contained by appropriate elements, or to limit its search space: <div id="index-egXML-d52e118619" class="pre egXML_valid"><br /><span class="comment">&lt;!-- ... --&gt;</span><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">imitation</span>"<br /> <span class="attribute">domains</span>="<span class="attributevalue">#dunciad #dunnotes</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.79 #L2.79</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.88 #L2.88</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #L3.284</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><p>Note that there must be a single parent element for each ‘domain’; if some notes are contained by a section with identifier <span class="val">dunnotes</span>, and others by a section with identifier <span class="val">dunimits</span>, an intermediate pointer must be provided (as described in section <a class="link_ptr" href="SA.html#SAPTIP" title="Intermediate Pointers"><span class="headingNumber">16.1.4 </span>Intermediate Pointers</a>) within the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> and its identifier used instead.</p><div class="p">Next, the <span class="att">targFunc</span> attribute can be used to provide further information about the role or function of the various targets specified for each link in the group. The value of the <span class="att">targFunc</span> attribute is a list of names (formally, name tokens), one for each of the targets in the link; these names can be chosen freely by the encoder, but their significance should be documented in the encoding description in the header.<span id="Note96_return"><a class="notelink" title="Since no special element is provided for this purpose in the present version of these Guidelines, the information should be supplied as a series of pa…" href="#Note96"><sup>59</sup></a></span> In the current example, we might think of the note as containing the <span class="noindex">source</span> of the imitation and the verse line as containing the <span class="noindex">goal</span> of the imitation. Accordingly, we can specify the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> in the preceding example thus: <div id="index-egXML-d52e118669" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">imitation</span>"<br /> <span class="attribute">domains</span>="<span class="attributevalue">#dunciad #dunnotes</span>" <span class="attribute">targFunc</span>="<span class="attributevalue">source goal</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.79 #L2.79</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n2.88 #L2.88</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #L3.284</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div></div><div class="div3" id="SAPTIP"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAPTLG"><span class="headingNumber">16.1.3 </span>Groups of Links</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAPTIP" title="link to this section "><span class="invisible">TEI: Intermediate Pointers</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.1.4 </span><span class="head">Intermediate Pointers</span></h4><p>In the preceding examples, we have shown various ways of linking an annotation and a single verse line. However, the example cited in fact requires us to encode an association between the note and a <em>pair</em> of verse lines (lines 284 and 285); we call these two lines a <span class="term">span</span>.</p><p>There are a number of possible ways of correcting this error: one could use the <span class="att">target</span> attribute to indicate one end of the span and the special purpose <span class="att">targetEnd</span> attribute on the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> element to point to the other. Another possibility might be to create an element which represents the whole span itself and assign that an <span class="att">xml:id</span> attribute, which can then be linked to the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> and <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a> elements. This could be done using for example the <a class="gi" title="(line group) contains one or more verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc." href="ref-lg.html">lg</a> element defined in section <a class="link_ptr" href="CO.html#COVE" title="Core Tags for Verse"><span class="headingNumber">3.12.1 </span>Core Tags for Verse</a> or the ‘virtual’ <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element discussed in section <a class="link_ptr" href="SA.html#SAAG" title="Aggregation"><span class="headingNumber">16.7 </span>Aggregation</a>.</p><div class="p">A third possibility would be to use an ‘intermediate pointer’ as follows: <div id="index-egXML-d52e118964" class="pre egXML_valid"><span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">L3.283-284</span>"<br /> <span class="attribute">target</span>="<span class="attributevalue">#L3.283 #L3.284</span>"/&gt;</span></div> When the <span class="att">target</span> attribute of a <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> or <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a> element specifies more than one element, the indicated elements are intended to be combined or aggregated in some way to produce the object of the pointer. (Such aggregation is however the task of a processing application, and cannot be defined simply by the markup). The <span class="att">xml:id</span> attribute of the <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> then provides an identifier which can be linked to the <a class="gi" title="contains a note or annotation." href="ref-note.html">note</a> and <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a> elements: <div id="index-egXML-d52e118989" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">evaluate</span>="<span class="attributevalue">all</span>"<br /> <span class="attribute">target</span>="<span class="attributevalue">#n3.284 #r3.284 #L3.283-284</span>"/&gt;</span></div> </div><p>The <span class="val">all</span> value of <span class="att">evaluate</span> is used on the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element to specify that any pointer encountered as a target of that element is itself evaluated. If <span class="att">evaluate</span> had the value <span class="val">none</span>, the link target would be the pointer itself, rather than the objects it points to.</p><p>Where a <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> element is used to group a collection of <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> elements, any intermediate pointer elements used by those <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> elements should be included within the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a>.</p></div></div><div class="div2" id="SAXP"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAPT"><span class="headingNumber">16.1 </span>Links</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASE"><span class="headingNumber">16.3 </span>Blocks, Segments, and Anchors</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAXP" title="link to this section "><span class="invisible">TEI: Pointing Mechanisms</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2 </span><span class="head">Pointing Mechanisms</span></h3><p>This section introduces more formally the pointing mechanisms available in the TEI. In addition to those discussed so far, the TEI provides methods of pointing: </p><ul class="bulleted"><li class="item">into documents other than the current document;</li><li class="item">to a particular element in a document other than the current document using its <span class="att">xml:id</span>;</li><li class="item">to a particular element whether in the current document or not, using its position in the XML element tree;</li><li class="item">at arbitrary content in any XML document using TEI-defined XPointer schemes.</li></ul><p>All TEI attributes used to point at something else are declared as having the datatype <span class="ident-datatype">data.pointer</span>, which is defined as a URI reference<span id="Note97_return"><a class="notelink" title="The URI (Universal Resource Indicator) is defined in RFC 3986" href="#Note97"><sup>60</sup></a></span>; the cases so far discussed are all simple examples of a URI reference. Another familiar example is the mechanism used in XHTML to create represent hypertext links by means of the XHTML <span class="att">href</span> attribute. A URI reference can reference the whole of an XML resource such as a document or an XML element, or a sub-portion of such a resource, identified by means of an appropriate <span class="term">fragment identifier</span>. Technically speaking, the ‘fragment identifier’ is that portion of a URI reference following the first unescaped <span class="q">‘#’</span> character; in practice, it provides a means of accessing some part of the resource described by the URI which is less than the whole. </p><p>The first three of the following subsections provide only a brief overview and some examples of the W3C mechanisms recommended. More detailed information on the use of these mechanisms is readily available elsewhere.</p><div class="div3" id="SAUR"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SABN"><span class="headingNumber">16.2.2 </span>Pointing Locally</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAUR" title="link to this section "><span class="invisible">TEI: Pointing Elsewhere</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.1 </span><span class="head">Pointing Elsewhere</span></h4><p>Like the ubiquitous if misnamed XHTML pointing attribute <span class="att">href</span>, the TEI pointing attributes can point to a document that is not the current document (the one that contains the pointing element) whether it is in the same local filesystem as the current document, or on a different system entirely. In either case, the pointing can be accomplished absolutely (using the entire address of the target document) or relatively (using an address relative to the current base URI in force). The ‘current base URI’ is defined according to <a class="link_ref" href="BIB.html#XMLBASE" title="Jonathan Marsh Richard Tobin XML Base (Second Edition)W3C28 January 2009">Marsh and Tobin 2009</a>. If there is none, the base URI is that of the current document. In common practice the current base URI in force is likely to be the value of the <span class="att">xml:base</span> attribute of the closest ancestor that has one. However this may not be the case, since <span class="att">xml:base</span> attributes are accumulated through the hierarchy by concatenation of path segments, beginning at the top of the hierarchy and proceeding down to the context node.</p><div class="p">The following example demonstrates an absolute URI reference that points to a remote document: <div id="index-egXML-d52e119091" class="pre egXML_valid">The current base URI in force is as defined in the<br /> W3C <span class="element">&lt;ref <span class="attribute">target</span>="<span class="attributevalue">http://www.w3.org/TR/xmlbase/</span>"&gt;</span>XML<br />   Base<span class="element">&lt;/ref&gt;</span> recommendation.</div></div><div class="p">This example points explicitly to a location on the Web, accessible via HTTP. Suppose however that we wish to access a document stored locally in a file. Again we will supply an absolute URI reference, but this time using a different protocol: <div id="index-egXML-d52e119100" class="pre egXML_valid">This Debian package is distributed under the terms<br /> of the <span class="element">&lt;ref <span class="attribute">target</span>="<span class="attributevalue">file:///usr/share/common-licenses/GPL-2</span>"&gt;</span>GNU General Public License<span class="element">&lt;/ref&gt;</span>.</div></div><div class="p">In the following example, we use a relative URI reference to point to a local document: <div id="index-egXML-d52e119107" class="pre egXML_valid"><span class="element">&lt;figure <span class="attribute">rend</span>="<span class="attributevalue">float fullpage</span>"&gt;</span><br /> <span class="element">&lt;graphic <span class="attribute">url</span>="<span class="attributevalue">Images/compic.png</span>"/&gt;</span><br /> <span class="element">&lt;figDesc&gt;</span>The figure shows the page from the <span class="element">&lt;title&gt;</span>Orbis<br />       pictus<span class="element">&lt;/title&gt;</span> of Comenius which is discussed in the text.<span class="element">&lt;/figDesc&gt;</span><br /><span class="element">&lt;/figure&gt;</span></div> Since no <span class="att">xml:base</span> is specified here, the location of the resource <span class="ident-file">Images/compic.png</span> is determined relative to the resource indicated by the current base URI, which is the current document.</div><div class="p">In the following example, however, we first change the current base URI by setting a new value for <span class="att">xml:base</span>. The resource required is then identified by means of a relative URI: <div id="index-egXML-d52e119127" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">chap</span>"<br /> <span class="attribute">xml:base</span>="<span class="attributevalue">http://classics.mit.edu/</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>On Ancient Persian Manners<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;p&gt;</span>In the very first story of <span class="element">&lt;ref <span class="attribute">target</span>="<span class="attributevalue">Sadi/gulistan.2.i.html</span>"&gt;</span><br />   <span class="element">&lt;title&gt;</span>The Gulistan of<br />         Sa'di<span class="element">&lt;/title&gt;</span><br />  <span class="element">&lt;/ref&gt;</span>,<br />     Sa'di relates moral advice worthy of Miss Minners ...<span class="element">&lt;/p&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><p>As noted above, the current base URI is found on the nearest ancestor. It is technically possible to use <span class="att">xml:base</span> as a means to shorten URIs, but this usage is not recommended. <a class="link_ref" href="SA.html#SAPU" title="Using Abbreviated Pointers">Abbreviated pointers</a> provide a more flexible and consistent method for creating shorthand links.</p></div><div class="div3" id="SABN"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAUR"><span class="headingNumber">16.2.1 </span>Pointing Elsewhere</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAPU"><span class="headingNumber">16.2.3 </span>Using Abbreviated Pointers</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SABN" title="link to this section "><span class="invisible">TEI: Pointing Locally</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.2 </span><span class="head">Pointing Locally</span></h4><div class="p">A <span class="term">shorthand pointer</span>, in which the URI consists only of <code>#</code> followed by the value of an <span class="att">xml:id</span> acts as a pointer to the element in the current document with that <span class="att">xml:id</span>, as in the following example. <div id="index-egXML-d52e119163" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">section</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">sect106</span>"&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">section</span>" <span class="attribute">n</span>="<span class="attributevalue">107</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">sect107</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>Limitations on exclusive rights: Fair use<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;p&gt;</span>Notwithstanding the provisions of<br />  <span class="element">&lt;ref <span class="attribute">target</span>="<span class="attributevalue">#sect106</span>"&gt;</span>section 106<span class="element">&lt;/ref&gt;</span>, the fair use of a<br />     copyrighted work, including such use by reproduction in copies<br />     or phonorecords or by any other means specified by that section,<br />     for purposes such as criticism, comment, news reporting,<br />     teaching (including multiple copies for classroom use),<br />     scholarship, or research, is not an infringement of copyright.<br />     In determining whether the use made of a work in any particular<br />     case is a fair use the factors to be considered shall<br />     include — <br />  <span class="element">&lt;list <span class="attribute">rend</span>="<span class="attributevalue">bulleted</span>"&gt;</span><br />   <span class="element">&lt;item <span class="attribute">n</span>="<span class="attributevalue">(1)</span>"&gt;</span>the purpose and character of the use, including<br />         whether such use is of a commercial nature or is for nonprofit<br />         educational purposes;<span class="element">&lt;/item&gt;</span><br />   <span class="element">&lt;item <span class="attribute">n</span>="<span class="attributevalue">(2)</span>"&gt;</span>the nature of the copyrighted work;<span class="element">&lt;/item&gt;</span><br />   <span class="element">&lt;item <span class="attribute">n</span>="<span class="attributevalue">(3)</span>"&gt;</span>the amount and substantiality of the portion<br />         used in relation to the copyrighted work as a whole;<br />         and<span class="element">&lt;/item&gt;</span><br />   <span class="element">&lt;item <span class="attribute">n</span>="<span class="attributevalue">(4)</span>"&gt;</span>the effect of the use upon the potential market<br />         for or value of the copyrighted work.<span class="element">&lt;/item&gt;</span><br />  <span class="element">&lt;/list&gt;</span><br />     The fact that a work is unpublished shall not itself bar a<br />     finding of fair use if such finding is made upon consideration<br />     of all the above factors.<span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><div style="float: right;"><a href="BIB.html#SA-eg-01">bibliography</a> </div></div> This method of pointing, by referring to the <span class="att">xml:id</span> of the target element as a bare name only (e.g., <span class="val">#sect106</span>) is the simplest and often the best approach where it can be applied, i.e. where both the source element and target element are in the same XML document, and where the target element carries an identifier. It is the method used extensively in previous sections of this chapter and elsewhere in these Guidelines.</div></div><div class="div3" id="SAPU"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SABN"><span class="headingNumber">16.2.2 </span>Pointing Locally</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATS"><span class="headingNumber">16.2.4 </span>TEI XPointer Schemes</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SAPU" title="link to this section "><span class="invisible">TEI: Using Abbreviated Pointers</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.3 </span><span class="head">Using Abbreviated Pointers</span></h4><p>Even in the case of relative links on the local file system, <span class="att">ref</span> or <span class="att">target</span> attributes may become quite lengthy and make XML code difficult to read. To deal with this problem, the TEI provides a useful method of using abbreviated pointers and documenting a way to dereference them automatically.</p><p>Imagine a project which has a large collection of XML documents organized like this:</p><ul><li class="item">anthology <ul><li class="item">poetry <ul><li class="item"><span class="ident-file">poem.xml</span></li></ul></li><li class="item">prose <ul><li class="item"><span class="ident-file">novel.xml</span></li></ul></li></ul></li><li class="item">references <ul><li class="item">people <ul><li class="item"><span class="ident-file">personography.xml</span></li></ul></li></ul></li></ul><div class="p">If you want to link a <a class="gi" title="(name, proper noun) contains a proper noun or noun phrase." href="ref-name.html">name</a> in the <span class="ident-file">novel.xml</span> file to a <a class="gi" title="provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source." href="ref-person.html">person</a> in the <span class="ident-file">personography.xml</span> file, the link will look like this: <div id="index-egXML-d52e119249" class="pre egXML_valid"><span class="element">&lt;name <span class="attribute">ref</span>="<span class="attributevalue">../../references/people/personography.xml#fred</span>"&gt;</span>Fred<span class="element">&lt;/name&gt;</span></div> If there are many names to tag in a single paragraph, the XML encoding will be congested, and such lengthy links are prone to typographical error. In addition, if the project organization is changed, every relative link will have to be found and altered.</div><div class="p">One way to deal with this is to use what is often referred to as a "magic token". You could make such links using the <span class="att">key</span> attribute: <div id="index-egXML-d52e119258" class="pre egXML_valid"><span class="element">&lt;name <span class="attribute">key</span>="<span class="attributevalue">fred</span>"&gt;</span>Fred<span class="element">&lt;/name&gt;</span></div> and document the meaning of the key using (for instance) a <a class="gi" title="defines a typology either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy." href="ref-taxonomy.html">taxonomy</a> element in the TEI header, as described in <a class="link_ptr" href="CO.html#CONARS" title="Referring Strings"><span class="headingNumber">3.5.1 </span>Referring Strings</a>. However, such a link cannot be mechanically processed by an external system that does not know how to interpret it; a human will have to read the header explanation and write code explicitly to reconstruct the intended link.</div><div class="p">A more robust alternative is to use a <span class="term">private URI scheme</span>. This is a method of constructing a simple, key-like token which functions as a <span class="ident-datatype">data.pointer</span>, and can therefore be used as the value of any attribute which has that datatype, such as <span class="att">ref</span> and <span class="att">target</span>. Such a scheme consists of a prefix with a colon, and then a value. You might, for example, use the prefix <span class="val">psn</span> (for "person"), and structure your name tags like this: <div id="index-egXML-d52e119285" class="pre egXML_valid"><span class="element">&lt;name <span class="attribute">ref</span>="<span class="attributevalue">psn:fred</span>"&gt;</span>Fred<span class="element">&lt;/name&gt;</span></div> How is this different from a ‘magic token’? Essentially, it isn't, except that TEI provides a structured method of dereferencing it (turning it into a computable path, such as <span class="val">../../references/people/personography.xml#fred</span>) by means of a declaration inside <a class="gi" title="(encoding description) documents the relationship between an electronic text and the source or sources from which it was derived." href="ref-encodingDesc.html">encodingDesc</a> in the TEI header, using the elements and attributes for prefix declaration: <ul class="specList"><li><span class="specList-elementSpec"><a href="ref-listPrefixDef.html">listPrefixDef</a></span> (list of prefix definitions) contains a list of definitions of prefixing schemes used in <span class="ident-datatype">data.pointer</span> values, showing how abbreviated URIs using each scheme may be expanded into full URIs.</li><li><span class="specList-elementSpec"><a href="ref-prefixDef.html">prefixDef</a></span> (prefix definition) defines a prefixing scheme used in <span class="ident-datatype">data.pointer</span> values, showing how abbreviated URIs using the scheme may be expanded into full URIs.<table class="specDesc"><tr><td class="Attribute"><span class="att">ident</span></td><td>supplies a name which functions as the prefix for an abbreviated pointing scheme such as a private URI scheme. The prefix constitutes the text preceding the first colon.</td></tr></table></li><li><span class="specList-classSpec"><a href="ref-att.patternReplacement.html">att.patternReplacement</a></span> provides attributes for regular-expression matching and replacement.<table class="specDesc"><tr><td class="Attribute"><span class="att">matchPattern</span></td><td>specifies a regular expression against which the values of other attributes can be matched.</td></tr><tr><td class="Attribute"><span class="att">replacementPattern</span></td><td>specifies a ‘replacement pattern’, that is, the skeleton of a relative or absolute URI containing references to groups in the <span class="att">matchPattern</span> which, once subpattern substitution has been performed, complete the URI.</td></tr></table></li></ul></div><div class="p">This is how you might document a private URI scheme using the <span class="val">psn:</span> prefix: <div id="index-egXML-d52e119308" class="pre egXML_valid"><span class="element">&lt;listPrefixDef&gt;</span><br /> <span class="element">&lt;prefixDef <span class="attribute">ident</span>="<span class="attributevalue">psn</span>"<br />  <span class="attribute">matchPattern</span>="<span class="attributevalue">([a-z]+)</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">../../references/people/personography.xml#$1</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span> In the context of this project, private URIs with the prefix<br />       "psn" point to <span class="element">&lt;gi&gt;</span>person<span class="element">&lt;/gi&gt;</span> elements in the project's<br />       personography.xml file.<br />   <span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/prefixDef&gt;</span><br /><span class="element">&lt;/listPrefixDef&gt;</span></div> This specifies that where a <span class="ident-datatype">data.pointer</span> value is constructed with a <span class="val">psn:</span> prefix, a regular-expression replace operation can be performed on it to construct the full or relative URI to the target document or fragment. <a class="gi" title="(list of prefix definitions) contains a list of definitions of prefixing schemes used in data.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs." href="ref-listPrefixDef.html">listPrefixDef</a> is a child of <a class="gi" title="(encoding description) documents the relationship between an electronic text and the source or sources from which it was derived." href="ref-encodingDesc.html">encodingDesc</a>, and it contains any number of <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> elements. Each <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> element provides a method of dereferencing or expanding an abbreviated pointer, based on a regular expression. The <span class="att">ident</span> attribute specifies the prefix to which the expansion applies (without the colon). The <span class="att">matchPattern</span> attribute contains a regular expression which is matched against the component of the pointer following the first colon, and the <span class="att">replacementPattern</span> provides the string which will be used as a replacement. In this example, using <span class="val">psn:fred</span>, the value <span class="val">fred</span> would be matched by the <span class="att">matchPattern</span>, and also captured (through the parentheses in the regular expression); it would then be replaced by the value <span class="val">../../references/people/personography.xml#fred</span> (with the the <span class="val">$1</span> in the <span class="att">replacementPattern</span> being replaced by the captured value). The <a class="gi" title="(paragraph) marks paragraphs in prose." href="ref-p.html">p</a> element inside the <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> can be used to provide a human-readable explanation of the usage of this prefix.</div><p>Through this mechanism, any processor which encounters a <span class="ident-datatype">data.pointer</span> with a protocol unknown to it can check the <a class="gi" title="(list of prefix definitions) contains a list of definitions of prefixing schemes used in data.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs." href="ref-listPrefixDef.html">listPrefixDef</a> in the header to see if there is an available expansion for it, and if there is, it can automatically provide the expansion and generate a full or relative URI.</p><div class="p">For any given prefix, it may be useful to supply more than one expansion. For instance, in addition to pointing at the <a class="gi" title="provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source." href="ref-person.html">person</a> element in the personography file, it might also be useful to point to an external source which is available on the network, representing the same information in a different way. So there might be a second <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> like this: <div id="index-egXML-d52e119387" class="pre egXML_valid"><span class="element">&lt;prefixDef <span class="attribute">ident</span>="<span class="attributevalue">psn</span>"<br /> <span class="attribute">matchPattern</span>="<span class="attributevalue">([a-z]+)</span>"<br /> <span class="attribute">replacementPattern</span>="<span class="attributevalue">http://www.example.com/personography.html#$1</span>"&gt;</span><br /> <span class="element">&lt;p&gt;</span> Private URIs with the prefix "psn" can be converted to point<br />     to a fragment on the Personography page of the project Website.<br />  <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/prefixDef&gt;</span></div> Any number of <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> elements may be provided for the same prefix. A processor may decide to process one or all of them; if it processes only one, it should choose the first one with the correct <span class="att">ident</span> value, so the primary or most important <a class="gi" title="(prefix definition) defines a prefixing scheme used in data.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs." href="ref-prefixDef.html">prefixDef</a> for any given prefix should appear first in its parent <a class="gi" title="(list of prefix definitions) contains a list of definitions of prefixing schemes used in data.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs." href="ref-listPrefixDef.html">listPrefixDef</a>.</div><p>When creating private URI schemes, it is recommended that you avoid using any existing registered prefix. A list of registered prefixes is maintained by IANA at <a class="link_ref" href="http://www.iana.org/assignments/uri-schemes.html">http://www.iana.org/assignments/uri-schemes.html</a>.</p><p>Note that this mechanism can also be used to dereference other abbreviated pointing systems which are based on prefixes, such as Tag URIs.</p><p>The <span class="att">matchPattern</span> and <span class="att">replacementPattern</span> attributes are also used in dereferencing canonical reference patterns, and further examples of the use of regular expressions are shown in <a class="link_ptr" href="SA.html#SACR" title="Canonical References"><span class="headingNumber">16.2.5 </span>Canonical References</a>.</p></div><div class="div3" id="SATS"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAPU"><span class="headingNumber">16.2.3 </span>Using Abbreviated Pointers</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACR"><span class="headingNumber">16.2.5 </span>Canonical References</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SATS" title="link to this section "><span class="invisible">TEI: TEI XPointer Schemes</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4 </span><span class="head">TEI XPointer Schemes</span></h4><p>The pointing schemes described in this chapter are part of a number of such schemes envisaged by the W3C, which together constitute a framework for addressing data within XML documents, known as the XPointer Framework (<a class="link_ref" href="BIB.html#XPTRFMWK" title="Paul Grosso Eve Maler Jonathan Marsh Norman Walsh XPointer FrameworkW3C25 March 2003">Grosso et al 2003</a>). This framework permits the definition of many other named addressing methods, each of which is known as an <span class="term">XPointer Scheme</span>. The W3C has predefined a set of such schemes, and maintains a register for their expansion.</p><p>One important scheme, also defined by the W3C, and recommended by these Guidelines is the <span class="name">xpath()</span> pointer scheme, which allows for any part of an XML structure to be selected using the syntax defined by the XPath specification. This is further discussed below, <a class="link_ptr" href="SA.html#SATSXP" title="xpath()"><span class="headingNumber">16.2.4.2 </span>xpath()</a>. These Guidelines also define six other pointer schemes, which provide access to parts of an XML document such as points within data content or stretches of data content. These additional TEI pointer schemes are defined in sections <a class="link_ptr" href="SA.html#SATSL" title="left()"><span class="headingNumber">16.2.4.3 </span>left()</a> to <a class="link_ptr" href="SA.html#SATSMA" title="match()"><span class="headingNumber">16.2.4.8 </span>match()</a> below.</p><div class="div4" id="SATSin"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSXP"><span class="headingNumber">16.2.4.2 </span>xpath()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSin" title="link to this section "><span class="invisible">TEI: Introduction to TEI Pointers</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.1 </span><span class="head">Introduction to TEI Pointers</span></h5><p>Before discussing the TEI pointer schemes, we introduce slightly more formally the terminology used to define them. So far, we have discussed only ways of pointing at components of the XML information set node such as elements and attributes. However, there is often a need in text analysis to address additional types of location such as the ‘point’ locations <em>between</em> ‘nodes’, and ‘sequences’ that may arbitrarily cross the boundaries of nodes in a document. The content of an XML document is organized sequentially as well as hierarchically, and it makes sense to consider ranges of characters within a document independently of the nodes to which they belong. From the perspective of most of the pointer schemes discussed below, a TEI document is a tree structure superimposed upon a character stream. Nodes are entities available only in the tree, while points are available only in the stream. For this reason, the schemes below that rely upon character positions (<code>string-index()</code>, <code>string-range()</code>, and <code>match()</code>) cannot take nodes into account. Similarly, XPath, being a method for locating nodes in the tree, treats those nodes as atomic, and is unable to address parts of nodes in their document context.</p><p>The TEI pointer scheme thus distinguishes the following kinds of object: </p><dl><dt><span>Node</span></dt><dd>A node is an instance of one of the node kinds defined in the <a class="link_ref" href="http://www.w3.org/TR/xpath-datamodel/">XQuery 1.0 and XPath 2.0 Data Model (Second Edition)</a>. It represents a single item in the XML information set for a document. For pointing purposes, the only nodes that are of interest are Text Nodes, Element Nodes, and Attribute nodes.</dd><dt><span>Sequence</span></dt><dd>A Sequence follows the definition in the XPath 2.0 Data Model, with one alteration. A Sequence is an ordered collection of zero or more items, where an item is either a node or a partial text node. </dd><dt><span>Text Stream</span></dt><dd>A Text Stream is the concatenation of the text nodes in a document and behaves as though all tags had been removed. A text stream begins at a reference node and encompasses all of the text inside that node (if any) and all the text following it in document order. In XPath terms, this would encompass all of the text nodes beginning at a particular node, and following it on the <a class="link_ref" href="http://www.w3.org/TR/xpath20/#axes">following axis</a>.</dd><dt><span>Point</span></dt><dd>A Point represents a dimensionless point between nodes or characters in a document. Every point is adjacent to either characters or elements, and never to another point. Points can only be referenced in relation to an element or text node in the document (i.e. something addressable by either an XPath or a fragment identifier). Points occur either immediately before or after an element, or at a numbered position inside a text stream. Position zero in the stream would be immediately before the first character. Note that points within attribute values cannot mark the beginning or end of a range extending beyond the attribute value, because points indicate a position within a document. Since attribute nodes are by definition un-ordered, they cannot be said to have a fixed position. </dd></dl><p>The TEI recommends the following seven pointer schemes: </p><dl><dt><span><span class="name">xpath()</span></span></dt><dd>Addresses a node or nodeset using the XPath syntax. (<a class="link_ptr" href="SA.html#SATSXP" title="xpath()"><span class="headingNumber">16.2.4.2 </span>xpath()</a>)</dd><dt><span><span class="name">left()</span> and <span class="name">right()</span></span></dt><dd>addresses the point before (left) or after (right) a node or node set (<a class="link_ptr" href="SA.html#SATSL" title="left()"><span class="headingNumber">16.2.4.3 </span>left()</a> and <a class="link_ptr" href="SA.html#SATSR" title="right()"><span class="headingNumber">16.2.4.4 </span>right()</a>)</dd><dt><span><span class="name">string-index()</span></span></dt><dd>addresses a point inside a text node (<a class="link_ptr" href="SA.html#SATSSI" title="stringindex()"><span class="headingNumber">16.2.4.5 </span>string-index()</a></dd><dt><span><span class="name">range()</span></span></dt><dd>addresses the range between two points (<a class="link_ptr" href="SA.html#SATSRN" title="range()"><span class="headingNumber">16.2.4.6 </span>range()</a>)</dd><dt><span><span class="name">string-range()</span></span></dt><dd>addresses a range of a specified length starting from a specified point (<a class="link_ptr" href="SA.html#SATSSR" title="stringrange()"><span class="headingNumber">16.2.4.7 </span>string-range()</a>)</dd><dt><span><span class="name">match()</span></span></dt><dd>addresses a range which matches a specified string within a node (<a class="link_ptr" href="SA.html#SATSMA" title="match()"><span class="headingNumber">16.2.4.8 </span>match()</a>)</dd></dl><p>The <span class="name">xpath()</span> scheme refers to the existing XPath specification which is adopted with one modification: the default namespace for any XPath used as a parameter to this scheme is assumed to be the TEI namespace <code>http://www.tei-c.org/ns/1.0</code>.</p><p>The other six schemes overlap in functionality with a W3C draft specification known as the <span class="name">XPointer scheme</span> draft, but are individually much simpler. At the time of this writing, there is no current or scheduled activity at the W3C towards revising this draft or issuing it as a recommendation.</p><p><span style="font-weight:bold">A note on namespaces</span>: The W3C defines an <span class="name">xmlns()</span> scheme (see <a class="link_ref" href="http://www.w3.org/TR/xptr-xmlns/">XPointer xmlns() Scheme</a>) which when prepended to a resolvable pointer allows for the definition of namespace prefixes to be used in XPaths in subsequent pointers. TEI Pointer schemes assume that un-prefixed element names in TEI Pointer XPaths are in the TEI namespace, <code>http://www.tei-c.org/ns/1.0</code>. The use of <span class="name">xmlns()</span> is thus optional, provided no new prefixes need to be defined. If the schemes described here are used to address non-TEI elements, then any new prefixes to be used in pointer XPaths may be defined using the <span class="name">xmlns()</span> scheme.</p></div><div class="div4" id="SATSXP"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSin"><span class="headingNumber">16.2.4.1 </span>Introduction to TEI Pointers</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSL"><span class="headingNumber">16.2.4.3 </span>left()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSXP" title="link to this section "><span class="invisible">TEI: xpath()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.2 </span><span class="head">xpath()</span></h5><p><code>Sequence xpath(XPATH)</code></p><p>The <span class="name">xpath()</span> scheme locates a node within an XML Information Set. The single argument XPATH is an XPath path expression, following the latest scheme adopted by the W3C (currently <a class="link_ref" href="http://www.w3.org/TR/xpath20/">XPath 2.0</a>), that returns a sequence. XPaths returning atomic values (e.g. <span class="name">substring()</span>) are illegal in the <span class="name">xpath()</span> scheme because they represent extracted values rather than locations in the source document. XPath expressions that address attribute nodes are only advisable in the <span class="name">xpath()</span> scheme.</p><div class="p">The example below, and all subsequent examples in this section refer to the following TEI fragment<a id="SATSXP-ex"><!--anchor--></a>:  <div id="index-egXML-d52e119614" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:lang</span>="<span class="attributevalue">la</span>" <span class="attribute">type</span>="<span class="attributevalue">edition</span>"<br /> <span class="attribute">xml:space</span>="<span class="attributevalue">preserve</span>"&gt;</span><span class="element">&lt;ab&gt;</span>
<span class="element">&lt;lb <span class="attribute">n</span>="<span class="attributevalue">1</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">line1</span>"/&gt;</span><span class="element">&lt;supplied <span class="attribute">reason</span>="<span class="attributevalue">lost</span>"&gt;</span>si<span class="element">&lt;/supplied&gt;</span> non <span class="element">&lt;choice&gt;</span><span class="element">&lt;reg&gt;</span>habui<span class="element">&lt;/reg&gt;</span><span class="element">&lt;orig&gt;</span>abui<span class="element">&lt;/orig&gt;</span><span class="element">&lt;/choice&gt;</span> quidquam vaco 
<span class="element">&lt;lb <span class="attribute">n</span>="<span class="attributevalue">2</span>"/&gt;</span>si<span class="element">&lt;gap <span class="attribute">reason</span>="<span class="attributevalue">illegible</span>" <span class="attribute">quantity</span>="<span class="attributevalue">3</span>"<br /> <span class="attribute">unit</span>="<span class="attributevalue">character</span>"/&gt;</span>b<span class="element">&lt;gap <span class="attribute">reason</span>="<span class="attributevalue">illegible</span>" <span class="attribute">quantity</span>="<span class="attributevalue">3</span>"<br /> <span class="attribute">unit</span>="<span class="attributevalue">character</span>"/&gt;</span> 
  cohort<span class="element">&lt;unclear&gt;</span>e<span class="element">&lt;/unclear&gt;</span> mi rescribas 
<span class="element">&lt;lb <span class="attribute">n</span>="<span class="attributevalue">3</span>"/&gt;</span><span class="element">&lt;unclear&gt;</span>s<span class="element">&lt;/unclear&gt;</span>emp<span class="element">&lt;unclear&gt;</span>er<span class="element">&lt;/unclear&gt;</span> in <span class="element">&lt;choice&gt;</span><span class="element">&lt;reg&gt;</span>mente<span class="element">&lt;/reg&gt;</span><span class="element">&lt;orig&gt;</span>mentem<span class="element">&lt;/orig&gt;</span><span class="element">&lt;/choice&gt;</span> 
  <span class="element">&lt;choice&gt;</span><span class="element">&lt;reg&gt;</span>habe<span class="element">&lt;/reg&gt;</span><span class="element">&lt;orig&gt;</span>abe<span class="element">&lt;/orig&gt;</span><span class="element">&lt;/choice&gt;</span> supra res 
<span class="element">&lt;lb <span class="attribute">n</span>="<span class="attributevalue">4</span>"/&gt;</span>scriptas<span class="element">&lt;gap <span class="attribute">reason</span>="<span class="attributevalue">lost</span>" <span class="attribute">extent</span>="<span class="attributevalue">unknown</span>"<br /> <span class="attribute">unit</span>="<span class="attributevalue">character</span>"/&gt;</span> 
<span class="element">&lt;lb <span class="attribute">n</span>="<span class="attributevalue">5</span>"/&gt;</span>auge et opto u<span class="element">&lt;unclear&gt;</span>t<span class="element">&lt;/unclear&gt;</span> bene valeas<span class="element">&lt;/ab&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><p>A TEI Pointer that referenced the "normalized" form in the <code>choice</code> in line 1 of the example might look like: <br /><code>#xpath(//lb[@n='1']/following-sibling::choice/reg)</code>.</p><p>When an XPath is interpreted by a TEI processor, the information set of the referenced document is interpreted without any additional information supplied by any schema processing that may or may not be present. In particular this means that no whitespace normalization is applied to a document before the XPath is interpreted. </p><p>This pointer scheme allows easy, direct use of the most widely-implemented XML query method. It is probably the most robust pointing mechanism for the common situation of selecting an XML element or its contents where an <span class="att">xml:id</span> is not present. The ability to use element names and attribute names and values makes <span class="name">xpath()</span> pointers more robust than the other mechanisms discussed in this section even if the designated document changes. For durability in the presence of editing, use of <span class="att">xml:id</span> is always recommended when possible.</p></div><div class="div4" id="SATSL"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSXP"><span class="headingNumber">16.2.4.2 </span>xpath()</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSR"><span class="headingNumber">16.2.4.4 </span>right()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSL" title="link to this section "><span class="invisible">TEI: left()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.3 </span><span class="head">left()</span></h5><p>Point <code>left( IDREF | XPATH )</code></p><p>The <span class="name">left()</span> scheme locates the point immediately preceding the node addressed by its argument, which is either an XPATH as defined above or an IDREF, the value of an <span class="att">xml:id</span> occurring in the document addressed by the base URI in effect for the pointer.</p><p>Example: the pointer <code>#left(//gap[1])</code> indicates the point between the first <code>lb</code> and the first <code>gap</code> in the <a class="link_ref" href="SA.html#SATSXP-ex" title="">example</a> above.</p><p>Example: <code>#left(line1)</code> indicates the point immediately before the <code>&lt;lb n="1"/&gt;</code> element.</p></div><div class="div4" id="SATSR"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSL"><span class="headingNumber">16.2.4.3 </span>left()</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSSI"><span class="headingNumber">16.2.4.5 </span>string-index()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSR" title="link to this section "><span class="invisible">TEI: right()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.4 </span><span class="head">right()</span></h5><p>Point <code>right( IDREF | XPATH )</code></p><p>The <span class="name">right()</span> scheme locates the point immediately following the node addressed by its argument.</p><p>Example: the pointer <code>#right(//lb[@n='3'])</code> indicates the point between the third <code>lb</code> and the <code>&lt;unclear&gt;s&lt;/unclear&gt;</code> element in the <a class="link_ref" href="SA.html#SATSXP-ex" title="">example</a>.</p></div><div class="div4" id="SATSSI"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSR"><span class="headingNumber">16.2.4.4 </span>right()</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSRN"><span class="headingNumber">16.2.4.6 </span>range()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSSI" title="link to this section "><span class="invisible">TEI: string-index()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.5 </span><span class="head">string-index()</span></h5><p>Point <code>string-index( IDREF | XPATH, OFFSET )</code></p><p>The <span class="name">string-index()</span> scheme locates a point based on character positions in a text stream relative to the node identified by the IDREF or XPATH parameter. The OFFSET parameter is a positive, negative, or zero integer which determines the position of the point. An offset of 0 represents the position immediately before the first character in either the first text node descendant of the node addressed in the first parameter or the first following text node, if the addressed element contains no text node descendants.</p><p>Example: <code>#string-index(//lb[@n='2'],1)</code> indicates the point between the <span class="q">‘s’</span> and the <span class="q">‘i’</span> in the word <span class="q">‘si’</span> in line 2.</p><p><span style="font-weight:bold">Note</span>: The OFFSET parameter (and similarly the LENGTH parameter found below in the <span class="name">string-range()</span> scheme) are measured in characters. What is considered a single character will depend (assuming the document being evaluated is in Unicode) on the Normalization Form in use (see <a class="link_ref" href="http://unicode.org/reports/tr15/">UNICODE NORMALIZATION FORMS</a>). A letter followed by a combining diacritic counts as two characters, but the same diacritic precombined with a letter would count as a single character. Compare, for example, é (<code>\u0060</code> followed by <code>\u0301</code>) and é (<code>\u00E9</code>). These are equivalent, and a conversion between Normalization Forms C and D will transform one into the other. This specification does not mandate a particular Normalization Form (see <a class="link_ptr" href="CH.html#D4-46-2" title="Precomposed and Combining Characters and Normalization"><span class="headingNumber"></span>Precomposed and Combining Characters and Normalization</a>), but users and implementers should be aware that it affects the character count and therefore the result of evaluating pointers that rely on character counting.</p></div><div class="div4" id="SATSRN"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSSI"><span class="headingNumber">16.2.4.5 </span>string-index()</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSSR"><span class="headingNumber">16.2.4.7 </span>string-range()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSRN" title="link to this section "><span class="invisible">TEI: range()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.6 </span><span class="head">range()</span></h5><p>Sequence <code>range( POINTER, POINTER[, POINTER, POINTER ...])</code></p><p>The <span class="name">range()</span> scheme takes as parameters one or more pairs of POINTERs, which are each members of the set IDREF, XPATH, <span class="name">left()</span>, <span class="name">right()</span>, or <span class="name">string-index()</span>. A <span class="name">range()</span> locates a (possibly non-contiguous) sequence beginning at the first POINTER parameter and ending at the last. If the POINTER locates a node (i.e. is an XPATH or IDREF), then that node is a member of the addressed sequence. If a sequence addressed by a range pointer overlaps, but does not wholly contain, an element (i.e. it contains only the start but not the end tag or vice-versa), then that element is not part of the sequence.</p><p><span class="name">Range()</span>s may address sequences of non-contiguous nodes. For example, a range() might select text beginning before an <a class="gi" title="(apparatus entry) contains one entry in a critical apparatus, with an optional lemma and usually one or more readings or notes on the relevant passage." href="ref-app.html">app</a>, encompassing the content of a single <a class="gi" title="(reading) contains a single reading within a textual variation." href="ref-rdg.html">rdg</a> and continuing after the <a class="gi" title="(apparatus entry) contains one entry in a critical apparatus, with an optional lemma and usually one or more readings or notes on the relevant passage." href="ref-app.html">app</a>.</p><p>Example: <code>#range(left(//lb[@n='3']),left(//lb[@n='4']))</code> indicates the whole of <a class="link_ref" href="SA.html#SATSXP-ex" title="">line 4</a> from the <code>&lt;lb n="3"/&gt;</code> to the point right before the following <code>&lt;lb n="4"/&gt;</code>.</p><p>Example: <code>#range(right(//lb[@n='3']),string-index(//lb[@n='3'],15))</code> indicates the sequence <code>&lt;unclear&gt;s&lt;/unclear&gt;emp&lt;unclear&gt;er&lt;/unclear&gt; in mente</code>.</p><p>Example: <code>#range(string-index(//lb[@n='3'],7),string-index(//lb[@n='3'],10),string-index(//lb[@n='3'],15),string-index(//lb[@n='3'],21))</code> indicates the non-contiguous sequence <span class="q">‘in mentem’</span>.</p></div><div class="div4" id="SATSSR"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSRN"><span class="headingNumber">16.2.4.6 </span>range()</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SATSMA"><span class="headingNumber">16.2.4.8 </span>match()</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSSR" title="link to this section "><span class="invisible">TEI: string-range()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.7 </span><span class="head">string-range()</span></h5><p>Sequence <code>string-range(IDREF | XPATH, OFFSET, LENGTH[, OFFSET, LENGTH ...])</code></p><p>The string-range() scheme locates a sequence based on character positions in a text stream relative to the node identified by the first parameter. The location of the beginning of the addressed sequence is determined precisely as for <span class="name">string-index()</span>. The OFFSET parameter is defined as above in <span class="name">string-index()</span>. The LENGTH parameter is a positive integer that denotes the length of the text stream captured by the sequence. As with <span class="name">range()</span>, the addressed sequence may contain text nodes and/or elements. The <span class="name">string-range()</span> scheme, can accept multiple OFFSET, LENGTH pairs to address a non-contiguous sequence in mauch the same way that range() can accept multiple pairs of pointers.</p><p>Because string-range() addresses points in the text stream, tags are invisible to it. For example, if an empty tag like <a class="gi" title="(line break) marks the start of a new (typographic) line in some edition or version of a text." href="ref-lb.html">lb</a> is encountered while processing a string-range(), it will be included in the resulting sequence, but the LENGTH count will not increment when it is captured.</p><p>Example: <code>#string-range(//lb[@n='5'],0,27)</code> indicates the whole of <a class="link_ref" href="SA.html#SATSXP-ex" title="">line 5</a> from the text immediately following the <code>lb</code> to the point right before the closing <code>ab</code> tag.</p><p>Example: <code>#string-range(//lb[@n='3'],7,8)</code> indicates the sequence <span class="q">‘in mente’</span>.</p><p>Example: <code>#string-range(//lb[@n='3'],7,3,15,6)</code> indicates the non-contiguous sequence <span class="q">‘in mentem’</span>.</p></div><div class="div4" id="SATSMA"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATSSR"><span class="headingNumber">16.2.4.7 </span>string-range()</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SATSMA" title="link to this section "><span class="invisible">TEI: match()</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.4.8 </span><span class="head">match()</span></h5><p>Sequence <code>match(IDREF | XPATH, 'REGEX' [, INDEX])</code></p><p>The match scheme locates a sequence based on matching the REGEX parameter against a text stream relative to the reference node identified by the first parameter. REGEX is a regular expression as defined by <a class="link_ref" href="http://www.w3.org/TR/xpath-functions/#regex-syntax">XQuery 1.0 and XPath 2.0 Functions and Operators (Second Edition)</a>, with some modifications: </p><ul><li class="item">Because the regular expression is delimited by apostrophe characters, any such characters (<code>'</code> or <code>\u0027</code>) occurring inside the expression must be escaped using the URI percent-encoding scheme <code>%27</code>.</li><li class="item">Regular expressions in <code>match()</code> are assumed to operate in multi-line mode. The end of the string to be matched against is either the end of the text contained by the element in the first parameter or the end of the document, if that parameter indicates an empty element. The meta-character <code>^</code> therefore matches the beginning of the text stream inside or following the reference node, and the meta-character <code>$</code> matches the end of that stream.</li></ul><p> The optional INDEX parameter is an integer greater than 0 which specifies which match should be chosen when there is more than one possibility. If omitted, the first match in the text stream will be used.</p><p>Like <code>string-range()</code>, <code>match()</code> may capture elements in the returned sequence, even though they are ignored for purposes of evaluating the match.</p><p>Example: <code>#match(//lb[@n='5'],'opto.*valeas')</code> indicates the sequence <code>opto u&lt;unclear&gt;t&lt;/unclear&gt; bene valeas</code> in <a class="link_ref" href="SA.html#SATSXP-ex" title="">line 5</a>.</p><p>Example: <code>#match(//lb[@n='3'],'semper')</code> would indicate the word <span class="q">‘semper’</span>, but would not capture the <code>unclear</code> elements in <code>&lt;unclear&gt;s&lt;/unclear&gt;emp&lt;unclear&gt;er&lt;/unclear&gt;</code>, just their text children.</p></div></div><div class="div3" id="SACR"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SATS"><span class="headingNumber">16.2.4 </span>TEI XPointer Schemes</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SACR" title="link to this section "><span class="invisible">TEI: Canonical References</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.5 </span><span class="head">Canonical References</span></h4><p>By ‘canonical’ reference we mean any means of pointing into documents, specific to a community or corpus. For example, biblical scholars might understand <span class="q">‘Matt 5:7’</span> to mean <span class="q">‘the book called <span class="titlem">Matthew</span>, chapter 5, verse 7.’</span> They might then wish to translate the string <span class="q">‘Matt 5:7’</span> into a pointer into a TEI-encoded document, selecting the element which corresponds to the seventh <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> element within the fifth <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> element within the <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> element with the <span class="att">n</span> attribute valued <span class="q">‘Matt.’</span></p><p>Several elements in the TEI scheme (<a class="gi" title="identifies a phrase or word used to provide a gloss or definition for some other word or phrase." href="ref-gloss.html">gloss</a>, <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a>, <a class="gi" title="(reference) defines a reference to another location, possibly modified by additional text or comment." href="ref-ref.html">ref</a>, and <a class="gi" title="contains a single-word, multi-word, or symbolic designation which is regarded as a technical term." href="ref-term.html">term</a>) bear a special attribute, <span class="att">cRef</span>, just for this purpose. Using the system described in this section, an encoder may specify references to canonical works in a discipline-familiar format, and expect software to derive a complete URI from it. The value of the <span class="att">cRef</span> attribute is processed as described in this section, and the resulting URI reference is treated as if it were the value of the <span class="att">target</span> attribute. The <span class="att">cRef</span> and <span class="att">target</span> attributes are mutually exclusive: only one or the other may be specified on any given occurrence of an element.</p><div class="p">For the <span class="att">cRef</span> attribute to function as required, a mechanism is needed to define the mapping between (for example) <span class="q">‘the book called <span class="titlem">Matt</span>’</span> and the part of the XML structure which corresponds with it. This is provided by the <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a> element  in the TEI header, which contains an algorithm for translating a canonical reference string (like <span class="val">Matt 5:7</span>) into a URI such as <code>#xpath(//div[@n='Matt']/div[5]/div[7])</code>. The <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a> element is described in section <a class="link_ptr" href="HD.html#HD54" title="The Reference System Declaration"><span class="headingNumber">2.3.6 </span>The Reference System Declaration</a>; the following example is discussed in more detail below in section <a class="link_ptr" href="SA.html#SACRWE" title="Worked Example"><span class="headingNumber">16.2.5.1 </span>Worked Example</a>. <div id="index-egXML-d52e120128" class="pre egXML_valid"><span class="element">&lt;refsDecl <span class="attribute">xml:id</span>="<span class="attributevalue">biblical</span>"&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">(.+) (.+):(.+)</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">#xpath(//div[@n='$1']/div[@n='$2']/div[@n='$3]')</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>This pointer pattern extracts and references the <span class="element">&lt;q&gt;</span>book,<span class="element">&lt;/q&gt;</span><br />   <span class="element">&lt;q&gt;</span>chapter,<span class="element">&lt;/q&gt;</span> and <span class="element">&lt;q&gt;</span>verse<span class="element">&lt;/q&gt;</span> parts of a biblical reference.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">(.+) (.+)</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">#xpath(//div[@n='$1']/div[$2])</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>This pointer pattern extracts and references the <span class="element">&lt;q&gt;</span>book<span class="element">&lt;/q&gt;</span> and<br />   <span class="element">&lt;q&gt;</span>chapter<span class="element">&lt;/q&gt;</span> parts of a biblical reference.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">(.+)</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">#xpath(//div[@n='$1'])</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>This pointer pattern extracts and references just the <span class="element">&lt;q&gt;</span>book<span class="element">&lt;/q&gt;</span><br />       part of a biblical reference.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /><span class="element">&lt;/refsDecl&gt;</span></div></div><p>When an application encounters a canonical reference as the value of <span class="att">cRef</span> attribute, it might follow this sequence of specific steps to transform it into a URI reference: </p><ol class="numbered"><li class="item">Ascertain the correct <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a> following the rules summarized in section <a class="link_ptr" href="CC.html#CCAS3" title="Summary"><span class="headingNumber">15.3.3 </span>Summary</a>.</li><li class="item">For each <a class="gi" title="(canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI." href="ref-cRefPattern.html">cRefPattern</a> element encountered in the appropriate <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a>, in the order encountered: <ol class="numbered"><li class="item">match the value of the <span class="att">cRef</span> attribute to the regular expression found as the value of the <span class="att">matchPattern</span> attribute</li><li class="item">if the value of the <span class="att">cRef</span> attribute matches: <ol class="numbered"><li class="item">take the value of the <span class="att">replacementPattern</span> attribute and substitute the back references ($1, $2, etc.) with the corresponding matched substrings</li><li class="item">the result is taken as if it were a relative or absolute URI reference specified on the <span class="att">target</span> attribute; i.e., it should be used as is or combined with the current <span class="att">xml:base</span> attribute value as usual</li><li class="item">no further processing of this value of the <span class="att">cRef</span> attribute against the <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a> should take place</li></ol></li><li class="item">if, however, the value of the <span class="att">cRef</span> attribute does not match the regular expression specified in the value of the <span class="att">matchPattern</span> attribute, proceed to the next <a class="gi" title="(canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI." href="ref-cRefPattern.html">cRefPattern</a></li></ol></li><li class="item">If all the <a class="gi" title="(canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI." href="ref-cRefPattern.html">cRefPattern</a> elements are examined in turn and none matches, the pointer fails.</li></ol><p>The regular expression language used as the value of the <span class="att">matchPattern</span> attribute is that used for the <span class="term">pattern</span> facet of the World Wide Web Consortium's XML Schema Language in an <a class="link_ref" href="http://www.w3.org/TR/xmlschema-2/#regexs">Appendix to XML Schema Part 2</a>.<span id="Note98_return"><a class="notelink" title="As always seems to be the case, no two regular expression languages are precisely the same. For those used to Perl regular expressions, be warned that…" href="#Note98"><sup>61</sup></a></span> The value of the <span class="att">replacementPattern</span> attribute is simply a string, except that occurrences of <span class="q">‘$1’</span> through <span class="q">‘$9’</span> are replaced by the corresponding substring match. Note that since a maximum of nine substring matches are permitted, the string <span class="q">‘$18’</span> means <span class="q">‘the value of the first matched substring followed by the character <span class="q">‘8’</span>’</span> as opposed to <span class="q">‘the eighteenth matched substring’</span>. If there is a need for an actual string including a dollar sign followed by a digit that is not supposed to be replaced, the dollar sign should be written as <code>$$</code>. Implementations must convert <code>$$</code> to <code>$</code> during processing.</p><div class="div4" id="SACRWE"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACRex"><span class="headingNumber">16.2.5.2 </span>Complete and Partial URI Examples</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SACRWE" title="link to this section "><span class="invisible">TEI: Worked Example</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.5.1 </span><span class="head">Worked Example</span></h5><p>Let us presume that with the example <a class="gi" title="(references declaration) specifies how canonical references are constructed for this text." href="ref-refsDecl.html">refsDecl</a> above, an application comes across a <span class="att">cRef</span> value of <span class="val">Matt 5:7</span>. The application would first apply the regular expression <code>(.+) (.+):(.+)</code> to <span class="q">‘Matt 5:7’</span>. This regular expression would successfully match. The first matched substring would be <span class="q">‘Matt’</span>, the second <span class="q">‘5’</span>, and the third <span class="q">‘7’</span>. The application would then apply these substrings to the pattern <code>#xpath(//div[@n='$1']/div[$2]/div[$3])</code>, producing <code>#xpath(//div[@n='Matt']/div[5]/div[7])</code>.</p><p>If, however, the input string had been <span class="q">‘Matt 5’</span>, the first regular expression would not have matched. The application would have then tried the second, <code>(.+) (.+)</code>, producing a successful match, and the matched substrings <span class="q">‘Matt’</span> and <span class="q">‘5’</span>. It would then have substituted those matched substrings into the pattern <code>#xpath(//div[@n='$1']/div[$2])</code> to produce a fragment identifier indicating the referenced element.</p><p>If the input string had been <span class="q">‘Matt’</span>, neither the first nor the second regular expressions would have successfully matched. The application would have then tried the third, <code>(.+)</code>, producing the matched substring <span class="q">‘Matt’</span>, and the URI Reference <code>#xpath(//div[@n='Matt'])</code>.</p><div class="p">a <a class="gi" title="(canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI." href="ref-cRefPattern.html">cRefPattern</a> should not reference more matched substrings. For example: <div id="index-egXML-d52e120357" class="pre egXML_valid"><span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">(.+) (.+):(.+)</span>"<br /> <span class="attribute">replacementPattern</span>="<span class="attributevalue">//div[@n='$1']/div[$2]/div[$3]/p[$4]</span>"/&gt;</span></div> is faulty, since only three matched substrings would have been produced, but a fourth (<code>$4</code>) was referenced.</div></div><div class="div4" id="SACRex"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SACRWE"><span class="headingNumber">16.2.5.1 </span>Worked Example</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACRmu"><span class="headingNumber">16.2.5.3 </span>Miscellaneous Usages</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SACRex" title="link to this section "><span class="invisible">TEI: Complete and Partial URI Examples</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.5.2 </span><span class="head">Complete and Partial URI Examples</span></h5><div class="p">In the above example, the value of <span class="att">cRef</span> was used to generate a Fragment Identifier. An absolute URI could be generated directly, as in the following example. <div id="index-egXML-d52e120371" class="pre egXML_valid"><span class="element">&lt;refsDecl <span class="attribute">xml:id</span>="<span class="attributevalue">USC</span>"&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">([0-9][0-9])\s*U\.?S\.?C\.?\s*[Cc](h(\.|ap(ter|\.)?)?)?\s*([1-9][0-9]*)</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">http://uscode.house.gov/download/pls/$1C$5.txt</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>Matches most standard references to particular<br />       chapters of the United States Code, e.g.<br />   <span class="element">&lt;val&gt;</span>11USCC7<span class="element">&lt;/val&gt;</span>, <span class="element">&lt;val&gt;</span>17 U.S.C. Chapter 3<span class="element">&lt;/val&gt;</span>, or<br />   <span class="element">&lt;val&gt;</span>14 USC Ch. 5<span class="element">&lt;/val&gt;</span>. Note that a leading zero is<br />       required for the title (must be two digits), but is not<br />       permitted for the chapter number.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">([0-9][0-9])\s*U\.?S\.?C\.?\s*[Pp](re(lim(inary)?)?)?\s*[Mm](at(erial)?)?</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">http://uscode.house.gov/download/pls/$1T.txt</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>Matches references to the preliminary material for a<br />       given title, e.g. <span class="element">&lt;val&gt;</span>11USCP<span class="element">&lt;/val&gt;</span>, <span class="element">&lt;val&gt;</span>17 U.S.C.<br />         Prelim Mat<span class="element">&lt;/val&gt;</span>, or <span class="element">&lt;val&gt;</span>14 USC pm<span class="element">&lt;/val&gt;</span>.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /> <span class="element">&lt;cRefPattern <span class="attribute">matchPattern</span>="<span class="attributevalue">([0-9][0-9])\s*U\.?S\.?C\.?\s*[Aa](ppend(ix)?)?</span>"<br />  <span class="attribute">replacementPattern</span>="<span class="attributevalue">http://uscode.house.gov/download/pls/$1A.txt</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span>Matches references to the appendix of a given tile,<br />       e.g. <span class="element">&lt;val&gt;</span>05USCA<span class="element">&lt;/val&gt;</span>, <span class="element">&lt;val&gt;</span>11 U.S.C. Appendix<span class="element">&lt;/val&gt;</span>,<br />       or <span class="element">&lt;val&gt;</span>18 USC Append<span class="element">&lt;/val&gt;</span>.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/cRefPattern&gt;</span><br /><span class="element">&lt;/refsDecl&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;p&gt;</span>The example in section 10 is taken<br />   from <span class="element">&lt;ref <span class="attribute">cRef</span>="<span class="attributevalue">17 USC Ch 1</span>"&gt;</span>Subject Matter and Scope of<br />     Copyright<span class="element">&lt;/ref&gt;</span>.<span class="element">&lt;/p&gt;</span></div></div><p>See <a class="link_ptr" href="SA.html#SAPU" title="Using Abbreviated Pointers"><span class="headingNumber">16.2.3 </span>Using Abbreviated Pointers</a> for another related use of the <span class="att">matchPattern</span> and <span class="att">replacementPattern</span> attributes.</p></div><div class="div4" id="SACRmu"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SACRex"><span class="headingNumber">16.2.5.2 </span>Complete and Partial URI Examples</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h5><span class="bookmarklink"><a class="bookmarklink" href="#SACRmu" title="link to this section "><span class="invisible">TEI: Miscellaneous Usages</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.2.5.3 </span><span class="head">Miscellaneous Usages</span></h5><p>Canonical reference pointers are intended for use by TEI encoders. However, this specification might be useful to the development of a process for recognizing canonical references in non-TEI documents (such as plain text documents), possibly as part of their conversion to TEI.</p></div></div></div><div class="div2" id="SASE"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAXP"><span class="headingNumber">16.2 </span>Pointing Mechanisms</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASY"><span class="headingNumber">16.4 </span>Synchronization</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SASE" title="link to this section "><span class="invisible">TEI: Blocks, Segments, and Anchors</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.3 </span><span class="head">Blocks, Segments, and Anchors</span></h3><p>In this section, we discuss three general purposes elements which may be used to mark and categorize both a span of text and a point within one. These elements have several uses, most notably to provide elements which can be given identifiers for use when aligning or linking to parts of a document, as discussed elsewhere in this chapter. They also provide a convenient way of extending the semantics of the TEI markup scheme in a theory-neutral manner, by providing for two neutral or ‘anonymous’ elements to which the encoder can add any meaning not supplied by other TEI defined elements. </p><ul class="specList"><li><span class="specList-elementSpec"><a href="ref-anchor.html">anchor</a></span> (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element.</li><li><span class="specList-elementSpec"><a href="ref-ab.html">ab</a></span> (anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph.</li><li><span class="specList-elementSpec"><a href="ref-seg.html">seg</a></span> (arbitrary segment) represents any segmentation of text below the ‘chunk’ level.</li></ul><p> The elements <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a>, <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a>, and <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> are members of the class <a class="link_odd" title="provides attributes which can be used to classify or subclassify elements in any way." href="ref-att.typed.html">att.typed</a>, from which they inherit the following attributes: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.typed.html">att.typed</a></span> provides attributes which can be used to classify or subclassify elements in any way.<table class="specDesc"><tr><td class="Attribute"><span class="att">type</span></td><td>characterizes the element in some sense, using any convenient classification scheme or typology.</td></tr><tr><td class="Attribute"><span class="att">subtype</span></td><td>provides a sub-categorization of the element, if needed</td></tr></table></li></ul><p> The elements <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a>, and <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> are members of the class <a class="link_odd" title="provides an attribute for representing fragmentation of a structural element, typically as a consequence of some overlapping hierarchy." href="ref-att.fragmentable.html">att.fragmentable</a>, from which they inherit the following attribute: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.fragmentable.html">att.fragmentable</a></span> provides an attribute for representing fragmentation of a structural element, typically as a consequence of some overlapping hierarchy.<table class="specDesc"><tr><td class="Attribute"><span class="att">part</span></td><td>specifies whether or not its parent element is fragmented in some way, typically by some other overlapping structure: for example a speech which is divided between two or more verse stanzas, a paragraph which is split across a page division, a verse line which is divided between two speakers.</td></tr></table></li></ul><p> The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element is also a member of the class <a class="link_odd" title="provides attributes for elements used for arbitrary segmentation." href="ref-att.segLike.html">att.segLike</a> from which it inherits the following attribute: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.segLike.html">att.segLike</a></span> provides attributes for elements used for arbitrary segmentation.<table class="specDesc"><tr><td class="Attribute"><span class="att">function</span></td><td>characterizes the function of the segment.</td></tr></table></li></ul><p>The <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> element may be thought of as an empty <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a>, or as an artifice enabling an identifier to be attached to any position in a text. Like the <a class="gi" title="marks a boundary point separating any kind of section of a text, typically but not necessarily indicating a point at which some part of a standard reference system changes, where the change is not represented by a structural element." href="ref-milestone.html">milestone</a> element discussed in section <a class="link_ptr" href="CO.html#CORS" title="Reference Systems"><span class="headingNumber">3.10 </span>Reference Systems</a>, it is useful where multiple views of a document are to be combined, for example, when a logical view based on paragraphs or verse lines is to be mapped on to a physical view based on manuscript lines. Like those elements, it is a member of the class <a class="link_odd" title="groups elements which may appear at any point within a TEI text." href="ref-model.global.html">model.global</a> and can therefore appear anywhere within a document when the module defined by this chapter is included in a schema. Unlike the other elements in its class, the <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> element is primarily intended to mark an arbitrary point used for alignment, or as the target of a spanning element such as those discussed in section <a class="link_ptr" href="PH.html#PHAD" title="Additions and Deletions"><span class="headingNumber">11.3.1.4 </span>Additions and Deletions</a>, rather than as a means of marking segment boundaries for some arbitrary segmentation of a text.</p><div class="p">For example, suppose that we wish to mark the end of the fifth word following each occurrence of some term in a particular text, perhaps to assist with some collocational analysis. This can most easily be done with the help of the <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> element, as follows:  <div id="index-egXML-d52e120511" class="pre egXML_valid">English language. Except for not very<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">eng1</span>"/&gt;</span><br /> English at all at the time<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">eng2</span>"/&gt;</span><br /> English was still full of flaws<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">eng3</span>"/&gt;</span><br /> English. This was revised by young<br /><span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">eng4</span>"/&gt;</span><div style="float: right;"><a href="BIB.html#PNIN1">bibliography</a> </div></div> In section <a class="link_ptr" href="SA.html#SACS1" title="Correspondence"><span class="headingNumber">16.5.1 </span>Correspondence</a> we discuss ways in which these <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> points might be used to represent an alignment such as one might get in a keyword-in-context concordance.</div><p>The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element may be used at the encoder's discretion to mark almost any segment of the text of interest for processing. One use of the element is to mark text features for which no appropriate markup is otherwise defined, i.e. as a simple extension mechanism. Another use is to provide an identifier for some segment which is to be pointed at by some other element, i.e. to provide a target, or a part of a target, for a <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> or other similar element.</p><p>Several examples of uses for the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element are provided elsewhere in these Guidelines. For example: </p><ul class="bulleted"><li class="item">as a means of marking segments significant in a metrical or rhyming analysis (see section <a class="link_ptr" href="VE.html#VEME" title="Rhyme and Metrical Analysis"><span class="headingNumber">6.4 </span>Rhyme and Metrical Analysis</a>)</li><li class="item">as a means of marking typographic lines in drama (see section <a class="link_ptr" href="DR.html#DRBOD" title="The Body of a Performance Text"><span class="headingNumber">7.2 </span>The Body of a Performance Text</a>) or title pages (see section <a class="link_ptr" href="DS.html#DSTITL" title="Title Pages"><span class="headingNumber">4.6 </span>Title Pages</a>)</li><li class="item">as a means of marking prosody- or pause-defined units in transcribed speech (see section <a class="link_ptr" href="TS.html#TSSASE" title="Segmentation"><span class="headingNumber">8.4.1 </span>Segmentation</a>)</li><li class="item">as a means of marking linguistic or other analyses in a theory-neutral manner (see chapter <a class="link_ptr" href="AI.html" title="15"><span class="headingNumber">17 </span>Simple Analytic Mechanisms</a> passim)</li></ul><div class="p">In the following simple example, the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element simply delimits the extent of a stutter, a textual feature for which no element is provided in these Guidelines. <div id="index-egXML-d52e120563" class="pre egXML_valid"><span class="element">&lt;q&gt;</span>Don't say <span class="element">&lt;q&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">stutter</span>"&gt;</span>I-I-I<span class="element">&lt;/seg&gt;</span>'m afraid,<span class="element">&lt;/q&gt;</span> Melvin, just say <span class="element">&lt;q&gt;</span>I'm<br />     afraid.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/q&gt;</span><div style="float: right;"><a href="BIB.html#SASE-eg-32">bibliography</a> </div></div>  The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element is particularly useful for the markup of linguistically significant constituents such as the phrases that may be the output of an automatic parsing system. This example also demonstrates the use of the <span class="att">xml:id</span> attribute to carry an identifier which other parts of a document may use to point to, or align with: <div id="index-egXML-d52e120583" class="pre egXML_valid"><span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">bl0034</span>" <span class="attribute">type</span>="<span class="attributevalue">sentence</span>"&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">bl0034.1</span>" <span class="attribute">type</span>="<span class="attributevalue">phrase</span>"&gt;</span>Literate and illiterate speech<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">bl0034.2</span>" <span class="attribute">type</span>="<span class="attributevalue">phrase</span>"&gt;</span>in a language like English<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">bl0034.3</span>" <span class="attribute">type</span>="<span class="attributevalue">phrase</span>"&gt;</span>are plainly different.<span class="element">&lt;/seg&gt;</span><br /><span class="element">&lt;/seg&gt;</span><div style="float: right;"><a href="BIB.html#SASE-eg-33">bibliography</a> </div></div> </div><div class="p">As the above example shows, <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements may be nested directly within one another, to any degree of analysis considered appropriate. This is taken a little further in the following example, where the <span class="att">type</span> and <span class="att">subtype</span> attributes have been used to further categorize each word of the sentence (the <span class="att">xml:id</span> attributes have been removed to reduce the complexity of the example): <div id="index-egXML-d52e120607" class="pre egXML_valid"><span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">sentence</span>" <span class="attribute">subtype</span>="<span class="attributevalue">declarative</span>"&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">phrase</span>" <span class="attribute">subtype</span>="<span class="attributevalue">noun</span>"&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">adjective</span>"&gt;</span>Literate<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">conjunction</span>"&gt;</span>and<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">adjective</span>"&gt;</span>illiterate<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">noun</span>"&gt;</span>speech<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">phrase</span>" <span class="attribute">subtype</span>="<span class="attributevalue">preposition</span>"&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">preposition</span>"&gt;</span>in<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">article</span>"&gt;</span>a<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">noun</span>"&gt;</span>language<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">preposition</span>"&gt;</span>like<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">noun</span>"&gt;</span>English<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">phrase</span>" <span class="attribute">subtype</span>="<span class="attributevalue">verb</span>"&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">verb</span>"&gt;</span>are<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">adverb</span>"&gt;</span>plainly<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">word</span>" <span class="attribute">subtype</span>="<span class="attributevalue">adjective</span>"&gt;</span>different<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">punct</span>"&gt;</span>.<span class="element">&lt;/seg&gt;</span><br /><span class="element">&lt;/seg&gt;</span></div></div><div class="p">(The example values shown are chosen for simplicity of comprehension, rather than verisimilitude). It should also be noted that specialized segment elements are defined in section <a class="link_ptr" href="AI.html#AILC" title="Linguistic Segment Categories"><span class="headingNumber">17.1 </span>Linguistic Segment Categories</a> to facilitate this particular kind of analysis. These allow for the explicit markup of units called <span class="term">s-units</span>, <span class="term">clauses</span>, <span class="term">phrases</span>, <span class="term">words</span>, <span class="term">morphemes</span>, and <span class="term">characters</span>, which may be felt preferable to the more generic approach typified by use of the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element. Using these, the first phrase above might be encoded simply as <div id="index-egXML-d52e120664" class="pre egXML_valid"><span class="element">&lt;phr <span class="attribute">type</span>="<span class="attributevalue">noun</span>"&gt;</span><br /> <span class="element">&lt;w <span class="attribute">type</span>="<span class="attributevalue">adjective</span>"&gt;</span>Literate<span class="element">&lt;/w&gt;</span><br /> <span class="element">&lt;w <span class="attribute">type</span>="<span class="attributevalue">conjunction</span>"&gt;</span>and<span class="element">&lt;/w&gt;</span><br /> <span class="element">&lt;w <span class="attribute">type</span>="<span class="attributevalue">adjective</span>"&gt;</span>illiterate<span class="element">&lt;/w&gt;</span><br /> <span class="element">&lt;w <span class="attribute">type</span>="<span class="attributevalue">noun</span>"&gt;</span>speech<span class="element">&lt;/w&gt;</span><br /><span class="element">&lt;/phr&gt;</span></div> Note the way in which the <span class="att">type</span> attribute of these specialized elements now carries the value carried by the <span class="att">subtype</span> attribute of the more general <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element. For an analysis not using these traditional linguistic categories however, the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element provides a simple but powerful mechanism.</div><div class="p">In language corpora and similar material, the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element may be used to provide an end-to-end segmentation as an alternative to the more specific <a class="gi" title="(s-unit) contains a sentence-like division of a text." href="ref-s.html">s</a> element proposed in chapter <a class="link_ptr" href="AI.html#AILC" title="Linguistic Segment Categories"><span class="headingNumber">17.1 </span>Linguistic Segment Categories</a> for the markup of orthographic sentences, or <span class="term">s-units</span>. However, it may be more useful to use the <a class="gi" title="(s-unit) contains a sentence-like division of a text." href="ref-s.html">s</a> element for this purpose, since this means that the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element can then be used to mark both features within s-units and segments composed of s-units, as in the following example:<span id="Note99_return"><a class="notelink" title="See section , where the text from which this fragment is taken is analyzed." href="#Note99"><sup>62</sup></a></span> <div id="index-egXML-d52e120713" class="pre egXML_valid"><span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">s1s3</span>" <span class="attribute">type</span>="<span class="attributevalue">narrative_unit</span>"&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">s1</span>"&gt;</span>Sigmund, the <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">patronymic</span>"&gt;</span>son of Volsung<span class="element">&lt;/seg&gt;</span>,<br />     was a king in Frankish country.<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">s2</span>"&gt;</span>Sinfiotli was the eldest of his sons.<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">s3</span>"&gt;</span> ... <span class="element">&lt;/s&gt;</span><br /><span class="element">&lt;/seg&gt;</span><div style="float: right;"><a href="BIB.html#AI-eg-01">bibliography</a> </div></div></div><div class="p">Like other elements, the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> tag must be properly enclosed within other elements. Thus, a single <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element can be used to group together words in different sentences only if the sentences are not themselves tagged. The first of the following two encodings is legal, but the second is not.  <div id="index-egXML-d52e120734" class="pre egXML_valid">Give me <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">phrase</span>"&gt;</span>a dozen. Or two or three.<span class="element">&lt;/seg&gt;</span></div> <pre class="pre_eg cdata">&lt;!-- Illegal! --&gt;
&lt;s&gt;Give me &lt;seg type="phrase"&gt;a dozen.&lt;/s&gt;
&lt;s&gt;Or two or three.&lt;/s&gt;&lt;/seg&gt;</pre></div><div class="p">The <span class="att">part</span> attribute may be used as one simple method of overcoming this restriction: <div id="index-egXML-d52e120747" class="pre egXML_valid"><span class="element">&lt;s&gt;</span>Give me <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">phrase</span>" <span class="attribute">part</span>="<span class="attributevalue">I</span>"&gt;</span>a dozen.<span class="element">&lt;/seg&gt;</span><span class="element">&lt;/s&gt;</span><br /><span class="element">&lt;s&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">part</span>="<span class="attributevalue">F</span>"&gt;</span>Or two or three.<span class="element">&lt;/seg&gt;</span><br /><span class="element">&lt;/s&gt;</span></div> Another solution is to use the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element discussed in section <a class="link_ptr" href="SA.html#SAAG" title="Aggregation"><span class="headingNumber">16.7 </span>Aggregation</a>; this requires that each of the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements be given an identifier. For further discussion of this generic encoding problem, see also chapter <a class="link_ptr" href="NH.html" title="31"><span class="headingNumber">20 </span>Non-hierarchical Structures</a>.</div><p>The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element has the same content as a paragraph in prose: it can therefore be used to group together consecutive sequences of <a class="link_odd" title="groups elements which can appear either within or between paragraph-like elements." href="ref-model.inter.html">model.inter</a> class elements, such as lists, quotations, notes, stage directions, etc. as well as to contain sequences of phrase-level elements. It cannot however be used to group together sequences of paragraphs or similar text units such as verse lines; for this purpose, the encoder should use intermediate pointers, as described in section <a class="link_ptr" href="SA.html#SAPTIP" title="Intermediate Pointers"><span class="headingNumber">16.1.4 </span>Intermediate Pointers</a> or the methods described in section <a class="link_ptr" href="SA.html#SAAG" title="Aggregation"><span class="headingNumber">16.7 </span>Aggregation</a>. It is particularly important that the encoder provide a clear description of the principles by which a text has been segmented, and the way in which that segmentation is represented. This should include a description of the method used and the significance of any categorization codes. The description should be provided as a series of paragraphs within the <a class="gi" title="describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc." href="ref-segmentation.html">segmentation</a> element of the encoding description in the TEI header, as described in section <a class="link_ptr" href="HD.html#HD53" title="The Editorial Practices Declaration"><span class="headingNumber">2.3.3 </span>The Editorial Practices Declaration</a>.</p><p>The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element may also be used to encode simultaneous or mutually exclusive variants of a text when the more special purpose elements for simple editorial changes, abbreviation and expansion, addition and deletion, or for a critical apparatus are not appropriate. In these circumstances, one <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> is encoded for each possible variant, and the set of them is enclosed in a <a class="gi" title="groups a number of alternative encodings for the same point in a text." href="ref-choice.html">choice</a> element.</p><div class="p">For example, if one were writing dual-platform instructions for installation of software, it might be useful to use <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> to record platform-specific pieces of mutually exclusive text. <div id="index-egXML-d52e120801" class="pre egXML_valid">…pressing <span class="element">&lt;choice&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">platform</span>" <span class="attribute">subtype</span>="<span class="attributevalue">Mac</span>"&gt;</span>option<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">type</span>="<span class="attributevalue">platform</span>" <span class="attribute">subtype</span>="<span class="attributevalue">PC</span>"&gt;</span>alt<span class="element">&lt;/seg&gt;</span><br /><span class="element">&lt;/choice&gt;</span>-f will …</div></div><p>Elsewhere in this chapter we provide a number of examples where the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element is used simply to provide an element to which an identifier may be attached, for example so that another segment may be linked or related to it in some way.</p><p>The <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> (anonymous block) element performs a similar function to that of the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element, but is used for portions of the text which occur not within paragraphs or other component-level elements, but at the component level themselves. It is therefore a member of the <a class="link_odd" title="groups paragraph-like elements." href="ref-model.pLike.html">model.pLike</a> class.</p><div class="p">The <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> element may be used, for example, to tag the canonical verse divisions of Biblical texts: <div id="index-egXML-d52e120830" class="pre egXML_valid"><span class="element">&lt;div1 <span class="attribute">n</span>="<span class="attributevalue">Gen</span>" <span class="attribute">type</span>="<span class="attributevalue">book</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>The First Book of Moses, Called<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;head <span class="attribute">type</span>="<span class="attributevalue">main</span>"&gt;</span>Genesis<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;div2 <span class="attribute">n</span>="<span class="attributevalue">1</span>" <span class="attribute">type</span>="<span class="attributevalue">chapter</span>"&gt;</span><br />  <span class="element">&lt;ab <span class="attribute">n</span>="<span class="attributevalue">1</span>"&gt;</span>In the beginning God created the heaven and the<br />       earth.<span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;ab <span class="attribute">n</span>="<span class="attributevalue">2</span>"&gt;</span>And the earth was without form, and void; and darkness<br />   <span class="element">&lt;hi&gt;</span>was<span class="element">&lt;/hi&gt;</span> upon the face of the deep. And the Spirit of God<br />       moved upon the face of the waters.<span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;ab <span class="attribute">n</span>="<span class="attributevalue">3</span>"&gt;</span>And God said, Let there be light: and there was<br />       light.<span class="element">&lt;/ab&gt;</span><br /> <span class="element">&lt;/div2&gt;</span><br /><span class="element">&lt;/div1&gt;</span><div style="float: right;"><a href="BIB.html#SASE-eg-40">bibliography</a> </div></div> </div><div class="p">In other cases, where the text clearly indicates paragraph divisions containing one or more verses, the <a class="gi" title="(paragraph) marks paragraphs in prose." href="ref-p.html">p</a> element may be used to tag the paragraphs, and the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element used to subdivide them. The <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> element is provided as an alternative to the <a class="gi" title="(paragraph) marks paragraphs in prose." href="ref-p.html">p</a> element; it may <em>not</em> be used within paragraphs. The <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element, by contrast, may appear only within and not between paragraphs (or anonymous block elements). <div id="index-egXML-d52e120869" class="pre egXML_valid"><span class="element">&lt;div1 <span class="attribute">n</span>="<span class="attributevalue">Gen</span>" <span class="attribute">type</span>="<span class="attributevalue">book</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>Das Erste Buch Mose.<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;div2 <span class="attribute">n</span>="<span class="attributevalue">1</span>" <span class="attribute">type</span>="<span class="attributevalue">chapter</span>"&gt;</span><br />  <span class="element">&lt;p&gt;</span><br />   <span class="element">&lt;seg <span class="attribute">n</span>="<span class="attributevalue">1</span>"&gt;</span>Am Anfang schuff Gott Himel vnd Erden.<span class="element">&lt;/seg&gt;</span><br />   <span class="element">&lt;seg <span class="attribute">n</span>="<span class="attributevalue">2</span>"&gt;</span>Vnd die Erde war wüst vnd leer / vnd es war<br />         finster auff der Tieffe / Vnd der Geist Gottes schwebet auff<br />         dem Wasser.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span><br />   <span class="element">&lt;seg <span class="attribute">n</span>="<span class="attributevalue">3</span>"&gt;</span>Vnd Gott sprach / Es werde Liecht / Vnd es ward<br />         Liecht.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/div2&gt;</span><br /><span class="element">&lt;/div1&gt;</span><div style="float: right;"><a href="BIB.html#SASE-eg-41">bibliography</a> </div></div> </div><div class="p">The <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> element is also useful for marking dramatic speeches when it is not clear whether the speech is to be regarded as prose or verse. If, for example, an encoder does not wish to express an opinion as to whether the opening lines of Shakespeare's <span class="titlem">The Tempest</span> are to be regarded as prose or as verse, they might be tagged as follows: <div id="index-egXML-d52e120892" class="pre egXML_valid"><span class="element">&lt;div1 <span class="attribute">n</span>="<span class="attributevalue">I</span>" <span class="attribute">type</span>="<span class="attributevalue">act</span>"&gt;</span><br /> <span class="element">&lt;div2 <span class="attribute">n</span>="<span class="attributevalue">1</span>" <span class="attribute">type</span>="<span class="attributevalue">scene</span>"&gt;</span><br />  <span class="element">&lt;head <span class="attribute">rend</span>="<span class="attributevalue">italic</span>"&gt;</span>Actus primus, Scena prima.<span class="element">&lt;/head&gt;</span><br />  <span class="element">&lt;stage <span class="attribute">rend</span>="<span class="attributevalue">italic</span>" <span class="attribute">type</span>="<span class="attributevalue">setting</span>"&gt;</span> A tempestuous noise of<br />       Thunder and Lightning heard:<br />       Enter a Ship-master, and a Boteswaine.<span class="element">&lt;/stage&gt;</span><br />  <span class="element">&lt;sp&gt;</span><br />   <span class="element">&lt;speaker&gt;</span>Master.<span class="element">&lt;/speaker&gt;</span><br />   <span class="element">&lt;ab&gt;</span>Bote-swaine.<span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;/sp&gt;</span><br />  <span class="element">&lt;sp&gt;</span><br />   <span class="element">&lt;speaker&gt;</span>Botes.<span class="element">&lt;/speaker&gt;</span><br />   <span class="element">&lt;ab&gt;</span>Heere Master: What cheere?<span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;/sp&gt;</span><br />  <span class="element">&lt;sp&gt;</span><br />   <span class="element">&lt;speaker&gt;</span>Mast.<span class="element">&lt;/speaker&gt;</span><br />   <span class="element">&lt;ab&gt;</span>Good: Speake to th' Mariners: fall too't, yarely,<br />         or we run our selues a ground, bestirre, bestirre.<br />    <span class="element">&lt;stage <span class="attribute">type</span>="<span class="attributevalue">move</span>"&gt;</span>Exit.<span class="element">&lt;/stage&gt;</span><span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;/sp&gt;</span><br />  <span class="element">&lt;stage <span class="attribute">type</span>="<span class="attributevalue">move</span>"&gt;</span>Enter Mariners.<span class="element">&lt;/stage&gt;</span><br />  <span class="element">&lt;sp&gt;</span><br />   <span class="element">&lt;speaker&gt;</span>Botes.<span class="element">&lt;/speaker&gt;</span><br />   <span class="element">&lt;ab&gt;</span>Heigh my hearts, cheerely, cheerely my harts: yare, yare:<br />         Take in the toppe-sale: Tend to th' Masters whistle: Blow<br />         till thou burst thy winde, if roome e-nough.<span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;/sp&gt;</span><br /> <span class="element">&lt;/div2&gt;</span><br /><span class="element">&lt;/div1&gt;</span><div style="float: right;"><a href="BIB.html#CODR-eg-295">bibliography</a> </div></div> See further <a class="link_ptr" href="CO.html#CODR" title="Core Tags for Drama"><span class="headingNumber">3.12.2 </span>Core Tags for Drama</a> and <a class="link_ptr" href="DR.html#DRPAL" title="Speech Contents"><span class="headingNumber">7.2.5 </span>Speech Contents</a>.</div></div><div class="div2" id="SASY"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASE"><span class="headingNumber">16.3 </span>Blocks, Segments, and Anchors</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACS"><span class="headingNumber">16.5 </span>Correspondence and Alignment</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SASY" title="link to this section "><span class="invisible">TEI: Synchronization</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.4 </span><span class="head">Synchronization</span></h3><p>In the previous section we discussed two particular kinds of alignment: alignment of parallel texts in different languages; and alignment of texts and portions of an image. In this section we address another specialized form of alignment: synchronization. The need to mark the relative positions of text components with respect to time arises most naturally and frequently in transcribed spoken texts, but it may arise in any text in which quoted speech occurs, or events are described within a time frame. The methods described here are also generalizable for other kinds of alignment (for example, alignment of text elements with respect to space).</p><div class="div3" id="SASYNC"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASYMP"><span class="headingNumber">16.4.2 </span>Placing Synchronous Events in Time</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASYNC" title="link to this section "><span class="invisible">TEI: Aligning Synchronous Events</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.4.1 </span><span class="head">Aligning Synchronous Events</span></h4><p>Provided that explicit elements are available to represent the parts or places to be synchronized, then the global linking attribute <span class="att">synch</span> may be used to encode such synchronization, once it has been identified. </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.global.linking.html">att.global.linking</a></span> provides a set of attributes for hypertextual linking.<table class="specDesc"><tr><td class="Attribute"><span class="att">synch</span></td><td>(synchronous) points to elements that are synchronous with the current element.</td></tr></table></li></ul><p> This is another of the attributes made globally available by the mechanism described in the introduction to this chapter. Alternatively, the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements may be used to make explicit the fact that the synchronous elements are aligned.</p><p>To illustrate the use of these mechanisms for marking synchrony, consider the following representation of a spoken text:  </p><pre class="pre_eg cdata">B: The first time in twenty five years, we've cooked Christmas
   (unclear) for a blooming great load of people.
A: So you're [1] (unclear) [2]
B: [1] It will be [2] nice in a way, but, [3] be strange. [4]
A: [3] Yeah [4], yeah, cos it, it's [5] the [6]
B: [5] not [6]</pre><p>This representation uses numbers in brackets to mark the points at which speakers overlap each other. For example, the <span class="mentioned">[1]</span> in A's first speech is to be understood as coinciding with the <span class="mentioned">[1]</span> in B's second speech.<span id="Note100_return"><a class="notelink" title="This sample is taken from a conversation collected and transcribed for the British National Corpus." href="#Note100"><sup>63</sup></a></span></p><div class="p">To encode this we use the spoken texts module, described in chapter <a class="link_ptr" href="TS.html" title="11"><span class="headingNumber">8 </span>Transcriptions of Speech</a>, together with the module described in the present chapter. First, we transcribe this text, marking the synchronous points with <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> elements, and providing a <span class="att">synch</span> attribute on one of each of the pairs of synchronous anchors. As noted in the example given above (section <a class="link_ptr" href="SA.html#SACSAL" title="Alignment of Parallel Texts"><span class="headingNumber">16.5.2 </span>Alignment of Parallel Texts</a>), correspondence, and hence synchrony, is a symmetric relation; therefore the attribute need only be specified on one of the pairs of synchronous anchors. <div id="index-egXML-d52e121304" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">BNC-d1</span>" <span class="attribute">type</span>="<span class="attributevalue">convers</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u2b</span>" <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span> The first time in twenty five years,<br />     we've cooked Christmas <span class="element">&lt;unclear&gt;</span> for a blooming great<br />       load of people.<span class="element">&lt;/unclear&gt;</span><span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u3a</span>" <span class="attribute">who</span>="<span class="attributevalue">#a</span>"&gt;</span>So you're<br />  <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t1b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t1a</span>"/&gt;</span><br />  <span class="element">&lt;unclear&gt;</span><br />   <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t2b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t2a</span>"/&gt;</span><br />  <span class="element">&lt;/unclear&gt;</span><span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u3b</span>" <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span><br />  <span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t1b</span>"/&gt;</span>It will be <span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t2b</span>"/&gt;</span><br />     nice in a way, but, <span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t3b</span>"/&gt;</span><br />     be strange.<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t4b</span>"/&gt;</span><span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u4a</span>" <span class="attribute">who</span>="<span class="attributevalue">#a</span>"&gt;</span><br />  <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t3b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t3a</span>"/&gt;</span>Yeah<br />  <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t4b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t4a</span>"/&gt;</span>, yeah, cos it, its<br />  <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t5b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t5a</span>"/&gt;</span>the<br />  <span class="element">&lt;anchor <span class="attribute">synch</span>="<span class="attributevalue">#t6b</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">t6a</span>"/&gt;</span><span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u4b</span>" <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span><br />  <span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t5b</span>"/&gt;</span>not<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">t6b</span>"/&gt;</span><span class="element">&lt;/u&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/div&gt;</span><div style="float: right;"><a href="BIB.html#SA-eg-02">bibliography</a> </div></div></div><div class="p">We can encode this same example using <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements to make the temporal alignment explicit. A <a class="gi" title="(back matter) contains any appendixes, etc. following the main part of a text." href="ref-back.html">back</a> element has been used to enclose the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> element, but the links may be located anywhere the encoder finds convenient: <div id="index-egXML-d52e121350" class="pre egXML_valid"><span class="element">&lt;back&gt;</span><br /> <span class="element">&lt;linkGrp <span class="attribute">xml:id</span>="<span class="attributevalue">lg1</span>"<br />  <span class="attribute">domains</span>="<span class="attributevalue">#BNC-d1 #BNC-d1</span>" <span class="attribute">targFunc</span>="<span class="attributevalue">speaker.a speaker.b</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">synchronous_alignment</span>"&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">L1</span>" <span class="attribute">target</span>="<span class="attributevalue">#t1a #t1b</span>"/&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">L2</span>" <span class="attribute">target</span>="<span class="attributevalue">#t2a #t2b</span>"/&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">L3</span>" <span class="attribute">target</span>="<span class="attributevalue">#t3a #t3b</span>"/&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">l4</span>" <span class="attribute">target</span>="<span class="attributevalue">#t4a #t4b</span>"/&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">l5</span>" <span class="attribute">target</span>="<span class="attributevalue">#t5a #t5b</span>"/&gt;</span><br />  <span class="element">&lt;link <span class="attribute">xml:id</span>="<span class="attributevalue">l6</span>" <span class="attribute">target</span>="<span class="attributevalue">#t6a #t6b</span>"/&gt;</span><br /> <span class="element">&lt;/linkGrp&gt;</span><br /><span class="element">&lt;/back&gt;</span></div> The <span class="att">xml:id</span> attributes are provided for the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements here for reasons discussed in the next section, <a class="link_ptr" href="SA.html#SASYMP" title="Placing Synchronous Events in Time"><span class="headingNumber">16.4.2 </span>Placing Synchronous Events in Time</a>.</div><div class="p">As with other forms of alignment, synchronization may be expressed between stretches of speech as well as between points. When complete utterances are synchronous, for example, if one person says <span class="mentioned">What?</span> and another <span class="mentioned">No!</span> at the same time, that can be represented without <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> elements as follows. <div id="index-egXML-d52e121383" class="pre egXML_valid"><span class="element">&lt;u <span class="attribute">synch</span>="<span class="attributevalue">#u02</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">u01</span>" <span class="attribute">who</span>="<span class="attributevalue">#a</span>"&gt;</span>What?<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u02</span>" <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span>No!<span class="element">&lt;/u&gt;</span></div></div><div class="p">A simple way of expressing <span class="term">overlap</span> (where one speaker starts speaking before another has finished) is thus to use the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element to encode the overlapping portions of speech. For example, <div id="index-egXML-d52e121396" class="pre egXML_valid"><span class="element">&lt;u <span class="attribute">who</span>="<span class="attributevalue">#a</span>"&gt;</span> So you're <span class="element">&lt;unclear <span class="attribute">synch</span>="<span class="attributevalue">#u-b1</span>"/&gt;</span><span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;u <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">u-b1</span>"&gt;</span> It will be <span class="element">&lt;/seg&gt;</span> nice in a way, but,<br /> <span class="element">&lt;seg <span class="attribute">synch</span>="<span class="attributevalue">#u-a3</span>"&gt;</span> be strange. <span class="element">&lt;/seg&gt;</span><span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;u <span class="attribute">who</span>="<span class="attributevalue">#a</span>"&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">u-a3</span>"&gt;</span> Yeah <span class="element">&lt;/seg&gt;</span>, yeah, cos it,<br />   its <span class="element">&lt;seg <span class="attribute">synch</span>="<span class="attributevalue">#u-b2</span>"&gt;</span> the <span class="element">&lt;/seg&gt;</span><span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">u-b2</span>" <span class="attribute">who</span>="<span class="attributevalue">#b</span>"&gt;</span> not <span class="element">&lt;/u&gt;</span></div> Note in this encoding how synchronization has been effected between an empty <a class="gi" title="contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source." href="ref-unclear.html">unclear</a> element and the content of a <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element, and between the content of an <a class="gi" title="(utterance) contains a stretch of speech usually preceded and followed by silence or by a change of speaker." href="ref-u.html">u</a> element and that of another <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a>, using the <span class="att">synch</span> attribute. Alternatively, a <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> could be used in the same way as above.</div></div><div class="div3" id="SASYMP"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASYNC"><span class="headingNumber">16.4.1 </span>Aligning Synchronous Events</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASYMP" title="link to this section "><span class="invisible">TEI: Placing Synchronous Events in Time</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.4.2 </span><span class="head">Placing Synchronous Events in Time</span></h4><p>A synchronous alignment specifies which points in a spoken text occur at the same time, and the order in which they occur, but does not say at what time those points actually occur. If that information is available to the encoder it can be represented by means of the <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> and <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> elements, whose description and attributes are the following: </p><ul class="specList"><li><span class="specList-elementSpec"><a href="ref-when.html">when</a></span> indicates a point in time either relative to other elements in the same timeline tag, or absolutely.<table class="specDesc"><tr><td class="Attribute"><span class="att">absolute</span></td><td>supplies an absolute value for the time.</td></tr><tr><td class="Attribute"><span class="att">interval</span></td><td>specifies a time interval either as a number or as one of the keywords defined by the datatype data.interval</td></tr><tr><td class="Attribute"><span class="att">unit</span></td><td>specifies the unit of time in which the <span class="att">interval</span> value is expressed, if this is not inherited from the parent <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a>.
Suggested values include: 1] d(days) ; 2] h(hours) ; 3] min(minutes) ; 4] s(seconds) ; 5] ms(milliseconds) </td></tr><tr><td class="Attribute"><span class="att">since</span></td><td>identifies the reference point for determining the time of the current <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> element, which is obtained by adding the interval to the time of the reference point.</td></tr></table></li><li><span class="specList-elementSpec"><a href="ref-timeline.html">timeline</a></span> provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text.<table class="specDesc"><tr><td class="Attribute"><span class="att">origin</span></td><td>designates the origin of the timeline, i.e. the time at which it begins.</td></tr><tr><td class="Attribute"><span class="att">interval</span></td><td>specifies a time interval either as a positive integral value or using one of a set of predefined codes.</td></tr><tr><td class="Attribute"><span class="att">unit</span></td><td>specifies the unit of time corresponding to the <span class="att">interval</span> value of the timeline or of its constituent points in time.
Suggested values include: 1] d(days) ; 2] h(hours) ; 3] min(minutes) ; 4] s(seconds) ; 5] ms(milliseconds) </td></tr></table></li></ul><p>Each <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> element indicates a point in time, either directly by means of the <span class="att">absolute</span> attribute, whose value is a string which specifies a particular time, or indirectly by means of the <span class="att">since</span> attribute, which points to another <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a>. If the <span class="att">since</span> is used, then the <span class="att">interval</span> and <span class="att">unit</span> attributes should also be used to indicate the amount of time that has elapsed since the time specified by the element pointed to by the <span class="att">since</span> attribute; the value <span class="val">-1</span> can be given to indicate that the interval is unknown.</p><p>If the <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> elements are uniformly spaced in time, then the <span class="att">interval</span> and <span class="att">unit</span> values need be given once in the <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a>, and not repeated in any of the <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> elements. If the intervals vary, but the units are all the same, then the <span class="att">unit</span> attribute alone can be given in the <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> element, and the <span class="att">interval</span> attribute given in the <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> element.</p><p>The <span class="att">origin</span> attribute in the <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> element points to a <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> element which specifies the reference or origin for the timings within the <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a>; this must, of course, specify its position in time absolutely. If the origin of a timeline is unknown, then this attribute may be omitted.</p><div class="p">The following <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> might be used to accompany the marked up conversation shown in the preceding section: <div id="index-egXML-d52e121527" class="pre egXML_valid"><span class="element">&lt;timeline <span class="attribute">xml:id</span>="<span class="attributevalue">tL1</span>" <span class="attribute">origin</span>="<span class="attributevalue">#w0</span>"<br /> <span class="attribute">unit</span>="<span class="attributevalue">ms</span>"&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w0</span>" <span class="attribute">absolute</span>="<span class="attributevalue">11:30:00</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w1</span>" <span class="attribute">interval</span>="<span class="attributevalue">unknown</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w0</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w2</span>" <span class="attribute">interval</span>="<span class="attributevalue">100</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w1</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w3</span>" <span class="attribute">interval</span>="<span class="attributevalue">200</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w2</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w4</span>" <span class="attribute">interval</span>="<span class="attributevalue">150</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w3</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w5</span>" <span class="attribute">interval</span>="<span class="attributevalue">250</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w4</span>"/&gt;</span><br /> <span class="element">&lt;when <span class="attribute">xml:id</span>="<span class="attributevalue">w6</span>" <span class="attribute">interval</span>="<span class="attributevalue">100</span>"<br />  <span class="attribute">since</span>="<span class="attributevalue">#w5</span>"/&gt;</span><br /><span class="element">&lt;/timeline&gt;</span></div> The information in this <a class="gi" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> could now be linked to the information in the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> which provides the temporal alignment (synchronization) for the text, as follows: <div id="index-egXML-d52e121543" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">temporal_specification</span>"<br /> <span class="attribute">domains</span>="<span class="attributevalue">#lg1 #tL1</span>" <span class="attribute">targFunc</span>="<span class="attributevalue">synch.points when</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#L1 #w1</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#L2 #w2</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#L3 #w3</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#l4 #w4</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#l5 #w5</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#l6 #w6</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><div class="p">To avoid the need for two distinct link groups (one marking the synchronization of anchors with each other, and the other marking their alignment with points on the time line) it would be better to link the <a class="gi" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a> elements with the synchronous points directly: <div id="index-egXML-d52e121556" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">temporal_specification</span>"<br /> <span class="attribute">domains</span>="<span class="attributevalue">#BNC-d1 #BNC-d1 #tL1</span>" <span class="attribute">targFunc</span>="<span class="attributevalue">speaker.a speaker.b when</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t1a #t1b #w1</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t2a #t2b #w2</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t3a #t3b #w3</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t4a #t4b #w4</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t5a #t5b #w5</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#t6a #t6b #w6</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><div class="p">Finally, suppose that a digitized audio recording is also available, and an XML file that assigns identifiers to the various temporal spans of sound is available. For example, the following Synchronized Multimedia Integration Language (SMIL, pronounced "smile") fragment: <div id="index-egXML-d52e121568" class="pre egXML_invalid"><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au1</span>" <span class="attribute">begin</span>="<span class="attributevalue">05.2s</span>"/&gt;</span><br /><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au2</span>" <span class="attribute">begin</span>="<span class="attributevalue">05.7s</span>"/&gt;</span><br /><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au3</span>" <span class="attribute">begin</span>="<span class="attributevalue">05.9s</span>"/&gt;</span><br /><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au4</span>" <span class="attribute">begin</span>="<span class="attributevalue">06.3s</span>"/&gt;</span><br /><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au5</span>" <span class="attribute">begin</span>="<span class="attributevalue">06.9s</span>"/&gt;</span><br /><span class="element">&lt;audio xmlns="http://www.w3.org/2001/SMIL20/Language" <span class="attribute">src</span>="<span class="attributevalue">rtsp://soundstage.pi.cnr.it:554/home/az/bncSound/xmas4lots.mp3</span>"<br /> <span class="attribute">xml:id</span>="<span class="attributevalue">au6</span>" <span class="attribute">begin</span>="<span class="attributevalue">07.4s</span>"/&gt;</span></div> URIs pointing to the <span class="gi">&lt;audio&gt;</span> elements could also be included as a fourth component in each of the above <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> elements, thus providing a synchronized audio track to complement the transcribed text.</div><p>For further discussion of this and related aspects of encoding transcribed speech, refer to chapter <a class="link_ptr" href="TS.html" title="11"><span class="headingNumber">8 </span>Transcriptions of Speech</a>.</p></div></div><div class="div2" id="SACS"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASY"><span class="headingNumber">16.4 </span>Synchronization</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAIE"><span class="headingNumber">16.6 </span>Identical Elements and Virtual Copies</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SACS" title="link to this section "><span class="invisible">TEI: Correspondence and Alignment</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.5 </span><span class="head">Correspondence and Alignment</span></h3><p>In this section we introduce the notions of <span class="term">correspondence</span>, expressed by the <span class="att">corresp</span> attribute, and of <span class="term">alignment</span>, which is a special kind of correspondence involving an ordered set of correspondences. Both cases may be represented using the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements introduced in section <a class="link_ptr" href="SA.html#SAPT" title="Links"><span class="headingNumber">16.1 </span>Links</a>. We also discuss the special case of alignment in time or <span class="term">synchronization</span>, for which special purpose elements are proposed in section <a class="link_ptr" href="SA.html#SASY" title="Synchronization"><span class="headingNumber">16.4 </span>Synchronization</a>.</p><div class="div3" id="SACS1"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACSAL"><span class="headingNumber">16.5.2 </span>Alignment of Parallel Texts</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SACS1" title="link to this section "><span class="invisible">TEI: Correspondence</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.5.1 </span><span class="head">Correspondence</span></h4><p>A common requirement in text analysis is to represent correspondences between two or more parts of a single document, or between places in different documents. Provided that explicit elements are available to represent the parts or places to be linked, then the global linking attribute <span class="att">corresp</span> may be used to encode such correspondence, once it has been identified. </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.global.linking.html">att.global.linking</a></span> provides a set of attributes for hypertextual linking.<table class="specDesc"><tr><td class="Attribute"><span class="att">corresp</span></td><td>(corresponds) points to elements that correspond to the current element in some way.</td></tr></table></li></ul><p> This is one of the attributes made available by the mechanism described in the introduction to this chapter (<a class="link_ptr" href="SA.html" title="14"><span class="headingNumber">16 </span>Linking, Segmentation, and Alignment</a>). Correspondence can also be expressed by means of the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element introduced in section <a class="link_ptr" href="SA.html#SAPT" title="Links"><span class="headingNumber">16.1 </span>Links</a>.</p><p>Where the correspondence is between <em>spans</em>, the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element should be used, if no other element is available. Where the correspondence is between <em>points</em>, the <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> element should be used, if no other element is available.</p><div class="p">The use of the <span class="att">corresp</span> attribute with spans of content is illustrated by the following example: <div id="index-egXML-d52e122176" class="pre egXML_valid"><span class="element">&lt;title <span class="attribute">xml:id</span>="<span class="attributevalue">SHIRLEY</span>"&gt;</span>Shirley<span class="element">&lt;/title&gt;</span>, which made<br /> its Friday night debut only a month ago, was<br /> not listed on <span class="element">&lt;name <span class="attribute">xml:id</span>="<span class="attributevalue">NBC</span>"&gt;</span>NBC<span class="element">&lt;/name&gt;</span>'s new schedule,<br /> although <span class="element">&lt;seg <span class="attribute">corresp</span>="<span class="attributevalue">#NBC</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">NETWORK</span>"&gt;</span>the network<span class="element">&lt;/seg&gt;</span><br /> says <span class="element">&lt;seg <span class="attribute">corresp</span>="<span class="attributevalue">#SHIRLEY</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">SHOW</span>"&gt;</span>the show<span class="element">&lt;/seg&gt;</span><br /> still is being considered.<div style="float: right;"><a href="BIB.html#SACS1-eg-48">bibliography</a> </div></div>    Here the anaphoric phrases <span class="mentioned">the network</span> and <span class="mentioned">the show</span> have been associated directly with the elements to which they refer by means of <span class="att">corresp</span> attributes. This mechanism is simple to apply, but has the drawback that it is not possible to specify more exactly what kind of correspondence is intended. Where this attribute is used, therefore, encoders are encouraged to specify their intent in the associated encoding description in the TEI header.</div><div class="p">Essentially, what the <span class="att">corresp</span> attribute does is to specify that elements bearing this attribute and those to which the attribute points are doubly linked. In the example above, the use of the <span class="att">corresp</span> attribute indicates that the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element containing <span class="q">‘the show’</span> and the <a class="gi" title="contains a title for any kind of work." href="ref-title.html">title</a> element containing <span class="q">‘Shirley’</span> correspond to each other: the correspondence relationship is not ‘from’ one to the other, but ‘between’ the two objects. It is thus different from the <span class="att">target</span> attribute, and provides functionality more similar to that of the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements defined in section <a class="link_ptr" href="SA.html#SAPT" title="Links"><span class="headingNumber">16.1 </span>Links</a>, although it lacks the ability to indicate more precisely what kind of correspondence is intended as in the following retagging of the preceding example. <div id="index-egXML-d52e122245" class="pre egXML_valid"><span class="element">&lt;title <span class="attribute">xml:id</span>="<span class="attributevalue">shirley</span>"&gt;</span>Shirley<span class="element">&lt;/title&gt;</span>, which made<br /> its Friday night debut only a month ago, was not<br /> listed on <span class="element">&lt;name <span class="attribute">xml:id</span>="<span class="attributevalue">nbc</span>"&gt;</span>NBC<span class="element">&lt;/name&gt;</span>'s new schedule,<br /> although <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">network</span>"&gt;</span>the network<span class="element">&lt;/seg&gt;</span> says<br /><span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">show</span>"&gt;</span>the show<span class="element">&lt;/seg&gt;</span> still is being considered.<br /><br /><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">anaphoric_link</span>"<br /> <span class="attribute">targFunc</span>="<span class="attributevalue">antecedent anaphor</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#shirley #show</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#nbc #network</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><div class="p">In the following example, we use the same mechanism to express a correspondence amongst the anchors introduced following the fifth word after <span class="mentioned">English</span> in a text: <div id="index-egXML-d52e122266" class="pre egXML_valid">English language. Except for not very<span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">en1</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> English at all at the time<br /><span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">en2</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> English was still full of flaws<br /><span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">en3</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> English. This was revised by young<br /><span class="element">&lt;anchor <span class="attribute">xml:id</span>="<span class="attributevalue">en4</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">five-word_collocates</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">type</span>="<span class="attributevalue">collocates_of_ENGLISH</span>"<br />  <span class="attribute">target</span>="<span class="attributevalue">#en1 #en2 #en3 #en4</span>"/&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div></div><div class="div3" id="SACSAL"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SACS1"><span class="headingNumber">16.5.1 </span>Correspondence</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SACSXA"><span class="headingNumber">16.5.3 </span>A Three-way Alignment</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SACSAL" title="link to this section "><span class="invisible">TEI: Alignment of Parallel Texts</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.5.2 </span><span class="head">Alignment of Parallel Texts</span></h4><p>One very important application area for the alignment of parallel texts is multilingual corpora. Consider, for example, the need to align ‘translation pairs’ of sentences drawn from a corpus such as the Canadian Hansard, in which each sentence is given in both English and French. Concerning this problem, Gale and Church write: </p><div class="q">Most English sentences match exactly one French sentence, but it is possible for an English sentence to match two or more French sentences. The first two English sentences [in the example below] illustrate a particularly hard case where two English sentences align to two French sentences. No smaller alignments are possible because the clause <span class="q">‘...sales...were higher...’</span> in the first English sentence corresponds to (part of) the second French sentence. The next two alignments ... illustrate the more typical case where one English sentence aligns with exactly one French sentence. The final alignment matches two English sentences to a single French sentence. These alignments [which were produced by a computer program] agreed with the results produced by a human judge.<span id="Note101_return"><a class="notelink" title="See , from which the example in the text is taken." href="#Note101"><sup>64</sup></a></span></div><p>The alignment produced by Gale and Church's program can be expressed in four different ways. The encoder must first decide whether to represent the alignment in terms of points within each text (using the <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> element) or in terms of whole stretches of text, using the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element. To some extent the choice will depend on the process by which the software works out where alignment occurs, and the intention of the encoder. Secondly, the encoder may elect to represent the actual encoding using either <span class="att">corresp</span> attributes attached to the individual <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> or <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements, or using a free-standing <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> element.</p><div class="p">We present first a solution using <a class="gi" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> elements bearing only <span class="att">corresp</span> attributes: <div id="index-egXML-d52e122329" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:lang</span>="<span class="attributevalue">en</span>" <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#fa1</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">ea1</span>"/&gt;</span>According to our survey, 1988<br />     sales of mineral water and soft drinks were much higher than in 1987,<br />     reflecting the growing popularity of these products. Cola drink<br />     manufacturers in particular achieved above-average growth rates.<br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#fa2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">ea2</span>"/&gt;</span>The higher turnover was largely<br />     due to an increase in the sales volume.<br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#fa3</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">ea3</span>"/&gt;</span>Employment and investment levels also climbed.<br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#fa4</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">ea4</span>"/&gt;</span>Following a two-year transitional period,<br />     the new Foodstuffs Ordinance for Mineral Water came into effect on<br />     April 1, 1988. Specifically, it contains more stringent requirements<br />     regarding quality consistency and purity guarantees.<span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">xml:lang</span>="<span class="attributevalue">fr</span>" <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#ea1</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fa1</span>"/&gt;</span>Quant aux eaux minérales<br />     et aux limonades, elles rencontrent toujours plus d'adeptes. En effet,<br />     notre sondage fait ressortir des ventes nettement supérieures<br />     à celles de 1987, pour les boissons à base de cola<br />     notamment. <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#ea2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fa2</span>"/&gt;</span>La progression des<br />     chiffres d'affaires résulte en grande partie de l'accroissement<br />     du volume des ventes. <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#ea3</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fa3</span>"/&gt;</span>L'emploi et<br />     les investissements ont également augmenté.<br />  <span class="element">&lt;anchor <span class="attribute">corresp</span>="<span class="attributevalue">#ea4</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fa4</span>"/&gt;</span>La nouvelle ordonnance fédérale<br />     sur les denrées alimentaires concernant entre autres les eaux<br />     minérales, entrée en vigueur le 1er avril 1988 après<br />     une période transitoire de deux ans, exige surtout une plus<br />     grande constance dans la qualité et une garantie de la<br />     pureté.<span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><div style="float: right;"><a href="BIB.html#SA-BIBL-1">bibliography</a> </div></div></div><p>There is no requirement that the <span class="att">corresp</span> attribute be specified in both English and French texts, since (as noted above) this attribute is defined as representing a mutual association. However, it may simplify processing to do so, and also avoids giving the impression that the English is translating the French, or vice versa. More seriously, this encoding does not make explicit that it is in fact the entire stretch of text between the anchors which is being aligned, not simply the points themselves. If for example one text contained material omitted from the other, this approach would not be appropriate.</p><div class="p">We now present the same passage using the alternative <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> mechanism and marking explicitly the segments which have been aligned: <div id="index-egXML-d52e122360" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">div-e</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">en</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e_1</span>"&gt;</span>According to our survey, 1988 sales of mineral<br />       water and soft drinks were much higher than in 1987,<br />       reflecting the growing popularity of these products. Cola<br />       drink manufacturers in particular achieved above-average<br />       growth rates.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e_2</span>"&gt;</span>The higher turnover was largely due to an<br />       increase in the sales volume.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e_3</span>"&gt;</span>Employment and investment levels also climbed.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e_4</span>"&gt;</span>Following a two-year transitional period, the new<br />       Foodstuffs Ordinance for Mineral Water came into effect on<br />       April 1, 1988. Specifically, it contains more stringent<br />       requirements regarding quality consistency and purity<br />       guarantees.<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">div-f</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">fr</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">f_1</span>"&gt;</span>Quant aux eaux minérales et aux limonades,<br />       elles rencontrent toujours plus d'adeptes. En effet, notre<br />       sondage fait ressortir des ventes nettement<br />       supérieures à celles de 1987, pour les<br />       boissons à base de cola notamment.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">f_2</span>"&gt;</span>La progression des chiffres d'affaires<br />       résulte en grande partie de l'accroissement du volume<br />       des ventes.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">f_3</span>"&gt;</span>L'emploi et les investissements ont<br />       également augmenté.<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">f_4</span>"&gt;</span>La nouvelle ordonnance fédérale sur<br />       les denrées alimentaires concernant entre autres les<br />       eaux minérales, entrée en vigueur le 1er avril<br />       1988 après une période transitoire de deux<br />       ans, exige surtout une plus grande constance dans la<br />       qualité et une garantie de la pureté.<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">alignment</span>"<br /> <span class="attribute">domains</span>="<span class="attributevalue">#div-e #div-f</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#e_1 #f_1</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#e_2 #f_2</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#e_3 #f_3</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#e_4 #f_4</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><div class="p">Note that use of the <a class="gi" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> element allows us to mark up the orthographic sentences in both languages independently of the alignment: the first translation pair in this example might be marked up as follows: <div id="index-egXML-d52e122391" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">english</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">en</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;ab <span class="attribute">xml:id</span>="<span class="attributevalue">english1</span>"&gt;</span><br />  <span class="element">&lt;s&gt;</span>According to our survey, 1988 sales of mineral water and soft<br />       drinks were much higher than in 1987, reflecting the growing popularity<br />       of these products.<span class="element">&lt;/s&gt;</span><br />  <span class="element">&lt;s&gt;</span>Cola drink manufacturers in particular achieved above-average<br />       growth rates.<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;/ab&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">french</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">fr</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">subsection</span>"&gt;</span><br /> <span class="element">&lt;ab <span class="attribute">xml:id</span>="<span class="attributevalue">french1</span>"&gt;</span><br />  <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">fs1</span>"&gt;</span>Quant aux eaux minérales et aux limonades, elles<br />       rencontrent toujours plus d'adeptes.<span class="element">&lt;/s&gt;</span><br />  <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">fs2</span>"&gt;</span>En effet, notre sondage fait ressortir des ventes nettement<br />       supérieures à celles de 1987, pour les boissons à<br />       base de cola notamment.<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;/ab&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div></div><div class="div3" id="SACSXA"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SACSAL"><span class="headingNumber">16.5.2 </span>Alignment of Parallel Texts</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SACSXA" title="link to this section "><span class="invisible">TEI: A Three-way Alignment</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.5.3 </span><span class="head">A Three-way Alignment</span></h4><p>The preceding encoding of the alignment of parallel passages from two texts requires that those texts and the alignment all be part of the same document. If the texts are in separate documents, then complete URIs, whether absolute or relative (section <a class="link_ptr" href="SA.html" title="14"><span class="headingNumber">16 </span>Linking, Segmentation, and Alignment</a>), will be required. These external pointers may appear anywhere within the document, but if they are created solely for use in encoding links, they may for convenience be grouped within the <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> (or other grouping element that uses them for linking).</p><p>To demonstrate this facility, we consider how we might encode the alignments in an extract from Comenius' <span class="titlem">Orbis Sensualium Pictus</span>, in the English translation of Charles Hoole (1659).  </p><figure class="figure float fullpage" id="COMENIUS"><img src="Images/compic.png" alt="" class="graphic" /></figure><p> Each topic covered in this work has three parts: a picture, a prose text in Latin describing the topic, and a carefully-aligned translation of the Latin into English, German, or some other vernacular. Key terms in the two texts are typographically distinct, and are linked to the picture by numbers, which appear in the two texts and within the picture as well.</p><div class="p">First, we consider the text portions. The English and Latin portions have been encoded as distinct <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> elements. Identifiers have been attached to each typographic line, but no other encoding added, to simplify the example.  <div id="index-egXML-d52e122432" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">e98</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">en</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">lesson</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>The Study<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9801</span>"&gt;</span>The Study<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9802</span>"&gt;</span>is a place<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9803</span>"&gt;</span>where a Student,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9804</span>"&gt;</span>a part from men,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9805</span>"&gt;</span>sitteth alone,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9806</span>"&gt;</span>addicted to his Studies,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9807</span>"&gt;</span>whilst he readeth<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">e9808</span>"&gt;</span>Books,<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">l98</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">la</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">lesson</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>Muséum<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9801</span>"&gt;</span>Museum<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9802</span>"&gt;</span>est locus<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9803</span>"&gt;</span>ubi Studiosus,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9804</span>"&gt;</span>secretus ab hominibus,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9805</span>"&gt;</span>solus sedet,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9806</span>"&gt;</span>Studiis deditus,<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9807</span>"&gt;</span>dum lectitat<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">l9808</span>"&gt;</span>Libros,<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/div&gt;</span><div style="float: right;"><a href="BIB.html#SA-BIBL-2">bibliography</a> </div></div></div><div class="p">Next we consider the non-textual parts of the page. Encoding this requires providing two distinct components: firstly a digitized rendering of the page itself, and secondly a representation of the areas within that image which are to be aligned. In section <a class="link_ptr" href="PH.html#PHFAX" title="Digital Facsimiles"><span class="headingNumber">11.1 </span>Digital Facsimiles</a> we present a simple way of doing this using the TEI-defined markup for alignment of facsimiles. In the present chapter we demonstrate a more powerful means of aligning arbitrary polygons and points, which uses the XML notation SVG (see <a class="link_ref" href="BIB.html#SVG-11" title="Scalable Vector Graphics (SVG) 1.1 (Second Edition). Editors Erik Dahlström Jon Ferraiolo 藤沢 淳 Anthony Grasso Dean Jackson Chri...">SVG</a>). This provides appropriate facilities for both these requirements: <div id="index-egXML-d52e122481" class="pre egXML_invalid"><span class="element">&lt;svg xmlns="http://www.w3.org/2000/svg"&gt;</span><br /> <span class="element">&lt;image <span class="attribute">xlink:href</span>="<span class="attributevalue">p1764.png</span>" <span class="attribute">width</span>="<span class="attributevalue">597</span>"<br />  <span class="attribute">height</span>="<span class="attributevalue">897</span>" <span class="attribute">id</span>="<span class="attributevalue">p981</span>"/&gt;</span><br /> <span class="element">&lt;rect <span class="attribute">id</span>="<span class="attributevalue">p982</span>" <span class="attribute">x</span>="<span class="attributevalue">75</span>" <span class="attribute">y</span>="<span class="attributevalue">75</span>" <span class="attribute">width</span>="<span class="attributevalue">25</span>"<br />  <span class="attribute">height</span>="<span class="attributevalue">10</span>"/&gt;</span><br /> <span class="element">&lt;rect <span class="attribute">id</span>="<span class="attributevalue">p983</span>" <span class="attribute">x</span>="<span class="attributevalue">55</span>" <span class="attribute">y</span>="<span class="attributevalue">42</span>" <span class="attribute">width</span>="<span class="attributevalue">25</span>"<br />  <span class="attribute">height</span>="<span class="attributevalue">10</span>"/&gt;</span><br /><span class="element">&lt;/svg&gt;</span></div> This example of SVG defines two rectangles at the locations with the specified x and y coordinates. A view is defined on these, enabling them to be mapped by an SVG processor to the image found at the URL specified (<span class="ident-file">p1764.png</span>). It also defines unique identifiers for the whole image, and the two views of it, which we will use within our alignment, as shown next (for further discussion of the handling of images and graphics, see section <a class="link_ptr" href="FT.html#FTGRA" title="Specific Elements for Graphic Images"><span class="headingNumber">14.4 </span>Specific Elements for Graphic Images</a>; for further discussion of using non-TEI XML vocabularies such as SVG within a TEI document, see section <a class="link_ptr" href="TD.html#ST-aliens" title="Combining TEI and NonTEI Modules"><span class="headingNumber">22.8.2 </span>Combining TEI and Non-TEI Modules</a>).</div><p>As printed, the Comenius text exhibits three kinds of alignment. </p><ol class="numbered"><li class="item">The English and Latin portions are printed in two parallel columns, with corresponding phrases, (represented above by <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements), more or less next to each other.</li><li class="item">Particular words or phrases are marked as terms in the two languages by a change of rendition: the English text, which otherwise uses black letter type throughout, has the words <span class="mentioned">The Study</span>, <span class="mentioned">a Student</span>, <span class="mentioned">Studies</span>, and <span class="mentioned">Books</span> in a roman font; in the Latin text, which is printed in roman, the corresponding words (<span class="mentioned">Museum</span>, <span class="mentioned">Studiosus</span>, <span class="mentioned">Studiis</span>, and <span class="mentioned">Libros</span>) are all in italic.</li><li class="item">Numbered labels appear within the text portions, linking keywords to each other and to sections of the picture. These labels, which have been left out of the above encoding, are attached to the first, third, and last segments in each language quoted below, and also appear (rather indistinctly) within the picture itself. Thus, the images of the study, the student, and his books are each aligned with the correct term for them in the two languages. </li></ol><div class="p">The first kind of alignment might be represented by using the <span class="att">corresp</span> attribute on the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element. The second kind might be represented by using the <a class="gi" title="identifies a phrase or word used to provide a gloss or definition for some other word or phrase." href="ref-gloss.html">gloss</a> and <a class="gi" title="contains a single-word, multi-word, or symbolic designation which is regarded as a technical term." href="ref-term.html">term</a> mechanism described in section <a class="link_ptr" href="CO.html#COHQU" title="Terms Glosses Equivalents and Descriptions"><span class="headingNumber">3.3.4 </span>Terms, Glosses, Equivalents, and Descriptions</a>. The third kind of alignment might be represented using pointers embedded within the texts, for example: <div id="index-egXML-d52e122553" class="pre egXML_valid"><br /><span class="comment">&lt;!--... --&gt;</span><span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">xe9803</span>"&gt;</span>where a <span class="element">&lt;ref <span class="attribute">n</span>="<span class="attributevalue">2</span>" <span class="attribute">target</span>="<span class="attributevalue">#xp982</span>"&gt;</span>Student<span class="element">&lt;/ref&gt;</span>,<span class="element">&lt;/seg&gt;</span><br /><span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">xl9803</span>"&gt;</span>ubi <span class="element">&lt;ref <span class="attribute">n</span>="<span class="attributevalue">2</span>" <span class="attribute">target</span>="<span class="attributevalue">#xp982</span>"&gt;</span>Studiosus<span class="element">&lt;/ref&gt;</span>,<span class="element">&lt;/seg&gt;</span><br /><span class="comment">&lt;!--... --&gt;</span></div> We choose however to use the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element, since this provides a more efficient way of representing the three-way alignment between English, Latin, and picture without redundancy. <div id="index-egXML-d52e122570" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">alignment</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9801 #xl9801 #xp981</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9802 #xl9802</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9803 #xl9803 #xp982</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9804 #xl9804</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9805 #xl9805</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9806 #xl9806</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9807 #xl9807</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xe9808 #xl9808 #xp983</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div></div><p>This map, of course, only aligns whole segments and image portions, since these are the only parts of our encoding which bear identifiers and can therefore be pointed to. To add to it the alignment between the typographically distinct words mentioned above, new elements must be defined, either within the text itself or externally by using stand off techniques. Encoding these word pairs as <a class="gi" title="contains a single-word, multi-word, or symbolic designation which is regarded as a technical term." href="ref-term.html">term</a> and <a class="gi" title="identifies a phrase or word used to provide a gloss or definition for some other word or phrase." href="ref-gloss.html">gloss</a>, although intuitively obvious, requires a non-trivial decision as to whether the Latin text is glossing the English, or vice versa. Tagging all the marked words as <a class="gi" title="contains a single-word, multi-word, or symbolic designation which is regarded as a technical term." href="ref-term.html">term</a> avoids the difficult decision, but might be thought by some encoders to convey the wrong information about the words in question. Simply tagging them as additional embedded <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements with identifiers that can be aligned like the others is also a possibility.</p><div class="p">These solutions all require the addition of further markup to the text. This may pose no problems, or it may be infeasible, for example because the text is held on a read-only medium. If it is not feasible to add more markup to the original text, some form of stand-off markup will be needed. Any item within the text that can be pointed to using the various pointer schemes discussed in this chapter may be used, not simply those which rely on the existence of an <span class="att">xml:id</span> attribute. Suppose our example had been more lightly tagged, as follows: <div id="index-egXML-d52e122599" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">E98</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">en</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">lesson</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>The Study<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;ab&gt;</span>The Study<span class="element">&lt;/ab&gt;</span><br /> <span class="element">&lt;ab&gt;</span>is a place<span class="element">&lt;/ab&gt;</span><br /> <span class="element">&lt;ab&gt;</span>where a Student,<span class="element">&lt;/ab&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">L98</span>" <span class="attribute">xml:lang</span>="<span class="attributevalue">la</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">lesson</span>"&gt;</span><br /> <span class="element">&lt;head&gt;</span>Muséum<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;ab&gt;</span>Museum<span class="element">&lt;/ab&gt;</span><br /> <span class="element">&lt;ab&gt;</span>est locus<span class="element">&lt;/ab&gt;</span><br /> <span class="element">&lt;ab&gt;</span>ubi Studiosus,<span class="element">&lt;/ab&gt;</span><br /><span class="element">&lt;/div&gt;</span><div style="float: right;"><a href="BIB.html#SA-BIBL-2">bibliography</a> </div></div></div><div class="p">To express the same alignment mentioned above, we could use an XPath expression to identify the required <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> elements: <div id="index-egXML-d52e122623" class="pre egXML_valid"><span class="element">&lt;linkGrp <span class="attribute">type</span>="<span class="attributevalue">alignment</span>"&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xpath(//div[@xml:id='L98']/ab[1]) #xpath(//div[@xml:id='E98']/ab[1])</span>"/&gt;</span><br /> <span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#xpath(//div[@xml:id='L98']/ab[2]) #xpath(//div[@xml:id='E98']/ab[2])</span>"/&gt;</span><br /><span class="element">&lt;/linkGrp&gt;</span></div> In the absence of any markup around individual substrings of the element content, the string-range pointer scheme discussed in <a class="link_ptr" href="SA.html#SATSSR" title="stringrange()"><span class="headingNumber">16.2.4.7 </span>string-range()</a> may also be helpful: for example, to indicate that the words <span class="mentioned">Studies</span> and <span class="mentioned">Studiis</span> correspond, we might express the link between them as follows: <div id="index-egXML-d52e122637" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#string-range(e9806,16,7) #string-range(l9806,0,7)</span>"/&gt;</span></div></div></div></div><div class="div2" id="SAIE"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SACS"><span class="headingNumber">16.5 </span>Correspondence and Alignment</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAAG"><span class="headingNumber">16.7 </span>Aggregation</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAIE" title="link to this section "><span class="invisible">TEI: Identical Elements and Virtual Copies</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.6 </span><span class="head">Identical Elements and Virtual Copies</span></h3><p>This section introduces the notion of a <span class="term">virtual element</span>, that is, an element which is not explicitly present in a text, but the presence of which an application can infer from the encoding supplied. In this section, we are concerned with virtual elements made by simply cloning existing elements. In the next section (<a class="link_ptr" href="SA.html#SAAG" title="Aggregation"><span class="headingNumber">16.7 </span>Aggregation</a>), we discuss virtual elements made by aggregating existing elements.</p><p>Provided that explicit elements are available to represent the parts or places to be linked, then the global linking attributes <span class="att">sameAs</span> and <span class="att">copyOf</span> may be used to encode this kind of equivalence: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.global.linking.html">att.global.linking</a></span> provides a set of attributes for hypertextual linking.<table class="specDesc"><tr><td class="Attribute"><span class="att">sameAs</span></td><td>points to an element that is the same as the current element.</td></tr><tr><td class="Attribute"><span class="att">copyOf</span></td><td>points to an element of which the current element is a copy.</td></tr></table></li></ul><div class="p">It is useful to be able to represent the fact that one element of text is identical to others, for analytical purposes, or (especially if the elements have lengthy content) to obviate the need to repeat the content. For example, consider the repetition of the <a class="gi" title="contains a date in any format." href="ref-date.html">date</a> element in the following material: <div id="index-egXML-d52e122665" class="pre egXML_valid"><span class="element">&lt;p&gt;</span>In small clumsy letters he wrote:<br /> <span class="element">&lt;q <span class="attribute">rend</span>="<span class="attributevalue">centered italic</span>"&gt;</span><br />  <span class="element">&lt;date <span class="attribute">xml:id</span>="<span class="attributevalue">d840404</span>"&gt;</span>April 4th,<br />       1984<span class="element">&lt;/date&gt;</span>.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;p&gt;</span>He sat back. A sense of complete helplessness had<br />   descended upon him. ...<span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;p&gt;</span>His small but childish handwriting straggled up<br />   and down the page, shedding first its capital letters<br />   and finally even its full stops:<br /> <span class="element">&lt;q <span class="attribute">rend</span>="<span class="attributevalue">italic</span>"&gt;</span><br />  <span class="element">&lt;date&gt;</span>April 4th, 1984<span class="element">&lt;/date&gt;</span>.<br />     Last night to the flicks. ... <span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><div style="float: right;"><a href="BIB.html#SA-eg-03">bibliography</a> </div></div> Suppose now that we wish to encode the fact that the second <a class="gi" title="contains a date in any format." href="ref-date.html">date</a> element above has identical content to the first. The <span class="att">sameAs</span> attribute is provided for this purpose. Using it, we could recode the last line of the above example as follows: <div id="index-egXML-d52e122687" class="pre egXML_valid"><span class="element">&lt;date <span class="attribute">sameAs</span>="<span class="attributevalue">#d840404</span>"&gt;</span>April 4th,<br />   1984<span class="element">&lt;/date&gt;</span><br /> Last night to the flicks ... </div></div><p>The <span class="att">sameAs</span> attribute may be used to document the fact that two elements have identical content. It may be regarded as a special kind of link. It should only be attached to an element with identical content to that which it targets, or to one the content of which clearly designates it as a repetition, such as the word <span class="mentioned">repeat</span> or <span class="mentioned">bis</span> in the representation of the chorus of a song, the second time it is to be sung. The relation specified by the <span class="att">sameAs</span> attribute is symmetric: if a chorus is repeated three times and each repetition bears a <span class="att">sameAs</span> attribute indicating the first occurrence of the element concerned, it is implied that each chorus is identical, and there is no need for the first occurrence to specify any of its copies.</p><div class="p">The <span class="att">copyOf</span> attribute is used in a similar way to indicate that the content of the element bearing it is identical to that of another. The difference is that the content is not itself repeated. The effect of this attribute is thus to create a <span class="term">virtual copy</span> of the element indicated. Using this attribute, the repeated date in the first example above could be recoded as follows: <div id="index-egXML-d52e122716" class="pre egXML_valid"><span class="element">&lt;date <span class="attribute">rend</span>="<span class="attributevalue">italic</span>" <span class="attribute">copyOf</span>="<span class="attributevalue">#d840404</span>"/&gt;</span></div></div><p>An application program should replace whatever is the actual content of an element bearing a <span class="att">copyOf</span> attribute with the content of the element specified by it. If the content of the element specified includes other elements, these will become embedded within the element bearing the attribute. Care must be taken to ensure that the document is valid both before and after this embedding takes place. If, for example, the element bearing a <span class="att">copyOf</span> attribute requires a mandatory sub-component, then this component must be present (though possibly empty), even though it will be replaced by the content of the targetted element.</p><div class="p">The following example demonstrates how the <span class="att">copyOf</span> attribute may be used in conjunction with the <a class="gi" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> element to highlight the differences between almost identical repetitions: <div id="index-egXML-d52e122734" class="pre egXML_valid"><span class="element">&lt;sp&gt;</span><br /> <span class="element">&lt;speaker&gt;</span>Mikado<span class="element">&lt;/speaker&gt;</span><br /> <span class="element">&lt;l&gt;</span>My <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-L1s</span>"&gt;</span>object all sublime<span class="element">&lt;/seg&gt;</span><span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l&gt;</span>I shall <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-L2s</span>"&gt;</span>achieve in time<span class="element">&lt;/seg&gt;</span>—<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-L3</span>"&gt;</span>To let <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">L3s</span>"&gt;</span>the punishment fit the crime<span class="element">&lt;/seg&gt;</span>,<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l4</span>"&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-L3s</span>"/&gt;</span>;<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l5</span>"&gt;</span>And make each pris'ner pent<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l6</span>"&gt;</span>Unwillingly represent<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l7</span>"&gt;</span>A source <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l7s</span>"&gt;</span>of innocent merriment<span class="element">&lt;/seg&gt;</span>,<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">Mik-l8</span>"&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l7s</span>"/&gt;</span>!<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;/sp&gt;</span><br /><span class="element">&lt;sp&gt;</span><br /> <span class="element">&lt;speaker&gt;</span>Chorus<span class="element">&lt;/speaker&gt;</span><br /> <span class="element">&lt;l&gt;</span>His <span class="element">&lt;seg <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-L1s</span>"/&gt;</span><span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l&gt;</span>He will <span class="element">&lt;seg <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-L2s</span>"/&gt;</span><span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-L3</span>"/&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l4</span>"/&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l5</span>"/&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l6</span>"/&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l7</span>"/&gt;</span><br /> <span class="element">&lt;l <span class="attribute">copyOf</span>="<span class="attributevalue">#Mik-l8</span>"/&gt;</span><br /><span class="element">&lt;/sp&gt;</span><div style="float: right;"><a href="BIB.html#COVE-eg-284">bibliography</a> </div></div></div><p>For further examples of the use of this attribute, see <a class="link_ptr" href="SA.html#SAAT" title="Alternation"><span class="headingNumber">16.8 </span>Alternation</a> and <a class="link_ptr" href="GD.html#GDAT" title="Another Tree Notation"><span class="headingNumber">19.3 </span>Another Tree Notation</a>.</p></div><div class="div2" id="SAAG"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAIE"><span class="headingNumber">16.6 </span>Identical Elements and Virtual Copies</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAAT"><span class="headingNumber">16.8 </span>Alternation</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAAG" title="link to this section "><span class="invisible">TEI: Aggregation</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.7 </span><span class="head">Aggregation</span></h3><p>Because of the strict hierarchical organization of elements, or for other reasons, it may not always be possible or desirable to include all the parts of a possibly fragmented text segment within a single element. In section <a class="link_ptr" href="SA.html#SAPTIP" title="Intermediate Pointers"><span class="headingNumber">16.1.4 </span>Intermediate Pointers</a> we introduced the notion of an intermediate pointer as a way of pointing to discontinuous segments of this kind. In this section we first describe another way of linking the parts of a discontinuous whole, using a set of linking attributes, which are made available for any tag by following the procedure described at the beginning of this chapter. We then describe how the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element may be used to aggregate such segments, and finally introduce the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element, which is a special-purpose linking element specifically for representing the aggregation of parts, and the <a class="gi" title="(join group) groups a collection of &lt;join&gt; elements and possibly pointers." href="ref-joinGrp.html">joinGrp</a> for grouping <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> elements.</p><p>The linking attributes for aggregation are <span class="att">next</span> and <span class="att">prev</span>; each of these attributes has a single identifier as its value: </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.global.linking.html">att.global.linking</a></span> provides a set of attributes for hypertextual linking.<table class="specDesc"><tr><td class="Attribute"><span class="att">next</span></td><td>points to the next element of a virtual aggregate of which the current element is part.</td></tr><tr><td class="Attribute"><span class="att">prev</span></td><td>(previous) points to the previous element of a virtual aggregate of which the current element is part.</td></tr></table></li></ul><p>It is recommended that the elements indicated by these attributes be of the same type as the element bearing them.</p><p>The <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element is also a member of the class of <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a> elements, and so may carry any of the attributes of that class; for the list, see section <a class="link_ptr" href="SA.html#SAPT" title="Links"><span class="headingNumber">16.1 </span>Links</a>.</p><div class="p">Here is the material on which we base our first illustration of the use of these mechanisms. Our problem is to represent the s-units identified below as <span class="val">qs3</span> and <span class="val">qs4</span> as a single (but discontinuous) whole: <div id="index-egXML-d52e122838" class="pre egXML_valid"><span class="element">&lt;q&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">qs2</span>"&gt;</span>Monsieur Paul, after he has taken equal<br />     parts of goose breast and the finest pork, and<br />     broken a certain number of egg yolks into them,<br />     and ground them <span class="element">&lt;emph&gt;</span>very<span class="element">&lt;/emph&gt;</span>, very fine,<br />     cooks all with seasoning for some three hours.<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">qs3</span>"&gt;</span><br />  <span class="element">&lt;emph&gt;</span>But<span class="element">&lt;/emph&gt;</span>,<span class="element">&lt;/s&gt;</span><br /><span class="element">&lt;/q&gt;</span><br /><span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">ps2</span>"&gt;</span>she pushed her face nearer, and looked with<br />   ferocious gloating at the pâté<br />   inside me, her eyes like X rays,<span class="element">&lt;/s&gt;</span><br /><span class="element">&lt;q&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">qs4</span>"&gt;</span>he never stops stirring it!<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">qs5</span>"&gt;</span>Figure to yourself the work of it —<span class="element">&lt;/s&gt;</span><br /> <span class="element">&lt;s <span class="attribute">xml:id</span>="<span class="attributevalue">qs6</span>"&gt;</span>stir, stir, never stopping!<span class="element">&lt;/s&gt;</span><br /><span class="element">&lt;/q&gt;</span><div style="float: right;"><a href="BIB.html#SAAG-eg-72">bibliography</a> </div></div> </div><div class="p">Using the <span class="att">prev</span> and <span class="att">next</span> attributes, we can link the s-units with identifiers <span class="val">qs3</span> and <span class="val">qs4</span>, either singly or doubly as follows: <pre class="pre_eg cdata">  &lt;s xml:id="qs3" next="#qs4"&gt;&lt;emph&gt;But&lt;/emph&gt;,&lt;/s&gt;
  &lt;s xml:id="qs4"&gt;he never stops stirring it!&lt;/s&gt;</pre> <pre class="pre_eg cdata">  &lt;s xml:id="qs3"&gt;&lt;emph&gt;But&lt;/emph&gt;,&lt;/s&gt;
  &lt;s xml:id="qs4" prev="#qs3"&gt;he never stops stirring it!&lt;/s&gt;</pre> <pre class="pre_eg cdata">  &lt;s xml:id="qs3" next="#qs4"&gt;&lt;emph&gt;But&lt;/emph&gt;,&lt;/s&gt;
  &lt;s xml:id="qs4" prev="#qs3"&gt;he never stops stirring it!&lt;/s&gt;</pre> Double linking of the two s-units, as illustrated by the last of these encodings, is equivalent to specifying a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element: <div id="index-egXML-d52e122888" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">type</span>="<span class="attributevalue">join</span>" <span class="attribute">target</span>="<span class="attributevalue">#qs3 #qs4</span>"/&gt;</span></div></div><p>Such a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element must carry a <span class="att">type</span> attribute with a value of <span class="val">join</span> to specify that the link is to be understood as joining its targets into a single aggregate.</p><div class="p">The <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element is equivalent to a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element of type <span class="val">join</span>.  Unlike the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element, the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element can additionally specify information about the virtual element which it represents, by means of its <span class="att">result</span> attribute. And finally, unlike the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element, the position of a <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element within a text is significant: it must be supplied at a position where the element indicated by its <span class="att">result</span> attribute would be contextually legal. <ul class="specList"><li><span class="specList-elementSpec"><a href="ref-join.html">join</a></span> identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it.<table class="specDesc"><tr><td class="Attribute"><span class="att">result</span></td><td>specifies the name of an element which this aggregation may be understood to represent.</td></tr></table></li><li><span class="specList-elementSpec"><a href="ref-joinGrp.html">joinGrp</a></span> (join group) groups a collection of <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> elements and possibly pointers.<table class="specDesc"><tr><td class="Attribute"><span class="att">result</span></td><td>supplies the default value for the <span class="att">result</span> on each <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> included within the group.</td></tr></table></li></ul> To conclude the above example, we now use a <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element to represent the virtual sentence formed by the aggregation of <span class="val">s1</span> and <span class="val">s2</span>: <div id="index-egXML-d52e122947" class="pre egXML_valid"><span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#qs3 #qs4</span>" <span class="attribute">result</span>="<span class="attributevalue">s</span>"/&gt;</span></div> As a further example, consider the following list of authors' names. The object of the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element here is to provide another list, composed of those authors from the larger list who happen to come from Heidelberg: <div id="index-egXML-d52e122953" class="pre egXML_valid"><span class="element">&lt;list&gt;</span><br /> <span class="element">&lt;head&gt;</span>Authors<span class="element">&lt;/head&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_uf</span>"&gt;</span>Figge, Udo <span class="element">&lt;/item&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_ch</span>"&gt;</span>Heibach, Christiane <span class="element">&lt;/item&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_gh</span>"&gt;</span>Heyer, Gerhard <span class="element">&lt;/item&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_bp</span>"&gt;</span>Philipp, Bettina <span class="element">&lt;/item&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_ms</span>"&gt;</span>Samiec, Monika <span class="element">&lt;/item&gt;</span><br /> <span class="element">&lt;item <span class="attribute">xml:id</span>="<span class="attributevalue">a_ss</span>"&gt;</span>Schierholz, Stefan <span class="element">&lt;/item&gt;</span><br /><span class="element">&lt;/list&gt;</span><br /><span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#a_ch #a_bp #a_ss</span>"<br /> <span class="attribute">result</span>="<span class="attributevalue">list</span>"&gt;</span><br /> <span class="element">&lt;desc&gt;</span>Authors from Heidelberg<span class="element">&lt;/desc&gt;</span><br /><span class="element">&lt;/join&gt;</span></div></div><p>The following example shows how <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> can be used to reconstruct a text cited in fragments presented out of order. The poem being remembered (an unusual translation of a well-known poem by Basho) runs <span class="q">‘When the old pond / gets a new frog, / it's a new pond.’</span></p><div id="index-egXML-d52e122979" class="pre egXML_valid"><span class="element">&lt;sp&gt;</span><br /> <span class="element">&lt;speaker&gt;</span>Hughie<span class="element">&lt;/speaker&gt;</span><br /> <span class="element">&lt;p&gt;</span>How does it go?<br />  <span class="element">&lt;q&gt;</span><br />   <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">frog-x1</span>"&gt;</span>da-da-da<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">frog-L2</span>"&gt;</span>gets a new frog<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>...<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/sp&gt;</span><br /><span class="element">&lt;sp&gt;</span><br /> <span class="element">&lt;speaker&gt;</span>Louie<span class="element">&lt;/speaker&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;q&gt;</span><br />   <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">frog-L1</span>"&gt;</span>When the old pond<span class="element">&lt;/l&gt;</span><br />   <span class="element">&lt;l&gt;</span>...<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;/q&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/sp&gt;</span><br /><span class="element">&lt;sp&gt;</span><br /> <span class="element">&lt;speaker&gt;</span>Dewey<span class="element">&lt;/speaker&gt;</span><br /> <span class="element">&lt;p&gt;</span><br />  <span class="element">&lt;q&gt;</span>...<br />   <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">frog-L3</span>"&gt;</span>It's a new pond.<span class="element">&lt;/l&gt;</span><span class="element">&lt;/q&gt;</span><br /> <span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#frog-L1 #frog-L2 #frog-L3</span>"<br />  <span class="attribute">result</span>="<span class="attributevalue">lg</span>" <span class="attribute">scope</span>="<span class="attributevalue">root</span>"/&gt;</span><br /><span class="element">&lt;/sp&gt;</span></div><div class="p">As with other forms of link, a grouping element <a class="gi" title="(join group) groups a collection of &lt;join&gt; elements and possibly pointers." href="ref-joinGrp.html">joinGrp</a> is available for use when a number of <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> elements of the same kind co-occur. This avoids the need to specify the <span class="att">result</span> attribute for each <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> if they are all of the same type, and also allows us to restrict the domain within which their target elements are to be found, in the same way as for <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements (see <a class="link_ptr" href="SA.html#SAPTLG" title="Groups of Links"><span class="headingNumber">16.1.3 </span>Groups of Links</a>). Like a <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a>, a <a class="gi" title="(join group) groups a collection of &lt;join&gt; elements and possibly pointers." href="ref-joinGrp.html">joinGrp</a> may appear only where the elements represented by its contents are legal. Thus if we had created many <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> tags of the sort just described, we could group them together, and require that their components are all contained by an element with the identifier <span class="val">MFKFhungry</span> as follows: <div id="index-egXML-d52e123043" class="pre egXML_valid"><span class="element">&lt;joinGrp <span class="attribute">domains</span>="<span class="attributevalue">#mfkfhungry #mfkfhungry</span>"<br /> <span class="attribute">result</span>="<span class="attributevalue">s</span>"&gt;</span><br /> <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#qs3 #qs4</span>"/&gt;</span><br /> <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#qs5 #qs6</span>"/&gt;</span><br /><span class="element">&lt;/joinGrp&gt;</span></div></div><p>The <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element is useful as a means of representing non-hierarchic structures (as further discussed in chapter <a class="link_ptr" href="NH.html" title="31"><span class="headingNumber">20 </span>Non-hierarchical Structures</a>). It may also be used as a convenient way of representing a variety of analytic units, like the <a class="gi" title="associates an interpretative annotation directly with a span of text." href="ref-span.html">span</a> and <a class="gi" title="(interpretation) summarizes a specific interpretative annotation which can be linked to a span of text." href="ref-interp.html">interp</a> elements discussed in chapter <a class="link_ptr" href="AI.html" title="15"><span class="headingNumber">17 </span>Simple Analytic Mechanisms</a>. As an example, consider the following famous Zen koan: </p><div class="q"><p>Zui-Gan called out to himself every day, <span class="q">‘Master.’</span></p> <p>Then he answered himself, <span class="q">‘Yes, sir.’</span></p> <p>And then he added, <span class="q">‘Become sober.’</span></p> <p>Again he answered, <span class="q">‘Yes, sir.’</span></p> <p><span class="q">‘And after that,’</span> he continued, <span class="q">‘do not be deceived by others.’</span></p> <p><span class="q">‘Yes, sir; yes, sir,’</span> he replied.</p></div><div class="p">Suppose now that we wish to represent an interpretation of the above passage in which we distinguish between the various ‘voices’ adopted by Zui-Gan. In the following encoding, the <span class="att">who</span> attribute has been used for this purpose; its value on each occasion supplies a pointer to the <span class="q">‘voice’</span> to which each speech is attributed. (For convenience in this example, we use simply the first occurrence of the names used for each voice as the target for these pointers.) Note also that we add <span class="att">xml:id</span> attributes to each distinct speech fragment, which we can then use to link the material spoken by each voice: <div id="index-egXML-d52e123112" class="pre egXML_valid"><span class="element">&lt;text <span class="attribute">xml:id</span>="<span class="attributevalue">zuitxt</span>"&gt;</span><br /> <span class="element">&lt;body&gt;</span><br />  <span class="element">&lt;p&gt;</span><br />   <span class="element">&lt;name <span class="attribute">xml:id</span>="<span class="attributevalue">zuigan</span>"&gt;</span>Zui-Gan<span class="element">&lt;/name&gt;</span> called out to himself every day,<br />   <span class="element">&lt;q <span class="attribute">next</span>="<span class="attributevalue">#zuiq2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq1</span>"<br />    <span class="attribute">who</span>="<span class="attributevalue">#zuigan</span>"&gt;</span><br />    <span class="element">&lt;name <span class="attribute">xml:id</span>="<span class="attributevalue">master</span>"&gt;</span>Master<span class="element">&lt;/name&gt;</span>.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span>Then he answered himself,<br />   <span class="element">&lt;q <span class="attribute">next</span>="<span class="attributevalue">#zuiq4</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq2</span>"<br />    <span class="attribute">who</span>="<span class="attributevalue">#zuigan</span>"&gt;</span>Yes, sir.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span>And then he added,<br />   <span class="element">&lt;q <span class="attribute">next</span>="<span class="attributevalue">#zuiq5</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq3</span>"<br />    <span class="attribute">who</span>="<span class="attributevalue">#master</span>"&gt;</span>Become sober.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span>Again he answered,<br />   <span class="element">&lt;q <span class="attribute">next</span>="<span class="attributevalue">#zuiq7</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq4</span>"<br />    <span class="attribute">who</span>="<span class="attributevalue">#zuigan</span>"&gt;</span>Yes, sir.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span><br />   <span class="element">&lt;q <span class="attribute">next</span>="<span class="attributevalue">#zuiq6</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq5</span>"<br />    <span class="attribute">who</span>="<span class="attributevalue">#master</span>"&gt;</span>And after that,<span class="element">&lt;/q&gt;</span><br />       he continued,<br />   <span class="element">&lt;q <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq6</span>" <span class="attribute">who</span>="<span class="attributevalue">#master</span>"&gt;</span>do not be deceived by others.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />  <span class="element">&lt;p&gt;</span><br />   <span class="element">&lt;q <span class="attribute">xml:id</span>="<span class="attributevalue">zuiq7</span>" <span class="attribute">who</span>="<span class="attributevalue">#zuigan</span>"&gt;</span>Yes, sir; yes, sir,<span class="element">&lt;/q&gt;</span><br />       he replied.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;/body&gt;</span><br /><span class="element">&lt;/text&gt;</span><div style="float: right;"><a href="BIB.html#SA-eg-04">bibliography</a> </div></div></div><div class="p">However, by using the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> element, we can directly represent the complete speech attributed to each voice: <div id="index-egXML-d52e123150" class="pre egXML_valid"><span class="element">&lt;joinGrp <span class="attribute">result</span>="<span class="attributevalue">q</span>"&gt;</span><br /> <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#zuiq1 #zuiq2 #zuiq4 #zuiq7</span>"&gt;</span><br />  <span class="element">&lt;desc&gt;</span>what Zui-Gan said<span class="element">&lt;/desc&gt;</span><br /> <span class="element">&lt;/join&gt;</span><br /> <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#zuiq3 #zuiq5 #zuiq6</span>"&gt;</span><br />  <span class="element">&lt;desc&gt;</span>what Master said<span class="element">&lt;/desc&gt;</span><br /> <span class="element">&lt;/join&gt;</span><br /><span class="element">&lt;/joinGrp&gt;</span></div></div><p>Note the use of the <a class="gi" title="(description) contains a brief description of the object documented by its parent element, typically a documentation element or an entity." href="ref-desc.html">desc</a> child element within the two <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a>s making up the <a class="gi" title="(quoted) contains material which is distinguished from the surrounding text using quotation marks or a similar method, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used." href="ref-q.html">q</a> element here. These enable us to document the speakers of the two virtual <a class="gi" title="(quoted) contains material which is distinguished from the surrounding text using quotation marks or a similar method, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used." href="ref-q.html">q</a> elements represented by the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> elements; this is necessary because the there is no way of specifying the attributes to be associated with a virtual element, in particular there is no way to specify a <span class="att">who</span> value for them.</p><div class="p">Suppose now that <span class="att">xml:id</span> attributes, for whatever reasons, are not available. Then <a class="gi" title="(pointer) defines a pointer to another location." href="ref-ptr.html">ptr</a> elements may be created using any of the methods described in section <a class="link_ptr" href="SA.html#SATS" title="TEI XPointer Schemes"><span class="headingNumber">16.2.4 </span>TEI XPointer Schemes</a>. The <span class="att">xml:id</span> attributes of <em>these</em> elements may now be specified by the <span class="att">target</span> attribute on the <a class="gi" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> elements. <div id="index-egXML-d52e123202" class="pre egXML_valid"><span class="element">&lt;text&gt;</span><br /> <span class="element">&lt;body&gt;</span><br /><span class="comment">&lt;!-- five div1 elements --&gt;</span><br />  <span class="element">&lt;div1&gt;</span><br />   <span class="element">&lt;p&gt;</span>Zui-Gan called out to himself every day, <span class="element">&lt;q&gt;</span>Master.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;p&gt;</span>Then he answered himself, <span class="element">&lt;q&gt;</span>Yes, sir.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;p&gt;</span>And then he added, <span class="element">&lt;q&gt;</span>Become sober.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;p&gt;</span>Again he answered, <span class="element">&lt;q&gt;</span>Yes, sir.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;p&gt;</span><br />    <span class="element">&lt;q&gt;</span>And after that,<span class="element">&lt;/q&gt;</span> he continued, <span class="element">&lt;q&gt;</span>do not be deceived by others.<span class="element">&lt;/q&gt;</span><span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;p&gt;</span><br />    <span class="element">&lt;q&gt;</span>Yes, sir; yes, sir,<span class="element">&lt;/q&gt;</span> he replied.<span class="element">&lt;/p&gt;</span><br />   <span class="element">&lt;ab <span class="attribute">type</span>="<span class="attributevalue">aggregation</span>"&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq1</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[1]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq2</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[2]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq3</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[3]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq4</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[4]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq5</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[5]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq6</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[5]/q[2])</span>"/&gt;</span><br />    <span class="element">&lt;ptr <span class="attribute">xml:id</span>="<span class="attributevalue">rzuiq7</span>"<br />     <span class="attribute">target</span>="<span class="attributevalue">./#xpath(//div1[6]/p[6]/q[1])</span>"/&gt;</span><br />    <span class="element">&lt;joinGrp <span class="attribute">evaluate</span>="<span class="attributevalue">one</span>" <span class="attribute">result</span>="<span class="attributevalue">q</span>"&gt;</span><br />     <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#rzuiq1 #rzuiq2 #rzuiq4 #rzuiq7</span>"&gt;</span><br />      <span class="element">&lt;desc&gt;</span>what Zui-Gan said<span class="element">&lt;/desc&gt;</span><br />     <span class="element">&lt;/join&gt;</span><br />     <span class="element">&lt;join <span class="attribute">target</span>="<span class="attributevalue">#rzuiq3 #rzuiq5 #rzuiq6</span>"&gt;</span><br />      <span class="element">&lt;desc&gt;</span>what Master said<span class="element">&lt;/desc&gt;</span><br />     <span class="element">&lt;/join&gt;</span><br />    <span class="element">&lt;/joinGrp&gt;</span><br />   <span class="element">&lt;/ab&gt;</span><br />  <span class="element">&lt;/div1&gt;</span><br /> <span class="element">&lt;/body&gt;</span><br /><span class="element">&lt;/text&gt;</span><div style="float: right;"><a href="BIB.html#SA-eg-04">bibliography</a> </div></div></div><p>The extended pointer with identifier <span class="val">rzuiq2</span>, for example, may be read as <span class="q">‘the first <a class="gi" title="(quoted) contains material which is distinguished from the surrounding text using quotation marks or a similar method, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used." href="ref-q.html">q</a> in the first <a class="gi" title="(paragraph) marks paragraphs in prose." href="ref-p.html">p</a>, within the sixth <a class="gi" title="(level-1 text division) contains a first-level subdivision of the front, body, or back of a text." href="ref-div1.html">div1</a> element of the current document.’</span></p></div><div class="div2" id="SAAT"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAAG"><span class="headingNumber">16.7 </span>Aggregation</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASO"><span class="headingNumber">16.9 </span>Stand-off Markup</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAAT" title="link to this section "><span class="invisible">TEI: Alternation</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.8 </span><span class="head">Alternation</span></h3><p>This section proposes elements for the representation of alternation. We say that two or more elements are in <span class="term">exclusive alternation</span> if any of those elements could be present in a text, but one and only one of them is; in addition, we say that those elements are <span class="term">mutually exclusive</span>. We say that the elements are in <span class="term">inclusive alternation</span> if at least one (and possibly more) of them is present. The elements that are in alternation may also be called <span class="term">alternants</span>.</p><p>The need to mark exclusive alternation arises frequently in text encoding. A common situation is one in which it can be determined that exactly one of several different words appears in a given location, but it cannot be determined which one. One way to mark such an exclusive alternation is to use the linking attribute <span class="att">exclude</span>. Having marked an exclusive alternation, it can sometimes later be determined which of the alternants actually appears in the given location. To preserve the fact that an alternation was posited, one can add the linking attribute <span class="att">select</span> to a tag which hierarchically encompasses the alternants, which points to the one which actually appears. To assign responsibility and degree of certainty to the choice, one can use the <a class="gi" title="indicates the degree of certainty associated with some aspect of the text markup." href="ref-certainty.html">certainty</a> tag described in chapter <a class="link_ptr" href="CE.html" title="17"><span class="headingNumber">21 </span>Certainty, Precision, and Responsibility</a>. Also see that chapter for further discussion of certainty in general.</p><p>The <span class="att">exclude</span> and <span class="att">select</span> attributes may be used with any element assuming that they have been declared following the procedure discussed in the introduction to this chapter. </p><ul class="specList"><li><span class="specList-classSpec"><a href="ref-att.global.linking.html">att.global.linking</a></span> provides a set of attributes for hypertextual linking.<table class="specDesc"><tr><td class="Attribute"><span class="att">exclude</span></td><td>points to elements that are in exclusive alternation with the current element.</td></tr><tr><td class="Attribute"><span class="att">select</span></td><td>selects one or more alternants; if one alternant is selected, the ambiguity or uncertainty is marked as resolved. If more than one alternant is selected, the degree of ambiguity or uncertainty is marked as reduced by the number of alternants not selected.</td></tr></table></li></ul><p>A more general way to mark alternation, encompassing both exclusive and inclusive alternation, is to use the linking element <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a>. The description and attributes of this tag and of the associated grouping tag <a class="gi" title="(alternation group) groups a collection of &lt;alt&gt; elements and possibly pointers." href="ref-altGrp.html">altGrp</a> are as follows. These elements are also members of the <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a> class and therefore have all the attributes associated with that class. </p><ul class="specList"><li><span class="specList-elementSpec"><a href="ref-alt.html">alt</a></span> (alternation) identifies an alternation or a set of choices among elements or passages.<table class="specDesc"><tr><td class="Attribute"><span class="att">weights</span></td><td>If <span class="att">mode</span> is <code>excl</code>, each weight states the probability that the corresponding alternative occurs. If <span class="att">mode</span> is <span class="val">incl</span> each weight states the probability that the corresponding alternative occurs given that at least one of the other alternatives occurs.</td></tr></table></li><li><span class="specList-elementSpec"><a href="ref-altGrp.html">altGrp</a></span> (alternation group) groups a collection of <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> elements and possibly pointers.</li></ul><div class="p">To take a simple hypothetical example, suppose in transcribing a spoken text, we encounter an utterance that we can understand either as <span class="mentioned">We had fun at the beach today.</span> or as <span class="mentioned">We had sun at the beach today.</span> We can represent the exclusive alternation of these two possibilities by means of the <span class="att">exclude</span> attribute as follows. <div id="index-egXML-d52e123835" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.sun1</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun1</span>"&gt;</span>We had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun1</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we.sun1</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">If it is then determined that the speaker said <span class="mentioned">fun</span>, not <span class="mentioned">sun</span>, the encoder could amend the text by deleting the alternant containing <span class="mentioned">sun</span> and the <span class="att">exclude</span> attribute on the remaining alternant. Alternatively, the encoder could preserve the fact that there was uncertainty in the original transcription by retaining the alternants, and assigning the <span class="val">we.fun</span> value to the <span class="att">select</span> attribute value on the <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> element that encompasses the alternants, as in: <div id="index-egXML-d52e123865" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">select</span>="<span class="attributevalue">#we.fun2</span>" <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.sun2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun2</span>"&gt;</span>We had fun at the beach<br />     today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we.sun2</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">The above alternation (including the <span class="att">select</span> attribute) could be recoded by assigning the <span class="att">exclude</span> attributes to tags that enclose just the words or even the characters that are mutually exclusive, as in:<span id="Note102_return"><a class="notelink" title="See section for discussion of the w and c tags that can be used in the following examples instead of the seg type=&#34;word&#34; and seg type=&#34;character&#34; tag…" href="#Note102"><sup>65</sup></a></span> <div id="index-egXML-d52e123896" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">select</span>="<span class="attributevalue">#fun3</span>"&gt;</span>We had<br />  <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#sun3</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fun3</span>"<br />   <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>fun<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#fun3</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">sun3</span>"<br />   <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>sun<span class="element">&lt;/seg&gt;</span><br />     at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div> <div id="index-egXML-d52e123906" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u&gt;</span>We had<br />  <span class="element">&lt;seg <span class="attribute">select</span>="<span class="attributevalue">#id-f</span>" <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span><br />   <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#id-s</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">id-f</span>"<br />    <span class="attribute">type</span>="<span class="attributevalue">character</span>"&gt;</span>f<span class="element">&lt;/seg&gt;</span><br />   <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#id-f</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">id-s</span>"<br />    <span class="attribute">type</span>="<span class="attributevalue">character</span>"&gt;</span>s<span class="element">&lt;/seg&gt;</span><br />       un<span class="element">&lt;/seg&gt;</span><br />     at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">Now suppose that the transcriber is uncertain whether the first word in the utterance is <span class="mentioned">We</span> or <span class="mentioned">Lee</span>, but is certain that if it is <span class="mentioned">Lee</span>, then the other uncertain word is definitely <span class="mentioned">fun</span> and not <span class="mentioned">sun</span>. The three utterances that are in mutual exclusion can be encoded as follows. <div id="index-egXML-d52e123935" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.sun4 #lee.fun4</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun4</span>"&gt;</span>We had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun4 #lee.fun4</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">we.sun4</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun4 #we.sun4</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">lee.fun4</span>"&gt;</span>Lee had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="comment">&lt;!-- ... --&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">The preceding example can also be encoded with <span class="att">exclude</span> attributes on the word segments <span class="mentioned">We</span>, <span class="mentioned">Lee</span>, <span class="mentioned">fun</span>, and <span class="mentioned">sun</span>: <div id="index-egXML-d52e123963" class="pre egXML_valid"><span class="element">&lt;u&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#lee</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we</span>" <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>We<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#we #sun</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">lee</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>Lee<span class="element">&lt;/seg&gt;</span><br />   had<br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#sun</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fun</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>fun<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#fun #lee</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">sun</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>sun<span class="element">&lt;/seg&gt;</span><br />   at the beach today.<span class="element">&lt;/u&gt;</span></div></div><div class="p">The value of the <span class="att">select</span> attribute is defined as a list of identifiers; hence it can also be used to narrow down the range of alternants, as in: <div id="index-egXML-d52e123980" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">select</span>="<span class="attributevalue">#we.fun5 #lee.fun5</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.sun5 #lee.fun5</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun5</span>"&gt;</span>We had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun5 #lee.fun5</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">we.sun5</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun5 #we.sun5</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">lee.fun5</span>"&gt;</span>Lee had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div> This is interpreted to mean that either the first or the third <a class="gi" title="(utterance) contains a stretch of speech usually preceded and followed by silence or by a change of speaker." href="ref-u.html">u</a> element tag appears, and is thus equivalent to just the alternation of those two tags: <div id="index-egXML-d52e123992" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#lee.fun6</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun6</span>"&gt;</span>We had fun at the beach<br />     today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">exclude</span>="<span class="attributevalue">#we.fun6</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">lee.fun6</span>"&gt;</span>Lee had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">The <span class="att">exclude</span> attribute can also be used in case there is uncertainty about the tag that appears in a certain position. For example, the occurrence of the word <span class="mentioned">May</span> in the s-unit <span class="mentioned">Let's go to May</span> can be interpreted, in the absence of other information, either as a person's name or as a date. The uncertainty can be rendered as follows, using the <span class="att">exclude</span> attribute. <div id="index-egXML-d52e124013" class="pre egXML_valid"><span class="element">&lt;s&gt;</span>Let's go to<br /> <span class="element">&lt;name <span class="attribute">exclude</span>="<span class="attributevalue">#mayn</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">mayd</span>"&gt;</span>May<span class="element">&lt;/name&gt;</span><br /> <span class="element">&lt;date <span class="attribute">copyOf</span>="<span class="attributevalue">#mayd</span>" <span class="attribute">exclude</span>="<span class="attributevalue">#mayd</span>"<br />  <span class="attribute">xml:id</span>="<span class="attributevalue">mayn</span>"/&gt;</span>.<span class="element">&lt;/s&gt;</span></div></div><p>Note the use of the <span class="att">copyOf</span> attribute discussed in section <a class="link_ptr" href="SA.html#SAIE" title="Identical Elements and Virtual Copies"><span class="headingNumber">16.6 </span>Identical Elements and Virtual Copies</a>; this avoids having to repeat the content of the element whose correct tagging is in doubt.</p><div class="p">The <span class="att">copyOf</span> and the <span class="att">exclude</span> attributes also provide for a simple way of indicating uncertainty about exactly where a particular element occurs in a document.<span id="Note103_return"><a class="notelink" title="An alternative way of representing this problem is discussed in chapter ." href="#Note103"><sup>66</sup></a></span> For example suppose that a particular <a class="gi" title="(level-2 text division) contains a second-level subdivision of the front, body, or back of a text." href="ref-div2.html">div2</a> element appears either as the third and last of the <a class="gi" title="(level-2 text division) contains a second-level subdivision of the front, body, or back of a text." href="ref-div2.html">div2</a> elements within the first <a class="gi" title="(level-1 text division) contains a first-level subdivision of the front, body, or back of a text." href="ref-div1.html">div1</a> element in the body of a document, or as the first <a class="gi" title="(level-2 text division) contains a second-level subdivision of the front, body, or back of a text." href="ref-div2.html">div2</a> of the second <a class="gi" title="(level-1 text division) contains a first-level subdivision of the front, body, or back of a text." href="ref-div1.html">div1</a>. One solution would be to record the <a class="gi" title="(level-2 text division) contains a second-level subdivision of the front, body, or back of a text." href="ref-div2.html">div2</a> in its entirety in the first of these positions, and a virtual copy of it in the second, and mark them as excluding each other as follows: <div id="index-egXML-d52e124059" class="pre egXML_valid"><span class="element">&lt;body&gt;</span><br /> <span class="element">&lt;div1 <span class="attribute">xml:id</span>="<span class="attributevalue">C1</span>"&gt;</span><br />  <span class="element">&lt;div2 <span class="attribute">xml:id</span>="<span class="attributevalue">C1S3</span>" <span class="attribute">exclude</span>="<span class="attributevalue">#C2S1</span>"/&gt;</span><br /> <span class="element">&lt;/div1&gt;</span><br /> <span class="element">&lt;div1 <span class="attribute">xml:id</span>="<span class="attributevalue">C2</span>"&gt;</span><br />  <span class="element">&lt;div2 <span class="attribute">xml:id</span>="<span class="attributevalue">C2S1</span>" <span class="attribute">copyOf</span>="<span class="attributevalue">#C1S3</span>"<br />   <span class="attribute">exclude</span>="<span class="attributevalue">#C1S3</span>"/&gt;</span><br /> <span class="element">&lt;/div1&gt;</span><br /><span class="element">&lt;/body&gt;</span></div> In this case, the <span class="att">select</span> attribute, if used, would appear on the <a class="gi" title="(text body) contains the whole body of a single unitary text, excluding any front or back matter." href="ref-body.html">body</a> element.</div><div class="p">Mutual exclusion can also be expressed using a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a>; the first example in this section can be recoded by removing the <span class="att">exclude</span> attributes from the <a class="gi" title="(utterance) contains a stretch of speech usually preceded and followed by silence or by a change of speaker." href="ref-u.html">u</a> elements, and adding a <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element as follows:<span id="Note104_return"><a class="notelink" title="In this example, we have placed the link next to the elements that represent the alternants. It could also have been placed elsewhere in the document,…" href="#Note104"><sup>67</sup></a></span> <div id="index-egXML-d52e124097" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">type</span>="<span class="attributevalue">interview</span>"&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">we.had.fun</span>"&gt;</span>We had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">we.had.sun</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /> <span class="element">&lt;link <span class="attribute">type</span>="<span class="attributevalue">exclusiveAlternation</span>"<br />  <span class="attribute">target</span>="<span class="attributevalue">#we.had.fun #we.had.sun</span>"/&gt;</span><br /><span class="element">&lt;/div&gt;</span></div></div><div class="p">Now we define the specialized linking element <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a>, making it a member of the class <a class="link_odd" title="provides a set of attributes used by all elements which point to other elements by means of one or more URI references." href="ref-att.pointing.html">att.pointing</a>, and assigning it a <span class="att">mode</span> attribute, which can have either of the values <span class="val">excl</span> (for exclusive) or <span class="val">incl</span> (for inclusive). Then the following equivalence holds: <div id="index-egXML-d52e124122" class="pre egXML_valid"><span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#a #b</span>" <span class="attribute">mode</span>="<span class="attributevalue">excl</span>"/&gt;</span></div> = <div id="index-egXML-d52e124125" class="pre egXML_valid"><span class="element">&lt;link <span class="attribute">target</span>="<span class="attributevalue">#a #b</span>"<br /> <span class="attribute">type</span>="<span class="attributevalue">exclusive_alternation</span>"/&gt;</span></div></div><div class="p">The preceding <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> element may therefore be recoded as the following <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> element. <div id="index-egXML-d52e124135" class="pre egXML_valid"><span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#we.had.fun #we.had.sun</span>"<br /> <span class="attribute">mode</span>="<span class="attributevalue">excl</span>"/&gt;</span></div></div><p>Another attribute that is defined specifically for the <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> element is <span class="att">weights</span>, which is to be used if one wishes to assign <span class="term">probabilistic weights</span> to the targets (alternants). Its value is a list of numbers, corresponding to the targets, expressing the probability that each target appears.  If the alternants are mutually exclusive, then the weights must sum to 1.0.</p><div class="p">Suppose in the preceding example that it is equiprobable whether <span class="mentioned">fun</span> or <span class="mentioned">sun</span> appears. Then the <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> element that represents the alternation may be stated as follows: <div id="index-egXML-d52e124161" class="pre egXML_valid"><span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#we.fun #we.had.sun</span>"<br /> <span class="attribute">mode</span>="<span class="attributevalue">excl</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.5 0.5</span>"/&gt;</span></div></div><div class="p">The assignment of a weight of 1.0 to one target (and weights of 0 to all the others) is equivalent to selecting that target. Thus the following encoding is equivalent to the second example at the beginning of this section. <div id="index-egXML-d52e124165" class="pre egXML_valid"><span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">we.fun</span>"&gt;</span>We had fun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;u <span class="attribute">xml:id</span>="<span class="attributevalue">we.sun</span>"&gt;</span>We had sun at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#we.fun #we.sun</span>" <span class="attribute">mode</span>="<span class="attributevalue">excl</span>"<br /> <span class="attribute">weights</span>="<span class="attributevalue">1 0</span>"/&gt;</span></div>                                         The sum of the weights for <span class="tag">&lt;alt mode="incl"&gt;</span> tags ranges from 0% to (100 × <code>k</code>)%, where <code>k</code> is the number of targets. If the sum is 0%, then the alternation is equivalent to exclusive alternation; if the sum is (100 x k)%, then all of the alternants must appear, and the situation is better encoded without an <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> tag.</div><p>If it is desired, <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> elements may be grouped together in an <a class="gi" title="(alternation group) groups a collection of &lt;alt&gt; elements and possibly pointers." href="ref-altGrp.html">altGrp</a> element, and attribute values shared by the individual <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> elements may be identified on the <a class="gi" title="(alternation group) groups a collection of &lt;alt&gt; elements and possibly pointers." href="ref-altGrp.html">altGrp</a> element. The <span class="att">targFunc</span> attribute defaults to the value <span class="val">first.alternant next.alternant</span>. </p><div class="p">To illustrate, consider again the example of a transcribed utterance, in which it is uncertain whether the first word is <span class="mentioned">We</span> or <span class="mentioned">Lee</span>, whether the third word is <span class="mentioned">fun</span> or <span class="mentioned">sun</span>, but that if the first word is <span class="mentioned">Lee</span>, then the third word is <span class="mentioned">fun</span>. Now suppose we have the following additional information: if <span class="mentioned">we</span> occurs, then the probability that <span class="mentioned">fun</span> occurs is 50% and that <span class="mentioned">sun</span> occurs is 50%; if <span class="mentioned">fun</span> occurs, then the probability that <span class="mentioned">we</span> occurs is 40% and that <span class="mentioned">Lee</span> occurs is 60%. This situation can be encoded as follows. <div id="index-egXML-d52e124327" class="pre egXML_valid"><span class="element">&lt;u&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#lee2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">we2</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>We<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#we2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">lee2</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>Lee<span class="element">&lt;/seg&gt;</span><br />   had<br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#sun2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">fun2</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>fun<span class="element">&lt;/seg&gt;</span><br /> <span class="element">&lt;seg <span class="attribute">exclude</span>="<span class="attributevalue">#fun2</span>" <span class="attribute">xml:id</span>="<span class="attributevalue">sun2</span>"<br />  <span class="attribute">type</span>="<span class="attributevalue">word</span>"&gt;</span>sun<span class="element">&lt;/seg&gt;</span><br />   at the beach today.<span class="element">&lt;/u&gt;</span><br /><span class="element">&lt;altGrp&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#we2 #lee2</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#fun2 #sun2</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#we2 #fun2</span>" <span class="attribute">mode</span>="<span class="attributevalue">incl</span>"<br />  <span class="attribute">weights</span>="<span class="attributevalue">0.5 0.5</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#lee2 #fun2</span>" <span class="attribute">mode</span>="<span class="attributevalue">incl</span>"<br />  <span class="attribute">weights</span>="<span class="attributevalue">1.0 0.6</span>"/&gt;</span><br /><span class="element">&lt;/altGrp&gt;</span></div> As noted above, when the <span class="att">mode</span> attribute on an <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> has the value <span class="val">incl</span>, then each weight states the probability that the corresponding alternative occurs, given that at least one of the other alternatives occurs.</div><p>From the information in this encoding, we can determine that the probability is about 28.5% that the utterance is <span class="q">‘We had fun at the beach today’</span>, 28.5% that it is <span class="mentioned">We had sun at the beach today</span>, and 43% that it is <span class="mentioned">Lee had fun at the beach today</span>.</p><p>Another very similar example is the following regarding the text of a Broadway song. In three different versions of the song, the same line reads <span class="q">‘Her skin is tender as a leather glove’</span>, <span class="q">‘Her skin is tender as a baseball glove’</span>, and <span class="q">‘Her skin is tender as Dimaggio's glove.’</span><span id="Note105_return"><a class="notelink" title="The variant readings are found in the commercial sheet music, the performance score, and the Broadway cast recording." href="#Note105"><sup>68</sup></a></span></p><p>If we wish to express this textual variation using the <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> element, we can record our relative confidence in the readings <span class="mentioned">Dimaggio's</span> (with probability 50%), <span class="mentioned">a leather</span> (25%), and <span class="mentioned">a baseball</span> (25%).</p><div class="p">Let us extend the example with a further (imaginary) variation, supposing for the sake of the argument that the next line is variously given as <span class="mentioned">and she bats from right to left</span> (with probability 50%) or <span class="mentioned">now ain't that too damn bad</span> (with probability 50%). Using the <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> element, we can express the conviction that if the first choice for the second line is correct, then the probability that the first line contains <span class="mentioned">Dimaggio's</span> is 90%, and each of the others 5%; whereas if the second choice for the second line is correct, then the probability that the first line contains <span class="mentioned">Dimaggio's</span> is 10%, and each of the others is 45%. This can be encoded, with an <a class="gi" title="(alternation group) groups a collection of &lt;alt&gt; elements and possibly pointers." href="ref-altGrp.html">altGrp</a> element containing a combination of exclusive and inclusive <a class="gi" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> elements, as follows.  <div id="index-egXML-d52e124418" class="pre egXML_valid"><span class="element">&lt;div <span class="attribute">xml:id</span>="<span class="attributevalue">bm</span>" <span class="attribute">type</span>="<span class="attributevalue">song</span>"&gt;</span><br /> <span class="element">&lt;l&gt;</span>Her skin is tender as<br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">dm</span>"&gt;</span>Dimaggio's<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">lt</span>"&gt;</span>a leather<span class="element">&lt;/seg&gt;</span><br />  <span class="element">&lt;seg <span class="attribute">xml:id</span>="<span class="attributevalue">bb</span>"&gt;</span>a baseball<span class="element">&lt;/seg&gt;</span><br />     glove,<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">rl</span>"&gt;</span>and she bats from right to left.<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;l <span class="attribute">xml:id</span>="<span class="attributevalue">db</span>"&gt;</span>now ain't that too damn bad.<span class="element">&lt;/l&gt;</span><br /><span class="element">&lt;/div&gt;</span><br /><span class="element">&lt;altGrp&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#dm #lt #bb</span>" <span class="attribute">mode</span>="<span class="attributevalue">excl</span>"<br />  <span class="attribute">weights</span>="<span class="attributevalue">0.5 0.25 0.25</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#rl #db</span>" <span class="attribute">mode</span>="<span class="attributevalue">excl</span>"<br />  <span class="attribute">weights</span>="<span class="attributevalue">0.50 0.50</span>"/&gt;</span><br /><span class="element">&lt;/altGrp&gt;</span><br /><span class="element">&lt;altGrp <span class="attribute">mode</span>="<span class="attributevalue">incl</span>"&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#dm #rl</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.90 0.90</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#lt #rl</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.5 0.5</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#bb #rl</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.5 0.5</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#dm #db</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.10 0.10</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#lt #db</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.45 0.90</span>"/&gt;</span><br /> <span class="element">&lt;alt <span class="attribute">target</span>="<span class="attributevalue">#bb #db</span>" <span class="attribute">weights</span>="<span class="attributevalue">0.45 0.90</span>"/&gt;</span><br /><span class="element">&lt;/altGrp&gt;</span></div></div></div><div class="div2" id="SASO"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAAT"><span class="headingNumber">16.8 </span>Alternation</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAAN"><span class="headingNumber">16.10 </span>Connecting Analytic and Textual Markup</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SASO" title="link to this section "><span class="invisible">TEI: Stand-off Markup</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9 </span><span class="head">Stand-off Markup</span></h3><div class="div3" id="SASOin"><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASOin" title="link to this section "><span class="invisible">TEI: Introduction</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9.1 </span><span class="head">Introduction</span></h4><p>Most of the mechanisms defined in this chapter rely to a greater or lesser extent on the fact that tags in a marked-up document can both assert a property for a span of text which they enclose, and assert the existence of an association between themselves and some other span of text elsewhere. In stand-off markup, there is a clear separation of these two behaviours: the markup does not directly contain any part of the text, but instead includes it by reference. One specific mechanism recommended by these Guidelines for this purpose is the standard XInclude mechanism defined by the W3C; another is to use pointers as demonstrated elsewhere in this chapter. </p><p>There are many reasons for using stand-off markup: the source text might be read-only so that additional markup cannot be added, or a single text may need to be marked up according to several hierarchically incompatible schemes, or a single scheme may need to accommodate multiple hierarchical ambiguities, so that a single markup tree is not the most faithful representation of the source material.</p><p>This section describes a generic mechanism for expressing <em>all</em> kinds of markup externally as stand-off tags, for use whenever it is appropriate.</p><p>Throughout this section the following terms will be systematically used in specific senses. </p><dl><dt><span><span class="term">source document</span></span></dt><dd>a document to which the stand-off markup refers (a source document can be either XML or plain text); there may be more than one source document.</dd><dt><span><span class="term">internal markup</span></span></dt><dd>markup that is already present in an XML source document</dd><dt><span><span class="term">stand-off markup</span></span></dt><dd>markup that is either outside of the source document and points in to it to the data it describes, or alternatively is in another part of the source document and points elsewhere within the document to the data it describes</dd><dt><span><span class="term">external document</span></span></dt><dd>a document that contains stand-off markup that points to a different, source document</dd><dt><span><span class="term">internalize</span></span></dt><dd>the action of creating a new XML document with external markup and data integrated with the source document data, and possibly some source document markup as well</dd><dt><span><span class="term">externalize</span></span></dt><dd>a process applied to markup from a pre-existing XML document, which splits it into two documents, an XML (external) document containing some of the markup of the original document, and another (source) XML document containing whatever text content and markup has not been extracted into the stand-off document; if all markup has been externalized from a document, the new source may be a plain text document</dd></dl><p>The three major requirements satisfied by this scheme for stand-off markup are: </p><ol class="numbered"><li class="item">any valid TEI markup can be either internal or external,</li><li class="item">external markup can be internalized by applying it to the document content by either substituting the existing markup or adding to it, to form a valid TEI document, and</li><li class="item">the external markup itself specifies whether an internalized document is to be created by substituting the existing internal markup or by adding to it</li></ol><p>.</p></div><div class="div3" id="SASOov"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASOin"><span class="headingNumber">16.9.1 </span>Introduction</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASOso"><span class="headingNumber">16.9.3 </span>Stand-off Markup in TEI</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASOov" title="link to this section "><span class="invisible">TEI: Overview of XInclude </span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9.2 </span><span class="head">Overview of XInclude </span></h4><p>Stand-off markup which relies on the inclusion of virtual content is adequately supported by the W3C XInclude recommendation, which is also recommended for use by these Guidelines.<span id="Note106_return"><a class="notelink" title="The version on which this text is based is the W3C Recommendation dated 20 December 2004.." href="#Note106"><sup>69</sup></a></span> XInclude defines a namespace (<span class="mentioned">http://www.w3.org/2001/XInclude</span>), which in these Guidelines will be associated with the prefix <span class="mentioned">xi:</span>, and exactly two elements, <span class="gi">&lt;xi:include&gt;</span> and <span class="gi">&lt;xi:fallback&gt;</span>. XInclude relies on the <a class="link_ref" href="http://www.w3.org/TR/xptr-framework/">XPointer framework</a> discussed elsewhere in this chapter to point to the actual fragments of text to be internalized. Although XInclude only requires support for the <a class="link_ref" href="http://www.w3.org/TR/xptr-element/"><code>element()</code></a> scheme of XPointer, these Guidelines permit the use of any of the pointing schemes discussed in section <a class="link_ptr" href="SA.html#SAXP" title="Pointing Mechanisms"><span class="headingNumber">16.2 </span>Pointing Mechanisms</a>.</p><p>XInclude is a W3C recommendation which specifies a syntax for the inclusion within an XML document of data fragments placed in different resources. Included resources can be either plain text or XML. XInclude instructions within an XML document are meant to be replaced by a resource targetted by a URI, possibly augmented by an XPointer that identifies the exact subresource to be included. </p><p>The <span class="gi">&lt;xi:include&gt;</span> element uses the <span class="att">href</span> attribute to specify the location of the resource to be included; its value is an URI containing, if necessary, an XPointer. Additionally, it uses the <span class="att">parse</span> attribute (whose only valid values are <span class="val">text</span> and <span class="val">xml</span>) to specify whether the included content is plain text or an XML fragment, and the <span class="att">encoding</span> attribute to provide a hint, when the included fragment is text, of the character encoding of the fragment. An optional <span class="gi">&lt;xi:fallback&gt;</span> element is also permitted within an <span class="gi">&lt;xi:include&gt;</span>; it specifies alternative content to be used when the external resource cannot be fetched for some reason. Its use is not however recommended for stand-off markup.</p></div><div class="div3" id="SASOso"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASOov"><span class="headingNumber">16.9.2 </span>Overview of XInclude </a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASOva"><span class="headingNumber">16.9.4 </span>Well-formedness and Validity of Stand-off Markup</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASOso" title="link to this section "><span class="invisible">TEI: Stand-off Markup in TEI</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9.3 </span><span class="head">Stand-off Markup in TEI</span></h4><p>The operations of internalizing and externalizing markup are very useful and practically important. XInclude processing as defined by the W3C <em>is</em> internalization of one or more source documents' content into a stand-off document. TEI use of XInclude for stand-off markup enables use of XInclude-conformant software to perform this useful operation. However, internalization is not clearly defined for all stand-off files, because the structure of the internal and external markup trees may overlap. In particular, when an external markup document selects a range that overlaps partial elements in the source document, it is not clear how the semantics of internalization (inclusion) should work, since partial elements are not XML objects.<span id="Note107_return"><a class="notelink" title="This corresponds to the observation that overlapping XML tags reflecting a textual version of such an inclusion would not even be well-formed XML. Thi…" href="#Note107"><sup>70</sup></a></span> XInclude defines a semantics for this case that involves only complete elements.</p><div class="p">When a range selection partially overlaps a number of elements in a source document, XInclude specifies that the partially overlapping elements should be included as well as all completely overlapping elements and characters (partially overlapping characters are not possible). The effect of this is that elements that straddle the start or end of a selected range will be included as wrappers for those of their children that are completely or partially selected by the range. For example, given the following source document: <div id="index-egXML-d52e125047" class="pre egXML_valid"><span class="element">&lt;body&gt;</span><br /> <span class="element">&lt;p <span class="attribute">xml:id</span>="<span class="attributevalue">par1</span>"&gt;</span>home, <span class="element">&lt;emph&gt;</span>home<span class="element">&lt;/emph&gt;</span> on Brokeback Mountain.<span class="element">&lt;/p&gt;</span><br /> <span class="element">&lt;p <span class="attribute">xml:id</span>="<span class="attributevalue">par2</span>"&gt;</span>That was the <span class="element">&lt;emph&gt;</span>song<span class="element">&lt;/emph&gt;</span> that I sang<span class="element">&lt;/p&gt;</span><br /><span class="element">&lt;/body&gt;</span></div> and the following external document: <pre class="pre_eg cdata">
  &lt;body&gt;
     &lt;div&gt;&lt;include href="example1.xml" xmlns="http://www.w3.org/2001/XInclude"
xpointer="range(xpath(id('par1')//emph),xpath(id('par2')//emph))"/&gt;
     &lt;/div&gt;
 &lt;/body&gt;   
</pre> the resulting document after XInclude processing of this external document would be: <pre class="pre_eg cdata">
   &lt;body&gt;
   &lt;div&gt;
     &lt;p xml:id="par1"&gt;home, &lt;emph&gt;home&lt;/emph&gt; on Brokeback Mountain.&lt;/p&gt;
     &lt;p xml:id="par2"&gt;That was the &lt;emph&gt;song&lt;/emph&gt; that I sang&lt;/p&gt;
   &lt;/div&gt;
   &lt;/body&gt;
</pre> The result of the inclusion is two paragraph elements, while the original range designated in the source document overlapped two paragraph fragments.  The semantics of XInclude require the creation of well-formed XML results even though the pointing mechanisms it uses do not necessarily respect the hierarchical structure of XML documents, as in this case. While this is a good way to ensure that internalization is always possible, it has implications for the use of XInclude as a notation for the <em>description</em> of overlapping markup structures.</div><p>When overlapping hierarchies need to be represented for a single document, each hierarchy must be represented by a separate set of XInclude tags pointing to a common source document. This sort of structure corresponds to common practice in work with linguistic text corpora. In such corpora, each potentially overlapping hierarchy of elements for the text is represented as a separate stream of stand-off markup. Generally the source text contains markup for the smallest significant units of analysis in the corpus, such as words or morphemes, this information and its markup representing a layer of common information that is shared by all the various hierarchies. As a way of organizing the representation of complex data, this technique generally allows a large number of <span class="att">xml:id</span> attributes to be attached to the shared elements, providing robust anchors for links and facilitating adjustments to the source document without breaking external documents that reference it.</p><p>Any tag can be externalized by  removing its content and replacing it with an <span class="gi">&lt;xi:include&gt;</span> element that contains an XPointer pointing to the desired content.</p><div class="p">For instance the following portion of a TEI document: <div id="index-egXML-d52e125087" class="pre egXML_valid"><span class="element">&lt;text&gt;</span><br /> <span class="element">&lt;body&gt;</span><br />  <span class="element">&lt;head&gt;</span>1755<span class="element">&lt;/head&gt;</span><br />  <span class="element">&lt;l&gt;</span>To make a prairie it takes a clover and one bee,<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>One clover, and a bee,<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>And revery.<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>The revery alone will do,<span class="element">&lt;/l&gt;</span><br />  <span class="element">&lt;l&gt;</span>If bees are few.<span class="element">&lt;/l&gt;</span><br /> <span class="element">&lt;/body&gt;</span><br /><span class="element">&lt;/text&gt;</span><div style="float: right;"><a href="BIB.html#VEST-eg-1">bibliography</a> </div></div> can be externalized by placing the actual text in a separate document, and providing exactly the same markup with the <span class="gi">&lt;xi:include&gt;</span> elements: <br />  <span class="it">Source.xml</span> <pre class="pre_eg cdata">&lt;content&gt;To make a prairie it takes a clover and one bee,\n
One clover, and a bee,\n
And revery.\n
The revery alone will do,\n
If bees are few.\n
&lt;/content&gt;</pre> <br />  <span class="it">External.xml</span> <pre class="pre_eg cdata">&lt;text xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
 &lt;body&gt;
  &lt;head&gt;1755&lt;/head&gt;
   &lt;l&gt;
    &lt;xi:include href="Source.xml" parse="xml"
 xpointer="string-range(element(/1),  0, 48)"/&gt;
   &lt;/l&gt;
   &lt;l&gt;
    &lt;xi:include href="Source.xml" parse="xml"
 xpointer="string-range(element(/1), 49, 71)"/&gt;
   &lt;/l&gt;
   &lt;l&gt;
    &lt;xi:include href="Source.xml" parse="xml"
 xpointer="string-range(element(/1), 72, 83)"/&gt;
   &lt;/l&gt;
   &lt;l&gt;
    &lt;xi:include href="Source.xml" parse="xml"
 xpointer="string-range(element(/1), 84,109)"/&gt;
   &lt;/l&gt;
   &lt;l&gt;
    &lt;xi:include href="Source.xml" parse="xml"
 xpointer="string-range(element(/1),110,126)"/&gt;
   &lt;/l&gt;
 &lt;/body&gt;
&lt;/text&gt;</pre></div><p>Please note that this specification requires that the XInclude namespace declaration is present in all cases. The <span class="gi">&lt;xi:fallback&gt;</span> element contains text or XML fragments to be placed in the document if the inclusion fails for any reason (for instance due to inaccessibility of an external resource). The <span class="gi">&lt;xi:fallback&gt;</span> element is optional; if it is not present an XInclude processor must signal a fatal error when a resource is not found. This is the preferred behaviour for use with stand-off markup. These Guidelines recommend against the use of <span class="gi">&lt;xi:fallback&gt;</span> for stand-off markup.</p></div><div class="div3" id="SASOva"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASOso"><span class="headingNumber">16.9.3 </span>Stand-off Markup in TEI</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SASOfr"><span class="headingNumber">16.9.5 </span>Including Text or XML Fragments</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASOva" title="link to this section "><span class="invisible">TEI: Well-formedness and Validity of Stand-off Markup</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9.4 </span><span class="head">Well-formedness and Validity of Stand-off Markup</span></h4><p>The whole source fragment identified by an XInclude element, as well as any markup therein contained is inserted in the position specified, and an XInclude processor is required to ensure that the resulting internalized document is well-formed. This has obvious implications when the external document contains XML markup. A plain text source document will always create a well-formed internalized document. </p><p>While a TEI customization may permit <span class="gi">&lt;xi:include&gt;</span> elements in various places in a TEI document instance, in general these Guidelines suggest that validity be verified after the resolution of all the <span class="gi">&lt;xi:include&gt;</span> elements.</p></div><div class="div3" id="SASOfr"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASOva"><span class="headingNumber">16.9.4 </span>Well-formedness and Validity of Stand-off Markup</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h4><span class="bookmarklink"><a class="bookmarklink" href="#SASOfr" title="link to this section "><span class="invisible">TEI: Including Text or XML Fragments</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.9.5 </span><span class="head">Including Text or XML Fragments</span></h4><p>When the source text is plain text the overall form of the XPointer pointing to it is of minimal importance. The form of the XPointer matters considerably, on the other hand, when the source document is XML.</p><p>In this case, it is rather important to distinguish whether we intend to substitute the source XML with the new one, or just to add new markup to it. The XPointers used in the references can express both cases.</p><div class="p">A simple way is to make sure to select only textual data in the XPointer. For instance, given the following document: <br />  <span class="it">Source.xhtml</span> <div id="index-egXML-d52e125167" class="pre egXML_valid"><span class="element">&lt;xhtml:html&gt;</span><br /> <span class="element">&lt;xhtml:body&gt;</span><br />  <span class="element">&lt;xhtml:div&gt;</span>To make a prairie it takes a <span class="element">&lt;xhtml:a <span class="attribute">href</span>="<span class="attributevalue">clover.gif</span>"&gt;</span>clover<span class="element">&lt;/xhtml:a&gt;</span><br />       and one <span class="element">&lt;xhtml:a <span class="attribute">href</span>="<span class="attributevalue">bee.gif</span>"&gt;</span>bee<span class="element">&lt;/xhtml:a&gt;</span>,<span class="element">&lt;/xhtml:div&gt;</span><br />  <span class="element">&lt;xhtml:div&gt;</span>One <span class="element">&lt;xhtml:a <span class="attribute">href</span>="<span class="attributevalue">clover.gif</span>"&gt;</span>clover<span class="element">&lt;/xhtml:a&gt;</span>, and<br />       a <span class="element">&lt;xhtml:a <span class="attribute">href</span>="<span class="attributevalue">bee.gif</span>"&gt;</span>bee<span class="element">&lt;/xhtml:a&gt;</span>,<span class="element">&lt;/xhtml:div&gt;</span><br />  <span class="element">&lt;xhtml:div&gt;</span>And revery.<span class="element">&lt;/xhtml:div&gt;</span><br />  <span class="element">&lt;xhtml:div&gt;</span>The revery alone will do,<span class="element">&lt;/xhtml:div&gt;</span><br />  <span class="element">&lt;xhtml:div&gt;</span>If bees are few.<span class="element">&lt;/xhtml:div&gt;</span><br /> <span class="element">&lt;/xhtml:body&gt;</span><br /><span class="element">&lt;/xhtml:html&gt;</span></div> the expression <code>range(element(/1/2/1.0),element(/1/2/11.1))</code> will select the whole poem, text content <em>and</em> <a class="gi" title="(text division) contains a subdivision of the front, body, or back of a text." href="ref-div.html">div</a> elements <em>and</em> hypertext links (NB: in XPointer whitespace-only text nodes count).</div><p>On the contrary, the expressions <code>xpointer(//text()/range-to(.))</code> and <code>xpointer(string-range(//text(),"To")/range-to(//text(),"few.")</code> will only select the text of the poem, with no markup inside.</p><p>Thus, the following could be a valid stand-off document for the <span class="titlem">Source.xhtml</span> document: <br />  <span class="it">External2.xml</span> </p><pre class="pre_eg cdata">&lt;text xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
 &lt;body&gt;
  &lt;head&gt;1755&lt;/head&gt;
  &lt;l&gt;
   &lt;xi:include href="Source.xhtml"
 xpointer='xpointer(string-range(//div[1]/text(),"To")/range-to(//div[1]/text(),"bee,")'/&gt;
  &lt;/l&gt;
  &lt;l&gt;
   &lt;xi:include href="Source.xhtml"
 xpointer='xpointer(string-range(//div[2]/text(),"One")/range-to(//div[2]/text(),"bee,")'/&gt;
  &lt;/l&gt;
  &lt;l&gt;
   &lt;xi:include href="Source.xhtml"
 xpointer='xpointer(string-range(//div[3]/text(),"And")/range-to(//div[3]/text(),".")'/&gt;
  &lt;/l&gt;
  &lt;l&gt;
   &lt;xi:include href="Source.xhtml"
 xpointer='xpointer(string-range(//div[4]/text(),"The")/range-to(//div[4]/text(),",")'/&gt;
  &lt;/l&gt;
  &lt;l&gt;
   &lt;xi:include href="Source.xhtml"
 xpointer='xpointer(string-range(//div[5]/text(),"If")/range-to(//div[5]/text(),".")'/&gt;
  &lt;/l&gt;
 &lt;/body&gt;
&lt;/text&gt;</pre></div></div><div class="div2" id="SAAN"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SASO"><span class="headingNumber">16.9 </span>Stand-off Markup</a></li><li class="subtoc"><span class="nextLink"> » </span><a class="navigation" href="SA.html#SAref"><span class="headingNumber">16.11 </span>Module for Linking, Segmentation, and Alignment</a></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAAN" title="link to this section "><span class="invisible">TEI: Connecting Analytic and Textual Markup</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.10 </span><span class="head">Connecting Analytic and Textual Markup</span></h3><p>In chapters <a class="link_ptr" href="AI.html" title="15"><span class="headingNumber">17 </span>Simple Analytic Mechanisms</a> and <a class="link_ptr" href="FS.html" title="16"><span class="headingNumber">18 </span>Feature Structures</a> and elsewhere, provision is made for analytic and interpretive markup to be represented outside of textual markup, either in the same document or in a different document. The elements in these separate domains can be connected, either with the pointing attributes <span class="att">ana</span> (for <span class="mentioned">analysis</span>) and <span class="att">inst</span> (for <span class="mentioned">instance</span>), or by means of <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> and <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> elements. Numerous examples are given in these chapters.</p></div><div class="div2" id="SAref"><div class="miniTOC miniTOC_right"><ul class="subtoc"><li class="subtoc"><span class="previousLink"> « </span><a class="navigation" href="SA.html#SAAN"><span class="headingNumber">16.10 </span>Connecting Analytic and Textual Markup</a></li><li class="subtoc"></li><li class="subtoc"><a class="navigation" href="index.html">Home</a></li></ul></div><h3><span class="bookmarklink"><a class="bookmarklink" href="#SAref" title="link to this section "><span class="invisible">TEI: Module for Linking, Segmentation, and Alignment</span><span class="pilcrow">¶</span></a></span><span class="headingNumber">16.11 </span><span class="head">Module for Linking, Segmentation, and Alignment</span></h3><p>The module described in this chapter makes available the following components: </p><dl class="moduleSpec"><dt class="moduleSpecHead"><span lang="en">Module</span> linking: Linking, segmentation and alignment</dt><dd><ul><li><span lang="en">Elements defined</span>: <a class="link_odd" title="(anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." href="ref-ab.html">ab</a> <a class="link_odd" title="(alternation) identifies an alternation or a set of choices among elements or passages." href="ref-alt.html">alt</a> <a class="link_odd" title="(alternation group) groups a collection of &lt;alt&gt; elements and possibly pointers." href="ref-altGrp.html">altGrp</a> <a class="link_odd" title="(anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element." href="ref-anchor.html">anchor</a> <a class="link_odd" title="identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it." href="ref-join.html">join</a> <a class="link_odd" title="(join group) groups a collection of &lt;join&gt; elements and possibly pointers." href="ref-joinGrp.html">joinGrp</a> <a class="link_odd" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> <a class="link_odd" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a> <a class="link_odd" title="(arbitrary segment) represents any segmentation of text below the ‘chunk’ level." href="ref-seg.html">seg</a> <a class="link_odd" title="provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text." href="ref-timeline.html">timeline</a> <a class="link_odd" title="indicates a point in time either relative to other elements in the same timeline tag, or absolutely." href="ref-when.html">when</a></li><li><span lang="en">Classes defined</span>: <a class="link_odd" title="provides a set of attributes for hypertextual linking." href="ref-att.global.linking.html">att.global.linking</a></li></ul></dd></dl><p> The selection and combination of modules to form a TEI schema is described in <a class="link_ptr" href="ST.html#STIN" title="Defining a TEI Schema"><span class="headingNumber">1.2 </span>Defining a TEI Schema</a>. </p></div></div><nav class="left"><span class="upLink"> ↑ </span><a class="navigation" href="index.html">TEI P5 Guidelines</a><span class="previousLink"> « </span><a class="navigation" href="CC.html"><span class="headingNumber">15 </span>Language Corpora</a><span class="nextLink"> » </span><a class="navigation" href="AI.html"><span class="headingNumber">17 </span>Simple Analytic Mechanisms</a></nav><!--Notes in [div]--><div class="notes"><div class="noteHeading">Notes</div><div class="note" id="Note94"><span class="noteLabel">57 </span><div class="noteBody">We use the term <span class="term">alignment</span> as a special case for the more general notion of correspondence. Using A as a short form for <span class="q">‘an element with its attribute <span class="att">xml:id</span> set to the value <span class="val">A</span>’</span>, and suppose elements A1, A2, and A3 occur in that order and form one group, while elements B1, B2, and B3 occur in that order and form another group. Then a relation in which A1 corresponds to B1, A2 corresponds to B2, and A3 corresponds to B3 is an alignment. On the other hand, a relation in which A1 corresponds to B2, B1 to C2, and C1 to A2 is not an alignment.</div> <a class="link_return" title="Go back to text" href="#Note94_return">↵</a></div><div class="note" id="Note95"><span class="noteLabel">58 </span><div class="noteBody">The <span class="att">type</span> attribute on the note is used to classify the notes using the typology established in the Advertisement to the work: <span class="q">‘The <span class="noindex">Imitations</span> of the Ancients are added, to gratify those who either never read, or may have forgotten them; together with some of the Parodies, and Allusions to the most excellent of the Moderns.’</span> In the source text, the text of the poem shares the page with two sets of notes, one headed <span class="q">‘Remarks’</span> and the other <span class="q">‘Imitations’</span>.</div> <a class="link_return" title="Go back to text" href="#Note95_return">↵</a></div><div class="note" id="Note96"><span class="noteLabel">59 </span><div class="noteBody">Since no special element is provided for this purpose in the present version of these Guidelines, the information should be supplied as a series of paragraphs at the end of the <a class="gi" title="(encoding description) documents the relationship between an electronic text and the source or sources from which it was derived." href="ref-encodingDesc.html">encodingDesc</a> element described in section <a class="link_ptr" href="HD.html#HD5" title="The Encoding Description"><span class="headingNumber">2.3 </span>The Encoding Description</a>.</div> <a class="link_return" title="Go back to text" href="#Note96_return">↵</a></div><div class="note" id="Note97"><span class="noteLabel">60 </span><div class="noteBody">The URI (Universal Resource Indicator) is defined in <a class="link_ref" href="http://www.ietf.org/rfc/rfc3986.txt">RFC 3986</a></div> <a class="link_return" title="Go back to text" href="#Note97_return">↵</a></div><div class="note" id="Note98"><span class="noteLabel">61 </span><div class="noteBody">As always seems to be the case, no two regular expression languages are precisely the same. For those used to Perl regular expressions, be warned that while in Perl the pattern <code>tei</code> matches any string that contains <span class="mentioned">tei</span>, in the W3C language it only matches the string <span class="q">‘tei’</span>.</div> <a class="link_return" title="Go back to text" href="#Note98_return">↵</a></div><div class="note" id="Note99"><span class="noteLabel">62 </span><div class="noteBody">See section <a class="link_ptr" href="AI.html#AISP" title="Spans and Interpretations"><span class="headingNumber">17.3 </span>Spans and Interpretations</a>, where the text from which this fragment is taken is analyzed.</div> <a class="link_return" title="Go back to text" href="#Note99_return">↵</a></div><div class="note" id="Note100"><span class="noteLabel">63 </span><div class="noteBody">This sample is taken from a conversation collected and transcribed for the British National Corpus.</div> <a class="link_return" title="Go back to text" href="#Note100_return">↵</a></div><div class="note" id="Note101"><span class="noteLabel">64 </span><div class="noteBody">See <a class="citlink" href="BIB.html#SA-BIBL-1">Gale and Church (1993)</a>, from which the example in the text is taken.</div> <a class="link_return" title="Go back to text" href="#Note101_return">↵</a></div><div class="note" id="Note102"><span class="noteLabel">65 </span><div class="noteBody">See section <a class="link_ptr" href="AI.html#AILC" title="Linguistic Segment Categories"><span class="headingNumber">17.1 </span>Linguistic Segment Categories</a> for discussion of the <a class="gi" title="(word) represents a grammatical (not necessarily orthographic) word." href="ref-w.html">w</a> and <a class="gi" title="(character) represents a character." href="ref-c.html">c</a> tags that can be used in the following examples instead of the <span class="tag">&lt;seg type="word"&gt;</span> and <span class="tag">&lt;seg type="character"&gt;</span> tags.</div> <a class="link_return" title="Go back to text" href="#Note102_return">↵</a></div><div class="note" id="Note103"><span class="noteLabel">66 </span><div class="noteBody">An alternative way of representing this problem is discussed in chapter <a class="link_ptr" href="CE.html" title="17"><span class="headingNumber">21 </span>Certainty, Precision, and Responsibility</a>.</div> <a class="link_return" title="Go back to text" href="#Note103_return">↵</a></div><div class="note" id="Note104"><span class="noteLabel">67 </span><div class="noteBody">In this example, we have placed the <a class="gi" title="defines an association or hypertextual link among elements or passages, of some type not more precisely specifiable by other elements." href="ref-link.html">link</a> next to the elements that represent the alternants. It could also have been placed elsewhere in the document, perhaps within a <a class="gi" title="(link group) defines a collection of associations or hypertextual links." href="ref-linkGrp.html">linkGrp</a>.</div> <a class="link_return" title="Go back to text" href="#Note104_return">↵</a></div><div class="note" id="Note105"><span class="noteLabel">68 </span><div class="noteBody">The variant readings are found in the commercial sheet music, the performance score, and the Broadway cast recording.</div> <a class="link_return" title="Go back to text" href="#Note105_return">↵</a></div><div class="note" id="Note106"><span class="noteLabel">69 </span><div class="noteBody">The version on which this text is based is the <a class="link_ref" href="http://www.w3.org/TR/2004/REC-xinclude-20041220/">W3C Recommendation dated <span class="date">20 December 2004</span>.</a>.</div> <a class="link_return" title="Go back to text" href="#Note106_return">↵</a></div><div class="note" id="Note107"><span class="noteLabel">70 </span><div class="noteBody">This corresponds to the observation that overlapping XML tags reflecting a textual version of such an inclusion would not even be well-formed XML. This kind of overlap in textual phenomena of interest is in fact the major reason that stand-off markup is needed.</div> <a class="link_return" title="Go back to text" href="#Note107_return">↵</a></div></div><div class="stdfooter autogenerated"><p>
    [<a href="../../en/html/SA.html">English</a>]
    [<a href="../../de/html/SA.html">Deutsch</a>]
    [<a href="../../es/html/SA.html">Español</a>]
    [<a href="../../it/html/SA.html">Italiano</a>]
    [<a href="../../fr/html/SA.html">Français</a>]
    [<a href="../../ja/html/SA.html">日本語</a>]
    [<a href="../../ko/html/SA.html">한국어</a>]
    [<a href="../../zh-TW/html/SA.html">中文</a>]
    </p><hr /><div class="footer"><a class="plain" href="http://www.tei-c.org/About/">TEI Consortium</a> | <a class="plain" href="http://www.tei-c.org/About/contact.xml">Feedback</a></div><hr /><address><br />TEI Guidelines <a class="link_ref" href="AB.html#ABTEI4">Version</a> <a class="link_ref" href="../../readme-3.1.1.html">3.1.1a</a>. Last updated on <span class="date">10th May 2017</span>, revision <a class="link_ref" href="https://github.com/TEIC/TEI/commit/bd8dda3">bd8dda3</a>. This page generated on 2017-05-12T12:30:09Z.</address></div></div></body></html>
back to top