edu.northwestern.at.wordhoard.tools.fixers
Class ShaTitle

java.lang.Object
  extended by edu.northwestern.at.wordhoard.tools.fixers.Fixer
      extended by edu.northwestern.at.wordhoard.tools.fixers.ShaTitle

public class ShaTitle
extends Fixer

Fixes Shakespeare titles ("head" elements).

In the XML the title is supposed to be specified by a unique "head" child element of the "div" element for the act or scene. There are, however, sometimes bugs where the "head" child is missing or there is more than one "head" child. If the "head" child is missing, we use the "type" and "n" attributes of the "div" element to reconstruct the title and we set it as the "head" attribute of the "div" element. If more than one "head" child is present, we delete all but the last one.

We normalize all act and scene titles to use arabic numerals instead of roman numerals, to use mixed case instead of upper case, and to not include periods at the end of the titles.

For the poem "The Rape of Lucrece" we change the title of the first part from "Introduction" to "Argument".


Constructor Summary
ShaTitle()
           
 
Method Summary
 void fix(java.lang.String corpusTag, java.lang.String workTag, org.w3c.dom.Document document)
          Fixes an XML DOM tree.
 
Methods inherited from class edu.northwestern.at.wordhoard.tools.fixers.Fixer
enableLogMessages, log
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ShaTitle

public ShaTitle()
Method Detail

fix

public void fix(java.lang.String corpusTag,
                java.lang.String workTag,
                org.w3c.dom.Document document)
         throws java.lang.Exception
Fixes an XML DOM tree.

Specified by:
fix in class Fixer
Parameters:
corpusTag - Corpus tag.
workTag - Work tag.
document - XML DOM tree.
Throws:
java.lang.Exception