public class CastList
- extends Fixer
Fixes cast list elements.
There are so many problems with cast lists that the only real solution
is to go through all of them by hand and fix the source to generate
reasonable Dramatis Personae pages.
We do what we can here to clean up the most egregious messes.
- A number of basic typos are fixed. E.g., "Groomsetc." ->"Grooms etc."
- If there is more than one "castList" child element of
"TEI.2/text/front/div/", we issue a warning message that only the
first one is used.
- We issue a warning for "castGroup" elements which have no "head"
- Cast group titles have their first characters mapped to upper case
if necessary, and trailing periods are removed unless the last word
- Roles have their first characters mapped to upper case if necessary,
and trailing periods are removed unless the last word is "etc."
- Leading commas and space characters are removed from role descriptions.
- Multiple runs of space characters are replaced by a single space
character in role descriptions.
- Multiple role descriptions are combined into a single role description.
- Trailing periods are removed from role descriptions unless the last word
- If a cast item has a role description but no role, the first character
of the role description is mapped to upper case if necessary.
- We issue a warning for cast items which have neither a role nor a role
- Several cast items with type "list" are hacked to produce something
that is at least readable. In some cases, an existing role description
is changed. In other cases, where there is no role description,
a "roleDesc" attribute is added to the "castItem" element.
Fixes an XML DOM tree.
|Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public void fix(java.lang.String corpusTag,
- Fixes an XML DOM tree.
- Specified by:
fix in class
corpusTag - Corpus tag.
workTag - Work tag.
document - XML DOM tree.