edu.northwestern.at.wordhoard.tools
Class CloneData

java.lang.Object
  extended by edu.northwestern.at.wordhoard.tools.CloneData

public class CloneData
extends java.lang.Object

Clones the WordHaord raw data XML files.

This tool is used to make many copies of the WordHoard XML data to test scaling issues.

Usage:

CloneData inDir outDir nCopies

inDir = Input XML directory path.

outDir = output XML directory path.

nCopies = number of copies to clone.

Before running this tool, the output directory must exist, and it must contain copies of the "authors.xml" file, the "word-classes.xml" file, and the "pos" directory.

The tool reads the input "corpora.xml" file and writes the output "corpora.xml" file containing the requested number of copies of all the corpora. For example, the "Shakespeare" corpus is cloned as "Shakespeare 1", "Shakespeare 2", etc.

The tool reads the input "works" directory and writes the output "works" directory containing the requested number of copies of all the works.

The output directory of XML files contains no annotations, spellings, translations, or Benson glosses.

Tags (ids) for cloned objects are constructed by appending the copy number. E.g., the tags for the copies of Hamlet are ham-1, ham-2, etc.


Method Summary
static void main(java.lang.String[] args)
          The main program.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

main

public static void main(java.lang.String[] args)
The main program.

Parameters:
args - Command line arguments.