|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Interface Summary | |
|---|---|
| Pretokenizer | Prepares a string for tokenization. |
| StringSimilarity | Interface defining a method for computing string similarity. |
| Class Summary | |
|---|---|
| BigramLogLikelihood | Computes Dunnett's log-likelihood for bigrams. |
| Collocation | Computes bigram collocation measures. |
| DefaultPretokenizer | Prepare a string for tokenization. |
| DoubleMetaphone | This code is based on an implementation by Ed Parrish, which was obtained from: http://www.cse.ucsc.edu/~eparrish/toolbox/search.html |
| FileTokenizer | Tokenize a text file. |
| Frequency | Computes frequency-based statistics for comparing corpora. |
| LevensteinDistance | Computes the Levenstein edit distance between two strings. |
| NGramExtractor | Extract ngrams from text. |
| Soundex | Soundex: Implements the Soundex Algorithm. |
| WordCountExtractor | Counts words in a text. |
Methods and interfaces for corpus linguistics, including comparative frequency analysis and collocation.
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||