|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.northwestern.at.utils.corpuslinguistics.Frequency
public class Frequency
Computes frequency-based statistics for comparing corpora.
| Constructor Summary | |
|---|---|
protected |
Frequency()
Don't allow instantiation but do allow overrides. |
| Method Summary | |
|---|---|
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize)
Compute log-likelihood statistic for comparing frequencies in two corpora. |
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize,
boolean computeLLSig)
Compute log-likelihood statistic for comparing frequencies in two corpora. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
protected Frequency()
| Method Detail |
|---|
public static double[] logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize,
boolean computeLLSig)
sampleCount - Count of word/lemma appearance in sample.refCount - Count of word/lemma appearance in reference
corpus.sampleSize - Total words/lemmas in the sample.refSize - Total words/lemmas in reference corpus.computeLLSig - Compute significance of log likelihood.
The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.
The results of any zero divides are set to zero.
public static double[] logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize)
sampleCount - Count of word/lemma appearance in sample.refCount - Count of word/lemma appearance in reference
corpus.sampleSize - Total words/lemmas in the sample.refSize - Total words/lemmas in reference corpus.
The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||