edu.northwestern.at.wordhoard.swing.calculator.analysis
Class CompareTexts

java.lang.Object
  extended by edu.northwestern.at.wordhoard.swing.calculator.analysis.FrequencyAnalysisRunnerBase
      extended by edu.northwestern.at.wordhoard.swing.calculator.analysis.CompareTexts
All Implemented Interfaces:
AnalysisRunner

public class CompareTexts
extends FrequencyAnalysisRunnerBase
implements AnalysisRunner

Generates measures of similarity between two text sets.


Field Summary
 
Fields inherited from class edu.northwestern.at.wordhoard.swing.calculator.analysis.FrequencyAnalysisRunnerBase
adjustChiSquareForMultipleComparisons, analysisText, analysisTextBreakdownBy, analyzePhraseFrequencies, associationMeasure, blankReplacementCharacter, collocationOccurrenceMap, colorCodeOveruseColumn, compressValueRangeInTagClouds, contextButton, cutoff, displayProgress, filterBigramsByWordClass, filterMultiwordUnitsContainingVerbs, filterOutProperNames, filterSingleOccurrences, filterTrigramsByWordClass, filterUsingLocalMaxs, FONT_SIZE, frequencyAnalysisType, frequencyNormalizationMethod, FrequencyProfileResults, ignoreCaseAndDiacriticalMarks, leftSpan, markSignificantLogLikelihoodValues, maximumMultiwordUnitLength, minimumCount, minimumMultiwordUnitLength, minimumWorkCount, model, percentReportMethod, pluralWordFormString, progressReporter, referenceText, referenceTextBreakdownBy, resultsPanel, resultsScrollPane, resultsTable, rightSpan, roundNormalizedFrequencies, showPhraseFrequencies, showWordClasses, tableSelectionListener, useShortWorkTitlesInDialogs, useShortWorkTitlesInHeaders, useShortWorkTitlesInOutput, useShortWorkTitlesInWindowTitles, wordForm, wordFormString, wordOccs, wordToAnalyze
 
Constructor Summary
CompareTexts()
          Create a multiple word form frequency profile object.
 
Method Summary
static double[] computeDocumentSimilarities(java.util.Map countMap1, java.util.Map countMap2)
          Compute document similarity measures given two count maps.
protected  ResultsPanel generateResults(WordHoardSortedTableModel model, java.lang.String maxLabel)
          Displays results of text comparison in a sorted table.
static java.lang.String getWordCounterLabel(WordCounter wordCounter, int breakdownBy)
          Get label for word counter.
 void runAnalysis(javax.swing.JFrame parentWindow, ProgressReporter progressReporter)
          Run an analysis.
 boolean similaritiesAreOK(double[] similarities)
          Determine if similarity values are OK.
 
Methods inherited from class edu.northwestern.at.wordhoard.swing.calculator.analysis.FrequencyAnalysisRunnerBase
areResultOptionsAvailable, closeProgressReporter, createCloudAssociationMeasuresComboBox, createCompressValueRangeInTagCloudsCheckBox, generateResults, getAnalysisPercentColumnName, getChart, getCloud, getCloud, getColTitleWordFormString, getContext, getDoubleFormat, getPercentReportMethodFormat, getReferencePercentColumnName, getResultOptions, getResults, getTableFontSize, getTitle, handleTableSelectionChange, isCancelled, isChartAvailable, isCloudAvailable, isContextAvailable, isFilterAvailable, saveChart, setContextButton, showDialog
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface edu.northwestern.at.wordhoard.swing.calculator.analysis.AnalysisRunner
areResultOptionsAvailable, getChart, getCloud, getContext, getResultOptions, getResults, handleTableSelectionChange, isChartAvailable, isCloudAvailable, isContextAvailable, isFilterAvailable, saveChart, setContextButton, showDialog
 

Constructor Detail

CompareTexts

public CompareTexts()
Create a multiple word form frequency profile object.

Method Detail

runAnalysis

public void runAnalysis(javax.swing.JFrame parentWindow,
                        ProgressReporter progressReporter)
Run an analysis.

Specified by:
runAnalysis in interface AnalysisRunner
Overrides:
runAnalysis in class FrequencyAnalysisRunnerBase
Parameters:
parentWindow - Parent window for dialogs in the analysis.
progressReporter - Progress display for analysis.

getWordCounterLabel

public static java.lang.String getWordCounterLabel(WordCounter wordCounter,
                                                   int breakdownBy)
Get label for word counter.

Parameters:
wordCounter - The word counter for the table entry.
breakdownBy - The method of breaking down the texts.
Returns:
The label for the word counter.

similaritiesAreOK

public boolean similaritiesAreOK(double[] similarities)
Determine if similarity values are OK.

Parameters:
similarities - The similarities to check.
Returns:
true if the similarities appear OK, false if they consist solely of NANs and 0.0 values, with at least one NAN.

computeDocumentSimilarities

public static double[] computeDocumentSimilarities(java.util.Map countMap1,
                                                   java.util.Map countMap2)
Compute document similarity measures given two count maps.

Parameters:
countMap1 - First count map.
countMap2 - Second count map.
Returns:
Array with four doubles. [0] = cosine similarity [1] = Dice similarity [2] = Jaccard similarity [3] = Overlap similarity.

generateResults

protected ResultsPanel generateResults(WordHoardSortedTableModel model,
                                       java.lang.String maxLabel)
Displays results of text comparison in a sorted table.

Parameters:
model - Table model holding data to display.
maxLabel - Maximum width value for first table column.
Returns:
ResultsPanel with table and title.