edu.northwestern.at.wordhoard.swing.calculator.analysis
Class FrequencyAnalysisRunnerBase

java.lang.Object
  extended by edu.northwestern.at.wordhoard.swing.calculator.analysis.FrequencyAnalysisRunnerBase
All Implemented Interfaces:
AnalysisRunner
Direct Known Subclasses:
CollocateFrequencies, CompareMultipleWordFrequencies, CompareSingleWordFrequencies, CompareTexts, FindMultiwordUnits, TrackWordOverTime, WordFrequencies

public class FrequencyAnalysisRunnerBase
extends java.lang.Object
implements AnalysisRunner

Base class for frequency analyses.


Field Summary
protected  boolean adjustChiSquareForMultipleComparisons
          True to adjust chi-square values for number of comparisons.
protected  WordCounter analysisText
          Analysis text.
protected  int analysisTextBreakdownBy
          Analysis text breakdown method.
protected  boolean analyzePhraseFrequencies
          True to compare phrase counts instead of word counts.
protected  int associationMeasure
          Association measure for localmaxs.
protected  java.lang.String blankReplacementCharacter
          Blank replacement character in tag clouds.
protected  java.util.TreeMap collocationOccurrenceMap
          TreeMap of word occurrences for each collocate.
protected  boolean colorCodeOveruseColumn
          True to use color coding for overuse table columns.
protected  boolean compressValueRangeInTagClouds
          True to compress value range in tag clouds.
protected  javax.swing.JButton contextButton
          The context button, if any.
protected  int cutoff
          Minimum number of times collocate word must appear.
protected  boolean displayProgress
          True to display progress dialog.
protected  boolean filterBigramsByWordClass
          Filter bigrams by word class flag.
protected  boolean filterMultiwordUnitsContainingVerbs
          Filter multiword units containing verbs.
protected  boolean filterOutProperNames
          True to filter out proper names.
 boolean filterSingleOccurrences
          Filter ngrams which occur only once.
protected  boolean filterTrigramsByWordClass
          Filter trigrams by word class flag.
 boolean filterUsingLocalMaxs
          Filter ngrams using localmaxs.
protected static int FONT_SIZE
          Font size for table.
protected  int frequencyAnalysisType
          Type of frequency analysis.
protected  int frequencyNormalizationMethod
          Normalization method for frequencies.
protected  java.util.ArrayList FrequencyProfileResults
          Holds frequency analysis results.
 boolean ignoreCaseAndDiacriticalMarks
          Ignore case and diacritical marks.
protected  int leftSpan
          Number of words to left of word to look for collocates.
protected  boolean markSignificantLogLikelihoodValues
          True to mark significant log-likelihood values in tabular display.
protected  int maximumMultiwordUnitLength
          Maximum multiword unit length.
protected  int minimumCount
          Minimum count for word to be analyzed.
protected  int minimumMultiwordUnitLength
          Minimum multiword unit length.
protected  int minimumWorkCount
          Minimum work count for word to be analyzed.
protected  WordHoardSortedTableModel model
          WordHoardSortedTableModel for holding results.
protected  int percentReportMethod
          Percent report method.
protected  java.lang.String pluralWordFormString
          Displayable plural word form type.
protected  ProgressReporter progressReporter
          Progress reporter.
protected  WordCounter referenceText
          Reference text.
protected  int referenceTextBreakdownBy
          Reference text breakdown method.
protected  ResultsPanel resultsPanel
          The results panel.
protected  XScrollPane resultsScrollPane
          The scroll pane around the results table.
protected  XTable resultsTable
          The results table.
protected  int rightSpan
          Number of words to right of word to look for collocates.
protected  boolean roundNormalizedFrequencies
          True to round normalized frequencies.
protected  boolean showPhraseFrequencies
          True to display phrase counts instead of word counts.
protected  boolean showWordClasses
          True to display word classes for all words.
protected  javax.swing.event.ListSelectionListener tableSelectionListener
          Watch for selection changes in results table.
protected  boolean useShortWorkTitlesInDialogs
          True to use short work names in dialogs.
protected  boolean useShortWorkTitlesInHeaders
          True to use short work names in headers.
protected  boolean useShortWorkTitlesInOutput
          True to use short work names in output.
protected  boolean useShortWorkTitlesInWindowTitles
          True to use short work names in titles.
protected  int wordForm
          Word form type.
protected  java.lang.String wordFormString
          Displayable word form type.
protected  Word[] wordOccs
          Word occurrences for a collocation analysis.
protected  Spelling wordToAnalyze
          Word to analyze (spelling, lemma, etc.).
 
Constructor Summary
FrequencyAnalysisRunnerBase(int frequencyAnalysisType)
          Create a single word form frequency profile object.
 
Method Summary
 boolean areResultOptionsAvailable()
          Are result options available?
 boolean closeProgressReporter()
          Close progress reporter.
 javax.swing.JComboBox createCloudAssociationMeasuresComboBox()
          Create cloud association measures combobox.
 javax.swing.JCheckBox createCompressValueRangeInTagCloudsCheckBox()
          Create compress cloud value range checkbox result option.
protected  ResultsPanel generateResults(Spelling wordToAnalyze, java.lang.String title, java.lang.String shortTitle, java.lang.String[] columnLongValues, java.lang.String[] columnFormats, int initialSortColumn, int logLikelihoodColumn, int wordClassColumn, WordHoardSortedTableModel model, java.lang.String[] maxColumnValues)
          Displays results of analysis in a sorted table.
 java.lang.String getAnalysisPercentColumnName()
          Get analysis text percent column name.
 ResultsPanel getChart()
          Chart results.
 ResultsPanel getCloud()
          Cloud results.
 ResultsPanel getCloud(java.lang.String title, java.lang.String[] headers, boolean compressRange, int scoreCol, int overUseCol, int wordClassCol)
          Show tag cloud of association measure.
 java.lang.String getColTitleWordFormString(java.lang.String wordFormString)
          Convert word form string to column title ready string.
 ResultsPanel getContext(javax.swing.JFrame parentWindow, ProgressReporter progressReporter)
          Get context results.
 java.lang.String getDoubleFormat(int decimalPlaces)
          Get format for double value in table.
 java.lang.String getPercentReportMethodFormat()
          Get format for percent report method.
 java.lang.String getReferencePercentColumnName()
          Get reference text percent column name.
 LabeledColumn getResultOptions()
          Result options.
 ResultsPanel getResults()
          Get results.
static int getTableFontSize()
          Get table font size.
 java.lang.String getTitle(WordCounter wordCounter, boolean useShortTitle)
          Get title for a word counter.
 void handleTableSelectionChange(javax.swing.event.ListSelectionEvent event)
          Handle selection change in results table.
 boolean isCancelled(ProgressReporter progressReporter)
          Check if cancelled flag set in a progress reporter.
 boolean isChartAvailable()
          Is chart output available?
 boolean isCloudAvailable()
          Is cloud output available?
 boolean isContextAvailable()
          Is context output available?
 boolean isFilterAvailable()
          Is output filter available?
 void runAnalysis(javax.swing.JFrame parentWindow, ProgressReporter progressReporter)
          Run analysis.
 void saveChart()
          Saves chart.
 void setContextButton(javax.swing.JButton contextButton)
          Set the context button.
 boolean showDialog(javax.swing.JFrame parentFrame)
          Display the frequency analysis dialog.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

frequencyAnalysisType

protected int frequencyAnalysisType
Type of frequency analysis.


analysisText

protected WordCounter analysisText
Analysis text.


referenceText

protected WordCounter referenceText
Reference text.


minimumCount

protected int minimumCount
Minimum count for word to be analyzed.


minimumWorkCount

protected int minimumWorkCount
Minimum work count for word to be analyzed.


FrequencyProfileResults

protected java.util.ArrayList FrequencyProfileResults
Holds frequency analysis results.


wordForm

protected int wordForm
Word form type.


wordFormString

protected java.lang.String wordFormString
Displayable word form type.


pluralWordFormString

protected java.lang.String pluralWordFormString
Displayable plural word form type.


wordToAnalyze

protected Spelling wordToAnalyze
Word to analyze (spelling, lemma, etc.).


frequencyNormalizationMethod

protected int frequencyNormalizationMethod
Normalization method for frequencies.


roundNormalizedFrequencies

protected boolean roundNormalizedFrequencies
True to round normalized frequencies.


percentReportMethod

protected int percentReportMethod
Percent report method.


markSignificantLogLikelihoodValues

protected boolean markSignificantLogLikelihoodValues
True to mark significant log-likelihood values in tabular display.


filterOutProperNames

protected boolean filterOutProperNames
True to filter out proper names.


showWordClasses

protected boolean showWordClasses
True to display word classes for all words.


showPhraseFrequencies

protected boolean showPhraseFrequencies
True to display phrase counts instead of word counts.


analyzePhraseFrequencies

protected boolean analyzePhraseFrequencies
True to compare phrase counts instead of word counts.


leftSpan

protected int leftSpan
Number of words to left of word to look for collocates.


rightSpan

protected int rightSpan
Number of words to right of word to look for collocates.


cutoff

protected int cutoff
Minimum number of times collocate word must appear.


wordOccs

protected Word[] wordOccs
Word occurrences for a collocation analysis.


collocationOccurrenceMap

protected java.util.TreeMap collocationOccurrenceMap
TreeMap of word occurrences for each collocate.


analysisTextBreakdownBy

protected int analysisTextBreakdownBy
Analysis text breakdown method.


referenceTextBreakdownBy

protected int referenceTextBreakdownBy
Reference text breakdown method.


associationMeasure

protected int associationMeasure
Association measure for localmaxs.


minimumMultiwordUnitLength

protected int minimumMultiwordUnitLength
Minimum multiword unit length.


maximumMultiwordUnitLength

protected int maximumMultiwordUnitLength
Maximum multiword unit length.


filterBigramsByWordClass

protected boolean filterBigramsByWordClass
Filter bigrams by word class flag.


filterTrigramsByWordClass

protected boolean filterTrigramsByWordClass
Filter trigrams by word class flag.


filterMultiwordUnitsContainingVerbs

protected boolean filterMultiwordUnitsContainingVerbs
Filter multiword units containing verbs.


filterUsingLocalMaxs

public boolean filterUsingLocalMaxs
Filter ngrams using localmaxs.


filterSingleOccurrences

public boolean filterSingleOccurrences
Filter ngrams which occur only once.


ignoreCaseAndDiacriticalMarks

public boolean ignoreCaseAndDiacriticalMarks
Ignore case and diacritical marks.


model

protected WordHoardSortedTableModel model
WordHoardSortedTableModel for holding results.


displayProgress

protected boolean displayProgress
True to display progress dialog.


progressReporter

protected ProgressReporter progressReporter
Progress reporter.


resultsPanel

protected ResultsPanel resultsPanel
The results panel.


contextButton

protected javax.swing.JButton contextButton
The context button, if any.


resultsTable

protected XTable resultsTable
The results table.


resultsScrollPane

protected XScrollPane resultsScrollPane
The scroll pane around the results table.


colorCodeOveruseColumn

protected boolean colorCodeOveruseColumn
True to use color coding for overuse table columns.


adjustChiSquareForMultipleComparisons

protected boolean adjustChiSquareForMultipleComparisons
True to adjust chi-square values for number of comparisons.


FONT_SIZE

protected static final int FONT_SIZE
Font size for table.

See Also:
Constant Field Values

useShortWorkTitlesInDialogs

protected boolean useShortWorkTitlesInDialogs
True to use short work names in dialogs.


useShortWorkTitlesInOutput

protected boolean useShortWorkTitlesInOutput
True to use short work names in output.


useShortWorkTitlesInWindowTitles

protected boolean useShortWorkTitlesInWindowTitles
True to use short work names in titles.


useShortWorkTitlesInHeaders

protected boolean useShortWorkTitlesInHeaders
True to use short work names in headers.


compressValueRangeInTagClouds

protected boolean compressValueRangeInTagClouds
True to compress value range in tag clouds.


blankReplacementCharacter

protected java.lang.String blankReplacementCharacter
Blank replacement character in tag clouds. Currently a raised dot (Unicode ·).


tableSelectionListener

protected javax.swing.event.ListSelectionListener tableSelectionListener
Watch for selection changes in results table.

Constructor Detail

FrequencyAnalysisRunnerBase

public FrequencyAnalysisRunnerBase(int frequencyAnalysisType)
Create a single word form frequency profile object.

Parameters:
frequencyAnalysisType - Type of frequency analysis.
Method Detail

showDialog

public boolean showDialog(javax.swing.JFrame parentFrame)
Display the frequency analysis dialog.

Specified by:
showDialog in interface AnalysisRunner
Parameters:
parentFrame - The parent window for the dialog.
Returns:
true if OK pressed in dialog, false otherwise.

runAnalysis

public void runAnalysis(javax.swing.JFrame parentWindow,
                        ProgressReporter progressReporter)
Run analysis.

Specified by:
runAnalysis in interface AnalysisRunner
Parameters:
parentWindow - Parent window for dialogs in the analysis.
progressReporter - Progress display for analysis.

closeProgressReporter

public boolean closeProgressReporter()
Close progress reporter.

Returns:
The value of the progress reporter's cancelled flag.

getResults

public ResultsPanel getResults()
Get results.

Specified by:
getResults in interface AnalysisRunner
Returns:
ResultsPanel containing the analysis results.

generateResults

protected ResultsPanel generateResults(Spelling wordToAnalyze,
                                       java.lang.String title,
                                       java.lang.String shortTitle,
                                       java.lang.String[] columnLongValues,
                                       java.lang.String[] columnFormats,
                                       int initialSortColumn,
                                       int logLikelihoodColumn,
                                       int wordClassColumn,
                                       WordHoardSortedTableModel model,
                                       java.lang.String[] maxColumnValues)
Displays results of analysis in a sorted table.

Parameters:
wordToAnalyze - The word being analyzed.
title - Long results title.
shortTitle - Short results title.
columnLongValues - Column long values to set column widths.
columnFormats - Column formats for results.
initialSortColumn - Results sorted by this column.
logLikelihoodColumn - Column containing log-likelihood values.
wordClassColumn - Column containing word classes.
model - Table model holding results.
maxColumnValues - Maximum width value for table columns. If number of entries "k" is less than the number of the table columns, only the first "k" column widths are set.

getTableFontSize

public static int getTableFontSize()
Get table font size.

Returns:
table font size in pixels.

isChartAvailable

public boolean isChartAvailable()
Is chart output available?

Specified by:
isChartAvailable in interface AnalysisRunner
Returns:
true if chart output available, false otherwise.

isCloudAvailable

public boolean isCloudAvailable()
Is cloud output available?

Specified by:
isCloudAvailable in interface AnalysisRunner
Returns:
true if cloud output available, false otherwise.

isFilterAvailable

public boolean isFilterAvailable()
Is output filter available?

Specified by:
isFilterAvailable in interface AnalysisRunner
Returns:
true if output filter available, false otherwise.

areResultOptionsAvailable

public boolean areResultOptionsAvailable()
Are result options available?

Specified by:
areResultOptionsAvailable in interface AnalysisRunner
Returns:
true if result options are available, false otherwise.

isContextAvailable

public boolean isContextAvailable()
Is context output available?

Specified by:
isContextAvailable in interface AnalysisRunner
Returns:
true if context output available, false otherwise.

getChart

public ResultsPanel getChart()
Chart results.

Specified by:
getChart in interface AnalysisRunner
Returns:
ResultsPanel containing the chart.

getCloud

public ResultsPanel getCloud()
Cloud results.

Specified by:
getCloud in interface AnalysisRunner
Returns:
ResultsPanel containing the cloud.

getResultOptions

public LabeledColumn getResultOptions()
Result options.

Specified by:
getResultOptions in interface AnalysisRunner
Returns:
Result options as LabeledColumn.

createCompressValueRangeInTagCloudsCheckBox

public javax.swing.JCheckBox createCompressValueRangeInTagCloudsCheckBox()
Create compress cloud value range checkbox result option.

Returns:
JCheckBox for compress values option.

createCloudAssociationMeasuresComboBox

public javax.swing.JComboBox createCloudAssociationMeasuresComboBox()
Create cloud association measures combobox.

Returns:
JComboBox for cloud association measures.

getContext

public ResultsPanel getContext(javax.swing.JFrame parentWindow,
                               ProgressReporter progressReporter)
Get context results.

Specified by:
getContext in interface AnalysisRunner
Parameters:
parentWindow - Parent window for dialogs in the analysis.
progressReporter - Progress display for analysis.
Returns:
ResultsPanel containing the context results.

saveChart

public void saveChart()
Saves chart.

Specified by:
saveChart in interface AnalysisRunner

setContextButton

public void setContextButton(javax.swing.JButton contextButton)
Set the context button.

Specified by:
setContextButton in interface AnalysisRunner
Parameters:
contextButton - The context button.

handleTableSelectionChange

public void handleTableSelectionChange(javax.swing.event.ListSelectionEvent event)
Handle selection change in results table.

Specified by:
handleTableSelectionChange in interface AnalysisRunner
Parameters:
event - Table selection event.

getColTitleWordFormString

public java.lang.String getColTitleWordFormString(java.lang.String wordFormString)
Convert word form string to column title ready string.

Parameters:
wordFormString - The word form string ("spelling", "lemma", etc.)
Returns:
Version of word form string suitable for use as JTable column title.

getAnalysisPercentColumnName

public java.lang.String getAnalysisPercentColumnName()
Get analysis text percent column name.

Returns:
Column name for analysis percent.

getReferencePercentColumnName

public java.lang.String getReferencePercentColumnName()
Get reference text percent column name.

Returns:
Column name for reference percent.

getPercentReportMethodFormat

public java.lang.String getPercentReportMethodFormat()
Get format for percent report method.

Returns:
Format string for percent report method.

getDoubleFormat

public java.lang.String getDoubleFormat(int decimalPlaces)
Get format for double value in table.

Parameters:
decimalPlaces - Number of decimal places.

getTitle

public java.lang.String getTitle(WordCounter wordCounter,
                                 boolean useShortTitle)
Get title for a word counter.

Parameters:
wordCounter - Word Counter whose title is desired.
useShortTitle - Return a short title.
Returns:
A long or short title.

isCancelled

public boolean isCancelled(ProgressReporter progressReporter)
Check if cancelled flag set in a progress reporter.

Parameters:
progressReporter - Progress reporter to check for cancel.

getCloud

public ResultsPanel getCloud(java.lang.String title,
                             java.lang.String[] headers,
                             boolean compressRange,
                             int scoreCol,
                             int overUseCol,
                             int wordClassCol)
Show tag cloud of association measure.

Parameters:
title - Cloud title.
headers - Header lines for cloud.
compressRange - Compress range of cloud tag values.
scoreCol - Tag score column in table.
overUseCol - Over/underuse column in table. -1 if none.
wordClassCol - Word class column in table. -1 if none.