edu.northwestern.at.utils.corpuslinguistics
Class Soundex

java.lang.Object
  extended by edu.northwestern.at.utils.corpuslinguistics.Soundex

public class Soundex
extends java.lang.Object

Soundex: Implements the Soundex Algorithm.

Soundex hashes words to a smaller space using a simple model which approximates the sound of the word as produced by a native American English speaker. The hash is a four (usually) character string in which the first character is an uppercase letter and the remaining characters are digits. Soundex was originally intended only for encoding proper last names, but occasionally finds other uses as well. The Soundex algorithm was devised and patented by Margaret K. Odell and Robert C. Russell in 1918.


Field Summary
static int MAXSOUNDEXLENGTH
           
static char[] US_ENGLISH_SOUNDEX_MAPPING
           
 
Constructor Summary
Soundex()
           
 
Method Summary
static java.lang.String soundex(java.lang.String s)
          Get Soundex code for a string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

US_ENGLISH_SOUNDEX_MAPPING

public static final char[] US_ENGLISH_SOUNDEX_MAPPING

MAXSOUNDEXLENGTH

public static final int MAXSOUNDEXLENGTH
See Also:
Constant Field Values
Constructor Detail

Soundex

public Soundex()
Method Detail

soundex

public static java.lang.String soundex(java.lang.String s)
Get Soundex code for a string.

Parameters:
s - The string for which the soundex code is desired.
Returns:
The Soundex code for "s". Returns an empty string when the Soundex code cannot be found. In particular, a Soundex code cannot be found if the first character in "s" is not a letter (a-z, A-Z).