pt.tumba.cluster
Class TeXWordFinder
java.lang.Object
pt.tumba.cluster.DefaultWordFinder
pt.tumba.cluster.TeXWordFinder
public class TeXWordFinder
- extends DefaultWordFinder
A word finder for TeX and LaTeX documents, which searches text for
sequences of letters, but ignores any commands and environments as well
as Math environments.
Method Summary |
void |
addUserDefinedIgnores(java.util.Collection expressions,
int regex)
This method is used to import a user defined set of either strings or regular expressions to ignore. |
private int |
ignoreUserDefined(int i)
|
java.lang.String |
next()
This method scans the text from the end of the last word, and returns a
new Word object corresponding to the next word. |
void |
setIgnoreComments(boolean ignore)
|
Methods inherited from class pt.tumba.cluster.DefaultWordFinder |
current, currentSegment, getText, hasNext, ignore, ignore, ignore, ignore, init, isWordChar, isWordChar, nextSegment, replace, setSentenceIterator, setText, startsSentence, toString |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
IGNORE_COMMENTS
private boolean IGNORE_COMMENTS
user_defined_ignores
private java.util.HashSet user_defined_ignores
regex_user_defined_ignores
private int regex_user_defined_ignores
STRING_EXPR
public static final int STRING_EXPR
- See Also:
- Constant Field Values
REG_EXPR
public static final int REG_EXPR
- See Also:
- Constant Field Values
TeXWordFinder
public TeXWordFinder(java.lang.String inText)
- Creates a new DefaultWordFinder object.
- Parameters:
inText
- the text to search.
TeXWordFinder
public TeXWordFinder()
next
public java.lang.String next()
- This method scans the text from the end of the last word, and returns a
new Word object corresponding to the next word.
- Overrides:
next
in class DefaultWordFinder
- Returns:
- the next word.
- Throws:
WordNotFoundException
- search string contains no more words.
addUserDefinedIgnores
public void addUserDefinedIgnores(java.util.Collection expressions,
int regex)
- This method is used to import a user defined set of either strings or regular expressions to ignore.
- Parameters:
expressions
- a collection of of Objects whose toString() value should be the expression. Typically String objects.regex
- is an integer specifying the type of expression to use. e.g. REG_EXPR, STRING_EXPR.
ignoreUserDefined
private int ignoreUserDefined(int i)
setIgnoreComments
public void setIgnoreComments(boolean ignore)