org.apache.lucene.analysis

Class StopFilter

public final class StopFilter extends TokenFilter

Removes stop words from a token stream.
Constructor Summary
StopFilter(TokenStream input, String[] stopWords)
Construct a token stream filtering the given input.
StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.
StopFilter(TokenStream in, Hashtable stopTable)
Constructs a filter which removes words from the input TokenStream that are named in the Hashtable.
StopFilter(TokenStream in, Hashtable stopTable, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the Hashtable.
StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
Construct a token stream filtering the given input.
StopFilter(TokenStream in, Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.
Method Summary
static SetmakeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
static SetmakeStopSet(String[] stopWords, boolean ignoreCase)
static HashtablemakeStopTable(String[] stopWords)
Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor.
static HashtablemakeStopTable(String[] stopWords, boolean ignoreCase)
Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor.
Tokennext()
Returns the next input Token whose termText() is not a stop word.

Constructor Detail

StopFilter

public StopFilter(TokenStream input, String[] stopWords)
Construct a token stream filtering the given input.

StopFilter

public StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.

StopFilter

public StopFilter(TokenStream in, Hashtable stopTable)

Deprecated: Use {@link #StopFilter(TokenStream, Set)} instead

Constructs a filter which removes words from the input TokenStream that are named in the Hashtable.

StopFilter

public StopFilter(TokenStream in, Hashtable stopTable, boolean ignoreCase)

Deprecated: Use {@link #StopFilter(TokenStream, Set)} instead

Constructs a filter which removes words from the input TokenStream that are named in the Hashtable. If ignoreCase is true, all keys in the stopTable should already be lowercased.

StopFilter

public StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
Construct a token stream filtering the given input.

Parameters: input stopWords The set of Stop Words, as Strings. If ignoreCase is true, all strings should be lower cased ignoreCase -Ignore case when stopping. The stopWords set must be setup to contain only lower case words

StopFilter

public StopFilter(TokenStream in, Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. It is crucial that an efficient Set implementation is used for maximum performance.

See Also: (java.lang.String[])

Method Detail

makeStopSet

public static final Set makeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

See Also: (java.lang.String[], boolean) passing false to ignoreCase

makeStopSet

public static final Set makeStopSet(String[] stopWords, boolean ignoreCase)

Parameters: stopWords ignoreCase If true, all words are lower cased first.

Returns: a Set containing the words

makeStopTable

public static final Hashtable makeStopTable(String[] stopWords)

Deprecated: Use {@link #makeStopSet(String[])} instead.

Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this table construction to be cached once when an Analyzer is constructed.

makeStopTable

public static final Hashtable makeStopTable(String[] stopWords, boolean ignoreCase)

Deprecated: Use {@link #makeStopSet(java.lang.String[], boolean)} instead.

Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this table construction to be cached once when an Analyzer is constructed.

next

public final Token next()
Returns the next input Token whose termText() is not a stop word.
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.