org.apache.lucene.analysis.fr
Class ElisionFilter
java.lang.Object
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.fr.ElisionFilter
public class ElisionFilter
- extends TokenFilter
Removes elisions from a token stream. For example, "l'avion" (the plane) will be
tokenized as "avion" (plane).
Note that StandardTokenizer sees " ' " as a space, and cuts it out.
- Author:
- Mathieu Lecarme
- See Also:
- Elision in Wikipedia
Method Summary |
Token |
next()
Returns the next input Token whith termText() without elisioned start |
void |
setArticles(Set articles)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ElisionFilter
protected ElisionFilter(TokenStream input)
- Constructs an elision filter with standard stop words
ElisionFilter
public ElisionFilter(TokenStream input,
Set articles)
- Constructs an elision filter with a Set of stop words
ElisionFilter
public ElisionFilter(TokenStream input,
String[] articles)
- Constructs an elision filter with an array of stop words
setArticles
public void setArticles(Set articles)
next
public Token next()
throws IOException
- Returns the next input Token whith termText() without elisioned start
- Overrides:
next
in class TokenStream
- Throws:
IOException
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.