org.biojava.bio.seq.io
Class WordTokenization

java.lang.Object
  extended byorg.biojava.utils.Unchangeable
      extended byorg.biojava.bio.seq.io.WordTokenization
All Implemented Interfaces:
Annotatable, Changeable, java.io.Serializable, SymbolTokenization
Direct Known Subclasses:
CrossProductTokenization, DoubleTokenization, IntegerTokenization, NameTokenization

public abstract class WordTokenization
extends Unchangeable
implements SymbolTokenization, java.io.Serializable

Base class for tokenizations which accept whitespace-separated `words'. Splits at whitespace, except when it is quoted by either double-quotes ("), brackets (), or square brackets [].

Since:
1.2
Author:
Thomas Down, Greg Cox, Keith James
See Also:
Serialized Form

Nested Class Summary
 
Nested classes inherited from class org.biojava.bio.seq.io.SymbolTokenization
SymbolTokenization.TokenType
 
Nested classes inherited from class org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
 
Field Summary
 
Fields inherited from interface org.biojava.bio.seq.io.SymbolTokenization
CHARACTER, FIXEDWIDTH, SEPARATED, UNKNOWN
 
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
 
Constructor Summary
WordTokenization(Alphabet fab)
           
 
Method Summary
 Alphabet getAlphabet()
          The alphabet to which this tokenization applies.
 Annotation getAnnotation()
          Should return the associated annotation object.
 SymbolTokenization.TokenType getTokenType()
          Determine the style of tokenization represented by this object.
 StreamParser parseStream(SeqIOListener siol)
          Return an object which can parse an arbitrary character stream into symbols.
protected  Symbol[] parseString(java.lang.String s)
           
protected  java.util.List splitString(java.lang.String str)
           
 java.lang.String tokenizeSymbolList(SymbolList sl)
          Return a string representation of a list of symbols.
 
Methods inherited from class org.biojava.utils.Unchangeable
addChangeListener, addChangeListener, addForwarder, getForwarders, getListeners, isUnchanging, removeChangeListener, removeChangeListener, removeForwarder
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.biojava.bio.seq.io.SymbolTokenization
parseToken, tokenizeSymbol
 
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
 

Constructor Detail

WordTokenization

public WordTokenization(Alphabet fab)
Method Detail

getAlphabet

public Alphabet getAlphabet()
Description copied from interface: SymbolTokenization
The alphabet to which this tokenization applies.

Specified by:
getAlphabet in interface SymbolTokenization

getTokenType

public SymbolTokenization.TokenType getTokenType()
Description copied from interface: SymbolTokenization
Determine the style of tokenization represented by this object.

Specified by:
getTokenType in interface SymbolTokenization

getAnnotation

public Annotation getAnnotation()
Description copied from interface: Annotatable
Should return the associated annotation object.

Specified by:
getAnnotation in interface Annotatable
Returns:
an Annotation object, never null

tokenizeSymbolList

public java.lang.String tokenizeSymbolList(SymbolList sl)
                                    throws IllegalSymbolException,
                                           IllegalAlphabetException
Description copied from interface: SymbolTokenization
Return a string representation of a list of symbols.

Specified by:
tokenizeSymbolList in interface SymbolTokenization
Throws:
IllegalAlphabetException - if alphabets don't match
IllegalSymbolException

parseStream

public StreamParser parseStream(SeqIOListener siol)
Description copied from interface: SymbolTokenization
Return an object which can parse an arbitrary character stream into symbols.

Specified by:
parseStream in interface SymbolTokenization
Parameters:
siol - The listener which gets notified of parsed symbols.

splitString

protected java.util.List splitString(java.lang.String str)
                              throws IllegalSymbolException
Throws:
IllegalSymbolException

parseString

protected Symbol[] parseString(java.lang.String s)
                        throws IllegalSymbolException
Throws:
IllegalSymbolException