org.biojava.bio.seq.io
Interface SymbolTokenization

All Superinterfaces:
Annotatable, Changeable
All Known Implementing Classes:
CharacterTokenization, WordTokenization

public interface SymbolTokenization
extends Annotatable

Encapsulate a mapping between BioJava Symbol objects and some string representation.

Since:
1.2
Author:
Thomas Down

Nested Class Summary
static class SymbolTokenization.TokenType
           
 
Nested classes inherited from class org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
 
Field Summary
static SymbolTokenization.TokenType CHARACTER
           
static SymbolTokenization.TokenType FIXEDWIDTH
           
static SymbolTokenization.TokenType SEPARATED
           
static SymbolTokenization.TokenType UNKNOWN
           
 
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
 
Method Summary
 Alphabet getAlphabet()
          The alphabet to which this tokenization applies.
 SymbolTokenization.TokenType getTokenType()
          Determine the style of tokenization represented by this object.
 StreamParser parseStream(SeqIOListener listener)
          Return an object which can parse an arbitrary character stream into symbols.
 Symbol parseToken(java.lang.String token)
          Returns the symbol for a single token.
 java.lang.String tokenizeSymbol(Symbol s)
          Return a token representing a single symbol.
 java.lang.String tokenizeSymbolList(SymbolList sl)
          Return a string representation of a list of symbols.
 
Methods inherited from interface org.biojava.bio.Annotatable
getAnnotation
 
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
 

Field Detail

CHARACTER

public static final SymbolTokenization.TokenType CHARACTER

FIXEDWIDTH

public static final SymbolTokenization.TokenType FIXEDWIDTH

SEPARATED

public static final SymbolTokenization.TokenType SEPARATED

UNKNOWN

public static final SymbolTokenization.TokenType UNKNOWN
Method Detail

getAlphabet

public Alphabet getAlphabet()
The alphabet to which this tokenization applies.


getTokenType

public SymbolTokenization.TokenType getTokenType()
Determine the style of tokenization represented by this object.


parseToken

public Symbol parseToken(java.lang.String token)
                  throws IllegalSymbolException
Returns the symbol for a single token.

The Symbol will be a member of the alphabet. If the token is not recognized as mapping to a symbol, an exception will be thrown.

Parameters:
token - the token to retrieve a Symbol for
Returns:
the Symbol for that token
Throws:
IllegalSymbolException - if there is no Symbol for the token

parseStream

public StreamParser parseStream(SeqIOListener listener)
Return an object which can parse an arbitrary character stream into symbols.

Parameters:
listener - The listener which gets notified of parsed symbols.

tokenizeSymbol

public java.lang.String tokenizeSymbol(Symbol s)
                                throws IllegalSymbolException
Return a token representing a single symbol.

Throws:
IllegalSymbolException - if the symbol isn't recognized.

tokenizeSymbolList

public java.lang.String tokenizeSymbolList(SymbolList sl)
                                    throws IllegalAlphabetException,
                                           IllegalSymbolException
Return a string representation of a list of symbols.

Throws:
IllegalAlphabetException - if alphabets don't match
IllegalSymbolException