org.biojava.bio.symbol
Class IntegerAlphabet

java.lang.Object
  extended byorg.biojava.utils.Unchangeable
      extended byorg.biojava.bio.symbol.IntegerAlphabet
All Implemented Interfaces:
Alphabet, Annotatable, Changeable, java.io.Serializable

public final class IntegerAlphabet
extends Unchangeable
implements Alphabet, java.io.Serializable

An efficient implementation of an Alphabet over the infinite set of integer values.

This class can be used to represent lists of integer numbers as a SymbolList with the alphabet IntegerAlphabet. These lists can then be annotated with features, or fed into dynamic-programming algorithms, or processed as per any other SymbolList object.

Object identity should be used to decide if two IntegerSymbol objects are the same. IntegerAlphabet ensures that all IntegerSymbol instances are canonicalized.

Author:
Matthew Pocock, Mark Schreiber, Thomas Down
See Also:
Serialized Form

Nested Class Summary
static class IntegerAlphabet.IntegerSymbol
          A single int value.
static class IntegerAlphabet.SubIntegerAlphabet
          A class to represent a finite contiguous subset of the infinite IntegerAlphabet
 
Nested classes inherited from class org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
 
Field Summary
static IntegerAlphabet INSTANCE
          The singleton instance of the IntegerAlphabet class.
 
Fields inherited from interface org.biojava.bio.symbol.Alphabet
EMPTY_ALPHABET, PARSERS, SYMBOLS
 
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
 
Method Summary
 boolean contains(Symbol s)
           Returns whether or not this Alphabet contains the symbol.
static SymbolList fromArray(int[] iArray)
          Retrieve a SymbolList view of an array of integers.
 java.util.List getAlphabets()
          Return an ordered List of the alphabets which make up a compound alphabet.
 Symbol getAmbiguity(java.util.Set symSet)
           Get a symbol that represents the set of symbols in syms.
 Annotation getAnnotation()
          Should return the associated annotation object.
 Symbol getGapSymbol()
           Get the 'gap' ambiguity symbol that is most appropriate for this alphabet.
static IntegerAlphabet getInstance()
          Retrieve the single IntegerAlphabet instance.
 java.lang.String getName()
          Get the name of the alphabet.
static IntegerAlphabet.SubIntegerAlphabet getSubAlphabet(int min, int max)
          Construct a finite contiguous subset of the IntegerAlphabet.
 IntegerAlphabet.IntegerSymbol getSymbol(int val)
          Retrieve the Symbol for an int.
 Symbol getSymbol(java.util.List symList)
           Get a symbol from the Alphabet which corresponds to the specified ordered list of symbols.
 SymbolTokenization getTokenization(java.lang.String name)
          Creates a new parser (Mark Schreiber 3 May 2001).
 void validate(Symbol s)
           Throws a precanned IllegalSymbolException if the symbol is not contained within this Alphabet.
 
Methods inherited from class org.biojava.utils.Unchangeable
addChangeListener, addChangeListener, addForwarder, getForwarders, getListeners, isUnchanging, removeChangeListener, removeChangeListener, removeForwarder
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
 

Field Detail

INSTANCE

public static IntegerAlphabet INSTANCE
The singleton instance of the IntegerAlphabet class.

Method Detail

getSubAlphabet

public static IntegerAlphabet.SubIntegerAlphabet getSubAlphabet(int min,
                                                                int max)
                                                         throws java.lang.IllegalArgumentException
Construct a finite contiguous subset of the IntegerAlphabet. Useful for making CrossProductAlphabets with other FiniteAlphabets.

Parameters:
min - the lower bound of the Alphabet
max - the upper bound of the Alphabet
Returns:
A FiniteAlphabet from min to max inclusive.
Throws:
java.lang.IllegalArgumentException - if max < min

fromArray

public static SymbolList fromArray(int[] iArray)
Retrieve a SymbolList view of an array of integers.

The returned object is a view onto the underlying array, and does not copy it. Changes made to the original array will alter the symulting SymbolList.

Parameters:
iArray - the array of integers to view
Returns:
a SymbolList over the IntegerAlphabet that represent the values in iArray

getInstance

public static IntegerAlphabet getInstance()
Retrieve the single IntegerAlphabet instance.

Returns:
the singleton IntegerAlphabet instance

getSymbol

public IntegerAlphabet.IntegerSymbol getSymbol(int val)
Retrieve the Symbol for an int.

Parameters:
val - the int to view
Returns:
a IntegerSymbol embodying val

getGapSymbol

public Symbol getGapSymbol()
Description copied from interface: Alphabet

Get the 'gap' ambiguity symbol that is most appropriate for this alphabet.

In general, this will be a BasisSymbol that represents a list of AlphabetManager.getGapSymbol() the same length as the getAlphabets list.

Specified by:
getGapSymbol in interface Alphabet
Returns:
the appropriate gap Symbol instance

getAnnotation

public Annotation getAnnotation()
Description copied from interface: Annotatable
Should return the associated annotation object.

Specified by:
getAnnotation in interface Annotatable
Returns:
an Annotation object, never null

getAlphabets

public java.util.List getAlphabets()
Description copied from interface: Alphabet
Return an ordered List of the alphabets which make up a compound alphabet. For simple alphabets, this will return a singleton list of itself. The returned list should be immutable.

Specified by:
getAlphabets in interface Alphabet
Returns:
a List of alphabets

getSymbol

public Symbol getSymbol(java.util.List symList)
                 throws IllegalSymbolException
Description copied from interface: Alphabet

Get a symbol from the Alphabet which corresponds to the specified ordered list of symbols.

The symbol at i in the list must be a member of the i'th alphabet in getAlphabets. If all of the symbols in rl are atomic, then the resulting symbol will also be atomic. If any one of them is an ambiguity symbol then the resulting symbol will be the appropriate ambiguity symbol.

Specified by:
getSymbol in interface Alphabet
Parameters:
symList - A list of Symbol instances
Throws:
IllegalSymbolException - if the members of rl are not Symbols over the alphabets returned from getAlphabets

getAmbiguity

public Symbol getAmbiguity(java.util.Set symSet)
                    throws IllegalSymbolException
Description copied from interface: Alphabet

Get a symbol that represents the set of symbols in syms.

Syms must be a set of Symbol instances each of which is contained within this alphabet. This method is used to retrieve ambiguity symbols.

Specified by:
getAmbiguity in interface Alphabet
Parameters:
symSet - the Set of Symbols that will be found in getMatches of the returned symbol
Returns:
a Symbol (possibly fly-weighted) for the Set of symbols in syms
Throws:
IllegalSymbolException

contains

public boolean contains(Symbol s)
Description copied from interface: Alphabet

Returns whether or not this Alphabet contains the symbol.

An alphabet contains an ambiguity symbol iff the ambiguity symbol's getMatches() returns an alphabet that is a proper sub-set of this alphabet. That means that every one of the symbols that could match the ambiguity symbol is also a member of this alphabet.

Specified by:
contains in interface Alphabet
Parameters:
s - the Symbol to check
Returns:
boolean true if the Alphabet contains the symbol and false otherwise

validate

public void validate(Symbol s)
              throws IllegalSymbolException
Description copied from interface: Alphabet

Throws a precanned IllegalSymbolException if the symbol is not contained within this Alphabet.

This function is used all over the code to validate symbols as they enter a method. Also, the code is littered with catches for IllegalSymbolException. There is a preferred style of handling this, which should be covererd in the package documentation.

Specified by:
validate in interface Alphabet
Parameters:
s - the Symbol to validate
Throws:
IllegalSymbolException - if r is not contained in this alphabet

getName

public java.lang.String getName()
Description copied from interface: Alphabet
Get the name of the alphabet.

Specified by:
getName in interface Alphabet
Returns:
the name as a string.

getTokenization

public SymbolTokenization getTokenization(java.lang.String name)
Creates a new parser (Mark Schreiber 3 May 2001).

Specified by:
getTokenization in interface Alphabet
Parameters:
name - Currently only "token" is supported.
Returns:
an IntegerParser.