org.biojava.bio.symbol
Class SuffixTree

java.lang.Object
  extended byorg.biojava.bio.symbol.SuffixTree
All Implemented Interfaces:
java.io.Serializable

public class SuffixTree
extends java.lang.Object
implements java.io.Serializable

Suffix tree implementation. The interface is a bit strange, as it needed to be as space-efficient as possible. More work could be done on the space issue.

A suffix tree is an efficient method for encoding the frequencies of motifs in a sequence. They are sometimes used to quickly screen for similar sequences. For instance, all motifs of length up to 2 in the sequence AAGT could be encoded as:

 root(4)
 |
 A(2)--------G(1)-----T(1)
 |           |
 A(1)--G(1)  T(1)
 

A possible method of comparing SuffixTrees is provided as a kernel function as org.biojava.stats.svm.tools.SuffixTreeKernel.

Author:
Matthew Pocock, Thomas Down (documentation and other updates)
See Also:
Serialized Form

Nested Class Summary
static class SuffixTree.SuffixNode
          A node in the suffix tree.
 
Constructor Summary
SuffixTree(FiniteAlphabet alphabet)
          Construct a new SuffixTree to contain motifs over the specified alphabet.
 
Method Summary
 void addSymbols(SymbolList sList, int window)
          Add a count for all motifs with length of up to window to this tree.
 int frequency(int length)
          Return the number of motifs of a given length encoded in this SuffixTree.
 FiniteAlphabet getAlphabet()
          Return the Alphabet containing all Symbols which might be found in this SuffixTree.
 SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node, int i)
          Get the n'th child of a node.
 SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node, Symbol s)
          Get a child of a SuffixTree.SuffixNode, constructing a new one if need be.
 SuffixTree.SuffixNode getRoot()
          Return the node object which is the root of this suffix tree.
protected  void incCounts(int i, int c)
           
 int maxLength()
          Return the length of the longest motif represented in this SuffixTree
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SuffixTree

public SuffixTree(FiniteAlphabet alphabet)
Construct a new SuffixTree to contain motifs over the specified alphabet.

Parameters:
alphabet - The alphabet of this SuffixTree (must be finite).
Method Detail

getAlphabet

public FiniteAlphabet getAlphabet()
Return the Alphabet containing all Symbols which might be found in this SuffixTree.


getRoot

public SuffixTree.SuffixNode getRoot()
Return the node object which is the root of this suffix tree. This represents the set of all motifs found in this tree.


getChild

public SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node,
                                      Symbol s)
                               throws IllegalSymbolException
Get a child of a SuffixTree.SuffixNode, constructing a new one if need be. This method is here due to memory optimisations.

Throws:
IllegalSymbolException

getChild

public SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node,
                                      int i)
Get the n'th child of a node.


addSymbols

public void addSymbols(SymbolList sList,
                       int window)
                throws IllegalSymbolException
Add a count for all motifs with length of up to window to this tree.

Parameters:
sList - a SymbolList whose motifs should be added to the tree.
window - The maximum motif length to count.
Throws:
IllegalSymbolException

incCounts

protected void incCounts(int i,
                         int c)

maxLength

public int maxLength()
Return the length of the longest motif represented in this SuffixTree


frequency

public int frequency(int length)
Return the number of motifs of a given length encoded in this SuffixTree.