Package org.biojava.bio.dp

HMM and Dynamic Programming Algorithms.

See:
          Description

Interface Summary
DotState A Dot state.
DPFactory The interface for objects that can generate a DP object for a MarkovModel.
DPMatrix  
EmissionState A state in a markov process that has an emission spectrum.
HMMTrainer interface implemented by objects that train HMMs.
MarkovModel A markov model.
ModelInState A state that contains an entire sub-model.
ModelTrainer Encapsulates the training of an entire model.
ScoreType This class computes the score that is used to be used in a DP optimisation.
State A state in a markov process.
StatePath Extends the Alignment interface so that it is explicitly used to represent a state path through an HMM, and the associated emitted sequence and likelihoods.
StoppingCriteria A callback that is invoked during the training of an HMM.
Trainable Flags an object as being able to register itself with a model trainer.
TrainingAlgorithm  
TransitionTrainer An object that can be used to train the transitions within a MarkovModel.
WeightMatrix A log odds weight matrix.
 

Class Summary
AbstractTrainer An abstract implementation of TrainingAlgorithm that provides a framework for plugging in per-cycle code for parameter optimization.
BackPointer A backpointer.
BaumWelchSampler Train a hidden markov model using a sampling algorithm.
BaumWelchTrainer Train a hidden markov model using maximum likelihood.
DP Objects that can perform dymamic programming operations upon sequences with HMMs.
DP.ReverseIterator  
DPFactory.DefaultFactory  
MagicalState Start/end state for HMMs.
ProfileHMM  
ScoreType.NullModel In this class, calculateScore returns the probability of a Symbol being emitted by the null model.
ScoreType.Odds In this class, calculateScore returns the odds ratio of a symbol being emitted.
ScoreType.Probability In this class, calculateScore returns the probability of a Symbol being emitted.
SimpleDotState A Dot state that you can make and use.
SimpleEmissionState  
SimpleHMMTrainer  
SimpleMarkovModel  
SimpleModelInState  
SimpleModelTrainer  
SimpleStatePath A no-frills implementation of StatePath.
SimpleWeightMatrix  
TrainerTransition This is a small and ugly class for storing a trainer and a transition.
Transition This is a small and ugly class for storing a transition.
WeightMatrixAnnotator Annotates a sequence with hits to a weight-matrix.
WMAsMM Wraps a weight matrix up so that it appears to be a very simple HMM.
XmlMarkovModel  
 

Exception Summary
IllegalTransitionException This exception indicates that there is no transition between two states.
 

Package org.biojava.bio.dp Description

HMM and Dynamic Programming Algorithms.

This package deals with dynamic programming. It uses the same notions of sequences, alphabets and alignments as org.biojava.bio.seq, and extends them to incorporate HMMs, HMM states and state paths. As far as possible, the implementation detail is hidden from the casual observer, so that these objects can be used as black-boxes. Alternatively, there is scope for you to implement your own efficient representations of states and dynamic programming algorithms.

HMMs are defined by a finite set of states and a finite set of transitions. The states are encapsulated as subinterfaces of Symbols, so that we can re-use alphabets and SymbolList to store legal states and sequences of states. States that emit residues must implement EmissionState. They define a probability distribution over an alphabet. Other states may contain entire HMMs, or be non-emitting states which make the model easier to wire. An HMM contains an alphabet of states and a set of transitions with scores. They really resemble directed weighted graphs with the nodes being the states and the arcs being the transitions.

A simple HMM can be aligned to a single sequence at a time. This effectively finds the most likely way that the HMM could have emitted that sequence. More complex algorithms may align more than one sequence to a model simultaneously. For example, Smith-Waterman is a three-state model that aligns two sequences to each other and to the model. These more complex models can still be represented as producing a single sequence, but in this case the sequence is an alignment of the two input sequences against one-another (including gap characters where appropriate).