Package org.biojava.bio.symbol

Representation of the Symbols that make up a sequence, and locations within them.

See:
          Description

Interface Summary
Alignment An alignment containing multiple SymbolLists.
Alphabet The set of AtomicSymbols which can be concatenated together to make a SymbolList.
AlphabetIndex Map between Symbols and index numbers.
AtomicSymbol A symbol that is not ambiguous.
BasisSymbol A symbol that can be represented as a string of Symbols.
FiniteAlphabet An alphabet over a finite set of Symbols.
FuzzyLocation.RangeResolver Determines how a FuzzyLocation should be treated when used as a normal Location.
FuzzyPointLocation.PointResolver Determines how a FuzzyPointLocation should be treated when used as a normal Location.
GappedSymbolList This extends SymbolList with API for manipulating, inserting and deleting gaps.
Location A set of integers, often used to represent positions on biological sequences.
Packing An encapsulation of the way symbols map to bit-patterns.
ReversibleTranslationTable A translation table that can also translate from the target to source alphabet.
Symbol A single symbol.
SymbolList A sequence of symbols that belong to an alphabet.
SymbolListFactory This interface exists to hide implementational details of SymbolLists when making chunked symbol lists.
SymbolPropertyTable class for maintaining properties associated with a symbol
TranslationTable Encapsulates the mapping from a source to a destination alphabet.
 

Class Summary
AbstractAlphabet An abstract implementation of Alphabet.
AbstractLocation An abstract implementation of Location.
AbstractLocationDecorator Abstract Location decorator (wrapper).
AbstractRangeLocation Base class for simple contiguous Location implementations.
AbstractSymbol The base-class for Symbol implementations.
AbstractSymbolList Abstract helper implementation of the SymbolList core interface.
Alignment.SymbolListIterator Iterator implementation looping over symbol lists in an alignment using the labels.
AlphabetManager Utility methods for working with Alphabets.
BetweenLocation Between view onto an underlying Location instance.
CircularLocation Circular view onto an underlying Location instance.
DNAAmbPack Packing utility class for DNA.
DNANoAmbPack A Packing implementation which handles the DNA alphabet, without any support for ambiguity symbols.
DoubleAlphabet An efficient implementation of an Alphabet over the infinite set of double values.
DoubleAlphabet.DoubleRange A range of double values.
DoubleAlphabet.DoubleSymbol A single double value.
DoubleAlphabet.SubDoubleAlphabet A class to represent a contiguous range of double symbols.
DummySymbolList Symbol list which just consists of non-informative symbols.
Edit Encapsulates an edit operation on a SymbolList.
FundamentalAtomicSymbol An atomic symbol consisting only of itself.
FuzzyLocation A 'fuzzy' location a-la Embl fuzzy locations.
FuzzyPointLocation FuzzyPointLocation represents two types of EMBL-style partially-defined locations.
IntegerAlphabet An efficient implementation of an Alphabet over the infinite set of integer values.
IntegerAlphabet.IntegerSymbol A single int value.
IntegerAlphabet.SubIntegerAlphabet A class to represent a finite contiguous subset of the infinite IntegerAlphabet
LocationTools Tools class containing a number of operators for working with Location objects.
MergeLocation Produced by LocationTools as a result of union operations.
MotifTools MotifTools contains utility methods for sequence motifs.
PackedDnaSymbolList a class that implements storage of symbols in packed form (2 symbols per byte).
PackedSymbolList A SymbolList that stores symbols as bit-patterns in an array of longs.
PackedSymbolListFactory This class makes PackedSymbolLists.
PackingFactory A factory that is used to maintain associations between alphabets and preferred bit-packings for them.
PointLocation A location representing a single point.
RangeLocation A simple implementation of Location that contains all points between getMin and getMax inclusive.
RelabeledAlignment An alignment that relabels another alignment.
SimpleAlignment A simple implementation of an Alignment.
SimpleAlphabet A simple no-frills implementation of the FiniteAlphabet interface.
SimpleAtomicSymbol A basic implementation of AtomicSymbol.
SimpleGappedSymbolList This implementation of GappedSymbolList wraps a SymbolList, allowing you to insert gaps.
SimpleReversibleTranslationTable A no-frills implementation of ReversibleTranslationTable that uses two Maps to map between symbols in a finite source alphabet into a finite target alphabet.
SimpleSymbolList Basic implementation of SymbolList.
SimpleSymbolListFactory This class makes SimpleSymbolLists.
SimpleSymbolPropertyTable Class that implements the SymbolPropertyTable interface
SimpleTranslationTable A no-frills implementation of TranslationTable that uses a Map to map from symbols in a finite source alphabet into a target alphabet.
SingletonAlphabet An alphabet that contains a single atomic symbol.
SuffixTree Suffix tree implementation.
SuffixTree.SuffixNode A node in the suffix tree.
SymbolList.EmptySymbolList The empty immutable implementation.
SymbolListViews Tools class for constructing views of SymbolList objects.
UkkonenSuffixTree A suffix tree is an efficient method for encoding the frequencies of motifs in a sequence.
 

Exception Summary
IllegalAlphabetException The exception to indicate that an invalid alphabet has been used.
IllegalSymbolException The exception to indicate that a symbol is not valid within a context.
 

Package org.biojava.bio.symbol Description

Representation of the Symbols that make up a sequence, and locations within them.

This package is not intended to have strong biological ties. It is here to make programming things like dynamic-programming much easier. It also handles serialization of well-known alphabets so that applicable singleton properties of alphabets and Symbols are maintained.

All coordinates are in 'bio-coordinates' - that is - legal indexes start from 1 and a range is inclusive (4 to 7 includes 4, 5, 6 and 7).

A Symbol is a single token. The Symbol maintains a name, a token (char), and an Annotation bundle. A set of Symbols is represented by an Alphabet instance. If the Alphabet can guarantee that there are only ever a finite number of Symbols contained with in it, then it must implement FiniteAlphabet. The Symbol objects within a FiniteAlphabet can be tested for equality by comparing their references directly. A SymbolList is a string over the Symbols from a single Alphabet instance. This allows you to represent a sequence of tokens, such as DNA nucleotides, or stock-market prices.

CrossProductAlphabet and CrossProductSymbol allow alphabets and symbols to be represented that are the combination of two or more alphabets and symbols under cross-product. For example, the CrossProduct alphabet DNA x DNA would contain all di-nucleotides. DNA x DNA x DNA x Protein would contain all combinations of three nucleotides and a single amino-acid. Dice x Coin would contain every possible combination of dice roles (1..6) and of coin flips (Heads, Tails) as the Symbol objects (1, Heads), (1, Tails), (2, Heads) ... (6, Tails). If any one of the Alphabets that make up the source of a CrossProductAlphabet is not finite, then the resulting CrossProductAlphabet will not be finite either.

Locations within a SymbolList can be represented by a Location object. This interface defines a sub-set of points that are within the Location. This uses bio-coordinates, and defines all the operations that you are likely to need to build your own Locations (union, intersection and the like).