org.biojava.bio.seq.db
Class IndexedSequenceDB

java.lang.Object
  extended by org.biojava.utils.AbstractChangeable
      extended by org.biojava.bio.seq.db.AbstractSequenceDB
          extended by org.biojava.bio.seq.db.IndexedSequenceDB
All Implemented Interfaces:
java.io.Serializable, SequenceDB, SequenceDBLite, Changeable

public final class IndexedSequenceDB
extends AbstractSequenceDB
implements SequenceDB, java.io.Serializable

This class implements SequenceDB on top of a set of sequence files and sequence offsets within these files.

This class is primarily responsible for managing the sequence IO, such as calculating the sequence file offsets, and parsing individual sequences based upon file offsets. The actual persistant storage of all this information is delegated to an instance of IndexStore, such as TabIndexStore.

 // create a new index store and populate it
 // this may take some time
 TabIndexStore indexStore = new TabIndexStore(
   storeFile, indexFile, dbName,
   format, sbFactory, symbolParser );
 IndexedSequenceDB seqDB = new IndexedSequenceDB(indexStore);

 for(int i = 0; i < files; i++) {
   seqDB.addFile(files[i]);
 }

 // load an existing index store and fetch a sequence
 // this should be quite quick
 TabIndexStore indexStore = TabIndexStore.open(storeFile);
 SequenceDB seqDB = new IndexedSequenceDB(indexStore);
 Sequence seq = seqDB.getSequence(id);
 

Note: We may be able to improve the indexing speed further by discarding all feature creation & annotation requests during index parsing.

Author:
Matthew Pocock, Thomas Down, Keith James
See Also:
TabIndexStore, Serialized Form

Field Summary
 
Fields inherited from interface org.biojava.bio.seq.db.SequenceDBLite
SEQUENCES
 
Constructor Summary
IndexedSequenceDB(IDMaker idMaker, IndexStore indexStore)
          Create an IndexedSequenceDB by specifying both the IDMaker and IndexStore used.
IndexedSequenceDB(IndexStore indexStore)
          Create an IndexedSequenceDB by specifying IndexStore used.
 
Method Summary
 void addFile(java.io.File seqFile)
          Add sequences from a file to the sequence database.
 IndexStore getIndexStore()
          Retrieve the IndexStore.
 java.lang.String getName()
          Get the name of this sequence database.
 Sequence getSequence(java.lang.String id)
          Retrieve a single sequence by its id.
 java.util.Set ids()
          Get an immutable set of all of the IDs in the database.
 SequenceIterator sequenceIterator()
          Returns a SequenceIterator over all sequences in the database.
 
Methods inherited from class org.biojava.bio.seq.db.AbstractSequenceDB
addSequence, filter, removeSequence
 
Methods inherited from class org.biojava.utils.AbstractChangeable
addChangeListener, addChangeListener, generateChangeSupport, getChangeSupport, hasListeners, hasListeners, isUnchanging, removeChangeListener, removeChangeListener
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.biojava.bio.seq.db.SequenceDB
filter
 
Methods inherited from interface org.biojava.bio.seq.db.SequenceDBLite
addSequence, removeSequence
 
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
 

Constructor Detail

IndexedSequenceDB

public IndexedSequenceDB(IDMaker idMaker,
                         IndexStore indexStore)
Create an IndexedSequenceDB by specifying both the IDMaker and IndexStore used.

The IDMaker will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.

Parameters:
idMaker - the IDMaker used to calculate Sequence IDs
indexStore - the IndexStore delegate

IndexedSequenceDB

public IndexedSequenceDB(IndexStore indexStore)
Create an IndexedSequenceDB by specifying IndexStore used.

IDMaker.byName will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.

Parameters:
indexStore - the IndexStore delegate
Method Detail

getIndexStore

public IndexStore getIndexStore()
Retrieve the IndexStore.

Returns:
the IndexStore delegate

addFile

public void addFile(java.io.File seqFile)
             throws IllegalIDException,
                    BioException,
                    ChangeVetoException
Add sequences from a file to the sequence database. This method works on an "all or nothing" principle. If it can successfully interpret the entire file, all the sequences will be read in. However, if it encounters any problems, it will abandon the whole file; an IOException will be thrown. Multiple files may be indexed into a single database. A BioException will be thrown if it has problems understanding the sequences.

Parameters:
seqFile - the file containing the sequence or set of sequences
Throws:
BioException - if for any reason the sequences can't be read correctly
ChangeVetoException - if there is a listener that vetoes adding the files
IllegalIDException

getName

public java.lang.String getName()
Get the name of this sequence database. The name is retrieved from the IndexStore delegate.

Specified by:
getName in interface SequenceDBLite
Returns:
the name of the sequence database, which may be null.

getSequence

public Sequence getSequence(java.lang.String id)
                     throws IllegalIDException,
                            BioException
Description copied from interface: SequenceDBLite
Retrieve a single sequence by its id.

Specified by:
getSequence in interface SequenceDBLite
Parameters:
id - the id to retrieve by
Returns:
the Sequence with that id
Throws:
IllegalIDException - if the database doesn't know about the id
BioException - if there was a failure in retrieving the sequence

sequenceIterator

public SequenceIterator sequenceIterator()
Description copied from interface: SequenceDB
Returns a SequenceIterator over all sequences in the database. The order of retrieval is undefined.

Specified by:
sequenceIterator in interface SequenceDB
Overrides:
sequenceIterator in class AbstractSequenceDB
Returns:
a SequenceIterator over all sequences

ids

public java.util.Set ids()
Description copied from interface: SequenceDB
Get an immutable set of all of the IDs in the database. The ids are legal arguments to getSequence.

Specified by:
ids in interface SequenceDB
Returns:
a Set of ids - at the moment, strings