EMBOSS: union


Program union

Function

Reads sequence fragments and builds one sequence

Description

union reads in several sequences, concatenates them and writes them out as a single sequence.

It is most useful when the input sequences are specified in a List file. The List file (file of sequence names) can be any set of sequences in files or database entries specified in the normal EMBOSS USA (which can include the spcification of sub-regions of the sequence, eg. 'em:hsfau[20,55]'). Specifying several such subregions in a sequence or sequences allows you to enter disjoint sequences to be joined.

Usage

Here is a sample session with union, the file 'cds.list' contains a list of the regions making up the coding sequence of the sequence 'embl:hsfau':

% union
Reads sequence fragments and builds one sequence
Input sequence(s): @cds.list
Output sequence [hsfau1.fasta]: fau.cds 

Command line arguments

   Mandatory qualifiers:
  [-sequence]          seqall     Sequence database USA
  [-outseq]            seqout     Output sequence USA

   Optional qualifiers: (none)
   Advanced qualifiers: (none)
   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
[-outseq]
(Parameter 2)
Output sequence USA Writeable sequence <sequence>.format
Optional qualifiers Allowed values Default
(none)
Advanced qualifiers Allowed values Default
(none)

Input file format

The input can be any set of sequences in a file of multiple sequence entries or a List file. The sequences are concatenated in the order in which they appear in the file.

The input used in the above example is:

em-id:HSFAU1[782:856]
em-id:HSFAU1[951:1095]
em-id:HSFAU1[1557:1612]
em-id:HSFAU1[1787:1912]

You may find the program yank useful for creating List files.

Output file format

The result if a normal sequence file containnig a single sequence resulting from the concatenation of the input sequences.

Data files

None.

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0.

Known bugs

None.

See also

Program nameDescription
biosedReplace or delete sequence sections
cutseqRemoves a specified section from a sequence
degapseqRemoves gap characters from sequences
descseqAlter the name or description of a sequence
entretReads and writes (returns) flatfile entries
extractfeatExtract features from a sequence
extractseqExtract regions from a sequence
listorWrites a list file of the logical OR of two sets of sequences
maskfeatMask off features of a sequence
maskseqMask off regions of a sequence
newseqType in a short new sequence
noreturnRemoves carriage return from ASCII files
notseqExcludes a set of sequences and writes out the remaining ones
nthseqWrites one sequence from a multiple set of sequences
pasteseqInsert one sequence into another
revseqReverse and complement a sequence
seqretReads and writes (returns) sequences
seqretsplitReads and writes (returns) sequences in individual files
splitterSplit a sequence into (overlapping) smaller sequences
swissparseRetrieves sequences from swissprot using keyword search
trimestTrim poly-A tails off EST sequences
trimseqTrim ambiguous bits off the ends of sequences
vectorstripStrips out DNA between a pair of vector sequences
yankReads a sequence range, appends the full USA to a list file

You may find the program yank useful for creating List files.

Author(s)

This application was written by Peter Rice (peter.rice@uk.lionbioscience.com)

History

Written (March 2002) - Peter Rice.

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments