![]() |
EMBOSS: dotmatcher |
The two sequences are placed on the axes of a rectangular image and (subject to threshold conditions) wherever there is a similarity between the sequences a dot is placed on the image.
Where the two sequences have substantial regions of similarity, many dots align to form diagonal lines. It is therefore possible to see at a glance where there are local regions of similarity as these will have long diagonal lines. It is also easy to see other features such as repeats (which form parallel diagonal lines), and insertions or deletions (which form breaks or discontinuities in the diagonal lines).
dotmatcher uses a threshold to define whether a match is plotted (calculated from the substitution matrix). A window of specified length is moved up all possible diagonals and a score is calculated within each window for each position along the diagonals. The score is the sum of the comparisons of the two sequences using the given similarity matrix along the window. If the score is above the threshold, then a line is plotted on the image over the position of the window.
% dotmatcher sw:hba_human sw:hbb_human
Mandatory qualifiers (* if not always prompted): [-sequencea] sequence Sequence USA [-sequenceb] sequence Sequence USA * -data bool Output the match data to a file instead of plotting it * -graph graph Graph type * -xygraph xygraph Graph type * -outfile outfile Display as data Optional qualifiers: -windowsize integer window size over which to test threshhold -threshold integer threshold -matrixfile matrix Matrix file Advanced qualifiers: -stretch bool Display a non-proportional graph General qualifiers: -help bool report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequencea] (Parameter 1) |
Sequence USA | Readable sequence | Required |
[-sequenceb] (Parameter 2) |
Sequence USA | Readable sequence | Required |
-data | Output the match data to a file instead of plotting it | Yes/No | No |
-graph | Graph type | EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png | EMBOSS_GRAPHICS value, or x11 |
-xygraph | Graph type | EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png | EMBOSS_GRAPHICS value, or x11 |
-outfile | Display as data | Output file | <sequence>.dotmatcher |
Optional qualifiers | Allowed values | Default | |
-windowsize | window size over which to test threshhold | Integer 3 or more | 10 |
-threshold | threshold | Integer 0 or more | 50 |
-matrixfile | Matrix file | Comparison matrix file in EMBOSS data path | EBLOSUM62 for protein EDNAFULL for DNA |
Advanced qualifiers | Allowed values | Default | |
-stretch | Display a non-proportional graph | Yes/No | No |
For protein sequences EBLOSUM62 is used for the substitution matrix. For nucleotide sequence, EDNAFULL is used. Others can be specified.
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
Program name | Description |
---|---|
dotpath | Displays a non-overlapping wordmatch dotplot of two sequences |
dottup | Displays a wordmatch dotplot of two sequences |
polydot | Displays all-against-all dotplots of a set of sequences |
dottup, by comparison, has no threshold, using a wordmatch-style method. dottup is less sensitive, but substantially faster than dotmatcher.
Completed 1st June 1999. Last modified 16th June 1999.