EMBOSS: dan


Program dan

Function

Calculates DNA RNA/DNA melting temperature

Description

Dan calculates the melting temperature (Tm) and the percent G+C of a nucleic acid sequence (optionally plotting them). For the Melting temperature profile, free energy values calculated from nearest neighbor thermodynamics are used (Breslauer et al. Proc. Natl. Acad. Sci. USA 83, 3746-3750 and Baldino et al. Methods in Enzymol. 168, 761-777).

Usage

Here is a sample session with dan.

% dan
Input sequence: embl:paamir
Enter window size [20]: 
Enter Shift Increment [1]: 
Enter DNA concentration (nM) [50.]: 
Enter salt concentration (mM) [50.]: 
Output file [paamir.dan]: 

An example of producing a plot of Tm:

% dan -plot
Input sequence(s): embl:paamir
Enter window size [20]: 
Enter Shift Increment [1]: 
Enter DNA concentration (nM) [50.]: 
Enter salt concentration (mM) [50.]: 
Enter minimum temperature [55.]: 
Graph type [x11]: 

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-sequence]          seqall     Sequence database USA
   -windowsize         integer    The values of melting point and other
                                  thermodynamic properties of the sequence are
                                  determined by taking a short length of
                                  sequence known as a window and determining
                                  the properties of the sequence in that
                                  window. The window is incrementally moved
                                  along the sequence with the properties being
                                  calculated at each new position.
   -shiftincrement     integer    This is the amount by which the window is
                                  moved at each increment in order to find the
                                  melting point and other properties along
                                  the sequence.
   -dnaconc            float      Enter DNA concentration (nM)
   -saltconc           float      Enter salt concentration (mM)
*  -formamide          float      This specifies the percent formamide to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -mismatch           float      This specifies the percent mismatch to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -prodlen            integer    This specifies the product length to be used
                                  in calculations (it is ignored unless
                                  -product is used).
*  -mintemp            float      Enter a minimum value for the temperature
                                  scale (y-axis) of the plot.
*  -graph              xygraph    Graph type
*  -outfile            report     If a plot is not being produced then data on
                                  the melting point etc. in each window along
                                  the sequence is output to the file.

   Optional qualifiers (* if not always prompted):
*  -temperature        float      If -thermo has been specified then this
                                  specifies the temperature at which to
                                  calculate the DeltaG, DeltaH and DeltaS
                                  values.

   Advanced qualifiers:
   -rna                bool       This specifies that the sequence is an RNA
                                  sequnce and not a DNA sequence.
   -product            bool       This prompts for percent formamide, percent
                                  of mismatches allowed and product length.
   -thermo             bool       Output the DeltaG, DeltaH and DeltaS values
                                  of the sequence windows to the output data
                                  file.
   -plot               bool       If this is not specified then the file of
                                  output data is produced, else a plot of the
                                  melting point along the sequence is
                                  produced.

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
-windowsize The values of melting point and other thermodynamic properties of the sequence are determined by taking a short length of sequence known as a window and determining the properties of the sequence in that window. The window is incrementally moved along the sequence with the properties being calculated at each new position. Integer from 1 to 100 20
-shiftincrement This is the amount by which the window is moved at each increment in order to find the melting point and other properties along the sequence. Integer 1 or more 1
-dnaconc Enter DNA concentration (nM) Number from 1.000 to 100000.000 50.
-saltconc Enter salt concentration (mM) Number from 1.000 to 1000.000 50.
-formamide This specifies the percent formamide to be used in calculations (it is ignored unless -product is used). Number from 0.000 to 100.000 0.
-mismatch This specifies the percent mismatch to be used in calculations (it is ignored unless -product is used). Number from 0.000 to 100.000 0.
-prodlen This specifies the product length to be used in calculations (it is ignored unless -product is used). Any integer value Window size (20)
-mintemp Enter a minimum value for the temperature scale (y-axis) of the plot. Number from 0.000 to 150.000 55.
-graph Graph type EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png EMBOSS_GRAPHICS value, or x11
-outfile If a plot is not being produced then data on the melting point etc. in each window along the sequence is output to the file. Report file  
Optional qualifiers Allowed values Default
-temperature If -thermo has been specified then this specifies the temperature at which to calculate the DeltaG, DeltaH and DeltaS values. Number from 0.000 to 100.000 25.
Advanced qualifiers Allowed values Default
-rna This specifies that the sequence is an RNA sequnce and not a DNA sequence. Yes/No No
-product This prompts for percent formamide, percent of mismatches allowed and product length. Yes/No No
-thermo Output the DeltaG, DeltaH and DeltaS values of the sequence windows to the output data file. Yes/No No
-plot If this is not specified then the file of output data is produced, else a plot of the melting point along the sequence is produced. Yes/No No

Input file format

Any DNA or RNA sequence USA.

Output file format

If a plot is not being produced, dan reports the sequence of each oligomer window, its melting temperature under the specified conditions and its GC content.

The output is a standard EMBOSS report file.

The results can be output in one of several styles by using the command-line qualifier -rformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel, feattable, motif, regions, seqtable, simple, srs, table, tagseq

See: http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for further information on report formats.

By default dan writes a 'seqtable' report file.

This is the start and the end of the output file from the example.


########################################
# Program: dan
# Rundate: Mon Feb 11 12:07:10 2002
# Report_file: paamir.dan
########################################

#=======================================
#
# Sequence: PAAMIR     from: 1   to: 2167
# HitCount: 2148
#=======================================

  Start     End Tm     GC     DeltaG DeltaH DeltaS TmProd Sequence
      1      20 64.9   70.0   .      .      .      .      ggtaccgctggccgagcatc
      2      21 63.7   65.0   .      .      .      .      gtaccgctggccgagcatct
      3      22 63.7   65.0   .      .      .      .      taccgctggccgagcatctg
      4      23 66.9   70.0   .      .      .      .      accgctggccgagcatctgc
      5      24 66.7   70.0   .      .      .      .      ccgctggccgagcatctgct
      6      25 65.5   70.0   .      .      .      .      cgctggccgagcatctgctc
      7      26 65.5   70.0   .      .      .      .      gctggccgagcatctgctcg
      8      27 63.7   65.0   .      .      .      .      ctggccgagcatctgctcga
      9      28 62.9   60.0   .      .      .      .      tggccgagcatctgctcgat
     10      29 62.6   65.0   .      .      .      .      ggccgagcatctgctcgatc
     11      30 61.7   60.0   .      .      .      .      gccgagcatctgctcgatca
     12      31 60.2   60.0   .      .      .      .      ccgagcatctgctcgatcac
etc.

   2143    2162 65.6   70.0   .      .      .      .      ggtggccgccaaccagttcc
   2144    2163 64.4   65.0   .      .      .      .      gtggccgccaaccagttcct
   2145    2164 64.1   65.0   .      .      .      .      tggccgccaaccagttcctc
   2146    2165 65.4   70.0   .      .      .      .      ggccgccaaccagttcctcg
   2147    2166 64.2   65.0   .      .      .      .      gccgccaaccagttcctcga
   2148    2167 62.4   65.0   .      .      .      .      ccgccaaccagttcctcgag

#---------------------------------------
#---------------------------------------     

The header information contains details of the program, date and sequence

Subsequent lines contain columns of data for each window into the sequence as it is moved along, giving:

If the qualifier '-product' is used to make the program prompt for percent formamide percent of mismatches allowed and product length, then the output includes the melting temperature of the specified product:


########################################
# Program: dan
# Rundate: Mon Feb 11 12:11:25 2002
# Report_file: paamir.dan
########################################

#=======================================
#
# Sequence: PAAMIR     from: 1   to: 2167
# HitCount: 2148
#=======================================

  Start     End Tm     GC     DeltaG DeltaH DeltaS TmProd Sequence
      1      20 64.9   70.0   .      .      .      54.9   ggtaccgctggccgagcatc
      2      21 63.7   65.0   .      .      .      52.8   gtaccgctggccgagcatct
      3      22 63.7   65.0   .      .      .      52.8   taccgctggccgagcatctg
      4      23 66.9   70.0   .      .      .      54.9   accgctggccgagcatctgc

etc.

If the qualifier '-thermo' is gived then the DeltaG, DeltaH and DeltaS of the sequence in the window is also output.

Data files

The EMBOSS data files "Edna.melt" and "Erna.melt" are used to read in the entropy/enthalpy/energy data for DNA and RNA respectively.

EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.

Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".

The directories are searched in the following order:

Notes

None.

References

  1. Breslauer, K.J., Frank, R., Blocker, H., and Marky, L.A. (1986). "Predicting DNA Duplex Stability from the Base Sequence." Proceedings of the National Academy of Sciences USA 83, 3746-3750.
  2. Baldino, M., Jr. (1989). "High Resolution In Situ Hybridization Histochemistry." In Methods in Enzymology, (P.M. Conn, ed.), 168, 761-777, Academic Press, San Diego, California, USA.

Warnings

RNA sequences must be submited to this application with the '-rna' qualifier on the command line, otherwise the sequence will be assumed to be DNA.

Diagnostic Error Messages

None.

Exit status

0 if successful.

Known bugs

None.

See also

Program nameDescription
bananaBending and curvature plot in B-DNA
btwistedCalculates the twisting in a B-DNA sequence
chaosCreate a chaos game representation plot for a sequence
compseqCounts the composition of dimer/trimer/etc words in a sequence
freakResidue/base frequency table or plot
isochorePlots isochores in large DNA sequences
wordcountCounts words of a specified size in a DNA sequence

Author(s)

This program was originally included in EGCG under the names "MELT" and "MELTPLOT", written by Rodrigo Lopez.

This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)

History

Written (1999) - Alan Bleasby

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments