![]() |
EMBOSS: sigcleave |
% sigcleave Reports peptide signal cleavage sites Input sequence: sw:ach2_drome Output file [ach2_drome.out]: Minimum weight [3.5]:
Mandatory qualifiers: [-sequence] seqall Sequence database USA -minweight float Minimum scoring weight value for the predicted cleavage site [-outfile] report (no help text) report value Optional qualifiers: -prokaryote bool Specifies the sequence is prokaryotic and changes the default scoring data file name Advanced qualifiers: -pval integer Specifies the number of columns before the residue at the cleavage site in the weight matrix table -nval integer specifies the number of columns after the residue at the cleavage site in the weight matrix table General qualifiers: -help bool report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required |
-minweight | Minimum scoring weight value for the predicted cleavage site | Number from 0.000 to 100.000 | 3.5 |
[-outfile] (Parameter 2) |
(no help text) report value | Report file | |
Optional qualifiers | Allowed values | Default | |
-prokaryote | Specifies the sequence is prokaryotic and changes the default scoring data file name | Yes/No | No |
Advanced qualifiers | Allowed values | Default | |
-pval | Specifies the number of columns before the residue at the cleavage site in the weight matrix table | Integer from -13 to -1 | -13 |
-nval | specifies the number of columns after the residue at the cleavage site in the weight matrix table | Integer 1 or more | Pval+15 (2) |
The output is a standard EMBOSS report file.
The results can be output in one of several styles by using the command-line qualifier -rformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel, feattable, motif, regions, seqtable, simple, srs, table, tagseq
See: http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for further information on report formats.
By default sigcleave writes a 'motif' report file.
The output from the above example is:
######################################## # Program: sigcleave # Rundate: Mon Feb 11 13:50:56 2002 # Report_file: ach2_drome.sig ######################################## #======================================= # # Sequence: ACH2_DROME from: 1 to: 576 # HitCount: 9 # # Reporting scores over 3.50 # #======================================= (1) Score 13.739 length 13 at residues 29->41 Sequence: LLVLLLLCETVQA | | 29 41 mature_peptide: NPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQIL (2) Score 3.632 length 13 at residues 308->320 Sequence: LLISEIIPSTSLA | | 308 320 mature_peptide: LPLLGKYLLFTMLLVGLSVVITIIILNIHYRKPSTHKMRPWIRSFFIKRL (3) Score 3.751 length 13 at residues 527->539 Sequence: LFLWLFMIASLVG | | 527 539 mature_peptide: TFVILGEAPSLYDDTKAIDVQLSDVAKQIYNLTEKKN (4) Score 4.026 length 13 at residues 31->43 Sequence: VLLLLCETVQANP | | 31 43 mature_peptide: DAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQILTT (5) Score 5.057 length 13 at residues 24->36 Sequence: KPLCLLLVLLLLC | | 24 36 mature_peptide: ETVQANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNL (6) Score 6.981 length 13 at residues 330->342 Sequence: FTMLLVGLSVVIT | | 330 342 mature_peptide: IIILNIHYRKPSTHKMRPWIRSFFIKRLPKLLLMRVPKDLLRDLAANKIN (7) Score 7.360 length 13 at residues 528->540 Sequence: FLWLFMIASLVGT | | 528 540 mature_peptide: FVILGEAPSLYDDTKAIDVQLSDVAKQIYNLTEKKN (8) Score 10.465 length 13 at residues 28->40 Sequence: LLLVLLLLCETVQ | | 28 40 mature_peptide: ANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQI (9) Score 12.135 length 13 at residues 26->38 Sequence: LCLLLVLLLLCET | | 26 38 mature_peptide: VQANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKD #--------------------------------------- #---------------------------------------
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
# Amino acid counts for 161 Eukaryotic Signal Peptides, # from von Heijne (1986), Nucl. Acids. Res. 14:4683-4690 # # The cleavage site is between +1 and -1 # Sample: 161 aligned sequences # # R -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 +1 +2 Expect # - --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ------ A 16 13 14 15 20 18 18 17 25 15 47 6 80 18 6 14.5 C 3 6 9 7 9 14 6 8 5 6 19 3 9 8 3 4.5 D 0 0 0 0 0 0 0 0 5 3 0 5 0 10 11 8.9 E 0 0 0 1 0 0 0 0 3 7 0 7 0 13 14 10.0 F 13 9 11 11 6 7 18 13 4 5 0 13 0 6 4 5.6 G 4 4 3 6 3 13 3 2 19 34 5 7 39 10 7 12.1 H 0 0 0 0 0 1 1 0 5 0 0 6 0 4 2 3.4 I 15 15 8 6 11 5 4 8 5 1 10 5 0 8 7 7.4 K 0 0 0 1 0 0 1 0 0 4 0 2 0 11 9 11.3 L 71 68 72 79 78 45 64 49 10 23 8 20 1 8 4 12.1 M 0 3 7 4 1 6 2 2 0 0 0 1 0 1 2 2.7 N 0 1 0 1 1 0 0 0 3 3 0 10 0 4 7 7.1 P 2 0 2 0 0 4 1 8 20 14 0 1 3 0 22 7.4 Q 0 0 0 1 0 6 1 0 10 8 0 18 3 19 10 6.3 R 2 0 0 0 0 1 0 0 7 4 0 15 0 12 9 7.6 S 9 3 8 6 13 10 15 16 26 11 23 17 20 15 10 11.4 T 2 10 5 4 5 13 7 7 12 6 17 8 6 3 10 9.7 V 20 25 15 18 13 15 11 27 0 12 32 3 0 8 17 11.1 W 4 3 3 1 1 2 6 3 1 3 0 9 0 2 0 1.8 Y 0 1 4 0 0 1 3 1 1 2 0 5 0 1 7 5.6
If you use matrix tables with a different number of residues before or after the cleavage site, you must also set the advanced parameters nval and pval.
Program name | Description |
---|---|
antigenic | Finds antigenic sites in proteins |
digest | Protein proteolytic enzyme or reagent cleavage digest |
fuzzpro | Protein pattern search |
fuzztran | Protein pattern search after translation |
helixturnhelix | Report nucleic acid binding motifs |
oddcomp | Finds protein sequence regions with a biased composition |
patmatdb | Search a protein sequence with a motif |
patmatmotifs | Search a PROSITE motif database with a protein sequence |
pepcoil | Predicts coiled coil regions |
preg | Regular expression search of a protein sequence |
pscan | Scans proteins using PRINTS |
Original program "SIGCLEAVE" by Peter Rice (EGCG 1989)