Class | Bio::GFF::GFF3::Record::Gap |
In: |
lib/bio/db/gff.rb
|
Parent: | Object |
Bio:GFF::GFF3::Record::Gap is a class to store data of "Gap" attribute.
Code | = | Struct.new(:code, :length) | Code is a class to store length of single-letter code. |
data | [R] | Internal data. Users must not use it. |
Arguments:
# File lib/bio/db/gff.rb, line 1275 1275: def initialize(str = nil) 1276: if str then 1277: @data = str.split(/ +/).collect do |x| 1278: if /\A([A-Z])([0-9]+)\z/ =~ x.strip then 1279: Code.new($1.intern, $2.to_i) 1280: else 1281: warn "ignored unknown token: #{x}.inspect" if $VERBOSE 1282: nil 1283: end 1284: end 1285: @data.compact! 1286: else 1287: @data = [] 1288: end 1289: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
Arguments:
# File lib/bio/db/gff.rb, line 1391 1391: def self.new_from_sequences_na(reference, target, 1392: gap_regexp = /[^a-zA-Z]/) 1393: gap = self.new 1394: gap.instance_eval { 1395: __initialize_from_sequences_na(reference, target, 1396: gap_regexp) 1397: } 1398: gap 1399: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
For incorrect alignments that break 3:1 rule, gap positions will be moved inside codons, unwanted gaps will be removed, and some forward or reverse frameshift will be inserted.
For example,
atgg-taagac-att M V K - I
is treated as:
atggt<aagacatt M V K >>I
Incorrect combination of frameshift with frameshift or gap may cause undefined behavior.
Forward frameshifts are recomended to be indicated in the target sequence. Reverse frameshifts can be indicated in the reference sequence or the target sequence.
Priority of regular expressions:
space > forward/reverse frameshift > gap
Arguments:
# File lib/bio/db/gff.rb, line 1587 1587: def self.new_from_sequences_na_aa(reference, target, 1588: gap_regexp = /[^a-zA-Z]/, 1589: space_regexp = /\s/, 1590: forward_frameshift_regexp = /\>/, 1591: reverse_frameshift_regexp = /\</) 1592: gap = self.new 1593: gap.instance_eval { 1594: __initialize_from_sequences_na_aa(reference, target, 1595: gap_regexp, 1596: space_regexp, 1597: forward_frameshift_regexp, 1598: reverse_frameshift_regexp) 1599: } 1600: gap 1601: end
If self == other, returns true. otherwise, returns false.
# File lib/bio/db/gff.rb, line 1615 1615: def ==(other) 1616: if other.class == self.class and 1617: @data == other.data then 1618: true 1619: else 1620: false 1621: end 1622: end
Processes nucleotide sequences and returns gapped sequences as an array of sequences.
Note for forward/reverse frameshift: Forward/Reverse_frameshift is simply treated as gap insertion to the target/reference sequence.
Arguments:
# File lib/bio/db/gff.rb, line 1715 1715: def process_sequences_na(reference, target, gap_char = '-') 1716: s_ref, s_tgt = dup_seqs(reference, target) 1717: 1718: s_ref, s_tgt = __process_sequences(s_ref, s_tgt, 1719: gap_char, gap_char, 1720: 1, 1, 1721: gap_char, gap_char) 1722: 1723: if $VERBOSE and s_ref.length != s_tgt.length then 1724: warn "returned sequences not equal length" 1725: end 1726: return s_ref, s_tgt 1727: end
Processes sequences and returns gapped sequences as an array of sequences. reference must be a nucleotide sequence, and target must be an amino acid sequence.
Note for reverse frameshift: Reverse_frameshift characers are inserted in the reference sequence. For example, alignment of "Gap=M3 R1 M2" is:
atgaagat<aatgtc M K I N V
Alignment of "Gap=M3 R3 M3" is:
atgaag<<<attaatgtc M K I I N V
Arguments:
# File lib/bio/db/gff.rb, line 1752 1752: def process_sequences_na_aa(reference, target, 1753: gap_char = '-', 1754: space_char = ' ', 1755: forward_frameshift = '>', 1756: reverse_frameshift = '<') 1757: s_ref, s_tgt = dup_seqs(reference, target) 1758: s_tgt = s_tgt.gsub(/./, "\\0#{space_char}#{space_char}") 1759: ref_increment = 3 1760: tgt_increment = 1 + space_char.length * 2 1761: ref_gap = gap_char * 3 1762: tgt_gap = "#{gap_char}#{space_char}#{space_char}" 1763: return __process_sequences(s_ref, s_tgt, 1764: ref_gap, tgt_gap, 1765: ref_increment, tgt_increment, 1766: forward_frameshift, 1767: reverse_frameshift) 1768: end