Class | Bio::Tree |
In: |
lib/bio/tree.rb
lib/bio/tree/output.rb |
Parent: | Object |
This is the class for phylogenetic tree. It stores a phylogenetic tree.
Internally, it is based on Bio::Pathway class. However, users cannot handle Bio::Pathway object directly.
This is alpha version. Incompatible changes may be made frequently.
DEFAULT_OPTIONS | = | { :indent => ' ' } | default options |
options | [RW] | tree options; mainly used for tree output |
root | [RW] | root node of this tree (even if unrooted tree, it is used by some methods) |
Creates a new phylogenetic tree. When no arguments are given, it creates a new empty tree. When a Tree object is given, it copies the tree. Note that the new tree shares Node and Edge objects with the given tree.
# File lib/bio/tree.rb, line 258 258: def initialize(tree = nil) 259: # creates an undirected adjacency list graph 260: @pathway = Bio::Pathway.new([], true) 261: @root = nil 262: @options = {} 263: _init_cache 264: self.concat(tree) if tree 265: end
Adds a new edge to the tree. Returns the newly added edge. If the edge already exists, it is overwritten with new one.
# File lib/bio/tree.rb, line 380 380: def add_edge(source, target, edge = Edge.new) 381: _clear_cache 382: @pathway.append(Bio::Relation.new(source, target, edge)) 383: edge 384: end
Adds a node to the tree. Returns self. If the node already exists, it does nothing.
# File lib/bio/tree.rb, line 402 402: def add_node(node) 403: _clear_cache 404: @pathway.graph[node] ||= {} 405: self 406: end
Shows the adjacency matrix representation of the tree. It shows matrix only for given nodes. If nodes is nil or is ommitted, it acts the same as tree.adjacency_matrix(tree.nodes). If a block is given, for each edge, it yields source, target, and edge, and uses the returned value of the block. Without blocks, it uses edge. Returns a matrix object.
# File lib/bio/tree.rb, line 822 822: def adjacency_matrix(nodes = nil, 823: default_value = nil, 824: diagonal_value = nil) #:yields: source, target, edge 825: nodes ||= self.nodes 826: size = nodes.size 827: hash = {} 828: nodes.each_with_index { |x, i| hash[x] = i } 829: # prepares an matrix 830: matrix = Array.new(size, nil) 831: matrix.collect! { |x| Array.new(size, default_value) } 832: (0...size).each { |i| matrix[i][i] = diagonal_value } 833: # fills the matrix from each edge 834: self.each_edge do |source, target, edge| 835: i_source = hash[source] 836: i_target = hash[target] 837: if i_source and i_target then 838: val = block_given? ? (yield source, target, edge) : edge 839: matrix[i_source][i_target] = val 840: matrix[i_target][i_source] = val 841: end 842: end 843: Matrix.rows(matrix, false) 844: end
Gets all ancestral nodes of the node. If root isn‘t specified or root is nil, @root is used. Returns an array of Nodes. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 757 757: def ancestors(node, root = nil) 758: root ||= @root 759: (self.path(root, node) - [ node ]).reverse 760: end
Gets the adjacent children nodes of the node. If root isn‘t specified or root is nil, @root is used. Returns an array of Nodes. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 701 701: def children(node, root = nil) 702: root ||= @root 703: c = self.adjacent_nodes(node) 704: c.delete(self.parent(node, root)) 705: c 706: end
Removes all edges connected with the node. Returns self. If the node does not exist, raises IndexError.
# File lib/bio/tree.rb, line 417 417: def clear_node(node) 418: unless self.include?(node) 419: raise IndexError, 'the node does not exist' 420: end 421: _clear_cache 422: @pathway.relations.delete_if do |rel| 423: rel.node.include?(node) 424: end 425: @pathway.graph[node].each_key do |k| 426: @pathway.graph[k].delete(node) 427: end 428: @pathway.graph[node].clear 429: self 430: end
Replaces each edge by each block‘s return value. Returns self.
# File lib/bio/tree.rb, line 527 527: def collect_edge! #:yields: source, target, edge 528: _clear_cache 529: @pathway.relations.each do |rel| 530: newedge = yield rel.node[0], rel.node[1], rel.relation 531: rel.edge = newedge 532: @pathway.append(rel, false) 533: end 534: self 535: end
Replaces each node by each block‘s return value. Returns self.
# File lib/bio/tree.rb, line 506 506: def collect_node! #:yields: node 507: _clear_cache 508: tr = {} 509: self.each_node do |node| 510: tr[node] = yield node 511: end 512: # replaces nodes in @pathway.relations 513: @pathway.relations.each do |rel| 514: rel.node.collect! { |node| tr[node] } 515: end 516: # re-generates @pathway from relations 517: @pathway.to_list 518: # adds orphan nodes 519: tr.each_value do |newnode| 520: @pathway.graph[newnode] ||= {} 521: end 522: self 523: end
Concatenates the other tree. If the same edge exists, the edge in other is used. Returns self. The result is unspecified if other isn‘t a Tree object. Note that the Node and Edge objects in the other tree are shared in the concatinated tree.
# File lib/bio/tree.rb, line 595 595: def concat(other) 596: #raise TypeError unless other.kind_of?(self.class) 597: _clear_cache 598: other.each_node do |node| 599: self.add_node(node) 600: end 601: other.each_edge do |node1, node2, edge| 602: self.add_edge(node1, node2, edge) 603: end 604: self 605: end
Gets all descendent nodes of the node. If root isn‘t specified or root is nil, @root is used. Returns an array of Nodes. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 712 712: def descendents(node, root = nil) 713: root ||= @root 714: distance, route = @pathway.breadth_first_search(root) 715: d = distance[node] 716: result = [] 717: distance.each do |key, val| 718: if val > d then 719: x = key 720: while x = route[x] 721: if x == node then 722: result << key 723: break 724: end 725: break if distance[x] <= d 726: end 727: end 728: end 729: result 730: end
Returns distance between node1 and node2. It would raise error if the edges didn‘t contain distance values. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 640 640: def distance(node1, node2) 641: distance = 0 642: self.each_edge_in_path(node1, node2) do |source, target, edge| 643: distance += get_edge_distance(edge) 644: end 645: distance 646: end
Calculates distance matrix of given nodes. If nodes is nil, or is ommited, it acts the same as tree.distance_matrix(tree.leaves). Returns a matrix object. The result is unspecified for cyclic trees. Note 1: The diagonal values of the matrix are 0. Note 2: If the distance cannot be calculated, nil will be set.
# File lib/bio/tree.rb, line 793 793: def distance_matrix(nodes = nil) 794: nodes ||= self.leaves 795: matrix = [] 796: nodes.each_index do |i| 797: row = [] 798: nodes.each_index do |j| 799: if i == j then 800: distance = 0 801: elsif r = matrix[j] and val = r[i] then 802: distance = val 803: else 804: distance = (self.distance(nodes[i], nodes[j]) rescue nil) 805: end 806: row << distance 807: end 808: matrix << row 809: end 810: Matrix.rows(matrix, false) 811: end
Iterates over each edge from node1 to node2. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 626 626: def each_edge_in_path(node1, node2) 627: path = self.path(node1, node2) 628: source = path.shift 629: path.each do |target| 630: edge = self.get_edge(source, target) 631: yield source, target, edge 632: source = target 633: end 634: self 635: end
Iterates over each node of this tree.
# File lib/bio/tree.rb, line 305 305: def each_node(&x) #:yields: node 306: @pathway.graph.each_key(&x) 307: self 308: end
Iterates over each connected edges of the given node. Returns self.
The reason why the method name is "each_out_edge" is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb, line 355 355: def each_out_edge(source) #:yields: source, target, edge 356: h = @pathway.graph[source] 357: h.each { |key, val| yield source, key, val } if h 358: self 359: end
Returns edge1 + edge2
# File lib/bio/tree.rb, line 121 121: def get_edge_merged(edge1, edge2) 122: dist1 = get_edge_distance(edge1) 123: dist2 = get_edge_distance(edge2) 124: if dist1 and dist2 then 125: Edge.new(dist1 + dist2) 126: elsif dist1 then 127: Edge.new(dist1) 128: elsif dist2 then 129: Edge.new(dist2) 130: else 131: Edge.new 132: end 133: end
# File lib/bio/tree.rb, line 237 237: def get_node_bootstrap(node) 238: begin 239: node.bootstrap 240: rescue NoMethodError 241: nil 242: end 243: end
# File lib/bio/tree.rb, line 245 245: def get_node_bootstrap_string(node) 246: begin 247: node.bootstrap_string 248: rescue NoMethodError 249: nil 250: end 251: end
Finds a node in the tree by given name and returns the node. If the node does not found, returns nil. If multiple nodes with the same name exist, the result would be one of those (unspecified).
# File lib/bio/tree.rb, line 390 390: def get_node_by_name(str) 391: self.each_node do |node| 392: if get_node_name(node) == str 393: return node 394: end 395: end 396: nil 397: end
Gets node name
# File lib/bio/tree.rb, line 229 229: def get_node_name(node) 230: begin 231: node.name 232: rescue NoMethodError 233: node.to_s 234: end 235: end
If the node exists, returns true. Otherwise, returns false.
# File lib/bio/tree.rb, line 410 410: def include?(node) 411: @pathway.graph[node] ? true : false 412: end
Insert a new node between adjacent nodes node1 and node2. The old edge between node1 and node2 are changed to the edge between new_node and node2. The edge between node1 and new_node is newly created.
If new_distance is specified, the distance between node1 and new_node is set to new_distance, and distance between new_node and node2 is set to tree.get_edge(node1, node2).distance - new_distance.
Returns self. If node1 and node2 are not adjacent, raises IndexError.
If new_node already exists in the tree, the tree would become circular. In addition, if the edge between new_node and node1 (or node2) already exists, it will be erased.
# File lib/bio/tree.rb, line 890 890: def insert_node(node1, node2, new_node, new_distance = nil) 891: unless edge = self.get_edge(node1, node2) then 892: raise IndexError, 'nodes not found or two nodes are not adjacent' 893: end 894: _clear_cache 895: new_edge = Edge.new(new_distance) 896: self.remove_edge(node1, node2) 897: self.add_edge(node1, new_node, new_edge) 898: if new_distance and old_distance = get_edge_distance(edge) then 899: old_distance -= new_distance 900: begin 901: edge.distance = old_distance 902: rescue NoMethodError 903: edge = old_distance 904: end 905: end 906: self.add_edge(new_node, node2, edge) 907: self 908: end
If node is nil, returns an array of all leaves (nodes connected with one edge). Otherwise, gets all descendent leaf nodes of the node. If root isn‘t specified or root is nil, @root is used. Returns an array of Nodes. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 738 738: def leaves(node = nil, root = nil) 739: unless node then 740: nodes = [] 741: self.each_node do |x| 742: nodes << x if self.out_degree(x) == 1 743: end 744: return nodes 745: else 746: root ||= @root 747: self.descendents(node, root).find_all do |x| 748: self.adjacent_nodes(x).size == 1 749: end 750: end 751: end
Gets the lowest common ancestor of the two nodes. If root isn‘t specified or root is nil, @root is used. Returns a Node object or nil. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 766 766: def lowest_common_ancestor(node1, node2, root = nil) 767: root ||= @root 768: distance, route = @pathway.breadth_first_search(root) 769: x = node1; r1 = [] 770: begin; r1 << x; end while x = route[x] 771: x = node2; r2 = [] 772: begin; r2 << x; end while x = route[x] 773: return (r1 & r2).first 774: end
Returns number of edges in the given node.
The reason why the method name is "out_degree" is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb, line 365 365: def out_degree(source) 366: h = @pathway.graph[source] 367: h ? h.size : 0 368: end
Returns all connected edges with adjacent nodes. Returns an array of the array [ source, target, edge ].
The reason why the method name is "out_edges" is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb, line 341 341: def out_edges(source) 342: h = @pathway.graph[source] 343: if h 344: h.collect { |key, val| [ source, key, val ] } 345: else 346: [] 347: end 348: end
Returns formatted text (or something) of the tree Currently supported format is: :newick, :nhx
# File lib/bio/tree/output.rb, line 230 230: def output(format, *arg, &block) 231: case format 232: when :newick 233: output_newick(*arg, &block) 234: when :nhx 235: output_nhx(*arg, &block) 236: when :phylip_distance_matrix 237: output_phylip_distance_matrix(*arg, &block) 238: else 239: raise 'Unknown format' 240: end 241: end
Returns a newick formatted string. If block is given, the order of the node is sorted (as the same manner as Enumerable#sort).
Available options:
:indent: | indent string; set false to disable (default: ’ ’) |
:bootstrap_style: | :disabled disables bootstrap representations. :traditional for traditional style. :molphy for Molphy style (default). |
# File lib/bio/tree/output.rb, line 198 198: def output_newick(options = {}, &block) #:yields: node1, node2 199: root = @root 200: root ||= self.nodes.first 201: return '();' unless root 202: __to_newick([], root, 0, :__to_newick_format_leaf, options, &block) + 203: __to_newick_format_leaf(root, Edge.new, options) + 204: ";\n" 205: end
Returns a NHX (New Hampshire eXtended) formatted string. If block is given, the order of the node is sorted (as the same manner as Enumerable#sort).
Available options:
:indent: | indent string; set false to disable (default: ’ ’) |
# File lib/bio/tree/output.rb, line 218 218: def output_nhx(options = {}, &block) #:yields: node1, node2 219: root = @root 220: root ||= self.nodes.first 221: return '();' unless root 222: __to_newick([], root, 0, 223: :__to_newick_format_leaf_NHX, options, &block) + 224: __to_newick_format_leaf_NHX(root, Edge.new, options) + 225: ";\n" 226: end
Generates phylip-style distance matrix as a string. if nodes is not given, all leaves in the tree are used. If the names of some of the given (or default) nodes are not defined or are empty, the names are automatically generated.
# File lib/bio/tree/output.rb, line 251 251: def output_phylip_distance_matrix(nodes = nil, options = {}) 252: nodes = self.leaves unless nodes 253: names = nodes.collect do |x| 254: y = get_node_name(x) 255: y = sprintf("%x", x.__id__.abs) if y.empty? 256: y 257: end 258: m = self.distance_matrix(nodes) 259: Bio::Phylip::DistanceMatrix.generate(m, names, options) 260: end
Gets the parent node of the node. If root isn‘t specified or root is nil, @root is used. Returns an Node object or nil. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 687 687: def parent(node, root = nil) 688: root ||= @root 689: raise IndexError, 'can not get parent for unrooted tree' unless root 690: unless ret = _get_cached_parent(node, root) then 691: ret = self.path(root, node)[-2] 692: _cache_parent(node, ret, root) 693: end 694: ret 695: end
Gets path from node1 to node2. Retruns an array of nodes, including node1 and node2. If node1 and/or node2 do not exist, IndexError is raised. If node1 and node2 are not connected, NoPathError is raised. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb, line 612 612: def path(node1, node2) 613: raise IndexError, 'node1 not found' unless @pathway.graph[node1] 614: raise IndexError, 'node2 not found' unless @pathway.graph[node2] 615: return [ node1 ] if node1 == node2 616: return [ node1, node2 ] if @pathway.graph[node1][node2] 617: step, path = @pathway.bfs_shortest_path(node1, node2) 618: unless path[0] == node1 and path[-1] == node2 then 619: raise NoPathError, 'node1 and node2 are not connected' 620: end 621: path 622: end
# Removes an edge between source and target. # Returns self. # If the edge does not exist, raises IndexError. +
# File lib/bio/tree.rb, line 465 465: def remove_edge(source, target) 466: unless self.get_edge(source, target) then 467: raise IndexError, 'edge not found' 468: end 469: _clear_cache 470: fwd = [ source, target ] 471: rev = [ target, source ] 472: @pathway.relations.delete_if do |rel| 473: rel.node == fwd or rel.node == rev 474: end 475: h = @pathway.graph[source] 476: h.delete(target) if h 477: h = @pathway.graph[target] 478: h.delete(source) if h 479: self 480: end
Removes each edge if the block returns not nil. Returns self.
# File lib/bio/tree.rb, line 484 484: def remove_edge_if #:yields: source, target, edge 485: _clear_cache 486: removed_rel = [] 487: @pathway.relations.delete_if do |rel| 488: if yield rel.node[0], rel.node[1], rel.edge then 489: removed_rel << rel 490: true 491: end 492: end 493: removed_rel.each do |rel| 494: source = rel.node[0] 495: target = rel.node[1] 496: h = @pathway.graph[source] 497: h.delete(target) if h 498: h = @pathway.graph[target] 499: h.delete(source) if h 500: end 501: self 502: end
Removes the given node from the tree. All edges connected with the node are also removed. Returns self. If the node does not exist, raises IndexError.
# File lib/bio/tree.rb, line 436 436: def remove_node(node) 437: #_clear_cache #done in clear_node(node) 438: self.clear_node(node) 439: @pathway.graph.delete(node) 440: self 441: end
Removes each node if the block returns not nil. All edges connected with the removed nodes are also removed. Returns self.
# File lib/bio/tree.rb, line 446 446: def remove_node_if 447: #_clear_cache #done in clear_node(node) 448: all = self.nodes 449: all.each do |node| 450: if yield node then 451: self.clear_node(node) 452: @pathway.graph.delete(node) 453: end 454: end 455: self 456: end
Removes all nodes that are not branches nor leaves. That is, removes nodes connected with exactly two edges. For each removed node, two adjacent edges are merged and a new edge are created. Returns removed nodes. Note that orphan nodes are still kept unchanged.
# File lib/bio/tree.rb, line 852 852: def remove_nonsense_nodes 853: _clear_cache 854: hash = {} 855: self.each_node do |node| 856: hash[node] = true if @pathway.graph[node].size == 2 857: end 858: hash.each_key do |node| 859: adjs = @pathway.graph[node].keys 860: edges = @pathway.graph[node].values 861: new_edge = get_edge_merged(edges[0], edges[1]) 862: @pathway.graph[adjs[0]].delete(node) 863: @pathway.graph[adjs[1]].delete(node) 864: @pathway.graph.delete(node) 865: @pathway.append(Bio::Relation.new(adjs[0], adjs[1], new_edge)) 866: end 867: #@pathway.to_relations 868: @pathway.relations.reject! do |rel| 869: hash[rel.node[0]] or hash[rel.node[1]] 870: end 871: return hash.keys 872: end
Gets the sub-tree consisted of given nodes. nodes must be an array of nodes. Nodes that do not exist in the original tree are ignored. Returns a Tree object. Note that the sub-tree shares Node and Edge objects with the original tree.
# File lib/bio/tree.rb, line 543 543: def subtree(nodes) 544: nodes = nodes.find_all do |x| 545: @pathway.graph[x] 546: end 547: return self.class.new if nodes.empty? 548: # creates subtree 549: new_tree = self.class.new 550: nodes.each do |x| 551: new_tree.add_node(x) 552: end 553: self.each_edge do |node1, node2, edge| 554: if new_tree.include?(node1) and new_tree.include?(node2) then 555: new_tree.add_edge(node1, node2, edge) 556: end 557: end 558: return new_tree 559: end
Gets the sub-tree consisted of given nodes and all internal nodes connected between given nodes. nodes must be an array of nodes. Nodes that do not exist in the original tree are ignored. Returns a Tree object. The result is unspecified for cyclic trees. Note that the sub-tree shares Node and Edge objects with the original tree.
# File lib/bio/tree.rb, line 569 569: def subtree_with_all_paths(nodes) 570: hash = {} 571: nodes.each { |x| hash[x] = true } 572: nodes.each_index do |i| 573: node1 = nodes[i] 574: (0...i).each do |j| 575: node2 = nodes[j] 576: unless node1 == node2 then 577: begin 578: path = self.path(node1, node2) 579: rescue IndexError, NoPathError 580: path = [] 581: end 582: path.each { |x| hash[x] = true } 583: end 584: end 585: end 586: self.subtree(hash.keys) 587: end