Module HTree::Container::Trav
In: htree/modules.rb
htree/traverse.rb

Methods

Included Modules

Traverse

Public Instance methods

each_hyperlink traverses hyperlinks such as HTML href attribute of A element.

It yields HTree::Text or HTree::Loc.

Note that each_hyperlink yields HTML href attribute of BASE element.

each_hyperlink_uri traverses hyperlinks such as HTML href attribute of A element.

It yields HTree::Text (or HTree::Loc) and URI for each hyperlink.

The URI objects are created with a base URI which is given by HTML BASE element or the argument ((|base_uri|)). each_hyperlink_uri doesn‘t yields href of the BASE element.

each_uri traverses hyperlinks such as HTML href attribute of A element.

It yields URI for each hyperlink.

The URI objects are created with a base URI which is given by HTML BASE element or the argument ((|base_uri|)).

filter rebuilds the tree without some components.

  node.filter {|descendant_node| predicate } -> node
  loc.filter {|descendant_loc| predicate } -> node

filter yields each node except top node. If given block returns false, corresponding node is dropped. If given block returns true, corresponding node is retained and inner nodes are examined.

filter returns an node. It doesn‘t return location object even if self is location object.

find_element searches an element which universal name is specified by the arguments. It returns nil if not found.

traverse_element traverses elements in the tree. It yields elements in depth first order.

If names are empty, it yields all elements. If non-empty names are given, it should be list of universal names.

A nested element is yielded in depth first order as follows.

  t = HTree('<a id=0><b><a id=1 /></b><c id=2 /></a>')
  t.traverse_element("a", "c") {|e| p e}
  # =>
  {elem <a id="0"> {elem <b> {emptyelem <a id="1">} </b>} {emptyelem <c id="2">} </a>}
  {emptyelem <a id="1">}
  {emptyelem <c id="2">}

Universal names are specified as follows.

  t = HTree(<<'End')
  <html>
  <meta name="robots" content="index,nofollow">
  <meta name="author" content="Who am I?">
  </html>
  End
  t.traverse_element("{http://www.w3.org/1999/xhtml}meta") {|e| p e}
  # =>
  {emptyelem <{http://www.w3.org/1999/xhtml}meta name="robots" content="index,nofollow">}
  {emptyelem <{http://www.w3.org/1999/xhtml}meta name="author" content="Who am I?">}

[Validate]