htree.rb

Path: htree.rb
Last Update: Fri Sep 17 15:12:27 +0000 2004

htree.rb

HTML/XML document tree

Author:Tanaka Akira <akr@m17n.org>

Features

  • Permissive unified HTML/XML parser
  • byte-to-byte round-tripping unparser
  • XML namespace support
  • Dedicated class for escaped string. This ease sanitization.
  • XHTML/XML generator
  • template engine: files/htree/template_rb.html
  • recursive template expansion
  • REXML tree generator: files/htree/rexml_rb.html

Example

The following one-liner prints parsed tree object.

  % ruby -rhtree -e 'pp HTree(ARGF)' html-file

The following two-line script convert HTML to XHTML.

  require 'htree'
  HTree(STDIN).display_xml

The conversion method to REXML is provided as to_rexml.

    HTree(...).to_rexml

Module/Class Hierarchy

Method Summary

HTree provides following methods.

  • Parsing Methods
  • Generation Methods
    • HTree::Node#display_xml -> STDOUT
    • HTree::Node#display_xml(out) -> out
    • HTree::Node#display_xml(out, encoding) -> out
    • HTree::Text#to_s -> String
  • Template Methods
  • Traverse Methods
  • Predicate Methods
    • HTree::Traverse#doc? -> true or false
    • HTree::Traverse#elem? -> true or false
    • HTree::Traverse#text? -> true or false
    • HTree::Traverse#xmldecl? -> true or false
    • HTree::Traverse#doctype? -> true or false
    • HTree::Traverse#procins? -> true or false
    • HTree::Traverse#comment? -> true or false
    • HTree::Traverse#bogusetag? -> true or false
  • REXML Tree Generator
    • HTree::Node#to_rexml -> REXML::Child

[Validate]