node(s)
deletion
HTML::Seamstress - HTML::Tree subclass for HTML templating via tree rewriting
HTML::Seamstress provides ``fourth generation'' dynamic HTML generation (templating).
In the beginning we had...
First generation dynamic HTML production used server-side includes:
<p>Today's date is <!--#echo var="DATE_LOCAL" --> </p>
The next phase of HTML generation saw embedded HTML snippets in Perl code. For example:
sub header { my $title = shift; print <<"EOHEADER"; <head> <title>$title</title> </head> EOHEADER }
The 3rd generation solutions embed programming language constructs with HTML. The language constructs are either a real language (as is with the HTML::Mason manpage) or a pseudo/mini-language (as is with PeTaL, Template or the HTML::Template manpage). Let's see some Template code:
<p>Hi there [% name %], are you enjoying your stay?</p>
Up to now, all approaches to this issue tamper with the HTML in some form or fashion:
The fourth generation of HTML production is distinguished by no need for tampering with the HTML. There are a wealth of XML-based modules which provide this approach (the XML::Twig manpage, the XML::LibXML manpage, the XML::TreeBuilder manpage, the XML::DOM manpage). HTML::Seamstress is the one CPAN module based around HTML and the HTML::Tree manpage for this approach.
When looking at HTML::Seamstress, we are looking at a uniquely positioned 4th-generation HTML generator. Seamstress offers two sets of advantages: those common to all 4th generation htmlgens and those common to a subclass of the HTML::Tree manpage.
What advantages does this fourth way of HTML manipulation offer? Let's take a look:
The contents of the document remain legal HTML/XML that can be be developed using standard interactive design tools. The flow of control of the code remains separate from the page. Technologies that mix content and data in a single file result in code that is often difficult to understand and has trouble taking full advantage of the object oriented programming paradigm.
If you have a strong hold on object-oriented Perl and a solid understand of the tree-based nature of HTML, then all you need to do is read the manual pages showing how Seamstress and related modules offer tree manipulation routines and you are done.
Extension just requires writing new Perl methods - a snap for any object oriented Perler.
Mixing Perl and HTML (by any of the generation 1-3 approaches) makes it impossible to use standard validation and formatting tools for either Perl or HTML.
Perl and HTML are solid technologies with years of effort behind making them robust and flexible enough to meet real-world technological demands.
Because manipulator and manipulated are separate, we can choose manipulators and/or stack them at will.
The real world is unfortunately more about getting HTML to work with IE and maybe 1 or 2 other browsers. Strict XHTML may not be acceptable under time and corporate pressures to get things to work with quirky browsers.
the HTML::Tree manpage has a nice large set of accessor/modifier functions. If that is not enough, then take a gander at Matthew Sisk's contributions: http://search.cpan.org/~msisk/ as well as the HTML::Element::Library manpage.
Now it's time to look at some examples. Before doing so, it is imperative that you understand the tree structure of HTML.
The best representation of this fact is this slide right here:
http://xmlc.objectweb.org/doc/xmlcSlides/xmlcSlides.html#de
If you understand this (and maybe the rest of the slides), then you have a good grip on seeing HTML as a tree.
the HTML::AboutTrees manpage does also teach this, but it takes a while before he gets to what matters to us. It's a fun read nonetheless.
Now that we've got this concept under our belts let's try some full examples.
The first thing to remember is that Seamstress is really just
convenience functions for HTML::Tree. You can do
entirely without
Seamstress. It's just that my daily real-world obligations have lead
to a set of library functions (HTML::Element::Library) and a
convenient way to locate ``templates'' (spkg.pl
) that work well on
top of HTML::Tree
$PATH
sbase.pl
and spkg.pl
are used to simplify the process of
parsing an HTML file into HTML::Treebuilder object. In other words
instead of having to do this in your Perl programs:
use HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new_from_file('/usr/htdocs/hello.html');
You can do this:
use htdocs::hello;
my $tree = htdocs::hello->new;
The lines of code is not much different, but abstracting away absolute paths is important in production environments where the absolute path may come from who knows where via who knows how.
HTML::Seamstress::Base
on your @INC
. This module contains one
function, comp_root()
which points to a place you wouldn't
typically have on your @INC
but which you must have because your
HTML file and corresponding .pm
abstracting it are going to be
there.
metaperl@pool-71-109-151-76:~/www$ spkg.pl moose.html comp_root........ /home/metaperl/ html_file_path... /home/metaperl/www/ html_file........ moose.html html_file sans... moose moose.html compiled to package www::moose
use www::moose; my $tree = www::moose->new; # manipulate tree... $tree->as_HTML;
In a mod_perl setup, you would want to pre-load your HTML and Class::Cache was designed for this very purpose. But that's a topic for another time.
In a setup with HTML files in numerous places, I recommend setting up
multiple HTML::Seamstress::Base::here
,
HTML::Seamstress::Base::there
for each file root. To do this, you
will need to use the --base_pkg
and --base_pkg_root
options to
spkg.pl
spkg.pl
call. Just supply it with a different HTML file to
create a different package. Then use
them, new
them and
manipulate them and $tree->as_HTML
them at will.
Now it's time to rock and roll!
In our first example, we want to perform simple text substitution on the HTML template document:
<html> <head> <title>Hello World</title> </head> <body> <h1>Hello World</h1> <p>Hello, my name is <span id="name">dummy_name</span>. <p>Today's date is <span id="date">dummy_date</span>. </body> </html>
First save this somewhere on your document root. Then compile it with
spkg.pl
. Now you simply use
the ``compiled'' version of HTML with API calls to
HTML::TreeBuilder, HTML::Element, and HTML::Element::Library.
use html::hello_world; my $tree = html::hello_world->new; $tree->look_down(id => name)->replace_content('terrence brannon'); $tree->look_down(id => date)->replace_content('5/11/1969'); print $tree->as_HTML;
replace_content()
is a convenience function in
the HTML::Element::Library manpage.
node(s)
deletion<span id="age_dialog"> <span id="under10"> Hello, does your mother know you're using her AOL account? </span> <span id="under18"> Sorry, you're not old enough to enter (and too dumb to lie about your age) </span> <span id="welcome"> Welcome </span> </span>
Again, compile and use the module:
use html::age_dialog;
my $tree = html::dialog->new;
$tree->highlander (age_dialog => [ under10 => sub { $_[0] < 10} , under18 => sub { $_[0] < 18} , welcome => sub { 1 } ], $age );
print $tree->as_HTML;
# will only output one of the 3 dialogues based on which closure # fires first
And once again, the function we used is the highlander method, also a part of the HTML::Element::Library manpage.
The following libraries are always available for more complicated manipulations:
Table unrolling, pulldown creation, li
unrolling, and dl
unrolling are
all examples of a tree operation in which you take a child of a node
and clone it and then alter it in some way (replace the content, alter
some of its attributes), and then stick it under its parent.
Functions for use with the common HTML elements --- <table>
,
<ol>
,
<ul>
, <dl>
, <select>
are documented in
the HTML::Element::Library manpage and are
prefaced with the words ``Tree Building Methods''.
Beyond the ``compilation'' support documented above, Seamstress offers nothing more than a simple structure-modifying method, expand_replace(). And to be honest, it probably shouldn't offer that. But once, when de-Mason-izing a site, it was easier to keep little itty-bitty components all over and so I wrote this method to facilitate the process.
Let's say you have this HTML:
<div id="sidebar">
<div class="sideBlock" id=mpi>mc::picBar::index</div>
<div class="sideBlock" id=mnm>mc::navBox::makeLinks</div>
<div class="sideBlock" id=mg>mc::gutenBox</div>
</div>
In this case, the content of each sideBlock is the name of a Perl
Seamstress-style class. As you know, when the constructor for such a
class is called an
HTML::Element, $E
, will be returned for it's parsed content.
In this case, we want the content of the div element to go from the being the class name to being the HTML::Element that the class constructs. So to inline all 3 tags you would do the following;
$tree->look_down(id => $_)->expand_replace for qw(mpi mnm mg);
Useful in mod_perl environments and anywhere you want control over the timing of object creation.
A fierce head-to-head between PeTaL and Seamstress goes on for several days in this thread!
A striking example of the limitations of mini-languages is shown here: http://perlmonks.org/
But the most cogent argument for using full-strength languages as opposed to mixing them occurs in the the Text::Template manpage docs:
When people make a template module like this one, they almost always start by inventing a special syntax for substitutions. For example, they build it so that a string like %%VAR%% is replaced with the value of $VAR. Then they realize the need extra formatting, so they put in some special syntax for formatting. Then they need a loop, so they invent a loop syntax. Pretty soon they have a new little template language.
This approach has two problems: First, their little language is crippled. If you need to do something the author hasn't thought of, you lose. Second: Who wants to learn another language? You already know Perl, so why not use it?
http://www.servlets.com/soapbox/problems-jsp-reaction.html
http://www-106.ibm.com/developerworks/library/w-friend.html
http://www.theserverside.com/resources/article.jsp
Two other frameworks come to mind. Both are stricter with regard to the correctness of the HTML and both use a different means for node lookup and rewrite.
From the docs, it looks like the XML::GDOME manpage is the successor to this module.
http://lists.sourceforge.net/lists/listinfo/seamstress-discuss
Terrence Brannon, tbone@cpan.org
I would like to thank
HTML_Tree
HTML_Tree is a C++ HTML manipulator with a Perl interface. Upon using his Perl interface, I began to notice limitations and extended his Perl interface. The author was not interested in working with me or my extensions, so I had to continue on a separate path.
johnnywang
for his post about dynamic HTML generation
Copyright 2002-2005 by Terrence Brannon.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.