A template system can be used to separate output formatting specifications, which govern the appearance and location of output text and data elements, from the executable logic which prepares the data and makes decisions about what appears in the output.
Template systems lie along a continuum of power versus separation. "Powerful" constructs like variable assignment or conditional statements make it easy to modify the look of an application within the template system exclusively, without having to modify any of the underlying "application logic". They do so, however, at the cost of separation, turning the templates themselves into part of the application logic.
This template system leans strongly towards preserving the separation of logic and presentation. It is intentionally constrained in the features it supports and, as a result, applications tend to require quite a bit of code to instantiate a template. This may not be to everybody's tastes. However, while this design limits the power of the template language, it does not limit the power or flexibility of the template system. This system supports arbitrarily complex text formatting. Many Google applications, including the "main" Google web search, use this system exclusively for formatting output.
Finally, this system is designed with an eye towards efficiency. Template instantiation is very quick, with an eye towards minimizing both memory use and memory fragmentation.
There are two parts to the Google Template System:
The templates are text files that contain the format specification for the formatted output, i.e, the template language. The data dictionaries contain the mappings from the template elements (markers) embedded in the templates to the data that they will format. Here's a simple template:
<html><head><title>{{TITLE}}</title>{{META_TAGS}}</head> <body>{{BODY}}</body></html>
Here's a dictionary that one could use to instantiate the template:
{"TITLE": "Template example", "BODY": "This is a simple template example.\nIt's boring", "DATE": "11/20/2005"}
If we instantiated the template with this dictionary, here's the output we would get:
<html><head><title>Template example</title></head> <body>This is a simple template example. It's boring</body></html>
{{TITLE}}
and {{{BODY}}
are template
elements, also called markers. In the dictionary,
TITLE
, BODY
, and DATE
are
dictionary names, and the values associated with each one, such
as 11/20/2005
, are dictionary values.
A few points are clear even from this simple example:
DATE
, and to insert the date into the dictionary
already formatted. DATE
is entirely ignored. {{META_TAGS}}
is not found in the
dictionary. This is perfectly legal; missing variable markers
evaluate to the empty string. The template language has four types of markers:
{{VARIABLE}}
{{#SECTION_NAME}}...{{/SECTION_NAME}}
{{>FILENAME}}
{{! comment lives here -- cool, no?}}
These marker types each have their own namespace. For readability, however, it is best to not overuse a single name.
Anything found in a template of the form {{...}}
is
interpreted as a template marker. All other text is considered
formatting text and is output verbatim at template expansion time.
Formatting text may consist of HTML tags, XML tags, linefeeds and
other spacing characters, constant text, etc.
A data dictionary is a map from keys to values. The keys are
always strings, each string representing either a variable, a section,
or a template-include file. (Comments are not stored in the data
dictionary!) These values correspond to the name of the associated
template marker: a section {{#FOO}}
in the template text
is matched to the key "FOO"
in the dictionary, if it
exists. Note the case must match as well.
The value associated with a key differs according to key type. The value associated with a variable is simple: it's the value for that variable. Both keys and values can be any 8-bit character-string, and may include internal NULs (\0).
The value associated with a section is more complicated, and somewhat recursive: it's a list of data dictionaries. Come template-expansion time, the section is expanded once for each dictionary in the list, so if there are two dictionaries in the list, then the section text will occur in the output twice. The first time, all variables/etc. in the section will be evaluated taking into account the first dictionary. The second time, all variables/etc. will be evaluated taking into account the second dictionary. (See below for a definition of "taking into account.")
A template-include is a special type of section, so the associated value is the same: a list of dictionaries. Template-includes also have one other, mandatory associated piece of information: the filename of the template to include. This filename may be specified either as an absolute path, or as a relative path. (In the latter case, the path is taken relative to the template_root, as set by the application.)
The application program is responsible for building this data dictionary, including all nesting. It then applies this dictionary to a single template to produce formatted output.
A program using Google Templates typically reads in templates at load time. During the course of program execution, the program will repeatedly perform the following two steps: first, instantiate a data dictionary, and second, apply the dictionary to the template to produce output.
The template system applies a dictionary to a template by finding
all template markers in the template, and replacing them with the
appropriate dictionary values. It matches template markers to
dictionary keys in the obvious way. For instance, a template marker
{{FOO}}
matches the dictionary key FOO
.
{{FOO:html_escape}}
matches FOO
as well. The
marker {{#BAR}}
matches the dictionary key
BAR
, as does the marker {{/BAR}}
. The
marker {{>BAZ}}
matches the dictionary key
BAZ
. (And of course, the marker {{!
comment}}
doesn't match any dictionary key at all.)
Template-variables can also have modifiers. In that case, the template-system starts by finding the appropriate value for that variable in the dictionary, just like normal. Then it applies each modifier to the variable, left to right. Finally, it emits the modified value to the output. Template-includes can have modifiers in a similar way. In such cases, after the sub-template is expanded, but before its content is injected into the current template, it has the modifiers applied.
If no dictionary key is found for a given template marker, then the template marker is ignored: if a variable, it expands to the empty string; if a section or include-template, the section or include-template is expanded zero times.
All names are case sensitive. Names -- that is, variable keys and,
as a result, template markers -- must be made of (7-bit ascii)
alphanumeric characters and the underscore. The commment marker,
which does not map to dictionary keys, may contain any chararacters
whatsoever except }
, the close-curly brace. It's a
syntax error for any template marker to violate this rule.
Outside of the template markers, templates may contain any text whatsoever, including (single) curly braces and NUL characters.
Recall that variables look like this: {{VARNAME}}
. We
actually allow a more generic form: the variable name may be followed
by one or more modifiers. A modifier is a filter that's
applied at template-expand time, that munges the value of the variable
before it's output. For instance, consider a template that looks like
this:
<html><body>{{NAME:html_escape}}</body></html>
This asks the template system to apply the built-in
html_escape
modifier when expanding
{{NAME}}
. If you set NAME
in your
dictionary to be Jim & Bob
, what will actually be
emitted in the template is Jim & Bob
.
Modifiers work for variable names and also for template-includes:
{{>SUB_TEMPLATE:html_escape}}
means that when you expand
SUB_TEMPLATE
, html-escape the expanded text before
inserting it into the current template.
You can chain modifiers together. This template first html-escapes
NAME
, and then javascript-escapes that result:
<html><body>{{NAME:html_escape:javascript_escape}}</body></html>
Modifiers typically have a long, descriptive name and also a one-letter abbreviation. So this example is equivalent to the previous one:
<html><body>{{NAME:h:j}}</body></html>
Here are the modifiers that are built in to the template system:
long name | short name | description |
---|---|---|
:html_escape | :h |
html-escapes the variable before output
(eg & -> & ) |
:pre_escape | :p |
pre-escapes the variable before output (same as html_escape but whitespace is preserved; useful for <pre>...</pre>) |
:url_query_escape | :u |
performs URL escaping on the variable before output.
space is turned into +, and everything other than [0-9a-zA-Z.,_:*/~!()-], is
transformed into %-style escapes. Use this when you are building
URLs with variables as parameters:
<a href="http://google.com/search?q={{QUERY:u}}">{{QUERY:h}}</a> |
:javascript_escape | :j |
javascript-escapes the variable before output
(eg " -> \" ) |
:cleanse_css | :c |
Removes characters not safe for a CSS value. Safe characters are alphanumeric, space, underscore, period, coma, exclamation mark, pound, percent, and dash. |
:json_escape | :o |
json-escapes a variable before output as a string in json;
similar to javascript escaping, but ignores characters such
as = and & . |
:html_escape_with_arg | :H |
special purpose html escaping. See below for details |
:url_escape_with_arg | :U |
special purpose url escaping. See below for details |
:none | leaves the variable as is |
The html_escape_with_arg
and url_escape_with_arg
modifiers are a bit different
because they requires a value to specify the type of escaping to use.
For example, this template is equivalent to using
the pre_escape
modifier:
<html><body><pre>{{BODY:H=pre}}</pre></body></html>
Here are the values that are supported by
the html_escape_with_arg
modifier:
value | description |
---|---|
=snippet |
like html_escape , but allows HTML entities and
some tags to pass through unchanged. The allowed tags
are <br> , <wbr> , <b> ,
and </b> . |
=pre |
same as pre_escape |
=url |
same as :U=html below. For backwards compatibility. |
=attribute |
replaces characters not safe for an use in an unquoted attribute with underscore. Safe characters are alphanumeric, underscore, dash, period, and colon. |
Here are the values that are supported by
the url_escape_with_arg
modifier:
value | description |
---|---|
=html |
Ensures that a variable contains a safe URL. Safe means that
it is either a http or https URL, or else it has no protocol
specified. If the URL is safe it is html-escaped, otherwise
it is replaced with # . |
=javascript |
Same as =html , but using javascript escaping
instead of html escaping. |
=query |
Same as url_query_escape . |
NOTE: At the moment, there are no filters for handling XML attributes and text nodes. For HTML snippets, use the html filter; in other situations, it may be appropriate to use CDATA blocks.
In addition to the built-in modifiers, you can write your own
modifier. Custom modifiers must have a name starting with "x-", and
the name can contain alphanumeric characters plus dashes and
underscores. Custom modifiers can also accept values with any
character except for :
and }
. For example this
template could be a valid use of a custom modifier:
{{VAR:x-my_modifier:value1,value2,value3 has spaces,etc}}
See <template_modifiers.h>
for details on how to write a modifier and how to register it (there's
also an example of a custom
modifier below).
The dictionary structure is a tree: there's a 'main' dictionary, and then sub-dictionaries for each section or include-template. Even with all this complexity, the lookup rules are mostly straightforward: when looking up a marker -- be it a variable, section, or include-template marker -- the system looks in the currently applicable dictionary. If it's found there, great. If not, and the parent dictionary is not an include-template, it continues the look in the parent dictionary, and possibly the grandparent, etc. That is, lookup has static scoping: you look in your dictionary and any parent dictionary that is associated with the same template-file. As soon as continuing the lookup would require you to jump to a new template-file (which is what include-template would do), we stop the lookup.
For instance, for a template that says
{{#RESULTS}}{{RESULTNUM}}. {{>ONE_RESULT}}{{/RESULTS}}
,
"ONE_RESULT"
is looked for in the "RESULTS" dictionary,
and if not found there, is looked for in the main, top-level
dictionary. Likewise, the variable "RESULTNUM"
is looked
for first in the "RESULTS" dictionary, then in the main dictionary if
necessary. However, "ONE_RESULT" will not do equivalent cascading
lookups. In fact, it will have no parent dictionaries at all, because
it's a different template file and thus in a different scope.
Because of these scoping rules, it's perfectly reasonable to set
all variables that are needed in a given template file, in the
top-level dictionary for that template. In fact, the ShowSection()
function is provided to
support just this idiom. To avoid confusion in such a usage mode,
it's strongly encouraged that you give unique names to all sections
and include-templates in a single template file. (It's no problem,
given the template scoping rules, for a single section or
include-template name to be repeated across different template
files.)
There's a single special case: the global variable dictionary. Every dictionary inherits its initial set of values from the global dictionary. Clients can set variables in the global dictionary just like they can in normal template dictionaries they create.
The system initializes the global dictionary with a few useful
values for your convenience. All system variables are prefixed with
BI
, to emphasize they are "built in" variables.
BI_SPACE
, which has the value
<space>
. It is used to force a space
at the beginning or end of a line in the template,
where it would normally be suppressed. (See below.) BI_NEWLINE
, which has the value
<newline>
It is used to force a
newline at the end of a line, where it would normally
be suppressed. (See below.) As is usual for inheritence, if a user explicitly assigns a value
to these variable-names in its own dictionary, this overrides the
inherited value. So, dict->SetValue("BI_SPACE",
" ")
causes BI_SPACE
to have the value
, rather than <space>
, when
expanding dict
.
Note that only variables can be inherited from the global dictionary, not section dictionaries or include-file dictionaries.
A couple of small implementation notes: global inheritence is "last
chance", so if a section's parent dictionary redefined
BI_SPACE
, say, the section dictionary inherits the
parent-dict value, not the global-dict value. Second, variable
inheritence happens at expand time, not at dictionary-create time. So
if you create a section dictionary, and then afterwards set a variable
in its parent dictionary (or in the global dictionary), the section
will inherit that variable value, if it doesn't define the
value itself.
Most application code concerns filling a template dictionary, but there is also code for loading templates themselves from disk. A final category of code lets you inspect and control the template system.
The code below assumes the default configuration option of putting
all template code in namespace google
.
The main routine to load a template is
google::Template::GetTemplate()
, defined in
template.h
. This is a static, factory method, that loads
a template from either disk or from an internal template cache, and
returns a pointer to a Template
object. Besides a
filename to load from, this routine takes a 'strip' argument which
defines how to expand whitespace found in a template file. It can
have one of the following values:
google::DO_NOT_STRIP
: do nothing. This expands the
template file verbatim.
google::STRIP_BLANK_LINES
: remove all blank
lines when expanding. This ignores any blank lines found in
the template file when expanding. When the template is html,
this reduces the size of the output text without requiring a
sacrifice of readability for the input file.
google::STRIP_WHITESPACE
: remove not only blank lines when
expanding, but also whitespace at the beginning and end of each
line. It also removes any linefeed (possibly following
whitespace) that follows a closing '}}' of any kind of template
marker except a template variable. (This means a
linefeed may be removed anywhere by simply placing a comment
marker as the last element on the line.) When the template is
html, this reduces the size of the output html without changing
the way it renders (except in a few special cases). When using
this flag, the built-in template variables
BI_NEWLINE
and BI_SPACE
can be useful
to force a space or newline in a particular situation.
This factory method returns NULL if the template cannot be found, or if there is a syntax error trying to load it.
Besides loading templates, the application can also ask the
template system to reload a template, via
template->ReloadIfChanged()
. (You can also reload all
templates at once via google::Template::ReloadAllIfChanged()
.)
ReloadIfChanged()
looks on disk, and if it notices the
template file has changed since the last load, it will reload the
template from disk, replacing the old contents. Actually, the reload
is done lazily: ReloadIfChanged
just sets a bit that
causes the template to be reloaded next time GetTemplate
is called.
The class google::TemplateDictionary
is used for all template
dictionary operations. new google::TemplateDictionary(name)
is
used to create a new top-level dictionary.
dict->AddSectionDictionary(name)
and
dict->AddIncludeDictionary(name)
are used to create
sub-dictionaries for sections or include-files. After
creating a dictionary, the application should call one or more
functions for each marker in the template. As an example, consider
the following template:
<html><body> {{! This page has no head section.}} {{#CHANGE_USER}} <A HREF="/login">Click here</A> if you are not {{USERNAME}}<br> {{/CHANGE_USER}} Last five searches:<ol> {{#PREV_SEARCHES} <li> {{PREV_SEARCH}} {{/PREV_SEARCHES}} </ol> {{>RESULT_TEMPLATE}} {{FOOTER}} </body></html>
To instantiate the template, the user should call a function to set
up FOOTER
, and a function to say what to do for the
sections CHANGE_USER
and PREV_SEARCHES
, and
for the include-template RESULT_TEMPLATE
. Quite likely,
the application will also want to create a sub-dictionary for
CHANGE_USER
, and in that sub-dictionary call a function
to set up USERNAME
. There will also be sub-dictionaries
for PREV_SEARCHES
, each of which will need to set
PREV_SEARCH
. Only when this is all set up will the
application be able to apply the dictionary to the template to get
output.
The appropriate function to call for a given template marker depends on its type.
For variables, the only interesting action is to set the variable's
value. For most variables, the right method to call is
dict->SetValue(name, value)
. (The name and value
can be specified as strings in a variety of ways: C++ strings, char
*'s, or char *'s plus length.)
There are two other ways to set a variable's value as well, each
with a different scoping rule. You can call
google::TemplateDictionary::SetGlobalValue(name, value)
-- no TemplateDictionary
instance needed here -- to set a
variable that can be used by all templates in an application. This
is quite rare.
You can also call dict->SetTemplateGlobalValue(name,
value)
. This sets a variable that is seen by all child
dictionaries of this dictionary: sub-sections you create via
AddSectionDictionary
, and included templates you create
via AddIncludeDictionary
(both described below). This
differs from SetValue()
, because SetValue()
values are never inherited across template-includes. Almost always,
SetValue
is what you want;
SetTemplateGlobalValue
is intended for variables
that are "global" to a particular template but not all templates, such
as a color scheme to use, a language code, etc.
To make it easier to use SetValue()
, there are a few
helper routines to help setting values of a few special forms.
SetIntValue(name, int)
: takes an int as the value. SetEscapedValue(name, value, escape_functor)
:
escapes the value, using the escape-functor, which takes a
string as input and gives a "munged" string as output.
TemplateDictionary
has a few escape-functors built
in, including html_escape
, which replaces
<
, >
, &
, and
"
with the appropriate html entity;
xml_escape
, which deals with the
entity; and
javascript_escape
, which escapes quotes and other
characters that are meaningful to javascript. These are
helpful in avoiding security holes when the template is
html/xml/javascript. You can also define your own functor; see
the example below. SetFormattedValue(name, fmt, ...)
: the
fmt
and ...
work just like in
printf
: SetFormattedValue("HOMEPAGE",
"http://%s/", hostname)
. SetEscapedFormattedValue(name, escape_functor, fmt,
...)
: formats the value just like printf
,
and then escapes the result using the given functor. Example:
google::TemplateDictionary* dict = new google::TemplateDictionary("var example"); dict->SetValue("FOOTER", "Aren't these great results?"); class StarEscape : public template_modifiers::TemplateModifier { void Modify(const char* in, size_t inlen, const template_modifiers::ModifierData* per_expand_data, ExpandEmitter* outbuf, const string& arg) const { outbuf->Emit(string("*") + string(in, inlen) + string("*")); } }; dict->SetEscapedValue("USERNAME", username, StarEscape());
Note that the template itself can also specify escaping via variable modifiers! It's very possible for you to escape the value when setting it in the dictionary, and then have the template escape it again when outputting, so be careful you escape only as much as you need to.
Sections are used in two ways in templates. One is to expand some
text multiple times. This is how PREV_SEARCHES
is used
in the example above. In this case we'll have one small
sub-dictionary for each of the five previous searches the user did.
To do this, call AddSectionDictionary(section_name)
to create the sub-dictionary. It returns a
TemplateDictionary*
that you can use to fill the
sub-dictionary.
The other use of sections is to conditionally show or hide a block
of text at template-expand time. This is how CHANGE_USER
is used in the example template: if the user is logged in, we show the
section with the user's username, otherwise we choose not to show the
section.
This second case is a special case of the first, and the "standard"
way to show a section is to expand it exactly one time, by calling
AddSectionDictionary()
once, and then setting
USERNAME
in the sub-dictionary.
However, the hide/show idiom is so common there are a few
convenience methods to make it simpler. The first takes advantage of
the fact sections inherit variables from their parent: you set
USERNAME
in the parent dictionary, rather than a section
sub-dictionary, and then call ShowSection()
, which adds a
single, empty dictionary for that section. This causes the section to
be shown once, and to inherit all its variable values from its
parent.
A second convenience method is written for the particular case we
have with USERNAME
: if the user's username is non-empty,
we wish to
show the section with USERNAME
set to the username,
otherwise we wish to hide the section and show neither
USERNAME
nor the text around it. The method
SetValueAndShowSection(name, value, section_name)
does
exactly that: if value is non-empty, add a single single dictionary to
section_name
and call section_dict->AddValue(name,
value)
. There's also SetEscapedValueAndShowSection(name,
value, escape_functor, section_name)
, which lets you escape
value
.
Example:
using google::TemplateDictionary; TemplateDictionary* dict = new TemplateDictionary("section example"); const char* username = GetUsername(); // returns "" for no user if (username[0] != '\0') { TemplateDictionary* sub_dict = dict->AddSectionDictionary("CHANGE_USER"); sub_dict->SetValue("USERNAME", username); } else { // don't need to do anything; we want a hidden section, which is the default } // Instead of the above 'if' statement, we could have done this: if (username[0] != '\0') { dict->ShowSection("CHANGE_USER"); // adds a single, empty dictionary dict->SetValue("USERNAME", username); // take advantage of inheritence } else { // don't need to do anything; we want a hidden section, which is the default } // Or we could have done this: dict->SetValueAndShowSection("USERNAME", username, "CHANGE_USER"); // Moving on... GetPrevSearches(prev_searches, &num_prev_searches); if (num_prev_searches > 0) { for (int i = 0; i < num_prev_searches; ++i) { TemplateDictionary* sub_dict = dict->AddSectionDictionary("PREV_SEARCHES"); sub_dict->SetEscapedValue("PREV_SEARCH", prev_searches[i], TemplateDictionary::html_escape); } }
Template-include markers are much like section markers, so
SetIncludeDictionary(name)
acts, not surprisingly,
exactly like SetSectionDictionary(name)
. However, since
variable inheritence doesn't work across include boundaries, there is
no template-include equivalent to ShowSection()
or
SetValueAndShowSection()
.
One difference bewteen template-includes and sections is that for a
sub-dictionary that you create via
SetIncludeDictionary()
, you must call
subdict->SetFilename()
to indicate the name of the
template to include. If you do not set this, the sub-dictionary will
be ignored. The filename may be absolute, or relative, in which case
it's relative to template_root.
Example:
using google::TemplateDictionary; TemplateDictionary* dict = new TemplateDictionary("include example"); GetResults(results, &num_results); for (int i = 0; i < num_results; ++i) { TemplateDictionary* sub_dict = dict->AddIncludeDictionary("RESULT_TEMPLATE"); sub_dict->SetFilename("results.tpl"); FillResultsTemplate(sub_dict, results[i]); }
In practice, it's much more likely that
FillResultsTemplate()
will be the one to call
SetFilename()
. Note that it's not an error to call
SetFilename()
on a dictionary even if the dictionary is
not being used for a template-include; in that case, the function is a
no-op, but is perhaps still useful as self-documenting code.
This last category is a bit esoteric: if you write your own modifier, you can pass data to that modifier
when you call Expand()
. The intended use of this
functionality is to allow a modifier to work one way when you expand a
template with dictionary A, and another way when you expand a template
with dictionary B. For instance, you might have a modifier that
encrypts part of a webpage using a user's secret-key, and the
secret-key is of course different every time you expand the
webpage.
To set data that the modifier can use, you call
SetModifierData()
on a
TemplateDictionary
.
For instance:
TemplateDictionary* dict = new TemplateDictionary("modifier example"); dict->SetValue("USERNAME", thisuser->username); // regular var-setting dict->SetModifierData("encrypt key", (void*)thisuser->secret_key); ...
Your custom modifier is passed all the ModifierData as one of the
arguments to Modify
: see
<google/template_modifiers.h>
for more details.
Once you have a template and a template dictionary, it's simplicity itself to expand the template with those dictionary values, putting the output in a string:
google::Template* tpl = google::Template::GetTemplate(<filename>, google::STRIP_WHITESPACE); google::TemplateDictionary dict("debug-name"); FillDictionary(&dict, ...); string output; bool error_free = tpl->Expand(&output, &dict); // output now holds the expanded template // Expand returns false if the system cannot load any of the template files // referenced by the TemplateDictionary.
The expanded template is written to the string output
.
If output
was not empty before calling
Expand()
, the expanded template is appended to the end of
output
.
The TemplateFromString
class, in
template_from_string.h
, is an alternative to the
Template
class when you really want your template to be
built in to the executable rather than read from a file. It's a
drop-in replacement, that takes an extra argument which is the
template contents.
Prefer Template
to TemplateFromString
,
for several reasons. For one, updating the template requires merely a
data push, rather than pushing the new executable. Also, you can load
the new template without needing to restart the binary. It also makes
it easier for non-programmers to modify the template. Finally,
string-templates cannot be included by other
templates, since {{>include}}
takes a filename.
One reason to use TemplateFromString
is if you are in
an environment where having data files could be dangerous -- for
instance, you work on a disk that is usually full, or need the
template to work even in the face of disk I/O errors.
This package comes with a script, template-converter, that takes a template file as input and emits a C++ code snippet (an .h file) that defines a string with those template contents. This makes it easy to start by using a normal, file-based template, and then switch to template-from-string later if you so desire.
You can use the MakeCopy()
method on a template
dictionary to make a "deep" copy of the template. This can be useful
for situations like the following: you want to fill a template several
times, each time with 90% of the values the same, but the last 10%
different. Computing the values is slow. Here's how you can use
MakeCopy()
to do it:
newdict1 = dict->MakeCopy();
newdict2 = dict->MakeCopy();
Like all web applications, programs that use the Google Template System
to create HTML documents can be vulnerable to Cross-Site-Scripting (XSS)
attacks unless data inserted into a template is appropriately sanitized
and/or escaped. Which specific form of escaping or sanitization is
required depends on the context in which the template variable appears
within a HTML document (such as, regular "inner text", within a
<script>
tag, or within an onClick
handler). The remainder of this section provides a brief summary of
techniques to prevent XSS vulnerabilities due to template variables in
various HTML contexts. Note that while escaping is typically required,
escaping alone is often not enough! You also may need to sanitize or
validate the input, as for instance with URL attributes. For further
information, refer to additional resources on Cross-Site-Scripting
issues.
Use the :html_escape
or :h
modifier to
HTML-escape the variable:
<h1>{{HEADING:h}}</h1>
Ensure that the attribute is enclosed in double quotes in the template, and
use the :html_escape
or :h
modifier to escape the
variable:
<form ... <input name=q value="{{QUERY:h}}"> </form>
Validate that the URL is a well-formed URL with an appropriate
scheme (e.g., http(s), ftp, mailto). Then enclose the URL in quotes
in the template and use the :html_escape
or
:h
modifier to escape the variable:
<img src="{{IMAGE_URL:h}}">
style
tag or attribute.
Certain CSS style-sheet constructs can result in the invocation of javascript. To prevent XSS, the variable must be carefully validated and sanitized.
For string literals: Ensure that the literal is enclosed in quotes
and apply the :javascript_escape
or :j
modifier to escape the variable:
<script> // ... var msg_text = '{{MESSAGE:j}}'; // ... </script>
Literals of non-string types cannot be quoted and escaped. Instead, ensure that the variable's value is set such that it is guaranteed that the resulting string corresponds to a javascript literal of the expected type. For example, use
dict->SetValueInt("NUM_ITEMS", num_items);
to populate an integer javascript variable in the template fragment
<script> // ... var num_items = {{NUM_ITEMS}}; // ... </script>
onClick
.
Tag attributes whose values are evaluated as a javascript
expression (such as on{Click,Load,etc}
handlers) require an
additional consideration, since the attribute's value is HTML-unescaped
by the browser before it is passed to the javascript interpreter.
To avoid XSS vulnerabilities, it is in generally necessary to HTML-escape after javascript-escaping:
<button ... onclick='GotoUrl("{{TARGET_URL:j:h}}");'>
There are a number of scenarios in which XSS can arise that are unrelated to the insertion of values into HTML templates, including,
Location
,charset
,Content-Type
,Content-Disposition: attachment
).Please consult additional documentation on Cross-Site-Scripting for more detailed discussion of such issues.
Both dictionary keys and template filenames are strings. Instead of using raw strings, we encourage you to use a bit of machinery to help protect against various types of errors.
For dictionary keys, you can use the make_tpl_varnames_h tool to create static string variables to use instead of a string constant. This will protect against typos, as the make_tpl_varnames_h documentation describes.
For template filenames that a program uses -- including sub-templates -- we suggest the following idiom:
#include "example.tpl.varnames.h" // defines 1 string per dictionary key RegisterTemplateFilename(EXAMPLE_FN, "example.tpl"); // defines template ... google::Template* tpl = google::Template::GetTemplate(EXAMPLE_FN, ...); ... include_dict->SetFilename(EXAMPLE_FN);
By registering the filename, you can query the template system to detect syntax errors, reload-status, and so forth.
The following functions affect the global state of the template system.
google::Template::SetTemplateRootDirectory(root)
: when
GetTemplate()
is called with a relative filename,
the template system will try to load the template from
root/file
. This defaults to ./
. There are some administrative tools that can help with tweaking template performance and debugging template problems. The following functions work on registered templates.
google::TemplateNameList::GetMissingList()
: returns a list
of all registered templates where the file could not be found
on disk. google::TemplateNameList::AllDoExist()
: true iff the
missing-list is empty. google::TemplateNameList::GetBadSyntaxList()
: returns a
list of all registered templates where the template contains a
syntax error, and thus cannot be used. google::TemplateNameList::IsAllSyntaxOkay()
: true iff the
bad-syntax list is emtpy. google::TemplateNameList::GetLastmodTime()
: the latest
last-modified time for any registered template-file. The following functions help with debugging, by allowing you to examine the template dictionaries and expanded templates in more detail.
dict->Dump()
: dumps the contents of the dictionary
(and any sub-dictionaries) to stderr. dict->DumpToString()
: dumps the contents of the
dictionary (and sub-dictionaries) to the given string. dict->SetAnnotateOutput()
: when applying this
dictionary to a template, add marker-strings to the output to
indicate what template-substitutions the system was making.
This takes a string argument which can be used to shorten the
filenames printed in the annotations: if the filename contains
the string you give, everything before that string is elided
from the filename before printing. It's confusing, but fear
not: it's safe to just always pass in the empty string. Finally, ClearCache()
removes all template objects
from the cache used by GetTemplate()
. Typically, this is
only used in environments that check for memory leaks: calling this at
the end of the program will clean up all memory that the template
system uses.
All static methods on Template
and
TemplateDictionary
objects are threadsafe: you can safely
call google::TemplateDictionary::SetGlobalValue()
without needing
to worry about locking.
Non-static methods are not thread-safe. It is not safe for two threads to assign values to the same template-dictionary without doing their own locking. Note that this is expected to be quite rare: usually only one thread will care about a given template-dictionary.
For Template
objects, the most common idiom is that a
template is loaded via GetTemplate()
, and after that only
const methods like Expand()
are called on the template.
With such usage, it's safe to use the same Template
object in multiple threads without locking. Be careful, however, if
you also call functions like ReloadIfChanged()
.
This package includes several tools to make it easier to use write and use templates.
make_tpl_varnames_h
is a "lint" style syntax checker
and header file generator. It takes the names of template files as
command line arguments and loads each file into a Template object by
retrieving the file via the Template factory method. The loading of
the file does pure syntax checking and reports such errors as
mis-matched section start/end markers, mis-matched open/close
double-curly braces, such as "{{VAR}"
, or invalid characters
in template variables/names/comments.
If the template passes the syntax check, by default the utility then creates a header file for use in the executable code that fills the dictionary for the template. If the developer includes this header file, then constants in the header file may be referenced in the dictionary building function, rather than hard-coding strings as variable and section names. By using these constants, the compiler can notify the developer of spelling errors and mismatched names. Here's an example of how this is used, and how it helps prevent errors:
const char * const kosr_RESULT_NUMBER = "RESULT_NUMBER"; // script output dict.SetValue("RESSULT_NUMBER", "4"); // typo is silently missed dict.SetValue(kosr_RESSULT_NUMBER, "4"); // compiler catches typo
Each constant is named as follows:
one_search_result_post20020815.tpl
are
osr
. As an example, the section name "RESULT_NUMBER" in the file
one_search_result_post20020815.tpl would be given the constant name
kosr_RESULT_NUMBER
and would appear in the header file as
const char * const kosr_RESULT_NUMBER = "RESULT_NUMBER";
-- as in the example above.
By default, the header file is produced in the current directory.
An alternate output directory may be specified
by the command line flag --header_dir
.
The name of the generated header file is the same as the name of
the template file with an extension added to the name. By default,
that extension is .varnames.h
. In the above example, the
header file containing the constant declarations would be named
one_search_result_post20020815.tpl.varnames.h
. An
alternate extension may be provided via the command line flag
--outputfile_suffix
.
Important command line flags:
--noheader
-- Indicates that a header file
should not be generated; only syntax checking should be done. --header_dir
-- sets the directory where the header
is written. Default: "./" --template_dir
-- sets the template root
directory. Default: ./
which is the correct
specification when it is run from the directory where the templates
are located. This is only used if the input template filenames
are specified as relative paths rather than absolute
paths. --outputfile_suffix
-- the extension added to the
name of the template file to create the name of the generated
header file. Default: .varnames.h
.
For a full list of command line flags, run
make_tpl_varnames_h --help
.
The TemplateFromString
class lets you load a template
from a string instead of a file. Applications may prefer this option
to reduce the dependencies of the executable, or use it in
environments where data files are not practical. In such cases,
template-converter
can be used as a template "compiler",
letting the developer write a template file as a data file in the
normal way, and then "compiling" it to a C++ string to be included in
the executable.
Usage is template-converter <template filename>
.
C++ code is output is to stdout; it can be stored in a .h file or
included directly into a C++ file. Perl must be installed to use this
script.