Package genshi :: Module core :: Class Stream

Class Stream



object --+
         |
        Stream

Represents a stream of markup events.

This class is basically an iterator over the events.

Stream events are tuples of the form:

(kind, data, position)

where kind is the event kind (such as START, END, TEXT, etc), data depends on the kind of event, and position is a (filename, line, offset) tuple that contains the location of the original element or text in the input. If the original location is unknown, position is (None, -1, -1).

Also provided are ways to serialize the stream to text. The serialize() method will return an iterator over generated strings, while render() returns the complete generated text at once. Both accept various parameters that impact the way the stream is serialized.



Instance Methods
 
__init__(self, events)
Initialize the stream with a sequence of markup events.
 
__iter__(self)
 
__or__(self, function)
Override the "bitwise or" operator to apply filters or serializers to the stream, providing a syntax similar to pipes on Unix shells.
 
filter(self, *filters)
Apply filters to the stream.
 
render(self, method='xml', encoding='utf-8', **kwargs)
Return a string representation of the stream.
 
select(self, path, namespaces=None, variables=None)
Return a new stream that contains the events matching the given XPath expression.
 
serialize(self, method='xml', **kwargs)
Generate strings corresponding to a specific serialization of the stream.
 
__str__(self)
str(x)
 
__unicode__(self)

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__

Class Variables
  START = 'START'
a start tag
  END = 'END'
an end tag
  TEXT = 'TEXT'
literal text
  DOCTYPE = 'DOCTYPE'
doctype declaration
  START_NS = 'START_NS'
start namespace mapping
  END_NS = 'END_NS'
end namespace mapping
  START_CDATA = 'START_CDATA'
start CDATA section
  END_CDATA = 'END_CDATA'
end CDATA section
  PI = 'PI'
processing instruction
  COMMENT = 'COMMENT'
comment
Properties
  events

Inherited from object: __class__

Method Details

__init__(self, events)
(Constructor)

 
Initialize the stream with a sequence of markup events.
Parameters:
  • events - a sequence or iterable providing the events
Overrides: object.__init__

__or__(self, function)
(Or operator)

 

Override the "bitwise or" operator to apply filters or serializers to the stream, providing a syntax similar to pipes on Unix shells.

Assume the following stream produced by the HTML function:

>>> from genshi.input import HTML
>>> html = HTML('''<p onclick="alert('Whoa')">Hello, world!</p>''')
>>> print html
<p onclick="alert('Whoa')">Hello, world!</p>

A filter such as the HTML sanitizer can be applied to that stream using the pipe notation as follows:

>>> from genshi.filters import HTMLSanitizer
>>> sanitizer = HTMLSanitizer()
>>> print html | sanitizer
<p>Hello, world!</p>

Filters can be any function that accepts and produces a stream (where a stream is anything that iterates over events):

>>> def uppercase(stream):
...     for kind, data, pos in stream:
...         if kind is TEXT:
...             data = data.upper()
...         yield kind, data, pos
>>> print html | sanitizer | uppercase
<p>HELLO, WORLD!</p>

Serializers can also be used with this notation:

>>> from genshi.output import TextSerializer
>>> output = TextSerializer()
>>> print html | sanitizer | uppercase | output
HELLO, WORLD!

Commonly, serializers should be used at the end of the "pipeline"; using them somewhere in the middle may produce unexpected results.

filter(self, *filters)

 

Apply filters to the stream.

This method returns a new stream with the given filters applied. The filters must be callables that accept the stream object as parameter, and return the filtered stream.

The call:

stream.filter(filter1, filter2)

is equivalent to:

stream | filter1 | filter2

render(self, method='xml', encoding='utf-8', **kwargs)

 

Return a string representation of the stream.

Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.

Parameters:
  • method - determines how the stream is serialized; can be either "xml", "xhtml", "html", "text", or a custom serializer class
  • encoding - how the output string should be encoded; if set to None, this method returns a unicode object

See Also: XMLSerializer.__init__, XHTMLSerializer.__init__, HTMLSerializer.__init__, TextSerializer.__init__

select(self, path, namespaces=None, variables=None)

 
Return a new stream that contains the events matching the given XPath expression.
Parameters:
  • path - a string containing the XPath expression
  • namespaces - mapping of namespace prefixes used in the path
  • variables - mapping of variable names to values
Returns:
the selected substream
Raises:
  • PathSyntaxError - if the given path expression is invalid or not supported

serialize(self, method='xml', **kwargs)

 

Generate strings corresponding to a specific serialization of the stream.

Unlike the render() method, this method is a generator that returns the serialized output incrementally, as opposed to returning a single string.

Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.

Parameters:
  • method - determines how the stream is serialized; can be either "xml", "xhtml", "html", "text", or a custom serializer class

See Also: XMLSerializer.__init__, XHTMLSerializer.__init__, HTMLSerializer.__init__, TextSerializer.__init__

__str__(self)
(Informal representation operator)

 
str(x)
Overrides: object.__str__
(inherited documentation)