|
|
|
|
|
Description |
This module is for working with HTML/XML. It deals with both well-formed XML and
malformed HTML from the web. It features:
- A lazy parser, based on the HTML 5 specification - see parseTags.
- A renderer that can write out HTML/XML - see renderTags.
- Utilities for extracting information from a document - see ~==, sections and partitions.
The standard practice is to parse a String to [Tag String] using parseTags,
then operate upon it to extract the necessary information.
|
|
Synopsis |
|
|
|
|
Data structures and parsing
|
|
|
A single HTML element. A whole document is represented by a list of Tag.
There is no requirement for TagOpen and TagClose to match.
| Constructors | TagOpen str [Attribute str] | An open tag with Attributes in their original order
| TagClose str | A closing tag
| TagText str | A text node, guaranteed not to be the empty string
| TagComment str | A comment
| TagWarning str | Meta: A syntax error in the input file
| TagPosition !Row !Column | Meta: The position of a parsed element
|
| Instances | |
|
|
|
The row/line of a position, starting at 1
|
|
|
The column of a position, starting at 1
|
|
type Attribute str = (str, str) | Source |
|
An HTML attribute id="name" generates ("id","name")
|
|
module Text.HTML.TagSoup.Parser |
|
module Text.HTML.TagSoup.Render |
|
|
Turns all tag names and attributes to lower case and
converts DOCTYPE to upper case.
|
|
Tag identification
|
|
|
Test if a Tag is a TagOpen
|
|
|
Test if a Tag is a TagClose
|
|
|
Test if a Tag is a TagText
|
|
|
Test if a Tag is a TagWarning
|
|
|
Test if a Tag is a TagPosition
|
|
|
Returns True if the Tag is TagOpen and matches the given name
|
|
|
Returns True if the Tag is TagClose and matches the given name
|
|
Extraction
|
|
|
Extract the string from within TagText, crashes if not a TagText
|
|
|
Extract an attribute, crashes if not a TagOpen.
Returns "" if no attribute present.
|
|
|
Extract the string from within TagText, otherwise Nothing
|
|
|
Extract the string from within TagWarning, otherwise Nothing
|
|
|
Extract all text content from tags (similar to Verbatim found in HaXml)
|
|
Utility
|
|
|
This function takes a list, and returns all suffixes whose
first item matches the predicate.
|
|
|
This function is similar to sections, but splits the list
so no element appears in any two partitions.
|
|
Combinators
|
|
|
Define a class to allow String's or Tag str's to be used as matches
| | Instances | |
|
|
|
Performs an inexact match, the first item should be the thing to match.
If the second item is a blank string, that is considered to match anything.
For example:
(TagText "test" ~== TagText "" ) == True
(TagText "test" ~== TagText "test") == True
(TagText "test" ~== TagText "soup") == False
For TagOpen missing attributes on the right are allowed.
|
|
|
Negation of ~==
|
|
Produced by Haddock version 2.4.2 |