6. detect

<detect>
  Content: and|dtdPublicId|dtdSystemId|fileNameExtension|mimeType|
           not|or|rootElementLocalName|rootElementNamespace|
           rootElementAttribute|schemaType
</detect>

<and>
  Content: [ and|dtdPublicId|dtdSystemId|fileNameExtension|mimeType|
             not|or|rootElementLocalName|rootElementNamespace|
             rootElementAttribute|schemaType ]+
</and>

<dtdPublicId
  substring = boolean : false
>
  Content: non empty token
</dtdPublicId>

<dtdSystemId>
  Content: anyURI
</dtdSystemId>

<fileNameExtension>
  Content: file name extension
</fileNameExtension>

<mimeType>
  Content: non empty token
</mimeType>

<not>
  Content: and|dtdPublicId|dtdSystemId|fileNameExtension|mimeType|
           not|or|rootElementLocalName|rootElementNamespace|
           rootElementAttribute|schemaType
</not>

<or>
  Content: [ and|dtdPublicId|dtdSystemId|fileNameExtension|mimeType|
             not|or|rootElementLocalName|rootElementNamespace|
             rootElementAttribute|schemaType ]+
</or>

<rootElementLocalName>
  Content: Name
</rootElementLocalName>

<rootElementNamespace>
  Content: anyURI
</rootElementNamespace>

<rootElementAttribute
  localName = Name
  namespace = anyURI
  value = string
  substring = boolean : false
/>

<schemaType>
  Content: 'dtd' | 'schema' | 'relaxng'
</schemaType>

Register with XXE a condition which can be used to detect the type of a document.

During its start-up, XXE loads all the configuration files it can find, because it needs to keep a list of all detect elements.

The order of a detect element in this list depend on the location of its configuration file: configurations loaded from the config subdirectory of user preferences directory precede configurations loaded from the value of environment variable XXE_ADDON_PATH which in turn precede configurations loaded from the addon subdirectory of XXE distribution directory.

When a document is opened, XXE tries each detect element in turn. If the condition expressed in the detect element evaluates to true, the detection phase stops and the configuration containing the detect element is associated to the newly opened document.

Child elements of detect:

and

Evaluates to true if all its children evaluate to true.

dtdPublicId

Evaluates to true if the document has a document type declaration (<!DOCTYPE>) with a public ID equals to the content of this element.

If substring="true", evaluates to true if public ID contains the specified string.

dtdSystemId

Evaluates to true if the document has a document type declaration (<!DOCTYPE>) with a system ID equals to the content of this element.

fileNameExtension

Evaluates to true if the file containing the document has a name which ends with '.' followed by the content of this element.

mimeType

Evaluates to true if the file containing the document has a MIME type equals to the content of this element.

not

Evaluates to true if its child evaluates to false.

or

Evaluates to true if any of its children evaluates to true.

rootElementLocalName

Evaluates to true if the document has a root element with a local name (name without the namespace part) equals to the content of this element.

rootElementNamespace

Evaluates to true if the document has a root element with a name which belongs to the namespace equals to the content of this element.

Use "<rootElementNamespace xsi:nil='true' />" to specify that the name of root element has no namespace.

Important

  • The detection step is always namespace-aware, and this, even when the document to be opened conforms to a DTD.[8].

    XHTML example:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
        ...

    In the above case, the namespace of the root element is "http://www.w3.org/1999/xhtml", even if the document starts with a <!DOCTYPE> and thus, conforms to a DTD.

  • Attribute default values are not considered during the detection step.

    XHTML example:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html>
        ...

    In the above case, the root element has no namespace, even if "<!ATTLIST html xmlns %URI; #FIXED 'http://www.w3.org/1999/xhtml'>" is declared in the DTD.

rootElementAttribute

Evaluates to true if the document has a root element which has at least one attribute where all of the following is true:

  • The local part of the name of the attribute is equal to the value of localName. When localName is not specified, any local part will do.

  • The namespace URI of the name of the attribute is equal to the value of namespace. When namespace is not specified, any namespace URI or no namespace URI at all will do.

    Use the empty string (e.g. namespace="") to specify that the name of the attribute should have no namespace at all.

  • The value of the attribute must be equal to the value of value. When value is not specified, any value will do.

    If substring is specified with value true, suffice for the value of the attribute to contain the value of value.

DocBook 5 example: use a specific configuration for documents conforming to version 1.0 of Acme Corporation's extension of DocBook 5. As explained in the DocBook 5 documentation, the root element of such document should have a version attribute with value 5.0-extension acme-1.0.

<rootElementAttribute localName="version" value="acme" substring="true" />

What follows is even more precise, though not strictly needed:

<rootElementAttribute localName="version" namespace="" value="acme" substring="true" />
schemaType

Evaluates to true

  • if the document is explicitly constrained by a DTD (that is, has a <!DOCTYPE>) and the content of this element is dtd,

  • OR if he document is explicitly constrained by an W3C XML Schema (that is, has a xsi:schemaLocation or a xsi:noNamespaceSchemaLocation attribute on its root element) and the content of this element is schema.

  • OR if he document is explicitly constrained by RELAX NG schema (that is, contains a <?xxe-relaxng-schema location="..."?> processing instruction) and the content of this element is relaxng.

Use "<schemaType xsi:nil='true' />" to specify that document is not explicitly constrained by a DTD, a W3C XML Schema or a RELAX NG schema.

Examples:

Example 7.1. DocBook DTD

<detect>
  <and>
    <or>
      <rootElementLocalName>book</rootElementLocalName>
      <rootElementLocalName>article</rootElementLocalName>
      <rootElementLocalName>chapter</rootElementLocalName>
      <rootElementLocalName>section</rootElementLocalName>
      <rootElementLocalName>sect1</rootElementLocalName>
      <rootElementLocalName>sect2</rootElementLocalName>
      <rootElementLocalName>sect3</rootElementLocalName>
      <dtdPublicId substring="true">DTD DocBook XML</dtdPublicId>
    </or>
    <rootElementNamespace xsi:nil="true" />
    <not>
      <dtdPublicId substring="true">Simplified</dtdPublicId>
    </not>
  </and>
</detect>

The detect element in previous example can be described as follows: opened document is a DocBook document if

  • The local name of the root element is one of book, article, chapter, section, sect1, sect2, sect3.

    OR the public ID of its DTD contains string "DTD DocBook XML".

  • AND the name of its root element does not belong to any namespace.

  • AND the public ID of its DTD does not contain string "Simplified".


Example 7.2. DocBook RELAX NG

<detect>
  <rootElementNamespace>http://docbook.org/ns/docbook</rootElementNamespace>
</detect>

Example 7.3. XHTML Strict DTD

<detect>
  <or>
    <dtdPublicId>-//W3C//DTD XHTML 1.0 Strict//EN</dtdPublicId>
    <and>
       <schemaType xsi:nil="true" />
       <or>
         <rootElementLocalName>body</rootElementLocalName>
         <rootElementLocalName>div</rootElementLocalName>
         <rootElementLocalName>html</rootElementLocalName>
       </or>
    </and>
  </or>
</detect>

Example 7.4. XHTML RELAX NG

<detect>
  <and>
    <rootElementNamespace>http://www.w3.org/1999/xhtml</rootElementNamespace>
    <not>
      <or>
        <dtdPublicId>-//W3C//DTD XHTML 1.0 Strict//EN</dtdPublicId>
        <dtdPublicId>-//W3C//DTD XHTML 1.0 Transitional//EN</dtdPublicId>
      </or>
    </not>
  </and>
</detect>



[8] Remember that XXE is not namespace-aware when the document being edited conforms to a DTD.