saxon:collation

Note: It is also possible to specify a collation directly by using a URI of the form http://saxon.sf.net/collation?keyword=value;keyword=value;.... For details see Collation URIs.

The saxon:collation element is a top-level element used to define collating sequences that may be used in sort keys and in functions such as compare(). The collation name is a URI (though actually any string can be used), and is defined in the mandatory name attribute. The other attributes control how the collation is defined. There are three ways of setting up a collation:

  1. by class. In this case the class attribute is used to specify the fully qualified name of a Java class that implements the java.util.Comparator interface. Note that if the collation is to be used in functions such as contains() and starts-with(), this class must also be a java.text.RuleBasedCollator. This approach allows a user-defined collation to be implemented in Java.

  2. by rules. In this case the rules attribute is used to specify details of the ordering required, using the syntax of the Java RuleBasedCollator. To give a simplified example, rules="A < B < C"

  3. by locale and tailoring. In this case the lang, strength, and decomposition attributes are used to obtain a collation for a particular locale, and to customize it. The lang attribute follows the rules of the xml:lang attribute, for example specify "en-US" for US English. This is used to find the collation appropriate to a Java locale. The strength attribute sets the strength of the collator. Values are "primary", "secondary", "tertiary", and "identical". The decomposition attribute determines how the collator handles Unicode composed characters. Values are "none", "standard", and "full". See the JDK documentation for full details of these attributes.

The default attribute specifies whether this collation is the default collation. Value is "yes" or "no". The value "yes" indicates that this collation is to be used as the default collation. If more than one collation is specified as the default, the last one wins. If no default collation is specified, Unicode codepoint collation is used.

Sorting and comparison according to Unicode codepoints can be achieved by setting up a collator as <saxon:collation name="unicode" class="net.sf.saxon.sort.CodepointCollator"/>

Note that a stylesheet containing a saxon:collation declaration cannot be compiled at this release, because the underlying Java classes are not serializable.

Expand

Up  Next