The encodings supported on input depend entirely on your choice of XML parser.
On output, any encoding supported by the Java VM may be used.
The encodings iso-646
and iso646
(in any mixture of upper and lower
case) are recognized as synonyms of US-ASCII
, even though they are not supported directly
by JDK 1.4.
There are some differences between the character encodings supported by the old java.io
package
and the new java.nio
package. If the requested encoding is not supported by the java.nio
package, then
all non-ASCII characters will be represented using numeric character references. If the encoding is
not supported by the java.io
package, then Saxon will revert to using UTF-8 as the actual output
encoding.
A list of the character encodings
supported in the java.nio
package can be obtained by using the command
java net.sf.saxon.charcode.CharacterSetFactory
,
with no parameters. Java does not provide any means of determining the list of encodings
supported by the java.io
package.
Although Saxon now supports any encoding that the Java VM supports, it still handles certain encodings itself, because this is more efficient and more reliable. The encodings that Saxon supports directly (including synonyms) are ASCII, US-ASCII, iso-646, iso646, iso-8859-1, ISO8859_1, iso-8859-2, ISO8859_2, UTF-8, UTF8, UTF-16, UTF16, KOI8-R, cp1250, windows-1250, cp1251, windows-1251, cp1252, windows-1252, cp852, windows-852.