[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This module provides a set of packages with a common interface to access the characters contained in a stream. Various implementations are provided to access files and manipulate standard Ada strings.
A top-level tagged type is provided that must be extended for the various streams. It is assumed that the pointer to the current character in the stream can only go forward, and never backward. As a result, it is possible to implement this package for sockets or other strings where it isn't even possible to go backward. This also means that one doesn't have to provide buffers in such cases, and thus that it is possible to provide memory-efficient readers.
Two predefined readers are available, namely String_Input
to read
characters from a standard Ada string, and File_Input
to read
characters from a standard text file.
They all provide the following primite operations:
Open
Close
Next_Char
The next time this function is called, it returns the following character from the stream.
Eof
It is the responsability of this stream reader to correctly call the
decoding functions in the unicode module so as to return one single
valid unicode character. No further processing is done on the result
of Next_Char
. Note that the standard File_Input
and
String_Input
streams can automatically detect the encoding to
use for a file, based on a header read directly from the file.
Based on the first four bytes of the stream (assuming this is valid XML), they will automatically detect whether the file was encoded as Utf8, Utf16,... If you are writing your own input streams, consider adding this automatic detection as well.
However, it is always possible to override the default through a call to
Set_Encoding
. This allows you to specify both the character set
(Latin1, ...) and the character encoding scheme (Utf8,...).
The user is also encouraged to set the identifiers for the stream they
are parsing, through called to Set_System_Id
and
Set_Public_Id
. These are used when reporting error messages.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |