Cross-Platform C++

ot
class Character

#include "ot/base/Character.h"

Represents a Unicode character using an internal sequence of one or more CharType characters. It provides optimized routines for converting Unicode characters into a sequence of one or more CharType characters and for decoding a multi-character sequence into a Unicode code-point UCS4Char value.

The Character class also contains a number of convenient methods for querying the characteristics of the encoded Unicode character. These routines such as isHexDigit() and isSpace() are simply wrappers for functions in the Unicode class. They have counterparts in the standard C++ library, but the standard library routines rely on the capabilities of a locale which may not be available for Unicode. The Unicode class does not suffer from this drawback.




Constructor/Destructor Summary
Character()
         Default constructor.
Character(const Character& rhs)
         Copy constructor.
Character(UCS4Char ch)
         Constructs a Character with an internal CharType sequence equivalent to the Unicode character represented by the value of ch.
Character(const CharType* pSeqStart, size_t len)
         Constructs a Character given a pointer to the first member of a multi-character sequence and its maximum length.

Method Summary
 void appendToString(String& ) const
         Appends the multi-character sequence controlled by this Character to the passed String str.
 const CharType* data() const
         Returns a pointer to the controlled CharType character sequence buffer.
 const CharType first() const
         Returns the first CharType character in the controlled sequence.
 bool isAscii() const
         Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F.
 bool isDigit() const
         Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9.
 bool isEOF() const
         Tests if this Character is equal to the special Character: Character::EndOfFileCharacter.
 bool isHexDigit() const
         Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f].
 bool isSpace() const
         Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions.
 const size_t length() const
         Returns the number of CharType characters in the controlled character sequence.
 bool operator!=(const Character& rhs) const
         Inequality operator.
 bool operator!=(CharType c) const
         Inequality operator.
 Character& operator=(const Character& rhs)
         Assignment operator.
 bool operator==(const Character& rhs) const
         Equality operator.
 bool operator==(CharType c) const
         Equality operator.
 String toString() const
         Returns the multi-character sequence controlled by this Character as a String.
 UCS4Char toUnicode() const
         Converts the controlled multi-character sequence into a 32-bit Unicode code-point value.

Public Static Data Members

EndOfFileCharacter

Character EndOfFileCharacter

Character representing the 'end of file' condition. This is a special Character that can be returned from functions that read a single Character when the end of file condition has been reached.


Constructor/Destructor Detail

Character

 Character()
Default constructor. Creates a Character that is equivalent to the EndOfFile character.


Character

 Character(const Character& rhs)
Copy constructor. Constructs a Character with the same value as rhs.

Parameters:
rhs - the Character to copy

Character

 Character(UCS4Char ch)
Constructs a Character with an internal CharType sequence equivalent to the Unicode character represented by the value of ch.

Exceptions:
IllegalCharacterException - if ch is not a legal Unicode character in the range U+0000-U+10FFFF.

Character

 Character(const CharType* pSeqStart,
           size_t len)
Constructs a Character given a pointer to the first member of a multi-character sequence and its maximum length. A multi-character sequence consists of one or more CharType characters that, taken together, represent a single Unicode character.

The sequence, including the first CharType character and any trailing characters are copied into the internal multi-character sequence.

Parameters:
pSeqStart - a pointer to the first character of a multi-character sequence that represents a single Unicode character.
len - the number of CharType characters that are legally addressable within the array starting at pSeqStart
Exceptions:
NullPointerException - if pSeqStart is null.
IllegalCharacterException - if the array starting at pSeqStart does not represent a valid Unicode character in the internal encoding

Method Detail

appendToString

void appendToString(String& ) const
Appends the multi-character sequence controlled by this Character to the passed String str.

Parameters:
str - the String which will have this Character appended

data

const CharTypedata() const
Returns a pointer to the controlled CharType character sequence buffer.

Returns:
a pointer to the controlled character sequence.
See also:
length()

first

const CharType first() const
Returns the first CharType character in the controlled sequence.

Returns:
the first CharType character in the controlled sequence.
Exceptions:
IllegalCharacterException - if this Character does not represent a valid Unicode character in the range U+0000-U+10FFFF.

isAscii

bool isAscii() const
Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F.

Returns:
true if this Character is in the ASCII range; false otherwise.
See also:
UnicodeCharacterType::IsAscii()

isDigit

bool isDigit() const
Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9.

Returns:
true if this Character is a decimal digit [0-9]; false otherwise.
See also:
UnicodeCharacterType::IsDigit()

isEOF

bool isEOF() const
Tests if this Character is equal to the special Character: Character::EndOfFileCharacter. Functions that read a character stream and return a Character need a method to indicate that the end of stream has been reached. To achieve this they return a special Character with a unique value that is different from all valid Unicode characters.

Returns:
true if this Character is equal to the Character::EndOfFileCharacter; false otherwise.

isHexDigit

bool isHexDigit() const
Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f].

Returns:
true if this Character is a hexadecimal digit; false otherwise.
See also:
UnicodeCharacterType::IsHexDigit()

isSpace

bool isSpace() const
Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions. Space characters are:-

Returns:
true if this Character is a space character; false otherwise.
See also:
UnicodeCharacterType::IsSpace()

length

const size_t length() const
Returns the number of CharType characters in the controlled character sequence.

Returns:
the length of the controlled character sequence.
See also:
data()

operator!=

bool operator!=(const Character& rhs) const
Inequality operator. Tests if the Unicode character represented by this is not the same Unicode character as rhs;

Returns:
false if the Unicode character represented by this Character is equal to the Unicode character rhs; true otherwise

operator!=

bool operator!=(CharType c) const
Inequality operator. Tests if the internal multi-character sequence has a length other than 1 or the first member is not equal to c.

Returns:
true if the Unicode character represented by this Character is equal to the single CharType character c; false otherwise

operator=

Character& operator=(const Character& rhs)
Assignment operator. Sets this Character equal to rhs.

Returns:
a reference to this Character.

operator==

bool operator==(const Character& rhs) const
Equality operator. Tests if the Unicode character represented by this is the same Unicode character as rhs;

Returns:
true if the Unicode character represented by this Character is equal to the Unicode character rhs; false otherwise

operator==

bool operator==(CharType c) const
Equality operator. Tests if the internal multi-character sequence has a length of 1 and the first member is equal to c.

Returns:
true if the Unicode character represented by this Character is equal to the CharType character c; false otherwise

toString

String toString() const
Returns the multi-character sequence controlled by this Character as a String.

Returns:
a String with the same sequence of CharType characters.

toUnicode

UCS4Char toUnicode() const
Converts the controlled multi-character sequence into a 32-bit Unicode code-point value.

Returns:
the Unicode character represented by this Character as a 32-bit value.
Exceptions:
IllegalCharacterException - if this Character does not represent a valid Unicode character in the range U+0000-U+10FFFF.


Cross-Platform C++

Found a bug or missing feature? Please email us at support@elcel.com

Copyright © 2000-2003 ElCel Technology   Trademark Acknowledgements