au.id.jericho.lib.html
Class ParseText

java.lang.Object
  extended by ParseText
All Implemented Interfaces:
java.lang.CharSequence

public final class ParseText
extends java.lang.Object
implements java.lang.CharSequence

Represents the text from the source document that is to be parsed.

This class is normally only of interest to users who wish to create custom tag types.

The parse text is defined as the entire text of the source document in lower case, with all ignored segments replaced by space characters.

The text is stored in lower case to make case insensitive parsing as efficient as possible.

This class provides many methods which are also provided by the java.lang.String class, but adds an extra parameter called breakAtIndex to the various indexOf methods. This parameter allows a search on only a specified segment of the text, which is not possible using the normal String class.

ParseText instances are obtained using the Source.getParseText() method.


Field Summary
static int NO_BREAK
          A value to use as the breakAtIndex argument in certain methods to indicate that the search should continue to the start or end of the parse text.
 
Method Summary
 char charAt(int index)
          Returns the character at the specified index.
 boolean containsAt(java.lang.String str, int pos)
          Indicates whether this parse text contains the specified string at the specified position.
 int indexOf(char[] searchCharArray, int fromIndex)
          Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex.
 int indexOf(char[] searchCharArray, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int indexOf(char searchChar, int fromIndex)
          Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex.
 int indexOf(char searchChar, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int indexOf(java.lang.String searchString, int fromIndex)
          Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex.
 int indexOf(java.lang.String searchString, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int lastIndexOf(char[] searchCharArray, int fromIndex)
          Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex.
 int lastIndexOf(char[] searchCharArray, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int lastIndexOf(char searchChar, int fromIndex)
          Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex.
 int lastIndexOf(char searchChar, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int lastIndexOf(java.lang.String searchString, int fromIndex)
          Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex.
 int lastIndexOf(java.lang.String searchString, int fromIndex, int breakAtIndex)
          Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
 int length()
          Returns the length of the parse text.
 java.lang.CharSequence subSequence(int beginIndex, int endIndex)
          Returns a new character sequence that is a subsequence of this sequence.
 java.lang.String substring(int beginIndex, int endIndex)
          Returns a new string that is a substring of this parse text.
 java.lang.String toString()
          Returns the content of the parse text as a String.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

NO_BREAK

public static final int NO_BREAK
A value to use as the breakAtIndex argument in certain methods to indicate that the search should continue to the start or end of the parse text.

See Also:
Constant Field Values
Method Detail

containsAt

public boolean containsAt(java.lang.String str,
                          int pos)
Indicates whether this parse text contains the specified string at the specified position.

This method is analogous to the java.lang.String.startsWith(String prefix, int toffset) method.

Parameters:
str - a string.
pos - the position (index) in this parse text at which to check for the specified string.
Returns:
true if this parse text contains the specified string at the specified position, otherwise false.

charAt

public char charAt(int index)
Returns the character at the specified index.

Specified by:
charAt in interface java.lang.CharSequence
Parameters:
index - the index of the character.
Returns:
the character at the specified index, which is always in lower case.

indexOf

public int indexOf(char searchChar,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified character within the specified range, or -1 if the character is not found.

indexOf

public int indexOf(char searchChar,
                   int fromIndex,
                   int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the end of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified character within the specified range, or -1 if the character is not found.

lastIndexOf

public int lastIndexOf(char searchChar,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified character within the specified range, or -1 if the character is not found.

lastIndexOf

public int lastIndexOf(char searchChar,
                       int fromIndex,
                       int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the start of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified character within the specified range, or -1 if the character is not found.

indexOf

public int indexOf(java.lang.String searchString,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified string within the specified range, or -1 if the string is not found.

indexOf

public int indexOf(char[] searchCharArray,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified character array within the specified range, or -1 if the character array is not found.

indexOf

public int indexOf(java.lang.String searchString,
                   int fromIndex,
                   int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the end of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified string within the specified range, or -1 if the string is not found.

indexOf

public int indexOf(char[] searchCharArray,
                   int fromIndex,
                   int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the end of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified character array within the specified range, or -1 if the character array is not found.

lastIndexOf

public int lastIndexOf(java.lang.String searchString,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified string within the specified range, or -1 if the string is not found.

lastIndexOf

public int lastIndexOf(char[] searchCharArray,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified character array within the specified range, or -1 if the character array is not found.

lastIndexOf

public int lastIndexOf(java.lang.String searchString,
                       int fromIndex,
                       int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the start of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified string within the specified range, or -1 if the string is not found.

lastIndexOf

public int lastIndexOf(char[] searchCharArray,
                       int fromIndex,
                       int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.

The position specified by breakAtIndex is not included in the search.

If the search is to continue to the start of the text, the value ParseText.NO_BREAK should be specified as the breakAtIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified character array within the specified range, or -1 if the character array is not found.

length

public int length()
Returns the length of the parse text.

Specified by:
length in interface java.lang.CharSequence
Returns:
the length of the parse text.

substring

public java.lang.String substring(int beginIndex,
                                  int endIndex)
Returns a new string that is a substring of this parse text.

The substring begins at the specified beginIndex and extends to the character at index endIndex - 1. Thus the length of the substring is endIndex-beginIndex.

Parameters:
beginIndex - the begin index, inclusive.
endIndex - the end index, exclusive.
Returns:
a new string that is a substring of this parse text.

subSequence

public java.lang.CharSequence subSequence(int beginIndex,
                                          int endIndex)
Returns a new character sequence that is a subsequence of this sequence.

This is equivalent to substring(beginIndex,endIndex).

Specified by:
subSequence in interface java.lang.CharSequence
Parameters:
beginIndex - the begin index, inclusive.
endIndex - the end index, exclusive.
Returns:
a new character sequence that is a subsequence of this sequence.

toString

public java.lang.String toString()
Returns the content of the parse text as a String.

Specified by:
toString in interface java.lang.CharSequence
Overrides:
toString in class java.lang.Object
Returns:
the content of the parse text as a String.