StringSearch

This module specifies a common framework for string search algorithms. Examples for implementations are StringSearch:SubstringBF, which does brute-force searching for substrings, and StringSearch:Regexp, a regular expression engine.

Import List

    Object
    Object
    RT0
 
Class List
FactoryFactory for Matcher objects.
MatchObjectRepresents the result of a successful match or search operation.
MatcherA representation of a search pattern.
Class Summary: Factory [Detail]
  +---RT0.Object
       |
       +---Object.Object
            |
            +--StringSearch.Factory

Factory for Matcher objects. An instance of Matcher is created from a textual description of a search pattern by calling Factory.Compile.

Constructor Summary
InitFactory(Factory)

          
Method Summary
Compile(String8, Flags): Matcher

          Compile a expression, for example a regular expression pattern, into a Matcher expression object.
Destroy()

          
Inherited Methods

From RT0.Object:

          Finalize

From Object.Object:

          Equals, HashCode, ToString

 
Class Summary: MatchObject [Detail]
  +---RT0.Object
       |
       +---Object.Object
            |
            +--StringSearch.MatchObject

Represents the result of a successful match or search operation.

Field Summary
endpos-: LONGINT

          The value of endpos which was passed to a search or match function.
matcher-: Matcher

          The matcher object whose match or search method produced this MatchObject instance.
pos-: LONGINT

          The value of pos which was passed to a search or match function.
string-: String8

          The string passed to a match or search function.
Constructor Summary
InitMatchObject(MatchObject, LONGINT, LONGINT, Matcher, String8)

          
Method Summary
Destroy()

          
End(LONGINT): LONGINT

          Returns the index of the end of the substring matched by group.
Start(LONGINT): LONGINT

          Returns the index of the start of the substring matched by group.
Inherited Methods

From RT0.Object:

          Finalize

From Object.Object:

          Equals, HashCode, ToString

 
Class Summary: Matcher [Detail]
  +---RT0.Object
       |
       +---Object.Object
            |
            +--StringSearch.Matcher

A representation of a search pattern. An instance of Matcher is applied to a string by calling one of the match or search methods.

Field Summary
flags-: Flags

          The flags argument used when the matcher object was compiled.
groups-: LONGINT

          The number of groups defined in the pattern string.
pattern-: String8

          The pattern string from which the matcher object was compiled.
Constructor Summary
InitMatcher(Matcher, String8, Flags, LONGINT)

          
Method Summary
Destroy()

          
Match(String8, LONGINT, LONGINT): MatchObject

          Like Matcher.MatchChars, but works on an instance of Object.String8.
MatchChars(ARRAY OF CHAR, LONGINT, LONGINT): MatchObject

          Returns a corresponding MatchObject instance, if zero or more characters at the beginning of string match this Matcher.
Search(String8, LONGINT, LONGINT): MatchObject

          Like Matcher.SearchChars, but works on an instance of Object.String8.
SearchChars(ARRAY OF CHAR, LONGINT, LONGINT): MatchObject

          Scans through string looking for a location where this Matcher produces a match, and return a corresponding MatchObject instance.
Inherited Methods

From RT0.Object:

          Finalize

From Object.Object:

          Equals, HashCode, ToString

 
Type Summary
Flags = SET

          
Constant Summary
copyString

          If set, then the match and search functions working an character arrays create a new copy of the searched string for MatchObject.string.
ignoreCase

          Perform case-insensitive matching.

Class Detail: Factory
Constructor Detail

InitFactory

PROCEDURE InitFactory(f: Factory)
Method Detail

Compile

PROCEDURE (f: Factory) Compile(pattern: String8; 
                  flags: Flags): Matcher

Compile a expression, for example a regular expression pattern, into a Matcher expression object. The matcher object can be used for matching using its Matcher.MatchChars and Matcher.SearchChars methods.

The pattern's behaviour can be modified by specifying a flags value. The set can include of the following variables: ignoreCase, copyString.

Result is NIL if the given pattern is invalid.

Pre-condition: The value of pattern does not contain the character 0X.


Destroy

PROCEDURE (f: Factory) Destroy()
 
Class Detail: MatchObject
Field Detail

endpos

FIELD endpos-: LONGINT

The value of endpos which was passed to a search or match function. This is the index into the string beyond which the matcher engine will not go.


matcher

FIELD matcher-: Matcher

The matcher object whose match or search method produced this MatchObject instance.


pos

FIELD pos-: LONGINT

The value of pos which was passed to a search or match function. This is the index into the string at which the matcher engine started looking for a match.


string

FIELD string-: String8

The string passed to a match or search function. This field is NIL if Matcher.MatchChars or Matcher.SearchChars is called without the flag copyString.

Constructor Detail

InitMatchObject

PROCEDURE InitMatchObject(m: MatchObject; 
                          pos: LONGINT; 
                          endpos: LONGINT; 
                          matcher: Matcher; 
                          string: String8)
Method Detail

Destroy

PROCEDURE (m: MatchObject) Destroy()

End

PROCEDURE (m: MatchObject) End(group: LONGINT): LONGINT

Returns the index of the end of the substring matched by group. See MatchObject.Start.


Start

PROCEDURE (m: MatchObject) Start(group: LONGINT): LONGINT

Returns the index of the start of the substring matched by group. A group of `0' refers to the whole matched substring. Returns `-1' if the group exists but did not contribute to the match. For a match object m, and a group g that did contribute to the match, the substring matched by group g is `string[m.Start(g), m.End(g)['. Note that `m.Start(g)' will equal `m.End(g)' if g matched a null string.

Note: Not all matcher implementations implement groups other than the whole match.

Pre-condition: `0 <= group <= m.matcher.groups'

 
Class Detail: Matcher
Field Detail

flags

FIELD flags-: Flags

The flags argument used when the matcher object was compiled.


groups

FIELD groups-: LONGINT

The number of groups defined in the pattern string. If there are no groups, for example because the matcher does not support them, the field is zero.


pattern

FIELD pattern-: String8

The pattern string from which the matcher object was compiled.

Constructor Detail

InitMatcher

PROCEDURE InitMatcher(matcher: Matcher; 
                      pattern: String8; 
                      flags: Flags; 
                      groups: LONGINT)
Method Detail

Destroy

PROCEDURE (matcher: Matcher) Destroy()

Match

PROCEDURE (matcher: Matcher) Match(string: String8; 
                pos: LONGINT; 
                endpos: LONGINT): MatchObject

Like Matcher.MatchChars, but works on an instance of Object.String8.


MatchChars

PROCEDURE (matcher: Matcher) MatchChars(string: ARRAY OF CHAR; 
                     pos: LONGINT; 
                     endpos: LONGINT): MatchObject

Returns a corresponding MatchObject instance, if zero or more characters at the beginning of string match this Matcher. Returns NIL if the string does not match the pattern. Note that this is different from a zero-length match.

Note: If you want to locate a match anywhere in string, use Matcher.Search instead.

The second parameter pos gives an index in the string where the search is to start, for example 0 to start at the beginning of the string.

The parameter endpos limits how far the string will be searched. It will be as if the string is endpos characters long, so only the characters in `[pos, endpos[' will be searched for a match. A value of `-1' is equivalent to an endpos of `Length(string)'.

Pre-condition: The start position is within the string `0 <= pos <= Length(string)', and the given end position is either `-1', or between the start position and the end of the string `pos <= endpos <= Length(string)'.


Search

PROCEDURE (matcher: Matcher) Search(string: String8; 
                 pos: LONGINT; 
                 endpos: LONGINT): MatchObject

Like Matcher.SearchChars, but works on an instance of Object.String8.


SearchChars

PROCEDURE (matcher: Matcher) SearchChars(string: ARRAY OF CHAR; 
                      pos: LONGINT; 
                      endpos: LONGINT): MatchObject

Scans through string looking for a location where this Matcher produces a match, and return a corresponding MatchObject instance. Returns NIL if no position in the string matches the pattern. Note that this is different from finding a zero-length match at some point in the string.

The pos and endpos parameters have the same meaning as for the Matcher.MatchChars method.

 
Type Detail

Flags

TYPE Flags = SET
Constant Detail

copyString

CONST copyString 

If set, then the match and search functions working an character arrays create a new copy of the searched string for MatchObject.string. Setting this flag is only useful if you intend to extract matched substrings from an instance of MatchObject. Note that most matchers do not support capturing matched substrings.


ignoreCase

CONST ignoreCase 

Perform case-insensitive matching. For example, a regular expression like `[A-Z]' will match lowercase letters, too.