T3X
|
Version 7.0.3 |
Copyright © 1996-2002 Nils M Holm |
mail: nmh@t3x.org |
home: http://www.t3x.org/ |
T3X is a small, portable, procedural, block-structured, recursive, almost typeless, and to some degree object oriented language. Its syntax is derived from Pascal and BCPL and its object oriented model is similar to that of Java, but much simpler. The structured approach to programming is well-understood, provides a sufficient degree of abstraction, and can easily be translated into native machine code at the same time. The object model eases the development of general and resusable code. T3X is an imperative language. This means that a program consists of a set of instructions which tell the computer in what way to manipulate the data defined by the program. An instruction is also called a statement. In structured programming languages, there are four fundamental ways of formulating statements:
The assignment is a fundamental property of imperative languages. It is used to move data from one location to another by assigning values to variables. In a sequence - which is basically a list of statements - the statements are processed from the top towards the bottom of the list. Each statement is guarranteed to be completely processed before the next one is interpreted. A branch is a statement which is executed only if an associated condition applies. Iteration is the repetition of a statement depending on a condition. In a block-structured language, statements may be grouped in statement blocks or compound statements. Each block may have its own local data which cannot be affected by statements contained in other blocks.
An additional layer of abstraction is added to an imperative, block-structured language by providing user-defined procedures or functions (in this document, these terms will be used synonymously). A procedure is a statement or a set of statements which is bound to a symbolic name. A procedure can be executed by coding a call to that procedure. Most languages provide a mechanism to transport data to a procedure and return a value to the calling program. Some languages (like BCPL and Pascal) make a distinction between procedures and functions, others (like K&R C) do not. In languages which make a distinction between procedures and functions, only functions may return values. In T3X, all procedures return values, but the caller is free to ignore them. Therefore, procedures and functions are basically the same.
Another level of abstraction is provided by adding an object model to the language. The object model of T3X consists solely of
Classes are used to encapsulate code and data of a program. A class may contain any number of data objects and procedures. Only public procedures (so-called methods) may be called by procedures (or methods) of other classes. Objects are used to instantiate classes. Each instance of a class has its own private data area. Hence the same class may be used for a template to create multiple independent objects. Messages are used to activate methods of specific objects. T3X does not provide inheritance, since it is the source of a whole load of semantic problems. It does not support different protection levels (like public data or 'friend' relationships), either, because these concepts undermine the object oriented model.
T3X is an almost typeless language. There exist two different types, so-called atomic variables which may hold small data objects, like characters, numbers and references to other data objects, and vectors which are used to store logically connected groups of small data objects. Additionally, there are constants, templates for defining structured data objects and classes, and different types of procedure declarations. The T3X compiler does not allow some combinations of operators which do not make sense (like assigning a value to a procedure or sending a message to a vector). Consequently, T3X's type checking is much more strict than for example BCPL's, but much less restrictive than Pascal's. Weakly typed and typeless languages have been exposed to a lot of critique in the past, because they are considered `insecure', but the degree of simplicity and flexibility which is bought by 'sacrificing' this bit of security is immense.
The type checking mechanisms of the T3X language are limited to the detection of
BTW: during the development of an early version of T3X, a severe error occurred in the compiler. After tracking it down, it turned out that was limited to the (type-safe) ANSI C version of the translator and did not affect the T3X version. Of course, this was coincidence, but to some degree it weakens the proposition that typeless languages are per se insecure and dangerous.
In the mid-90's, I was (once again) looking for a programming language providing the following properties:
One might think that there must have been quite a few languages providing these features, but obviously my search did not lead to any satisfactory result, or I would not have invented T3X (which I originally called T). Point (3) turned out to be particuraly hard to match. The language which came closest to my requirements was BCPL. The typeless approach, which has been very consistently implemented in this language, leads to clear, simple, and flexible semantics. The language is portable, its implementation is small and can easily be done in BCPL itself. The compiler provided by Martin Richards, the inventor of BCPL, generates code which is aimed at interpretation (for the purpose of porting the compiler), but may be translated into native code as well.
Unfortunately, the syntax of BCPL
reflects its otherwise overwhelming elegance only to a limited degree.
The precedences of some operators have (in my opinion) been chosen in a
too complex way and the syntax of the language is hard to parse by a
pure recursive descent (RD) parser. There are, for example, precedence rules
for statements, which make parsing and understanding BCPL programs
unnecessarily hard. An RD parser had always been a prerequisite for my
language of choice, because they are very easy to implement.
[At this point I have to add that Richard's BCPL compiler is
small, elegant and easy to understand, even though it does use syntax trees
and a bottom up parsing technique.]
Even if BCPL did not match my ideas exactly, it came pretty close and studying the language and compiler sources has influenced the design of T3X a lot. Without BCPL, T3X would not be the language, it is today.
The most important thing when designing a programming language is - in my opinion - to define its main purpose. The design goal of T3X was to create a portable, simple, and easy to understand notation for the description of algorithms. T3X was never aimed at industrial software development. Its purpose is to support the programmer in the process of reasoning about problems. It should be a productivity tool in the sense that it provides a playground for new ideas and allows the creator of these ideas to share it with others using a formal notation. Such a notation, of course, has to be clear, simple, easy to learn, and it would be a great advantage, if a compiler for this notation would be available in many different environments.
Naturally, my interpretation of productivity is not exactly the same as in the rather profit-oriented `real world' and the design of the T3X language reflects this intention well. T3X is not suitable for writing large scale application programs, nor for `rapid application development'. Originally, it was is more a notation than a programming language. Because it is simple and straight-forward, it does not force its user to pay too much attention to the language itself. Instead, it provides some very basic building-stones which may be used to construct a formal solution for a given problem.
There are many popular programming languages which provide
very high level features like data abstraction, inheritance,
exception handling, and environments containing pre-defined solutions
to create popup menus, radio buttons, database queries,
event handling and a load of features which is very helpful when
creating 'real world application programs'.
However, all these features do not really help the programmer
solve a problem - unless the problem is to create an application
program. When I am talking about a `problem', however, I usually
mean the search for an algorithm. Frequently, people say
things like
Mostly, such statements are the result of too little reasoning. Basically, any algorithm can be implemented in any language. The only difference is in the amount of work one has to do to solve the same problem in different languages. So the correct form of the above statement would be
Problem `A' is inconvenient to solve in language `B'.T3X provides only a very basic set of building-stones, but it turns out that this set is suitable to solve a variety of different problems in a convenient way - including, for example, the creation of a compiler and runtime environment for the T3X language itself.
T3X is an almost typeless, block-structured, procedural, object oriented programming language. Programs, classes, procedures, statements, and expressions form a hierarchy: Programs consist of classes, procedures, and statements, classes contain procedures and statements, procedures usually contain statements, and statements mostly contain expressions. Variables may be atoms (ordinal) or vectors (one-dimensional arrays). Since there are no different types, composed data types - called structures - are basically equal to vectors. Constants may be used to represent frequently used or tuneable values.
This chapter is written in bottom-up order, so that the building stones of larger entities already have been explained when the entites themselves are discussed.
The T3X compiler expects its input in the form of an ASCII file (a sequence of octets where the least significant seven bits of each octet contain the ASCII code of one character and the high bit is set to zero). The following characters will be treated as white space (The C-style 0x-notation is used to represent hexa-decimal numbers):
White space characters delimit tokens, but will otherwise be ignored by the compiler.
Valid input characters are the upper and lower case alphabetic characters A-Z, a-z, the decimal digits 0-9, and the following special characters:
! " # % & ( ) * + , - . / : ; < = > @ [ \ ] ^ _ | ~
Characters which are not contained in this alphabet may only occur in string literals, character literals, and comments. Otherwise they will cause an error.
A comment may be introduced at almost any point in a T3X program using an exclamation point (!). It extends up to but not including the end of the current line. Therefore, a comment is treated the same way as a single white space character, and consequently,
wh! this is a comment ile(1) ;
is equal to
wh ile(1) ;
and not to
while(1) ;
Therefore, comments may not occur inside of a single token, but only between two tokens. This is particularly valid for string literals and character literals which are single tokens as well. A ! character inside one of these literals is treated as an ordinary character.
Symbolic names may contain alphabetic characters, the underscore character (_), and decimal digits, where the first character must be alphabetic or an underscore. Upper case characters will be folded to lower case. Therefore, the names
abc abC aBc aBC Abc AbC ABc ABC
would all refer to the same symbol. The T3X compiler always uses all characters contained in two symbols to distinguish them, so
very_very_very_long_symbol_number_one
and
very_very_very_long_symbol_number_two
are guaranteed to be different. The maximum length of symbol names may be limited by other factors, though (like the maximum length of a token).
In fact, T3X is not a totally typeless language. It is called a 'typeless' language anyway, because even BCPL (which pushes the typeless concept quite to its limit) has at least two types (variables and MANIFEST constants) which require different handling at compile time. In T3X, the following types exist:
Vectors and structures are basically the same and there is no big difference between methods and procedures, either. Atomic variables are used to hold small numeric values or single ASCII characters (which are represented by numbers). Constants are used to provide symbolic names for immutable numeric values. Vectors are sequences of atomic variables. A structure is a set of constants which is used to give names to specific members of a vector. Procedures process parameters and return values just like mathematical functions. Since T3X is an imperative language, they usually have side effects, too. A method is a procedure which is used to alter the state of an object. An object is an instance of a class (which is not a type of its own). Classes will be discussed in detail in the section about object-oriented programming.
Each (atomic) T3X variable allocates exactly one machine word. When talking about variables in the remainder of this document, the attribute atomic is implied. Vectors will be implicitly referred to as vectors or arrays.
Variables are defined using a VAR statement. Any number of names may be defined in a single statement:
VAR x_coord, y_coord, depth;
Although, it is recommended to define only logically connected variables in a single statement.
All types of values may be stored in a variable: numeric values, pointers to strings, pointers to vectors, pointers to structures, pointers to objects, or single characters. The range of numeric values which may be stored in a variable actually depends upon the implementation. The Tcode engine uses only 16 bits to represent a cell - independently from the underlying platform. Therefore, programs which use values not in the range -32767...32767 should be considered machine-dependent. (The T3X compiler will not allow the use of numeric literals outside of this range.)
When a variable is placed in an expression (frequently also called a righthand side value, it evaluates to its value. When it is placed on the lefthand side of an assignment, however, it evaluates to its address (which will be dereferenced immediately by the following assignment operator, however).
Constants are variables which exist only at compile time (so-called compile time variables). Instead of an automatically assigned address, they are initialized with an explicitly specified value when they are declared. Since they are compile time entities, the values of constants may not change at run time. Any number of constants may be declared in a single CONST statement:
CONST READ=1, WRITE=2, RDWR=READ|WRITE;
Each constant name must be followed by an equal sign (=) and a constant expression which evaluates to the value of the constant. Constant expressions will be explained in a later section.
Constants may occur only in righthand side expressions, where they evaluate to their values.
Vectors are compile time variables, too. When they are declared, they will be initialized with the address of an array of subsequent machine words, the so-called vector members or vector elements. The address of a vector is equal to the address of its first member. Any number of vectors may be defined in a single VAR statement. Declarations of vectors and atomic variables may be mixed in one and the same statement:
VAR RingBuffer[1000], Head, Tail;
Vector declarations differ from atomic variable declarations by the trailing square brackets containing a constant expression which specifies the size of the vector in machine words. The first member of a vector has the index value 0 and the last one has the index vectorsize-1 (999 in the above example). The size of a vector may range from 1 to 16383 elements.
Since vector addresses are stored in compile time variables, they may not change at run time. It is legal to change the values of vector members, though. When occurring in righthand side expressions, vector names evaluate to the addresses of their associated arrays.
Single members of a vector may be addressed using the subscript operator []. The expression
v[5]
for example, evaluates to the fifth member of the vector v (given that the first member of the vector is actually referred to as the zeroe'th member). Subscripted vectors may occur on the left sides of expressions, as well. The assignment
v[i] := 99;
would change the i'th member of v to 99. Like atomic variables, the members of vectors may be used to store any data type, even pointers to vectors. See the description of the []-operator for details about nested vectors.
A special case of the vector is the byte vector. Like `ordinary' vectors, they are declared in VAR statements:
VAR Input::256, Output::256;
The only difference between a vector and a byte vector is the computation of the required size. The size value after the ::-operator specifies the number of characters required. The amount of memory actually allocated depends on the size of a machine word on the target machine, which is returned by the core class procedure T3X.BPW() (the method BPW of the class T3X). For all Tcode programs, T3X.BPW()=2 applies. The size of a byte vector is computed using this formula (T being an instance of the T3X class):
vectorsize + T.BPW() - 1 ------------------------ T.BPW()
which allocates enough space for at least vectorsize characters. No further type information is associated with vectors. Therefore, it is valid to access byte vector members using [] and word vector members using ::. However, this is discouraged, because the actual vector sizes might depend on a specific implementation and alignment errors may occur at runtime.
A byte vector may not be larger than 32766 bytes (16383 machine words on the Tcode machine).
A structure is a composed data object. Only one structure may be defined in a single STRUCT statement:
STRUCT POINT = PT_x, PT_y;
Such a statement does not actually create a new data object, but only the `layout' of a structure. For example, to create a 'point' data object, an additional VAR statement is required:
VAR point_a[POINT], point_b[POINT];
This statement creates two point entities, point_a and point_b. The members of such structures can be addressed using the subscript operator: point_a[PT_x] and point_a[PT_y].
Structures do not really have an own type. As the declaration and member access syntax already suggests, they are ordinary arrays and the member names are constants. In fact, the statement
STRUCT s = a, b, c;
is perfectly equal to
CONST s=3, a=0, b=1, c=2;
The STRUCT statement only defines symbolic names for accessing vector members with a fixed position and known meaning. The structure name is another constant which holds the number of constants used to name the members (and therefore the size of the entire structure in machine words).
This section describes the most basic elements of each T3X program, the factors which may be used to form expressions.
There are many different kinds of factors: symbols, numeric literals, character literals, string literals, tables, procedure calls, messages, and class constants. A factor may only occur in expressions and a single factor is the minimum form of an expression. Factors may be prefixed by unary operators and they may be combined using binary or ternary operators. Basically, all sorts of factors are exchangable: where one of them may occur, all others are allowed, too. The only exception is the symbol which has some additional properties which make it special. For example, symbols may be subscripted and it is possible to compute their addresses. These operations are limited to symbols. All other operations may be applied to any kind of factor, even if it makes little sense, like the multiplication of two strings (which will lead to highly environment-dependent results):
"Hello" * "World"
The evaluation of a symbol depends on its type. Variables and constants evaluate to their values, vectors and objects evaluate to their addresses. Structure names and structure member names are treated the same way as constants.
Class constants are public constants which are defined inside classes. To include a class constant in an expression, it must be prefixed with the name of the defining class and a dot:
T3X.SYSOUT
Like 'ordinary' constants, they evaluate to their values.
Numeric literals are written in decimal, hexa-decimal, or binary notation and represent their own values. A percent sign may be used to negate a number:
%123 = -123
The difference between %123 and -123 is that %123 is a factor while -123 is an expression (`minus' applied to a numeric factor). In fact, the percent sign has little meaning in T3X today, since the compiler accepts ordinary minus prefixes in constant expression contexts, too. [In early T3X versions, constant expressions were limited to single factors and therefore, the percent sign was required to define negative constant values. The %-prefix is kept for compatibility reasons.] An optimizing compiler might turn -n into %n, if n is a constant numeric factor.
The hexa-decimal notation may be used to represent a numeric value when prefixing the literal with the strings '0x' or '0X' (null, X). No space is allowed between the prefix and the hexa-decimal digits. The number 4095, for example, can be written as '0xfff' or '0xFFF'. The characters 'A' through 'F' (alternativey 'a'...'f') are used to represent the hexa-decimal digits with the values '10' to '15'. No difference is made between upper and lower case characters. The literals
0x1f 0X1f 0x1F 0X1F
all express the decimal value 31. The percent prefix may be combined with hexa-decimal factors as well.
Numbers may be expressed in binary notation by prefixing the literal with the strings '0b' or '0B' (null, B). No space is allowed between the prefix and the binary digits. The number 165, for example, could be written as '0b10101010'.
Note: The literals 0x8000 and 0b1000000000000000 should not be used to express the (decimal) value -32768. This value is not defined in T3X. Since 0x8000 is useful to mask the most significant bit of a pattern, the compiler will allow its use, but it will not allow the notation -32768.
Character literals are single characters or escape sequences enclosed by single quote characters like
'a' '0' '\s' ''' '\'' '\\'
A character literal evaluates to the ASCII code of the enclosed character. An escape sequence may be used to include certain unprintable or special characters. The backslash character is used to introduce the sequence. The '\' itself and the following character will be removed and replaced with the associated special character. Note that no escape sequence is required to represent an apostrophy: '''. Besides most C-style sequences, the following translations will be performed: \e->ESC, \q->", and \s->blank. The latter has been included for readability reasons. Unlike C, T3X accepts uppercase sequences as well: \e and \E both evaluate to ESC. The escape character may be used to escape itself. Thereby, it loses its special meaning and '\\' evaluates to a single literal backslash. A summary of all escape sequences is listed in the quick reference section.
String literals are sequences of characters delimited by double quotes ("):
"Hello, World!\n"
Each character either represents itself or is part of an escape sequence as described above. Each character is stored in a single byte. Each string literal is terminated with a NUL character, so n+1 machine words are required to store a string of the length n.
Since a string is an array of subsequent bytes, the ::-operator may be used to access its single characters.
At runtime, each string literal evaluates to the address of its first character.
A more general form of a literal vector is the table. A table is a static initialized vector and a generalization of BCPL-style TABLEs. Syntactically, it is a list of table members delimited by square brackets:
[ 7, "MOD", @modulo ]
Each table member occupies exactly one machine word. A string, for example, is represented by a pointer, while the string literal itself is placed outside of the table. Therefore, table members can be accessed using the subscript operator []: if
X = [ 77,88,99 ]
then
X[2]
evaluates to 99. The square bracket notation was chosen for delimiting tables because of the strong connection between vectors and the subscript operator.
The type of each table member may be any out of the following list:
Constant expressions include everyting which has a value that may be computed at compile time (like numeric literals). The inclusion of strings has been explained above. Addresses of global variables and procedures are represented by a symbol name prefixed with the address operator @.
What makes tables particulary flexible is the possibility to nest them:
[ [ 2, 9, 4 ], [ 7, 5, 3 ], [ 6, 1, 8 ] ]
Like strings, embedded tables are stored outside of the surrounding table and included as pointers. If, for example, the above table is assigned to the symbol v, the following conditions hold:
v[0] = [ 2, 9, 4 ] v[1] = [ 7, 5, 3 ] v[2] = [ 6, 1, 8 ]
Since the result of applying a subscript operator to a table containig tables results in a vector again, the subscript operator may be applied one more time, and consequently,
v[1][1]
would result in 5:
v = [ [2,4,9], [7,5,3], [6,1,8] ] v[1] = [7,5,3] v[1][1] = 5
(Remember that the first element of a vector has the index 0.)
A table which contains at least one non-constant expression is called a dynamic table. Non-constant expressions must be put in parentheses when they are to be included in a table:
v := [ "a * b = ", (a*b) ];
Embedded (non-constant) expressions are computed freshly each time the flow of the program passes the table they are contained in (each time the table is evaluated). Therefore, the values of table members computed by embedded expressions may be different each time the table is evaluated. This is why such a table is called 'dynamic'. The parentheses show the compiler that an expression is non-constant and make it generate additional code to fill in the value of the expression whenever the table is encountered. Therefore, static (constant) expressions should never be parenthesized in tables, because doing so would result in inefficient code. For example,
v := [ "5 * 7 = ", (5*7) ];
works, but computes 5*7 each time the table is evaluated. (Note: Even if an optimizing compiler would fold 5*7 to 35, the value would have to be stored in the table each time it is passed.)
On the other hand, including dynamic expressions in a table without any parentheses will lead to an error:
v := [ "a * b = ", a*b ];
will not work unless both a and b are constant.
Tables may be prefixed with the keyword PACKED. Packed tables may only contain byte-size values. Therefore, their members are limited to constant expressions with bit patterns where only the least significant 8 bits may contain values other than 0. Expressed in numbers, this is the range from -128 to 255.
A string may be considered a special form of a packed table. Consequently, each string may be written as a packed table as well. For example,
"T3X"
is equal to
PACKED [ 'T', '3', 'X', 0 ]
(Note the trailing zero in the vector literal.) Like strings, packed tables will be padded with zeroes up to the next word boundary.
The maximum number of members per table may be limited, but at least 128 elements per table must be allowed by any T3X implementation. The elements contained in nested tables do not count, but the entire embedded table counts as a single member. The same limit may exist for packed tables and string literals.
Procedure calls are represented by a procedure name followed by a parentheses-enclosed list of zero or more comma-separated arguments:
find(text, "word", 0, TEXT_SIZE);
Each argument may be any valid expression. When a procedure expects zero arguments, the parentheses must still be supplied: P(). A procedure call evaluates to the return value of the called procedure.
In T3X, only procedures may be called. Calls to absolute addresses and computed calls - like in BCPL - are not allowed. There is a mechanism to perform indirect calls, though: the CALL operator.
More detailed information on procedure calls and the procedure call operators can be found in later sections.
A message is used to call a method of a class. It is sent to an instance of its class, also known as an object. The message syntax is equal to a procedure call prefixed with the name of the instance, the message shall be sent to, and a dot:
t.write(T3X.SYSOUT, "Hello, World!\n", 14);
Details about messages can be found in the chapter on object-oriented programming.
Numeric entities usually carry a sign in T3X. This means that a part of a bit pattern representing a number is reserved to indicate the number's sign (positive or negative). On two's-complement machines, the most significant bit (high bit) contains the sign flag. If this bit is set, the number is negative and otherwise it is positive. Therefore, the numeric range on the Tcode machine includes the values -32767 to 32767 (in bit patterns 0xffff to 0x7fff).
Under some circumstances, it is desirable to interpret a number as an unsigned entity instead (for example when comparing pointers). In this document, a leading dot is used to indicate an unsigned number. In T3X itself, no such notation exists, but some operators may be modified with a leading dot to turn them into 'unsigned' operators. Unsigned operators treat the sign bit as a part of the value. Therefore, the domain of these operators is {0 ... 65535} rather than {-32767...32767}.
[Note: Unsigned values may not be expressed in the form of numeric literals in T3X.]
Since the modified operators operate on raw 'bit patterns', -1 and 65535 represent the same value to them on two's-complement machines. To avoid confusion, signed and unsigned operators should only be applied to the following ranges:
Range | Operators |
---|---|
-32767...32767 | signed: * / < > <= >= |
0...65535 | unsigned: .* ./ .< .> .<= .>= |
In expressions, operators may be used to modify or combine factors in various ways. Most operators may be applied to any kind of operand, even if the resulting operation may not evaluate to any meaningful value.
There are different kinds of operators and like procedures, they are classified by the number of their arguments (which are called operands in this context). There are unary (prefix) operators, binary (infix) operators, and there is one ternary operator and one variadic operator.
Operators may also be classified by their precedences. The higher the precedence of an operator is, the stronger it binds its operands. For example, the term operators (product, quotient, remainder) bind stronger than the sum operators (sum, difference). Therefore,
a * b + c * d
is equal to
(a*b) + (c*d)
Like in math expressions, parentheses may be used to override these default bindings. The precedence rules are simple in T3X:
The precedence rules in 3. are similar to the rules used in the evaluation of algebraic math expressions.
Another property of an operator is its associativity. An operator associates to the left when a sequence of identical operations is evaluated from the left to the right:
Associativity | Expression | Meaning |
---|---|---|
left | A op B op C | (A op B) op C |
right | A op B op C | A op (B op C) |
In T3X, all binary operations with the sole exception of :: are left-associative. The byte operator is right-associative. Unary operators are always right-associative.
In the remainder of this section, all availabe operators will be explained. The appearance is ordered by descending precedence.
The operators (), [], and :: are the only postfix operators. They always bind to primary factors in the form of symbols. Subscripts and call operators may be considered a part of a factor rather then an operator applied to a factor.
The ()-operator is the only variadic operator. Given the procedure call
P(a1, ..., aN)
its arity is N+1 (P plus N arguments). The meaning of the operator is the application of the procedure P to the (optional) arguments a1 through aN. If P does not have any formal arguments, the syntax is
P()
The value of the operation depends on the semantics of P. See the description of the RETURN statement for further details.
() may only be applied to symbols of the type `procedure'. The procedure must have been declared before its first application using either a procedure definition or declaration. The number of arguments to a procedure call will be checked against the arity of the called procedure. If the numbers do not match, an error will be signalled.
Each argument to a procedure call may be any valid expression itself which includes, of course, procedure calls. Given the binary function P2, the following expression is perfectly valid:
P2( P2(1, 2), P2( 3, P2(4, 5) ) )
An indirect procedure call may be performed using the CALL operator which is in fact an extension of the () operator. The expression
CALL PP(a1, ..., aN)
evaluates to the result of the application of PP to a1...aN, but in this case, PP is a procedure pointer instead of an actual procedure. A procedure pointer is an ordinary variable which has been assigned the address of a procedure using the address operator:
PP := @P;
In indirect procedure calls, no type checking (as described above) will be performed. If PP is the name of a procedure instead of a variable, the keyword CALL will be ignored.
In direct and indirect calls, the calling scheme is call by value. This means that all arguments of the call will be evaluated before the control is transferred to the called procedure, so that the value of each parametric expression will be transported to the procedure.
Note: Since vectors evaluate to their addresses, passing a vector by value will actually pass a reference to the array associated with the vector symbol. Therefore, vectors are always passed by reference: Instead of passing the entire vector, only the address of its first member is transported to the called procedure. There, the address will be stored in an (atomic) parameter variable. Since parameters are always atomic and therefore evaluate to their values and vectors evaluate to their addresses, both the actual vector and the parameter will reference the same memory location and subsequently, the parameter may be used in the same way as the original vector.
Procedure parameters are guranteed to be evaluated in the order of occurence (from the left to the right). For example, given the expression
P( Q(), R() );
the programmer may rely on the fact that Q() will be called before R().
The []-operator may be applied to vectors as well as to atomic variables. The subscript in
symbol [subscript]
may be any valid expression. If a is a vector, the subscript operation
a[b]
evaluates to the b'th member of a. If a is an atomic variable, the operation evaluates to the b'th member of the vector, a points to. This means that both subscripts in the following example would evaluate to the same value:
var v[100], pv; var a1, a2; pv := v; a1 := v[25]; a2 := pv[25];
The reason is simple. Since vectors evaluate to their addresses, the assignment
pv := v;
stores the address of v in pv. Atomic variables, on the other hand, evaluate to their values and therefore, pv evaluates to the address of v which has been previously stored in it. Consequently, v and pv both evaluate to the address of v in the above example. Hence, a variable which holds the address of a vector may be used in the place of that vector. For this reason, the subscript operator may be applied to atomic variables as well.
Since there is no nesting limit for vectors, any number of subscript operators may follow a single symbol. Assuming that v5 holds a vector containing five levels of nested vectors, the expression
v4[i1][i2][i3][i4][i5]
may be used to access single elements at the deepest nesting level. Such chains of subscripts evaluate from the left to the right.
The ::-operator (aka byte (subscript) operator) differs from the ordinary (word) subscript operator in serveral ways. First, it addresses bytes in (byte) vectors and second, it is right-associative. The expression
a::b
evaluates to the b'th byte of the the vector a. Therefore, :: is mostly used to access characters in strings. Since the results of ::-operations are always limited to byte-width, they cannot be assumed to return valid addresses. For this reason, byte subscripts are right-associative, since the result may very well be a valid subscript). If the expression
a :: b :: c
would evaluate from the left to the right (a::b :: c) the result of a::b would probably not be a valid address, since it is limited to eight bits. In this case, however, the following subscript would reference the position c of a non-vector - which is certainly not the desired result. If the expression evaluates from the right to the left, on the other hand (a :: b::c) the subexpression b::c is evaluated first and will probably return a valid subscript. This subscript is then applied to the (also valid) vector a.
Finally, the :: operator differs from [] in the fact that it has no righthand side delimiter. Therefore, the righthand side of :: is always a single factor and expressions like
a::b+c
actually evaluate to
(a::b)+c
since :: has the highest precedence. To address the b+c'th byte in the array a, the subscript must be parenthesized:
a::(b+c)
All unary operators have a high precedence and bind to single factors. Unless explicitly specified using parentheses, they never affect subexpressions containing other operators (except for postfix operators which have an even higher precedence). The suffix operators must bind stronger than the prefix operators, because this order leads to much more sensible semantics. For example
-P(a,b)
means `negate the result of applying P to a and b' and
~v[j]
means `evaluate to the inverse value of the j'th member of v'.
If the order of precedence would be reverse, the meaning of the first example would be `apply whatever is at the negative address of P to a and b' and the second one would mean `evaluate to the j'th member of the vector located at the address expressed by the inverse value of v'.
Altogether, there are four prefix operators. The minus sign (-) (which exists as a binary operator, too) evaluates to the negative value of its operand. Like in math expressions, any even number of minus signs has no effect. The unary minus sign is distinguished from the binary '-' by its context. When the sign occurs between two operands, it is binary. If it occurs at the place of a factor, it is unary and the factor itself follows.
The tilde operator (~) results in the value of its operand with all bits inverted. Since inverting a bit twice always yields the original state, even numbers of ~-operators have no effect, either.
The backslash (\) represents the logical NOT (while ~ represents the bitwise NOT). This operator evaluates to logical truth (-1), if its operand is logically false (0) and vice versa. Only the zero value represents logical falsity in T3X and all non-zero values represent logical truth. The normal form of the `true' value is -1. Two (or any other even number of) subsequent logical NOT operators may be used to create the normal form of a truth value.
The address operator (@) evaluates to the address of its operand. Therefore, it may only be applied to symbol names. The addresses of constants, structure member names, and classes may not be computed using @, because such entities have no addresses. Since the subscript operators bind stronger than the address operator, @ may be used to compute addresses of vector and structure members, and even the addresses of members of nested tables:
@v[i][j]
computes the address of the j'th member of the embedded vector v[i]. Of course, the address operator might be combined with byte subscipts, as well:
@s::i
yields the address of the i'th byte of s.
Annotation:
In early versions of T3X, it was a common technique to write
x := a+i;
to compute the address of the i'th member of a. Code like this is likely to cause serious trouble when passed through a non-16-bit backend, since the domain D of the + operator is
D = { -32767, ... 32767 }
Therefore, the equation
a+i = a+i & _16_BIT_MASK_
(where _16_BIT_MASK_ = 0xFFFF) holds in all strictly conformant T3X programs. For efficiency reasons, the results of arithmetic operations are usually not truncated by code generators. Hence using the above code to compute the address of a member of a is undefined.
The only correct way to compute the address of the i'th member of the byte vector a is to write
x := @a::i;
For the same reason, code of the form
x := a + i*t.bpw();
does not work under all circumstances and should be replaced with
x := @a[i];
The operation A*B evaluates to the product of A and B. If A*B does not fit in a machine word, the result is undefined.
A/B results in the integral part of the quotient of A and B. The result is undefined, if B is zero.
A MOD B evaluates to the difference between A and A./B.*B where A./B is an unsigned integer division and .* is an unsigned multiplication. Therefore, A MOD B is the division remainder of A./B. Like /, MOD leads to an undefined result, if B=0.
All term operators respect the signs of both of their operands. Two equally signed operands yield a positive result and operands with different signs lead to a negative result.
However, the T3X language also provides some modified operators which work on unsigned values. Modified versions of the multiplication and division operator exist. Like all modified operators, they are prefixed with a dot `.'.
The operation A.*B evaluates to the product of the unsigned values .A and .B. A./B results in the integral part of the quotient of .A and .B.
The notation .X is used to denote the unsigned value of X.
A+B evaluates to the sum of A and B and A-B evaluates to their difference.
In T3X, all bit operations have the same precedence. Grouping such operations usually required parentheses. Otherwise, the evaluation is performed from the left to the right.
The operation A&B results in the bitwise AND of A and B. Each bit is the result of computing the logical product of one bit in A with the bit at the same position in B.
A|B yields the result of performing a bitwise OR on A and B. Each bit in the result is a logical sum of a bit in A and the bit at the same position in B.
A^B performs a bitwise exclusive OR (XOR). In this case, the computation of a single bit is done by combining bits at the same positions in A and B using a logical negative equivalence ('not equal') operation.
See the following table for the results of applying logical operations to pairs of bits.
A | B | AND,* | OR,+ | XOR,\= |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
0 | 1 | 0 | 1 | 1 |
1 | 0 | 0 | 1 | 1 |
1 | 1 | 1 | 1 | 0 |
A<<B evaluates to the value of A with all bits shifted to the left by B positions. This is the same as an unsigned multiplication with the B'th power of 2:
b a<<b = a .* 2
After such an operation, the sign of the result must be considered undefined. This is not relevant, of course, if A is used as a bit field where each bit represents a binary state.
A>>B yields the result of shifting the bits of A to the right by B positions. This is basically equal to the computation of the quotient
b a ./ 2
Like in left-shift operations, the sign must be considered undefined after right-shift operations.
Technically speaking, one might say that the shift operators in T3X perform bitwise rather than arithmetic shift operations. This implementation has been chosen, because it is in some cases hard to manipulate bit fields using arithmetic shift operators.
Relational operators are used to compare two operands. The relation between the operands is expressed as a truth value: all these operators return truth, if their meaning applies to their operands and otherwise falsity. The following relational operations exist (.X denotes the unsigned value of X):
Operator | Description |
---|---|
A < B | A is less than B |
A > B | A is greater than B |
A <= B | A is less than or equal to B |
A >= B | A is greater than or equal to B |
A .< B | .A is less than .B |
A .> B | .A is greater than .B |
A .<= B | .A is less than or equal to .B |
A .>= B | .A is greater than or equal to .B |
A = B | A is equal to B |
A \= B | A is not equal to B |
Note: the operators expressing equivalence (=, \=) have a lower precedence than operators expressing ordering (> , <, >=, <=, .<, .>, .<=, .>=). For example,
A < B = C < D
is equal to
(A<B) = (C<D)
Consequently, the equation sign may be interpreted as `logical equivalence' when used between comparisons: the above expression evaluates to true, if either
(A<B) AND (C<D)
or
\(A<B) AND \(C<D)
applies. Since the inequation operator \= has the same precedence as =, it may be used as a negative logical equivalence operator (aka an Exclusive OR):
A<0 \= B<0
becomes true, if either A or B is negative. If the truth values of the comparisons A<0 and B<0 are equal, the expression yields the result `false'.
Note that any value may be considered a truth value in T3X. Everything but the zero value is interpreted as `truth', and only 0 may be used to express the `false' value.
The operators A/\B and A\/B reflect logical conjunction (AND) and disjunction (OR). Generally, the expression
A /\ B
evaluates to some true value, only if A AND B evaluate to 'truth'. Analoguosly,
A \/ B
yields a true result if either A OR B (or both of them) evaluate to `truth'.
More specifically, /\ and \/ are so-called short circuit operators. Since the expression A/\B can lead to a true result only if all its operands are true, there is no actual need to evaluate B, if A already has evaluated to 'false'. Therefore, the second operand of a conjunction never will be evaluated by a T3X program, if the first one already is false. The result will be zero in this case. If, on the other hand, the first value is true, the result of the entire conjunctional expression will be the value of the second operand. Therefore, the result of
A /\ B
can be specified more precisely as
zero, if A=0 and
B, if A\=0.
The meaning is just a more general form of the logical AND.
Similiarly, the expression A\/B can never become `false', if A already has been found out to be true. Therefore, no T3X program will ever evaluate B in such a case, and the result of the disjunction
A \/ B
can be explained more precisely as
A, if A\=0.
B, if A=0.
Like in ordinary algebra, conjunctions bind stronger than disjunctions:
A /\ B \/ C /\ D
equals
(A/\B) \/ (C/\D)
In chains of equal logical operations, the order of evaluation is from the left to the right (as in all binary operations). This means that chains of conjunctions will be evaluated up to the first `false' occurrence and chains of disjunctions will be processed up to the first `true' occurrence. In either case, the result of the entire chain is the value of the operand most recently processed.
There exists a connection between the logical operators and conditional statements: Because of their short circuit nature, logical operators may be used to implement flow control inside of expressions. The expression
A /\ B()
has almost the same meaning as
IF (A) B();
The only difference is that the expression yields a value, while the statement only has a side effect. Likewise, the expression
A \/ B()
has the same meaning as
IF (\A) B();
when ignoring the value of the expression. The IF-statement will be explained in a later section.
The ternary conditional operator has the least precedence. Therefore, it may be used to combine any kind of expressions without using parentheses. The following expression, for example, implements the minimum function:
a<b -> a : b
Since the operator has three operands, it consists of two parts: '->' and ':'. The meaning of the conditional operator is as follows: Given the expression
A-> B: C
if the operand A (the condition) evaluates to some `true' value, B will be evaluated and otherwise, C will be evaluated. If B is evaluated, C will not be evaluated and vice versa. The result of the expression is equal to the value of the most recently evaluated operand.
Like the logical operators /\ and \/, the conditional operator has a connection to conditional statements:
A-> B(): C()
is equivalent to
IE (A) B(); ELSE C();
except for the fact, of course, that the expression has a value, while the statement only has a side effect. (IE means If/Else and introduces a conditional statement with an alternative). The IE-statement will be discussed in a later section.
Constant expressions are used wherever a value must be known at compile time. Only a limited set of operators is allowed in constant expressions and the order of evaluation is always from the left to the right. Only one single unary operator is allowed per factor. There are no precedence or associativity rules.
L+1*10
evaluates to (L+1)*10 and not to L+(1*10) like it would in ordinary expressions. The resons for this decision were 1) ease of implementation and 2) the fact that most conditional expressions contain only a single operator or none at all.
The following operators are recognized inside of constant expressions:
While the associativity and precedence rules specify which operation is to be performed first, the order of evaluation determins, which factor is to be evaluated first. For example, in the expression
A * B
A may be evaluated before B or vice versa. The order of evaluation becomes important, if both A and B have side effects. If A had the side effect of printing 'A' on the terminal screen and B would print 'B', the terminal output of above expression may be "AB" as well as "BA".
The order of evaluation is undefined in most operations, but there are exceptions: the conjunction, disjunction, and conditional operators are defined by their orders of evaluation. Therefore, the order of evaluation of an expression like
A() /\ B()
is exactly defined. The lefthand side is always evaluated first and the righthand side is only evaluated, if the value of the lefthand side is non-zero. Given the above side effects, this expression would print "AB", if A() is non-zero and "A", if it is zero. It would under no circumstances print "BA".
The other exception is the order of evaluation of procedure call arguments and nested procedure calls. Procedure call arguments are guaranteed to be evalauted from the left to the right and nested calls are evaluated inside-out. The expression
P( A(), B() )
would be invariably evaluated in the order A, B, P. Therefore, it is safe, for example, to format a string in a procedure call argument in T3X and compute the length of the formatted string in a following argument. The statement
t.write(1, str.format(buf, "%S/.txrc", [(path)]), str.length(buf));
would print the string formatted in str.format correctly.
Notice: A T3X programmer should never rely on any order of evaluation not explicitly specified in this subsection! Even if precedence rules may suggest a specific order of evaluation, it may in fact be different and, even worse, it may change without breaking any rules, when turning optimizations on or off or using a different compiler version.
Statements are the basic building stones of T3X programs. While expressions just have a value, statements are used to `tell the program to do something'. Therefore, T3X is called an imperative language. Each program is a list of commands which is executed in sequence. Each command is also called a statement in the terminology of imperative programming.
There are different kinds of statements: assignments, procedure calls, conditional statements, loop statements, branch statements, and compound statements. The assignment is an essential part of every imperative language. It is frequently even used to characterize the imperative approach. Compound statemements do not have an own meaning, but they are used to group statements to form the bodies of loops, conditionals, and procedures. All other statement types serve the control of the program's flow.
In T3X, all statements have to be terminated with a semicolon. This means that a semicolon must follow every statement in a program, except for compound statements which are delimited by the keywords DO and END. In other procedural languages (like BCPL and Pascal), statements are separated rather than terminated. In such languages, a delimiter is only necessary, if two or more statements are written in sequence - there may not be any delimiter after the last statement. The separation rules in some languages are rather complex and the saving in delimiters is usually not worth the extra expense of having to remember these rules. Therefore, the most simple form of combining statements has been chosen in T3X: Each (non-compound) statement has to be terminated.
An assignment is used to transfer the value of an expression to a specific storage location. For example, the statement
A := B;
copies the value of B to A. After the assignment, both variables will have the same value. The previous value of B is thereby lost.
The righthand side of an assigment may be any valid expression as described in the previous section. The lefthand side is restricted to a subset of expressions which is frequently referred to as lvalues (lefthand side values). In T3X, each lvalue may be one of the following:
(*) Vector members and structure members are basically the same.
Assignments to vector members are in no way limited; addressing elements of multiply nested vectors is perfectly legal. The evaluation of variables on lefthand and righthand sides of assignments was explained in detail in the section about factors. In short, on righthand sides variables evaluate to their values and on lefthand sides they evaluate to their addresses. The assignment operator := first evaluates the expression on its left side and remembers the resulting address. Then, it evaluates the expression to its right and stores the result at the memorized address.
A generalization of the evaluation of lefthand sides is the following: All but the last reference on a lefthand side of an assignment evaluates to its value. Only the last reference evaluates to its address. Here are some examples:
A := B;
The symbol A references a specific storage location. Since it is the only reference in the lvalue, it evaluates to its address. In the statement
A[i] := B;
A is not the last reference and hence it yields its value (which is its address in case A is a vector). The operation [i] references the i'th member of A. Since it is the last reference on the lefthand side, it evaluates to the address of A[i] instead of its value. Consequently, the following assignment operator stores B at the address of the i'th member of A. The same is valid for the access of vector elements at any nesting level. The statement
A[i1][i2][i3][i4] := B;
for example, stores B in the i4'th member of A[i1][i2][i3].
Accessing byte vectors works in the same way:
A::i := B;
stores the least significant eight bits of B in the i'th byte of A.
Since :: associates to the right, the last evaluated reference is the leftmost one in chains of byte operators like
A::B::i := C;
Because B::i will be evaluated first in this example, it will yield its value. Then, the address of A::(B::i) is computed. Since no more references are following after A::, the (least significant eight bits of the) value of C will be stored in the (B::i)'th byte of A.
Note: Although the assignment symbol := looks like an operator (and is frequently even referred to as such), it may not be used inside of expressions, but only to combine expressions, thereby turning them into a statement.
The application of a procedure may form a complete statement:
fill(a, 'X', 10);
In this case, the return value of the called procedure will be discarded and only the side effects of the procedure will actually take effect. The side effect of the above statement, for example, could be to fill the first 10 characters of the vector a with the character 'X'.
Each procedure - no matter whether it returns a specific value or not - may be used in a standalone procedure call. For details on procedure calls, see the the sections on factors and procedures in this manual.
There are two forms of the conditional statement. The first one is the IF statement which is avaliable in most procedural languages. Its general syntax is
IF (expression) statement
where expression may be any expression and statement may be any statement. The IF statement itself does not have to be terminated with a semicolon, since its body, which is a statement, too, already supplies the terminating semicolon. The statement which forms the body of the IF statement will be executed, only if expression evaluates to a `true' (non-zero) value. The following statement turns a into its absolute value:
IF (a < 0) a := -a;
If A is less than zero, then a will be replaced with -a, thereby changing its sign. Since the body a := -a is executed only, if a < 0 applies, this conditional statement always leaves a positive value in a. The semicolon in the above example belongs to the assignment.
The second form of the conditional statement is the IE statement, which imlements a conditional with an alternative:
IE (expression) statement-T ELSE statement-F
Like in IF statements, any valid expression or statement may be used in the places of expression, statement-T, and statement-F.
The meaning of the IE statement is equal to the one of the IF statement, as long as the expression becomes `true'. In this case, statement-T will be executed. If the expression evaluates to `false', however, statement-F will be executed, while an IF statement would not have any effect in this case. Therefore
IE (expr) stmt ELSE ;
is equal to
IF (expr) stmt
IE is an abbreviation for If/Else. In most languages, the IF statement may or may not have an alternative. In T3X, there is a separate type of statement for each version. The reason for this choice is the `dangling else' problem which cannot arise when these statement types are separated. If no further information is supplied, the following program written in a language which allows optional alternatives would be ambiguous:
IF (condition1) IF (condition2) statement1 ELSE statement2
The problem is to decide which IF the ELSE branch belongs to: is it the alternative of IF (condition1) or IF (condition2)? In fact, most languages will bind it to the most recently opened IF - the second one in this example. In T3X, such an ambiguity does not exist:
IE (condition1) IF (condition2) statement1 ELSE statement2
Since the IF statement cannot have an alternative, the ELSE branch must belong to IE (condition1).
There are two kinds of loops: `while' loops and `for' loops which represent two classes of problems: those which are computable by algorithms with a known upper limit of iterations (FOR-computable or primitive recursive functions) and problems which cannot be computed by algorithms with a fixed number of iterations (WHILE-computable or general recursive functions). Since the FOR-computable functions are a subset of the WHILE-computable ones, FOR statements may be considered a special case of WHILE statements and in fact, it is possible to express a FOR loop using WHILE, but not vice versa.
[In actual programming languages, it is possible to express WHILE using FOR, but theoretically, it is not.]
There is a third kind of loop in many other languages, the repeative loop, but it turns ouf to be a special case of the WHILE loop. Repeating loops are not very frequently needed and if they are, they can easily be constructed using WHILE and IF in T3X.
The WHILE loop has the following general form:
WHILE (expression) statement
where expression may be any expression and statement may be any statement. The body consisting of the statement will be executed while the test expression in parentheses evaluates to some `true' value. Hence the name of this loop. If the expression becomes `false' before the statement has been passed for the first time, the statement will never be executed. However, a loop which tests its exit condition at the end of the statement may be constructed using WHILE, IF, LEAVE, and a compound statement (which will be explained later in this chapter):
WHILE (-1) DO ! -1 = true statement IF (\condition) LEAVE; END
In this case, statement will be executed at least once, because the exit condition -1 is a `true' constant. In the subsequent IF statement, the loop will be left if condition does not apply. LEAVE is used to branch out of a loop. It will be explained later, too.
The FOR loop exists in two forms: an explicit form and a short form. The explicit form looks as follows:
FOR (var=start, limit, step) statement
Var is an atomic variable which must have been declared earlier in the program. Unlike in BCPL, it will not be declared implicitly by the FOR statement. Start and limit are expressions and step is a constant expression. The FOR loop works this way:
First, var is initialized with the value of start.
Second, var is compared against limit. If either the condition
var<limit /\ step>=0
or
var>limit /\ step<0
holds, the statement is executed. Otherwise the loop is left and statement will not be executed any more.
Finally, step is added to var, and the loop will be repeated from the point where the exit condition is checked. Like in a WHILE loop, the statement will never be executed, if the exit condition already is true when it is checked for the first time (which is the case, if both of the above conditions become false).
The following examples print the numbers from 0 to 9 using the procedure print (which is only defined in the first example). (Print uses some routines of the classes t3x and string which will be explained in a later chapter.)
print(n) DO VAR b::10; t.write(T3X.SYSOUT, str.format(b, "%D\n", [(n)]), str.length(b)); END DO VAR i; FOR (i=0, 10, 1) print(i); END
This example counts down from 9 to 0:
DO VAR i; FOR (i=9, -1, -1) print(i); END
Special attention should be paid to the limits of the FOR loops in these examples. They always specify the first value which will not be applied to the statement. Another way to write the second example would be the following one, where the FOR loop has been replaced by a WHILE loop:
DO VAR I; i := 9; WHILE (i > -1) DO print(i); i := i-1; END END
The meaning of this program fragment is completely equal to the one employing a FOR loop, but the syntax of the FOR statement is more compact and expresses the purpose of the statement clearer.
The step value is optional in FOR statements. In the short form of the statement, it is omitted. If only two operands are specified in FOR, the step width defaults to one. Therefore, the statements
FOR (j=0, 100, 1) p(i);
and
FOR (j=0, 100) p(i);
have exactly the same meaning.
A branch passes control to a specific point in a program. Typical destinations for branch commands are the beginnings or the ends of loops or the ends of procedures or programs. There is no branch command with a freely definable destination like Goto in BCPL.
The LEAVE command causes the immediate termination of the innermost WHILE or FOR loop. There are no operands to LEAVE.
The following code compares the characters in two strings A and B. The loop is left at the first position where the strings differ, but in any case after 100 steps:
FOR (i=0, 100) IF (a::i \= b::i) LEAVE;
The loop is set up for 100 passes and the LEAVE command makes the loop terminate, as soon as a mismatch is found.
The LOOP command transfers control to the beginning of the innermost loop. Like LEAVE, it has does not have any operands. If LOOP is used inside of a FOR loop, it branches to the increment part where the value of the index variable is modified. In WHILE loops, it branches directly to the point where the exit condition is checked.
To leave a procedure, a RETURN statement may be used. It has the general forms
RETURN expression;
and
RETURN ;
The statement evaluates the specified expression, if any, and passes it back to the calling procedure. Then, it performs a branch to the end of the procedure where local storage is released and the procedure is left. The value received by the calling procedure is the value of expression:
P(x) RETURN x*x; Q() DO VAR y; y := P(5); END
In this short example program, 5 is passed as an argument to the procedure P. The procedure computes the square of its argument and returns it to Q where the result (25) will be stord in y.
When no expression is specified after RETURN, a zero value will be assumed. Therefore,
RETURN;
is the short form of
RETURN 0;
All the above branch statements take care of locally allocated storage. If local symbols are defined in the bodies of loops, for example, LOOP and LEAVE will release this storage before branching to their respective destinations. This allows the use of these commands in any loop context, even if local symbols are present.
The HALT statement with the general forms
HALT constant-expression ;
and
HALT ;
branches to the end of the entire program.
If necessary, the command cleans up the runtime environment of the program The value of the specified constant expression is returned to the calling process. Only the least significant eight bits are guaranteed to be returned to the caller.
The argument of HALT may be omitted. In this case, zero will be delivered to the caller.
A compound statement (sometimes also called a block statement) is a group of statements which is treated like a single statement under some aspects. For example, a compound statement may occur at any place where a simple statement is expected. In commands like
IF (expression) statement
a compound statement can be used to extend the scope of the conditional statement so that it is applied to a group of statements instead of a single statement:
IF (a < '0' \/ a > '9') DO VAR b::3; t.write(T3X.SYSOUT, "Not a valid digit. Code=", 25); t.write(T3X.SYSOUT, str.format(b, "%X\N", [(a & 0xff)]), str.length(b)); RETURN -1; END
In this example, both t.write() messages and the RETURN statement will processed only if the IF-condition applies. (The concept of sending messages will be explained in detail in the chapter about object-oriented T3X.) The keywords DO and END are used to delimit statement blocks. There is no terminating semicolon after a compound statement. The line
DO p(); q(); END ;
would be recognized as a compound statement containing the procedure calls P() and Q() and an empty statement consisting only of a single semicolon.
In T3X, compound statements are ordinary statements and they may occur at any place where a statement is expected. Even statements like
DO DO END DO END END
are perfectly valid. The use of compound statements in sequences becomes clear in the next sections where the allocation of local storage in compound statements is explained.
Besides the grouping of commands, compound statements provide a mechanism for the definition of local symbols and the allocation of dynamic storage. Declaration statements already have been explained in a previous section. All data objects which can be created in T3X may also be declared locally inside of compound statements by placing their declarations at the beginning of the statement block. Any number of declarations will be accepted after the keyword DO, which introduces the block.
The declaration statements themselves do not change in local contexts. Only the position inside of a statement block makes the declared symbols local to that block. The statement
DO VAR i; FOR (i=0, 10) p(i); END
for example, applies the procedure p to the sequence 0...9. The index variable is declared inside of a compound statement which also contains the FOR loop generating the sequence. The variable i does not exist before the compound statement is entered. It will be created automatically at the point of its declaration and it will cease to exist at the end of the block it has been declared in. Therefore, variables which are local to compound statements are sometimes also called automatic variables.
Besides atomic variables and vectors, structures, constants, and objects may be declared locally, too. Unlike BCPL, T3X does not support nested procedure definitions. In case of atomic variables and vectors, the storage required by the variables is allocated when the symbol becomes valid and released when the variable is destroyed again. In most environments, automatic storage will be allocated on the runtime stack.
To illustrate another application of local storage allocation, imagine the following situation:
P() DO VAR big_V[LARGE_1]; VAR big_W[LARGE_2]; ! Too big task1(big_V); task2(big_W); END
In this procedure, two tasks requiring large amounts of storage shall be run sequentially, but not enough memory for both arrays is available. One solution would be the creation of two procedures where each one creates local storage for only one of the tasks. Another one would be to share the vector, but both solutions only work at the cost of readablility and maintainability. T3X provides another solution, since the compiler guarantees that local storage is allocated exactly at the point of its declaration and released immediately at the point of the destruction of the associated symbol:
P() DO DO VAR big_V[LARGE_1]; task1(big_V); END ! big_V gets released here DO VAR big_W[LARGE_2]; task2(big_W); END ! big_W gets released here END
Since compound statements may be nested, naming conflicts may occur in many languages, like the following example (in C) illustrates:
{ int i; i = 123; { int i; i = 456; } printf("%d\n", i); }
The variable i, which is defined in the outer compound statement,
is redefined in the inner block. Inside of the inner block,
the variable i is assigned the value 456. Clearly, the assignment
i = 123; in the outer block references the variable defined in
the outer block, but which one is referenced in the inner scope?
C - like most other procedural languages - resolves this ambiguity
by always giving precedence to the innermost definition. Therefore,
the example program fragment would print 123. When this method is
used, the symbol i defined in the outer scope becomes inaccessible
in the embedded scope.
This effect is called shadowing: The inner definition `shadows' the
outer one which thereby becomes temporarily invisible to the compiler.
T3X uses more strict scoping rules than most other languages: Symbols generally may not be redefined in T3X programs. This also applies to global symbols (symbols which have been declared at the top level - outside of procedure definitions, classes, or statement blocks). This way, shadowing can never happen. The flexibility of local symbols remains, though, since names can be reused as soon as a local data object is getting destroyed:
F(x,y) DO VAR i, j; ! ... END G(x,y) DO VAR i, j; ! The names x,y,i,j are re-used ! ... END
As shown in this example, symbol names may be reused in procedure definitions (for formal argument names) as well as in subsequent compound statements. Since the variables i and j will be destroyed at the end of the compound statement forming the body of F, they can be reused in G. The same is valid for the argument names x and y.
The following example shows some local and global symbols and their scopes.
+++ VAR GX, GY; | +++ CLASS A() | STRUCT C=R,G,B; +++ | | | P(x, y) DO +++ +++ | VAR x1, x2; +++ | | END --- | | END --- | +++ P(x, y) DO VAR x1, y1; +++ | | | STRUCT PT=PX,PY; +++ | | | DO VAR i, j; +++ | | DO VAR x2, y2; +++ | | | END --- | | | | | | DO VAR x2, y2; +++ | | | END --- | | | END --- | | | | DO CONST t=%1, f=~t; +++ | | DO VAR x2, y2; +++ | | | END --- | | | END --- | | END --- |
Fig.1 Scopes (example) |
Like all other symbols, the global variables GX and GY are valid from the point of their declaration, but unlike locally declared names, they remain existant up to the end of the program. Their scope is the entire program (beginning at their declaration). The scopes of all symbols in the example are illustrated using vertical bars. Plus signs indicate the point where a symbol name becomes valid and its storage is allocated, and minus signs mark the point of its destruction.
Note: the names x2 and y2, which are used in different scopes, denote different variables. A value stored in x1 within the first scope, for example, cannot be retrieved in the second or the third scope from x1, because the name references different locations in different scopes. The variable which is created at the beginning of the first scope containing x1 is deleted at the end of this scope and the value stored in that variable is lost. Assignments to local variables only remain valid between the according +++ and --- indicators.
All symbols which are defined in a so-called class contexts (between the keywords CLASS and the matching scope terminator END) are only valid inside of this context. Both, the structure C and the procedure P defined in the class A are only valid inside of the class context of A. There exists no conflict between A.P (the method P of A and the procedure P which is defined at the top level. Class contexts will be discussed in detail in the section on object-oriented T3X.) later in this document.
There are two forms of the empty statement (aka null statement) in T3X. The first form is the single semicolon
;
Compound statements may be empty, too:
DO END
Both null statements have absolutely no effect. Their only purpose is to fill a gap where a statement is required, but nothing is to do. They are useful to negate the meanings of complex conditions, for example. Instead of negating the condition at the cost of making it harder to understand, one might turn
IF (complex-condition) statement
into
IE (complex-condition) ; ELSE statement
Each procedure may be considered a separate small program. It communicates with other procedures using parameters and return values and/or through global variables. Each procedure has access to all global data objects which have been declared before itself. Generally, it is considered `good style' in procedural languages to keep procedures self-contained and use global storage as little as possible, but when data has to be shared between a big number of different procedures, the use of top-level definitions is very common (and more efficient).
The definition of a procedure has only one single form in T3X. Since there is no support for nested routines, all procedure declarations and definitions must occur at the top level (the space between the other global declarations) or in class contexts. Public procedures declared in class contexts are called methods. They will be explained later.
The only form of the procedure definition is
P(a1, ... aN) statement
where P is the name of the procedure, a1...aN are the names of their formal arguments, and statement is the body of the procedure - the part which describes its semantics.
The procedure name may be any valid symbol and it is declared in the global context. Therefore, procedure names may not be reused ever. (One advantage of T3X's strict scoping rules is that procedures cannot get shadowed.) The arguments a1...aN are local to the procedure (not local to the statement forming the body). Their names will cease to exist after the statement has been accepted. Hence, they may be reused after the procedure definition, but not inside of it. The parentheses around the argument list must always be specified, even if the list is empty:
Q() statement
The number of arguments specified in a procedure definition determines the type of the procedure. The type of a procedure is written as an integer number which represents the number of its arguments. In T3X, the argument counts of all procedure calls will be checked against the procedure type. The compiler will not allow calls with a wrong number of parameters. This is done because of T3X's calling conventions: Parameters are passed in reverse order and therefore, each procedure relies on a correct number of arguments. BCPL and C, for example, use a different approach in which it is easy to compensate for missing or superflous procedure parameters. The only real advantage of this approach, however, is the possibility of defining real variadic procedures (procedures with a variable number of arguments). Variable argument lists can also be realized in T3X, but using a different mechanism which will be explained later.
When a procedure is called, it may receive data through its arguments. This works in the following way. Given a procedure
P(x, a, b, c) RETURN a*x*x + b*x + c;
and a procedure call
Q() VAR y; y := P(2, 3, 5, -7); END
the caller (Q) places the values of the actual arguments 2, 3, 5, and -7 in a temporary storage location (usually on the runtime stack), saves the address of the following operation (in this case the assignment) and then transfers control to the procedure P. In P, the formal arguments x, a, b, and c reference storage locations which exactly match the temporary locations of the values passed to the routine, so that x=2, a=3, b=5, and c=-7.
The procedure P computes a*x*x+b*x+c and returns the resulting value to the caller. Each procedure returns automatically when its body has been processed completely or when an explicit RETURN statement is executed. In the above example, both happens at the same time. It is not unusual to specify a RETURN statement at the end of a procedure, since only RETURN may pass an explicit value back to the caller. Procedures which do not return through RETURN have an implicit zero return value. In the example, however, the value of P is explicitly specified. After passing control back to the caller, the assignment takes place, and the result of the procedure call is stored in y. Between the procedure return and the assignment, the temporary storage where the actual arguments were held is released again.
The most frequently used form of the procedure has a body consisting of a compound statement:
fib(n) DO VAR f, i, j, k; f := 1; j := 1; FOR (i=1, n) DO k := f; f := j; j := j+k; END RETURN f; END
Note: The variables declared at the beginning of the procedure
VAR f, i, j, k;
belong to the compound statement rather than to the procedure. Like in conditional statements and loops, the statement block is used to extend the scope of the procedure: not only a single statement, but a group of statements forms the body of the routine.
It is perfectly safe for a procedure to call itself. Since the declaration of a procedure takes place while parsing its head (consisting of its name and its argument list), the declaration is already valid when the compiler processes the body. Therefore, a procedure may recurse into itself:
fac(n) RETURN n=0-> 1: fac(n-1)*n;
This small example computes n!. For the trivial case n=0, it simply returns 1. To compute n! where n>0, it first computes (n-1)! and then multiplies the result with n. To compute the factorial of n-1, it calls itself. Since the value of the argument of the recursive call is decremented by one at each level of recursion, it will finally reach 0 and the procedure will start returning.
Recursion is safe in T3X, because local variables (which include formal arguments) are created freshly each time a declaration is passed. Therefore, the symbol n in the above example denotes different variables at each level of recursion. To see how recursion works, the following program including a modified example of the factorial function is recommended:
MODULE visual_fac(t3x, string); OBJECT t[t3x], str[string]; fac(n) DO VAR b::30; ie (n=0) do t.write(T3X.SYSOUT, " 1", 3); return 1; end else do t.write(T3X.SYSOUT, str.format(b, " %D *", [(n)]), str.length(b)); return n*fac(n-1); end END DO var b::80; t.write(T3X.SYSOUT, "fac(7) =", 8); t.write(T3X.SYSOUT, str.format(b, " = %D\n", [(fac(7))]), str.length(b)); END
Of course, the usual restrictions concerning the use of global memory and other shared resources in recursive procedures apply in T3X, too.
Recursive procedures which depend on each other are called `mutually recursive'. Such a configuration introduces the following problem: Given the procedures
A() DO !... B(); END B() DO !... A(); END
which depend on each other, it does not matter which one is declared first - one will always be inaccessible from within the other. In the above example, B is undefined in A because it is declared after A. When swapping the definitions, A will become undefined in B.
The problem is solved by introducing procedure declarations which may occur before the matching definition. A declaration makes a procedure symbol known to the compiler, but does not associate any meaning with with it. In the case of procedures, the definition may be `subsequently delivered'. To declare a procedure, the DECL statement is used:
DECL name(type);
Like in most other declaration statements, any number of comma-separated declarations may be included in a single DECL statement. Name is the name of the procedure to declare and type is a constant expression specifying the number of formal arguments of that procedure. This value is required to type check forward calls to the procedure. The number of formal arguments in a subsequent definition must exactly match the type specified in the declaration. Otherwise, a redefinition error will be signalled.
DECL reserves the given names for later procedure definitions. Therefore, each of these names may only be re-used in one single procedure definition. Declaring a procedure without defining it later is an error, since this may leave forward references to the declared procedure unresolved.
To correct the above program fragment containing the mutually recursive procedures A and B, the declaration
DECL B(0);
has to be inserted before the definition of A. Like procedure definitions, DECL statements are only allowed at the top level and in class contexts, but not inside of local scopes.
All procedures have fixed numbers of arguments in T3X. It is possible, however, to pass a variable number of arguments to a procedure using a dynamic vector. The following simple example computes the average of n values stored in the vector v:
average(n, v) DO VAR i, t; t := 0; FOR (i=0, n) t := t+v[i]; RETURN t/n; END
Since vectors are first-class objects in T3X, it is possible to inline vectors in procedure applications, thereby forming an elegant way of passing a variable number of values a procedure:
average(5, [ 2, 3, 5, 7, 11 ]); average(3, [ (fib(10)), (fac(5)), 789 ]);
Another example is illustrated below. Given a T3X implementation of a subset of of the variadic standard C library procedure printf() and the fib() procedure which has been defined earlier in this chapter, it would be possible to write the following program which prints the line
fib(n) = m
for each n=1...10 and m=fib(n):
DO VAR i; FOR (i=1, 10) printf("fib(%d) = %d\n", [ (i), (fib(i)) ]); END
printf() replaces each %d with the readable representation of the value of one of the arguments in the table. Each time, a %d is processed, the procedure advances to the next argument. The C version uses a variable number of arguments while the T3X version uses a dynamic vector to transport the arguments.
BTW: printf() uses the number of %-patterns to determine the number of arguments passed to it.
Like many popular object oriented languages, T3X is a hybrid language. A hybrid language is a language incorporating (at least) two different paradigms. T3X uses the object oriented approach at a rather abstract level and the procedural approach at the lower levels. For example, numbers are no objects in T3X and adding numbers is not done by sending messages. In a purely object oriented language, the term
5 + 7
would be interpreted as send the message '+' with the argument '7' to the object '5'. In a procedural language, however, adding numbers is done by combining the factors '5' and '7' using the '+' operator. Interpreting numbers as objects and expressions as messages makes no sense in a procedural language, since numbers and operators are not implemented this way. There are many hybrid languages employing both the procedural and object oriented approach. For example, C++, Java, and ObjectPascal fit in this category. A well-known member of the family of purely object oriented languages is Smalltalk.
What is the object oriented programming (OOP) paradigm anyway? The term 'object oriented' (OO) has become a little fuzzy, since it has been excessively misused in marketing campaigns and blurry language definitions.
An object oriented language encapsulates code and data definitions in templates called classes . A class contains data definitions, procedures, and public procedures (aka methods). Each class can be instantiated by declaring an object (aka instance) of that class. Each object contains the data objects defined inside of its class. (The term object is used to refer to instances of classes in this section. Variables and vectors are referred to as data objects.) Only procedures defined in its class may access the data contained in an object. No procedure which is not contained the the class of an object may ever access data of the object. This principle is called encapsulation. This is a fundamental property of OO languages.
Each class may have multiple instances (objects). In this case, each object of the class has its own private data area. This is why classes may be reused. Manipulating one object has no effect on other objects of the same class.
Methods of classes are invoked by sending messages to objects of that class. A message is similar to a procedure call. It may transport arguments 'into' the object and the object may return a value to the caller. Since the method is part of the class of the object, it is allowed to manipulate data inside of the object. This way, methods provide a clean and abstract interface to the data of the object. The data structure itself is hidden from the user and may change without changing the interface.
Other languages define additional concepts like inheritance , protected and public variables, virtual methods, friend relationships, class variables, etc, but most of these concepts are semantically hard to handle and weaken the object oriented principles. All concepts that are actually necessary to define an object oriented model are
The general forms of the class declaration are as follows:
CLASS classname() declarations END CLASS classname(required, ...) declarations END
The context of a class is delimited by the class header (consisting of the keyword CLASS, the name of the class, and the depdendency list in parentheses) and the keyword END. Inside of this context, there may be any number of declarations. These types of declarations are allowed in class contexts:
Nested classes are not defined in the T3X object model.
All declarations between CLASS and END are local to the class. Therefore, classes add an additional level of scoping between the global level and the procedural level. All data objects and procedures declared inside of a class are only visible inside of that class. The names of entities declared at class level may be reused outside of the scope of the class. Hence different classes may define data objects, procedures and even methods with equal names. The following example illustrates this principle.
CLASS a() VAR flag; PUBLIC flip() flag := \flag; END ! the scope of class A ends here. CLASS b() VAR flag; PUBLIC flop() flag := \flag; END
This example defines two classes A and B each defining a variable named flag. At the end of the scope of A, all declarations of A become invisibe (encapsulated) and so the name flag may be reused in B. Since the procedure flip is contained in the same scope as flag, it may access the flag of A. In the same way, the procedure flop may access the flag of B. The two variables named flag are different entities, though, since they are contained in different classes.
Like structures, classes are merely templates for data objects. They describe the layout of a data structure plus a set of methods which may be used to manipulate the structure. The size of a class is computed in the same way as the size of a structure: it is equal to the sum of the sizes of all class members. In expressions, the name of a class is a constant evaluating to the size of the class. Classes without any instance variables have a size of one machine word.
The only way of changing the state of a class from the outside is to send it a message. T3X supports a simplified form of the methods called class constants. Class constants may be thought of as 'lightweight' methods returning a constant value. They allow to export values and structures without having to send a full message. OO systems which do not allow to change the state of an object without sending a message are said to employ strict encapsulation. Strict encapsulation in T3X is illustrated in the following figure.
+--------------------------------------------+ | | | M e t h o d s | | | | +--------------+ | | | Variables | | | +--------------+ | | | Constants | | | +--------------+ | |--------------| Structures |--------------| | +--------------+ | | | Procedures | | | +--------------+ | | | Objects | | | +--------------+ | | | | C l a s s C o n s t a n t s | | | +--------------------------------------------+ |
Fig.2 Encapsulation |
Objects are used to instantiate classes. An OBJECT statement is to a CLASS definition what a VAR statement is to a STRUCT definition. While the class defines the layout of an object, the OBJECT statement actually creates an object in memory. The general form of the object definitition is as follows:
OBJECT an_object[a_class], ... ;
Any number of objects may be defined in a single OBJECT statement. Each class may be instantiated any number of times and different classes may be instantiated in the same statememt. The name of the object to define is specified before the square brackets and the class of the object inside of the brackets:
OBJECT str[string];
creates an object of the class string named str.
An object may be a factor in an expression. It evaluates to the address of its first member. The notations
objectname
and
@objectname
are equivalent.
When creating multiple instances of a class, only the data defintions of the class are instantiated. The methods of a class belong to the class rather than the object. They are created when a class is declared.
The only way to alter the state of an object is to send it a message. Therefore, the state of each object is - ideally - completely independent from the states of other objects, even if they belong to the same class. In a hybrid language like T3X, however, the procedures of a class may change data objects defined in the global scope. Changing a global object from within an object changes the state of all other objects of the same class (and maybe even others). Therefore, this technique is deprecated. Of course, there are situations where an object has to change the global state, for example when performing input/output operations. Classes defining such objects are said to have side effects.
In order to create an instance of a class A inside of a class B, the class B must require A. In this case, B is also said to depend on A. The most simple scenario contains two classes which are defined inside of the same file:
CLASS a() ! definitions of A END CLASS b(a) OBJECT xa[a]; END
Since B instantiates A, it requires A. A class is required by including its name in the dependency list of the class header of the dependent class. Requiring a class has two effects:
Things get a little more complex, if the dependent class and the required class are located in different files. Since class names are contained in the global scope, they are lost as soon as the compiler has finished the translation of the file they are contained in. Hence the required class would be unknown when the compiler translates the file containing the depedent class. To allow classes to be located in different files, an additional level above the global scope is added. It is called the public scope and it persists even when the compiler finishes.
To add a class to the public scope, two steps are necessary. First, the file must get a module header which names the file. A MODULE statement has the following general form:
MODULE module_name (required, ...);
The module name specified in the module header must be the same as the actual name of the file containing the module (but not including the .t suffix). If, for example, the class A is located in a file named tools.t, the module header would look like this:
MODULE tools();
The parentheses after the module name serve the same purpose as in class definitions: they delimit a dependency list. It may be ignored for now. The module header allows the compiler to locate the definition of a class, even if multiple classes are contained in a single file. (If files were named after classes, only a single class could be included per file.)
The second step required to export a class to the public level is to prefix its class header with the keyword PUBLIC:
PUBLIC CLASS a()
The following example shows the contents of two files file_a (containing the required class A) and file_b (containing the dependent class B). When compiling first file_a, A is exported to the public level. When compiling file_b, A is imported from the public level when it is required by B.
MODULE file_a(); MODULE file_b(); PUBLIC CLASS a() CLASS b(a) ! definitions OBJECT xa[a]; END END
Another purpose of the MODULE statement is to provide an interface between the procedural and the object oriented parts of T3X. If only classes could require classes, it would be impossible to instantiate a class inside of a T3X program, since the main program is procedural. To instantiate a class in a procedural program, it is added to the dependency list of the module wishing to instantiate the class. It does not matter whether the required class is a public class contained in a different module or a 'private' class contained in the same module. In the latter case, however, the MODULE statement must be located after the class definition. The next example shows a procedural program sending a message to a class.
CLASS A() PUBLIC m() DO t.write(T3X.SYSOUT, "A: received m.\n", 15); END END MODULE main(A); DO OBJECT xa[A]; xa.m(); END
Messages are used to alter the state of an object from the outside. Basically, procedural programs are sets of procedures calling each other. An object oriented program is a set of objects sending messages to each other. Sending a message to an object activates a public procedure defined in the class of the object. A method definition looks like a procedure definition with the keyword PUBLIC attached:
PUBLIC procname(arguments) statement
Method definitions are only valid in class-level scopes.
A message is sent to an object using the syntax
objectname.methodname(arguments)
Messages may be factors in expressions or standalone statements. When used as statements, they must be terminated with a semicolon:
objectname.methodname(arguments);
The arguments of a method are passed in the same way as the arguments of a procedure and like a procedure, a method returns a value. The difference between an 'ordinary' procedure and a method is that a method changes the instance context upon entry. The instance context is the set of data objects defined in a class. It is comparable to local contexts of procedures: when a procedure is entered, it creates a new local scope and when it leaves, it restores the caller's context. Unlike a local scope, the instance context is persistent, though. Therefore, methods do not create a new instance context, but just activate an existing context. The caller's context is saved upon entry (usually on the runtime stack) and restored when the method returns. Since instance contexts are persistent, the changes performed by methods are permanent.
Each object has its own instance context, which is divided into the data objects declared in its class. Methods use the instance context to access class-level data. By shifting the instance context upon entry, each object accesses only its own private data. The instance context may be thought of as a multiplexer. The principle is illustrated in the following figure:
+------------+ | Class A | |------------| +----------------+ | Method m | ----------> | Access V | | ... | +----------------+ | Variable V | | +------------+ | V +--------------------+ | Instance Context | +--------------------+ | | | | ,--------' '--------, | | V V +---------------+ +---------------+ | V of x[A] | | V of z[A] | +---------------+ +---------------+ | Object x[A] | | Object z[A] | +---------------+ +---------------+ |
Fig.3 Multiplexing Method Applications |
In this figure, class A defines a method m which accesses the instance variable V which is also declared in A. The instance of V accessed by m depends on the object the message is sent to. Sending x.m(), for example, results in accessing the V of x and sending z.m() results in accessing the V of z.
In T3X, the current instance (the currently active instance context) can be referred to using the symbol SELF. SELF is a pseudo-variable which always refers the the object owning the current instance context. Therefore, SELF may only be used inside of procedures local to classes. Using SELF, an object may send a message to itself, as shown in the next example
CLASS math() PUBLIC prod(i, j) DO VAR p; p := 1; FOR (i=i, j+1) p := p*i; RETURN p; END PUBLIC fac(n) RETURN self.prod(1, n); END
In this example, the method fac of the class math uses the method prod of the same class to express the factorial of n by sending the message prod(1,n) to itself. Of course, methods may recurse, too, since they are basically procedures. Therefore, fac could as well be defined this way:
PUBLIC fac(n) RETURN n<1-> 1: self.fac(n-1) * n;
Since objects are basically vectors, they may be passed to procedures (or methods) as parameters. By passing an object to a procedure, however, the object loses its type information, since the pointer to the object is stored in a typeless argument variable by the callee. To be able to send messages to such objects, the SEND operator is introduced. Its gerenal form is
SEND(variable, classname, methodname(arguments))
This operator sends the message methodname(arguments) to the object of the class classname pointed to by variable. For example, the statement
y := m.fac(5);
is equal to
pm := @m; y := SEND(pm, math, fac(5));
Constants may be public as well:
PUBLIC CONST symbol = constant_expression;
Such constants can be accessed from outside the class by sending a special form of a message to the class which defines the constant. Given the constant MAXLEN of the class STRING:
PUBLIC CLASS STRING() PUBLIC CONST MAXLEN = 32767; END
the expression
STRING.MAXLEN
could be used to access the value of MAXLEN . So the general form of the class constant access is
classname.constname
The 'classic' way of exporting such a constant would be to define a method returning the constant:
PUBLIC maxlen() RETURN 32767;
Class constants have the advantage of saving a procedure call. Since their values are known at compile time, they can also be used in constant expression contexts.
Structures can be exported in the same way as constants:
PUBLIC STRUCT structname = member1, ..., memberN;
Public structures are useful for objects requiring arguments in structured form or returning values in this form. For example, the SYSTEM.STAT function (the function STAT of the SYSTEM class) returns a structure containing information about a specific file. In addition, the SYSTEM class provides a public structure describing the layout of the structure returned by the STAT function. This structure allows programs using the SYSTEM.STAT function to decompose the returned information.
Since class constants cannot be altered, they do not weaken the strict encapsulation principle. Public structures are an interface and an implementation at the same time. Since the same PUBLIC STRUCT statement is used to define the same structure internally and externally, the interface changes automatically when the implementation changes.
An interface class is a class depending on an external object file called an extension object. Extension objects may be linked against the Virtual Tcode Machine or against native code generated by the T3X compiler.
The declaration of an interface class begins with an ICLASS statement. This statement has the following general form:
ICLASS class_name ("extension_object_name")
Class_name is the name of the class to declare and extension_object_name is the name of the object file holding the code of the interface class. The name of the object file should be specified without any suffixes such as ".o" or ".lib".
Like other class contexts, interface class contexts are terminated with the keyword END.
In addition to the declarations allowed in class contexts, interface classes may contain so called interface declarations. An interface declaration describes an interface procedure contained in an extension object. Any number of interface procedures may be declared in a single IDECL statement:
IDECL proc_name(type, call_map), ...;
Proc_name is the name of an interface method to declare and type is the number of arguments of that procedure. Call_map describes the types of the parameters passed to the interface procedure. It is a bit map where each bit is associated with a procedure argument as outlined in the following table.
Argument Number |
Call Map Value |
Argument Number |
Call Map Value |
---|---|---|---|
1 | 0x0001 | 9 | 0x0100 |
2 | 0x0002 | 10 | 0x0200 |
3 | 0x0004 | 11 | 0x0400 |
4 | 0x0008 | 12 | 0x0800 |
5 | 0x0010 | 13 | 0x1000 |
6 | 0x0020 | 14 | 0x2000 |
7 | 0x0040 | 15 | 0x4000 |
8 | 0x0080 | 16 | 0x8000 |
Each bit with a zero value denotes an atomic (numeric) argument and each bit with a one value denotes a vector argument (a pointer in the argument list of a orresponding C function). For example, the C function _spawn() of the extension object system could be declared as follows:
int _spawn S3(char *prog, char **args, XCELL wait);
The S3() macro is used to generate lists of three arguments. Similar macros exist for declarations of functions with up to 16 arguments. They are called S0(), ..., S16(). The Sn() macros reverse the supplied argument lists to meet the T3X calling conventions and supply a dummy argument to intercept the additional instance context parameter which T3X programs pass to each method.
Do use the Sn() macros. Otherwise, interface procedures will not work.
XCELL is a macro expanding to the type of a signed cell (integer) of the same size as a pointer.
All macros discussed here are defined in the 'txx.h' header file. The 'txx.h' also contains some other macros which are useful for implementing interface procedures.
The first and second argument of _spawn() are pointers and so the bits 0x0001 and 0x0002 in the call map have to be set. This leads to the following interface declaration:
CLASS system("system") ... IDECL _spawn(3,0x0003); ... END
The call map is required to translate vector addresses to native pointer size before passing them to interface procedures. Call maps are limited to 16 bits, so interface procedures may not have more than 16 arguments.
NOTE: when passing vectors of pointers to interface procedures, the vector of pointers must be converted to native pointer size, too. The T3X core method T3X.CVALIST performs this operation.
To the T3X programmer, an interface procedure is a method of an interface class. In order to invoke an interface procedure, the interface class is instantiated a message is sent to the resulting object:
ICLASS world("world_code") IDECL hello(1,1); END MODULE test(world); OBJECT aworld[world]; DO aworld.hello("Hello, world!\n"); END
The HELLO method used in the previos subsection could be implemented as follows:
/* world_code.c */ #include <txx.h> #include <string.h> int world_hello S2(XCELL n, char *s) { while (n--) write(1, s, strlen(s)); return 0; }
The proper arguments to compile this code to an extension objects depends on the host environment. However, the following requirements must be met:
To compile the above interface code to an extension object on a generic Unix system, a command like this may be used:
cc -o world_code.o -I /usr/local/include -DLIBRARY -c world_code.c
When compiling an object to be linked against TXX, -DLIBRARY should be omitted.
Scoping rules define the contexts in which symbols are valid and under which conditions they may be redefined. In T3X, there are five different contexts and very strict and simple redefinition rules:
(*) Technical remark: To purge the public symbol table, the storage holding the table (usually a file) must be cleared (eg by deleting the file).
The following redefinition rules apply:
(*) This happens when a module containing public classes is recompiled. In this case, the definition in the public context is silently updated.
+-- Public ------------------------------------+ | +-- Global --------------------------------+ | | | +-- Class -----------------------------+ | | | | | +-- Procedure ---------------------+ | | | | | | | +-- Block**0 ------------------+ | | | | | | | | | +--- ... ------------------+ | | | | | | | | | | | | | | | | | | | | | | | +-- Block**N ----------+ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +----------------------+ | | | | | | | | | | | +--------------------------+ | | | | | | | | | +------------------------------+ | | | | | | | +----------------------------------+ | | | | | +--------------------------------------+ | | | +------------------------------------------+ | +----------------------------------------------+ |
Fig.4 Scopes (overview) |
At the end of a class context, all names contained in that class context as well as the classname itself will be removed from the global context. However, the class name will be memorized at a different location and may never be reused in the same program. This is a workaround for an ugly inconsistency resulting from an interference with the module system. Imagine the following situation:
CLASS A() ... END A() DO ... END CLASS B(A) ... END
In this case, the class B would depend upon the procedure A which would be semantically incorrect. On the other hand, if the name A would persist, the following code would be correct:
CLASS A() ... END CLASS B() OBJECT XA[A]; END
In this case, class B could use class A without being dependent on it, which would also be semantically incorrect, because the module extension requires that B be dependent on A in this case. Therefore, the (admittedly brute force) solution of not permitting the reuse of (deleted) class names has been chosen.
In T3X, there are only very few type checking mechanisms to detect things like assignments to constants, calls of non-procedures, etc. Including the object meta type, T3X has six different types on which only specific operations are allowed. Type-related semantic checks carried through by the compiler intercept the following errors:
In a truly object-oriented language with first class objects, however, more extensive type checking would be required. When, for example, an object is passed to or returned by a function, the passed value becomes a first class value which may be assigned to a variable. No type information would be associated with such a variable in T3X. Therefore, the compiler could not determine which messages this object could answer and which not. This problem has been solved by not allowing to send messages to any type of data object other than the object. The SEND operator may be used to circumvent this restriction. Like the CALL operator, SEND shall be used with special care.
The following table contains an overview over the operations which may be applied to each type or entity:
Type | Evaluate | Change | Call | Send to |
---|---|---|---|---|
Constant | + | - | - | - |
Variable | + | + | (+) | (*) |
Vector (#) | + | - | - | - |
Procedure | + | - | + | - |
Class | + | - | - | - |
Object | + | - | - | + |
(+) Using the CALL operator
(*) Using the SEND operator
(#) A structure is a combination of a vector and a set of constants
In this context, evaluating an entity means to compute its value.
This is the value assigned to a constant, the content of an atomic variable,
the size of a class or structure, and the address of each other kind
of entity (vector, packed vector, procedure, object).
To change an entity means to assign a new value to it.
This operation is limited to atomic variables.
Calling an entity denotes a procedure call and excludes sending
a message.
To send a message means to apply a method to an object.
Meta commands are used to control the behaviour of the compiler. No code will be generated for meta commands. Meta command do not belong to the T3X language itself and different implementations may provide different sets of meta commands. The commands described in this section should be present in any implementation of the T3X language, though.
All meta commands begin with a hash sign (#) and like all other statements, they are terminated with a semicolon. They may occur at any place where a statement or a declaration (either local or global) is expected, but not inside of statements or declarations. The following meta command exist:
#CLASSPATH "path";
Specify an alternative path for searching class files. Normally, the translator searches class files in the current working directory and in some compiled-in paths. When bootstrapping the compiler or running it in some other non-standard environment, it may be necessary to specify the class path explicitly using this command. Only one path may be specified. When using multiple #classpath commands, only the last one will take effect. This solution is a makeshift and will probably be combined with a more flexible technique in the future. When a classpath is specified using this meta command, it will take precedence over all other class paths.
#DEBUG;
Turn on emission of debug information like source code line numbers
and variable names and addresses. When this option is turned on, the
T3X translator will generate a LINE instruction at
the beginning of each statement and LSYM, ISYM, or
GSYM instructions for each local variable, instance variable, or
global variable.
Debug information is intended to be used by a source level debugger.
Each program has an initial entry point where the execution begins at run time. In T3X, the entry point is a compound statement at the top level which does not belong to any procedure context. This compound statement is mandatory and it must always be the last definition in the entire program. Consequently, the minimum valid T3X program is
DO END
The main procedure, like any other compound statement, may declare its own local symbols. Since it has no name, it cannot recurse, though. RETURN may not be used in it, because there is no procedure to return to.
When executation reaches the end of the main procedure, the program terminates and delivers a zero return code back to the calling process.
Most of this chapter has been automatically generated from the structured document 'classes.sd' which is contained in the T3X Release 7 distribution.
There are three different types of classes: The core class, native classes and interface classes. To the programmer, there is no difference between these types, but when implementing additional runtime classes, it is important to know the differences.
The T3X core class is written in a system-level language like C or assembly. Calling functions of this class is done through an internal jump table. The core class cannot be extended or modified.
The native class is the most common type. Such classes are written in T3X using the techniques described in the section about the object-oriented programming and modules. The major part of the runtime system is implemented this way. There is no difference between a native class and a user-defined module. Programs are linked against native classes using the Tcode linker.
Interface classes allow to add low-level (LL) functions to a program. The LL functions themselves are written in a language suitable for systems-level programming. The foreign language code is compiled to a relocatable object or library. Additionally, a native interface class must be defined to describe the functions contained in the object holding the code.
A runtime class is linked into a program by requiring it either at class level or at module level. For example,
class foo(t3x, iostream)
would be the header of a class requiring the T3X core class and the iostream class, and the statement
module bar(t3x, char, string);
would require the core class plus the char and string classes.
The following runtime classes belong to the T3X environment:
Name | Type | Description |
---|---|---|
t3x | core | basic routines |
char | native | character manipulation |
iostream | native | buffered I/O-streams |
memory | native | dynamic memory management |
string | native | string manipulation |
system | interface | (mostly) portable system calls |
ttyctl | interface | terminal control |
xmem | interface | external memory access |
These classes will be explained in detail in the following sections.
OBJECT T[T3X];
The T3X class contains some procedures which provide access to the most common operating system services, like opening, reading, writing, and erasing files, copying and comparing memory regions, receiving arguments, etc. It also contains the dynamic loader interface, if available. The class requires no explicit initialization or shutdown.
The T3X class does not contain any variables. Therefore, it is sufficient to create a single instance per module.
T.BPW() ! => Num
Return the number of Bytes Per Word on the host machine. When running a Tcode program, this value will always be 2, regardless of the host environment. When called by a native machine program, the procedure will return the actual machine word size of the target machine.
T.CLOSE(fdesc) ! Fdesc => Num
Close the file descriptor 'fdesc'. To obtain a valid file descriptor, use T.OPEN.
T.CLOSE returns 0 on success and a negative value in case of an error.
See also: OPEN, SYS.DUP, SYS.DUP2, SYS.PIPE, READ, WRITE
T.CVALIST(n, bmap, ilist, olist) ! Num,Num,Vec,Vec => 0
Convert a Tcode argument list into a native argument list. Since the Tcode machine is a 16-bit architecture, argument lists may need to be extended before passing them to machine code procedures in 32-bit or 64-bit environments.
Extending an argument list from 16 to 32 bits (or whatever is appropriate on the host system) is done by zero-extending all values in the argument vector to the size of a generic pointer on the host machine. Additionally, the offset of the Tcode machine's data area will be added to pointer type arguments. The bitmap 'bmap' specifies the type of each argument.
Argument lists may not be longer than 16 elements (plus the trailing null).
'N' specifies the number of elements in the argument list 'ilist'. If a trailing null is required, it must be counted, too. 'Ilist' is a vector containing the arguments. 'Bmap' is a bit field where a bit is set when the argument with the according offset is a pointer: if bit #0 is set, (bmap & 1), ilist[0] is a pointer, if bit #1 is set (bmap & 2), ilist[1] is a pointer, and so on. 'Olist' will be filled with the extended argument list. It must provide up to 17 times the size of a generic pointer in bytes (which is usually equal to 17 machine words on the host system).
CVALIST may relocate 'olist' if it is not aligned to native machine word boundary. It returns the number of bytes 'olist' was moved. This number should be used to compute the new address of olist:
offset := t.cvalist(n, bmap, ilist, olist); olist := @olist::offset.
When a negative count is supplied, the effect of T.CVALIST is reversed. In this case, each member of 'olist' will be copied to 'ilist' and truncated to 16-bits. No pointers may be processed this way. The 'Bmap' argument is ignored when converting argument lists in this direction.
T.CVALIST is used to prepare argument lists for passing them to dynamically loaded procedures in Tcode programs. When a T3X program is run in an environment where the size of a pointer is equal to the size of a T3X machine word, T.CVALIST simply copies the argument vector.
SYS.SPAWN uses T.CVALIST internally. Most programs will not require its use.
See also: SYS.SPAWN
T.GETARG(n, buffer, size) ! Num,Str,Num => Num
Retrieve the 'n'th command line argument and store its first 'size'-1 characters in 'buffer'. If the length K of the the requested argument is less than 'size'-1, copy only K characters. In either case, append a NUL character.
T.GETARG returns the number of characters copied. A return code of -1 indicates that a non-existing argument has been requested ('n' is too big).
See also: GETENV
T.GETENV(name, buffer, size) ! Str,Str,Num => Num
Retrieve the value of the environment variable 'name' and store up to 'size'-1 chararacters of its value in 'buffer'. Append a NUL character to the text in 'buffer'.
T.GETENV returns the number of characters copied. A return code of -1 indicates that a non-existing variable name has been specified.
See also: GETARG
T.MEMCOMP(r1, r2, len) ! Bvec,Bvec,Num => Num
Compare up to 'len' bytes of the regions 'r1' and 'r2'. When a mismatch is found during the comparison, the procedure returns
r1::p - r2::p
where 'p' is the position of the mismatch. When 'len' bytes have been compared without encountering a mismatch, zero is returned.
See also: MEMCOPY, MEMFILL, MEMSCAN
T.MEMCOPY(dest, src, len) ! Bvec,Bvec,Num => 0
Copy 'len' bytes from region 'src' to region 'dest'. The regions may overlap.
See also: MEMCOMP, MEMFILL, MEMSCAN
T.MEMFILL(region, val, len) ! Bvec,Num,Num => 0
Fill the first 'len' bytes of 'region' with the value of the least significant byte of 'val'.
See also: MEMCOMP, MEMCOPY, MEMSCAN
T.MEMSCAN(region, val, len) ! Bvec,Num,Num => Num
Scan the first 'len' bytes of 'region' for 'val'. If the scanned region contains 'val', return its offset (0...len-1) and otherwise return -1.
See also: MEMCOMP, MEMCOPY, MEMFILL
T.NEWLINE(s) ! Str => Str
Write a system-dependent newline sequence to the string 's'. The sequence will move the cursor to the beginning of a new line when sent to terminal screens (in cooked mode). The sequence written to 's' will not be longer than four characters.
T.NEWLINE returns a pointer to 's'.
See also: WRITE, TTY.MODE
T.OPEN(path, mode) ! Str,Num => Fdesc
Open the file whose path is specified in 'path' in the given 'mode'. The exact format of 'path' depends on the operating system. The following modes exist:
Mode constant | ReadOK | WriteOK | Create |
---|---|---|---|
T3X.OREAD | Yes | No | No |
T3X.OWRITE | No | Yes | Yes |
T3X.ORDWR | Yes | Yes | No |
T3X.OAPPND | Yes | Yes | No |
When OWRITE is specified and a file with the given name already exists, it will be deleted first.
OAPPND is like ORDWR, but the file pointer will be positioned at the end of the file so that T.WRITE will append its output to the file.
T.OPEN returns a file descriptor for accessing 'path' on success and a negative number in case of an error.
When a T3X program starts up, there already are some open file descriptors, which are by default connected to the user's terminal:
Name | Descriptor | Mode |
---|---|---|
T3X.SYSIN | standard input | read-only |
T3X.SYSOUT | standard output | write-only |
T3X.SYSERR | standard error | write-only |
See also: CLOSE, READ, WRITE, SEEK
T.READ(fdesc, buffer, count) ! Fdesc,Vec,Num => Num
Read up to 'count' characters from the file descriptor 'fdesc' into 'buffer'. Return the number of characters read.
A return value less than zero indicates a severe error. A return value which is less than 'count' usually indicates that the end of the input has been reached.
When reading line oriented devices, such as terminals, a return value below 'count' may indicate the end of a line. In this case, a zero value means that the input stream is exhausted.
For a summary of standard descriptors (system input and output), see T.OPEN.
See also: OPEN, CLOSE, SYS.PIPE, WRITE
T.REMOVE(path) ! Str => Num
Remove the directory entry specified in 'path'. The exact format of 'path' depends on the operating system.
On systems supporting multiple links (names) for a single file, this procedure will only remove the specified link. On such systems, other links to the file may still be used to access the file. Only when the last link is removed, the file will become inaccessible. On other systems, T.REMOVE deletes the given file immediately.
T.REMOVE returns zero, if the directory entry could be successfully deleted and otherwise a negative value.
See also: SYS.OPENDIR, RENAME
T.RENAME(old, new) ! Str,Str => Num
Rename the directory entry whose path is specified in 'old' to 'new'. 'Old' and 'new' may describe names contained in different pathes. In this case, the directory entry will be 'moved' to the directory specified in 'new'. The old and the new name of the directory entry must both reside on the same physical device.
T.RENAME returns zero upon success and a negative value in case of an error.
See also: REMOVE
T.SEEK(fdesc, where, origin) ! Fdesc,Num,Num => Num
Move the file pointer associated with the file descriptor 'fdesc' to a new position. 'Where' specifies the desired position and 'origin' specifies where the motion shall start. The following origins are possible:
Constant | Origin | Distance |
---|---|---|
T3X.SEEK_SET | Beginning of the file | +where |
T3X.SEEK_FWD | Current position | +where |
T3X.SEEK_END | End of the file | -where |
T3X.SEEK_BCK | Current position | -where |
SEEK_SET and SEEK_FWD move the file pointer forward, SEEK_END and SEEK_BCK move it backward. In either case, 'where' is an unsigned value so that offsets may range from 0 to 65535 bytes.
T.SEEK returns zero upon success and -1 in case of an error.
The SEEK operation may be undefined on some devices and pipes.
See also: OPEN, CLOSE, READ, WRITE
T.WRITE(fdesc, buffer, count) ! Fdesc,Vec,Num => Num
Write 'count' characters from 'buffer' to the file descriptor 'fdesc'. Return the number of characters actually written.
A return value which is less than 'count' indicates a severe error (such as insufficient space left on a device).
For a summary of standard descriptors (system input and output), see T.OPEN.
See also: OPEN, CLOSE, SYS.PIPE, READ
OBJECT CHR[CHAR]; CHR.INIT();
The CHAR class contains functions for determining character types and converting characters. They all operate on ASCII values.
This class must be initialized by calling CHR.INIT before it can be used. An explicit shutdown is not required.
The CHAR class does not contain any variables. Therefore, it is sufficient to create a single instance per module.
CHR.INIT() ! => 0
Initialize the character class by loading an internal pointer with the character type map.
See also: MAP
CHR.ALPHA(c) ! Char => Num
Return TRUE (-1), if 'c' is an alphabetic character (in the range 'a'...'z' or 'A'...'Z'). Otherwise return FALSE (0).
CHR.ASCII(c) ! Char => Num
Return TRUE (-1), if 'c' is a valid ASCII value (in the range 0...127). Otherwise return FALSE (0).
CHR.CNTRL(c) ! Char => Num
Return TRUE (-1), if 'c' is a control character (in the range 0...31 or equal to 127). Otherwise return FALSE (0).
CHR.DIGIT(c) ! Char => Num
Return TRUE (-1), if 'c' is a decimal digit (in the range '0'...'9'). Otherwise return FALSE (0).
CHR.LCASE(c) ! Char => Char
If the character 'c' is an upper case character (see CHR.UPPER), convert it to lower case and return it. Otherwise, return it unchanged.
See also: UCASE
CHR.LOWER(c) ! Char => Num
Return TRUE (-1), if 'c' is a lower case letter (in the range 'a'...'z'). Otherwise return FALSE (0).
CHR.MAP() ! => Vec
Return the character description map used internally. This map is a vector of 128 words containing flags for describing each ASCII character. It can be used to implement faster character checks. For example,
IF (CHR.LOWER(c)) ...
can be written as
chrmap := CHR.MAP(); ... IF (chrmap[c] & (CHAR.C_UPPER|CHAR.C_ALPHA) = CHAR.C_ALPHA) ...
which saves a procedure call each time a character is tested for being lower case.
The following public constants are defined in the CHAR class and can be used for testing character flags:
Flag | Property |
---|---|
CHAR.C_ALPHA | alphabetic |
CHAR.C_UPPER | upper case |
CHAR.C_DIGIT | decimal digit |
CHAR.C_SPACE | white space |
CHAR.C_CNTRL | control character |
CHR.SPACE(c) ! Char => Num
Return TRUE (-1), if 'c' is a space character (HT(9), LF(10), VT(11), FF(12), CR(13)). Otherwise return FALSE (0).
CHR.UCASE(c) ! Char => Char
If the character 'c' is a lower case character (see CHR.LOWER), convert it to upper case and return it. Otherwise, return it unchanged.
See also: LCASE
CHR.UPPER(c) ! Char => Num
Return TRUE (-1), if 'c' is a upper case letter (in the range 'A'...'Z'). Otherwise return FALSE (0).
OBJECT IOS[IOSTREAM];
The IOSTREAM class implements fully buffered I/O streams.
I/O streams provide a string/character-oriented interface to the programmer while performing block-oriented I/O to the file or device associated with a stream. This way, they combine the speed of block-I/O with the flexibility of character-based I/O.
This class contains the I/O stream data structure and procedures for creating, opening, closing, reading, and writing streams.
A separate IOSTREAM object must be defined for each stream to be used in a program.
ios.CLOSE() ! => Num
Shutdown the I/O stream 'ios' by first flushing its buffer and then closing the file associated with the stream. Flushing a buffer means to write pending output (if the stream has been written to) and to discard any pending input (if the stream is to be read from).
IOS.CLOSE returns zero, if the stream could be closed and otherwise -1. After sucessfully sending CLOSE, the receiving stream becomes invalid immediately and should no longer be accessed.
See also: CREATE, OPEN, FLUSH
ios.CREATE(fd, buffer, len, mode) ! Fdesc,Bvec,Num,Num => 0
Initialize the iostream 'ios' with the given parameters. 'Fd' is an open file descriptor which will be associated with the stream. 'Buffer' will be used for buffering read/write operations on the stream. 'Len' specifies the size of 'buffer' in characters. 'Mode' controls the operations allowed on 'ios'. The following flags may be used to build the mode value:
Mode constant | ReadOK | WriteOK | LF>CRLF | CRLF>LF |
---|---|---|---|---|
IOSTREAM.FREAD | Yes | No | - | - |
IOSTREAM.FWRITE | No | Yes | - | - |
IOSTREAM.FRDWR | Yes | Yes | - | - |
IOSTREAM.FKILLCR | - | - | - | Yes |
IOSTREAM.FADDCR | - | - | Yes | - |
IOSTREAM.FTRANS | - | - | Yes | Yes |
CRLF>LF denotes that each CR character found in an input stream will be silently discarded. This is useful when reading DOS-style ASCII text files. LF>CRLF means that a CR character will be added before each LF in the output stream. Since FADDCR has no effect on input and FKILLCR has no effect on output, FTRANS may be used safely on input as well as output streams.
IOS.CREATE only initializes an IOSTREAM object with some data. It cannot fail and therefore, it returns always 0.
When using IOS.CREATE to create a stream for accessing standard file descriptors (such as T3X.SYSIN and T3X.SYSOUT), these stream should never be closed. IOS.FLUSH may be used to synchronize them.
See also: OPEN, CLOSE, FLUSH
ios.EOF() ! => Num
Return a flag indicating whether input has been exhausted on the stream 'ios'. When IOS.EOF returns TRUE (-1), no more input can be read from 'ios'. This is the case when the end of the associated input file has been reached or when an EOF character has been typed on a terminal.
See also: READ, READS, RDCH, RESET
ios.FLUSH() ! => Num
Flush the stream 'ios' and return a value indicating whether the operation was successful. Zero means success, -1 means failure.
Flushing an output stream means to write all pending data to the associated file, flushing an input stream means to discard all pending input. The operation performed on a combined input/output stream depends on the type of the last operation performed before (reading or writing).
See also: OPEN, CLOSE, READ, WRITE
ios.MOVE(offset, origin) ! Num,Num => Num
Move the file pointer of the file descriptor associated with 'ios' to a new position. The position is computed using the given 'offset' and 'origin'. 'Offset' is the number of bytes to move and 'origin' specifies where the motion shall begin. The following origin values are available:
Constant | Origin |
---|---|
IOSTREAM.SEEK_SET | the beginning of the file |
IOSTREAM.SEEK_FWD | the current position (move forward) |
IOSTREAM.SEEK_END | the end of the file |
IOSTREAM.SEEK_BCK | the current position (move backward) |
SEEK_SET and SEEK_FWD move the file pointer forward, SEEK_END and SEEK_BCK move it backward. In either case, 'where' is an unsigned value so that offsets may range from 0 to 65535 bytes.
IOS.MOVE always flushes the stream buffer before changing the file pointer.
It returns zero upon success and -1 in case of an error.
See also: FLUSH, T.SEEK
ios.OPEN(path, buffer, len, mode) ! Str,Vec,Num,Num => Num
Open the file specified in 'path' and initialize 'ios' with the resulting file descriptor and the arguments 'buffer', 'len', and 'mode'. See IOS.CREATE for details. The exact format of 'path' depends on the operating system. The follwing open modes ('mode' values) are common:
Mode constant | ReadOK | WriteOK | Create |
---|---|---|---|
IOSTREAM.FREAD | Yes | No | No |
IOSTREAM.FWRITE | No | Yes | Yes |
IOSTREAM.FRDWR | Yes | Yes | No |
For additional modes, see IOS.CREATE.
When creating a file, any existing file with the same name will be deleted.
IOS.OPEN returns zero upon success and -1 in case of an error.
See also: CREATE, CLOSE, FLUSH, T.OPEN
ios.RDCH() ! => Char
Read a single character from 'ios' and return it. When the EOF condition is true on 'ios', return -1 (which cannot be a valid character).
See also: READ, READS, WRCH, EOF
ios.READ(buffer, len) ! Vec,Num => Num
Read up to 'len' characters from 'ios' into 'buffer'. Return the number of characters actually read. A return value less than 'len' may indicate the end of input or the beginning of a new line a on a terminal. A return value of zero always indicates the EOF. A value below zero indicates a severe error.
See also: RDCH, READS, WRITE, EOF
ios.READS(buffer, len) ! Vec,Num => Num
Read up to 'len'-1 characters from 'ios' into 'buffer'. Return the number of characters actually read. A return value of zero indicates that the EOF has been reached. A value below zero indicates a severe error.
Unlike READ, READS stops reading when it encounters a line separator (LF).
See also: RDCH, READ, WRITE, EOF
ios.RESET() ! => 0
Reset the error flag of the given iostream 'ios'. Resetting the error flag is necessary to access a stream after an error has occurred (for example, after reading beyond the EOF).
See also: EOF, READ
ios.WRCH(c) ! Char => Char|Num
Write the character 'c' to the stream 'ios'. If the character could be written, return its ASCII code and otherwise return -1.
See also: WRITE, WRITES, FLUSH
ios.WRITE(buffer, len) ! Vec,Num => Num
Write 'len' characters from 'buffer' to 'ios'. Return the number of characters actually written. A return value less than 'len' indicates a severe error (such as no space left on the target device).
See also: WRITES, WRCH, FLUSH
ios.WRITES(str) ! Str => Num
Write the string 'str' to 'ios'. Return the number of characters actually written. A return value less than 'len' indicates a severe error (such as no space left on the target device).
See also: WRITE, WRCH, FLUSH
OBJECT MEM[MEMORY];
The MEMORY class implements dynamic memory pools. When initialized, the address of a static data area is passed to a MEMORY object. This area (called a 'pool') will be managed by the MEMORY object. Vectors can be allocated from the pool and released it to again when they are no longer required.
A first-match algorithm is used to allocate memory in a pool. The algorithm is optimized for sequential allocation. The pool is defragmented when releasing memory, but no garbage collection is performed.
MEMORY objects are ineffective when allocating a large number of small vectors, since the free list is kept inside of the pool.
Multiple memory pools may be defined using the MEMORY class.
mem.ALLOC(size) ! Num => Vec
Allocate 'size' bytes from the memory pool 'mem' and return a pointer to the allocated vector. If the request could not be satisfied due to insufficient memory, return 0.
Up to 32765 bytes may be allocated in a single request.
See also: FREE
mem.FREE(vec) ! Vec => 0
Release the memory occupied by the vector 'vec' to 'mem' and defragment 'mem'. Thereby, the size of 'vec' will be added to the amount of free memory in 'mem'.
'Vec' must be the address of a vector which has been previously allocated in 'mem'. Otherwise, the calling program may be terminated with an error message of the form
mem_free(): bad block
Accessing a free'd vector is undefined.
See also: ALLOC
mem.INIT(pool, size) ! Vec,Num => 0
Initialize the memory pool 'mem' by adding 'size' bytes to its internal freelist. 'Pool' must have a size of at least 'size' bytes. 'Size' may not be larger than 32767.
All previously allocated vectors of 'mem' will be freed by INIT.
MEM.INIT always returns 0.
mem.WALK(vec, sizep, statp) ! Vec,Vec,Vec => Vec
Traverse the list of vectors in 'mem'. This list contains both allocated and free vectors. Traversing the list works as follows:
When MEM.WALK is called the first time, 'vec' must be zero:
v := mem.walk(0, @p, @s);
This call will return a pointer to the first vector in 'mem'. The returned vector can be passed to MEM.WALK in a subsequent call to retrieve a pointer to the next vector:
v := mem.walk(v, @p, @s);
When the 'vec' argument finally points to the last vector in 'mem', MEM.WALK will return zero, thereby indicating the end of the list.
The argument 'sizep' is a one-word vector which will be filled with the size of the returned vector. 'Statp' is also a one- word vector argument which will be filled with the status of the vector (1=free, 0=allocated). If either 'statp' or 'sizep' is zero, it will be ignored by MEM.WALK.
OBJECT STR[STRING];
The STRING class contains procedures for manipulating ASCIZ strings (NUL-terminated sequences of ASCII characters).
The STRING class does not contain any data and it does not require any explicit initialization or shutdown calls.
STR.COMP(a, b) ! Str,Str => Num
Compare each character in 'a' with the character at the same position in 'b' and return the difference
a::i - b::i
of the characters at the first mismatching position 'i'. When no mismatch is encountered, the difference between the terminating NUL characters (0) is returned. Consequently, the return value of STR.COMP can be interpreted as follows:
Value | Meaning |
---|---|
>0 | 'a' is lexically greater than 'b' |
<0 | 'a' is lexially less than 'b' |
=0 | 'a' is equal to 'b' |
See also: FIND, SCAN, RSCAN
STR.COPY(a, b) ! Str,Str => 0
Copy the string stored at the location 'b' to the location 'a'. Return zero.
See also: FORMAT
STR.FIND(a, b) ! Str,Str => Num
Find the first occurrence of the string 'b' in the string 'a'. Return the offset of the string found, if any. Return -1, if 'a' does not contain 'b'.
See also: COMP, SCAN, RSCAN
STR.FORMAT(buf, tmpl, list) ! => Str,Str,Vec => Str
Format the arguments contained in 'list' according to the template 'tmpl' and store the resulting string in 'buf'.
'Tmpl' is a string containing literal characters as well as 'format definitions'. A format definition is a substring which begins with a percent sign and ends with one of the characters in {C,D,S,X,%}. When 'tmpl' does not contain any format definitions, it will be copied to 'buf' and 'list' will be ignored. When format defintions exist, each definition will be used to format one element of 'list'. Instead of the definition itself, the result of formatting the current member of 'list' according to the definition will be inserted into 'buf'.
A format definition has the following syntax:
%[max][:F][U][{LR}]{CDSX%}
([x] indicates an optional element, {xyz} indicates 'one out of x,y,z'.)
The following types exist ('i' denotes the index of the current argument):
Type | Insert list[i] as |
---|---|
C | character |
D | decimal numeric literal |
S | string |
X | hexa-decimal numeric literal |
The string '%%' may be used to include a literal percent sign.
Examples:
Template | Argument list | Result |
---|---|---|
"%D%% of %10:*D = %D" | [10,200,20] | 10% of *******200 = 20 |
"'%C' = 0X%X = %D" | ['A','A','A'] | 'A' = 0X41 = 65 |
"%:-9LS%:+9RS" | ["ZZZ","YYY"] | ZZZ------++++++YYY |
STR.FORMAT returns the address of 'buf'.
See also: PARSE, COPY
STR.LENGTH(a) ! Str => Num
Return the number of characters contained in 'a' (excluding the terminating NUL character).
STR.NUMTOSTR(buf, n, radix) ! Str,Num,Num => Str
Convert a number 'n' into a string representing that number with respect to a given 'radix'. The resulting numeric literal will be strored in 'buf'. If 'radix' is negative, a leading minus sign will be generated, if 'n' is also negative. If 'radix' is positive, an unsigned literal will be generated.
'Buf' must provide enough space to hold the resulting literal.
Valid values for 'radix' range from '2' (binary) to '16' (hexa- decimal) and from '-2' (signed binary) to '-16' (signed hexa- decimal).
STRTONUM returns the address of the first character of the resulting literal.
See also: STRTONUM, FORMAT
STR.PARSE(source, tmpl, list) ! Str,Str,Vec => Num
Extract patterns described in 'tmpl' from 'source' and store the extracted objects in the members of 'list'. Patterns used in 'tmpl' are similar to format descriptions as used by STR.FORMAT. Characters not belonging to patterns are matched literally.
STR.PARSE compares each character contained in 'tmpl' with a character in 'source' (like STR.COMP) unless it finds a '%'-character in 'tmpl'. A percent character indicates the beginning of a pattern. Patterns match specific classes of characters. Instead of matching the pattern description, the character class described by the pattern is matched.
Some patterns store the matched substring in an element of 'list' and some do not. Each pattern may consists of the following parts:
%[len][:D]{CDSWX%}
([x] indicates an optional element, {xyz} indicates 'one out of x,y,z'.)
The special form %[c0...cN] may be used to match any character in the range 'c0'...'c1'.
The following pattern types exist:
Type | Store as | Matches |
---|---|---|
C | character | any single character |
D | number | a signed decimal number (+) |
S | string | a string (*) |
W | - | space (any number of '\s' or '\t' |
characters) | ||
X | number | a signed hexa-decimal number (+) |
% | - | a percent sign |
(+) These leading prefixes are accepted: {+-%}
(*) When no length is specified, %S matches the entire rest of 'source'. ':D' or a length may be specified to match a substring.
Numbers and characters are stored in 'list[i][0]' (where 'i' is the index of the current member of 'list') and strings are copied to the location pointed to by 'list[i]'.
Example:
VAR name::50, speed, unit::10; STR.PARSE("HAL9000 @ 500 MHz", "%:@S@ %D%W%S", [ name, @speed, unit]);
will store
"HAL9000 " in 'name' 500 in 'speed' "MHz" in 'unit'
STR.PARSE returns the number of patterns stored.
See also: FORMAT, COMP
STR.RSCAN(s, c) ! Str,Num => Num
Find the rightmost occurrence of the character 'c' in the string 's' and return the offset (position) of the character found. If 'c' is not contained in 's', return -1.
See also: SCAN, FIND, COMP
STR.SCAN(s, c) ! Str,Num => Num
Find the first occurrence of the character 'c' in the string 's' and return the offset (position) of the character found. If 'c' is not contained in 's', return -1.
See also: RSCAN, FIND, COMP
STR.STRTONUM(s, radix, lastp) ! Str,Num,Vec => Num
Compute the value represented by the numeric literal stored in the string 's'. 'Radix' specifies the base of the literal in 's'. It may range from '2' (binary) to '16' (hexa-decimal).
STR.STRTONUM performs the following steps:
The following characters may represent the digits from 0 to 15: "0123456789ABCDEF".
When the argument 'lastp' is non-zero, it will be filled with the number of characters processed. Consequently, it points to the first non-numeric character in 's' when STR.STRTONUM returns.
STR.STRTONUM returns the computed value.
No overflow checking is performed.
See also: NUMTOSTR, PARSE
STR.XLATE(s, old, new) ! Str,Num,Num => Str
Replace each occurrence of the character 'old' in the string 's' with 'new'. Return the address of 's'.
See also: SCAN, RSCAN
MODULE MODNAME(TCODE); CLASS CLASSNAME(TCODE) ... END
The TCODE class contains a set of public constants describing the instruction set of the Tcode machine. There is no need to instantiate this class, since it does not contain any state or methods. To access the opcode of a specific Tcode instruction, use the class constant notation
tcode.IINSTRUCTION
For example, to load the variable I with the opcode of the JUMP instruction, use
I := tcode.IJUMP;
The special constant IENDOFSET contains a value which is one above the highest value used to form Tcode instructions. To check if a variable J contains a valid instruction, the following code may be used:
IF ((J & 0x7F) .>= tcode.IENDOFSET) ; ! Call your illegal instruction handler here
OBJECT U[UTIL];
The UTIL (utility) class contains utility procedures for miscellaneous tasks. Currently, there are methods for sending formatted output to various channels like the system output, file descriptors or streams. Using the UTIL class simplifies many frequently used code fragments. For example, the code
DO VAR buffer::256; t.write(T3X.SYSOUT, str.format(buffer, "X = %D\N", [(x)]), str.length(buffer)); END
can be replaced with
u.printf("X = %D\N", [(x)]);
in modules depending on the UTIL class.
UTIL.BUFLEN
This public constant holds the maximum length of strings formatted by the PRINTF, WRITEF, and SWRITEF methods. The length returned includes the terminating NUL character.
U.PRINTF(tmpl, args) ! Str,Vec => Num
Format the arguments contained in the vector 'args' using
STR.FORMAT(buffer, tmpl, args)
where 'buffer' is an internal buffer of the length UTIL.BUFLEN. The buffer is then written to the system output device T3X.SYSOUT.
PRINTF returns the number of characters written using T3X.WRITE.
U.SWRITEF(ios, tmpl, args) ! IOS,Str,Vec => Num
Format the arguments contained in the vector 'args' using
STR.FORMAT(buffer, tmpl, args)
where 'buffer' is an internal buffer of the length UTIL.BUFLEN. The buffer is then written to the output stream 'ios'.
SWRITEF returns the number of characters written using IOSTREAM.WRITE.
U.WRITEF(fd, tmpl, args) ! FDesc,Str,Vec => Num
Format the arguments contained in the vector 'args' using
STR.FORMAT(buffer, tmpl, args)
where 'buffer' is an internal buffer of the length UTIL.BUFLEN. The buffer is then written to the file descriptor 'fd'.
WRITEF returns the number of characters written using T3X.WRITE.
OBJECT SYS[SYSTEM]; SYS.INIT();
The SYSTEM class contains some procedures which form a more or less portable interface to the operating system. Most procedures have the same names and functions as Unix system calls. Some functions may be unavailable on non-Unix systems.
This class is implemented as a shared object. Therefore, it must The SYSTEM class must be initialized using SYS.INIT and shut down by calling SYS.FINI.
SYSTEM does not contain any variables. Therefore, it is sufficient to create a single instance per module.
SYS.INIT() ! => 0
Initialize the operating system interface.
See also: FINI
SYS.CHDIR(path) ! Str => Num
Change the current working directory to 'path'. 'Path' contains the operating system dependent representation of a path.
SYS.CHDIR returns 0 on success and a negative value in case of an error.
See also: MKDIR, RMDIR, OPENDIR
SYS.CLOSEDIR(ddesc) ! Ddesc => Num
Close the directory descriptor 'Ddesc'.
SYS.CLOSEDIR returns 0 on success and a negative value in case of an error.
See also: OPENDIR, READDIR, STAT
SYS.DUP(oldfd) ! Fdesc => Fdesc | -1
Duplicate the file descriptor 'oldfd' and return a new descriptor which will reference the same file. Since the old and the new descriptor reference the same file, all operations performed on one of them will also affect the other.
SYS.DUP returns a new descriptor on success and a negative number in case of an error.
See also: DUP2, T.OPEN, T.CLOSE, PIPE, FORK
SYS.DUP2(oldfd, newfd) ! Fdesc,Fdesc => Num
Duplicate the file descriptor 'oldfd' and make 'newfd' reference the same file. Thereafter, all operations performed on one of them will also affect the other. If 'newfd' already references a valid descriptor, it will first be closed using T.CLOSE.
SYS.DUP2 returns zero upon success, and a negative number in case of an error.
See also: DUP, T.OPEN, T.CLOSE, PIPE, FORK
SYS.FINI() ! => 0
Shutdown the operating system interface. After calling SYS.FINI, the SYSTEM services become unavailable.
See also: INIT
SYS.FORK() ! => Num
Duplicate the calling process. The new process -- called the child process -- will start running exactly at the point where SYS.FORK returns. Each process has an own data segment and an own set of file descriptors. Descriptors, which where open when SYS.FORK was called, will reference the same files in both processes, though.
After the successful creation of the new process, SYS.FORK returns 0 to the child process and the process ID of the child to the parent process.
In case of an error, it returns -1.
See also: KILL, SPAWN, T.OPEN
SYS.GETDIR(buf, len) ! Str,Num => Num
Store the fully qualified path name of the current working directory in 'buf'. Do not store more than the first 'len'-1 characters. Append a trailing NUL character. 'Buf' must be at least 'len' characters in length, and it may not be smaller than 65 characters.
If 'len' is less than 65 or the function fails, -1 is returned. Upon success, the number of stored characters is returned.
See also: MKDIR, RMDIR, OPENDIR
SYS.KILL(pid, sig) ! Num,Num => Num
Send a signal to the process with the process ID 'pid'. The following constants may be used in the place of 'sig' to specify which signal to send to the process:
Constant | Action |
---|---|
SYSTEM.SIGTEST | Test process |
SYSTEM.SIGTERM | Request termination |
SYSTEM.SIGKILL | Force termination |
SYS.KILL returns zero, if the signal could be delivered sucessfully, and a negative number in case of an error.
Delivering SIGTEST does not have any effect. Therefore, it can be used to check whether the process with a given PID exists.
SIGTERM may be caught by the receiving process to initiate a clean shutdown.
SIGKILL terminates the receiving process immediately.
See also: FORK, SPAWN
SYS.MKDIR(path) ! Str => Num
Create a directory with the name stored in 'path'. 'Path' is an operating system dependent path name.
SYS.MKDIR returns zero, if the directory could be created and otherwise a negative value.
See also: CHDIR, RMDIR, OPENDIR, GETDIR
SYS.OPENDIR(path) ! Str => Ddesc | -1
Open the directory specified in 'path'. The exact format of 'path' depends on the underlying operating system.
Upon success, SYS.OPENDIR returns a directory descriptor and in case of an error, it returns -1.
See also: READDIR, CLOSEDIR, STAT
SYS.PIPE(vec) ! Vec => Num
Create a pipe (a FIFO structure) and fill the vector 'vec' with two descriptors which can be used to access the pipe. Each element of 'vec' will be filled with an ordinary file descriptor as returned by T.OPEN. Therefore, the usual I/O operations can be used to read and write a pipe.
After the successful creation of a pipe, 'vec[0]' will contain the output descriptor (which is read-only) and 'vec[1]' will contain the input descriptor (which is write-only).
Data written to 'vec[1]' can be read from 'vec[0]'. Write requests will block, if the pipe is full and read requests will block, if the pipe is empty. The size of the pipe depends on the operating system.
SYS.PIPE returns zero when a pipe could be created and a negative value in case of an error.
See also: T.OPEN, T.CLOSE, T.READ, T.WRITE
SYS.RDCHK(fdesc) ! Fdesc => Num
Check whether input is available from the file descriptor 'fdesc' (whether a read operation on 'fdesc' would NOT block).
SYS.RDCHK returns a non-zero value, if the operation would succeed without blocking. If the read request would block, zero is returned.
See also: T.READ
SYS.READDIR(ddesc, buffer, lim) ! Ddesc,Str,Num => Num
Read the next directory entry from the directory descriptor 'ddesc' and fill 'buffer' with the name of that entry. If the name is longer than 'lim'-1 characters, truncate it to 'lim'-1 characters. In any case, terminate the name with a NUL character.
SYS.READDIR returns the length of the name read upon success, and -1 in case of an error. Reading beyond the end of the directory will also return -1.
See also: OPENDIR, CLOSEDIR, STAT
SYS.RMDIR(path) ! Str => Num
Remove the directory specified in 'path'. 'Path' is an operating system dependent path name.
SYS.RMDIR returns zero, if the directory could be removed and otherwise a negative value.
See also: CHDIR, MKDIR, OPENDIR, GETDIR
SYS.SPAWN(prog, args, mode) ! Str,Vec,Num => Num
Create a new process by running the program 'prog' with the command line options stored in 'args'. 'Prog' contains the path of the executable in an operating system dependent format. 'Args' is a vector of strings where each one contains one command line argument. The last element of the vector must be zero. 'Mode' controls whether execution of the calling process will be suspended until the spawned process exits. The following modes exist:
Mode name | Meaning |
---|---|
SYSTEM.SPAWN_NOWAIT | Execute the new process concurrently. |
SYSTEM.SPAWN_WAIT | Suspend the caller until the new |
process terminates. |
NOTE: some operating systems may restrict the space which can be used for passing command line arguments.
NOTE2: on non-multitasking systems, SPAWN_NOWAIT may be unimplemented.
SYS.SPAWN returns the exit code of the subprocess when called with mode=SPAWN_WAIT and zero when called with mode=SPAWN_NOWAIT. In case of an error, it will return -1.
See also: FORK, WAIT
SYS.STAT(path, sb) ! Str,Statbuf => Num
Retrieve some information about the file specified in 'path'. The path is in a system dependent format. The retrieved information will be stored in a 'Statbuf' structure which has the following format:
struct STATBUF = | |
---|---|
ST_DEV, | ! device ID |
ST_INO, | ! inode number |
ST_MODE, | ! access bits |
ST_NLINK, | ! number of links |
ST_UID, | ! user ID of owner |
ST_GID, | ! group ID of owner |
ST_RDEV, | ! device type |
ST_SIZE, | ! file size in bytes |
ST_EXT64, | ! file size in 64K blocks |
ST_MTIME, | ! date of last modification (8 bytes) |
! Format: CYMDHMSh, see SYS.TIME. | |
ST_MT_2, | ! \ |
ST_MT_3, | ! > Buffer for ST_MTIME |
ST_MT_4; | ! / |
Depending on the operating system, some fields will be filled with more or less meaningful standard values. For example, systems not supporting multiple links will fill the ST_NLINK field with 1.
The access field ST_MODE may have the following flags set:
Flag | Description |
---|---|
SYSTEM.FM_RDOK | file is readable |
SYSTEM.FM_WROK | file is writeable |
SYSTEM.FM_EXOK | file is executable (*) |
SYSTEM.FM_ISDIR | file is a directory |
(*) DOS-files do not have an executable flag. Therefore, | x[SYSTEM.ST_MODE] | SYSTEM.FM_EXOK is always zero on DOS systems.
SYS.STAT returns zero upon success and otherwise a negative value.
See also: OPENDIR, READDIR, CLOSEDIR, GETDIR
SYS.TIME(tbuf) ! Bvec => 0
Fill the buffer 'tbuf' with the current system time. 'Tbuf' must provide eight bytes of space which will be filled as follows:
Field | Value | Range |
---|---|---|
tbuf[0] | year / 100 | 19... |
tbuf[1] | year mod 100 | 0...100 |
tbuf[2] | month | 1...12 |
tbuf[3] | day | 1...31 |
tbuf[4] | hour | 0...23 |
tbuf[5] | minute | 0...59 |
tbuf[6] | second | 0...59 |
tbuf[7] | second/100 | 0...99 |
SYS.TIME never fails and always returns 0. It might return an incorrect time, though, and on systems without a clock, it may fill 'tbuf' with the same values each time it is called.
SYS.WAIT(pid) ! => Num
Wait for a subprocess to terminate and return its exit code. 'Pid' contains the process IF of the process to wait for. It must be a PID obtained from SYS.SPAWN or SYS.FORK.
See also: SPAWN, FORK
OBJECT TTY[TTYCTL];
The TTYCTL class implements a set of routines for controlling character-based video terminals and reading keyboards. Procedures contained in this class include writing to the terminal screen, cursor movement, clearing and srolling screen regions, setting display colors (where available), and decoding keyboard input.
The TTYCTL routines must be initialized by calling TTY.INIT and shut down by calling TTY.FINI.
This class does not contain any data.
TTY.INIT() ! => 0
Initialize the TTY control structures. This routine must be called before any other procedure of this class can be used. It performs the following steps (depending on the used operating system, some of these steps may be skipped):
TTY.INIT may fail for any of the following reasons:
In any of the above cases, an appropriate message will be printed and the calling program will be terminated.
See also: MODE
TTY.CLEAR() ! => 0
Clear the terminal screen using the currently selected color.
See also: CLREOL, COLOR
TTY.CLREOL() ! => 0
Clear all characters from the cursor position to the end of the current line using the currenly selected color.
See also: CLEAR, COLOR
TTY.COLOR(color) ! Num => 0
Select a new foreground and background color. 'Color' is created by OR'ing together a foreground and a background color value. The following values exist (F_ indicates 'foreground' and B_ indicates 'background'):
F_BLACK, F_BLUE, F_GREEN, F_CYAN, F_RED, F_MAGENTA, F_YELLOW, F_GREY, B_BLACK, B_BLUE, B_GREEN, B_CYAN, B_RED, B_MAGENTA, B_YELLOW, B_GREY
The special value F_BRIGHT may be OR'ed in to increase the intensity of the foreground color. For example,
tty.color(TTYCTL.F_CYAN | TTYCTL.B_BLUE | TTYCTL.F_BRIGHT)
selects bright cyan color on blue background.
On monochrome terminals, only the color values
F_GREY|B_BLACK and F_BLACK|B_GREY
should be considered defined.
See also: SCREENTYPE
TTY.COLORS() ! => Num
Return a non-zero value, if the controlled terminal supports color.
See also: COLUMNS, LINES
TTY.COLUMNS() ! => Num
Return the number of columns per line on the screen of the controlled terminal.
See also: COLORS, LINES
TTY.FINI() ! => 0
Shutdown the TTYCTL class and unload the TTYCTL.SO module.
See also: INIT
TTY.LINES() ! => Num
Return the number of lines on the screen of the controlled terminal.
See also: COLORS, COLUMNS
TTY.MODE(rawflag) ! Num => 0
Switch the terminal to 'raw mode'. Some terminals (especially in the Unix world) must be in 'raw mode' to allow to read single characters from them. In non-raw ('cooked') mode, reading a TTY device first returns when CR (aka ENTER,NL) is pressed on the terminal's keyboard. To make the read call return immedialtely after a key has been pressed, the TTY driver must be in raw mode.
TTY.MODE(1) selects raw mode and TTY.MODE(0) selects cooked mode.
These calls may have no effect on other platforms, but when switching a TTY driver to raw mode, it should be switched back to cooked mode before terminating the program. Otherwise, the TTY driver may be left in an undesired state and make the TTY inaccessable.
On some systems, cooked mode may not be implemented. In this case, READC will always return after receiving a single key.
See also: READC
TTY.MOVE(x, y) ! Num,Num => 0
Move the cursor to the specified location (column 'x', row 'y'). If the specified coordinates do not exist on the used TTY, the result is undefined. Coordinates start at (0,0) in the upper/left corner.
TTY.QUERY() ! => Num
Check whether there are characters in the keyboard input buffer. If there are characters, TTY.READC would return when called at that moment. Otherwise, it would block.
TTY.QUERY returns -1 if there are characters in the buffer and otherwise 0.
See also: READC
TTY.READC() ! => Num
Read a single character from the terminal's keyboard and return its keycode. For keys generating ASCII characters, the ASCII code of the key will be returned. 'Special' keys like the arrow keys, PREVIOS PAGE, NEXT PAGE, INSERT, DELETE, and the function keys return values above 255. The following symbols may be used to match special key codes:
Keycode | Label or Keys |
---|---|
TTYCTL.K_HOME | 'Home' |
TTYCTL.K_LEFT | Left arrow |
TTYCTL.K_RGHT | Right arrow |
TTYCTL.K_END | 'End' |
TTYCTL.K_BKSP | Backspace, <--, <X] |
TTYCTL.K_DEL | 'Del', 'Delete', 'Remove' |
TTYCTL.K_KILL | Control + 'U', Control + Backspace |
TTYCTL.K_INS | 'Ins', 'Insert' |
TTYCTL.K_CR | 'CR', 'Enter', 'Return', <-' |
TTYCTL.K_UP | Up arrow |
TTYCTL.K_DOWN | Down arrow |
TTYCTL.K_ESC | 'ESC', 'Escape' |
TTYCTL.K_PREV | 'Prev', 'PgUp', 'PageUp' |
TTYCTL.K_PGUP | = K_PREV |
TTYCTL.K_NEXT | 'Next', 'PgDn', 'PageDn' |
TTYCTL.K_PGDN | = K_NEXT |
TTYCTL.K_F1 | 'F1' |
TTYCTL.K_F2 | 'F2' |
TTYCTL.K_F3 | 'F3' |
TTYCTL.K_F4 | 'F4' |
TTYCTL.K_F5 | 'F5' |
TTYCTL.K_F6 | 'F6' |
TTYCTL.K_F7 | 'F7' |
TTYCTL.K_F8 | 'F8' |
TTYCTL.K_F9 | 'F9' |
TTYCTL.K_F10 | 'F10' |
Some systems may require to switch the TTY driver to raw mode (see MODE) before single characters can be received from a terminal.
See also: MODE, QUERY, WRITEC
TTY.RSCROLL(top, bottom) ! Num,Num => 0
Scroll the screen region from 'top' to 'bottom' down by one line. At the top of the region, a blank line will be inserted using the currently selected color. Line numbers start at 0.
See also: SCROLL, SCREENTYPE
TTY.SCROLL(top, bottom) ! Num,Num => 0
Scroll the screen region from 'top' to 'bottom' up by one line. At the bottom of the region, a blank line will be inserted using the currently selected color. Line numbers start at 0.
See also: RSCROLL, SCREENTYPE
TTY.WRITEC(c) ! Num => Num
Write the character 'c' to the terminal screen and return its ASCII code. The character will be output at the current cursor position. Writing a character advances the cursor. When the cursor is in the rightmost column when writing a character, the cursor position is undefined after the output operation.
See also: READC, WRITES
TTY.WRITES(string) ! Str => 0
Write a string to the terminal screen as if each character of the string had been written using TTY.WRITEC. However, TTY.WRITES is usually faster than the character-oriented WRITEC method.
See also: WRITEC, READC
OBJECT XM[XMEM]; XM.INIT();
The XMEM class provides access to external memory blocks. An external memory block is a continous region of memory not contained in the T3X data area. XM blocks are byte addressed. Bytes in XM blocks can only be read and written using the procedures XM.GET, XM.PUT, and friends, as defined by this class.
The XMEM class does have a state which is implemented in the shared object part. Therefore, only one single instance of the class may be loaded.
Since each external memory block must be completely adressable using Tcode machine words, their sizes may not exceed 65536 bytes.
The XMEM class must be initialized by XM.INIT before its use shut down by calling XM.FINI.
XM.INIT() ! => 0
Initialize the external memory interface. Basically, this routine loads the shared object containing the actual interface procedures. XM.INIT will fail, if the shared object could not be opened. In this case, the calling program will be halted.
See also: FINI
XM.ALLOC(length) ! => id | -1
Allocate a block of external memory with a size of LENGTH bytes. Upon success, return an identifier which may be used in subseuqent XM operations to access the block. In case of an error (out of memory / out of IDs), return -1.
See also: FREE
XM.FREE(id) ! => 0 | -1
Release a previously allocated external memory block. ID is an identifier returned by XM.ALLOC.
See also: ALLOC
XM.GET(id, index) ! => value | -1
Return the byte stored at address INDEX of the external memory block referenced through ID. INDEX may not exceed X-1 where X is the size of the block as specified at allocation time. XM.GET returns -1, if an invalid ID is passed to it.
See also: ALLOC, PUT, READ.
XM.PUT(id, index, value) ! => value | -1
Replace the value of the byte stored at address INDEX of the external memory block referenced through ID with VALUE. INDEX may not exceed X-1 where X is the size of the block as specified at allocation time. All but the least significant 8 bits of VALUE will be discarded. XM.PUT returns -1, if an invalid ID is passed to it.
See also: ALLOC, GET, WRITE.
XM.READ(id, index, buffer, length) ! => 0 | -1
Copy LENGTH bytes stored at address INDEX of the external memory block referenced through ID into BUFFER. INDEX may not exceed X-1-LENGTH where X is the size of the block as specified at allocation time. XM.READ returns -1, if an invalid ID is passed to it.
See also: ALLOC, GET, READ.
XM.WRITE(id, index, buffer, length) ! => 0 | -1
Copy LENGTH bytes from BUFFER to the address INDEX of the external memory block referenced through ID. INDEX may not exceed X-1-LENGTH where X is the size of the block as specified at allocation time. XM.WRITE returns -1, if an invalid ID is passed to it.
See also: ALLOC, PUT, READ.
The Tcode machine is the target of the reference implementation of T3X. Tcode is suitable for both interpretation and transformation into native code. It also provides mechanisms for static linking so that multiple Tcode modules can be linked together forming one single program. Since version 3, support for object-oriented programming is built into the virtual Tcode machine. This chapter describes Tcode7 and its virtual machine in detail.
The Tcode machine is a virtual 16-bit machine basically consisting of the following parts:
+---------------+0xFFFF 0xFFFF+---------------+ | | | | | | :-- 16 bits --: | Stack | | U n u s e d | +-------------+ | and | | S p a c e | +---| IP | | Dynamic | | | | +-------------+ +--->| Storage | | | | | RR | | | | |- - - - - - - -| | +-------------+ | +->|- - - - - - - -| | | | | FP |--+ | | F r e e | | | | +-------------+ | | | | Tcode Program |<--+ | SP |----+ | M e m o r y | | | +-------------+ | | | Instructions | | SELF |---+ |- - - - - - - -| | | +-------------+ | | Static | | | REGISTERS +-->| Data | | | | | +---------------+0x0000 0x0000+---------------+ CODE ARRAY DATA ARRAY |
Fig.5 The Architecture of the Tcode Machine |
There are two byte-addressable memory regions called the code array and the data array. The code array holds the Tcode program which is to be executed and the data array is used to hold the data used by the program. Each cell in one of the arrays is completely addressable using a 16-bit pointer. Therefore, the maximum size for each array is 65536 bytes.
Machine words - which are always 2 bytes wide - are stored with the least significant byte in the cell with the lower address: 0x1234 = 0x34 0x12 (little endian byte ordering). However, this byte ordering applied only to the way machine words are stored in Tcode programs. Actual implementations of the Tcode machine may store machine words in any format as long as the word size is left unchanged. Therefore, the result of accessing single bytes of a machine word is undefined.
The Tcode machine has five 16-bit wide special purpose registers which are outlined in the following overview.
FP, the Frame Pointer.
The frame pointer always points to the stack frame (aka context) of the
currently running procedure. FP is implicitly referenced by the
instructions LDL, LDLV, SAVL, and
INCL, which address local objects. FP is modified only
by HDR, END, MHDR, and ENDM instructions.
See also calling conventions.
IP, the Instruction Pointer.
This register always points to the instruction which will be
interpreted next. IP is interpreted as an offset into the code
array. It cannot be accessed directly and it is changed by jump, call,
and branch instructions.
RR, the Return Register.
This register is used to transport procedure results back to the caller.
The return register is loaded by the POP instruction and saved by
CLEAN. See also
calling conventions.
SELF, the class context pointer.
SELF points to the instance context
which is currently in effect. This is equal to the
first byte of the data space of the object which is
currently receiving a message. Instance contexts
are static. They are established using MHDR
and released using ENDM. The SELF
register is used by the LDI, LDIV,
and INCI instructions to compute the addresses
of instance variables. See also
instance contexts.
SP, the Stack Pointer.
The stack pointer points to the object most recently placed on the
stack. Moving an object onto the stack implicitly decreases SP
by 1 machine word. Removing an object increases it by 1 machine word.
SP may be explicitly modified using the STACK
instruction.
The Tcode machine instructions can be divided into the following nine groups:
Declarations, external linkage, and debug instructions will be processed only once (therefore, they may be resolved in a preprocessing step). This means that an instruction like
STR 5 H e l l o
will not create a new string literal each time it is interpreted, but only at the first time. (One might also think of this behaviour as creating the same object each time a declaration executed).
Arithmetic instructions and predicates expect their arguments on the runtime stack and also place their results there. Since there are no general purpose registers, most operations are performed on stack elements.
A procedure should always begin with a HDR instruction, which saves the caller's context and creates a new stack frame, and end with an END instruction, which restores the saved stack frame and jumps back to the caller.
A procedure call
P(a, b, c)
where a, b and c be global variables, is coded as follows:
LDG a LDG b LDG c CALL LP CLEAN 3
Each LDG instruction loads the value of a global variable onto the stack. CALL performs the procedure call which returns with its result in the Return Register (RR). LP denotes the label tagging the procedure P. The final CLEAN instruction removes the three arguments from the stack and replaces them with the value returned in RR so that the top stack element finally holds the procedure return value.
Each procedure may expect the following stack configuration when called:
FP+M | Argument #1 |
FP+3 | Argument #N-1 |
FP+2 | Argument #N |
FP+1 | Return Address (saved by CALL or CALR) |
FP+0 | Old SP (saved by HDR) |
FP-1 | Local Variable #1 |
FP-2 | Local Variable #2 |
FP-J | Local Variable #K |
SP | ( Free memory below ) |
Note: The arguments are passed to the procedure in reverse order with the first argument at the highest address.
Both, arguments and local variables may be accessed using LDL instructions. Given the above context,
LDL -M
would access the first argument.
LDL -2
always loads the value of the last argument, if any. Local storage is accessed using positive offets:
NUM 25 SAVL 2
would load the second local variable with the value 25.
A method is a procedure used for manipulating data of an object. Instead of the usual procedure frame as described in the previous section, it should contain instructions to build destroy a local context and shift and restore the instance context which is held in the SELF register. The MHDR instruction, which shifts the instance context, expects the new context pointer at FP+2. ENDM is like END, but also restores the instance context of the caller. A method frame looks as follows:
CLAB procedure-label MHDR ... code ... CLAB exit-label ENDM
Passing a message m with three arguments to a global object O
O.m(1,2,3);
would be coded as follows:
NUM 1 NUM 2 NUM 3 LDGV LO CALL Lm CLEAN 4
Each called method may expect the following stack configuration:
FP+M | Argument #1 |
FP+4 | Argument #N-1 |
FP+3 | Argument #N |
FP+2 | Receiver's Address |
FP+1 | Return Address (saved by CALL or CALR) |
FP+0 | Old SP (saved by MHDR) |
FP-1 | Sender's Address (Old Instance Context, saved by MHDR) |
FP-2 | Local Variable #1 |
FP-3 | Local Variable #2 |
FP-J | Local Variable #K |
SP | ( Free memory below ) |
A cycle is the set of operations which is required to execute one single Tcode instruction. Each cycle consists of the following steps:
These steps are repeated until the Tcode machine is halted by executing HALT.
If an instruction modifies stack elements, first all its operands are removed from the stack, then the operation denoted by the instruction is performed, and finally the result is placed back on the stack.
This sections describes the state of the Tcode machine when it is started.
Symbol | Size (bits) |
Description |
---|---|---|
M N | 16 | generic numeric values |
.N .M | 16 | unsigned numeric values |
L | 16 | a label tagging a data word or a procedure. |
E | 16 | a label referencing an external procedure. |
I | 16 | a label referencing an interface procedure. |
C | 8 | an 8-bit character. |
X1...XN | var | a vector containing N elements of the type X. |
memory[X] | 16 | the content of the X'th machine word in the data array. |
memory::X | 8 | the content of the X'th byte in the data array. |
S0...SN | 16 | the N+1 elements most recently pushed onto the stack. |
Annotations
Normally, the most significant bit of each machine word is interpreted as a sign flag (1 indicates a negative number). The leading dot notation .N indicates that the MSB of N should be treated as a part of the value instead of a sign indicator.
An address is an offset into the code or data array where the base (code or data array) is implicitly determined by the associated instruction. Adresses are 16 bits wide.
There is a relation between labels (L) and addresses (A). When a label Lx tags a specific instruction at the address Ay, then Lx and Ay are exchangeable. In the following sections, L will be used to denote the creation of or the reference to a label while the notation A will be used to denote a reference to a location tagged by a label.
External labels are used to create a connection between the name of an external procedure and a reference to such a procedure. External label IDs are 16 bits wide.
Interface labels are used to create a connection between the name of an interface procedure and a reference to such a procedure. Interface label IDs are 16 bits wide.
S0 denotes the element most recently pushed onto the stack. When popping elements from a stack holding N+1 elements, S0 will be removed first and SN will be removed last.
0x82 CLAB L - Code LABel
Define a label identified by the value L which tags a subsequent
procedure.
0x85 CREF L - Code REFerence
Define a word-size storage location holding the address of the
procedure tagged by the label L.
0x84 DATA N - DATA definition
Define a word-size storage location containing the value N.
0x87 VEC N - VECtor declaration
Define a vector with a length of N machine words and undefined
content.
0x83 DLAB L - Data LABel
Define a label identified by the value L which tags a subsequent data
object.
0x86 DREF L - Data REFerence
Define a word-size storage location holding the address of the data
object tagged by the label L.
0xCD INIT N L - INITialize
Originally used to initialize the Tcode environment - hence its name.
Each program must begin with this instruction. The argument N
specifies the Tcode version the program complies to. This document
describes version 7 of the Tcode language.
L is a code label tagging the initial entry point of the Tcode module
containing the instruction. This label may be evaluated by a Tcode
linker.
0x88 STR N C1 ... CN - define STRing
Define a vector with a length of
(N+2) / 2
machine words containing the characters C1 through CN. Each character is stored in a separate byte. All trailing bytes of the vector are filled with zeroes so that a properly terminated string is created.
0x0A END - END procedure
Remove two elements S0 and S1. Restore the context of the calling
procedure by loading FP with S0 and then perform a branch to S1. S1 is
usually a return address which has been saved by a CALL or CALR
instruction.
0x0C ENDM - END Method
First, load SELF with the value previously saved on the stack by MHDR,
thereby restoring the instance context of the sender. The sender's context
will be removed from the stack.
Then, remove two elements S0 and S1. Restore the context of the calling
procedure by loading FP with S0 and then perform a branch to S1. S1 is
usually a return address which has been saved by CALL or CALR.
0x09 HDR - HeaDeR
Push the context of the calling procedure (FP) and create a fresh
procedure context by loading FP with SP.
0x0B MHDR - Method HeaDeR
First, push the context of the calling procedure (FP) and create a fresh
procedure context by loading FP with SP.
Then, push the context of the sending method or procedure (SELF) and
shift the instance context by loading SELF with the machine word
pointed to by FP+2 (the last argument passed to the answering
method).
0x91 CLEAN N - CLEAN up arguments
Remove N procedure arguments from the stack:
SP := SP + N*2
and then push the content of RR, the return register.
0x0E DUP - DUPlicate
Push the current top of the stack (S0), thereby duplicating it.
0x0D POP
Pop the top element S0 and load it into the return register RR.
0x90 STACK N
Move the stack pointer SP by N machine words:
SP := SP - N*2.
Moving the stack pointer `down' (N>=1) allocates space on the stack, moving it `up' (N<=-1) deallocates space. STACK is primarily used to allocate and release dynamic memory in procedures.
0x0F SWAP
Exchange the values of S0 and S1.
0x1A ADD
Remove two elements S0 and S1 and push their sum: S1+S0.
0x1C BAND - Bitwise AND
Remove two elements S0 and S1, perform a bitwise AND on them and push
the result: S1 & S0.
0x14 BNOT - Bitwise NOT
Invert each bit of the top element: ~S0.
0x1D BOR - Bitwise OR
Remove two elements S0 and S1, perform a bitwise OR on them and push
the result: S1 | S0.
0x1F BSHL - Bitwise SHift Left
Remove two elements S0 and S1, shift the bits of S1 to the left by S0
positions and push the result: S1 << S0.
0x20 BSHR - Bitwise SHift Right
Remove two elements S0 and S1, shift the bits of S1 to the right by S0
positions and push the result: S1 >> S0.
0x1E BXOR - Bitwise eXclusive OR
Remove two elements S0 and S1, perform a bitwise XOR on them and push
the result: S1 ^ S0.
0x16 DIV - integer DIVide
Remove two elements S0 and S1, compute the (signed) integer part of
their quotient and push it: S1 / S0.
If S0=0, signal a fatal error and halt.
0xCE INCG A N - INCrement Global
Add the value N to the value of the memory cell located at the address
A. This is exactly the same as
LDG A NUM N ADD SAVG A
but more efficient.
0xCF INCI M N - INCrement Instance variable
Add the value N to the value of the memory cell whose absolute address
is SELF + M*2.
INCI M N is equal to
LDI M NUM N ADD SAVI M
but more efficient.
0xD0 INCL M N - INCrement Local
Add the value N to the value of the memory cell whose absolute address
is FP - M*2.
INCL M N is equal to
LDL M NUM N ADD SAVL M
but more efficient.
0x13 LNOT - Logical NOT
If the top element is equal to zero, replace it with -1 and otherwise
with zero: S0=0-> -1: 0.
0x19 MOD - MODulo
Remove two elements S0 and S1, compute their division remainder and
push it: S1 MOD S0. S1 MOD S0 is defined as S1 - S1./S0.*S0 where
'./' denotes an unsigned integer division and '.*' an
unsigned multiplication. If S0=0, signal a fatal error and halt.
0x15 MUL - MULtiply
Remove two elements S0 and S1, compute their (signed) product and push
it: S1 * S0. Do not perform any overflow checking.
0x12 NEG - NEGate
Negate the top element: -S0.
0x00 GLUE - GLUE (no operation)
Rest for a cycle.
0x1B SUB - SUBtract
Remove two elements S0 and S1 and push their difference: S1-S0.
0x18 UDIV - Unsigned integer DIVide
Remove two elements S0 and S1, compute the unsigned integer part of
their quotient and push it: .S1 / .S0.
If S0=0, signal a fatal error and halt.
0x17 UMUL - Unsigned MULtiply
Remove two elements S0 and S1, compute their unsigned product and push
it: .S1 * .S0. Do not perform any overflow checking.
0x21 EQU - EQUal
Remove two elements S0 and S1. Push true, if they are equal and
otherwise false: S1=S0-> -1: 0.
0x24 GRTR - GReaTeR than
Remove two elements S0 and S1. Push true, if S1 is greater than S0 and
otherwise false: S1>S0-> -1: 0.
S0 and S1 are both signed.
0x26 GTEQ - Greater Than or EQual to
Remove two elements S0 and S1. Push true, if S1 is greater than or
equal to S0 and otherwise false: S1>=S0-> -1: 0.
S0 and S1 are both signed.
0x23 LESS - LESS than
Remove two elements S0 and S1. Push true, if S1 is less than S0 and
otherwise false: S1<S0-> -1: 0.
S0 and S1 are both signed.
0x25 LTEQ - Less Than or EQual to
Remove two elements S0 and S1. Push true, if S1 is less than or equal
to S0 and otherwise false: S1<=S0-> -1: 0.
S0 and S1 are both signed.
0x22 NEQU - Not EQUal
Remove two elements S0 and S1. Push true, if they are not equal and
otherwise false: S1\=S0-> -1: 0.
0x28 UGRTR - Unsigned GReaTeR than
Remove two elements S0 and S1. Push true, if .S1 is greater than .S0
and otherwise false: .S1>.S0-> -1: 0.
0x2A UGTEQ - Unsigned Greater Than or EQual to
Remove two elements S0 and S1. Push true, if .S1 is greater than or
equal to .S0 and otherwise false: .S1>=.S0-> -1: 0.
0x27 ULESS - Unsigned LESS than
Remove two elements S0 and S1. Push true, if .S1 is less than .S0 and
otherwise false: .S1<.S0-> -1: 0.
0x29 ULTEQ - Unsigned Less Than or EQual to
Remove two elements S0 and S1. Push true, if .S1 is less than or equal
to .S0 and otherwise false: .S1<=.S0-> -1: 0.
0x34 DEREF - DEREFerence
Remove two values S0 and S1, load a machine word from memory[S1/2+S0],
zero-extend it, and push it. The NORM instruction is implied. It converts
a pointer (S1) and an offset (S0) into a pointer.
0x35 DREFB - DeREFerence Byte
Remove two values S0 and S1, load a single byte from memory::(S1+S0),
zero-extend it, and push it. The NORMB instruction is implied. It converts
a pointer (S1) and an offset (S0) into a pointer.
0xAB LDG A - LoaD Global
Push the value stored in the memory cell with the address A:
memory[A/2].
0xAC LDGV A - LoaD Global Vector
Push the address A.
0xAF LDI N - LoaD Instance variable
Push the value located at memory[(SELF/2)+N]. This instruction is used
to load the content of an instance variable. N specifies the offset of
the variable relative to the beginning of the object's data area.
0xB0 LDIV N - LoaD Instance Vector
Push the address SELF+N*2. This instruction is used to load the
address of an instance variable. N specifies the offset of the
variable relative to the beginning of the data area of the currently
addressed object.
0xAD LDL N - LoaD Local
Push the value stored at the N'th position `below' the stack frame
base. The address of the cell is computed as follows:
FP - N*2
Consequently, negative values of N may be used to access locations `above' the frame base.
0xB1 LDLAB A - LoaD LABel
Push the address A. This instruction is similar to LDGV, but more general.
It may also be used to reference procedures.
0xAE LDLV N - LoaD Local Vector
Push the absolute address of a local object. N specifies the offset of
the object relative to the stack frame base. The absolute address is
computed using the formula FP - N*2.
0x36 NORM - NORMalize reference
Remove two elements S0 and S1 and compute the absolute address of the
S0'th member of the vector pointed to by S1: S1 + S0*2. Push the
computed address.
This instruction converts a pointer plus a machine word offset into a
pure pointer which references the same location.
0x37 NORMB - NORMalize Byte reference
Remove two elements S0 and S1 and compute the absolute address of the
S0'th byte of the vector pointed to by S1 (S0+S1). Push the computed
address.
This instruction converts a pointer plus a byte offset into a pure
pointer which references the same location.
0xB2 NUM N - load NUMber
Push the value N.
0xB8 SAVG A - SAVe Global
Pop one element and save it in the memory cell with the address A:
memory[A/2] := S0.
0xBA SAVI N - SAVe Instance variable
Pop one element and save it in the memory cell
memory[(SELF/2)+N]
This instruction is used to alter the state of an instance variable. N specifies the offset of the variable relative to the beginning of the data area of the currrently addressed object.
0xB9 SAVL N - SAVe Local
Pop one element and save it in the storage cell with the address
FP - N*2. See also LDL.
0x33 SELF - SELF reference
Push the content of the SELF register onto the stack.
0x3C STORB - STORe Byte
Pop two elements S0 and S1 and store the least significant 8 bits of
S0 in the byte pointed to by S1: memory::S1 := S0.
0x3B STORE
Pop two elements S0 and S1 and store the value S0 in the
memory cell pointed to by S1:
memory[S1/2] := S0.
0xBD BRF A - BRanch on False
Remove the element S0 and branch to the address A, if S0 is false.
0xBE BRT A - BRanch on True
Remove the element S0 and branch to the address A, if S0 is true.
0xC5 CALL A - procedure CALL
Push the current value of the Instruction Pointer and then perform
a branch to the address A.
0x46 CALR - CALl through Register
Push the current value of the Instruction Pointer and then remove one
element and perform a branch to the location it points to. The
destination is implicitly located in the code array of the program
(branches to the code array cannot be done).
0xC3 DNEXT A 0 - Downward NEXT
Remove two elements S0 and S1 and branch to the address A, if
S1 <= S0. (*)
0xC8 SYS N - SYStem call
Execute the system procedure associated with the index value N. The
index value is removed and a call-dependent number of arguments is
passed to the respective system procedure. The system procedure may
return a machine word size return value in the Return Register. The
stack must me cleaned up using CLEAN after calling a system procedure.
System procedures are implemented as methods of the T3X core class.
Therefore, they should only be invoked by sending a message to an
instance of the T3X class.
The exact semantics of SYS depend on the called procedure.
0xCA ICALL N - Interface CALL
Call the interface procedure located at slot N. Slot values are
dynamically generated using IPROC, IREF, and ICALX. The interface
procedure may return a machine word size return value in the Return
Register. The stack must me cleaned up using CLEAN after calling an
interface procedure. System procedures are normally implemented in
languages other than T3X (such as C or assembly language). They are
used to extend the T3X runtime environment.
The exact semantics of ICALL depend on the called procedure.
0xC4 HALT N
Instantly halt the Tcode machine. The least significant
eight bits of N will be delivered to the invoker of the program as a
return code.
0xC1 JUMP A
Unconditionally jump to the address A.
0xBF NBRF A - Nondestructive BRanch on False
Branch to the address A, if S0 is false. Do not remove S0.
0xC0 NBRT A - Nondestructive BRanch on True
Branch to the address A, if S0 is true. Do not remove S0.
0xC2 UNEXT A - Upward NEXT
Remove two elements S0 and S1 and branch to the address A, if S1 >=
S0. (*)
(*) The instructions UNEXT and DNEXT have been designed for use in counting loops (FOR-NEXT loops in BASIC). The idea is as follows: At the end of the loop, the current loop index and the loop limit are both pushed onto the stack. UNEXT compares the values and branches out of the loop, if index>=limit. DNEXT branches, if index<=limit. Therefore, UNEXT is used in upward counting loops and DNEXT is used in countdown loops.
0xC7 CALX E - CALl eXternal procedure
Call a procedure contained in a different module. There must be an
EXT record defining the external label E in the same module and a PUB
record with the same name in another module, so that the external
reference can be resolved.
0xD5 CMAP N M - Call MAP
Call maps describe the arguments of interface procedures. The operands
of CMAP are identical to the parameters of interface declarations
(see IDECL for details).
0xD2 EXT E N C1 ... CN - EXTernal reference
Create an external reference to the symbol represented by the
characters C1 through CN. E is a so-called external label. Such labels
are used to reference external symbols in CALX instructions. See the
section on
loading Tcode
for details.
Note: When interpreting a Tcode program, this instruction may
lead to an error (unresolved external reference).
0xCB ICALX I - Interface CALl of eXternal procedure
Call the interface procedure described by the interface label I. There
must be an IREF record defining I in the same module. An IPROC record
with the same name as the IREF record must be supplied to resolve the
reference.
0xC9 ILIB N C1 ... CN - Interface LIBrary
Name the relocatible object file or library containing the interface
procedures described by the IPROC record in this module. This name
can be used for linking the appropriate libraries when translating
Tcode to native code or to dynamically load extensions into the
Tcode machine.
0xD3 IPROC N M C1 ... CN - PUBlic reference
Assign the interface procedure named by the characters C1 though CM
to the interface slot N. IPROC records are also used to resolve ICALX
instructions. See the section on
resolving interface references
for details.
0xD4 IREF I N C1 ... CN - Interface REFerence
Create an interface reference to the symbol represented by the
characters C1 through CN. I is a so-called interface label. Such labels
are used to reference interface procedures in ICALX instructions. See
the section on
resolving interface references
for details.
Note: When interpreting a Tcode program, this instruction may
lead to an error (unresolved interface reference).
0xD1 PUB L N C1 ... CN - PUBlic reference
Signal the Tcode linker that the procedure tagged by the label L is
public and can be referenced externally using the name formed by the
characters C1 through CN. Again, see the section on
loading Tcode
for details.
When interpreting Tcode programs, this instruction type may be ignored
safely.
0xD6 GSYM L N C1...CN - Global SYMbol
Name a global symbol. C1 through CN contain the characters of the
symbol name. L is the ID of the label which marks the named
symbol. GSYM instructions should be generated for global variable
names.
0xD8 ISYM M N C1...CN - Instance SYMbol
Name an instance variable. C1 through CN contain the characters
of the symbol name. M is the offset in machine words (into the classes
data space) of the data object named by the symbol.
0xC6 LINE N - LINE number
Indicate that the following instructions have been created from line
N of the source code the containing Tcode program has been created
from.
0xD7 LSYM M N C1...CN - Local SYMbol
Name a local symbol. C1 through CN contain the characters of
the symbol name. M is a signed number holding the position of the
variable relative to the Frame Pointer.
0x81 HINT N - pass a HINT
Pass some meta information to later stages of the compiler like the
optimizer or the code generator. The information itself is encoded
in a single machine word. Hint instructions are null operations when
executed by the Tcode machine. When a program (like a code generator)
detects a HINT in a context where it does not expect one, it may
safely ignore it.
The following hints are currently generated by the T3X translator (version 7.0.x):
HINT Context | Contained Information |
---|---|
HDR HINT N | The number of formal arguments of the procedure being defined. (*) |
MHDR HINT N | The number of formal arguments of the method being defined plus 1. (In terms of Tcode, the new instance context passed to the method is also an argument. Therefore, N is one higher than the number of arguments defined in the original T3X program.) (*) |
(*) These values are particularly useful when translating Tcode back into a high level language, since they allow to create a procedure header with the proper number of formal arguments without having to scan the entire procedure body.
The Tcode definition provides a set of instructions for binding together modules which were separately compiled to Tcode. External references are limited to procedure calls. This means that a module can call procedures defined in an external module, but it cannot access an external module's data.
To call an external procedure, the label which tags the entry point of the routine must be declared public (using a PUB instruction) in the module containing the called routine. In the module of the caller, it must be declared extern (using EXT). The 'extern' declaration creates a so-called external label which may be referenced by calls to external procedures (CALX instructions).
The instruction PUB provides a symbolic name for a procedure. This symbolic name may be referenced by EXT in a different module. CALX is used to reference an EXT instruction defined in the same module. An external reference is resolved in four steps:
The following figure illustrates the principle of external references.
+----------------------------+ +----------------------------+ | | | | | ,--> EXT E 4 name >================> PUB L 4 name | | | | | CLAB L HDR ... END | | '-------.-.-----------, | | | | | | | | | | | CALX E >--' | CALX E >-' | | | | | | | | | CALX E >-' | | | | | | | +----------------------------+ +----------------------------+ Caller's Module Callee's Module |
Fig.6 External References |
Since labels are represented by integers in Tcode, label collisions will occur when binding two (or more) Tcode modules together. Therefore, labels must be renamed in this case: When a module A already has been loaded and a module B is to be loaded, the highest label ID used in A should be added to each (non-external) label in B.
Two or more EXT records with the same name may exist, because the same symbol may be associated with different external labels in different modules.
The existance of two PUB records with the same name is considered an error (redefinition error).
There must be a matching PUB record for each EXT record. Otherwise, an error is signalled (unresolved external).
Interface references are references to procedures which are not located in the code area of the Tcode machine. Modules can provide interfaces by exporting IPROC records and request interfaces using IREF records. ICALX instructions are used to call unresolved interfaces and ICALL instructions are used to call resolved interfaces.
Notice that resolving an interface means to assign a unique slot number to the name of an interface procedure. It is in the responsibility of the Tcode machine to place the correct procedure in this slot. IPROC records are used to assign names to slot numbers.
Since multiple modules may export IPROC records and each module starts numbering interfaces at one, the Tcode loader must relocate IPROCs by assigning unique slot numbers to them.
IPROC records and corresponding IREF records may be located in the same module. In any case, though, the IPROC record must precede the IREF record and the IREF record must precede any ICALX instructions referenecing the IREF record. In this section, IPROC and IREF are assumed to be in a different modules.
IPROC records provide a symbolic name for an interface procedure. This symbolic name may be referenced by IREF records. ICALX instructions are used to reference an IREF records defined in the same module. An external reference is resolved in four steps:
The following figure illustrates the principle of interface references.
Caller's Module Callee's Module +----------------------------+ +----------------------------+ | | | | | ,--> IREF I 4 name >===============> IPROC L 4 name | | | | | \/ | | '-------.-.-----------, | | || | | | | | | | || | | ICALX I >--' | ICALX I >-' | | || | | | | | || | | ICALX I >-' | | || | | | | || | +----------------------------+ +----------||----------------+ || Tcode Machine or external object || +----------------------------------------------||----------------+ | || | | Interface procdedure slot #0 || | | ... || | | Interface procdedure slot #L <===========' | | ... | | | +----------------------------------------------------------------+ |
Fig.7 Interface References |
The existance of two IPROC records with the same name is considered an error (redefinition error).
Basically, the syntax of the T3X language is described a BNF-style format similar to the one accepted by the YACC parser generator. A more detailed description follows.
The T3X grammar is described as a set of rules of the following format:
Name: Pattern1 | Pattern2 | ... | PatternN ;
It reads 'Name may also be written as Pattern1 OR Pattern2 OR ... OR PatternN'. Each pattern may consist of names of rules or terminal symbols. Each terminal symbol is enclosed in apostrophes, like '=', 'CONST', or '0x'. An apostrophy may be included in a terminal (symbol) by doubling it. Consequently, a terminal represented by an apostrophy is written ''''.
An example: The rule
BinaryDigit: '0' | '1' ;
is read 'A BinaryDigit may be represented by either the string '0' or the string '1'. A (recursive) rule to define arbitrary-length binary numbers based upon BinaryDigit would look like this:
BinaryNumber: BinaryDigit | BinaryDigit BinaryNumber ;
In this case, a BinaryNumber would be either a single BinaryDigit or a BinaryDigit followed by another BinaryNumber (and therefore more BinaryDigits).
The hash symbol (#) is used to indicate that no white space is allowed between the elements of a pattern. While the above rule would match both of the following strings
a rule containing the concatenation symbol (#) would only match the latter version. Such a rule would be written:
BinaryNumber: BinaryDigit | BinaryDigit # BinaryNumber ;
Ellipses (...) are used to represent obvious parts of sequences in patterns. For example, the following two patterns are equal:
A special rule named <character>, which is not defined inside of the formal grammer, is used to refer to an arbitrary character contained in the character set of the implementation environment.
Program: DeclList CompoundStmt ; DeclList: Declaration | Declaration DeclList ; Declaration: 'VAR' VarDeclList ';' | 'CONST' ConstDeclList ';' | 'DECL' ProtoDeclList ';' | 'STRUCT' Symbol '=' StructMemList ';' | ProcDecl | ClassDecl | 'PUBLIC' ClassDecl | 'OBJECT' ObjDeclList ';' | 'MODULE' Symbol '(' ModList ')' ';' | 'MODULE' Symbol '(' ')' ';' | 'INTERFACE' IfaceDeclList ';' ; VarDeclList: VarDecl | VarDeclList ',' VarDecl ; VarDecl: Symbol | Symbol '[' ConstValue ']' | Symbol '::' ConstValue ; ConstDeclList: Symbol '=' ConstValue | Symbol '=' ConstValue ',' ConstDeclList ; ModList: Symbol | Symbol ',' ModList ; StructMemList: Symbol | Symbol ',' StructMemList ; ClassDecl: 'CLASS' Symbol '(' ModList ')' InstDeclList 'END' | 'CLASS' Symbol '(' ')' InstDeclList 'END' | 'ICLASS' Symbol '(' String ')' IClassInstDeclList 'END' ; IClassInstDeclList: IClassInstDecl | IClassInstDecl IClassInstDeclList ; InstDeclList: InstDecl | InstDecl InstDeclList ; IClassInstDecl: InterfaceDecl | InstDecl ; InstDecl: 'VAR' VarDeclList ';' | 'CONST' ConstDeclList ';' | 'DECL' ProtoDeclList ';' | 'INTERFACE' IfaceDeclList ';' | 'STRUCT' Symbol '=' StructMemList ';' | ProcDecl | 'OBJECT' ObjDeclList ';' | 'PUBLIC' ProcDecl | 'PUBLIC' 'CONST' ConstDeclList ';' | 'PUBLIC' 'STRUCT' Symbol '=' StructMemList ';' ; InterfaceDecl: 'IDECL' InfDeclList ; InfDeclList: InfDecl | InfDecl ',' InfDeclList ; InfDecl: Symbol '(' ConstValue ',' ConstValue ')' ; ObjDeclList: Symbol '[' Symbol ']' | Symbol '[' Symbol ']' ',' ObjDeclList ; ProtoDeclList: ProtoDecl | ProtoDecl ',' ProtoDeclList ; ProtoDecl: Symbol '(' ConstValue ')' ; ProcDecl: Symbol '(' ArgumentList ')' Statement | Symbol '(' ')' Statement ; ArgumentList: Symbol | Symbol ',' ArgumentList ; IfaceDeclList: IfaceDecl | IfaceDecl ',' IfaceDeclList ; IfaceDecl: ProtoDecl | ProtoDecl '=' ConstValue ; Statement: CompoundStmt | Symbol ':=' Expression ';' | Symbol Subscripts ':=' Expression ';' | ProcedureCall | 'CALL' ProcedureCall ';' | Symbol '.' ProcedureCall ';' | 'SEND' '(' Symbol ',' Symbol ',' ProcedureCall ')' ';' | 'IF' '(' Expression ')' Statement | 'IE' '(' Expression ')' Statement 'ELSE' Statement | 'WHILE' '(' Expression ')' Statement | 'FOR' '(' Symbol '=' Expression ',' Expression ')' Statement | 'FOR' '(' Symbol '=' Expression ',' Expression ',' ConstValue ')' Statement | 'LEAVE' ';' | 'LOOP' ';' | 'RETURN' ';' | 'RETURN' Expression ';' | 'HALT' ';' | 'HALT' ConstValue ';' | ';' ; CompoundStmt: 'DO' 'END' | 'DO' LocalDeclList 'END' | 'DO' StatementList 'END' | 'DO' LocalDeclList StatementList 'END' ; LocalDeclList: LocalDecl | LocalDecl LocalDeclList ; LocalDecl: 'VAR' VarDeclList ';' | 'CONST' ConstDeclList ';' | 'STRUCT' Symbol '=' StructMemList ';' | 'OBJECT' ObjDeclList ';' ; StatementList: Statement | Statement StatementList ; ExprList: Expression | Expression ',' ExprList ; Expression: Disjunction | Disjunction '->' Expression ':' Expression ; Disjunction: Conjunction | Disjunction '\/' Conjunction ; Conjunction: Equation | Conjunction '/\' Equation ; Equation: Relation | Equation '=' Relation | Equation '\=' Relation ; Relation: BitOperation | Relation '<' BitOperation | Relation '>' BitOperation | Relation '<=' BitOperation | Relation '>=' BitOperation | Relation '.<' BitOperation | Relation '.>' BitOperation | Relation '.<=' BitOperation | Relation '.>=' BitOperation ; BitOperation: Sum | BitOperation '&' Sum | BitOperation '|' Sum | BitOperation '^' Sum | BitOperation '<<' Sum | BitOperation '>>' Sum ; Sum: Term | Sum '+' Term | Sum '-' Term ; Term: Factor | Term '*' Factor | Term '/' Factor | Term '.*' Factor | Term './' Factor | Term 'MOD' Factor ; Factor: Number | String | Table | 'PACKED' PackedTable | Symbol | Symbol Subscripts | Symbol '.' Symbol | ProcedureCall | 'CALL' ProcedureCall | Symbol '.' ProcedureCall | 'SEND' '(' Symbol ',' Symbol ',' ProcedureCall ')' | '@' Symbol | '-' Factor | '\' Factor | '~' Factor | '(' Expression ')' ; Subscripts: '[' Expression ']' | '::' Factor | '[' Expression ']' Subscripts ; Table: '[' MemberList ']' ; MemberList: TableMember | TableMember ',' MemberList ; TableMember: ConstValue | String | Table | 'PACKED' PackedTable | '@' Symbol | '(' Expression ')' ; PackedTable: '[' PackedTableMembers ']' ; PackedTableMembers: PackedTableMember | PackedTableMember ',' PackedTableMembers ; PackedTableMember: Symbol | Number ; ProcedureCall: Symbol '(' ')' | Symbol '(' ExprList ')' ; ConstValue: SimpleConst | ConstValue '+' SimpleConst | ConstValue '*' SimpleConst | ConstValue '|' SimpleConst ; SimpleConst: Symbol | Number | '-' SimpleConst | '~' SimpleConst ; Number: DecimalNumber | '0x' # HexNumber | '0X' # HexNumber | '0b' # BinaryNumber | '0B' # BinaryNumber | '%' # DecimalNumber | '%' # '0x' # HexNumber | '%' # '0X' # HexNumber | '''' # AnyChar # '''' ; BinaryNumber: BinaryDigit | BinaryDigit # BinaryNumber ; DecimalNumber: DecimalDigit | DecimalDigit # DecimalNumber ; HexNumber: HexDigit | HexDigit # HexNumber ; Symbol: Letter | Letter # SymbolChars ; SymbolChars: Letter | DecimalDigit | Letter # SymbolChars | DecimalDigit # SymbolChars ; Letter: '_' |'A'|'B'|...|'Y'|'Z' |'a'|'b'|...|'y'|'z' ; HexDigit: DecimalDigit |'A'|'B'|'C'|'D'|'E'|'F' |'a'|'b'|'c'|'d'|'e'|'f' ; DecimalDigit: '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9' ; BinaryDigit: '0'|'1' ; String: '"' # StringChars # '"' ; StringChars: AnyChar | AnyChar # StringChars ; AnyChar: <character> | '\' # <character> ;
Statement | Description |
---|---|
VAR name, ... ; | Define atomic variables |
VAR name[cexpr], ... ; | Define vectors, size = cexpr |
VAR name::cexpr, ... ; | Define byte vectors, size = at least cexpr characters |
VAR name[structname], ...; | Define structured vectors |
CONST name = cexpr, ... ; | Define constants |
PUBLIC CONST ... | Define class constants |
STRUCT sname = m1, ... mN ; | Define structure sname with members m1...mN |
PUBLIC STRUCT ... | Define class constants |
CLASS cname(req-module, ...) class-declarations END |
Define class cname and its dependencies |
PUBLIC CLASS ... | Define a public class |
OBJECT name[cname], ... ; | Define instance of class cname |
DECL name(cexpr), ... ; | Declare procedures (type = cexpr) |
pname(a1, ..., aN) stmt | Define procedure pname with aruments a1...aN and body stmt |
PUBLIC pname(a1, ..., aN) stmt | Define a public procedure (method) |
INTERFACE name(cexpr) = slot, name(cexpr), ... |
Define interface procedures (obsolete, do not use) |
MODULE mname(req-module, ...) | Name module mname and define dependencies |
Statement | Description |
---|---|
lvalue := expr; | Assign the value of expr to lvalue |
IF (expr) stmt | Run stmt, if expr is true |
IE (expr) stmt-T ELSE stmt-F | Run stmt-1, if expr is true and stmt-2, if expr is false |
WHILE (expr) stmt | Run (and repeat) stmt while expr is true |
FOR (var=start, limit, cexpr) stmt |
Count from start to limit-cexpr using cexpr increments. If cexpr is negative, count to limit+cexpr. Run stmt in each step. start and limit are expressions. |
FOR (var=start, limit) stmt | Short form of FOR (var=start, limit, 1) ... |
LEAVE; | Leave the innermost WHILE or FOR loop |
LOOP; | Restart the innermost WHILE or FOR loop. In FOR loops, go to the increment step. |
RETURN expr; | Leave the current procedure and return expr to the calling procedure. |
RETURN; | Short form of RETURN 0; |
HALT cexpr; | Terminate program, deliver exit code cexpr |
HALT; | Short form of HALT 0; |
pname(a1, ..., aN); | Call procedure pname with arguments a1...aN |
CALL ptr(a1, ..., aN); | Indirect procedure call through ptr |
SEND(ptr, cname, mname(a1, ..., aN)); |
Send message mname to the object of class cname pointed to by ptr |
DO decls ...; stmts ... END | Compund statement. Declarations (decls) and statements (stmts) are both optional. |
; | Empty statement |
Operator | Prec | Assoc | Description |
---|---|---|---|
(expr) | 0 | - | Give a (sub)expression the highest precedence |
pname(a1, ..., aN) | 0 | L | The value of applying procedure pname to the arguments a1 through aN |
vname[expr] | 0 | L | The expr'th element of the vector vname. [expr] may be repeated |
vname::expr | 0 | R | The expr'th byte of the vector vname |
@lvalue | 1 | R | The address of lvalue |
~X | 2 | R | The bitwise complement of X |
\X | 2 | R | The logical complement of X |
-X | 2 | R | The negative value of X |
X * Y | 3 | L | The product of X and Y |
X / Y | 3 | L | The quotient of X and Y |
X MOD Y | 3 | L | The division remainder of .X and .Y |
X .* Y | 3 | L | The unsigned product of .X and .Y |
X ./ Y | 3 | L | The unsigned division remainder of .X and .Y |
X + Y | 4 | L | The sum of X and Y |
X - Y | 4 | L | The difference between X and Y |
X & Y | 5 | L | The bitwise logical product X AND Y |
X | Y | 5 | L | The bitwise logical sum X OR Y |
X ^ Y | 5 | L | The bitwise logical negative equivalence X XOR Y |
X << Y | 5 | L | X shifted to the left by Y bits |
X >> Y | 5 | L | X shifted to the right by Y bits |
X < Y | 6 | L | True, if X is less than Y |
X <= Y | 6 | L | True, if X is less than or equal to Y |
X > Y | 6 | L | True, if X is greater than Y |
X >= Y | 6 | L | True, if X is greater than or equal to Y |
X .< Y | 6 | L | True, if .X is less than .Y |
X .<= Y | 6 | L | True, if .X is less than or equal to .Y |
X .> Y | 6 | L | True, if .X is greater than .Y |
X .>= Y | 6 | L | True, if .X is greater than or equal to .Y |
X = Y | 7 | L | True, if X is equal to Y |
X \= Y | 7 | L | True, if X is not equal to Y |
X /\ Y | 8 | L | Short circuit logical AND: 0 if X=0; Y, if X\=0 |
X \/ Y | 9 | L | Short circuit logical OR: X if X\=0; Y, if X=0 |
X -> Y : Z | 10 | L | Conditional expression: Y, if X\=0; Z, if X=0 |
Meta command | Description |
---|---|
#L number "filename"; | Set internal line counter to number and input file name to filename. For use in preprocessors. |
#CLASSPATH "path"; | Specify an alternative location for searching class files. |
#DEBUG; | Turn on emission of debug information. |
Procedure | Description |
---|---|
T.BPW() | Return the number of bytes per machine word on the host machine |
T.CLOSE(fd) | Close file descriptor FD |
T.CVALIST(n, bmap, in, out) | Convert argument list, N=number of arguments, BMAP=argument type bitmap, IN=source list, OUT=destination list, return number of bytes OUT was relocated |
T.GETARG(n, buf, max) | Copy MAX characters of command line argument #N into BUF |
T.GETENV(name, buf, max) | Copy MAX characters of the value of the environment variable NAME to BUF |
T.MEMCOMP(r1, r2, len) | Compare LEN bytes at the regions R1 and R2 |
T.MEMCOPY(dest, src, len) | Copy LEN bytes from SRC to DEST |
T.MEMFILL(dest, char, len) | Fill LEN bytes at DEST with CHAR |
T.MEMSCAN(src, char, lim) | Search LIM bytes at SRC for CHAR |
T.NEWLINE(buf) | Fill BUF with a system dependent newline sequence |
T.OPEN(path, mode) | Open file PATH in the given MODE |
T3X.OAPPND T3X.ORDWR T3X.OREAD T3X.OWRITE |
append mode (write-only) read/write mode read-only mode create/overwrite mode (write-only) |
T.READ(fd, buf, len) | Read LEN bytes from file descriptor FD into buffer BUF |
T.REMOVE(path) | Delete (a link to) the file PATH |
T.RENAME(old, new) | Rename the file OLD to NEW |
T.SEEK(fd, pos, org) | Move the file pointer of FD by POS bytes starting at ORG. |
T3X.SEEK_END T3X.SEEK_REL T3X.SEEK_SET |
origin: end of file origin: current position origin: beginning of file |
T.WRITE(fd, buf, len) | Write LEN characters from buffer BUF to file descriptor FD |
Procedure | Description |
---|---|
CHR.INIT() | Initialize the CHAR class |
CHAR.C_ALPHA CHAR.C_CNTRL CHAR.C_DIGIT CHAR.C_SPACE CHAR.C_UPPER |
property: alphabetic property: control character property: decimal digit property: white space character property: upper case character |
CHR.ALPHA(char) | Alphabetic character test |
CHR.ASCII(char) | ASCII character test |
CHR.CNTRL(char) | Control character test |
CHR.DIGIT(char) | Numeric character test |
CHR.LCASE(char) | Convert CHAR to lower case, if upper case |
CHR.LOWER(char) | Lower case character test |
CHR.MAP() | Return map of character properties |
CHR.SPACE(char) | White space character test |
CHR.UCASE(char) | Convert CHAR to upper case, if lower case |
CHR.UPPER(char) | Upper case character test |
Procedure | Description |
---|---|
IOS.CLOSE() | Close stream |
IOS.CREATE(fd, buf, len, mode) | Create stream using the buffer BUF with the given length LEN and connect it to the file descriptor FD (MODEs are described under IOS.OPEN) |
IOS.EOF() | Return true, if input is exhausted |
IOS.FLUSH() | Write buffered data to the associated descriptor |
IOS.MOVE(offset, origin) | Move the stream pointer to the given OFFSET starting at ORIGIN. |
IOSTREAM.SEEK_BCK IOSTREAM.SEEK_END IOSTREAM.SEEK_FWD IOSTREAM.SEEK_SET |
origin: current position, move back origin: end of file, move back origin: current position, move forward origin: beginning of file, move forward |
IOS.OPEN(path, buf, len, mode) | Like IOS.CREATE(), but open PATH instead of connecting the stream to an existing descriptor |
IOSTREAM.FADDCR IOSTREAM.FKILLCR IOSTREAM.FRDWR IOSTREAM.FREAD IOSTREAM.FTRANS IOSTREAM.FWRITE |
mode: Add CR before LF on output mode: Remove any CRs from input mode: Open file in read/write mode mode: Open file read-only mode: Apply both FADDCR and FKILLCR mode: Create file in write-only mode |
IOS.RDCH() | Read and return a single character |
IOS.READ(buf, len) | Read up to LEN characters into BUF |
IOS.RESET() | Reset error flag of stream |
IOS.READS(buf, len) | Read a line with up to LEN-1 characters into BUF |
IOS.WRCH(char) | Write single character |
IOS.WRITE(buf, len) | Write LEN characters from BUF. |
IOS.WRITES(str) | Write NUL-terminated string |
Procedure | Description |
---|---|
MEM.INIT(pool, size) | Initialize a new memory POOL. A pool of the specified SIZE must be supplied. |
MEM.WALK(block, sizep, statp) | Walk the block list starting at BLOCK (0=beginning of pool). The vectors SIZEP and STATP will be filled with size and status. |
MEM.ALLOC(size) | Allocate a block of the given SIZE |
MEM.FREE(block) | Free a previously allocated BLOCK |
Procedure | Description |
---|---|
STR.COMP(s1, s2) | Compare strings |
STR.COPY(dest, src) | Copy SRC to DEST |
STR.FIND(str, pat) | Find position of substring PAT in STR |
STR.FORMAT(buf, tmpl, list) | Format BUF according to TMPL using the arguments in LIST |
STR.LENGTH(str) | Compute the length of a string |
STRING.MAXLEN | Constant: maximum string length |
STR.NUMTOSTR(buf, n, radix) | Create a string representing the number N using the given RADIX. Store the result in BUF. |
STR.PARSE(str, tmpl, list) | Extract data described in TMPL from STR. Store results in the variables pointed to by the members of LIST. |
STR.RSCAN(str, char) | Find the rightmost occurence of CHAR in STR |
STR.SCAN(str, char) | Find the first occurence of CHAR in STR |
STR.STRTONUM(s, radix, lp) | Convert the string S which contains a numeric string with the given RADIX into a value. Fill LP with the position of the first non-digit. |
STR.XLATE(str, old, new) | Replace each OLD character in STR with NEW |
Procedure | Description |
---|---|
SYS.INIT() | Initialize the system interface |
SYS.FINI() | Shutdown the system interface |
SYS.CHDIR(path) | Change the curreent working directory to PATH |
SYS.CLOSEDIR(dirfd) | Close directory file descriptor |
SYS.DUP(fd) | Duplicate file descriptor FD |
SYS.DUP2(old, new) | Duplicate file descriptor OLD to NEW. If NEW already is open, close it first. |
SYS.FORK() | Duplicate the calling process |
SYS.KILL(pid, sig) | Send signal SIG to process PID |
SYSTEM.SIGKILL SYSTEM.SIGTERM SYSTEM.SIGTEST |
signal: kill process signal: terminate process signal: test process id |
SYS.MKDIR(path) | Create new directory |
SYS.OPENDIR(path) | Open a directory file descriptor |
SYS.PIPE(fdvec) | Create pipe. FDVEC[0] is filled with output end (read-only) and FDVEC[1] with the input descriptor (write-only). |
SYS.RDCHK(fd) | Check FD for pending input |
SYS.READDIR(dirfd, buf, max) | Read the first MAX characters of the next directory entry (file name) from DIRFD into BUF |
SYS.RMDIR(path) | Remove the given directory |
SYS.SPAWN(prog, args, mode) | Spawn program PROG as a subprocess in the given MODE, passing the arguments pointed to by ARGS to it |
SYSTEM.SPAWN_NOWAIT SYSTEM.SPAWN_WAIT |
mode: create background process mode: wait for process termination |
SYS.STAT(path, sb) | Retrieve statistics for file PATH |
SYSTEM.STATBUF | Structure for file statistics (SB) |
SYSTEM.ST_DEV SYSTEM.ST_EXT SYSTEM.ST_GID SYSTEM.ST_INO SYSTEM.ST_MODE SYSTEM.ST_MTIME SYSTEM.ST_MT_2 SYSTEM.ST_MT_3 SYSTEM.ST_MT_4 SYSTEM.ST_NLINK SYSTEM.ST_RDEV SYSTEM.ST_SIZE SYSTEM.ST_UID |
SB: device ID SB: size of file in full 64K blocks SB: group ID of owner SB: inode number SB: permission flags SB: modification time: YYMDHMSh SB: 8-byte MTIME buffer SB: 8-byte MTIME buffer SB: 8-byte MTIME buffer SB: number of links SB: device type SB: size mod 64K SB: user ID of owner |
SYS.TIME(tbuf) | fill TBUF with system time (8 bytes = YYMDHMSh) |
SYS.WAIT() | Wait for subprocess termination |
Procedure | Description |
---|---|
TTY.INIT() | Initialize the TTY interface |
TTY.FINI() | Shutdown the TTY interface |
TTY.CLEAR() | Clear the terminal screen |
TTY.CLREOL() | Clear the current line starting at the cursor position |
TTY.COLOR(color) | Select COLOR for TTY writes |
TTYCTL.B_BLACK
TTYCTL.B_BLUE TTYCTL.B_CYAN TTYCTL.B_GREEN TTYCTL.B_GREY TTYCTL.B_MAGENTA TTYCTL.B_RED TTYCTL.B_YELLOW |
background colors |
TTYCTL.F_BLACK TTYCTL.F_BLUE TTYCTL.F_CYAN TTYCTL.F_GREEN TTYCTL.F_GREY TTYCTL.F_MAGENTA TTYCTL.F_RED TTYCTL.F_YELLOW |
foreground colors |
TTYCTL.F_BRIGHT | Foreground intensity flag |
TTY.COLORS() | Return a flag indicating whether the terminal supports color |
TTY.COLUMNS() | Return the number of characters per line on the connected TTY |
TTY.MODE(raw) | Select raw or cooked mode (where available) |
TTY.MOVE(x, y) | Move the cursor to the given coordinates |
TTY.LINES() | Return the number of lines per screen on the connected TTY |
TTY.QUERY() | Check for pending characters in the keyboard buffer |
TTY.READC() | Read a single character from the keyboard. Special keys return special codes listed here: |
TTYCTL.K_BKSP TTYCTL.K_CR TTYCTL.K_DEL TTYCTL.K_DOWN TTYCTL.K_END TTYCTL.K_ESC TTYCTL.K_F1 TTYCTL.K_F2 TTYCTL.K_F3 TTYCTL.K_F4 TTYCTL.K_F5 TTYCTL.K_F6 TTYCTL.K_F7 TTYCTL.K_F8 TTYCTL.K_F9 TTYCTL.K_F10 TTYCTL.K_HOME TTYCTL.K_INS TTYCTL.K_KILL TTYCTL.K_LEFT TTYCTL.K_NEXT TTYCTL.K_PGDN TTYCTL.K_PGUP TTYCTL.K_PREV TTYCTL.K_RIGHT TTYCTL.K_UP |
Backspace Enter / CR / Return Delete Down arrow End Escape F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 Home Insert Kill, Erase line Left arrow Next, Page down Next, Page down Prev, Page up Prev, Page up Right arrow Up arrow |
TTY.RSCROLL(top, bot) | Scroll down the lines from TOP to BOT |
TTY.SCROLL(top, bot) | Scroll up the lines from TOP to BOT |
TTY.WRITEC(char) | Write a single character to the screen |
TTY.WRITES(string) | Write a string of characters to the screen |
Procedure | Description |
---|---|
XM.INIT() | Initialize the XMEM interface |
XM.FINI() | Shutdown the XMEM interface |
XM.ALLOC(size) | Allocate an external memory block of SIZE bytes |
XM.FREE(id) | Release allocated block with the given ID |
XM.GET(id, index) | Read a byte from address INDEX of the block referenced by ID and return it |
XM.PUT(id, index, value) | Change the INDEX'th byte of block ID to VALUE |
XM.READ(id, index, buf, len) | Copy LEN bytes from addresses starting at INDEX of the block referenced by ID into BUF |
XM.WRITE(id, index, buf, len) | Copy LEN bytes from BUF to addresses starting at INDEX in the block referenced by ID |
Escape Seqences |
ASCII Code |
Name | Description |
---|---|---|---|
\a \A | 0x07 | BEL | Ring terminal bell |
\b \B | 0x08 | BS | Backspace |
\e \E | 0x1B | ESC | Introduce control sequence |
\f \F | 0x12 | FF | Form feed |
\n \N | 0x0A | LF | Line feed |
\q \Q \" | 0x22 | - | Literal quote character |
\r \R | 0x0D | CR | Carriage return |
\t \T | 0x09 | HT | Horizontal tabulator |
\v \V | 0x0B | VT | Vertical tabulator |
\\ | 0x5C | - | Literal backslash |
00/80 | 10/90 | 20/A0 | 30/B0 | 40/C0 | 50/D0 | |
---|---|---|---|---|---|---|
00 | GLUE | STACK* | BSHR | LDIV* | NBRT* | INCLL** |
01 | HINT* | CLEAN* | EQU | LDLAB* | JUMP* | PUB**S |
02 | CLAB* | NEG | NEQU | NUM* | UNEXT* | EXT**S |
03 | DLAB* | LNOT | LESS | SELF | DNEXT* | IPROC**S |
04 | DATA* | BNOT | GRTR | DEREF | HALT* | IREF**S |
05 | CREF* | MUL | LTEQ | DREFB | CALL* | CMAP** |
06 | DREF* | DIV | GTEQ | NORM | CALR | GSYM**S |
07 | VEC* | UMUL | ULESS | NORMB | CALX* | LSYM**S |
08 | STR*S | UDIV | UGRTR | SAVG* | SYS* | ISYM**S |
09 | HDR | MOD | ULTEQ | SAVL* | ILIB*S | |
0A | END | ADD | UGTEQ | SAVI* | ICALL* | |
0B | MHDR | SUB | LDG* | STORE | ICALX* | |
0C | ENDM | BAND | LDGV* | STORB | LINE* | |
0D | POP | BOR | LDL* | BRF* | INIT** | |
0E | DUP | BXOR | LDLV* | BRT* | INCG** | |
0F | SWAP | BSHL | LDI* | NBRF* | INCI** |
* instruction has an argument.
** instruction has two arguments.
S instruction has a string argument.
In any of these cases, add 0x80 to the opcode.
hex | 00 | 10 | 20 | 30 | 40 | 50 | 60 | 70 |
---|---|---|---|---|---|---|---|---|
dec | 0 | 16 | 32 | 48 | 64 | 80 | 96 | 112 |
00 | NUL | DLE | 0 | @ | P | ` | p | |
01 | SOH | DC1 | ! | 1 | A | Q | a | q |
02 | STX | DC2 | " | 2 | B | R | b | r |
03 | ETX | DC3 | # | 3 | C | S | c | s |
04 | EOT | DC4 | $ | 4 | D | T | d | t |
05 | ENQ | NAK | % | 5 | E | U | e | u |
06 | ACK | SYN | & | 6 | F | V | f | v |
07 | BEL | ETB | ' | 7 | G | W | g | w |
08 | BS | CAN | ( | 8 | H | X | h | x |
09 | HT | EM | ) | 9 | I | Y | i | y |
0A | LF | SUB | * | : | J | Z | j | z |
0B | VT | ESC | + | ; | K | [ | k | { |
0C | FF | FS | , | < | L | \ | l | | |
0D | CR | GS | - | = | M | ] | m | } |
0E | SO | RS | . | > | N | ^ | n | ~ |
0F | SI | US | / | ? | O | _ | o | DEL |
Fig.1 Scopes (example)
Fig.2 Encapsulation
Fig.3 Multiplexing Method Applications
Fig.4 Scopes (overview)
Fig.5 The Architecture of the Tcode Machine
Fig.6 External References
Fig.7 Interface References
This manual is part of the T3X compiler package which is distributed under the following terms.
T3X -- A Compiler for the Minimum Procedural Language T3X
Copyright (C) 1996-2002 Nils M Holm. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.