A "match specification" (match_spec) is an Erlang term describing a
small "program" that will try to match something (either the
parameters to a function as used in the erlang:trace_pattern/2
BIF, or the objects in an ETS table.).
The match_spec in many ways works like a small function in Erlang, but is
interpreted/compiled by the Erlang runtime system to something much more
efficient than calling an Erlang function. The match_spec is also
very limited compared to the expressiveness of real Erlang functions.
Match specifications are given to the BIF erlang:trace_pattern/2
to
execute matching of function arguments as well as to define some actions
to be taken when the match succeeds (the MatchBody
part). Match
specifications can also be used in ETS, to specify objects to be
returned from an ets:select/2
call (or other select
calls). The semantics and restrictions differ slightly when using
match specifications for tracing and in ETS, the differences are
defined in a separate paragraph below.
The most notable difference between a match_spec and an Erlang fun is
of course the syntax. Match specifications are Erlang terms, not
Erlang code. A match_spec also has a somewhat strange concept of
exceptions. An exception (e.g., badarg
) in the MatchCondition
part,
which resembles an Erlang guard, will generate immediate failure,
while an exception in the MatchBody
part, which resembles the body of an
Erlang function, is implicitly caught and results in the single atom
'EXIT'
.
A match_spec can be described in this informal grammar:
'_'
| [ MatchHeadPart, ... ]
'_'
[]
is_atom
| is_constant
|
is_float
| is_integer
| is_list
|
is_number
| is_pid
| is_port
|
is_reference
| is_tuple
| is_binary
|
is_function
| is_record
| is_seq_trace
|
'and'
| 'or'
| 'not'
| 'xor'
|
andalso
| orelse
'$_'
| '$$'
[]
| [ConditionExpression, ...] | NonCompositeTerm | Constant
const
, term()}
abs
|
element
| hd
| length
| node
|
round
| size
| tl
| trunc
|
'+'
| '-'
| '*'
| 'div'
|
'rem'
| 'band'
| 'bor'
| 'bxor'
|
'bnot'
| 'bsl'
| 'bsr'
| '>'
|
'>='
| '<'
| '=<'
| '=:='
|
'=='
| '=/='
| '/='
| self
|
get_tcw
set_seq_token
|
get_seq_token
| message
|
return_trace
| process_dump
|
enable_trace
| disable_trace
| trace
|
display
| caller
| set_tcw
|
silent
The different functions allowed in match_spec
work like this:
is_atom, is_constant, is_float, is_integer, is_list,
is_number, is_pid, is_port, is_reference, is_tuple, is_binary,
is_function: Like the corresponding guard tests in
Erlang, return true
or false
.
is_record: Takes an additional parameter, which SHALL
be the result of record_info(size, <record_type>)
,
like in {is_record, '$1', rectype, record_info(size,
rectype)}
.
'not': Negates its single argument (anything other
than false
gives false
).
'and': Returns true
if all its arguments
(variable length argument list) evaluate to true
, else
false
. Evaluation order is undefined.
'or': Returns true
if any of its arguments
evaluates to true
. Variable length argument
list. Evaluation order is undefined.
andalso: Like 'and'
, but quits evaluating its
arguments as soon as one argument evaluates to something else
than true. Arguments are evaluated left to right.
orelse: Like 'or'
, but quits evaluating as soon
as one of its arguments evaluates to true
. Arguments are
evaluated left to right.
'xor': Only two arguments, of which one has to be true
and the other false to return true
; otherwise
'xor'
returns false.
abs, element, hd, length, node, round, size, tl, trunc,
'+', '-', '*', 'div', 'rem', 'band', 'bor', 'bxor', 'bnot',
'bsl', 'bsr', '>', '>=', '<', '=<', '=:=', '==', '=/=', '/=',
self: Work as the corresponding Erlang bif's (or
operators). In case of bad arguments, the result depends on
the context. In the MatchConditions
part of the
expression, the test fails immediately (like in an Erlang
guard), but in the MatchBody
, exceptions are implicitly
caught and the call results in the atom 'EXIT'
.
is_seq_trace: Returns true
if a sequential
trace token is set for the current process, otherwise false
.
set_seq_token: Works like
seq_trace:set_token/2
, but returns true
on success
and 'EXIT'
on error or bad argument. Only allowed in the
MatchBody
part and only allowed when tracing.
get_seq_token: Works just like
seq_trace:get_token/0
, and is only allowed in the
MatchBody
part when tracing.
message: Sets an additional message appended to the
trace message sent. One can only set one additional message in
the body; subsequent calls will replace the appended message. As
a special case, {message, false}
disables sending of
trace messages ('call' and 'return_to')
for this function call, just like if the match_spec had not matched,
which can be useful if only the side effects of
the MatchBody
are desired.
Another special case is {message, true}
which
sets the default behavior, as if the function had no match_spec,
trace message is sent with no extra
information (if no other calls to message
are placed
before {message, true}
, it is in fact a "noop").
Takes one argument, the message. Returns true
and can
only be used in the MatchBody
part and when tracing.
return_trace: Causes a return_from
trace
message to be sent upon return from the current function.
Takes no arguments, returns true
and can only be used
in the MatchBody
part when tracing.
If the process trace flag silent
is active the return_from
trace message is inhibited.
NOTE! If the traced function is tail recursive, this match
spec function destroys that property.
Hence, if a match spec executing this function is used on a
perpetual server process, it may only be active for a limited
time, or the emulator will eventually use all memory in the host
machine and crash. If this match_spec function is inhibited
using the silent
process trace flag
tail recursiveness still remains.
exception_trace: Same as return_trace,
plus; if the traced function exits due to an exception,
an exception_from
trace message is generated,
whether the exception is caught or not.
process_dump: Returns some textual information about
the current process as a binary. Takes no arguments and is only
allowed in the MatchBody
part when tracing.
enable_trace: With one parameter this function turns
on tracing like the Erlang call erlang:trace(self(), true,
[P2])
, where P2
is the parameter to
enable_trace
. With two parameters, the first parameter
should be either a process identifier or the registered name of
a process. In this case tracing is turned on for the designated
process in the same way as in the Erlang call erlang:trace(P1, true,
[P2])
, where P1 is the first and P2 is the second
argument. The process P1
gets its trace messages sent to the same
tracer as the process executing the statement uses. P1
can not be one of the atoms all
, new
or
existing
(unless, of course, they are registered names).
P2
can not be cpu_timestamp
nor
{tracer,_}
.
Returns true
and may only be used in
the MatchBody
part when tracing.
disable_trace: With one parameter this function
disables tracing like the Erlang call erlang:trace(self(),
false, [P2])
, where P2
is the parameter to
disable_trace
. With two parameters it works like the
Erlang call erlang:trace(P1, false, [P2])
, where P1 can
be either a process identifier or a registered name and is given
as the first argument to the match_spec function.
P2
can not be cpu_timestamp
nor
{tracer,_}
. Returns
true
and may only be used in the MatchBody
part
when tracing.
trace: With two parameters this function takes a list
of trace flags to disable as first parameter and a list
of trace flags to enable as second parameter. Logically, the
disable list is applied first, but effectively all changes
are applied atomically. The trace flags
are the same as for erlang:trace/3
not including
cpu_timestamp
but including {tracer,_}
. If a
tracer is specified in both lists, the tracer in the
enable list takes precedence. If no tracer is specified the
same tracer as the process executing the match spec is
used. With three parameters to this function the first is
either a process identifier or the registered name of a
process to set trace flags on, the second is the disable
list, and the third is the enable list. Returns
true
if any trace propery was changed for the
trace target process or false
if not. It may only
be used in the MatchBody
part when tracing.
caller:
Returns the calling function as a tuple {Module,
Function, Arity} or the atom undefined
if the calling
function cannot be determined. May only be used in the
MatchBody
part when tracing.
Note that if a "technically built in function" (i.e. a
function not written in Erlang) is traced, the caller
function will sometimes return the atom undefined
. The calling
Erlang function is not available during such calls.
display: For debugging purposes only; displays the
single argument as an Erlang term on stdout, which is seldom
what is wanted. Returns true
and may only be used in the
MatchBody
part when tracing.
get_tcw:
Takes no argument and returns the value of the node's trace
control word. The same is done by
erlang:system_info(trace_control_word)
.
The trace control word is a 32-bit unsigned integer intended for generic trace control. The trace control word can be tested and set both from within trace match specifications and with BIFs. This call is only allowed when tracing.
set_tcw:
Takes one unsigned integer argument, sets the value of
the node's trace control word to the value of the argument
and returns the previous value. The same is done by
erlang:system_flag(trace_control_word, Value)
. It is only
allowed to use set_tcw
in the MatchBody
part
when tracing.
silent:
Takes one argument. If the argument is true
, the call
trace message mode for the current process is set to silent
for this call and all subsequent, i.e call trace messages
are inhibited even if {message, true}
is called in the
MatchBody
part for a traced function.
This mode can also be activated with the silent
flag
to erlang:trace/3
.
If the argument is false
, the call trace message mode
for the current process is set to normal (non-silent) for
this call and all subsequent.
If the argument is neither true
nor false
,
the call trace message mode is unaffected.
Note that all "function calls" have to be tuples,
even if they take no arguments. The value of self
is
the atom() self
, but the value of {self}
is
the pid() of the current process.
Variables take the form '$<number>'
where
<number>
is an integer between 0 (zero) and
100000000 (1e+8), the behavior if the number is outside these
limits is undefined. In the MatchHead
part, the special
variable '_'
matches anything, and never gets bound (like
_
in Erlang). In the MatchCondition/MatchBody
parts, no unbound variables are allowed, why '_'
is
interpreted as itself (an atom). Variables can only be bound in
the MatchHead
part. In the MatchBody
and
MatchCondition
parts, only variables bound previously may
be used. As a special case, in the
MatchCondition/MatchBody
parts, the variable '$_'
expands to the whole expression which matched the
MatchHead
(i.e., the whole parameter list to the possibly
traced function or the whole matching object in the ets table)
and the variable '$$'
expands to a list
of the values of all bound variables in order
(i.e. ['$1','$2', ...]
).
In the MatchHead
part, all literals (except the variables
noted above) are interpreted as is. In the
MatchCondition/MatchBody
parts, however, the
interpretation is in some ways different. Literals in the
MatchCondition/MatchBody
can either be written as is,
which works for all literals except tuples, or by using the
special form {const, T}
, where T
is any Erlang
term. For tuple literals in the match_spec, one can also use
double tuple parentheses, i.e., construct them as a tuple of
arity one containing a single tuple, which is the one to be
constructed. The "double tuple parenthesis" syntax is useful to
construct tuples from already bound variables, like in
{{'$1', [a,b,'$2']}}
. Some examples may be needed:
Expression | Variable bindings | Result |
{{'$1','$2'}} | '$1' = a, '$2' = b | {a,b} |
{const, {'$1', '$2'}} | doesn't matter | {'$1', '$2'} |
a | doesn't matter | a |
'$1' | '$1' = [] | [] |
['$1'] | '$1' = [] | [[]] |
[{{a}}] | doesn't matter | [{a}] |
42 | doesn't matter | 42 |
"hello" | doesn't matter | "hello" |
$1 | doesn't matter | 49 (the ASCII value for the character '1') |
The execution of the match expression, when the runtime system decides whether a trace message should be sent, goes as follows:
For each tuple in the MatchExpression
list and while no
match has succeeded:
MatchHead
part against the arguments to the
function,
binding the '$<number>'
variables (much like in
ets:match/2
).
If the MatchHead
cannot match the arguments, the match fails.
MatchCondition
(where only
'$<number>'
variables previously bound in the
MatchHead
can occur) and expect it to return the atom
true
. As soon as a condition does not evaluate to
true
, the match fails. If any BIF call generates an
exception, also fail.
ActionTerm
in the same way as the
MatchConditions
, but completely ignore the return
values. Regardless of what happens in this part, the match has
succeeded.
ETS match specifications are there to produce a return
value. Usually the expression contains one single
ActionTerm
which defines the return value without having
any side effects. Calls with side effects are not allowed in the
ETS context.
When tracing there is no return value to produce, the
match specification either matches or doesn't. The effect when the
expression matches is a trace messsage rather then a returned
term. The ActionTerm
's are executed as in an imperative
language, i.e. for their side effects. Functions with side effects
are also allowed when tracing.
In ETS the match head is a tuple()
(or a single match
variable) while it is a list (or a single match variable) when
tracing.
Match an argument list of three where the first and third arguments are equal:
[{['$1', '_', '$1'], [], []}]
Match an argument list of three where the second argument is a number greater than three:
[{['_', '$1', '_'], [{ '>', '$1', 3}], []}]
Match an argument list of three, where the third argument
is a tuple containing argument one and two or a list
beginning with argument one and two (i. e. [a,b,[a,b,c]]
or
[a,b,{a,b}]
):
[{['$1', '$2', '$3'], [{orelse, {'=:=', '$3', {{'$1','$2'}}}, {'and', {'=:=', '$1', {hd, '$3'}}, {'=:=', '$2', {hd, {tl, '$3'}}}}}], []}]
The above problem may also be solved like this:
[{['$1', '$2', {'$1', '$2}], [], []}, {['$1', '$2', ['$1', '$2' | '_']], [], []}]
Match two arguments where the first is a tuple beginning with a list which in turn begins with the second argument times two (i. e. [{[4,x],y},2] or [{[8], y, z},4])
[{['$1', '$2'], [{'=:=', {'*', 2, '$2'}, {hd, {element, 1, '$1'}}}], []}]
Match three arguments. When all three are equal and are numbers, append the process dump to the trace message, else let the trace message be as is, but set the sequential trace token label to 4711.
[{['$1', '$1', '$1'], [{is_number, '$1'}], [{message, {process_dump}}]}, {'_', [], [{set_seq_token, label, 4711}]}]
As can be noted above, the parameter list can be matched
against a single MatchVariable
or an '_'
. To replace the
whole
parameter list with a single variable is a special case. In all
other cases the MatchHead
has to be a proper list.
Match all objects in an ets table where the first element is the atom 'strider' and the tuple arity is 3 and return the whole object.
[{{strider,'_'.'_'}, [], ['$_']}]
Match all objects in an ets table with arity > 1 and the first element is 'gandalf', return element 2.
[{'$1', [{'==', gandalf, {element, 1, '$1'}},{'>=',{size, '$1'},2}], [{element,2,'$1'}]}]
In the above example, if the first element had been the key,
it's much more efficient to match that key in the MatchHead
part than in the MatchConditions
part. The search space of
the tables is restricted with regards to the MatchHead
so
that only objects with the matching key are searched.
Match tuples of 3 elements where the second element is either 'merry' or 'pippin', return the whole objects.
[{{'_',merry,'_'}, [], ['$_']}, {{'_',pippin,'_'}, [], ['$_']}]
The function ets:test_ms/2
can be useful for testing
complicated ets matches.