To administrate Zebra, you run the
zebraidx
program. This program supports a number of options
which are preceded by a minus, and a few commands (not preceded by
minus).
Both the Zebra administrative tool and the Z39.50 server share a
set of index files and a global configuration file. The
name of the configuration file defaults to zebra.cfg
.
The configuration file includes specifications on how to index
various kinds of records and where the other configuration files
are located. zebrasrv
and zebraidx
must
be run in the directory where the configuration file lives unless you
indicate the location of the configuration file by option
-c
.
Indexing is a per-record process. Before a record is indexed search
keys are extracted from whatever might be the layout the original
record (sgml,html,text, etc..).
The Zebra system currently supports two fundamantal types of records:
structured and simple text.
To specify a particular extraction process, use either the
command line option -t
or specify a
recordType
setting in the configuration file.
The Zebra configuration file, read by zebraidx
and
zebrasrv
defaults to zebra.cfg
unless specified
by -c
option.
You can edit the configuration file with a normal text editor.
Parameter names and values are seperated by colons in the file. Lines
starting with a hash sign (#
) are treated as comments.
If you manage different sets of records that share common
characteristics, you can organize the configuration settings for each
type into "groups".
When zebraidx
is run and you wish to address a given group
you specify the group name with the -g
option. In this case
settings that have the group name as their prefix will be used
by zebraidx
. If no -g
option is specified, the settings
with no prefix are used.
In the configuration file, the group name is placed before the option
name itself, separated by a dot (.). For instance, to set the record type
for group public
to grs.sgml
(the SGML-like format for structured
records) you would write:
public.recordType: grs.sgml
To set the default value of the record type to text
write:
recordType: text
The available configuration settings are summarized below. They will be explained further in the following sections.
Specifies how records with the file extension name should
be handled by the indexer. This option may also be specified
as a command line option (-t
). Note that if you do not
specify a name, the setting applies to all files. In general,
the record type specifier consists of the elements (each
element separated by dot), fundamental-type,
file-read-type and arguments. Currently, two
fundamental types exist, text
and grs
.
Specifies how the records are to be identified when updated. See section Locating Records.
Specifies the Z39.50 database name.
Specifies whether key information should be saved for a given group of records. If you plan to update/delete this type of records later this should be specified as 1; otherwise it should be 0 (default), to save register space.
Specifies whether the records should be stored internally in the Zebra system files. If you want to maintain the raw records yourself, this option should be false (0). If you want Zebra to take care of the records for you, it should be true(1).
Directory in which various lock files are stored.
Directory in which temporary files used during zebraidx' update phase are stored.
Specifies the directory that the server uses for temporary result sets.
If not specified /tmp
will be used.
Specifies the location of profile specification files.
Specifies the filename(s) of attribute set files for use in
searching. At least the Bib-1 set should be loaded (bib1.att
).
The profilePath
setting is used to look for the specified files.
See section
The Attribute Set Files
Specifies size of internal memory to use for the zebraidx program. The amount is given in megabytes - default is 4 (4 MB).
The default behaviour of the Zebra system is to reference the
records from their original location, i.e. where they were found when you
ran zebraidx
. That is, when a client wishes to retrieve a record
following a search operation, the files are accessed from the place
where you originally put them - if you remove the files (without
running zebraidx
again, the client will receive a diagnostic
message.
If your input files are not permanent - for example if you retrieve
your records from an outside source, or if they were temporarily
mounted on a CD-ROM drive,
you may want Zebra to make an internal copy of them. To do this,
you specify 1 (true) in the storeData
setting. When
the Z39.50 server retrieves the records they will be read from the
internal file structures of the system.
Consider a system in which you have a group of text files called
simple
. That group of records should belong to a Z39.50 database
called textbase
. The following zebra.cfg
file will suffice:
profilePath: /usr/lib/yaz/tab:/usr/lib/zebra/tab
attset: explain.att
attset: bib1.att
simple.recordType: text
simple.database: textbase