NPULL (C-1)





Copyright (C) 1994-95 Conetic Software Systems, Inc. All names, products, and services mentioned are the trademarks of their respective organizations.



NAME


npull - a fast data record extraction utility. (uses indexes)

SYNOPSIS


npull [-abdfnpr] [-g#] [-s|R|i indexno|I indexlist] [-tSEP] [-Bstring]
[-Estring] [-N#] datafile [printfields] [selectfields]
or  npull -h [-tSEP] [-Bstring] [-Estring] datafile
or  npull -c [-dfn(p[r])] [-g#] [-tSEP] [-Bstring] [-Estring] [-N#]
controlfile [procname]
or  npull ?

DESCRIPTION


npull(C-1) reads the specified datafile and produces on the standard 
output one text line for each selected record.  Each output line is made 
up of fields separated by carets (^), followed by the record number in 
the datafile.  A field in a data record is output only if it is listed 
as one of the printfield parameters in the command line.  An output line 
is generated only if the data values in the record fall within the bounds 
specified by the selectfield parameters.

A printfield parameter specifies a field name within the datafile.  A 
value is output onto the standard output file for each printfield listed, 
in the order the printfields are listed.

A selectfield parameter has one of the following forms:
        
	"fieldname>value"       	greater than
	"fieldname>=value"      	greater than or equal to 
	"fieldname=value"       	equal
	"fieldname<>value"      	not equal to
	"fieldname<=value"      	less than or equal to
	"fieldname and < 
symbols.  A condition with an '@' appended onto it causes pull to ignore
the condition if there is no value present.

Whenever possible, npull(C-1) will try to use an index to read the 
data file, based upon the selectfields given.  First it will try to use 
all "=" (equal) conditions on any of the keys in the file.  If it finds 
a suitable key, it can then use a dfindk(C-3) search if it is the primary 
key, or a dfindm(C-3) search if it is a secondary key.  If it can not 
find a suitable key, it then tries to use part of a key or conditions 
that use ">" (greater than) or ">=" (greater than or equal to).  If there
is a suitable key for these conditions, it will then use a dfindi(C-3)
search.  If there are no suitable keys after these checks, then it will
do a sequential read of the file, using dfind(C-3).

OPTIONS

If the -a option is specified, then all of the fields in the data file 
are pulled, and any datafields on the command line are ignored.

When -Bstring is used, pull will append 'string' onto the beginning of 
each output record.  With -Estring, 'string' is appended onto the end 
of each output record.

If the -b option is used, pull reads the file backwards starting at the 
end of the file.

If the -c option is used, the path of a controlfile must be specified.  
This controlfile allows you to pull multiple files simultaneously.  
This type of pull is used for grace reports in which you want to select 
records in one file based upon the contents of records in other files.  
An example would be sorting invoice detail records by the product in the 
detail and the date in the matching header records.  The output from this 
type of pull could then be piped into wtr(C-1) using the -s option of 
wtr.  The procname of the procedure you wish to use from the controlfile 
may optionally be specified.

If the -d option is used, DATE and TIME fields are output as decimal 
numbers with a value equivalent to the internal representation of the 
field.  Normally, DATE and TIME fields are output in their standard 
format, i.e., dates are output as MM/DD/YY and times are output as 
HH:MM:SS.  If the -d option is used, date fields are not output in 
MM/DD/YY format, but instead are output in long integer format giving 
the number of days from January 1, 1800.  This allows programs such as 
csort(C-1) to sort date fields in date order.  put(C-1) converts such 
fields back to their proper internal representation automatically.

If the -f option is specified, then the first line output contains the 
list of printfield names that are being output.  Each printfield name 
is separated from the next by a caret in the same manner the printfield 
values are separated.

If the -g# option is used, npull(C-1) will print debugging information 
to standard error.  This output can be useful when npull(C-1) does not 
pull the records you are expecting.  This option is most useful when 
using the -c option.  There are 4 levels of debugging output.  1 gives 
the least amount of information, 4 the most.

If the -i# option is used, npull(C-1) will read the file using the 
specified index number.

If the -Iindexlist option is used, pull will try to find an index in the 
file which starts with the field names in indexlist.  The field names 
should be separated by commas.  (Ex. -Iorder,sequence)

If the -n option is used, npull(C-1) will print to stderr the number of 
records pulled.

If the -Nn option is used, npull(C-1) will output up to n records then 
stop.

If the -p option is used, npull(C-1) will format its output in 
dprint(C-1) style. That is, it will print  "fieldname=fieldvalue", one
per line, and print the record number on the first line.

If the -r option is used, the record numbers are stripped from the output.

If the -s option is used, npull(C-1) will always use a sequential read 
of the file, rather than using one of the indexes.  This is useful when 
you are attempting to repair a damaged file and the keys are corrupted.

If the -R option is used, npull(C-1) will read standard input for a list 
of record numbers to use when reading the file.  The record numbers must 
be on the end of each line and must have a delimiting caret (^) in front 
of it.  This is the same format that npull(C-1) uses for its output.

If the -t option is used, npull(C-1) will use the string following the 
't' as the field separator, rather than a caret.

If the -h option is used, npull(C-1) will only output the complete field 
list for the file.  This will give you the same output as if you had 
specified the -f and -a options, but with no data being output.

The ? option provides additional information on each option.

The following is a example of  npull(C-1) extracting all General Ledger 
Chart of Accounts that greater than or equal to "100" and are less than 
"200":

npull -af -i0 cbooks~glacct "glacct>=100" "glacct<200"

CONTROL FILES

Syntax for control files:
%BEGIN [procname]       		(required)
%FILE filename [selectfields...]	(may be repeated)
%SEQUENTIAL filename       		(optional command)
%BACKWARDS filename       		(optional command)
%INDEXNO filename indexno  		(optional command)
%INDEXLIST filename indexlist 		(optional command)
%PRINT filename[.fieldname] [...]	(may be repeated)
%PRINTALL filename [...]       		(may be repeated)
%END [comment]       			(required)

Lines that do not begin with one of the commands listed above are ignored. 
This is so a npull(C-1) control file could be placed inside a grace 
report file, inside a section of comments.

You may put more than one control procedure in a control file.  If a 
procname is given as a parameter on the command line, then that procedure 
will be used.  If no procname is given, the first procedure encountered 
in the file will be used.  An optional comment may be placed after the 
%END statement. The %FILE statement declares which files are to be 
opened and read.  Files are scanned in the order specified.  Selectfields 
are the same as those specified earlier, with the addition that fields 
in other files may be used in place of a value.  Instead of a value, 
another field may be specified thusly:
                             
 		~filename.fieldname

%PRINT statements declare which fields are to be sent to standard output.  
If the fieldname is not given, and only the filename is there, then the 
current record number of that file will be output.  %PRINTALL statements 
declare files in which all fields are to be output, including the record 
number.  This is the same as using the -a option on a normal pull.

%SEQUENTIAL forces a sequential read of the specified file.  %INDEXNO 
forces a read of the file based on the given index number (primary key 
is index #0). %INDEXLIST finds an index containing at least the specified 
fields.  Fieldnames should be separated by commas and nothing else.  
%BACKWARDS forces a backwards read of the specified file.  %BACKWARDS 
may be used with %INDEXNO, %INDEXLIST, and %SEQUENTIAL. %SEQUENTIAL can 
not be used with %INDEXNO or %INDEXLIST. Also, %INDEXNO and %INDEXLIST 
can not be used together.

CONTROL FILE EXAMPLE

Suppose you wish to produce a report of all the products you sold, sorted 
by the invoice date, the product category, and the product number.  You
could use the following control file:

%BEGIN all_products_sold_by_category
%FILE invoiced
%SEQUENTIAL invoiced
%FILE product product=~invoiced.product
%INDEXNO product 0
%FILE invoicem invoice=~invoiced.invoice
%INDEXLIST invoicem invoice
%PRINT invoicem.date product.category invoiced.product invoiced
%END

This control file would do a sequential read on the invoiced file, use 
index number 0 (the primary key) on the product file, and find an index 
that started with the invoice field on the invoicem file and use it.  
If any of those options were not specified, pull will determine an 
appropriate index to use on each of the files.

Note: The invoiced is the last field on the %PRINT line so that the 
invoiced record numbers are placed on the end of the line for wtr to use. 
The shell command to use this would be (assuming that the above controlfile 
were placed inside of the grace program "report" as a comment, and the 
grace program was previously compiled into the file "report.rw"):

npull -c report | csort -0 -1 -2 | wtr -s invoiced report.rw |lp -s