Introduction |
This document is the official description of the Plucker format.
Overview |
All Plucker documents have an index record followed by a series of data records.
The Index Record |
This record includes info about the compression type used for the Plucker document and also what IDs the reserved records use. The viewer will use this record to know where to look for the reserved records and whether it must have support for ZLib compression. This record should always be the first record in the Plucker document (i.e. at index 0).
Field | Bytes | Type | Notes |
uid | 2 | Numeric | unique ID for record, always 0x0001 |
version | 2 | Numeric | 0x0002 if data is ZLib compressed, 0x0001 if DOC compressed |
records | 2 | Numeric | number of reserved records |
reserved | 4*records | Numeric | reserved ID array |
The reserved ID array consists of a series of name/ID pairs, where the ID is the unique ID (2 bytes) for the record and the name is a value (2 bytes) from the following list.
The Data Records |
There are several different types of data records.
Text data start with a header, followed by a series of paragraph headers before the compressed/uncompressed data, while all the other types only have a header and data.
NOTE: No text data should be larger than 32k. If the original document is larger than 32k, then the parser have to split it into several records. Images must be less than 60k uncompressed
Field | Bytes | Type | Notes |
uid | 2 | Numeric | unique ID for record |
paragraphs | 2 | Numeric | number of paragraphs |
size | 2 | Numeric | total length of data before compression |
type | 2 | Numeric | data type (only the first byte of the type is used, the second byte should be set to 0x0) |
Data type | Value |
DATATYPE_PHTML | 0 |
DATATYPE_PHTML_COMPRESSED | 1 |
DATATYPE_TBMP | 2 |
DATATYPE_TBMP_COMPRESSED | 3 |
DATATYPE_MAILTO | 4 |
DATATYPE_LINK_INDEX | 5 |
DATATYPE_LINKS | 6 |
DATATYPE_LINKS_COMPRESSED | 7 |
DATATYPE_BOOKMARKS | 8 |
DATATYPE_CATEGORY | 9 |
For text data the header is followed by a series of paragraph headers, each one represents a paragraph block in the text data.
Field | Bytes | Type | Notes |
size | 2 | Numeric | total length of paragraph before compression |
attributes | 2 | Numeric | paragraph info |
The first 5 bits in the attributes are unused, the 3 LSB indicates the amount of extra paragraph spacing (2*value pixels).
Text data
The (uncompressed) text data contains either characters or 'functions'.
A function is introduced by a NULL (\0), followed by a function code
and up to 7 bytes of data. The 3 LSB of a function code represent the
remaining function code length; the 5 MSB represent the actual function
code.
Code | Description | Bytes | Arguments |
0x0A | Anchor begins | 2 | record ID |
0x0C | Named anchor begins | 4 | record ID, paragraph offset |
0x08 | Anchor ends | 0 | no data |
0x11 | Set style | 1 | font style |
0x1A | Embedded image | 2 | record ID |
0x22 | Set margin | 2 | left margin, right margin |
0x29 | Alignment of text | 1 | alignment |
0x33 | Horizontal rule | 3 | height, width (pixels), width (%) |
0x38 | New line | 0 | no data |
0x40 | Italic text begins | 0 | no data |
0x48 | Italic text ends | 0 | no data |
0x5C | Multiple embedded image | 4 | alternate image ID, image ID |
0x60 | Underline text begins | 0 | no data |
0x68 | Underline text ends | 0 | no data |
0x70 | Strike-through text begins | 0 | no data |
0x78 | Strike-through text ends | 0 | no data |
Argument | Bytes | Notes | ||||||||||||||||||||
record ID | 2 | reference to record in Plucker document | ||||||||||||||||||||
image ID | 2 | reference to image in Plucker document | ||||||||||||||||||||
paragraph offset | 2 | paragraph number (starting from 0) to jump to | ||||||||||||||||||||
font style | 1 |
|
||||||||||||||||||||
left margin | 1 | left margin in pixels | ||||||||||||||||||||
right margin | 1 | right margin in pixels | ||||||||||||||||||||
alignment | 1 | alignment code (left = 0, right = 1, center = 2) | ||||||||||||||||||||
height | 1 | height of horizontal rule in pixels, if not given a default value of 2 pixels will be used | ||||||||||||||||||||
width (pixels) | 1 | width in pixels, should be 0 if percentage value should be used | ||||||||||||||||||||
width (%) | 1 | width as the percentage between the current left and right margins. The default is 100% |
Image data
The image data is the compressed/uncompressed Tbmp.
Mailto data
The mailto data contains info about e-mail addresses that are
referenced by the mailto anchors. All the offsets are counting
from the end of the header.
Field | Bytes | Type | Notes |
to_offset | 2 | Numeric | offset to TO string |
cc_offset | 2 | Numeric | offset to CC string |
subject_offset | 2 | Numeric | offset to SUBJECT string |
body_offset | 2 | Numeric | offset to BODY string |
strings | 0+ | NULL-terminated strings | a list of To, Cc, Subject and Body strings (if any) |
URL handling data
The URL handling data is used to find the correct URL string. It
contains a series of 2 byte number pairs.
Field | Bytes | Type | Notes |
last_url | 2 | Numeric | the record ID of the last URL in the group |
id | 2 | Numeric | record ID for the URL record containing the group |
URL data
The URL data contains a list of the URLs. Additional records
are created if needed and contain up to 200 URLs.
Field | Bytes | Type | Notes |
URLs | 1+ | NULL-terminated strings | a list of up to 200 URLs (only text and image records are included, other records are represented only by the presence of a NULL) |
These records may or may not be compressed. This is indicated by the type in the header. These records are used by the Details form to display the URL of the current record and by the External Reference form to display the URL of not collected pages. From either form you can copy the URL to a Memo to remind you to pluck it at a later date.
External bookmarks data
The external bookmarks data contains a list of bookmarks added by the
parser. It will work just as for named anchors.
Field | Bytes | Type | Notes |
bookmarks | 2 | Numeric | number of bookmarks |
offset | 2 | Numeric | offset to the start of the bookmark data (counting from the beginning of the record) |
names | < 21*bookmarks | NULL-terminated strings | a list of bookmark names (each name is max 20 chars) |
bookmark_data | 4*bookmarks | Bookmark Data | block of data for the location of the external bookmarks (see below) |
The bookmark data is a series of uid/offset pairs.
Field | Bytes | Type | Notes |
uid | 2 | Numeric | unique ID for record |
offset | 2 | Numeric | paragraph offset |
© Copyright 2000 Michael Nordström <micke@sslug.dk> | $Id: PluckerDB.tex,v 1.15 2001/09/13 18:04:24 nordstrom Exp $ |