Subsections


A. Anatomy of a unit file

1 Basics

As described in chapter GenCode, unit description files (hereafter called PPU files for short), are used to determine if the unit code must be recompiled or not. In other words, the PPU files act as mini-makefiles, which is used to check dependencies of the different code modules, as well as verify if the modules are up to date or not. Furthermore, it contains all public symbols defined for a module.

The general format of the ppu file format is shown here:

ppu.png

To read or write the ppufile, the ppu unit ppu.pas can be used, which has an object called tppufile which holds all routines that deal with ppufile handling. While describing the layout of a ppufile, the methods which can be used for it are presented as well.

A unit file consists of basically five or six parts:

  1. A unit header.
  2. A general information part (wrongly named interface section in the code)
  3. A definition part. Contains all type and procedure definitions.
  4. A symbol part. Contains all symbol names and references to their definitions.
  5. A browser part. Contains all references from this unit to other units and inside this unit. Only available when the uf_has_browser flag is set in the unit flags
  6. A file implementation part (currently unused).

2 reading ppufiles

We will first create an object ppufile which will be used below. We are opening unit test.ppu as an example.

var
  ppufile : pppufile;
begin
{ Initialize object }
  ppufile:=new(pppufile,init('test.ppu');
{ open the unit and read the header, returns false when it fails }
  if not ppufile.openfile then
    error('error opening unit test.ppu');

{ here we can read the unit }

{ close unit }
  ppufile.closefile;
{ release object }
  dispose(ppufile,done);
end;

Note: When a function fails (for example not enough bytes left in an entry) it sets the ppufile.error variable.

3 The Header

The header consists of a record (tppuheader) containing several pieces of information for recompilation. This is shown in table (PPUHeader) . The header is always stored in little-endian format.


Table: PPU Header
offset size (bytes) description
00h 3 Magic : 'PPU' in ASCII
03h 3 PPU File format version (e.g : '021' in ASCII)
06h 2 Compiler version used to compile this module (major,minor)
08h 2 Code module target processor
0Ah 2 Code module target operating system
0Ch 4 Flags for PPU file
10h 4 Size of PPU file (without header)
14h 4 CRC-32 of the entire PPU file
18h 4 CRC-32 of partial data of PPU file (public data mostly)
1Ch 8 Reserved

The header is already read by the ppufile.openfile command. You can access all fields using ppufile.header which holds the current header record.


Table: PPU CPU Field values
value description
0 unknown
1 Intel 80x86 or compatible
2 Motorola 680x0 or compatible
3 Alpha AXP or compatible
4 PowerPC or compatible

Some of the possible flags in the header, are described in table (PPUHeaderFlags) . Not all the flags are described, for more information, read the source code of ppu.pas.


Table: PPU Header Flag values
Symbolic bit flag name Description
uf_init Module has an initialization (either Delphi or TP style) section.
uf_finalize Module has a finalization section.
uf_big_endian All the data stored in the chunks is in big-endian format.
uf_has_browser Unit contains symbol browser information.
uf_smart_linked The code module has been smartlinked.
uf_static_linked The code is statically linked.
uf_has_resources Unit has resource section.

4 The sections

Apart from the header section, all the data in the PPU file is separated into data blocks, which permit easily adding additional data blocks, without compromising backward compatibility. This is similar to both Electronic Arts IFF chunk format and Microsoft's RIFF chunk format.

Each 'chunk' (tppuentry) has the following format, and can be nested:


Table: chunk data format
offset size (bytes) description
00h 1 Block type (nested (2) or main (1))
01h 1 Block identifier
02h 4 Size of this data block
06h+ <variable> Data for this block

Each main section chunk must end with an end chunk. Nested chunks are used for record, class or object fields.

To read an entry you can simply call ppufile.readentry:byte, it returns the tppuentry.nr field, which holds the type of the entry. A common way how this works is (example is for the symbols):

  repeat
    b:=ppufile.readentry;
    case b of
   ib<etc> : begin
             end;
 ibendsyms : break;
    end;
  until false;

The possible entry types are found in ppu.pas, but a short description of the most common ones are shown in table (PPUEntryTypes) .


Table: Possible PPU Entry types
Symbolic name Location Description
ibmodulename General Name of this unit.
ibsourcefiles General Name of source files.
ibusedmacros General Name and state of macros used.
ibloadunit General Modules used by this units.
inlinkunitofiles General Object files associated with this unit.
iblinkunitstaticlibs General Static libraries associated with this unit.
iblinkunitsharedlibs General Shared libraries associated with this unit.
ibendinterface General End of General information section.
ibstartdefs Interface Start of definitions.
ibenddefs Interface End of definitions.
ibstartsyms Interface Start of symbol data.
ibendsyms Interface End of symbol data.
ibendimplementation Implementation End of implementation data.
ibendbrowser Browser End of browser section.
ibend General End of Unit file.

Then you can parse each entry type yourself. ppufile.readentry will take care of skipping unread bytes in the entry and reads the next entry correctly! A special function is skipuntilentry(untilb:byte):boolean; which will read the ppufile until it finds entry untilb in the main entries.

Parsing an entry can be done with ppufile.getxxx functions. The available functions are:

procedure ppufile.getdata(var b;len:longint);
function  getbyte:byte;
function  getword:word;
function  getlongint:longint;
function  getreal:ppureal;
function  getstring:string;

To check if you're at the end of an entry you can use the following function:

function  EndOfEntry:boolean;
notes:
  1. ppureal is the best real that exists for the cpu where the unit is created for. Currently it is extended for i386 and single for m68k.
  2. the ibobjectdef and ibrecorddef have stored a definition and symbol section for themselves. So you'll need a recursive call. See ppudump.pp for a correct implementation.

A complete list of entries and what their fields contain can be found in ppudump.pp.

5 Creating ppufiles

Creating a new ppufile works almost the same as reading one. First you need to init the object and call create:
  ppufile:=new(pppufile,init('output.ppu'));
  ppufile.createfile;

After that you can simply write all needed entries. You'll have to take care that you write at least the basic entries for the sections:

  ibendinterface
  ibenddefs
  ibendsyms
  ibendbrowser (only when you've set uf_has_browser!)
  ibendimplementation
  ibend

Writing an entry is a little different than reading it. You need to first put everything in the entry with ppufile.putxxx:

procedure putdata(var b;len:longint);
procedure putbyte(b:byte);
procedure putword(w:word);
procedure putlongint(l:longint);
procedure putreal(d:ppureal);
procedure putstring(s:string);

After putting all the things in the entry you need to call ppufile.writeentry(ibnr:byte) where ibnr is the entry number you're writing.

At the end of the file you need to call ppufile.writeheader to write the new header to the file. This takes automatically care of the new size of the ppufile. When that is also done you can call ppufile.closefile and dispose the object.

Extra functions/variables available for writing are:

ppufile.NewHeader;
ppufile.NewEntry;
This will give you a clean header or entry. Normally this is called automatically in ppufile.writeentry, so there should be no need to call these methods.
ppufile.flush;

to flush the current buffers to the disk

ppufile.do_crc:boolean;
set to false if you don't want that the crc is updated, this is necessary if you write for example the browser data.




2002-04-25