Copyright © 2003 Iustin Pop, <iusty@k1024.org>
This is the usermanual for the cfvers project; homepage is at http://www.nongnu.org/bakonf/. You can also get new versions of this document there.
Revision: $Id: manual.xml 71 2003-10-28 21:56:30Z iusty $
Making backup is an important aspect of system administration. The techniques of backing up data are explained in any good document about system administration, and they won't be explained here again.
However, the text configuration files are more suited to versioning systems than to full/incremental backups which are targeted at binary files and miscellaneous data. Unfortunately, the versioning systems are not very good at working directly live on the system: the main reasons are creation of extra-files, inability to cope with special files and with keeping permissions intact.
The working model of the classic versioning systems is one (or more) composed of a central repository (very precious) and a multitude of developer's workspaces, which hold semi-important data; by this I mean it's ok to delete or otherwise break a developer's workspace when no changes have been performed to it - all state can be restored from central repository.
In contrast, a versioning system designed for system configuration has its priorities almost reversed: the critical issue is with the filesystem, and the repository is secondary to that. This means that such a software should obey the following rules:
keep the system's integrity: the software must not do anything to the filesystem it hasn't been asked to do
treat the meta-data of versioned items to be as important as the data
when in doubt about the success of the operation, abort rather than do damage on the workspace
cfvers has been designed with these objectives in mind[1].
How to create your first repository
decide on which back-end to use (either sqlite or postgresql for now), and configure it in /etc/cfvers.conf, like this:
[repositories] #For sqlite one ;default=sqlite:/var/lib/cfvers/repo.sqlite #For postgresql ;default=postgres::cfvers #This is the default: ;default-area=default
run cfvadmin --init in order to create the initial repository.
run cfv store ITEMS... in order to register (and store the first version of) the items you want versioned.
after every change to the system's configuration, rerun the cfvers store command in order to update the versioned items. New items you want stored must be given in a separate call.
schedule a cron job to watch for differences or do automatic commits.
I tried to keep cfvers as simple as possible. It's implemented in Python, and uses an SQL repository (encapsulated of course in a class and easily replaced if someone is so inclined).
The repository syntax is used whenever you need to specify a repository to cfvers: in configuration files and using the -d option to cfv and cfvadmin.
Generally speaking, the string specifying the repository is composed of two parts:
the backend driver, specifying the database used to store the repository
connection-specific information
These are joined together using a colon, like this: backend:conn_info.
The following objects are defined:
Each repository consists of one or several 'areas', each area rooted at a specific point in the filesystem. Usually you'll have one area, rooted at "/".
Each object to be versioned is defined by an 'item'. Right now, only files are supported, but more has been thought of.
Each new revision to an area has several parameters; they together form an area revision.
The data of each revision an item is encompassed in a revision entry. These revision entries are linked to an area revision.
The 'area' concept has been invented in order to ease the keeping of different sets of configuration files in one repository. In later versions, migration of config files from one area to another could be a possibility.
The properties of an area are:
name - text; the name of the area and primary key;
root - text; where in the filesystem the area is rooted; all operations will be relative to this (as if in a chroot)
ctime - timestamp; the creation date for this area;
description - text; free-flow description of this area;
Each versioned item is represented by this object. In the SQL repository, it is represented by a row in the 'items' table.
The basic properties of an item are:
id - integer; represents the primary key
area - text; represents the area this item belongs to
name - text; the name of the item; until we implement renames (and deletions) this is equal to all the item's revisions entries name.
ctime - timestamp; the creation time of the item; useful for knowing when it has entered the repository
The common parameters for all item's revision sharing the revision number are gathered in the area revision object. These include attributes like: log message, timestamp of the revision, commiter information, etc.
Table 1. Area revision attributes
Name | Type | Description |
---|---|---|
area | text | the name (primary key) of the parent area |
revno | integer | the ID of the area revision |
logmsg | text | the log message for this revision |
ctime | timestamp | the creation date of this revision |
uid | integer | the uid, gid of the creation process |
gid | ||
commiter | text | free-form description of commit type; can be used to differentiate between manual and automatic commits |
server | text | the hostname of the server on which the commit was made. |
Each revision of each item is represented in the database using a revision entry. This object is stored in the 'revisions' table.
The metadata is stored in various fields (attributes) of the table (objects). For regular files, the contents of the file is stored in various ways, depending on the contents, in order not to violate the constraints of each backend. ASCII text files are stored as-is, while binary files be encoded (using either base64 or quoted-printable, whichever is shorter).
Table 2. Revision entries attributes
Name | Type | Description |
---|---|---|
item | integer | The ID of the item to which this revision belongs. |
revno | integer | The revision number of this revision entry. |
filename | text | The filename this entry represents. |
filetype | integer | The file type of this entry, one of ST_IF* values |
filecontents | text | the encoded contents of the file, for file types that have such a thing |
sha1sum | text | the SHA1 checksum over the unencoded filecontents |
size | integer | the size of the file |
mode | integer | the st_mode entry in the stat result |
mtime | integer | The modification, access and change time for this inode |
atime | ||
ctime | ||
inode | integer | The inode number of this file |
device | integer | The device on which the inode resides |
nlink | integer | Number of links to this inode |
uid | integer | The UID/GID of the owner/group of this file |
gid | ||
rdev | integer | For device files, their major/minor mode |
blocks | integer | The number of blocks occupied by this file and the size of the blocks, if the operating system/file system reports these |
blksize | ||
encoding | text | Information about how the filecontents has been encoded |
This section should be very big. It's small because I didn't have time to fill it, not because cfvers is complete :-)
These are limitations or design decisions inherent to the POSIX specification or the GNU/Linux implementation. While developing cfvers, I found:
You can't change the ctime of an inode. This is by design in the POSIX filesystem layer: the ctime is for metadata modifications, and the mtime/atime pair for data write/read accesses. Thus a ctime modification would trigger a ctime modification, since the ctime itself is part of metadata, rendering useless the ctime modification :). A read attribute for the metadata would be innapropriate, I think, because such reads are made in a great amount.
utimes(2) and chmod(2) acts on the destination of a symlink (when given an argument which is a symlink). I can't think why anyone would like this (you could always expand the symlink using readlink, but right now you can't act on the symlink!).
SQLite: File size is limited to 1MiB. This should not be of great concern (the file will be bzip'ed first, and only if the compressed+base64 file size exceeds the size the commit is aborted), and cfvers is aiming at small configuration files, but still...
[1] | However, nobody said it attained these goals - after all, it software! |