[ Terminology | Installation | Getting started | Mk4py Reference ]
What it is - MetaKit is an embeddable database which runs on Unix, Windows, Macintosh, and other platforms. It lets you build applications which store their data efficiently, in a portable way, and which will not need a complex runtime installation. In terms of the data model, MetaKit takes the middle ground between RDBMS, OODBMS, and flat-file databases - yet it is quite different from each of them.
What it isn't - MetaKit is not: 1) an SQL database, 2) multi-user, 3) scalable to gigabytes, 4) proprietary software, 5) a toy.
Technology - Everything is stored variable-sized yet with efficient positional row access. Changing an existing datafile structure is as simple as re-opening it with that new structure. All changes are transacted. You can mix and match software written in C++, Python, and Tcl. Things can't get much more flexible...
Python - The extension for Python is called "Mk4py". It provides a lower-level API for the Metakit C++ core extension than an earlier version of this interface, and uses SCXX by Gordon McMillan as C++ glue interface.
Mk4py 2.01 - is the latest release. The homepage points to a download area with pre-compiled shared libraries for Unix, Windows, and Macintosh. The MetaKit source distribution includes this documentation, the Mk4py C++ source code, a "MkMemoIO.py" class which provides efficient and fail-safe I/O (therefore also pickling) using MetaKit memo fields, and a few more goodies.
Changes since 2.0 - Mk4py (which at one point was called MkWrap) is now part of MetaKit 2.0, and adds:
License and support - MetaKit 2.01 is distributed under the liberal X/MIT-style open source license. Commercial support is available through an Enterprise License. See the license page for details.
Credits - Are due to Gordon McMillan for not stopping at the original Mk4py and coming up with a more Pythonic interface, and to Christian Tismer for pushing Mk4py way beyond its design goals. Also to GvR and the Python community for taking scripting to such fascinating heights...
Updates - The latest version of this document is at
http://www.equi4.com/metakit/python.html
The terms adopted by MetaKit can be summarized as follows:
A few more comments about the semantics of MetaKit:
Create a view (this is the MetaKit term for "table"):import Mk4py mk = Mk4py db = mk.Storage("datafile.mk",1)
Add two rows (this is the MetaKit term for "record"):vw = db.getas("people[first:S,last:S,shoesize:I]")
Commit the changes to file:vw.append(first='John',last='Lennon',shoesize=44) vw.append(first='Flash',last='Gordon',shoesize=42)
Show a list of all people:db.commit()
Show a list of all people, sorted by last name:for r in vw: print r.first, r.last, r.shoesize
Show a list of all people with first name 'John':for r in vw.sort(vw.last): print r.first, r.last, r.shoesize
for r in vw.select(first='John'): print r.first, r.last, r.shoesize
import Mk4py mk = Mk4py del Mk4py # tidy up
SYNOPSYS
ADDITIONAL DETAILS
- db = mk.Storage()
- Create an in-memory database (can't use commit/rollback)
- db = mk.Storage(file)
- Use a specified file object to build the storage on
- db = mk.Storage(name, roflag)
- Open file, create if absent and rwflag is non-zero. Open read-only and shared if roflag is non-zero, else r/w and exclusively (the file will be created if needed).
- vw = mk.View()
- Create a standalone view; not in any storage object
- pr = mk.Property(type, name)
- Create a property (a column, when associated to a view)
- vw = mk.Wrap(sequence, proplist, byPos=0)
- Wraps a Python sequence as a read-only view
Storage - When given a single argument, the file object must be a real stdio file, not a class implementing the file r/w protocol. When the storage object is destroyed (such as with 'db = None'), the associated datafile will be closed. Be sure to keep a reference to it around as long as you use it.
Wrap - This call can be used to wrap any Python sequence, it assumes that each item is either a dictionary or an object with attribute names corresponding to the property names. Alternately, if byPos is nonzero, each item can be a list or tuple - they will then be accessed by position instead. This mechanism can be used for joins and other view operations.
ADDITIONAL DETAILS
- vw = storage.getas(description)
- Locate, define, or re-define a view stored in a storage object
- vw = storage.view(viewname)
- The normal way to retrieve an existing view
- storage.rollback()
- Revert data and structure as was last committed to disk
- storage.commit()
- Permanently commit data and structure changes to disk
- ds = storage.description(viewname='')
- The description string is described under getas
- vw = storage.contents()
- Returns the View which holds the meta data for the Storage.
- storage.autocommit()
- Commit changes automatically when the storage object goes away
- storage.load(fileobj)
- Replace storage contents with data from file (or any other object supporting read)
- storage.save(fileobj)
- Serialize storage contents to file (or any other object supporting write)
contents - Advanced use only!
description - A description of the entire storage is retured if no viewname is specified, otherwise just the specified top-level view.
getas - Side-effects: the structure of the view is changed.
Notes: Normally used to create a new View, or alter the structure of an existing one.
A description string looks like:"people[name:S,addr:S,city:S,state:S,zip:S]"That is "<viewname>[<propertyname>:<propertytype>...]"
Where the property type is one of:
I adaptable integer (becomes Python int) F C float (becomes Python float) D C double (is a Python float) S C null terminated string (becomes Python string) B C array of bytes (becomes Python string) M C string (long) (becomes Python string)
r = view[0] r.name = 'Julius Caesar' view[0].name # will yield 'Julius Caesar'Slices return copies. You can create an empty view with the same structure as another view with:
v2 = v[0:0]Setting a slice changes the view:
v[:] = [] # empties the viewView supports getattr, which returns a Property (eg view.shoesize can be used to refer to the shoesize column). Views can be obtained from Storage objects: view = db.view('inventory') or from other views (see select, sort, flatten, join, project...) or empty, columnless views can be created: vw = Mk4py.View()
SYNOPSYS
ADDITIONAL DETAILS
- view.insert(index, obj)
- Coerce object to a Row and insert at index in View
- ix = view.append(obj)
- Object is coerced to Row and added to end of View
- view.delete(index)
- Row at index removed from View
- lp = view.structure()
- Return a list of property objects
- cn = view.addproperty(fileobj)
- Define a new property, return its column position
addproperty - This adds properties which do not persist when committed. To make them persist, you should have used storage.getas() when defining the view.
append - Also support keyword args (colname=value...).
insert - coercion to a Row is driven by the View's columns, and works for:
dictionaries (column name -> key) instances (column name -> attribute name) lists (column number -> list index) - watch out!
ADDITIONAL DETAILS
- vw = view.select(criteria...)
- Return a view which has fields matching the given criteria
- vw = view.select(low, high)
- Return a view with rows in the specified range (inclusive)
- vw = view.sort()
- Sort view in "native" order, i.e. the definition order of its keys
- vw = view.sort(property...)
- Sort view in the specified order
- vw = view.sortrev((propall...), (proprev...))
- Sort view in specified order, with optionally some properties in reverse
- vw = view.project(property...)
- Returns a derived view with only the named columns
select - Example selections, returning the corresponding subsets:
inventory.select(shoesize=44) inventory.select({'shoesize':40},{'shoesize':43}) inventory.select({},{'shoesize':43})The derived view is "connected" to the base view. Modifications of rows in the derived view are reflected in the base view
sort - Example, returning the sorted permutation:inventory.sort(inventory.shoesize)See notes for select concerning changes to the sorted view
ADDITIONAL DETAILS
- vw = view.flatten(subprop, outer=0)
- Produces one 'flat' view from a nested view
- vw = view.join(view, property...,outer=0)
- Both views must have a property (column) of that name and type
- ix = view.find(criteria..., start=0)
- Returns the index of the found row, or -1
- ix = view.search(criteria...)
- Binary search (native view order), returns match or insertion point
- vw = view.unique()
- Returns a new view without duplicate rows (a set)
- vw = view.union(view2)
- Returns a new view which is the set union of view and view2
- vw = view.intersect(view2)
- Returns a new view which is the set intersection of view and view2
- vw = view.different(view2)
- Returns a new view which is the set XOR of view and view2
- vw = view.minus(view2)
- Returns a new view which is (in set terms) view - view.intersect(view2)
- vw = view.rename('oldname', 'newname')
- Returns a derived view with one property renamed
- vw = view.product(view)
- Returns the cartesian product of both views
- vw = view.groupby(property..., 'subname')
- Groups on specified properties, with subviews to hold groups
- vw = view.counts(property..., 'name')
- Groups on specified properties, replacing rest with a count field
find - view[view.find(firstname='Joe')] is the same as view.select(firstname='Joe')[0] but much faster Subsequent finds use the "start" keyword: view.find(firstname='Joe', start=3)