Tom Lord's Hackery

Rewriting Arch for Tla 2.0

Here are notes analyzing the possibility of simply rewriting tla as a way to accomplish the goals for 2.0.

Past Experience

Writing tla followed the same pattern as writing larch: it is very natural to implement a core arch in steps because the core starts off with distinctly layered, independently useful components:

The Progression of Implementation Steps

- inventory
- mkpatch/dopatch
- in-tree patch-log access
- archive access
- simple revision builder
- revlib support
- commit and merge commands
- then fancier builders, caching, etc.

user-parameter (.arch-params) support in parallel with all of
the above

It's a little bit more complicated this time because several of those steps involve both redesign and implementing backwards compatability. For example, inventory will presumably use something besides =tagging-method in new-format trees and will change the rules for finding tagline tags. At the same time, when run on an old-format tree, inventory should produce the same results as before.

Librification from the Ground Up

Obviously, it makes sense to plan librification from the ground-up during a rewrite.

Arch, fortunately, can be implemented using only very simple data structures (file descriptors, strings, numbers, tables and associative tables of those, essentially). All functions in a libarch can be designed so that all parameters and return values are of those simple types.

Sticking to simple types can both simplify much of the coding of arch and result in a library that is used more easily in more situations.

(Obviously), errors have to be propogated to callers rather than causing emergency exits --- a hackerlab-friendly approach to this was designed in an earlier conversation with rbcollins.

So, overall, one of the first steps will be to specify the APIs for basic data types and the API conventions for libarch functions.

One hard part of this step concerns Unicode: the string type libarch uses should have, at least, a Unicode-friendly API.

One interesting opportunity during this work is to make libarch "reflective": for example, to have libarch contain a data structure which describes the argument and return types of every libarch function; to have a way to invoke a libarch function given only a table of the arguments to it and the name (a string) of the function.

Relationship to XL

The data structure and API conventions chosen for libarch are, in essense, the built-in data-structures and API conventions of the XL runtime system.

That's true, more or less, by definition.

My reasoning, while wearing my XL Language Designer Hat, is that XL is really aiming to be an awk-like language, but with C-like pragmatic prowess. "Awk-like" in the sense that it gives a simple and relatively familiar programming model, perhaps with some idiosyncratic ways to manage flow of control, on top of a very minimalist set of easy-to-implement data types. "C-like" in the sense that it should be a fast and functionally "complete" environment, practical in 90% or more of the cases where C is the current first choice.

Arch is itself a program specification which admits a clear, clean, awk-ish high level expression --- but which also requires, to be practical, an implementation with C-like performance characteristics.

Since I believe in the possibility of XL --- a language that bridges that Awk-like or C-like gap --- a rewrite of arch is a good opportunity to put stakes in the ground and design a good chunk, at least, of the XL run-time system: a set of of core data structures and calling conventions.

In other words: I'm committing (so far) to building GNU Arch 2.0 on top of the first version of the XL Run-time System.

I'm not committing to actually then writing any of 2.0 in XL, although that is certainly an idea I have in mind.

Next Steps

On the one hand, I plan to keep making progress on the "programmer's blog." That counts as prototyping higher-level UI features for 2.0, but it only indirectly counts as progress on the code-base for an actual 2.0 implementation. (Many of the data structures I'm working on for gtla and the blog are virtually certain to see re-use in 2.0 but there is a certain amount of gtla code which, really, is only a discardable prototype of some 2.0 features.)

On the other hand, I'd like to get started on the 2.0 code base more directly.

I think that inventory probably makes a good first milestone to aim for:

In order to write a brand-new, stand-alone, new-project-tree-format-defining implementation of inventory I will first have to spec out the details of "Librification" for 2.0.

Having speced out those interface and run-time system requirements, actually implementing inventory will make a good test. Does the proposed approach to librification actually work out smoothly in practice?

Copyright

Copyright (C) 2004 Tom Lord

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

See the file COPYING for further information about the copyright and warranty status of this work.