3 Implementation notes about Fuse for FreeBSD, with special emphasis on comparing it to Linux Fuse

Here I summarize the differences between the original Fuse implementation for Linux, and Fuse for FreeBSD, with references to the differences of OSes themselves, where it’s necessary.

First let’s see what’s implemented by the FreeBSD Fuse module. Perhaps the most platform-independent way of expressing the functionality is to list the utilized Fuse operations and say: the OS functionality provided is what these operations can be used for. So, we (Fuse for FreeBSD) utilize these:

  • FUSE_LOOKUP
  • FUSE_FORGET
  • FUSE_GETATTR
  • FUSE_SETATTR
  • FUSE_READLINK
  • FUSE_SYMLINK
  • FUSE_MKNOD
  • FUSE_MKDIR
  • FUSE_UNLINK
  • FUSE_RMDIR
  • FUSE_RENAME
  • FUSE_LINK
  • FUSE_OPEN
  • FUSE_READ
  • FUSE_WRITE
  • FUSE_STATFS
  • FUSE_RELEASE
  • FUSE_FSYNC
  • FUSE_INIT
  • FUSE_OPENDIR
  • FUSE_READDIR
  • FUSE_RELEASEDIR
  • FUSE_FSYNCDIR
  • FUSE_ACCESS
  • FUSE_CREATE

Ie., we don’t use the following ones:

  • FUSE_SETXATTR
  • FUSE_GETXATTR
  • FUSE_LISTXATTR
  • FUSE_REMOVEXATTR
  • FUSE_FLUSH

However, some nuances are left hidden by this rudimentary description. We do provide mmap, and give kernel level support to the allow_other, direct_io, kernel_cache options. We don’t support attribute and name caching (“d_cache” in Linux). We neither have specific large file support.

Support for all the above mentioned goodies can be added in due time, maybe except for (the functionality behind) FUSE_FLUSH (more on this later). That is, these differences are related to the level of maturity of the two implementations, and as such, these are the less interesting ones. Let’s see those differences of the modules which stem from the differences of the two OS and/or preferences of the implementors.

3.1 VFS API

This is the one of the greatest impact. Linux VFS operations are inherently (struct) file based, BSD VFS operations are inherently vnode based (BSD vnodes correspond to Linux inodes). That is, all file-related VFS operations take a (struct) file parameter in Linux.

In BSD, in some cases you can put your hands on (struct) files, in some cases not. The API encourages and helps you to not use files (if the nature of the fs permits, these can be completely avoided), and a consequently file based design is simply impossible (as I could experience when I tried to make a “naive” port). You are given a subtle system of structures and op vectors which let you do many things with files if you insist on to do so, but which gently shields you from seeing those ugly files in your everyday life. (The vnode centered nature is common to all BSD flavours; how comprehensive is that optional acces to files is flavour dependent, but as I said, never complete.)

This becomes manifest in different ways. The most visible of these is

3.2 Mounting

3.2.1 interface

There are many factors which result in a different user interface for the two OS.

In Linux, there is one Fuse device, and when a Fuse daemon is started, it calls a helper utility which opens the Fuse device for the daemon and performs the mount syscall “atomically”. Hence all mount parameters, including mount point, has to be passed to the daemon.

In FreeBSD, there are several Fuse devices. When a Fuse daemon is started, it attaches itself to one – either completely on its own or using an already existing file descriptor, but it doesn’t call any helper program and doesn’t do anything mount related. Mounting is done by an external utility. This means that there must exist a global namespace in which the mounter can specify the daemon to mount. As one Fuse device can serve no more than one daemon, it’s easy: the device name is used to specify the subject of the mount (what a novelty).

In Linux, this couldn’t work out this way (given the design chosen by Linux Fuse): there is one Fuse device only, and a fail-safe identification of the subject of the mount is possible only within the scope of one process.

In FreeBSD, altering from the Linux way of atomic mounting is not a design choice, it’s a must. Unlike Linux, BSDs use a different type of op vector devices and for regular files (the device access and vnode API is not unified). The device access API is not only different but almost completely (struct) file unaware. So we can’t distinguish between different openers of a given device.

The traditional solution would be working with a static set of fuse devices (nnpfs, the multiplatform userspace vfs framework of Arla has chosen this solution in its BSD implementations). However, FreeBSD has a peculiarity which lets us implement Fuse as dynamically as it’s in Linux: a wisely designed in-kernel device filesystem.

FreeBSD devfs provides a mechanism to register event handlers for certain device name patterns, by which new devices can be spawned on the fly and system call handling can be delegated. We rely on this mechanism for providing a dedicated Fuse device to each Fuse daemon.

At this point, one could spot one definite advantage of the Linux way of doing the mount: it’s comfortable. That the state of Fuse daemons is explicitly reflected in the devfs namespace might seem to be an elegant concept, yet performing the device squatting and the mount syscall via separate commands will get tedious pretty fast.

To get over this, the mount utility serves as an umbrella frontend to all these mechanisms, and running mount_fusefs auto mp daemon args will open a Fuse device, start the daemon with the given args and makes it chew on the device, and finally mount the daemon on mp.

A more recent enhancement of Fuse on FreeBSD is adding support also for the Linux style interface. That is, you can mount Fuse daemons with the daemon args mp syntax as well, in which case the daemon will invoke mount_fusefs by herself. (Note: this of course won’t work if you want to do mounting on a different privilege level than that of the daemon).

It’s also a comfortable thing in Linux that when you run mount(8) to see the list of mounted filesystems, what you will find in the “mounted device” (first) column for Fuse filesystems, is a compact custom description of the filesystem. Under FreeBSD you will see simply the device name as is, which might feel crude related to the Linux listing. Yet I don’t plan implementing an “fs-summary-as-device” hack under FreeBSD: that would mean loss of information. Even if showing the device name is not so friendly, that is the exact unique identifier of the mount. What I might happen to do is to write (or to wait for one being written) a nifty shell script which gathers Fuse mount related informations and presents it in a human-loveable way – you can find out everything you want via mount(8), ps(1) and fstat(1).

3.2.2 security

Fuse is a dedicatedly Promethean filesystem: it aims to the bring the power of interaction via a custom filesystem interface to ordinary users. Practically, this boils down to doing customized non-privileged mounts. In Linux, ordinary users are usually allowed to do mounts via the user(s) option of fstab. This is a fairly static mechanism, so to be able to do the customized non-privileged mounts as it’s required, Linux Fuse rolls his own: the above mentioned device opener/mounter utility is written in a way so that it can safely bear the suid bit, and then the appropriate permission handling logic is stuffed into this utility, too.

In FreeBSD no heroic action is needed. No setuid mounting is needed – unlike Linux, there are no user(s) option in fstab, and mount(8) itself is not setuid.

Non-privileged mounts are handled via a system-wide policy: if the vfs.usermount sysctl is 1, users can mount over mount points owned by them, if it’s 0, only root is allowed to mount. Unlike as it’s with the user(s) fstab options, Fuse mounts fit into this frame nicely, so there is no need to implement special methods for non-privileged mounts.

Yet there might be administrators who find this control too much rudimentary… What do such people do? Most likely, they set vfs.usermount to 0, and use their homebrew privilege delegation policies; by all chance, they implement it via sudo(8). However, mount_fusefs can run an arbitrary other command, so can’t be called from sudoers without care. Nevertheless, the commandline interface is designed in a way such that it is easy to ban daemon spawning (eg., mount_fusefs -S will not be willing to start a daemon).

Moreover, besides the already mentioned above mentioned vfs.usermount sysctl, FreeBSD has an other access control mechanism for mounting. This comes into play if the filesystem is backed by a device. In that case, only those can mount the filesystem who have read/write access to the device to be mounted (or read access for a read only mount).

This latter mechanism can be used with Fuse, too: despite its somewhat synthetic charater, Fuse is a device backed filesystem. There is though one subtle difference between Fuse and traditonal device (disk) backed filesystems in this respect: with traditional filesystems, permissions of the device are used also for providing access control for the device file as such, which is a valid entity on its own and can be used for performing raw I/O on the appropriate hardware.

On the contrary, Fuse devices has no use without being mounted (the kernel is not willing to interact with a reader/writer of the device file until the VFS layer pushes messages onto it). Hence permission settings of Fuse devices are to be directly interpreted as permissions for mounting Fuse filesystems. So this is the tool by which a fine-grained control on mounting Fuse filesystems can be set up.

By default, Fuse devices can be used by members of the operator group (that’s used for controlling access to, eg., usb devices). One can set permissions of fuse devives directly, by chmod, or generally, via devfs(8) rules.

Let’s elaborate a bit more on this “naturally useable” BSD mount access control. This also makes Fuse more exposed to attacks under FreeBSD than it is under Linux: in Linux, non privileged mountability of Fuse based filesystems don’t open up further privileged tasks. In FreeBSD, mounting and unmounting will be available more generally if the respective permissive move (“sysctl vfs.usermount=1”) has been done. (With the help of sudo, one can setup an access control scheme which is similar to that of Linux, yet we are to give full support for the system provided facilities.)

As we said, device permissions can fall into the role of mount permissions, thus there are limits to the freedom provided by vfs.usermount=1, but this happens only with device based filesystems. The null filesystem (providing functionality similar to “bind mounting” in Linux) is one example for a deviceless filesystem. A user can create a deadlock by null mounting a directory of a Fuse filesystem over another directory, if the Fuse filesystem requires this other directory during its operation. And users can freely unmount their filesystems, including forced unmounts, which can easily lead to panics. (Note that while Linux tends to stay on the safe side and refuse forced umounts too, if the filesystem is busy, FreeBSD tends to go forward and perform the forced unmount and occasionally panic.) So in FreeBSD, we will have to cope with these, too, if we want to claim that mounting of untrusted daemons is safe.

Careful code review slowly leads us toward a state in which this claim can be maintained. To add, Linux Fuse doesn’t seem to rely on its more protected situation: Linux Fuse was immune to those crashing schemes that were used to be possible to summon by non-privileged users in FreeBSD (in Linux, these were attempted as root, of course).

3.2.3 dealing with the “allow other” misery

This is related to security, too, but deserves an dedicated subchapter.

The problem is that the Fuse daemon sees I/O activity of the users of the filesystem driven by her. As usually no privileges are needed for using Fuse, we can’t guarantee that the daemon belongs to a trusted user account. Moreover, mounts are transparent for normal filesystem related activities, so the user of a filesystem might be unaware of the fact that a given path belongs to a Fuse mount.

How is this handled in Linux? By default a user is denied from usage of a Fuse mount unless the daemon is “more privileged” than the user (that is, the daemon’s credentials allow her trace the user’s processes).

There is a mount option, allow_other, which removes this limitation. Of course, if anyone could use this option, that would pretty much defeat the very purpose of its existence. So by default, only root can use this. However, the final decision is made by the setuid dispatcher; and his decision is based upon settings in the respective config file /etc/fuse.conf.

The problem with this approach is that it’s pretty hard for the root to make sure that no user of the system would mind an “everyone can allow_other” policy.

And in FreeBSD, we couldn’t even follow this approach, as we don’t have a setuid dispatcher for deciding about the fate of allow_other attempts.

So, what do we do about it in FreeBSD?

The basic setup is the same as in Linux: there is an allow_other mount option, useable only by root. But we don’t make exceptions: allow_other can be used only by root, period.

Yet we have our own ways of being not too draconian. We have an explicit global unique userspace identifier of daemons in work.

This allows the introduction of shared daemons. When the first (primary) mount of a daemon has been completed, other users can do secondary mounts of the same daemon. A secondary mount works like a symlink – it forwards all requests to the primary mount. So it is a very lightweight mechanism.

And what’s most important, doing a secondary mount can be viewed as signing an agreement of traceability. The secondary mounter can be expected to have the necessary knowledge about the primary mount to which she or he joins. Hence while she or he posessess the secondary mount, s/he will be allowed to use the filesystem – either via its primary or secondary mounts.

3.2.4 anything else

Not specific to Fuse, but it’s a great revelation that one doesn’t need to fight with the dreaded /etc/mtab file under FreeBSD. Even under Linux, traditional file system authors don’t get involved with such activities, but due to its alternative mounting mechanisms, Linux Fuse does have to take special care about mtab maintenance.

3.3 [vi]node operations

The design of the Fuse “rpc” system shows the traits of the file centric nature of the Linux VFS. In Linux, upon each open system call, a FUSE_OPEN message is sent to the daemon, who then performs whatever opening means to her, and sends back a file handle identifier to the kernel, which then gets attached to the file structure open is done for. In the sequel, when this file structure is used for I/O, the kernel will use the file handle identifier as a reference by which the daemon can find the file’s userspace counterpart. So, not only inodes and the daemon’s file (node) structures are kept in sync, but entities representing open files, too. (Though this is somewhat relaxed by the fact that the daemon can use the same file handle to serve I/O request for multiple (struct) files.)

Keeping vnodes and daemon’s file (node) structures in sync happens in FreeBSD, too – it’s a central concept of Fuse, this could be hardly avoided. But concerning files… It’s not possible to maintain such a close correspondence between in-kernel file structures and the daemon’s file handles in FreeBSD. Luckily for us, this is neither necessary (even if the rpc system suggests having this correspondence).

This is so because fuse filehandles are not stateful, unlike file descriptors/streams/structures. That is, if we are in the userspace, and we open a file (node) three times, then it’s pretty much makes a difference which file descriptor/stream is used in a given read operation (assuming other parameters of the read call are fixed): each of them stores a file offset during its lifetime, and the read performed on one is understood to start from that given offset (which is then advanced according to the number of bytes read).

There is no offset kept with Fuse filehandles: the FUSE_READ operation requires an explicit offset parameter and the filehandle should serve data according to it, regardless of its exact identity. As process id and credentials are also given with a FUSE_READ, these parameters might better match, but apart from that, it should be the indifferent which filehandle is used. (“Highly synthetic” filesystem daemons might opt to serve data in a filehandle specific manner, but that’s an extremity.)

So far, it’s been explained why is it possible to alter from the “strict correspondence between file structures and daemon’s filehandles” model. Now let’s see: why is this necessary?

There is the so-called “strategy” vnode method, which is used to transfer data between the “storage” (the daemon in our case) and the vmio buffers; this is the engine of buffered I/O in BSD. It takes only two parameters: the vnode we operate on, and the buffer object we read into from or write to the storage (to read or to write: this info is kept with the buffer). With Fuse, what are we to say to the daemon, when the strategy is invoked? We need a “key”, a suitable filehandle identifier to perform the I/O request – where to get one?

The situation is easy when reading or writing regular files: these operations can easily be arranged in a way that they will be file aware. When they arrive to the point where the strategy needs to be called, they simply don’t call the “official” strategy… but an internal version, which happens to take a filehandle parameter.

On the contrary, the readdir operation is not (struct) file aware (file awareness can be forced by a dirty hack, but we better refrain from going that way). Even “worse”, when doing an internal mmap, which happens eg. when executing a file, there is no file structure involved at all, and strategy is invoked by the vmio system, with no chance to smuggle in a file parameter.

In these cases, the FUSE_OPEN message is sent by the strategy itself, and thus it gets its vehicle for the I/O.

But what to do with the filehandle when the strategy has done its job?

Releasing it immediately (that is, strategy releases it before return) is pretty unefficient: the file should be re-opened at each turn of a lengthy read-in.

Just simply forgetting about it and polluting the daemon with worn-out filehandles is neither a good idea. Some kind of resource management should be used for these “unbound” filehandles.

As the first step of that, a list of filehandles is maintained by each vnode, and when strategy needs one, first looks for a suitable one in the list of existing ones, and asks for a new one only if it finds none. This still doesn’t seem to considerably lower the frequency of requests for new filehandles: the criterion for “suitable” is to have the same credentials, mode and pid recorded with the filehandle, thus new processes need to get new filehandles. And filehandles are still kept around ad infinitum. Some kind of garbage collection is needed. The question then: what event should trigger gc and on what thread should gc run?

There is a neat built-in gc mechanism: it’s invoked when vnodes become unused. The intended effect of this method is disassociating the vnode from its file (node), and putting it back to the pool of free vnodes. We don’t do that, as then we would lose the number of lookups (which is needed for Fuse to operate correctly). Yet it’s a pretty fine time to gc unbound filehandles.

But if a file is kept open for a longer while, that will block this mechanism: the respective vnode remains in use during this time. Then the filehandles created by subsequent internal (fileless) opens will persist. (Imagine that one opens /mnt/fuse/bin/ls as a regular file, and keeps it open. In this case each execution of this file will create a new unbound filehandle which won’t go away upon the termination of the process.)

So one more gc entry point is needed. Using the open handler for this purpose is good enough. This will stop the proliferation of unbound filehandles in the above scenario, yet won’t occur with an unbearable frequency.

3.4 Syncing

As far as I’m concerned, both the Linux and FreeBSD modules pass over written data to the daemon immediately, so all writes are synchronous in the traditional (vmio buffers vs. storage) sense.

Yet it doesn’t mean that syncing type operations wouldn’t make sense with Fuse: there might be a second layer of cached data, maintained by the daemon in the userspace (who might have its own background storage, too). So these operations do make sense, but they should be passed on to the daemon. The Fuse rpc kit does have those operations which are devoted to these purposes: FUSE_FSYNC and FUSE_FLUSH, and they correspond to the respective Linux syscall handlers. According to Linux semantics, their functionality differs in the way of relating to file data and meta data (I never remember exactly how).

In FreeBSD, there is just one syscall of this type, fsync. It is both for synchronizing file data and meta data. For both type of data, two kind of operation mode is available: the one where the caller waits for completing the operation, and the one where operation is backgrounded. The actual combination of these modes is determined by the mount flags of the filesystem.

So, to summarize, both OS’ syncing methods can relate to file data and meta data in various subtle ways, but in rather different ways.

Making up our mind in respect of flush is easy: we don’t have such a syscall under FreeBSD, hence we just never use FUSE_FLUSH. An alternative is merging the use of the library’s flush and fsync functionality in FreeBSD Fuse’s fsync handling. This could be considered upon seeing a Fuse filesystem which implements the flush and fsync userspace callbacks in an essentially different (yet BSD [POSIX] compatible) way, but this moment of enlightenment is yet to come.

When trying to implement fsync for Fuse, once again we bump into the basic difference: Linux fsync (flush) is file based, FreeBSD fsync is vnode based. Here I can imagine that file basedness has a significance to the userspace: eg., it is possible that sshfs runs different sftp connection threads for different filehandles, and syncing the data stuffed into one connection doesn’t imply syncing the data of the other (I don’t know whether it goes this way or not actually, but it’s enough to see that this is a realistic scenario).

So what we do during the fsync vnode op is that we walk over the list of filehandles and send a FUSE_FSYNC for each. Unlike Linux Fuse (and regardless of the mount flags) we don’t wait for the answer: sending the FUSE_FSYNC messages and waiting for the answer one after the other would be too much pain (and we can’t [yet] send/wait for many messages once, in a batch).

3.5 Messaging

Here I give a brief comparison of the ways of implementing messaging between kernel and userspace in Linux and in FreeBSD.

Before anything else: I don’t claim superiority of either solutions over the other. I implemented my solution from scratch, without understanding the respective parts of the Linux code, and without having a clear vision how this will be used by the VFS. (This latter implies that I tried to make the design as general as possible. On one hand, this is good; on the other hand, it means it doesn’t contain any Fuse-specific tuning.)

Now, post festam I took the effort of peeking at Linux Fuse’s messaging code, and I feel able to make this comparison. Nothing is carved into stone, I might make up my mind and bend my code closer to that of Linux Fuse. Or I might make it even more different.

Terminology: up will mean from kernel to userspace, down will mean… you can guess. (“In” and “out” are too relativistic to my taste.) The basic vehicle of messaging is called a request in Linux, a ticket in FreeBSD.

The basic mode of operation is similar.

  • There is a pool of requests/tickets.
  • Syscall handler wants to get data from daemon. Takes a request/ticket from the pool, fills in its fields, and inserts into upgoing queue. If buffered I/O is being done, the backing pages/buffers are attached to the request/ticket, too. Handler alerts device’s read method and falls asleep, waiting for answer.
  • The device’s read method pushes up the message to the daemon.
  • The daemon does whatever she should do with it, and sends back an answer. The device’s write method grabs the answer and finds out its requester and wakes that up. If buffered read is being done, appropriate parts of the answer are handled differently, and copied into attached page/buffer.
  • Syscall handler woken up, processes answer, drops requests/ticket, returns.

Differences:

  • In Linux, one Fuse mount works with a fixed number of preallocated requests (with some exceptions, when new ones are created), in FreeBSD, tickets are created on demand.
  • Unlike Linux, messaging facilites are unconditionally avaliable in FreeBSD. That is, there are no forced delays depending on the intensity of filesystem usage. (This is not because of a forced-delays-are-evil policy; it’s rather due to a better-to-do-nothing-than-to-do-something-without-understanding-it standpoint.)
  • In Linux, the buffers hosting the fields of a request are allocated on the stack (that is, these fields are pointers to structures held in variables of syscall handlers), in FreeBSD, they are allocated dynamically (they are not freed when the ticket gets dropped, they are kept, reused, and reallocated on demand).
  • In Linux, the unique field of the request is filled with a really unique value upon being taken out of the pool (ie., number of take-out). In FreeBSD, unique values are owned by the ticket itself (it’s not changed during the ticket’s lifetime), so unique values give information about the number of messaging sessions going in parallell (there is a secondary field for each ticket which stores the number of take-outs that ticket went through, but that’s rarely used).
  • Messaging API: in Linux, the fields of requests are Fuse specific (eg., there are fields named inode, and inode2, as file operations take one or two inode). This means that syscall handlers usually can fill these fields in a straightforward way (req->inode = inode;).
    In FreeBSD, there are just raw message and answer buffers attached to a given ticket. Syscall handlers use variables of pointers of the required structs, and frontend methods for tickets set them to an appropriate value (to the appropriate point in the ticket’s appropriate buffer). In some of the more complex cases this means a bit of manual pointer arithmetic; for those of the complex patterns which occur repeatedly (mknod/creat/link), further, specific frontend methods are used (to note, in Linux, too). In general, I didn’t feel that this approach yields too much tedious repetition when setting up a ticket.

Interrupt handling: in Linux, when a syscall is interrupted, the corresponding request is “backgrounded”. It’s put into another queue, and when the daemon (unaware of the interrupt) sends its reply, then its get dropped silently.

And by-and-large, the same happens in FreeBSD – just as the special case of a more general mechanism.

Tickets have a callback field, which can hold an arbitrary function (of the given type), or can be NULL. When the device write method finds the ticket to which a given answer should be passed, then invokes the callback on the incoming data (so that’s what “passing” means), provided it’s not NULL. If the callback is not NULL, then the device write method expects the handler doing the necessary resource management by the handler; but if it’s NULL, the device write method takes up this role and does what it can do – drops the ticket, etc.

In most cases we use the so-called standard one, which does what’s described above: fetch the answer and wake up syscall handler, but we also use NULL and custom cleanup functions for backgrounded requests.

NULL is used when we don’t want to wait for the answer (in case of doing a RELEASE, and in FreeBSD, for FSYNC too), and to handle interruption. If the syscall is interrupted (that is, if it returns from sleep with an error), then it locks the answer queue and replaces the callback with NULL, and thus the device write routine will aptly discard the answer. Well, there is no guarantee of a race win: it’s possible that the device write routine has already taken out the ticket from the answer queue. In that case, it will be passed to the standard handler. That’s not a problem, we can make him notice the interruption, and then he will drop the answer rather than waking up anyone. 1 The only difference is I/O: the standard handler copies in data nevertheless, while this is skipped if the device write routine finds a NULL callback.

3.6 Miscellaneous

Now you can ask: what do I think, which VFS design is the better? Vnode centric BSD or file centric Linux?

Frankly: I don’t know. While it sounds reasonable to not to keep something always on the surface if it’s rarely used directly, I have never tried to get something useful out of the Linux VFS, so I don’t know how does it feel to make a filesystem under Linux.

What I do have an opinion about is the FreeBSD VFS.

Concerning the, so to say, “OO design” it features, the hiearchy of objects and their method kit: I can say it’s a carefully crafted masterpiece. There are subsystems which remain completely hidden by default, but they are there for your disposal if you need ever need to tweak it. This yields an interface which is both high-level and highly flexible.

On the other hand, when we come to implement the functionality required by a given method (syscall handler), we can see that having a broad range of supported filesystems give Linux an edge. In FreeBSD, one has the feeling that the way syscall handlers are utilized by the OS has too deep ties with UFS, and that when NFS entered the scene, it was bastardized and hacked on until it worked, but the need of doing low-level legwork and re-implement general functionality again and again by each users of the API has not been thoroughly and conceptionally eliminated.

Some entries from my factbook:

  • In Linux, you have a nifty generic_file_read() function which you can just plug in as the read handler, and which does data transfer between user space and the buffer cache in a generic way. In BSD you have to implement read completely by yourself. (For some VFS functionality there are useable defaults in BSD, too: these include syncing, getting/putting pages from/to the storage, and advisory file locking.)
  • In BSD, you have to take care about many small details by yourself, eg.: “shouldn’t we bail out here because we are mounted read only?”; check whether a directory is tried to be moved into a subdirectory of itself when doing a rename; check whether vnodes are from the same filesystem when creating hard links, and so on. Sometimes it’s trivial to do these (just shouldn’t be forgotten about), sometimes not so much…
  • The BSD codebase encourages “code reuse by means of copy and paste”, which is not a particularly good thing. Just run through UFS code, and grep all the other fs code for comments seen there… This is a sign of not properly abstracted generic functionality.
  • In BSD, some of the arguments of the readdir handler are there only for the sake of NFS: you have to process them appropriately if you want your filesystem to be NFS exportable. You can ignore them if that’s not a concern.

1 Daemons are treated as female, for the sake of correctness. Let the standard handler be male.

Prev Home
1 A quickstart for DIY types
2 FAQ
3 Implementation notes about Fuse for FreeBSD, with special emphasis on comparing it to Linux Fuse
3.1 VFS API
3.2 Mounting
3.3 [vi]node operations
3.4 Syncing
3.5 Messaging
3.6 Miscellaneous
100%