DEFFS stands for "Distributed, Encrypted, Fractured File System". Its primary goal is to secure a group's files while also allowing said files to be accessible from multiple machines on a local network, and preserving the ability to keep files private from others.
A small laboratory has 3 employees: a Chemist, a Biologist, and a Physicist. They're working on an experiment together, so they'd like to share files. Traditional solutions to this issue include git or Google Drive, however none of the 3 employees are programmers, and their files are large enough that they don't want to have to pay for extra storage from Google.
DEFFS provides a non-invasive, easy-to-use solution for their issue. On each of their machines, they create two folders, "storepoint" and "mountpoint", at any location they desire.
$ mkdir storepoint mountpoint
They execute the DEFFS binary, providing the paths to the two folders they created, and the number of machines that DEFFS will interact with (2, as the Chemist's machine will interact with the Biologist's and the Physicist's, and etc).
$ DEFFS mountpoint storepoint -n 2
Now, whenever they want to share files, all they have to do is copy the files into "mountpoint", or just work in that directory by default.
Whenever files are written inside of the mountpoint, their contents are encrypted and split into shards which are distributed to the other machines via their shared local network.
For example, the Physicist would like to share his new star charts with his coworkers. He copies the chart files and pastes them into a new folder inside of "mountpoint". Each of the charts is split into 3 encrypted shards, 2 of them being sent to the Chemist's and Biologist's machines. When the Chemist opens her own "mountpoint" directory, she will see the new folder containing the star charts. Upon opening the chart, DEFFS will send a TCP request to the other 2 machines, asking for the corresponding file shards. The shards will be recombined, decrypted, and displayed to the Chemist. No manual uploading or downloading required. In this way, DEFFS treats a network of machines as a single machine, distributing its contents over each node.
DEFFS currently encrypts file contents with the plain AES cipher provided by OpenSSL, but this will soon be replaced by AES GCM for improved security. The symmetric encryption keys are passed through Shamir's Secret Sharing Scheme, providing a number of key chunks equal to the number of shards to be created and machines to interact with. Shard files are stored in a hidden directory within the storepoint (the shardpoint) with an SHA256-generated filename, and contain the Shamir-encoded encryption key chunks. "Header" files, or the normally-visible files that users will interact with, actually only contain the SHA256 hash that corresponds with the file's shards. To access a file, all shards for the corresponding file must be retrieved to unlock the file's contents, due to the nature of Shamir's algorithm.
Distributed filesystems are not a new concept, and DEFFS builds on the conventions that previous projects have set up. Among other distributed filesystems, DEFFS is most similar to Tahoe-LAFS.
DEFFS stands apart in a few key ways:
Configuration
- DEFFS aims to be a highly-configurable zeroconf system, meaning it is well-tuned for usage out-of-the-box, but can be fine-tuned for any user's individual needs. These configs include simple options like the port number over which DEFFS interacts, and high-level options like node groups for larger networks. Have a config request? Email me!
Language
- DEFFS is written in C for maximum compatability and efficiency
Node Communication
- each node in a DEFFS network leaves a port open for communication. Sockets on new threads are opened and closed for each interaction, like any threaded server. Only encrypted contents are sent across these connections
No, shard requests and replies are completely automatic.
By default, the new DEFFS instance will sniff the network for other DEFFS instances, and notify them that it has joined the network. The new node will receive file headers but shards will not be redistributed. The new machine will only begin to store shards if a file is written to.
The default DEFFS configuration is geared towards an always-on network of reliable machines, so this is not a top issue right now. However, the current planned solution to this issue is to base DEFFS shard architecture on RAID 1 mirroring to allow any machine to disconnect without data loss. A simpler option that will be implemented soon is the option to automatically create redundant shards.
This solution to the issue of concurrent file modifications is a currently-undecided one, but there are a few different ways to go. The simplest solution is to lock or make readonly a file on all machines if it's open on one, preventing more than one user from working on a given file at once. This solution is better for smaller networks with users that don't modify each other's files a lot. The second option is to have a git-like commit feature, which would be harder to implement and more complicated for users to understand, but would handle cases where files are highly-shared.