diff options
author | Lars Wirzenius <liw@liw.fi> | 2020-11-27 16:57:21 +0200 |
---|---|---|
committer | Lars Wirzenius <liw@liw.fi> | 2020-11-28 10:37:40 +0200 |
commit | aa6709b2f657a964c2af04f6370e7eb7c117bf9e (patch) | |
tree | 9e0f5ba56f45eb24d286ac38a67b27ed0899f9d6 /summain.md | |
parent | b5b5884097219d77aa0b4cd6ad4d3a9c1407f5a6 (diff) | |
download | summain-rs-aa6709b2f657a964c2af04f6370e7eb7c117bf9e.tar.gz |
feat: implement Summain in Rust
Diffstat (limited to 'summain.md')
-rw-r--r-- | summain.md | 187 |
1 files changed, 187 insertions, 0 deletions
diff --git a/summain.md b/summain.md new file mode 100644 index 0000000..7c2c5fa --- /dev/null +++ b/summain.md @@ -0,0 +1,187 @@ +# Introduction + +A file manifest lists files, with their metadata. + +To verify a backup has been restored correctly, one can compare a +manifest of the data before the backup and after it has been restored. +If the manifests are identical, the data has been restored correctly. + +This requires a way to produce manifests that is deterministic: if run +twice on the same input files, without the files having changed, the +result should be identical. The Summain program does this. + +This version of Summain has been written in Rust for the [Obnam][] +project. + +[Obnam]: https://obnam.org/ + +## Why not mtree? + +[mtree]: http://cdn.netbsd.org/pub/pkgsrc/current/pkgsrc/pkgtools/mtree/README.html +[NetBSD]: https://en.wikipedia.org/wiki/NetBSD + +[mtree][] is a tool included in [NetBSD][] Unix since version 1.2, +released in 1996. It produces a manifest, and can check a manifest +against the file system. It is, in principle, a tool that solves the +same problem Summain. Why not use an existing tool. Some reasons: + +* I'm an anti-social not-invented-here jerk. +* It's an old C program, without tests in the source tree. +* The file format is custom, and not nice for reading by humans. +* It doesn't handle Unicode well. + - a filename of `รถ` is encoded as `\M-C\M-6` + - but at least it can handle non-ASCII characters! +* It doesn't handle file metadata that's Linux specific. + - extended attributes + - the ext4 immutable bit +* It's single-threaded. + +In principle, there is no reason why mtree couldn't be extended to +support everything I need for Obnam. In practice, since I'm working on +this in my free time in order to have fun, I prefer to write a new +tool in Rust. + + +## Why not use the old Python version of Summain + +I don't like Python anymore. The old tool would need updates to work +with current Python, and I'd rather use Rust. + + +# Usage + +Summain is given one or more files or directories on the command line, +and it outputs to its standard output a manifest. If the command line +arguments are the same, and the files haven't changed, the manifest is +the same. + +The output is YAML. Each file gets its own YAML document, delimieted +by `---` and `...` as usual. + +Summain does not itself traverse directories. Instead, a tool like +**find**(1) should be used. Summain will, however, sort its command +line arguments so that it doesn't matter if they're always in the same +order. + +# Acceptance criteria + +## Directory + +~~~scenario +given an installed summain +given directory empty +and atime for empty is 123 +and mtime for empty is 456 +when I run chmod a=rx empty +when I run summain empty +then output matches file empty.yaml +~~~ + +```{#empty.yaml .file .numberLines} +--- +path: empty +atime: 123 +atime_nsec: 0 +mode: dr-xr-xr-x +mtime: 456 +mtime_nsec: 0 +nlink: 2 +size: ~ +``` + +## Writeable file + + +~~~scenario +given an installed summain +given file foo +and atime for foo is 11 +and mtime for foo is 22 +when I run chmod a=rw foo +when I run summain foo +then output matches file foo.yaml +~~~ + +```{#foo.yaml .file .numberLines} +--- +path: foo +atime: 11 +atime_nsec: 0 +mode: "-rw-rw-rw-" +mtime: 22 +mtime_nsec: 0 +nlink: 1 +size: 0 +``` + +## Read-only file + +~~~scenario +given an installed summain +given file foo +and atime for foo is 33 +and mtime for foo is 44 +when I run chmod a=r foo +when I run summain foo +then output matches file readonly.yaml +~~~ + +```{#readonly.yaml .file .numberLines} +--- +path: foo +atime: 33 +atime_nsec: 0 +mode: "-r--r--r--" +mtime: 44 +mtime_nsec: 0 +nlink: 1 +size: 0 +``` + +## Two files sorted + +~~~scenario +given an installed summain +given file aaa +and atime for aaa is 33 +and mtime for aaa is 44 +given file bbb +and atime for bbb is 33 +and mtime for bbb is 44 +when I run chmod a=r aaa bbb +when I run summain bbb aaa +then output matches file aaabbb.yaml +~~~ + +```{#aaabbb.yaml .file .numberLines} +--- +path: aaa +atime: 33 +atime_nsec: 0 +mode: "-r--r--r--" +mtime: 44 +mtime_nsec: 0 +nlink: 1 +size: 0 +--- +path: bbb +atime: 33 +atime_nsec: 0 +mode: "-r--r--r--" +mtime: 44 +mtime_nsec: 0 +nlink: 1 +size: 0 +``` + +--- +title: "Summain—deterministic file manifests" +author: Lars Wirzenius +template: python +bindings: + - subplot/summain.yaml + - subplot/runcmd.yaml +functions: + - subplot/summain.py + - subplot/runcmd.py +... |