summaryrefslogtreecommitdiff
path: root/summain.md
blob: 3721b0fe85c4fee8467a73909ae6dcb9156888e8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
# Introduction

A file manifest lists files, with their metadata.

To verify a backup has been restored correctly, one can compare a
manifest of the data before the backup and after it has been restored.
If the manifests are identical, the data has been restored correctly.

This requires a way to produce manifests that is deterministic: if run
twice on the same input files, without the files having changed, the
result should be identical. The Summain program does this.

This version of Summain has been written in Rust for the [Obnam][]
project.

[Obnam]: https://obnam.org/

## Why not mtree?

[mtree]: http://cdn.netbsd.org/pub/pkgsrc/current/pkgsrc/pkgtools/mtree/README.html
[NetBSD]: https://en.wikipedia.org/wiki/NetBSD

[mtree][] is a tool included in [NetBSD][] Unix since version 1.2,
released in 1996. It produces a manifest, and can check a manifest
against the file system. It is, in principle, a tool that solves the
same problem Summain. Why not use an existing tool. Some reasons:

* I'm an anti-social not-invented-here jerk.
* It's an old C program, without tests in the source tree.
* The file format is custom, and not nice for reading by humans.
* It doesn't handle Unicode well.
  - a filename of `รถ` is encoded as `\M-C\M-6`
  - but at least it can handle non-ASCII characters!
* It doesn't handle file metadata that's Linux specific.
  - extended attributes
  - the ext4 immutable bit
* It's single-threaded.

In principle, there is no reason why mtree couldn't be extended to
support everything I need for Obnam. In practice, since I'm working on
this in my free time in order to have fun, I prefer to write a new
tool in Rust.


## Why not use the old Python version of Summain

I don't like Python anymore. The old tool would need updates to work
with current Python, and I'd rather use Rust.


# Usage

Summain is given one or more files or directories on the command line,
and it outputs to its standard output a manifest. If the command line
arguments are the same, and the files haven't changed, the manifest is
the same.

The output is YAML. Each file gets its own YAML document, delimieted
by `---` and `...` as usual.

Summain does not itself traverse directories. Instead, a tool like
**find**(1) should be used. Summain will, however, sort its command
line arguments so that it doesn't matter if they're always in the same
order.

# Acceptance criteria

These scenarios verify that Summain handles the various kinds of file
system objects it may encounter, with two exceptions: block and
character devices. To create those, one needs to be the `root` user,
and we don't want to have to run the test suite as root. Instead, we
blithely rely on the output being correct for those anyway. Testing
manually indicates that it works, and the only difference from, say,
regular files is that the mode starts with a `b` or `c`, which is
exactly correct.

## Directory

~~~scenario
given an installed summain
given directory empty
and mtime for empty is 456
when I run chmod a=rx empty
when I run summain empty
then output matches file empty.yaml
~~~

```{#empty.yaml .file .numberLines}
---
path: empty
mode: dr-xr-xr-x
mtime: 456
mtime_nsec: 0
nlink: 2
size: ~
sha256: ~
target: ~
```

## Writeable file


~~~scenario
given an installed summain
given file foo
and mtime for foo is 22
when I run chmod a=rw foo
when I run summain foo
then output matches file foo.yaml
~~~

```{#foo.yaml .file .numberLines}
---
path: foo
mode: "-rw-rw-rw-"
mtime: 22
mtime_nsec: 0
nlink: 1
size: 0
sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
target: ~
```

## Read-only file

~~~scenario
given an installed summain
given file foo
and mtime for foo is 44
when I run chmod a=r foo
when I run summain foo
then output matches file readonly.yaml
~~~

```{#readonly.yaml .file .numberLines}
---
path: foo
mode: "-r--r--r--"
mtime: 44
mtime_nsec: 0
nlink: 1
size: 0
sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
target: ~
```

## Two files sorted

~~~scenario
given an installed summain
given file aaa
and mtime for aaa is 44
given file bbb
and mtime for bbb is 44
when I run chmod a=r aaa bbb
when I run summain bbb aaa
then output matches file aaabbb.yaml
~~~

```{#aaabbb.yaml .file .numberLines}
---
path: aaa
mode: "-r--r--r--"
mtime: 44
mtime_nsec: 0
nlink: 1
size: 0
sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
target: ~
---
path: bbb
mode: "-r--r--r--"
mtime: 44
mtime_nsec: 0
nlink: 1
size: 0
sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
target: ~
```

## Symlink

~~~scenario
given an installed summain
given symlink ccc pointing at aaa
and mtime for ccc is 44
when I run summain ccc
then output matches file ccc.yaml
~~~

```{#ccc.yaml .file .numberLines}
---
path: ccc
mode: lrwxrwxrwx
mtime: 44
mtime_nsec: 0
nlink: 1
size: 3
sha256: ~
target: aaa
```

## Unix domain socket

~~~scenario
given an installed summain
given socket aaa
and file aaa has mode 0700
and mtime for aaa is 44
when I run summain aaa
then output matches file socket.yaml
~~~

```{#socket.yaml .file .numberLines}
---
path: aaa
mode: srwx------
mtime: 44
mtime_nsec: 0
nlink: 1
size: 0
sha256: ~
target: ~
```

## Named pipe

~~~scenario
given an installed summain
given named pipe aaa
and file aaa has mode 0700
and mtime for aaa is 44
when I run summain aaa
then output matches file fifo.yaml
~~~

```{#fifo.yaml .file .numberLines}
---
path: aaa
mode: prwx------
mtime: 44
mtime_nsec: 0
nlink: 1
size: 0
sha256: ~
target: ~
```