README for story testing ======================== Introduction ------------ `story-test` is a story testing tool: you write a story describing how a user uses your software and what should happen, and express, using very lightweight syntax, the story in such a way that it can be tested automatically. The story has a simple, but strict structure: GIVEN some setup for the test WHEN thing that is to be tested happens THEN the post-conditions must be true As an example, consider a very short test story for verifying that a backup program works, at least for one simple case. GIVEN some live data in a directory AND an empty backup repository WHEN a backup is made THEN the data case be restored (Note the addition of AND: you can have multiple GIVEN, WHEN, and THEN statements.) Stories are meant to be written in somewhat human readable language. However, they are not free form text. In addition to the GIVEN/WHEN/THEN structure, the text for each of the steps needs a computer-executable implementation. This is done by using IMPLEMENTS. The backup story from above might be implemented as follows: IMPLEMENTS GIVEN some live data in a directory rm -rf "$TESTDIR/data" mkdir "$TESTDIR/data" echo foo > "$TESTDIR/data/foo" IMPLEMENTS GIVEN an empty backup repository rm -rf "$TESTDIR/repo" mkdir "$TESTDIR/repo" IMPLEMENTS WHEN a backup is made backup-program -r "$TESTDIR/repo" "$TESTDIR/data" IMPLEMENTS THEN the data can be restored mkdir "$TESTDIR/restored" restore-program -r "$TESTDIR/repo" "$TESTDIR/restored" diff -rq "$TESTDIR/data" "$TESTDIR/restored" Each "IMPLEMENT GIVEN" (or WHEN, THEN) is followed by a regular expression on the same line, and then a shell script that gets executed to implement any step that matches the regular expression. The implementation can extract data from the match as well: for example, the regular expression might allow an amount to be specified. The above example is a bit silly, of course: why go to the effort to obfuscate the various steps? The answer is that the various steps, implemented using IMPLEMENTS, can be combined in many ways, to test different aspects of the program being tested. Moreover, by making the step descriptions be human language text, matched by regular expressions, most of the test can hopefully be written, and understood, by non-programmers. Someone who understands what a program should do, could write tests to verify its behaviour. The implementations of the various steps need to be implemented by a programmer, but given a well-designed set of steps, with enough flexibility in their implementation, that quite a good test suite can be written. Test language specification --------------------------- A test document is written in [Markdown][markdown], with block quoted code blocks being interpreted specially. Each block must follow the syntax defined here. * Every step in a story is one line, and starts with a keyword. * Each implementation (IMPLEMENTS) starts as a new block, and continues until there is a block that starts with another keyword. The following keywords are defined. * **STORY** starts a new story. The rest of the line is the name of the story. The name is used for documentation and reporting purposes only and has no semantic meaning. STORY MUST be the first keyword in a story, with the exception of IMPLEMENTING. A test document any number of stories, but must contain at least one. The IMPLEMENTING sections are shared between them. * **ASSUMING** defines a condition for the story. The rest of the line is "matched text", which gets implemented by an IMPLEMENTING section. If the code executed by the implementation fails, the story is skipped. * **GIVEN** prepares the world for the test to run. If the implementation fails, the story fails. * **WHEN** makes the change to the world that is to be tested. If the code fails, the story fails. * **THEN** verifies that the changes made by the GIVEN steps did the right thing. If the code fails, the story fails. * **FINALLY** specifies how to clean up after a story. If the code fails, the story fails. The code gets run at the end of the story, regardless of whether the story is failing or not. * **AND** acts as ASSUMING, GIVEN, WHEN, THEN, or FINALLY: whichever was used last. It must not be used unless the previous step was one of those, or another AND. * **IMPLEMENTING** is followed by one of ASSUMING, GIVEN, WHEN, or THEN, and a PCRE regular expression, and then shell commands until the end of the block quoted code block. Markdown is unclear whether an empty line (no characters, not even whitespace) between two block quoted code blocks starts a new one or not, so we resolve the ambiguity by specifiying that the next code block is a continuation unless it starts with one of the story testing keywords. The shell commands get parenthesised parts of the match of the regular expression as positional arguments (`$1` etc). For example, if the regexp is "a (\d+) byte file", then `$1` gets set to the number matchec by `\d+`. The test runner creates a temporary directory, whose name is given to the shell code in the `TESTDIR` environment variable. The shell commands get invoked with `/bin/sh -eu`, and need to be written accordingly. Be careful about commands that return a non-zero exit code. The code block of an IMPLEMENTING block fails if the shell invocation exits with a non-zero exit code. Output to stderr is not an indication of failure. Any output to stdout or stderr may or may not be shown to the user. Semantics: * The name of each story (given with STORY) must be unique. * Every ASSUMING, GIVEN, WHEN, THEN, FINALLY must be matched by exactly one IMPLEMENTING. The test runner checks this before running any code. * Every IMPLEMENTING must match at least one ASSUMING, GIVNE, WHEN, THEN, or FINALLY. The test runner checks this before running any code. * If ASSUMING fails, that story is skipped, and any FINALLY steps are not run. See also -------- Wikipedia has an article on [Behaviour Driven Development][BDD], which can provide background and further explanation to what this tools tries to do. [BDD]: https://en.wikipedia.org/wiki/Behavior-driven_development [Markdown]: ... TODO ---- * Add FINALLY to clean up at the end, regardless of test failures. * Add DEFINING, PRODUCING, if they turn out to be useful. * Need something like ASSUMING, except fail the story if the pre-condition is not true. Useful for testing that you can ssh to localhost when flinging, for example.