# Introduction Ambient CI will be a continuous integration system: an automated system that build, tests, delivers, and deploys software. It will be simple, sane, safe, secure, speedy, and supercalifragilisticexpialidocious. There are many existing such systems, but we feel none of them are excellent. For now, it is a command line tool that runs a build locally, in a VM, without network access. ## Continuous integration concepts The concepts and related terminology around continuous integration systems is not entirely standardized. To be clear, here are definitions for Ambient. The goal here is being clear and unambiguous in the Ambient context, not to define terminology in the wider community of software developers. Note: Ambient is currently aimed at building, testing, and publishing software, but not deploying it. That may change, but it's possible that deployment will need its own kind of system rather than forcing a CI system to do it badly. A CI system consists of several abstract components, or responsibilities. These may end as separate programs and services in Ambient, or several may be combined into one. For thinking and discussing the system and its software architecture, we pretend that they're all separate programs. * **artifact** -- a file produced by a build * **artifact store** -- where artifacts are stored after a build step is finished - we can't assume we can access files on the worker, but we do need some place to store artifacts - the artifact store can also store intermediate state between build steps - later build steps and build run may fetch the artifacts from the store * **build** or **build run** -- an execution of all the steps of a build graph - some CI systems call this "job" but that seems unclear terminology * **build graph** -- the full set of steps to build a project - in the simple case, the steps form a simple sequence: - build program - run its tests - build Debian package - publish Debian package - in the general case, some steps may be performed concurrently (building Debian package can happen while tests are run): - build program - concurrently: - run its tests - build Debian package - publish Debian package - in the even more general case, some steps may need performed on several target systems - build Debian source package - concurrently on amd64, armhf, aarch64, riscv32, risc64: - build program - run its tests - build Debian binary package from source package - publish Debian source package and all binary packages - but only if all binary packages succeeded - how users specify this is going to be crucial for Ambient * **build step** -- a concrete step in a build graph - for example, "run this shell command to build executable binaries from the source code", "run this program to create a tar archive of the binaries" * **controller** -- system that keeps track of projects, build runs, and their current state, and what needs to happen next - once a build run is triggered, the controller makes sure every step gets executed, and handles steps failing, or taking too long - the controller tells workers what to do - the controller checks the result of each step, and picks the next step to execute, and the worker to execute it * **project** -- a thing that needs to be built and tested - each project has a build graph * **trigger** -- what causes a build run to start - to trigger a build, _something_ tells the controller that a build is needed; the controller does not trigger a build itself - triggering can be done by a change in a git repository (as if by the git server), or otherwise (e.g., a cron job to trigger a nightly build) * **worker** -- executes build steps - there can be many workers, and many workers may be needed for a complete build - consecutive steps for the same build run may or may not be executed by the same worker, at the discretion of the controller; if necessary, state is communicated between workers via the artifact store - conceptually, a worker executes one build step at a time; to make better user of hardware resources, run multiple workers concurrently # Motivation ## Problems we see These are not in any kind of order. * Debugging: when (not if) there is a build failure, it can be tedious and frustrating to figure what the cause is and to fix that. Often the failure can be difficult to reproduce locally, or otherwise in a way that can be inspected except via the build log. * Capacity: individuals and small organization often don't have a much capacity to spare for CI purposes, which hampers and slows down development and collaboration. * Generality: many CI systems run jobs in containers, which has a low overhead, but limits what a job can do. For example, containers don't allow building on a different operating system from the host or on a different computer architecture. * Security: typically a CI system doesn't put many limits on what the job can do, which means that building and testing software can be a security risk. ## Needs and wants we have These are not in any kind of order. * We want to build software for different operating systems and computer architectures, and versions thereof, as one project, with the minimum of fuss. One project should be able to build binaries and installation packages for any number of targets as part of one build. * It must be possible to construct and update the build environments within the CI system itself. For example, by building the virtual machine base image for build workers. * We want builds to be quick. The CI system should add only little overhead to a build. When it's possible to break a build into smaller, independent parts, they are run concurrently as much hardware capacity allows. * We want it to be easy to provide build workers, without having to worry about the security of the worker host, or the security of the build artifacts. * If a build is going to fail for a reason that can be predicted before it even starts, the job should not start. For example, if a build step runs a shell command, the syntax should be checked before the job starts. Obviously this is not possible in every case, but in the common case it is. * Build failures should be easy to debug. Exactly what this means is unclear at the time of writing, but it should be a goal for all design and development work. * It's easy to host both client and server components. * It's possible, straightforward, and safe, for workers to require payment to run a build step. This needs to be done in a way that is unlikely to anyone being scammed. * Is integrated into major git hosting platforms (GitHub, GitLab, etc), but is not tied to any git platform, or git at all. * Build logs can be post-processed by other programs. # Visions of CI ## Megalomaniac vision: CI is everywhere, all the time Ambient links together build workers and development projects to provide a CI system that just is there, all the time, everywhere. Anyone can provide a worker that anyone can use. The system constrains build jobs so that they can only do safe things, and can only use an acceptable amount of resources, and guarantees that the output of the build can be trusted. (This may be impossible to achieve, but we can dream. If you don't aim for the stars you are at risk of shooting yourself in the foot.) ## Plausible vision: CI is easy to provide and use The various components of Ambient are easy to set up, and to keep running, and to use. Within an organization it's easy to access. It's so easy provide a worker on one's own machine, without worry, that everyone in the organization can be expected to do so. Builds can easily be reproduced locally. ## Realistic vision Ambient provides a command line tool to run a job in a safe, secure manner that is easily reproduced by different people, to allow collaborating on a software development project in a controlled way. # Threat modeling This model concerns itself with running a build locally. Some terminology: * project -- what is being built * host -- the computer where Ambient is run Many software projects require running code from that project to be built, and certainly when it's automated tests are run. The code might not be directly part of the project, but might come from a dependency specified by the project. This code can do anything. It might be malicious and attack the build host. It probably doesn't, but Ambient must be able to safely and securely build and test projects that aren't fully trusted and trustworthy. The attacks we are concerned with are: * reading, modifying, or storing data on the host, in unexpected ways * using too much CPU on the host * using too much memory on the host * using too much disk space on the host * accessing the network from the host, in unexpected ways ## Prevention We build and test in a local virtual machine. The VM has no network access at all. We provide the VM the project source code via a read-only virtual disk. We provide the VM with another virtual disk where it can store any artifacts that should persist. Both virtual disks will contain no file system, but a tar archive. We provide the VM with a pre-determined amount of virtual disk. The project won't be able to use more. We provide the VM with an operating system image with all the dependencies the project needs pre-installed. Anything that needs to be downloaded from online repositories is specified by URL and cryptographic checksum, and downloaded before the VM starts, and provided to the build via a virtual disk to the VM. We interact with the VM via a serial console only. We run the VM with a pre-determined amount of disk, number of CPUs, amount of memory. We fail the build if it exceeds a pre-determined time limit. We fail the build if the amount of output via the serial console exceeds a pre-determined limit. # Architecture At a very abstract level, the Ambient architecture is as follows: * Ambient creates a virtual machine with four block devices (using `virtio_blk`) in addition to the system disk (`/dev/vda` on Linux): - `/dev/vdb`: the read-only source device: a tar archive of the project's source tree - `/dev/vdc`: the read/write artifact device: for the project to write a tar archive of any build artifacts it wants to export - this would be write-only if that was possible - when the build starts, this contains only zeroes - after the build a tar archive is extracted from this - `/dev/vdd`: the read-only dependencies device: a tar archive of additional dependencies in a form that the project can use - `/dev/vde`: the read/write cache device: a tar archive of any files the project wants to persist across runs; for example, for a Rust project, this would contains the cargo target directory contents - when a build starts, this can be empty; the build must deal with an empty cache * The VM additionally has a serial port where it will write the build log. On Linux this is `/dev/ttyS0`. * The VM automatically, on boot, creates `/workspace/{src,cache,deps}`, and extracts the source, cache, and dependencies tar archives to those directories. * The VM then changes current working directory to `/workspace/src` and runs `./.ambient-script` (if the script isn't executable, the VM first makes it so). The script's stdout and stderr are redirected to the serial port. The `ambient-build.service` and `ambient-run-script` files in the Ambient source tree implement this for Linux with systemd, and have been tested with Debian. # Acceptance criteria [Subplot]: https://subplot.tech These acceptance criteria are written for the [Subplot][] tool to process. They are verified using scenarios expressed in a given/when/then language. For details, please see the Subplot documentation. ## `ambient-run-script` This section concerns itself with `ambient-run-script`, which is part of the VM in which the build is run. ### Accepts the various devices _Requirement: `ambient-run-script` accepts the various input and output devices correctly._ We verify this by running the script with an option that only exists for this purpose to dump its configuration as text. ~~~scenario given an installed ambient-run-script given file tars.txt when I run ./ambient-run-script --dump-config=dump.txt -t /dev/ttyS0 -s input.tar -a output.tar -c cache.tar -d deps.tar then files dump.txt and tars.txt match ~~~ ~~~{#tars.txt .file .text} ambient-run-script: - dump: dump.txt - tty: /dev/ttyS0 - src: input.tar - artifact: output.tar - cache: cache.tar - dependencies: deps.tar - root: / - dry_run: None ~~~ ### Lists steps in happy path This scenario verifies two requirements, for the sake of simplicity of test implementation. * _Requirement: `ambient-run-script` must perform the same steps every time, unless something goes wrong._ We verify this by having `ambient-run-script` list the steps is would do, without actually doing them, using the `--dry-run` option. ~~~scenario given an installed ambient-run-script given file expected-steps.txt when I run ./ambient-run-script --dry-run=steps.txt -t /dev/ttyS0 -s input.tar -a output.tar -c cache.tar -d deps.tar then files steps.txt and expected-steps.txt match ~~~ ~~~{#expected-steps.txt .file .text} create /workspace extract input.tar to /workspace/src extract cache.tar to /workspace/cache extract deps.tar to /workspace/deps build in /workspace/src save /workspace/cache to cache.tar ~~~ ### Performs expected steps in happy path _Requirement: `ambient-run-script` must prepare to run a build in a VM._ `ambient-run-script` in inside the VM, so we verify this requirement by having it run a build that we prepare so it's safe to run in our test environment, without a VM. We make use of a special option so that paths used by the program are relative to the current working directory, instead of absolute. We also verity that the output device gets output written, and that the cache device gets updated. This is a bit of a long scenario, so it's divided into chunks. First we set things up. ~~~scenario given an installed ambient-run-script given file project/.ambient-script from simple-ambient-script given tar archive input.tar with contents of project given file cached/data from cache.data given tar archive cache.tar with contents of cached given file deps/deps.data from deps.data given tar archive deps.tar with contents of deps ~~~ We now have the source files, cached data, and dependencies and we can run `ambient-run-script` with them. ~~~scenario when I run ./ambient-run-script --root=. -t log -s input.tar -a output.tar -c cache.tar -d deps.tar ~~~ The workspace must now exists and have the expected contents. ~~~scenario then file workspace/src/.ambient-script exists then file workspace/cache/data exists then file workspace/cache/cached contains "hello, cache" then file workspace/deps/deps.data exists then file log contains "hello, ambient script" ~~~ The artifact tar archive must contain the expected contents. ~~~scenario when I create directory untar-output when I run tar -C untar-output -xvf output.tar then file untar-output/greeting contains "hello, there" ~~~ The cache tar archive must contain the expected contents. ~~~scenario when I create directory untar-cache when I run tar -C untar-cache -xvf cache.tar then file untar-cache/cached contains "hello, cache" then file untar-cache/data exists ~~~ That's all, folks. ~~~{#simple-ambient-script .file .sh} #!/bin/bash set -xeuo pipefail echo hello, ambient script echo hello, cache > ../cache/cached echo hello, there > greeting tar -cf "$1" greeting ~~~ ~~~{#cache.data .file} This is cached data. ~~~ ~~~{#deps.data .file} This is dependency data. ~~~