diff options
-rw-r--r-- | architecture.mdwn | 719 | ||||
-rw-r--r-- | index.mdwn | 1 |
2 files changed, 720 insertions, 0 deletions
diff --git a/architecture.mdwn b/architecture.mdwn new file mode 100644 index 0000000..b024c30 --- /dev/null +++ b/architecture.mdwn @@ -0,0 +1,719 @@ +[[!meta title="Ick—architecture"]] + +Introduction +============================================================================= + +Ick2 is a continuous integration (CI) system. It is being developed by +Lars Wirzenius and other people, for their own need. It is very early +days. You don't want to use Ick2, but if you have opinions on what a +CI system should be like, feedback is welcome. + +This document describes the architecture of Ick2. Specifically, the +architecture for the upcoming ALPHA-1 release, not further than that. +It is a capital mistake to design software before you have all the +requirements. It biases the judgment. You can rarely have all the +requirements a priori, you have to iterate to gather them. Designing +beyond one iteration is a mistake. + +Background and justification +----------------------------------------------------------------------------- + +This section should be written some day. In short, Lars got tired of +Jenkins, and all competetitors seem insufficient or somehow +unpleasant. Then Daniel suggested a name and Lars is incapable of not +starting a project if given a name for it. + + +Overview +----------------------------------------------------------------------------- + +A continuous integration (CI) or continuous deployment (CD) system is, +at its most simple core, an automated system that reacts to changes in +a program's source code by doing a build of the program, running any +of its automated tests, and then publishing the results somewhere. A +CD system continues from there to also installing the new version of +the program on all relevant computers. If a build or an automated test +fails, the system notifies the relevant parties. + +Ick2 aims to be a CI/CD system. It deals with a small number of +concepts: + +* **projects**, which consist of **source code** in a version control + system (mainly git right now) +* **pipelines**, which are sequences of steps aiming to convert source + code into something executable, or test the program +* **worker build hosts**, which do all the heavy lifting + +The long-term goal for Ick2 is to provide a CI/CD system that can be +used to build and deploy any reasonable software project, including +building packages of any reasonable type. In our wildest dreams it'll +be scalable enough to build a full, large Linux distribution such as +Debian. We'll see. + +Example +----------------------------------------------------------------------------- + +We will be returning to this example throughout this document. Imagine +a static website that is built using the ikiwiki software. The source +of the web pages is stored in a git repo, and the generated HTML pages +are published on a web server. + +This might be expressed as an Ick2 configuration like this: + + projects: + website: + workspace: + - git: ssh://git@git.example.com/website.git + pipelines: + - name: getsource + steps: + - shell: git clone ssh://git@git.example.com/website.git src + - name: ikiwiki + steps: + - shell: mkdir html + - shell: ikiwiki src html + - name: publish + steps: + - shell: rsync -a --delete html/. www-user@www.example.com/srv/http/. + +Note that pipelines are defined in the configuration. Eventually, Ick2 +may come with pre-defined libraries of pipelines that can easily be +reused, but it will always be possible for users to define their own. + +Pipeline steps will not be able to use variables, in ALPHA-1. That's +probably going to be added later. + + +Ick2 ALPHA-1 +============================================================================= + +We are currently working on what will be called the ALPHA-1 version of +Ick2. This chapter outlines its intended functionality and the shape +of its architecture. + + +Ick2 ALPHA-1 definition +----------------------------------------------------------------------------- + +This is the current working definition of the aim for the ALPHA-1 +version of Ick2: + +> ALPHA-1 of Ick2 can be deployed and configured easily, and can +> concurrently build multiple projects using multiple workers. Builds may be +> traditional builds from source code, may involve running unit tests +> or other build-time tests, may involve building Debian packages, and +> build artifacts are published in a suitable location. Builds may +> also be builds of static web sites or documentation, and those build +> artifacts may be published on suitable web servers. Builds happen on +> workers in reasonably well isolated, automatically maintained +> environments akin to pbuilder or schroot (meaning the sysadmin is +> not expected to set up the pbuilder base tarball, ick2 will do +> that). + +Ick2 acceptance criteria +----------------------------------------------------------------------------- + +Acceptance criteria for ALPHA-1: + +* All Ick2 components and the workers are deployable using Ansible or + similar configuration management tooling. + +* At least two people (not only Lars) have set up a CI cluster to + build at least two different projects on at least two workers. One + of the projects should build documentation for ick2 itself, the + other should build a .deb packages of ick2. Bonus points for + building other projects than ick2 as well. + +* Builds get triggered automatically by a git server on any commit to + the master branch. + +* Build logs can be viewed while builds are running or afterwards via + an HTTP API (perhaps wrapped in a command line tool). Bonus points + if someone builds a web app on top of the API. + +* A modicum of thought has been spent on security and the major + contributors agree the security design is not idiotic. The goal is + to be confident that a future version of Ick2 can be made reasonably + secure, even if that doesn't happen for ALPHA-1. + +* The workspace is constructed from several git repositories, e.g., so + that the debian subdir comes from a different repo than the main + source tree. + +* The pipeline steps are not merely snippets of shell scripts to run. + Instead, steps may name operations that get executed by the workers + without specifying the implementation in the Ick2 project + configuration. + + +Ick2 ALPHA-1 architecture +----------------------------------------------------------------------------- + +The future architecture of Ick2 is a collection of mutually recursive +self-modifying microservices. + +* A project consists of a description of the workspace, and one or + more pipelines to be executed when triggered to do so. Each + pipeline needs to be triggered individually. Each pipeline acts in + the same workspace. The entire pipeline is executed on the same + worker. + +* The workspace description is, initially, a set of git repos and + corresponding refs to clone (or update from) into a tree. Later + (after ALPHA-1) the workspace may be built from multiple git repos, + or artifacts of other builds, or other things that turn out to be + useful. + + Accessing git repositories may require credentials that all specific + workers will have. + +* The workspace is, essentially, a directory tree, populated by files + needed for doing a build. The "source tree" if you wish. + +* The project's pipelines do things like: prepare workspace, run + actual build, publish build artifacts from worker to a suitable + server. The controller keeps track of where in each pipeline a + build is. + +* Workers are represented by worker-managers, which request work + from the controller and perform the work by running commands locally + or over ssh on the actual worker host. Worker-managers may be on the + worker hosts or elsewhere, depending on what suits best for each CI + cluster. + +* Worker-builders register their workers with the controller. For + ALPHA-1 all workers are assumed to be equivalent + +* A pipeline is a sequence of steps (such as shell commands to run), + plus some requirements for what attributes the worker that runs the + pipeline should have. All the steps of a pipeline get executed by + the same worker. + +* If a pipeline step fails, the controller will mark the pipeline + execution as having failed and won't schedule more steps to execute. + Likewise, later pipelines in the same project won't be executed. If + the failure was transient (e.g., DNS lookup error), the user may + trigger a rebuild manually (via the trigger service). + +ick2 ALPHA-1 components +----------------------------------------------------------------------------- + +Ick2 consists of several independent services. This document describes +how they are used individually and together. + +* The **controller** keeps track of projects, build pipelines, workers, + and the current state of each. It decides which build step is next, + and who should execute it. The controller provides a simple, + unconditional "build this pipeline" API call, to be used by the + trigger service (see below). + +* A **worker-manager** represents a **build host**. It queries the + controller for work, and makes the build host (the actual worker) + execute it, and then reports results back to the controller. + +* The **trigger service** decides when a build should start. It polls + the state of the universe, or gets notifications of changes of the + same. + +* The controller and trigger services provide an API. The **identity + provider** (IDP) takes care of the authentication of each API + client, and what privileges each should have. The API client + authenticates itself to the IDP, and receives an access token. The + API provider gets the token in each request, validates it, and + inspects it to see what the client is allowed to do. + + A major point of the IDP is to have just a single place where + authentication and authorisation is configured. + +On an implementation level, the various services of Ick2 may be +implemented using any language and framework that works. However, to +keep things simple. initially we'll be using Python 3, Bottle, and +Green Unicorn. Also, the actual API implementation ("backend") will be +running behind haproxy, such that haproxy decrypts TLS and sends the +actual HTTP requrest over unencrypted localhost connections to the +backend. + +@startuml +title Ick2 services + + +[git server] --> [trigger service] : notify of change +[trigger service] --> [controller] : start pipeline +[controller] <-- [worker manager] : get work, report result +[worker manager] --> [host] : execute command +[git server] --> [IDP] : get access token +[trigger service] .. [IDP] : get access token +[worker manager] .. [IDP] : get access token +@enduml + +The API providing services will be running in a configuration like +this: + +@startuml +title API arch +node service { + component haproxy + component backend +} +[API client] --> [haproxy] : HTTPS (TLS) +[haproxy] --> [backend] : HTTP over localhost +@enduml + + +Individual APIs +============================================================================= + +This chapter covers interactions with individual APIs. + + +On security +----------------------------------------------------------------------------- + +All APIs are provided over TLS only. Access tokens are signed using public +key encryption and the public part of the signing keys is provided +"somehow" to all API clients. + + +Getting an access token +----------------------------------------------------------------------------- + +The API client (user's command line tool, a putative web app, git +server, worker-manager, etc) authenticates itself to the IDP, and if +successful, gets back a signed JSON Web Token. It will include the +token in all requests to all APIs so that the API provider will know +what the client is allowed to do. + +The privileges for each API client are set by the sysadmin who +installs the CI system, or a user who's been given IDP admin +privileges by the sysadmin. + +@startuml +hide footbox +title Get an access token +client -> IDP : GET /auth, with Basic Auth, over https +IDP --> client : signed JWT token +@enduml + +All API calls need a token. Getting a token happens the same way for +every API client. + + +Worker (worker-manager) registration +----------------------------------------------------------------------------- + +The sysadmin arranges to start a worker-manager for every build host. +They may run on the same host, or not: the Ick2 architecture doesn't +really care. If they run on the same host, the worker manager will +start a subprocess. If on different hosts, the subprocess will be +started using ssh. + +The CI admin may define tags for each worker. Attributes may include +things like whether the worker can be trusted with credentials for +logging into other workers, or for retrieving source code from the git +server. Workers may not override such tags. Workers may, however, +provide other tags, to e.g., report their CPU architecture or Debian +release. The controller will eventually be able to use the tags to +choose which worker should execute which pipeline steps. + +@startuml +hide footbox +title Register worker +worker_manager -> IDP : GET /auth, with Basic Auth, over https +IDP --> worker_manager : token A +worker_manager -> controller : POST /workers (token A) +controller --> worker_manager : success +@enduml + +The worker manager runs a very simple state machine. + +@startuml +title Worker-manager state machine + +Querying : ask controller for work +Running : run subprocess + + +[*] -down-> Idle : start +Idle -down-> Querying : short timeout has expired +Querying -up-> Idle : nothing to do +Querying --> Running : something to do + +Running --> Running : get output, report to controller +Running --> Idle : subprocess finished, report to controller +@enduml + + +Add project to controller +----------------------------------------------------------------------------- + +The CI admin (or a user authorised by the CI admin) adds projects to +the controller to allow them to be built. This is done using an "CI +administration application", which initially will be a command line +tool, but may later become a web application as well. Either way, the +controller provides API endpoints for this. + +@startuml +hide footbox +title Add project to controller + +adminapp -> IDP : GET /auth, with Basic Auth, over https +IDP --> adminapp : token B +adminapp -> controller : POST /projects (token B) +controller --> adminapp : success or failure indication +@enduml + + +A full build +============================================================================= + +Next we look at how the various compontens interact during a complete +build, using a single worker, which is trusted with credentials. We +assume the worker has been registered and projects added. + +The sequence diagrams in this chapter have been split into stages, to +make them easier to view and read. Each diagram after the first one +continues where the previous one left off. + +Although not shown in the diagrams, the same sequence is meant to work +if having multiple projects running concurrently on multiple workers. + +Trigger build by pushing changes to git server +----------------------------------------------------------------------------- + +@startuml +hide footbox +title Build triggered by git change + +developer -> gitano : git push + +gitano -> IDP : GET /auth, with Basic Auth, over https +IDP --> gitano : token C +gitano -> trigger : POST /git/website.git (token C) +note right + Git server notifies + trigger service that + a git repo has changed +end note + +||| + +trigger -> IDP : GET /auth, with Basic Auth, over https +IDP --> trigger : token D +trigger -> controller : GET /projects (token D) +note right + trigger service queries + controller to get list + of all projects, so it + knows which builds to + start +end note +controller --> trigger : list of projects + +||| + +trigger -> controller : GET /projects/website (token D) +note right + trigger service + gets project config + so it knows what + pipelines project has +end note +controller --> trigger : project description, incl. pipelines + +||| + +trigger -> controller : POST /projects/website/pipelines/getsource/+start (token D) +@enduml + +The first pipeline has now been started by the trigger service. + + +Pipeline 1: get sources +----------------------------------------------------------------------------- + +The first pipeline uses the trusted worker to fetch source code from +the git server (we assume that requires credentials), and push them +to the powerful worker. + +@startuml +hide footbox +title Build pipeline: get source + +trusty -> IDP : GET /auth, with Basic Auth, over https +IDP --> trusty : token E + +||| + +trusty -> controller : GET /worker/trusty (token E) +controller --> trusty : "clone website source into workspace" +trusty -> gitano : git clone +gitano --> trusty : website source code +trusty -> controller : POST /worker/trusty, exit=0 (token E) + +||| + +trusty -> controller : GET /worker/trusty (token E) +controller -> trusty : "notify trigger service pipeline is finished **successfully**" +trusty -> trigger : GET /pipelines/website/getsource, exit=0 (token E) +note right + No need to have the trigger service query the controller since + it has been told the status of pipeline by the worker. +end note +trusty -> controller : POST /worker/trusty, exit=0 (token E) +note right + If the notification to the trigger service failed, + this can be reported to the controller for logging. +end note +trigger -> controller : POST /projects/website/pipelines/ikiwiki/+start (token D) +@enduml + +The first pipeline finished, and the website building can start. +That's the second pipeline, which has just been started. + + +Pipeline 2: Build static web site +----------------------------------------------------------------------------- + +The second pipeline runs on the same worker. The source is already +there and it just needs to perform the build. + +@startuml +hide footbox +title Build static website + +trusty -> controller : GET /worker/trusty (token E) +controller -> trusty : "build static website" +trusty -> trusty : run ikiwiki to build site +trusty -> controller : POST /worker/trusty, exit=0 (token E) + +||| + +trusty -> controller : GET /worker/trusty (token E) +controller -> trusty : "notify trigger service pipeline is finished" +trusty -> controller : POST /worker/trusty, exit=0 (token E) +trusty -> trigger : GET /pipelines/website/ikiwiki (token E) +trigger -> controller : GET /projects/website/pipelines/ikiwiki (token D) +trigger -> controller : POST /projects/website/pipelines/publish/+start (token D) + +@enduml + +At the end of the second pipeline, we start the third one. + +Pipeline 3: Publish web site to web server +----------------------------------------------------------------------------- + +The third pipeline copies the built static website from the trusty +worker to the actual web server. + +@startuml +hide footbox +title Copy built site from beefy to web server + +trusty -> controller : GET /worker/trusty (token E) +controller -> trusty : "rsync static website to web server" +trusty -> webserver : rsync +trusty -> controller : POST /worker/trusty, exit=0 (token E) + +||| + +trusty -> controller : GET /worker/trusty (token E) +controller --> trusty : "notify trigger service pipeline is finished" +trusty -> controller : POST /worker/trusty, exit=0 (token E) +trusty -> trigger : GET /pipelines/website/publish (token E) +trigger -> controller : GET /projects/website/pipelines/ikiwiki (token D) +note right + There are no further pipelines. +end note + +@enduml + +The website is now built and published. + +Ick APIs +============================================================================= + +APIs follow the RESTful style +----------------------------------------------------------------------------- + +All the Ick APIs aRE [RESTful][]. Server-side state is represented by +a set of "resources". These data objects that can be addressed using +URLs and they are manipulated using HTTP methods (verbs): GET, POST, +PUT, DELETE. There can be many instances of a type of resource. These +are handled as a collection. Example: given a resource type for +projects Ick should build, the API would have the following calls: + + POST /projects -- create a new project, giving it an ID + GET /projects -- get list of all project ids + GET /projects/ID -- get info on project ID + PUT /projects/ID -- update project ID + DELETE /projects/ID -- remove a project + +[RESTful]: https://en.wikipedia.org/wiki/Representational_state_transfer + +Resources are all handled the same way, regardless of the type of the +resource. This gives a consistency that makes it easier to use the +APIs. + +Note that the server doesn't store any client-side state at all. There +are sessions, no logins, etc. Authentication is handled by attaching +(in the `Authorization` header) a token to each request. An Identity +Provider gives out the tokens to API clients, on request. + +Note also the API doesn't have RPC style calls. The server end may +decide to do some action as a side effect of a resource being created +or updated, but the API client can't invoke the action directly. Thus, +there's no way to "run this pipeline"; instead, there's a resource +showing the state of a pipeline, and changing that resource to say +state is "triggered" instead of "idle" is how an API client tells the +server to run a pipeline. + + +Ick controller resources and API +----------------------------------------------------------------------------- + +A project consists of a workspace specification, and an ordered list +of pipelines. Additionally the project has a list of builds, and for +each build a build log, and metadata (time and duration of build, what +triggered it, whether it was successful or not). Also, a current state +of the workspace. + +A project resource: + + { + "project": "liw.fi", + "parameters": { + "rsync_target": "www-data@www.example.com/srv/http/liw.fi" + }, + "workspace": { + "gits": [ + { + "git": "ssh://git@git.liw.fi/liw.fi", + "branch": "master", + "dir": "src" + } + ] + }, + "pipelines": [ + { + "name": "workspace-setup", + "actions": [ + { "name": "clone-gits" }, + ] + }, + { + "name": "ikiwiki-config", + "actions": [ + { "shell": "cat src/ikiwiki.setup.template > ikiwiki.setup" }, + { "shell": "echo \"destdir: {{ workspace }}/html\" >> ikiwiki.setup" }, + { "name": "mkdir", "dirname": "html" } + ] + }, + { + "name": "ikiwiki-run", + "actions": [ + { "shell": "ikiwiki --setup ikiwiki.setup" } + ] + } + { + "name": "rsync", + "actions": [ + { "shell": "rsync -a --delete html/. \"{{ rsync_target }}/.\" } + ] + } + ] + } + +Here: + +- the pipeline consists of a sequence of steps +- each step is a shell snippet (expanded with jinja2) or a built-in + operation implemented by the worker-manager directly +- project parameters are used by steps + +A pipeline status resource at +`/projects/PROJECTNAME/pipelines/PIPELINENAME`, created automatically +when a project resource is updated to include the pipeline: + + { + "status": "idle/triggered/running/paused" + } + +To trigger a pipelie, PUT a pipeline resource with a status field of +"triggered". It is an error to do that when current status is not +idle. + +A build resource is created automatically, at +/projects/PROJECTNAME/builds, when a pipeline actually starts (not +when it's triggered). It can't be changed via the API. + + { + "build": "12765", + "project": "liw.fi", + "pipeline": "ikiwiki-run", + "worker": "bartholomew", + "status": "running/success/failure", + "started": "TIMESTAMP", + "ended": "TIMESTAMP", + "triggerer": "WHO/WHAT", + "trigger": "WHY" + } + +A build log is stored at `/projects/liw.fi/builds/12765/log` as a +blob. The build log is appended to by the worker-manager by reporting +output. + +Workers are registered to the controller by creating a worker +resource. Later on, we can add useful metadata to the resource, but +for now we'll have just the name. + + { + "worker": "bartholomew" + } + +A work resource resource tells a worker what to do next: + + { + "project": "liw.fi", + "pipeline": "ikiwiki-run", + "step": { + "shell": "ikiwiki --setup ikiwiki.setup" + }, + "parameters": { + "rsync-target": "..." + } + } + +The controller provides a simple API to give work to each worker: + + GET /work/bartholomew + PUT /work/bartholomew + +The controller keeps track of which worker is currently running which +pipeline + +Work output resource: + + { + "worker": "bartholomew", + "project": "liw.fi", + "pipeline": "ikiwiki-run", + "exít_code": null, + "stdout": "...", + "stderr": "...", + "timestamp": "..." + } + +When `exit_code` is non-null, the step has finished, and the +controller knows it should schedule the next step in the pipeline. + + +Known problems +============================================================================= + +The architecture shown in this document for ALPHA-1 is not perfect. At +least the following things will probably need to be addressed in the +future. We've made comromises to gain simplicity and get something +working sooner, to allow things to be iterated (faster). + +* It's not OK for all workers to be trusted with credentials to access + all git repositories and all web servers. @@ -64,6 +64,7 @@ Latest meetings: More info: +* [[Architecture]] * [[Process desciption|pm]] * [[Current and past projects|projects]] * [[Tasks to do if you want to help|tasks]] |