arch/ick2-arch.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552

<!--

Copyright 2017 Lars Wirzenius

-->

---
title: Ick2, a CI system (architecture)
author: Lars Wirzenius
date: work-in-progress for ALPHA-1
...


Introduction
=============================================================================

Ick2 is a continuous integration (CI) system. It is being developed by
Lars Wirzenius and other people, for their own need. It is very early
days. You don't want to use Ick2, but if you have opinions on what a
CI system should be like, feedback is welcome.

This document describes the architecture of Ick2. Specifically, the
architecture for the upcoming ALPHA-1 release, not further than that.
It is a capital mistake to design software before you have all the
requirements. It biases the judgment. You can rarely have all the
requirements a priori, you have to iterate to gather them. Designing
beyond one iteration is a mistake.

Background and justification
-----------------------------------------------------------------------------

This section should be written some day. In short, Lars got tired of
Jenkins, and all competetitors seem insufficient or somehow
unpleasant. Then Daniel suggested a name and Lars is incapable of not
starting a project if given a name for it.


Overview
-----------------------------------------------------------------------------

A continuous integration (CI) or continuous deployment (CD) system is,
at its most simple core, an automated system that reacts to changes in
a program's source code by doing a build of the program, running any
of its automated tests, and then publishing the results somewhere. A
CD system continues from there to also installing the new version of
the program on all relevant computers. If a build or an automated test
fails, the system notifies the relevant parties.

Ick2 aims to be a CI/CD system. It deals with a small number of
concepts:

* **projects**, which consist of **source code** in a version control
  system (mainly git right now)
* **pipelines**, which are sequences of steps aiming to convert source
  code into something executable, or test the program
* **worker build hosts**, which do all the heavy lifting

The long-term goal for Ick2 is to provide a CI/CD system that can be
used to build and deploy any reasonable software project, including
building packages of any reasonable type. In our wildest dreams it'll
be scalable enough to build a full, large Linux distribution such as
Debian. We'll see.

Example
-----------------------------------------------------------------------------

We will be returning to this example throughout this document. Imagine
a static website that is built using the ikiwiki software. The source
of the web pages is stored in a git repo, and the generated HTML pages
are published on a web server.

This might be expressed as an Ick2 configuration like this:

    projects:
        website:
            workspace:
                - git: ssh://git@git.example.com/website.git
            pipelines:
                - name: getsource
                  steps:
                  - shell: git clone ssh://git@git.example.com/website.git src
                - name: ikiwiki
                  steps:
                  - shell: mkdir html
                  - shell: ikiwiki src html
                - name: publish
                  steps:
                  - shell: rsync -a --delete html/. www-user@www.example.com/srv/http/.

Note that pipelines are defined in the configuration. Eventually, Ick2
may come with pre-defined libraries of pipelines that can easily be
reused, but it will always be possible for users to define their own.

Pipeline steps will not be able to use variables, in ALPHA-1. That's
probably going to be added later.


Ick2 ALPHA-1
=============================================================================

We are currently working on what will be called the ALPHA-1 version of
Ick2. This chapter outlines its intended functionality and the shape
of its architecture.


Ick2 ALPHA-1 definition
-----------------------------------------------------------------------------

This is the current working definition of the aim for the ALPHA-1
version of Ick2:

> ALPHA-1 of Ick2 can be deployed and configured easily, and can
> concurrently build multiple projects using multiple workers. Builds may be
> traditional builds from source code, may involve running unit tests
> or other build-time tests, may involve building Debian packages, and
> build artifacts are published in a suitable location. Builds may
> also be builds of static web sites or documentation, and those build
> artifacts may be published on suitable web servers. Builds happen on
> workers in reasonably well isolated, automatically maintained
> environments akin to pbuilder or schroot (meaning the sysadmin is
> not expected to set up the pbuilder base tarball, ick2 will do
> that).

Ick2 acceptance criteria
-----------------------------------------------------------------------------

Acceptance criteria for ALPHA-1:

* All Ick2 components and the workers are deployable using Ansible or
  similar configuration management tooling.

* At least two people (not only Lars) have set up a CI cluster to
  build at least two different projects on at least two workers. One
  of the projects should build documentation for ick2 itself, the
  other should build a .deb packages of ick2. Bonus points for
  building other projects than ick2 as well.

* Builds get triggered automatically by a git server on any commit to
  the master branch.

* Build logs can be viewed while builds are running or afterwards via
  an HTTP API (perhaps wrapped in a command line tool). Bonus points
  if someone builds a web app on top of the API.

* A modicum of thought has been spent on security and the major
  contributors agree the security design is not idiotic. The goal is
  to be confident that a future version of Ick2 can be made reasonably
  secure, even if that doesn't happen for ALPHA-1.

* The workspace is constructed from several git repositories, e.g., so
  that the debian subdir comes from a different repo than the main
  source tree.

* The pipeline steps are not merely snippets of shell scripts to run.
  Instead, steps may name operations that get executed by the workers
  without specifying the implementation in the Ick2 project
  configuration.


Ick2 ALPHA-1 architecture
-----------------------------------------------------------------------------

The future architecture of Ick2 is a collection of mutually recursive
self-modifying microservices.

* A project consists of a description of the workspace, and one or
  more pipelines to be executed when triggered to do so. Each
  pipeline needs to be triggered individually. Each pipeline acts in
  the same workspace. The entire pipeline is executed on the same
  worker.

* The workspace description is, initially, a set of git repos and
  corresponding refs to clone (or update from) into a tree. Later
  (after ALPHA-1) the workspace may be built from multiple git repos,
  or artifacts of other builds, or other things that turn out to be
  useful.

  Accessing git repositories may require credentials that all specific
  workers will have.

* The workspace is, essentially, a directory tree, populated by files
  needed for doing a build. The "source tree" if you wish.

* The project's pipelines do things like: prepare workspace, run
  actual build, publish build artifacts from worker to a suitable
  server. The controller keeps track of where in each pipeline a
  build is.

* Workers are represented by worker-managers, which request work
  from the controller and perform the work by running commands locally
  or over ssh on the actual worker host. Worker-managers may be on the
  worker hosts or elsewhere, depending on what suits best for each CI
  cluster.

* Worker-builders register their workers with the controller. For
  ALPHA-1 all workers are assumed to be equivalent

* A pipeline is a sequence of steps (such as shell commands to run),
  plus some requirements for what attributes the worker that runs the
  pipeline should have. All the steps of a pipeline get executed by
  the same worker.

* If a pipeline step fails, the controller will mark the pipeline
  execution as having failed and won't schedule more steps to execute.
  Likewise, later pipelines in the same project won't be executed. If
  the failure was transient (e.g., DNS lookup error), the user may
  trigger a rebuild manually (via the trigger service).

ick2 ALPHA-1 components
-----------------------------------------------------------------------------

Ick2 consists of several independent services. This document describes
how they are used individually and together.

* The **controller** keeps track of projects, build pipelines, workers,
  and the current state of each. It decides which build step is next,
  and who should execute it. The controller provides a simple,
  unconditional "build this pipeline" API call, to be used by the
  trigger service (see below).

* A **worker-manager** represents a **build host**. It queries the
  controller for work, and makes the build host (the actual worker)
  execute it, and then reports results back to the controller.

* The **trigger service** decides when a build should start. It polls
  the state of the universe, or gets notifications of changes of the
  same.

* The controller and trigger services provide an API. The **identity
  provider** (IDP) takes care of the authentication of each API
  client, and what privileges each should have. The API client
  authenticates itself to the IDP, and receives an access token. The
  API provider gets the token in each request, validates it, and
  inspects it to see what the client is allowed to do.

  A major point of the IDP is to have just a single place where
  authentication and authorisation is configured.

On an implementation level, the various services of Ick2 may be
implemented using any language and framework that works. However, to
keep things simple. initially we'll be using Python 3, Bottle, and
Green Unicorn. Also, the actual API implementation ("backend") will be
running behind haproxy, such that haproxy decrypts TLS and sends the
actual HTTP requrest over unencrypted localhost connections to the
backend.

@startuml
title Ick2 services


[git server] --> [trigger service] : notify of change
[trigger service] --> [controller] : start pipeline
[controller] <-- [worker manager] : get work, report result
[worker manager] --> [host] : execute command
[git server] --> [IDP] : get access token
[trigger service] .. [IDP] : get access token
[worker manager] .. [IDP] : get access token
@enduml

The API providing services will be running in a configuration like
this:

@startuml
title API arch
node service {
    component haproxy
    component backend
}
[API client] --> [haproxy] : HTTPS (TLS)
[haproxy] --> [backend] : HTTP over localhost
@enduml


Individual APIs
=============================================================================

This chapter covers interactions with individual APIs.


On security
-----------------------------------------------------------------------------

All APIs are provided over TLS only. Access tokens are signed using public
key encryption and the public part of the signing keys is provided
"somehow" to all API clients.


Getting an access token
-----------------------------------------------------------------------------

The API client (user's command line tool, a putative web app, git
server, worker-manager, etc) authenticates itself to the IDP, and if
successful, gets back a signed JSON Web Token. It will include the
token in all requests to all APIs so that the API provider will know
what the client is allowed to do.

The privileges for each API client are set by the sysadmin who
installs the CI system, or a user who's been given IDP admin
privileges by the sysadmin.

@startuml
hide footbox
title Get an access token
client -> IDP : GET /auth, with Basic Auth, over https
IDP --> client : signed JWT token
@enduml

All API calls need a token. Getting a token happens the same way for
every API client.


Worker (worker-manager) registration
-----------------------------------------------------------------------------

The sysadmin arranges to start a worker-manager for every build host.
They may run on the same host, or not: the Ick2 architecture doesn't
really care. If they run on the same host, the worker manager will
start a subprocess. If on different hosts, the subprocess will be
started using ssh.

The CI admin may define tags for each worker. Attributes may include
things like whether the worker can be trusted with credentials for
logging into other workers, or for retrieving source code from the git
server. Workers may not override such tags. Workers may, however,
provide other tags, to e.g., report their CPU architecture or Debian
release. The controller will eventually be able to use the tags to
choose which worker should execute which pipeline steps.

@startuml
hide footbox
title Register worker
worker_manager -> IDP : GET /auth, with Basic Auth, over https
IDP --> worker_manager : token A
worker_manager -> controller : POST /workers (token A)
controller --> worker_manager : success
@enduml

The worker manager runs a very simple state machine.

@startuml
title Worker-manager state machine

Querying : ask controller for work
Running : run subprocess


[*] -down-> Idle : start
Idle -down-> Querying : short timeout has expired
Querying -up-> Idle : nothing to do
Querying --> Running : something to do

Running --> Running : get output, report to controller
Running --> Idle : subprocess finished, report to controller
@enduml


Add project to controller
-----------------------------------------------------------------------------

The CI admin (or a user authorised by the CI admin) adds projects to
the controller to allow them to be built. This is done using an "CI
administration application", which initially will be a command line
tool, but may later become a web application as well. Either way, the
controller provides API endpoints for this.

@startuml
hide footbox
title Add project to controller

adminapp -> IDP : GET /auth, with Basic Auth, over https
IDP --> adminapp : token B
adminapp -> controller : POST /projects (token B)
controller --> adminapp : success or failure indication
@enduml


A full build
=============================================================================

Next we look at how the various compontens interact during a complete
build, using a single worker, which is trusted with credentials. We
assume the worker has been registered and projects added.

The sequence diagrams in this chapter have been split into stages, to
make them easier to view and read. Each diagram after the first one
continues where the previous one left off.

Although not shown in the diagrams, the same sequence is meant to work
if having multiple projects running concurrently on multiple workers.

Trigger build by pushing changes to git server
-----------------------------------------------------------------------------

@startuml
hide footbox
title Build triggered by git change

developer -> gitano : git push

gitano -> IDP : GET /auth, with Basic Auth, over https
IDP --> gitano : token C
gitano -> trigger : POST /git/website.git (token C)
note right
    Git server notifies
    trigger service that
    a git repo has changed
end note

|||

trigger -> IDP : GET /auth, with Basic Auth, over https
IDP --> trigger : token D
trigger -> controller : GET /projects (token D)
note right
    trigger service queries
    controller to get list
    of all projects, so it
    knows which builds to
    start
end note
controller --> trigger : list of projects

|||

trigger -> controller : GET /projects/website (token D)
note right
    trigger service
    gets project config
    so it knows what
    pipelines project has
end note
controller --> trigger : project description, incl. pipelines

|||

trigger -> controller : POST /projects/website/pipelines/getsource/+start (token D)
@enduml

The first pipeline has now been started by the trigger service.


Pipeline 1: get sources
-----------------------------------------------------------------------------

The first pipeline uses the trusted worker to fetch source code from
the git server (we assume that requires credentials), and push them
to the powerful worker.

@startuml
hide footbox
title Build pipeline: get source

trusty -> IDP : GET /auth, with Basic Auth, over https
IDP --> trusty : token E

|||

trusty -> controller : GET /worker/trusty (token E)
controller --> trusty : "clone website source into workspace"
trusty -> gitano : git clone
gitano --> trusty : website source code
trusty -> controller : POST /worker/trusty, exit=0 (token E)

|||

trusty -> controller : GET /worker/trusty (token E)
controller -> trusty  : "notify trigger service pipeline is finished **successfully**"
trusty -> trigger     : GET /pipelines/website/getsource, exit=0 (token E)
note right
    No need to have the trigger service query the controller since
    it has been told the status of pipeline by the worker.
end note
trusty -> controller  : POST /worker/trusty, exit=0 (token E)
note right
    If the notification to the trigger service failed,
    this can be reported to the controller for logging.
end note
trigger -> controller : POST /projects/website/pipelines/ikiwiki/+start (token D)
@enduml

The first pipeline finished, and the website building can start.
That's the second pipeline, which has just been started.


Pipeline 2: Build static web site
-----------------------------------------------------------------------------

The second pipeline runs on the same worker. The source is already
there and it just needs to perform the build.

@startuml
hide footbox
title Build static website

trusty -> controller : GET /worker/trusty (token E)
controller -> trusty : "build static website"
trusty -> trusty : run ikiwiki to build site
trusty -> controller : POST /worker/trusty, exit=0 (token E)

|||

trusty -> controller : GET /worker/trusty (token E)
controller -> trusty  : "notify trigger service pipeline is finished"
trusty -> controller  : POST /worker/trusty, exit=0 (token E)
trusty -> trigger     : GET /pipelines/website/ikiwiki (token E)
trigger -> controller : GET /projects/website/pipelines/ikiwiki (token D)
trigger -> controller : POST /projects/website/pipelines/publish/+start (token D)

@enduml

At the end of the second pipeline, we start the third one.

Pipeline 3: Publish web site to web server
-----------------------------------------------------------------------------

The third pipeline copies the built static website from the trusty
worker to the actual web server.

@startuml
hide footbox
title Copy built site from beefy to web server

trusty -> controller : GET /worker/trusty (token E)
controller -> trusty : "rsync static website to web server"
trusty -> webserver  : rsync
trusty -> controller : POST /worker/trusty, exit=0 (token E)

|||

trusty -> controller : GET /worker/trusty (token E)
controller --> trusty : "notify trigger service pipeline is finished"
trusty -> controller  : POST /worker/trusty, exit=0 (token E)
trusty -> trigger     : GET /pipelines/website/publish (token E)
trigger -> controller : GET /projects/website/pipelines/ikiwiki (token D)
note right
    There are no further pipelines.
end note

@enduml

The website is now built and published.

Known problems
=============================================================================

The architecture shown in this document for ALPHA-1 is not perfect. At
least the following things will probably need to be addressed in the
future. We've made comromises to gain simplicity and get something
working sooner, to allow things to be iterated (faster).

* It's not OK for all workers to be trusted with credentials to access
  all git repositories and all web servers.