architecture.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936

[[!meta title="Architecture of Ick"]]

[[!toc levels=2]]

Introduction
=============================================================================

Ick is a tool to aid in continuous integration (CI). It may some day
evolve into also being a tool for also doing continuous deployment
(CD).

This document describes the technical architecture of ick.
Specifically, the architecture for the upcoming ALPHA-6 release, but
not further than that. ALPHA-6 is meant to be usable by people other
than the primary developer.

What is continuous integration?
-----------------------------------------------------------------------------

Continuous integration is a software development style where changes
get integrated to the main line of development frequently. It is
contrasted with a style where features get developed in their own,
long-living branches and only get integrated when finished, often
after weeks or months of development. The latter often results in an
integration phase that is long and error-prone. Frequent integration
typically results in fast, painless integration, because not much has
changed.


Background and justification
-----------------------------------------------------------------------------

Ick's main developer learned to program in the 1980s, studies computer
science in the early 1990s, and has been working in the industry since
the mid-1990s. Very roughly, in the 1980s there were few if any
automated tests, a little of it in the late 1990s, and it became
prevalent in the early 2000s. As more automated tests started
happening, it turned out that programmers keep forgetting to run them.
Thus was created automation to run the tests whenever code has
changed. This has since morphed into full blown continuous integration
systems, which do builds, run tests, and possibly do other things.

A common CI system is Jenkins, originally called Hudson, but renamed
after Oracle bought Sun Microsystems. Jenkins is used widely in the
industry. Ick's main developer has not been happy with it, however,
and decided to write a new system. The current ick is the second
generation of ick. The first generation (only ever used by its
developer) was written in a two-week frenzy of hacking to get
something, anything that could replace Jenkins in specific use cases.
The first generation was just good enough to be used by its developer,
but not satisfactory otherwise. It also had a very awkward
architecture that among other things only allowed running one build at
a time, and did not work well as a service.

The second (current) generation of ick is a re-design from scratch,
keeping nothing of the first generation. It is explicitly aimed to
become a "hostable" system: to allow an ick instance to be a CI system
to a number of independent users.

The name "ick" was suggested by Daniel Silverstone in an IRC
discussion. He said "all CI systems are icky", and this prompted Lars
to name the first generation "ick".


Overview
-----------------------------------------------------------------------------

A continuous integration system is, at its most simple core, an
automated system that reacts to changes in a program's source code by
doing a build of the program, running any of its automated tests, and
then publishing the results somewhere. A continuous deployment system
continues from there to also installing the new version of the program
on all relevant computers. If any step in the process fails, the
system notifies the relevant parties.

Ick aims to be a CI system. It deals with a small number of concepts:

* **projects**, which consist of **source code** in a version control
  system
* **pipelines**, which are reusable sequences of actions aiming to
  achieve some task (build program source code, run tests, etc)
* **workers**, which do all the actual work by executing pipeline
  actions
* **artifact store**, which holds results of project builds, and
  intermediary results used by the build
* **identity provider**, which handles authentication of users

The long-term goal for ick is to provide a CI/CD system that can be
used to build and deploy any reasonable software project, including
building packages of any reasonable type. In our wildest dreams it'll
be scalable enough to build a full, large Linux distribution such as
Debian. Also, it should be painless to deploy, operate, and use.

Example project
-----------------------------------------------------------------------------

We will be returning to this example throughout this document. Imagine
a static website that is built using the [ikiwiki][] software, using a
wrapper that also pushes the generated HTML files to a web server over
rsync. The source of the web pages is stored in a git repo, and the
generated HTML pages are published on a web server.

[ikiwiki]: http://ikiwiki.info/

This might be expressed as an Ick configuration like this:

    projects:
      - project: ick.liw.fi
        parameters:
            git_url: git://git.liw.fi/ick.liw.fi
            git_ref: master
            rsync_target: ickliwfi@ick.liw.fi:/srv/http/ick.liw.fi
        pipelines:
        - get_source
        - build_ikiwiki_site
        - publish_html

    pipelines:

      - pipeline: get_source
        parameters:
        - git_url
        - git_ref
        actions:
        - python: |
            import subprocess
            def R(*args):
              subprocess.check_call(*args, stdout=None, stderr=None)
            R(['git', 'clone', '-vb', params['git_ref],
               params['git_url'], 'src'])
          where: host

      - pipeline: build_ikiwiki_site
        actions:
        - python: |
            import subprocess
            def R(*args):
              subprocess.check_call(*args, stdout=None, stderr=None)
            R(['ikiwiki', 'src/ikiwiki.setup'])
          where: host

      - pipeline: publish_html
        parameters:
        - rsync_target
        actions:
        - shell: |
            tgt="$(params | jq .)"
            rsync -a --delete html/. "$tgt"
          where: host

Note that pipelines are defined in the configuration by the user.
Eventually, ick will come with libraries of pre-defined pipelines that
can easily be reused, but it will always be possible for users to
define their own.


Ick architecture
=============================================================================

The architecture of ick is a collection of mutually recursive
self-modifying microservices. (That's intended to scare you off.)

* A project consists of one or more pipelines to be executed when
  triggered to do so. A project defines some parameters given to the
  pipelines. The user (or some other entity, such as a version control
  server) triggers a project, and ick will execute all the pipelines.
  Each pipeline acts in the same workspace. The entire pipeline is
  executed on the same worker. All workers are considered equal.

* There is no separate workspace description. Each project needs to
  construct the workspace itself, if it needs to. Each build starts
  with an empty directory as the workspace. The project needs to
  populate it by, say, `git clone` or by telling ick to fetch the
  contents of the previous build's workspace from the artifact store.

* The project's pipelines do things like: prepare workspace, run
  actual build, publish build artifacts from worker to a suitable
  server. The controller keeps track of where in each pipeline a
  build is.

* Each worker is represented by a worker-manager running on the worker
  host. It requests work from the controller and performs the work by
  running commands locally, and reporting output and exit code to the
  controller.

* Worker-builders register themselves with the controller using a
  secret set during deployment time. The secret allows them to
  authenticate themselves to the identity provider.

* A pipeline is a sequence of actions (such as shell or python
  snippets to run), plus some parameters that the actions can
  reference.

* If a pipeline action fails, the controller will mark the pipeline
  execution as having failed and won't schedule more steps to execute.


Ick components
-----------------------------------------------------------------------------

Ick consists of several independent services. This document describes
how they are used individually and together.

* The **controller** keeps track of projects, pipelines, workers,
  builds, and the current state of each. It decides which build action
  is next, and who should execute it. The controller provides a
  simple, unconditional "build this project" API call, which the user
  can use.

* A **worker-manager** represents and directly controls a **build
  host**. It queries the controller for work, and executes the related
  action on its build host, and then reports results back to the
  controller. Results consist of any output (stdout, stderr) and exit
  code.

* An **artifact store** stores individual files (which may be tar
  files). As an example, the container system tree (see below) will be
  stored in the artifact store.

* The controller and artifact store provide an API. The **identity
  provider** (IDP) takes care of the authentication of each API
  client, and what privileges each should have. The API client
  authenticates itself to the IDP, and receives an access token. The
  client includes the access token with each call to an API, the API
  provider validates the token, and inspects it to see what the client
  is allowed to do.

* The **identity provider** (IDP) authenticates the user, ick
  components, and other API users. The authenticated entity gets an
  **access token**, and each API provider (controller, artifact store,
  etc) accepts API requests if accompanied with a valid access token.

  We use the Qvisqve
  software as the IDP.

* The **notification service** provides an API to send out
  notifications about ended builds to users. The API is used by the
  controller via a worker, when a build ends.

* The **APT repository** provides access to Debian packages built by
  the ick instance, so that users can install them easily. (Note that
  this does not make ick Debian specific. Adding support for the
  equivalent repository for, say, RPM packages is possible, and will
  hopefully happen not too far in the future.)

* The **icktool** command line tool provides the ick user interface.
  It gets an access token from the identity provider, and uses the
  controller and artifact store APIs to manage project and pipeline
  descriptions, build artifacts, trigger builds, and view build
  status.

On an implementation level, the various services of ick may be
implemented using any language and framework that works. However, to
keep things simple, currently we are using Python 3, Bottle, and Green
Unicorn. Also, the actual API implementation ("backend") will be
running behind haproxy, such that haproxy de-crypts TLS and sends the
actual HTTP request over unencrypted localhost connections to the
backend.

In addition to the actual components of ick, a few other entities are
relevant:

* An **SMTP server** is needed to send notifications. This is not part
  of ick, and access to an external server is needed.

* A **git server** is external to ick. It is expected to trigger
  builds when a repository changes. Any git server will do, as long as
  an ick worker can access it.

* The **end user** (developer) defines projects and pipelines, and is
  generally an important part of the ick view of the universe.

    @startuml
    title Ick components

    node "Mail server" {
        [SMTP server] as smtp_server
    }

    node "IDP" {
        [haproxy] as qvisqve_haproxy
        [qvisqve] as qvisqve_backend
        [qvisqve_haproxy] -down-> [qvisqve_backend]
    }

    node "Controller" {
        [haproxy] as controller_haproxy
        [controller] as controller_backend
        [controller_haproxy] -down-> [controller_backend]
        [controller_backend] -down-> [local_disk]
    }

    node "Artifact store" {
        [haproxy] as as_haproxy
        [artifact store] as as_backend
        [as_haproxy] -down-> [as_backend]
    }

    node "APT repository" {
         [http server] as apt_httpd
         [.deb repository] as apt_repo
         [ssh server] as apt_ssh
         [apt_ssh] -left-> apt_repo
         [apt_httpd] -right-> apt_repo
    }

    node "Notification service" {
        [haproxy] as notif_haproxy
        [notifier] as notif_backend
        [notif_haproxy] -down-> [notif_backend]
        [notif_backend] --> [smtp_server]
    }

    node "Worker" {
        [worker manager] as worker_manager
        [local process] as local_host
        [worker_manager] -left-> [controller_haproxy]
        [worker_manager] -right-> [local_host]
        [worker_manager] -down-> [qvisqve_haproxy]
        [worker_manager] -down-> [as_haproxy]
        [worker_manager] -down-> [notif_haproxy]
        [worker_manager] -up-> [apt_ssh]
    }

    [git server] as git_server
    [End user] as user
    [git_server] -down-> [controller_haproxy]
    [user] -down-> [controller_haproxy]
    [user] -down-> [apt_httpd]
    @enduml


Individual APIs
=============================================================================

This chapter covers interactions with individual APIs.


On security
-----------------------------------------------------------------------------

All APIs are provided over TLS only. Access tokens are signed using public
key encryption and the public part of the signing keys is provided
to all API providers at deployment time.

The access tokens contain the identity of the API client and possibly
the end-user, and a list of "scopes", which define what the bearer of
the token can do. Each API call has its own scope (HTTP method, plus
path component of the URL).


Getting an access token: icktool and OAuth2
-----------------------------------------------------------------------------

Ick uses [Qvisqve][] as the IDP solution. For non-interactive API
clients, which act independently of an end-user, the [OAuth2][]
protocol is used, and in particular the "client credentials grant"
variant.

[OAuth2]: https://oauth.net/2/

The API client (`icktool`, worker-manager) authenticates itself to the
IDP, and if successful, gets back a signed JSON Web Token. It will
include the token in all requests to all APIs so that the API provider
will know what the client is allowed to do.

The privileges for each API client are set by the sysadmin who
installs the CI system.

    @startuml
    hide footbox
    title Get an access token
    client -> IDP : GET /auth, with Basic Auth, over https
    IDP --> client : signed JWT token
    @enduml

All API calls need a token. Getting a token happens the same way for
every API client. There are three exceptions:

* The call to get an access token.
* Getting the version from the controller, which includes the URL to
  the IDP.
* Triggering a project to build. This is temporarily un-authenticated,
  to avoid having to distribute API credential to git server. This
  will be fixed later.


Getting an access token: ickui and OpenID Connect
-----------------------------------------------------------------------------

For use cases where an end-user uses ick interactively, via a web user
interface, the [OpenID Connect][] (OIDC) is used, in particular the
"authorization code flow" variant. This is somewhat more complicated
than the client credentials grant for non-interactive use.

[OpenID Connect]: https://openid.net/specs/openid-connect-core-1_0.html

In summary, there are five entities involved:

* the end-user who owns (in a legal meaning) the resources involved
* the "resource server" where the resources technically are: this
  means the controller and artifact store, and possibly other ick
  components that hold data on behalf of the end-user
* the IDP (Qvisqve), which authenticates the end-user and gives out
  access tokens that allow the bearer of the access token to do things
  with the user's resources
* the front-end running in the end-user's web browser; this is
  Javascript and other data loaded into the browser
* a "facade" that sits between the browser and the resource servers

The facade is necessary for security. We do not trust the browser to
keep an access token secure from malware running on the end-user's
machine or device, including in the browser. The facade runs on what
is assumed to be a more secure machine, and can thus be trusted with
the access token. The facade can also provide a more convenient API
for the front-end than what the actual resource servers provide. The
facade makes HTTP requests to resource servers on behalf of the
front-end, and includes the access token to those.

### OIDC protocol overview

* User initiates login, by clicking on a "login" link in the front-end
  UI. Or else the facade initiates this, when its access token
  expires. Either way, the browser makes a request to the facade's
  login endpoint.
* Facade redirects user's browser to Qvisqve.
    * This is called the "authorization request". It includes some
      data that's needed to prevent various security intrusions.
    * Also includes information of what kind of access is wanted
      ("scopes").
* Qvisqve lets user authenticate themselves.
    * Username and password for now, other methods will be added
      later.
* Qvisqve redirects user's browser back to the facade.
    * This includes an "authorization code", which can be used a
      single time by the facade. The browser will see the
      authorization code, but since it can be used only once, the
      consequences of the code leaking are tolerable. (And also, the
      authorization code is useless on its own.)
* Facade retrieves an access token from Qvisqve.
    * Facade authenticates itself to Qvisqve using a pre-registered
      client id and secret. The request includes the authorization
      code.
* Facade uses access token to use resource servers.


### The authorization request

The authorization request has the following parameters:

* REQUIRED: `scope`. MUST include `openid`. If it is not there,
  behaviour is unspecified, Qvisqve should return an error. Any
  other scope values will be included in the access token, if Qvisqve
  is configured to allow them for the user and application.

* REQUIRED: `response_type`. Must be `code`.

* REQUIRED: `client_id`. An id Qvisqve knows. If unknown, Qvisqve
  returns an error. This is the client id for the facade.

* REQUIRED: `redirect_uri`. MUST exactly match one of the
  callback-URIs pre-registered for the application with Qvisqve.

* RECOMMENDED: `state`. The facade will generate this, and Qvisqve
  will require this, and will return an error if it is missing. Needed
  for security (XSRF mitigation).

* Qvisqve will ignore any other parameters, for now. The OIDC protocol
  defines a bunch, and they may be useful later.


### The authorization code flow: Protocol messages

User clicks on a "login" link, or facade gets an error indicating the
access token it was using has expired. In either case, the facade is
doing something in response to an HTTP request from the browser.

Facade initiates login with "Authorization request" by returning a 302
(moved temporarily) response, with a Location header like this:

    HTTP/1.1 302 Found
    Location: https://qvisqve/auth?
        response_type=code
        &scope=openid
        &client_id=CLIENTID
        &state=RANDOMSTRING
        &redirect_uri=CALLBACKURI

Here, `CLIENTID` is the client id the facade has for accessing
Qvisqve, and `CALLBACKURI` is a URL pre-registered with Qvisqve for
the facade. `RANDOMSTRING` is a large random value (such as a UUID4),
which the facade generates and remembers.

The browser follows the redirect, and Qvisqve checks the request
parameters. If it looks OK, Qvisqve creates an "authorization attempt
object", and stores `CLIENTID`, `RANDOMSTRING`, and `CALLBACKURI` in
it. It also generates an object id (a UUID4).

Qvisqve returns to the browser a login form, asking for username and
password, plus a hidden form field with the authorization attempt
object id.

The user fills in the login form, and submits it. The submitted form
includes the authorization attempt object id field. Qvisqve checks
that the id value corresponds to an existing authorization attempt
object, and that the credentials are valid. If so, it creates an
authorization code, stores that into the authorization object, and
responds to the form submission with:

    HTTP/1.1 302 Found
    Location: CALLBACKURI?code=AUTHZCODE&state=RANDOMSTRING

Here, `CALLBACKURI` and `RANDOMSTRING` are the ones retrieved from the
authorization object.

Browser follows the redirect to the facade. The facade checks that
`RANDOMSTRING` matches one it remembered earlier, and extracts
`AUTHZCODE` from the request URI.

The facade then makes a request to Qvisqve to get actual tokens:

    POST /token HTTP/1.1
    Host: qvisqve
    Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW
    Content-Type: application/x-www-form-urlencoded

    grant_type=authorization_code&code=AUTHZCODE&redirect_uri=CALLBACKURI

Here, `AUTHZCODE` comes from the request from the browser, and
`CALLBACKURI` is the same URI as given in the initial authorization
request. Note that the `Authorization` header Basic Auth encodes the
client id and client secret for the facade, as registered with
Qvisqve. The client id must be the same as given in the initial
authorization request.

Qvisqve checks the client's credentials (`Authorization` header), and
that the `AUTHZCODE` is one it has generated and that hasn't yet been
used, and that `CALLBACKURI` is registered with the facade. If all is
well, it responds to the facade's token request:

    HTTP/1.1 200 OK
    Content-Type: application/json;charset=UTF-8
    Cache-Control: no-store
    Pragma: no-cache

    {
        "access_token":"ACCESSTOKEN",
        "token_type":"bearer",
        "expires_in":3600,
    }

Here, `ACCESSTOKEN` is the access token, a signed JSON Web Token,
which the facade will use in all future requests to resource servers.
Scopes in the access token are those listed in the `scope` parameter
in the initial authorization request, or the subset Qvisqve is
configured to grant to the user.

When returning the access token, Qvisqve destroys the authorization
object, so that any further use of the authorization attempt object id
(via the login form's hidden id field, or the authorization code) will
fail.


### The authorization code flow: Sequence diagram

The "happy path" of the authorization code flow, as a UML sequence
diagram.

    @startuml

    actor enduser
    participant browser
    participant facade
    participant qvisqve
    participant resources

    group Successful login

        enduser -> browser : click on "login" in application
        activate browser
            browser -> facade  : GET https://facade/login
            activate facade
                facade -> facade : create, remember STATE value
                facade  -> browser : Redirect to https://qvisqve/auth?manyparams
            deactivate facade

            browser -> qvisqve : GET https://qvisqve/auth?manyparams
            activate qvisqve
                qvisqve -> qvisqve : valid request?\ncreate authz object\nwith STATE, etc
                qvisqve -> browser : login form with authz object id
            deactivate qvisqve
            browser -> enduser : show login form
        deactivate browser

        enduser -> browser : enter credentials, click on submit
        activate browser
            browser -> qvisqve : POST https://qvisqve/auth
            activate qvisqve
                qvisqve -> qvisqve : valid request (authz obj id)?\ncredentials OK?\ncreate authz code
                qvisqve -> browser : redirect to https://facade/callback?code=CODE&state=STATE
            deactivate qvisqve

            browser -> facade  : https://facade/callback?code=CODE&state=STATE
            activate facade
                facade -> facade : valid request? (STATE matches)
                facade -> qvisqve : POST https://qvisqve/token with params, Basic Auth
                activate qvisqve
                    qvisqve -> qvisqve : client creds OK? params OK?\ncreate access token\ndestroy authz obj
                    qvisqve -> facade  : access token
                deactivate qvisqve
                facade  -> facade  : create cookie, associate with access token
                facade  -> browser : logged in page, cookie
            deactivate facade
           browser -> enduser : show "you're logged in!"
       deactivate browser

    group User is already logged in

        enduser -> browser : click on something that requires being logged in
        activate browser
            browser -> facade  : GET https://facade/something
            activate facade
                facade -> facade : got valid cookie?
                facade -> resources : GET htts://resources/something with access token
                activate resources
                    resources -> resources : check access token
                    facade <- resources : some resource
                deactivate resources
                facade -> browser : some thing
            deactivate facade
            browser -> enduser : show thing to user
        deactivate browser

    end

    @enduml

(This should be the same protocol as described in prose above.)

The worker-manager
-----------------------------------------------------------------------------

The sysadmin arranges to start a worker-manager on every build host
and installs IDP credentials for each worker-manager.

    @startuml
    hide footbox
    title Register worker
    worker_manager -> IDP : GET /auth, with Basic Auth, over https
    IDP --> worker_manager : token A
    worker_manager -> controller : POST /workers (token A)
    controller --> worker_manager : success
    @enduml

The worker manager runs a very simple state machine.

    @startuml
    title Worker-manager state machine

    Querying : ask controller for work
    Running : run subprocess


    [*] -down-> Idle : start
    Idle -down-> Querying : short timeout has expired
    Querying -up-> Idle : nothing to do
    Querying --> Running : something to do

    Running --> Running : get output, report to controller
    Running --> Idle : subprocess finished, report to controller
    @enduml

The worker manager can execute a number of different actions. Some of
these are built into the worker manager itself, and some require
executing an external program. It can run the action on the host, in a
chroot, or in a container.


Add project to controller
-----------------------------------------------------------------------------

The CI admin (or a user authorised by the CI admin) adds projects to
the controller to allow them to be built. This is done using
`icktool`. The controller provides API endpoints for this.

    @startuml
    hide footbox
    title Add project to controller

    adminapp -> IDP : GET /auth, with Basic Auth, over https
    IDP --> adminapp : token B
    adminapp -> controller : POST /projects (token B)
    controller --> adminapp : success or failure indication
    @enduml

Pipeline descriptions happen in the same way, except using different
resources and endpoints.


A full build
=============================================================================

Next we look at how the various components interact during a complete
build, using a single worker, which is trusted with credentials to
external systems. We assume the worker has been registered and
projects added.

The sequence diagrams in this chapter have been split into stages, to
make them easier to view and read. Each diagram continues where the
previous one left off.

Although not shown in the diagrams, the same sequence is meant to work
if having multiple projects running concurrently on multiple workers.

Trigger build by pushing changes to git server
-----------------------------------------------------------------------------

    @startuml
    hide footbox
    title Build triggered by git change

    developer -> gitano : git push

    gitano -> controller : GET /projects/foo/+trigger (no auth)
    note right
        Git server notifies
        controller that a
        project needs to be
        built
    end note

    @enduml

The project has now been marked by the controller as triggered.


Pipeline: `get_source`
-----------------------------------------------------------------------------

The first pipeline uses the trusted worker to fetch source code from
the git server (we assume that requires credentials), and push them
to the powerful worker.

    @startuml
    hide footbox
    title Build pipeline: get source

    worker -> IDP : GET /auth, with Basic Auth, over https
    IDP --> worker : token E

    |||

    worker -> controller : GET /work (token E)
    controller --> worker : "clone website source into workspace"

    |||

    worker -> gitano : git clone
    worker -> controller : POST /work, exit=null (token E)
    note right
        Report partial
        output
    end note
    gitano --> worker : website source code
    worker -> controller : POST /work, exit=0 (token E)
    @enduml

The first pipeline finished, and the website building can start.


Pipeline: `build_ikiwiki_site`
-----------------------------------------------------------------------------

The second pipeline runs on the same worker. The source is already
there and it just needs to perform the build.

    @startuml
    hide footbox
    title Build static website

    worker -> controller : GET /work (token E)
    controller -> worker : "build static website"
    worker -> worker : run ikiwiki to build site
    note right
        Running happens
        directly on the
        host in the
        example.
    end note
    worker -> controller : POST /work, exit=0 (token E)

    @enduml

At the end of the second pipeline, we start the third one.


Pipeline: `publish_html`
-----------------------------------------------------------------------------

The third pipeline copies the built static website from the trusty
worker to the actual web server.

    @startuml
    hide footbox
    title Copy built site from beefy to web server

    worker -> controller : GET /worker (token E)
    controller -> worker : "rsync static website to web server"
    worker -> webserver  : rsync
    worker -> controller : POST /work, exit=0 (token E)

    @enduml

The website is now built and published. The controller won't give
anything else to do to the worker until a new build is started.


Ick APIs
=============================================================================

APIs follow the RESTful style
-----------------------------------------------------------------------------

All the Ick APIs are [RESTful][]. Server-side state is represented by
a set of "resources". These data objects that can be addressed using
URLs and they are manipulated using HTTP methods: GET, POST, PUT,
DELETE. There can be many instances of a type of resource. These are
handled as a collection. Example: given a resource type for projects
ick should build, the API would have the following calls:

* `POST /projects` &ndash; create a new project, giving it an ID
* `GET /projects` &ndash; get list of all project ids
* `GET /projects/ID` &ndash; get info on project ID
* `PUT /projects/ID` &ndash; update project ID
* `DELETE /projects/ID` &ndash; remove a project

[RESTful]: https://en.wikipedia.org/wiki/Representational_state_transfer

Resources are all handled the same way, regardless of the type of the
resource. This gives a consistency that makes it easier to use the
APIs.

Except for blobs, all resources are in the JSON format. Blobs are just
sequences of bytes and don't have structure. Build artifacts and build
logs are blobs.

Note that the server doesn't store any client-side state at all. There
are no sessions, no logins, etc. Authentication is handled by
attaching (in the `Authorization` header) a token to each request. The
identity provider gives out the tokens to API clients, on request.

Note also the API doesn't have RPC style calls. The server end may
decide to do some action as a side effect of a resource being created
or updated, but the API client can't invoke the action directly. Thus,
there's no way to "run this pipeline"; instead, there's a resource
showing the state of a pipeline, and changing that resource to say
state is "triggered" instead of "idle" is how an API client tells the
server to run a pipeline.


Ick controller resources and API
-----------------------------------------------------------------------------

See the example project for examples. Each item in the `projects` and
`pipelines` lists is a resource. The example is in YAML syntax, but is
trivially converted to JSON, which the API talks. (The example is
input to the `icktool` command and is meant to be human-editable. YAML
is better for that, than JSON.)

For a fuller description of the APIs, see the [yarn][] scenario tests
in the ick source code: <http://git.liw.fi/ick2/tree/yarns>

A build resource is created automatically, when a project is
triggered, at `/builds/BUILDID`, a pipeline is triggered. It can't be
changed via the API.

    {
        "project": "liw.fi",
        "build_id": "liw.fi/12765",
        "build_number": 12765,
        "log": "logs/liw.fi/12765",
        "parameters": {},
        "pipeline": "ikiwiki-run",
        "worker": "bartholomew",
        "status": "building",
    }

A build log is stored at `/logs/liw.fi/12765` as a blob. The build log
is appended to by the worker-manager by reporting output.

Workers are registered to the controller by creating a worker
resource. Later on, we can add useful metadata to the resource, but
for now we'll have just the name.

    {
        "worker": "bartholomew"
    }

A work resource tells a worker what to do next:

    {
        "project": "liw.fi",
        "pipeline": "ikiwiki-run",
        "step": {
            "shell": "ikiwiki --setup ikiwiki.setup"
        },
        "parameters": {
            "rsync-target": "..."
        }
    }

The controller provides a simple API to give work to each worker:

    GET /work

The controller identifies the worker from the access token.

The controller keeps track of which worker is currently running each
pipeline.

Work output resource:

    {
        "worker": "bartholomew",
        "project": "liw.fi",
        "pipeline": "ikiwiki-run",
        "exit_code": null,
        "stdout": "...",
        "stderr": "...",
        "timestamp": "..."
    }

When `exit_code` is non-null, the step has finished, and the
controller knows it should schedule the next step in the pipeline.
If `exit_code` is a non-zero integer, the action failed.