summaryrefslogtreecommitdiff
path: root/gitlab.md
blob: d3f7c6ae42370ee3902fb959a3a9f5cea34f9fb0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
---
title: Implementing CI/CD around GitLab
author: Lars Wirzenius
documentclass: article
...

# Introduction

![components](component.svg)\ 

This is the plan for the **first iteration** of implementing a CI/CD
system around GitLab for WMF. This iteration tries to do the least
amount of work to prove that the planned architecture is workable.
There are some small differences to the architecture in the planning
document.

The components are:

* **Gerrit** (or any git server): this is the canonical location for
  the source code. It emits *events* that the controller reacts to.
  Also, the controller sends message to it. This will be mocked in the
  first iteration by using any dumb git server.
* **controller**: this orcestrates builds and deployments; in the
  first iteration, it won't be listening to Gerri events, and will
  instead have an HTTP API that will be used instead
* **GitLab**: this is used only for it's CI/CD functionality; a
  secondary copy of the git repository is kept here, because GitLab
  requires it
* **Runner**: this is used by GitLab to run builds and tests; it
  uploads built binaries to the artifact store; this corresponds to
  "build worker" in the planning document
* **artifact store**: this stores binaries or other build artifacts so
  they persist when the Runner goes away, given the Runner is a Docker
  container and has no persistency
* **VCS worker**: this retrieves source code from Gerrit (or other git
  server) and pushes is to GitLab; it's a separate system so it can be
  given credentials to access non-public git repositories
* **deployment worker**: this gets binaries from the artifact store
  and deploys them to the test environment
* **test environment**: this mocks a production-like environment for
  running sites and services

Differences from the planning document:

* There's GitLab to run builds, rather then the controller commanding
  build workers directly.
* There's no log store. This isn't necessary for the first iteration.
* There's only one environment, the test environment, and it won't be
  running sites or services. Deployment is simulated by merely
  publishing the build artifacts in the test environment.

# New components

There will need to be some new components. We'll keep them as simple
as possible. Most will have a simple HTTP API. We'll use signed JWT
access tokens, which will be generated staticlly and installed when
the components are set up. (This is highly inadequtate for production,
but this is just the first iteration.)

All HTTP APIs will served over HTTPS using a TLS certificate from
Let's Encrypt, for the first iteration.

None of the components will aim to be fast or to serve many clients,
in the first iteration. They'll be implemented using haproxy (TLS),
bottle.py, and some custom Python code. All of this may change after
the first iteration, but these tools are familiar to me so I can just
use them, and don't have to learn stuff to get started.

## Controller

* Simple HTTP API
* Endpoint: POST /cd, body specifies which repo and ref to build and
  deploy; queues the build, the queued build will be visible via the
  /status endpoint

        {
            "repo": "...",
            "ref": "..."
            "artifact": "..."
        }

* Endpoint: GET /status, which lists what jobs (posts to /cd) are
  queued, or running, or finished

        {
            "builds": [
                "id1": {"repo":"...", "ref":"...", "artifact":"..."},
                "id2": {"repo":"...", "ref":"...", "artifact":"..."},
                "id3": {"repo":"...", "ref":"...", "artifact":"..."},
            ]
            "queued": ["id1", ...]
            "building": ["id2", ...]
            "finished": ["id3", ...]
        }

## VCS worker

* Simple HTTP API
* Endpoint: POST /repo, body specifies repo and ref to fetch; git
  clone (or pulls) that, and pushes to GitLab repo with same name
  (creates it if necessary), but using only the master branch; returns
  info of how things went

        {
            'git': 'git://git.liw.fi/heippa',
            'ref': 'master',
            'name': 'hithere2'
        }

* Clones the specified git repository, and pushes the specified branch
  as master in GitLab. Deletes the repository on GitLab if it exists
  already (for hygiene). If there's a `.gitlab-ci.yml` in the repo,
  GitLab CI will run it.

## Artifact store

* Simple HTTP API
* We can probably use the Ick artifact store for this, at least
  initially
* Endpoint: PUT /blobs/NAME, stores body as blob named NAME,
  overwriting it if it existed already
* Endpoing: GET /blobs/NAME, returns blob, with a generic content
  type, or 404 if not found

## Deployment worker

* Simple HTTP API
* Endpoint: POST /deploy, body specifies name in artifact store,
  copies that to the test environment using SSH

        {
            "artifact_id": "foo",
            "published_name": "foo-1.2.4"
        }


## Test environment

* SSH server with HTTP server serving a directory
* Deployment is mocked by copying blob from artifact store to the
  directory being serveed by HTTP server

# How a build works

![Build sequence](buildseq.svg)\ 


The build has the following steps:

1. Gerrit notifies the controller of a change in a git repo.

2. The controller tells the VCS worker to copy the repo to GitLab, by
   doing a "POST /repo" HTTP request.

3. The VCS worker git clones the repo from Gerrit. It may do a git
   pull instead to update an existing clone, but that's an
   optimisation, whiche we'll implement when it's needed. The git
   operation may require credentials (e.g., security embargoed
   repositories), which the vCS worker has: they're installed when the
   host is deployed. No other CI host has those credentials.

4. The VCS worker pushed the repository to GitLab. This may again
   require credentials. The push triggers GitLab CI to run the commit
   stage build and test command specified in `.gitlab-ci.yml`.

5. The VCS worker responds to the POST request from the controller
   with results.

6. GitLab tells the Runner host to run the build and test commands.
   The runner does that.

7. The Runner uploads any binaries it builds to the artifact store.

8. The Runner tells GitLab it's finished.

9. GitLab tells the controller via a webhook that a build has
   finished.

0. The controller tells the deployer to start a deployment, via a
   "POST /deploy" HTTP API call.

0. The deployer fetches the artifacts it has been told to deploy
   from the artifact store.

0. The deployer copyies the artifacts to the test environment. (In a
   future iteration this will be a more sophisticated deployment
   process.)

0. The deployer responds to the HTTP request from the controller with
   the results.

0. The controller notifies Gerrit of a build and deployment having
   been finished. (Except not in the first iteration.)