summaryrefslogtreecommitdiff
path: root/tickets/7503b896b90049d3a89ad5a7f4ed021d/Maildir/new/1549662191.M217124P14948Q1.koom
blob: 95dd6003db6e623fc9a8e1ac1a6daa2cc629b009 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
Return-Path: <ick-discuss-bounces@ick.liw.fi>
X-Original-To: distix@pieni.net
Delivered-To: distix@pieni.net
Received: from yaffle.pepperfish.net (yaffle.pepperfish.net [88.99.213.221])
	by pieni.net (Postfix) with ESMTPS id 43452431FA
	for <distix@pieni.net>; Fri,  8 Feb 2019 21:42:34 +0000 (UTC)
Received: from platypus.pepperfish.net (unknown [10.112.101.20])
	by yaffle.pepperfish.net (Postfix) with ESMTP id 1491A41318
	for <distix@pieni.net>; Fri,  8 Feb 2019 21:42:34 +0000 (GMT)
Received: from ip6-localhost.nat ([::1] helo=platypus.pepperfish.net)
	by platypus.pepperfish.net with esmtp (Exim 4.80 #2 (Debian))
	id 1gsDur-0006qU-Vo; Fri, 08 Feb 2019 21:42:33 +0000
Received: from koom.pieni.net ([88.99.190.206] helo=pieni.net)
 by platypus.pepperfish.net with esmtpsa (Exim 4.80 #2 (Debian))
 id 1gsDur-0006qF-0d
 for <ick-discuss@ick.liw.fi>; Fri, 08 Feb 2019 21:42:33 +0000
Received: from exolobe4 (mobile-access-6df022-41.dhcp.inet.fi [109.240.34.41])
 by pieni.net (Postfix) with ESMTPSA id 44190431FA
 for <ick-discuss@ick.liw.fi>; Fri,  8 Feb 2019 21:42:30 +0000 (UTC)
Message-ID: <5e2b278847b76dd0311d5050b3455b12b4dd3077.camel@liw.fi>
From: Lars Wirzenius <liw@liw.fi>
To: ick-discuss@ick.liw.fi
Date: Fri, 08 Feb 2019 13:42:10 -0800
User-Agent: Evolution 3.30.4-1 
Mime-Version: 1.0
X-Pepperfish-Transaction: d5be-f6af-3d3b-972a
X-Pepperfish-Transaction-By: platypus
Subject: Plan for using Muck for the Ick controller
X-BeenThere: ick-discuss@ick.liw.fi
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: discussions about the ick CI system <ick-discuss-ick.liw.fi>
List-Unsubscribe: <https://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/ick-discuss-ick.liw.fi>,
 <mailto:ick-discuss-request@ick.liw.fi?subject=unsubscribe>
List-Archive: <http://listmaster.pepperfish.net/pipermail/ick-discuss-ick.liw.fi>
List-Post: <mailto:ick-discuss@ick.liw.fi>
List-Help: <mailto:ick-discuss-request@ick.liw.fi?subject=help>
List-Subscribe: <https://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/ick-discuss-ick.liw.fi>,
 <mailto:ick-discuss-request@ick.liw.fi?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============6608641942915556769=="
Mime-version: 1.0
Sender: ick-discuss-bounces@ick.liw.fi
Errors-To: ick-discuss-bounces@ick.liw.fi


--===============6608641942915556769==
Content-Type: multipart/signed; micalg="pgp-sha512";
 protocol="application/pgp-signature"; boundary="=-2hGZcifWjHnZE7/eshss"


--=-2hGZcifWjHnZE7/eshss
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

TL;DR: The Ick controller can be changed in a straightforward manner
to use Muck for persistent data storage, except for log files. It
doesn't seem worthwhile to support an in-place conversion. Instead,
the few people affected can re-create their projects and pipelines on
a new, fresh demo Ick instance, which will replace the current one.

TS;RF (too short; rambling follows):

Currently, the Ick controller stores various resources directly in its
local filesystem. I wish to change the controller to use Muck instead.
The main motivation for this is to have better access control: users
of the same controller shouldn't see or be able to change projects or
pipelines for other users.

Muck is a JSON data store with access control. Part of the access
control is that every JSON object ("resource") is owned by a specific
user. Every user can only access their own objects, for now (although
this will become more flexible later).

Where the controller currently creates, say, a resource for the
project foo, by storing it as YAML in the file
"/var/lib/ick/state/projects/foo", I plan to change that so that it
creates a JSON resource in Muck like this:

    {
        "_type": "project",
        "_name": "foo",
        ... # all other fields as in the current YAML file
    }

Muck invents a unique identifier for each object, and guarantees no
other object has that identifier. The controller will not use this,
and will instead use a search on the "_type" and "_name" fields to
find the right object. This is so that users may refer to projects and
pipelines using more humane names: "ick.liw.fi" instead of
"48053f4f-71d9-42a1-b3ca-8574cbb788aa" for example.

Muck allows arbitrary JSON objects to be stored. For the Ick
controller, the following approach seems like a reasonable first
attempt:

* Each object will have a "_type" field, which specifies the type of
  the object: project, pipeline, build, log, or worker. This is needed
  so that one can search for a "project foo", as opposed to "pipeline
  foo".

* Each object will have a user-assigned "_name" field, which the
  controller makes sure is unique and prefixed with the object owner's
  username.

  Muck does not have transactions that span multiple HTTP requests.
  The controller will do its best to ensure a name is unique, but it
  can't guarantee that. However, if it notices a name clash later, it
  will treat that as an error. For example, if there are two projects
  named "liw/ick.liw.fi", the controller will refuse to trigger
  either. (This is a limitation in Muck and will be fixed later,
  possibly by teaching Muck about user-defined names for resources,
  and having it make sure they're unique. But that will have to wait
  for a later version of Muck.)

* Depending on the type of the object, it may contain other fields as
  well, see below.

* The controller creates objects in Muck based on API calls to the
  controller, and passes on the access token it gets. Muck uses the
  access token's "sub" field to assign ownership of the object; some
  access tokens do not have such a field, and can thus not be
  accressed by any user; this will be the case for workers that
  register themselves.

The following object types will be supported initially:

* project - projects defined by the user
* pipeline - pipelines defined by the user
* build - a build that's been triggered, is running, or is finished
* log - a build log
* worker - a worker

The contents of these object types are as they are in the controller
now. Switching to Muck does not change that, except for logs, which
need special handling to avoid very bad performance.

Muck stores all incoming resources to a "change log". A build log may
get thousands of updates: each line of output may become an update.
Doing a new update of the log object would result in thousands of
nearly identical copies in the change log, which is quite wasteful:
each new version of the log resource would be identical to the
previous one, except with a line of new text. When a build produces a
thousand lines of output, Muck would store a thousand copies of the
log object in its change log.

To avoid this waste, the controller will be changed to store the build
logs as follows:

* The worker-manager will send each build log snippet as a separate
  update, as before, but it will also add a sequence number to the
  updates.

* The controller will create a separate log object for reach update in
  Muck. This object will contain only the new snippet. The new log
  object will have the log snippet, plus a reference to the build for
  which it is part of, and the sequence number.

* The controller will reconstruct the whole build log when it's
  requested for a complete log, by fetching all log objects that refer
  to the specified build, and catenating the log output snippets in
  order of the sequence number.

I think this will work, but I haven't written any code yet. Comments?
I plan on starting work on this next week, when I get back from gallivantin=
g
in SF.

I don't plan on converting existing projects and pipeline on the demo Ick
instance. That seems like work that would be useful, but it's also work
that's maybe not worthwhile yet, given the demo Ick instance only has a few
users. I'm lazy, sorry.


--=-2hGZcifWjHnZE7/eshss
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEETNTnrewG6wEE1EJ3bC+mFux6IDEFAlxd97IACgkQbC+mFux6
IDE+uA//ciOWY53Sv6SW4p5fOZPZ0SGtOJEP2+Y36Bx+b9s0v4/RpW1b00Zs6alT
E50Y4MoFBD2IR6S/jPs2kDlve4CSrIUiI0gfMt5iBWrcJK3GlzqvOK33RY+ofQFW
OCdA5o1U/YFdSZzdJqJJqx5hg0BKrR887WXKwQOX/K9eq8hCSgtrtKNho2TgPO/s
8s3hveugFXJND8AtEuhuTyuh3e3PhKluiZYt7gTjCFyYMi83I/0k1i0SX1xlUisl
IV83ZbnAfQWC9f9q6b4Qmq/rmEEMJEchwGKKlrbbQx1A2H8yAo1Rq+TKonygXCSA
58WVTBpkREh42X9AzLsk53hQmGtrKZ4GTr/tilr8rStxwfWGO07v19mtdTLuAhYx
xKR621JcGv06TYE6BUmCpW45fKyGkfWFOzfyQrMK47Of/Ke0/oFspWkswzHZInvk
O/cGPjjjS7/2+ouXiiXNy45te26uyWU1uIy6bwYCgqZqgQsT59pgOb8UvqinG6rW
qz6Bhqj2AkQCHHa1TJHABV7vgC5YxUMy9l4UnmlfJbTk/vBPMTBk1KXjvltfHPlL
3EvC/o2zU096Q6IurxnYcdbeO1+zP3tv9T3tCxrQ9KTgfTvB2WFQ4xXAl6EXS8sp
zp92RajksrkbunjGwB6DBxQqKn2NbpvioIljZUWlMn0Nk5IkVpU=
=hHLc
-----END PGP SIGNATURE-----

--=-2hGZcifWjHnZE7/eshss--



--===============6608641942915556769==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ick-discuss mailing list
ick-discuss@ick.liw.fi
https://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/ick-discuss-ick.liw.fi

--===============6608641942915556769==--