summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLars Wirzenius <lwirzenius@wikimedia.org>2019-05-07 16:08:38 +0300
committerLars Wirzenius <lwirzenius@wikimedia.org>2019-05-07 16:08:38 +0300
commitf52ce479fbf21f8b9da5f0449d5a7007b98a9cd7 (patch)
treec07a2f3498676fd8ea3a6cbe868922297bd9c48c
parentcf43a0fa2ff22c8717f6c141541d8ea160c2e4e0 (diff)
downloadwmf-ci-arch-f52ce479fbf21f8b9da5f0449d5a7007b98a9cd7.tar.gz
Change: un-bulletize the credentials management
-rw-r--r--ci-arch.html24
-rw-r--r--ci-arch.mdwn108
-rw-r--r--ci-arch.pdfbin285975 -> 286470 bytes
3 files changed, 68 insertions, 64 deletions
diff --git a/ci-arch.html b/ci-arch.html
index 9672f31..a2e4d45 100644
--- a/ci-arch.html
+++ b/ci-arch.html
@@ -41,7 +41,6 @@
<li><a href="#log-storage">Log storage</a></li>
<li><a href="#artifact-storage">Artifact storage</a></li>
<li><a href="#credentials-management-and-access-control">Credentials management and access control</a></li>
-<li><a href="#interdependent-changes-to-multiple-components">Interdependent changes to multiple components</a></li>
</ul></li>
<li><a href="#architecture-the-wmf-development-ecosystem">Architecture: The WMF development ecosystem</a></li>
<li><a href="#the-default-pipeline">The (default?) pipeline</a><ul>
@@ -235,18 +234,16 @@
</ul>
<h2 id="credentials-management-and-access-control">Credentials management and access control</h2>
<ul>
-<li><p>Credentials and other secrets are used to allow access to servers, services, and files. They are often highly security sensitive data. The CI system needs to protect them, but allow controlled use of them.</p></li>
-<li><p>Example: a CI job needs to deploy a Docker image with a tested and reviewed change as a container orchestrated by Kubernetes. For this, it needs to authenticate itself to the Kubernetes API. This is typically done by a username/password combination. How will the future CI system handle this?</p></li>
-<li><p>Example: for tests, and in production, a MediaWiki container needs access to a MariaDB database, and MW needs to authenticate itself to the database. MW gets the necessary credentials for this from its configuration, which CI will install during deployment. The configuration will be specific for what the container is being used: if it’s for testing a change, the configuration only allows access to a test database, but for production it provides access to the production database.</p></li>
-<li><p>FIXME: This is unclear as yet, the text below is some incoherent preliminary rambling by Lars which needs review and fixing.</p></li>
-<li><p>Builds are done in isolated containers. These containers have no credentials. Build artifacts are extracted from the containers and stored in an artifact storage system by the CI system, and this is done in a controlled environment, where only vetted code is run, not code from the repository being tested.</p></li>
-<li><p>Deployments happen in controlled environments, with access to the credentials needed for deployment. The deployment retrieve artifacts from the artifact storage system. The deployments are to containers, and the deployed continers don’t have any credentials, unless CI has been configured to install them, in which case CI installs the credentials for the intended use of the container.</p></li>
-<li><p>Tests run against software deployed to containers, and those containers only have access to the backing services needed for the test.</p></li>
-<li><p>The CI system needs a way to store the credentials that can only be accessed by CI itself, when it’s deploying a container (Kubernetes API access) or configuring the container (installing credentials for the intended use of container).</p>
-<p>This might be, for example, a set of files deployed to the CI host where container deployment or configuration runs, with access control provided by Unix permissions. Not sure if this is sufficiently secure.</p></li>
+<li>FIXME: This is unclear as yet, the text below is some incoherent preliminary rambling by Lars which needs review and fixing.</li>
</ul>
-<h2 id="interdependent-changes-to-multiple-components">Interdependent changes to multiple components</h2>
-<p>FIXME All or none of the changes need to merged and deployed. Lars would prefer to avoid having such changes.</p>
+<p>Credentials and other secrets are needed to allow access to servers, services, and files. They are highly security sensitive data. The CI system needs to protect them, but allow controlled use of them.</p>
+<p>Example: a CI job needs to deploy a Docker image with a tested and reviewed change as a container orchestrated by production Kubernetes. For this, it needs to authenticate itself to the Kubernetes API. This is typically done by a username/password combination, but might be an API token of some kind (though it doesn’t really matter; it’s all just secret bits at some level). How will the future CI system handle this?</p>
+<p>Example: for tests, and in production, a MediaWiki container needs access to a MariaDB database, and MW needs to authenticate itself to the database. MW gets the necessary credentials for this from its configuration, which CI will install during deployment. The configuration will be specific for what the container is being used: if it’s for testing a change, the configuration only allows access to a test database, but for production it provides access to the production database.</p>
+<p>Builds are done in isolated containers. These containers have no credentials. Build artifacts are extracted from the containers and stored in an artifact storage system by the CI system, and this extraction is done in a controlled environment, where only vetted code is run, not code from the repository being tested. The build environment can’t push artifacts directly to the artifact store.</p>
+<p>Deployments happen in controlled environments, with access to the credentials needed for deployment. The deployment retrieves artifacts from the artifact storage system. The deployments are to containers, and the deployed containers don’t have any credentials, unless CI has been configured to install them, in which case CI installs the credentials for the intended use of the container.</p>
+<p>Note that credentials should not come directly from the source code of the deployed program. CI deploys configuration when it deploys the software. This way, the same software (build artifacts) can be deployed to different environment. (This may be complicated by the way MediaWiki is configured, using a PHP file in the source tree. This will need discussion.)</p>
+<p>Tests run against software deployed to containers, and those containers only have access to the backing services needed for the test, and may even be firewalled to not have access to any other network locations.</p>
+<p>Suggestion: Deployments will be done dedicated deployment environments, which run a “pingee” service. When a pipeline executes a deployment stage, deploying to any environment, the stage runs in a suitable container, but doesn’t actually do the deployment itself. Instead, it “pings” a deployment service, with information of who is deploying, what, and where, and the deployment service inspects the change, and if it looks acceptable, does the actual deployment to the desired environment. The deployment service has access to the credentials it needs for accessing the artifacts and doing the deployment. There may be several deployment services, for deploying to environments with different security needs.</p>
<h1 id="architecture-the-wmf-development-ecosystem">Architecture: The WMF development ecosystem</h1>
<figure>
<img src="ecosystem.svg" alt="The WMF development ecosystem, roughly" style="height:25.0%" /><figcaption>The WMF development ecosystem, roughly</figcaption>
@@ -290,7 +287,8 @@
<p>FIXME This needs to be written, but it needs a lot of thinking first</p>
<h1 id="acceptance-criteria">Acceptance criteria</h1>
<ul>
-<li>FIXME: This chapter will sketch some automated acceptance tests using a Gherkin/Cucumber-like pseudo code language. Or in some other way that can be automatically executed.</li>
+<li><p>FIXME: This chapter will sketch some automated acceptance tests using a Gherkin/Cucumber-like pseudo code language. Or in some other way that can be automatically executed.</p>
+<p>The goal is to have CI deploy itself, and as part of the pipeline run acceptance tests defined here.</p></li>
</ul>
</body>
</html>
diff --git a/ci-arch.mdwn b/ci-arch.mdwn
index 590c108..ee32a82 100644
--- a/ci-arch.mdwn
+++ b/ci-arch.mdwn
@@ -508,60 +508,66 @@ we plan CI to implement them.
## Credentials management and access control
-* Credentials and other secrets are used to allow access to servers,
- services, and files. They are often highly security sensitive data.
- The CI system needs to protect them, but allow controlled use of
- them.
-
-* Example: a CI job needs to deploy a Docker image with a tested and
- reviewed change as a container orchestrated by Kubernetes. For this,
- it needs to authenticate itself to the Kubernetes API. This is
- typically done by a username/password combination. How will the
- future CI system handle this?
-
-* Example: for tests, and in production, a MediaWiki container needs
- access to a MariaDB database, and MW needs to authenticate itself to
- the database. MW gets the necessary credentials for this from its
- configuration, which CI will install during deployment. The
- configuration will be specific for what the container is being used:
- if it's for testing a change, the configuration only allows access
- to a test database, but for production it provides access to the
- production database.
-
* FIXME: This is unclear as yet, the text below is some incoherent
preliminary rambling by Lars which needs review and fixing.
-* Builds are done in isolated containers. These containers have no
- credentials. Build artifacts are extracted from the containers and
- stored in an artifact storage system by the CI system, and this is
- done in a controlled environment, where only vetted code is run, not
- code from the repository being tested.
-
-* Deployments happen in controlled environments, with access to the
- credentials needed for deployment. The deployment retrieve artifacts
- from the artifact storage system. The deployments are to containers,
- and the deployed continers don't have any credentials, unless CI has
- been configured to install them, in which case CI installs the
- credentials for the intended use of the container.
-
-* Tests run against software deployed to containers, and those
- containers only have access to the backing services needed for the
- test.
-
-* The CI system needs a way to store the credentials that can only be
- accessed by CI itself, when it's deploying a container (Kubernetes
- API access) or configuring the container (installing credentials for
- the intended use of container).
-
- This might be, for example, a set of files deployed to the CI host
- where container deployment or configuration runs, with access
- control provided by Unix permissions. Not sure if this is
- sufficiently secure.
-
-## Interdependent changes to multiple components
-
-FIXME All or none of the changes need to merged and deployed. Lars
-would prefer to avoid having such changes.
+Credentials and other secrets are needed to allow access to servers,
+services, and files. They are highly security sensitive data. The CI
+system needs to protect them, but allow controlled use of them.
+
+Example: a CI job needs to deploy a Docker image with a tested and
+reviewed change as a container orchestrated by production Kubernetes.
+For this, it needs to authenticate itself to the Kubernetes API. This
+is typically done by a username/password combination, but might be an
+API token of some kind (though it doesn't really matter; it's all just
+secret bits at some level). How will the future CI system handle this?
+
+Example: for tests, and in production, a MediaWiki container needs
+access to a MariaDB database, and MW needs to authenticate itself to
+the database. MW gets the necessary credentials for this from its
+configuration, which CI will install during deployment. The
+configuration will be specific for what the container is being used:
+if it's for testing a change, the configuration only allows access to
+a test database, but for production it provides access to the
+production database.
+
+Builds are done in isolated containers. These containers have no
+credentials. Build artifacts are extracted from the containers and
+stored in an artifact storage system by the CI system, and this
+extraction is done in a controlled environment, where only vetted code
+is run, not code from the repository being tested. The build
+environment can't push artifacts directly to the artifact store.
+
+Deployments happen in controlled environments, with access to the
+credentials needed for deployment. The deployment retrieves artifacts
+from the artifact storage system. The deployments are to containers,
+and the deployed containers don't have any credentials, unless CI has
+been configured to install them, in which case CI installs the
+credentials for the intended use of the container.
+
+Note that credentials should not come directly from the source code of
+the deployed program. CI deploys configuration when it deploys the
+software. This way, the same software (build artifacts) can be
+deployed to different environment. (This may be complicated by the way
+MediaWiki is configured, using a PHP file in the source tree. This
+will need discussion.)
+
+Tests run against software deployed to containers, and those
+containers only have access to the backing services needed for the
+test, and may even be firewalled to not have access to any other
+network locations.
+
+Suggestion: Deployments will be done dedicated deployment
+environments, which run a "pingee" service. When a pipeline executes
+a deployment stage, deploying to any environment, the stage runs in a
+suitable container, but doesn't actually do the deployment itself.
+Instead, it "pings" a deployment service, with information of who is
+deploying, what, and where, and the deployment service inspects the
+change, and if it looks acceptable, does the actual deployment to the
+desired environment. The deployment service has access to the
+credentials it needs for accessing the artifacts and doing the
+deployment. There may be several deployment services, for deploying to
+environments with different security needs.
# Architecture: The WMF development ecosystem
diff --git a/ci-arch.pdf b/ci-arch.pdf
index 9bb3ced..b9f6c81 100644
--- a/ci-arch.pdf
+++ b/ci-arch.pdf
Binary files differ