diff options
author | Lars Wirzenius <liw@liw.fi> | 2021-02-17 08:49:38 +0200 |
---|---|---|
committer | Lars Wirzenius <liw@liw.fi> | 2021-02-18 10:42:40 +0200 |
commit | a85e7d15c57c11d8286656531c7394681fb855d9 (patch) | |
tree | 872ec655a99606e8fe2147a4b8d4fb3e1aeb186b /muck.md | |
parent | 4b9edb354297353cbbe38575553ca1351f68d380 (diff) | |
download | subplot-a85e7d15c57c11d8286656531c7394681fb855d9.tar.gz |
refactor: move echo and muck examples under new examples/ directory
Diffstat (limited to 'muck.md')
-rw-r--r-- | muck.md | 513 |
1 files changed, 0 insertions, 513 deletions
diff --git a/muck.md b/muck.md deleted file mode 100644 index d5470a0..0000000 --- a/muck.md +++ /dev/null @@ -1,513 +0,0 @@ ---- -title: Muck JSON storage server and API -author: Lars Wirzenius -date: work in progress -bindings: muck.yaml -functions: muck.py -template: python -... - -Introduction -============================================================================= - -Muck is intended for storing relatively small pieces of data securely, -and accessing them quickly. Intended uses cases are: - -* storing user, client, application, and related data for an OpenID - Connect authenatication server -* storing personally identifiable information of data subjects (in the - GDPR sense) in a way that they can access and update, assuming - integration with a suitable authantication and authorization server -* in general, storage for web applications of data that isn't large - and fits easily into RAM - -Muck is a JSON store, with an access controlled RESTful HTTP API. Data -stored in Muck is persistent, but kept in memory for fast access. Data -is represented as JSON objects. - -Access is granted based on signed JWT bearer tokens. An OpenID Connect -or OAuth2 identity provider is expected to give such tokens to Muck -clients. The tokens must be signed with a public key that Muck is -configured to accept. - -Access control is simplistic. Each resource is assigned an owner -upon creation, and each user can access (see, update, delete) only -their own resources. A use with "super" powers can access, update, and -delete resources they don't own, but can't create resources for other. -This will be improved later. - -Architecture ------------------------------------------------------------------------------ - -Muck stores data persistently in its local file system. It provides an -HTTP API for clients. Muck itself does not communicate otherwise with -external entities. - -```dot -digraph "architecture" { -muck [shape=box label="Muck"]; -storage [shape=tab label="Persistent \n storage"]; -client [shape=ellipse label="API client"]; -idp [shape=ellipse label="OAuth2/OIDC server"]; - -storage -> muck [label="Read at \n startup"]; -muck -> storage [label="Write \n changes"]; -client -> muck [label="API read/write \n (HTTP)"]; -client -> idp [label="Get access token"]; -idp -> muck [label="Token signing key"]; -} -``` - - -Authentication ------------------------------------------------------------------------------ - -[OAuth2]: https://oauth.net/ -[OpenID Connect]: https://openid.net/connect/ -[JWT]: https://en.wikipedia.org/wiki/JSON_Web_Token - -Muck uses [OAuth2][] or [OpenID Connect][] bearer tokens as access -tokens. The tokens are granted by some form of authentication service, -are [JWT][] tokens, and signed using public-key cryptography. The -authentication service is outside the scope of this document; any -standard implementation should work. - -Muck will be configured with one public key for validating the tokens. -For Muck to access a token: - -* its signature must be valid according to the public key -* it to must be used while it's valid (after the validity starts, but - before if expires) -* its audience must be the specific Muck instance -* its scope claim contains the specified scopes needed for the - attempted operation -* it specified an end-user (data subject) - -Every request to the Muck API must include a token, in the -`Authorizatin` header as a bearer token. The request is denied if the -token does not pass all the above checks. - -Requirements -============================================================================= - -This chapter lists high level requirements for Muck. - -Each requirement here is given a unique mnemnoic id for easier -reference in discussions. - -**SimpleOps** - -: Muck must be simple to install and operate. Installation should be - installing a .deb package, configuration by setting the public key - for token signing of the authentication server. - -**Fast** - -: Muck must be fast. The speed requirement is that Muck must be able - to handle at least 100 concurrent clients, creating 1000 objects - each, and then retrieving each object, and then deleting each - object, and all of this must happen in no more than ten minutes - (600 seconds). Muck and the clients should run on different - virtual machines. - -**Secure** - -: Muck must allow access only by an authenticated client - representing a data subject, and must only allow that client to - access objects owned by the data subject, unless the client has - super privileges. The data subject specifies, via the access - token, what operations the client is allowed to do: whether they - read, update, or delete objects. - - -HTTP API -============================================================================= - -The Muck HTTP API has one endpoint – `/res` – that's used -for all objects. The objects are called resources by Muck. - -The JSON objects Muck operates on must be valid, but their structure -does not matter to Muck. - -Metadata ------------------------------------------------------------------------------ - -Each JSON object stored in Muck is associated with metadata, which is -represented as the following HTTP headers: - -* **Muck-Id** – the resource id -* **Muck-Revision** – the resource revision - -The id is assiged by Muck at object creation time. The revision is -assigned by Muck when the object is created or modified. - - -API requests ------------------------------------------------------------------------------ - -The RESTful API requests are POST, PUT, GET, and DELETE. - -* **POST /res** – create a new object -* **PUT /res** – update an existing object -* **GET /res** – retrieve a existing object -* **DELETE /res** – delete an existing object - -Although it is usual for RESTful HTTP APIs to encode resource -identifiers in the URL, Muck uses headers (Muck-Id, Muck-Revision) for -consistency, and to provide for later expansion. Muck is not intended -to be used manually, but by programmatic clients. - -Additionally, the "sub" claim in the token is used to assign and check -ownership of the object. If the scope contains "super", the sub claim -is ignored, except for creation. - -The examples in this chapter use HTTP/1.1, but should provide the -necessary information for other versions of HTTP. Also, only the -headers relevant to Muck are shown. For example, HTTP/1.1 requires -also a Host header, but this is not shown in the examples. - - - -### Creating an object: POST /res - -Creating requires: - -* "create" in the scope claim -* a non-empty "sub" claim, which will be stored by Muck as the owner - of the created object - -The creation request looks like this: - -~~~{.numberLines} -POST /res HTTP/1.1 -Content-Type: application/ -Authorization: Bearer TOKEN - -{"foo": "bar"} -~~~ - -Note that the creation request does not include Muck-Id or -Muck-Revision headers. - -A successful response looks like this: - -~~~{.numberLines} -201 Created -Content-Type: application/json -Muck-Id: ID -Muck-Revision: REV1 -~~~ - -Note that the response does not contain a copy of the resource. - - - -### Updating an object: PUT /res - -Updating requires: - -* "update" in the scope claim -* one of the following: - - "super" in the scope claim - - "sub" claim matches owner of object Muck; super user can update - any resource, but otherwise data subjects can only update their own - objects -* Muck-Revision matches the current revision in Muck; this functions - as a simplistic guard against conflicting updates from different - clients. - -The update request looks like this: - -~~~{.numberLines} -PUT /res HTTP/1.1 -Authorization: Bearer TOKEN -Content-Type: application/json -Muck-Id: ID -Muck-Revision: REV1 - -{"foo": "yo"} -~~~ - -In the request, ID identifies the object, and REV1 is its revision. - -The successful response: - -~~~{.numberLines} -200 OK -Content-Type: application/json -Muck-Id: ID -Muck-Revision: REV2 -~~~ - -Note that the update response also doesn't contain the object. The -client should remember the new revision, or retrieve the object get -the latest revision before the next update. - - -### Retrieving an object: GET /res - -A request requires: - -* "show" in the scope claim -* one of the following: - - "super" in the scope claim - - "sub" claim matches owner of object Muck; super user can retrieve - any resource, but otherwise data subjects can only update their own - objects - -The request to retrieve a response: - -~~~{.numberLines} -GET /res HTTP/1.1 -Authorization: Bearer TOKEN -Muck-Id: ID -~~~ - -A successful response: - -~~~{.numberLines} -200 OK -Content-Type: application/json -Muck-Id: ID -Muck-Revision: REV2 - -{"foo": "yo"} -~~~ - -Note that the response does NOT indicate the owner of the resource. - - - -Acceptance criteria for Muck -============================================================================= - -This chapter details the acceptance criteria for Muck, and how they're -verified. - - -Basic object handling ------------------------------------------------------------------------------ - -First, we need a new Muck server. It will initially have no objects. -We also need a test user, whom we'll call Tomjon. - -~~~scenario -given a fresh Muck server -given I am Tomjon -~~~ - -Tomjon can create an object. - -~~~scenario -when I do POST /res with {"foo": "bar"} -then response code is 201 -then header Muck-Id is ID -then header Muck-Revision is REV1 -~~~ - -Tomjon can then retrieve the object. It has the same revision and -body. - -~~~scenario -when I do GET /res with Muck-Id: {ID} -then response code is 200 -then header Muck-Revision matches {REV1} -then body matches {"foo": "bar"} -~~~ - -Tomjon can update the object, and the update has the same id, but a -new revision and body. - -~~~scenario -when I do PUT /res with Muck-Id: {ID}, Muck-Revision: {REV1}, and body {"foo":"yo"} -then response code is 200 -then header Muck-Revision is {REV2} -then revisions {REV1} and {REV2} are different -~~~ - -If Tomjon tries to update with the old revision, it fails. - -~~~scenario -when I do PUT /res with Muck-Id: {ID}, Muck-Revision: {REV1}, and body {"foo":"yo"} -then response code is 409 -~~~ - -After the failed update, the object or its revision haven't changed. - -~~~scenario -when I do GET /res with Muck-Id: {ID} -then response code is 200 -then header Muck-Revision matches {REV2} -then body matches {"foo": "yo"} -~~~ - -We can delete the resource, and then it's gone. - -~~~scenario -when I do DELETE /res with Muck-Id: {ID} -then response code is 200 -when I do GET /res with Muck-Id: {ID} -then response code is 404 -~~~ - - -Restarting Muck ------------------------------------------------------------------------------ - -Muck should store data persistently. For this we need our test user to -have the "super" capability. - -~~~scenario -given a fresh Muck server -given I am Tomjon, with super capability -when I do POST /res with {"foo": "bar"} -then header Muck-Id is ID -then header Muck-Revision is REV1 -~~~ - -So far, so good. Nothing new here. Now we restart Muck. The resource -just created must still be there. - -~~~scenario -when I restart Muck -when I do GET /res with Muck-Id: {ID} -then response code is 200 -then header Muck-Revision matches {REV1} -then body matches {"foo": "bar"} -~~~ - - -Super user access ------------------------------------------------------------------------------ - -Check here that if we have super scope, we can retrieve, update, and -delete someone else's resources, but if we create a resourec, it's -ours. - -Invalid requests ------------------------------------------------------------------------------ - -There are a number of ways in which a request might be rejected. This -section verifies all of them. - -### Accessing someone else's data - -~~~scenario -given a fresh Muck server -given I am Tomjon -when I do POST /res with {"foo": "bar"} -then header Muck-Id is ID -then header Muck-Revision is REV1 -when I do GET /res with Muck-Id: {ID} -then response code is 200 -then header Muck-Revision matches {REV1} -then body matches {"foo": "bar"} -~~~ - -After this, we morph into another test user. - -~~~scenario -given I am Verence -when I do GET /res with Muck-Id: {ID} -then response code is 404 -~~~ - -Note that we get a "not found" error and not a "access denied" error -so that Verence doesn't know if the resource exists or not. - - -### Updating someone else's data - -This is similar to retrieving it, but we try to update instead. - -~~~scenario -given a fresh Muck server -given I am Tomjon -when I do POST /res with {"foo": "bar"} -then header Muck-Id is ID -then header Muck-Revision is REV1 -given I am Verence -when I do PUT /res with Muck-Id: {ID}, Muck-Revision: {REV1}, and body {"foo":"yo"} -then response code is 404 -~~~ - - -### Deleting someone else's data - -This is similar to retrieving it, but we try to delete it instead. - -~~~scenario -given a fresh Muck server -given I am Tomjon -when I do POST /res with {"foo": "bar"} -then header Muck-Id is ID -then header Muck-Revision is REV1 -given I am Verence -when I do DELETE /res with Muck-Id: {ID} -then response code is 404 -~~~ - -### Bad signature - -### Not valid yet - -### Not valid anymore - -### Not for our instance - -### Lack scope for creation - -### Lack scope for retrieval - -### Lack scope for updating - -### Lack scope for deletion - -### No subject when creating - -### No subject when retrieving - -### No subject when updating - -### No subject when deleting - -### Invalid JSON when creating - -### Invalid JSON when updating - - -# Possible future changes - -* There is no way to list all the resources a user has, or search for - resource. This should be doable in some way. With a search, a - listing operation is not strictly necessary. - -* It's going to be inconvenient to only be able to access one's own - resources. It would be good to support groups. A resource could be - owned by a group, and end-users / subjects could belong to any - number of groups. Also, groups should be able to belong to groups. - Each resource should be able to specify for each group what access - members of that group should have (retrieve, update, delete). There - should be no limits to how many group access control rules there are - per resource. - - This would allow setups such as each resource representing a stored - file, and some groups would be granted read access, or read-write - access, or read-delete access to the files. - -* Also, it might be good to be able to grant other groups access to - controll a resource's access control rules. - -* It might be good support schemas for resources? - -* It might be good to have a configurable maximum size of a resource. - Possibly per-user quotas. - -* It would be good to support replication, sharding, and fault - tolerance. - -* Monitoring, logging, other ops requirements? - -* Encryption of resources, so that Muck doesn't see the contents? - -* Should Muck sign the resources it returns, with it's own key? |