From a3a9897dedf847023586f1e24724029f747be78c Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Mon, 29 Oct 2018 11:01:34 +0200 Subject: Change: README --- README | 219 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 214 insertions(+), 5 deletions(-) diff --git a/README b/README index 05a2659..29530cb 100644 --- a/README +++ b/README @@ -1,11 +1,13 @@ muck-poc - a JSON store with an HTTP API and access control ============================================================================= -This is a proof of concept. +> This is a proof of concept. It's not meant to be performant. It's a +> vehicle for exploring what the optimal API and feature set should +> look like. -Muck is a JSON store, with an access controlled, RESTful HTTP API. -Data stored in Muck is persistent, but kept in memory for speed -of access (similar to Redis). Data is stored as flat JSON objects: +Muck is a JSON store, with an access controlled RESTful HTTP API. Data +stored in Muck is persistent, but kept in memory for simplicity. Later +on, also for speed. Data is stored as flat JSON objects, which means: * an object may have any number of fields * each field has a value that is `null`, a UTF-8 string, or a list of @@ -14,8 +16,215 @@ of access (similar to Redis). Data is stored as flat JSON objects: Access is granted based on signed JWT bearer tokens. An OpenID Connect or OAuth2 identity provider is expected to give such tokens to authorized users. The tokens are signed with a public key, and the -expected signing key is a key Muck configuration item. +expected signing key is a key Muck configuration item. I use Qvisqve +for my OpenID provider, but any provider should work. +Access control is currently very simplitic, but will be improved +later. The goal is to have allow access to be specified per user, per +resource and per operation. + +Muck is currently a single-threaded Python program using the Bottle.py +framework and its built-in HTTP server. The production version of Rust +will probably be written in Rust for performance. The current Python +version can do order of 900 requests per second on a Thinkpad X220 +laptop (plain HTTP over localhost). + +Architecture +----------------------------------------------------------------------------- + +Muck is in essence a dict in memory, indexed by resource id, and an +HTTP layer to allow it to be accessed. Any changes are logged to an +append-only `changelog` file. At startup, the `changelog` is read and +the changes are made to the dict. + +There are currently no index data structures, so searches are very +slow. + +Startup can be slow if `changelog` is long. Eventually this will be +fixed by having occasional snapshots of the dict, and only reading +change log entries made after the snapshot. + +Hacking +----------------------------------------------------------------------------- + +Run `./check` to run the full test suite: unit tests, and integration +tests. You'll need various build dependencies. I'm too lazy to list +them here. + +Run `./benchmark` and `./benchmark-http` to run some simplistic +benchmarking. + +The tests and benchmarks create access tokens using pre-generated test +keys. If you use those keys for anything else, I will laugh at you. + +Configuration and starting and stopping +----------------------------------------------------------------------------- + +Create a JSON configuration file: + + { + "log": "muck.log", + "pid": "muck.pid", + "store": "muck.store", + "signing-key-filename": "trusted-key.pub" + } + +Create the directory given as the store. Put the token-signing public +key in the named file. Start Muck with the following command: + + ./muck_poc config.json + +Muck will listen on port 12765 on localhost. If you want to expose +Muck to the external network, you should run a TLS-enabled reverse +proxy (like haproxy or nginx) in front of it. + +Muck writes its PID into the named PID file. To stop it, send SIGTERM +or SIGKILL to the process. + + +HTTP API +----------------------------------------------------------------------------- + +The HTTP API requires all requests to have an `Authorization: Bearer +TOKEN` headers, where `TOKEN` is a valid JWT access token whose +signature can be checked using the public key Muck is configured to +trust. The token should have a `scope` claims with space-delimited +words to allow specific operations. + +The API has two endpoints: `/res` for resources, `/search` for search. +Resources are managed as follows: + +* `POST /res` — create a new resource (need `create` in scope) +* `PUT /res` — update an existingresource (need `update` in scope) +* `GET /res` — retrieve a specific resource (need `show` in scope) +* `DELETE /res` — delete a specific resource (need `delete` in scope) + +In all requests and responses that transport a reosurce, it is in the +body, represented as JSON, using the `application/json` content type. + +Resource meta data is always given using HTTP headers of the request +and response: + +* `Muck-Id` — the resource id +* `Muck-Revision` — the resource revision + +The request should have these headers, if the operation requires +them. Responses always have them, if a resource is returned. + +Searches are done by using a GET request to the `/search` endpoint, +with a JSON body like this: + + { + "cond": [ + { + "where": "meta", + "field": "id", + "pattern": "ID123", + "op": "==" + ] + } + +The search condition is a list of simple conditions, which must all +match. A simple condition consists of four parts: + +* `where` — should be `meta` to match metadata, or `data` to + match the actual resource +* `field` — the name of the field to compare +* `pattern` — the value to compare the field to +* `op` — the comparison operation: `==`, `>=`, or `<=` + +The response is a JSON object listing all the ids of resources that +match all the simple conditions. + +Searches require the `show` scope. + + +API examples +----------------------------------------------------------------------------- + +All these examples assume you've already retrieved an access token. + +To create a resource: + + POST /res HTTP/1.1 + Authorization: Bearer TOKEN + Content-Type: application/json + + {"foo": "bar"} + +Response is: + + 201 Created + Content-Type: application/json + Muck-Id: ID + Muck-Revision: REV1 + + {"foo": "bar"} + +Note that in the future Muck might decide to modify the resource by +filling in missing fields. The canonical representation of the +resource is in the response. + +To update a resource: + + PUT /res HTTP/1.1 + Authorization: Bearer TOKEN + Content-Type: application/json + Muck-Id: ID + Muck-Revision: REV1 + + {"foo": "yo"} + +The response: + + 200 OK + Content-Type: application/json + Muck-Id: ID + Muck-Revision: REV2 + + {"foo": "yo"} + +To retrieve a response: + + GET /res HTTP/1.1 + Authorization: Bearer TOKEN + Muck-Id: ID + +The response: + + 200 OK + Content-Type: application/json + Muck-Id: ID + Muck-Revision: REV2 + + {"foo": "yo"} + +To delete a resource: + + DELETE /res HTTP/1.1 + Authorization: Bearer TOKEN + Muck-Id: ID + +The response: + + 200 OK + +To search: + + GET /search HTTP/1.1 + Authorization: Bearer TOKEN + Content-Type: application/json + + {"cond": [ + {"where":"data", "field":"name", "pattern":"James", "op":">="] + } + +The response: + + 200 OK + Content-Type: application/json + + {"resources": ["ID"]} Legalese ----------------------------------------------------------------------------- -- cgit v1.2.1