From 0cfe3e388772f07a88e9fdd1bb61967502815368 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Sat, 26 Jan 2019 09:17:22 +0200 Subject: Add: Yuck arch doc page --- yuck.mdwn | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 488 insertions(+) create mode 100644 yuck.mdwn (limited to 'yuck.mdwn') diff --git a/yuck.mdwn b/yuck.mdwn new file mode 100644 index 0000000..7ac0040 --- /dev/null +++ b/yuck.mdwn @@ -0,0 +1,488 @@ +[[!meta title="Yuck - an authentication server"]] + +[[!toc levels=2]] + +**NOTE**: Yuck is in its planning phase at the moment. No code exists, +only this document. Feedback on this document is welcome, preferably +via email to liw@liw.fi. Ick will continue to use Qvisqve for the +time being, until Yuck is ready to replace it. + +# Introduction + +Yuck is an **identity provider** that allows end users to **securely +authenticate** themselves to web sites and applications. Yuck also +allows users to **authorize** applications to act on their behalf. +Yuck supports the **OAuth2** and **OpenID Connect** protocols, and has +an API to allow storing and managing data about end users, +applications, and other entities related to authentication. + +Yuck does not provide any services unrelated to authentication. Other +services can work with Yuck to control access to them. + +OpenID Connect (OIDC) is a protocol suitable for interactively +authenticating a person (the end user). OAuth2 is suitable for +non-interactive API clients, possibly ones acting on behalf of the end +user. + +Both OAuth2 and OpenID Connect provide a number of variants and +extensions. Yuck implements the "client credentials grant" for OAuth2, +and the "authorization code flow" for OIDC. + +Yuck has an extensible architecture for supporting different ways for +users to authenticate, and for optionally using multiple +authentication factors. Initially it will implement traditional +passwords and time-based one-time passwords (TOTP, same as "Google +Authenticator"). + +The Yuck architecture supports different ways for storing the data and +credentials it needs. Initially it comes with support for using the +Muck JSON store, but support for, say, LDAP can be added. + +## Terminology and concepts + +* **access token**: a token which grants access to a service or + resource; usually short-lived, but see refresh token + +* **API client**: a program that uses the API, either on behalf of an + end-user, or on its own behalf + +* **application**: software that provides a service using the RP + +* **authenticate**: prove the identity of someone or something; "this + is how you know I am who I say am"; authentication can happen in any + number of ways, and different relying parties may have different + requirements: government ID; being able to read email sent to an + email address; knowing a secret; possessing a unique thing; acting + in a particular way; having particular body features (fingerprint, + face, voice, hand shape, ...); etc, the list is almost endless + +* **authorize**: grant access to an authenticated entity; "what are + they allowed to do?" + +* **end-user**: a human using the system, typically the reason the + system exists, can also be a subject + +* **front end**: provides the user interface to an end user via the + user agent or browser; typically provides HTML, JS, CSS, and images, + statically or generated dynamically, but could audio, video, or + anything the user can interact with + +* **IDP**: short for identity provider + +* **identify**: claim an identity; "this is who I say I am" + +* **identity**: who a human is, or which instance of a program is + +* **identity provider**: software the authenticates an end user and + non-human entities, and also stores authorizations for them + +* **JWT**: a standard way to represent tokens, see [JWT][]; Yuck will + use digitally signed tokens + +* **OAuth2**: a protocol for authenticating software; see [OAuth2][] + +* **OIDC**: short for OpenID Connect; a protocol for authenticating + end users; see [OIDC][] + +* **refresh token**: a token that can be used to get a new access + token; usually long-lived, but can be revoked + +* **relying party**: software that relies on the IDP for + authentication and authorization; often a resource provider, but can + also do things on request instead of merely storing things + +* **resource**: data stored by a resource provider + +* **resource provider**: stores resources and allows authorized access + to it; "database" + +* **RP** is short for relying party or resource provider + +* **subject**: a person whose personal information is handled by the + system, see end-user + +* **user agent**: typically a web browser, but can be a mobile + or desktop application; assumed to be under complete user control, + and so trusted by the user, but not the ecosystem + +[JWT]: https://en.wikipedia.org/wiki/JSON_Web_Token +[OAuth2]: https://en.wikipedia.org/wiki/OAuth#OAuth_2.0 +[OIDC]: https://en.wikipedia.org/wiki/OpenID_Connect + +# Requirements + +[RFC 2119]: https://www.ietf.org/rfc/rfc2119.txt + +Yuck has at least the following high level requirements. + +In this section, the key words "MUST", "MUST NOT", "REQUIRED", +"SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", +and "OPTIONAL" in this document are to be interpreted as described in +[RFC 2119][]. + +Each requirement and sub-requirement is given a unique name for easier +reference in discussions. + +* (SECURE) Yuck MUST be secure. + * (CREDSTORE) Yuck MUST be store credentials in a way that + minimises damage if they leak. Credentials SHOULD be stored + encrypted using a respected encryption algorithm (such as + scrypt) and using per-credential salting. Or something stronger + may be implemented instead. + * (MFA) Yuck MUST support multi-factor authentication using secure + factors. + * (PROTOS) Yuck MUST use secure protocols to authenticate users + and API clients. + * (HTTPS) Yuck MUST NOT ever use plain HTTP, only HTTPS. + * (AUDIT) Yuck SHOULD undergo security audits, and general + scrutiny. Audits SHOULD happen regularly. (This is not an + absolute requirement, as it depends on the availability of + competent auditors. Yuck is not a for-profit project, and may + not be able to pay them.) + * (SECUREANDUSABLE) The Yuck developers MUST keep security at the + highest priority, without sacrificing usability. +* (QUALITY) The Yuck project MUST aim for high quality, by applying + development methods that are known to work for achieving quality, + such as test-driven development, automated test suites with high + test coverage, and code review. +* (HSCALABLE) The Yuck architecture MUST be horizontally scalable to + very large numbers of concurrent users and API clients. + * (NOTUNSCALABLE) The implementation might not scale to very many + users or concurrent users, especially initially, but the + architecure MUST NOT prevent a scalable implementation. +* (ADMINFRIENDLY) Yuck MUST be flexible for system administrators to + manage, and applications to use. + * (ADMINAPIS) Yuck SHOULD provide APIs for managing the entities + and data it needs, such as for creating end users and API + clients, or changing their credentials. + * (APPFRIENDLY) Yuck SHOULD enable applications to delegate all + authentication to Yuck. +* (FREEDOM) Yuck MUST be free software. It MUST NOT require + applications, API clients, and other software that works with Yuck + to be free software. +* (PRIVACYSTORE) Yuck MUST NOT store personal information it does not + need. +* (PRIVACYLEAK) Yuck MUST NOT leak personal information. + + +# Architecture: the ecosystem + +[[!graph type=digraph src=""" +user [shape="ellipse" label="end user" margin="0.2,0.2"]; +browser [shape="tab" label="web browser /\nuser agent" margin="0.2,0.2"]; +webapp [shape="component" label="application\nfrontend" margin="0.2,0.2"]; +IDP [shape="component" label="IDP" margin="0.2,0.2"]; +RP [shape="cylinder" margin="0.2,0.3"]; +app_api [shape="component" label="application\nbackend" margin="0.2,0.2"]; +user_client [shape="tab" label="API client\n(on behalf of user)" margin="0.2,0.2"]; +auto_client [shape="tab" label="API client\n(autonomous)" margin="0.2,0.2"]; + +user -> browser; +browser -> webapp; +browser -> IDP; +webapp -> IDP; +webapp -> app_api; +app_api -> RP; + +user_client -> IDP; +user_client -> app_api; + +auto_client -> IDP; +auto_client -> app_api; +"""]] + +An IDP interacts with several other systems to enable end users to do their +thing. The RP provides the actual service, and delegates +authentication to the IDP. There can be other services in front of the +RP, and for security reasons there has to be at least one for end-user +authentication. + +* The end user interacts directly with their web browser or other user + agent, which is assumed to be entirely under their control, and thus + not trusted by the IDP or other components. The end user is assumed + to trust what they use. + +* The browser talks to the facade (to get the HTML and JS and other + files to present a UI to the user), and the IDP (to allow the user + to authenticate themselves). + +* The facade holds the access token on behalf of an authenticated end + user. + +* The facade talks to a backend, giving it the user's access token as + proof of authentication and authorization. + +* The backend provides an API suitable for the service it provides. It + also allows access based on the access token. + +* The resource provider stores data for the backend. It also allows + access based on the access token. + +Some access is not interactive by the end user, but by API clients +that either act on behalf of the user, or are unrelated to them in any +way. The end user can authorize an API client access on their behalf. +The authorization can limit the API client's access to a subset to +what the end user user can do. If the end user can both read and write +a resource, the authorization might only allow the API client to read +the resource. + +API clients that are unrelated to the user are authorized by the +owners of the RP. See below for an example. + +## Authentication scenarios + +As examples of how an authentication server might be used, consider a +an online banking system. It should support at least three scenarios. + +**End user interactively accesses their account**: The end user opens up +the bank web page, and logs in, and can interactively do whatever +they're allowed to do: view their bank statement, transfer money, etc. + +**End user authorizes an API client**: The end user, who happens to be +a Unix sysadmin, might want to automatically retrieve their bank +statement and feed it to their accounting system. They create an +authorization for an API client that only allows it to retrieve the +statement, but not do anything else. This creates, in the IDP, a new +API client identity, which is tied to the end user's identity, so that +whatever the API client does, it is known to act on behalf of the end +user. + +**Bank pays interest automatically**: The bank runs an API client, +authorized by the bank to act autonomously and without end user +authorization, which annually transfers interest from the bank's own +account to each end user's account. + +Obviously, a real bank would need a lot more scenarios, but these will +do for discussing Yuck. + +## Data model + +Yuck needs to store data about end users, applications, and API +clients. It models the data as a set of "resources", which can be +represented as JSON objects. Initially, Yuck will store the JSON +objects in Muck, which is a dedicated JSON object store, but Yuck will +be able to support any store that supports the following: + +* an object can be created and assigned a unique ID and revision +* an object can be updated, with collision prevention using the + revision (updater gives the revision of what they think is the + newest revision; the store will fail the update if it isn't) +* an object can be retrieved, given the ID +* an object can be deleted, given the ID +* objects can be search for, based on any field defined below, using + case-independent equality or comparison to a pattern + +### A user + +A user resource represents the user. It's object ID is used to +identify users in the eco system, not a username. The object identity +is unique, never changes, and is chosen by Yuck, and ideally is never +shown to the user, and only used to reference the user internally. + +The user resource stores the following data: + +* `allowed_scopes` — (a list of strings) the scopes the user is + allowed to have + +Note that the user object does not store usernames or credentials in +any way. They may have any number of credentials, for multi-factor +authentication. When a user is being authenticated, they must provide +all credentials. + +### A username + +A username resource stores one name by which the user is identified to +the system. As far as Yuck is concerned, a user may have any number of +usernames, and they can change. The username is user-visible, and +chosen by the user. They need to be unique. + +* `user_ref` — (a string) ID of the user resource for the user +* `username` — (a string) a username for the user + +Yuck stores as little about a user as possible. For example, it does +not store the full name, or any contact information. The applications +may store that separately. + +### An OAuth2 API client + +For OAuth2 API clients, the following data is stored: + +* `user_ref` — (a string, or `null`) ID of the user resource for + the user on behalf of whom the API client acts, if any +* `allowed_scopes` — (a list of strings) the scopes the API + client is allowed to have + +Note that an API client may act on behalf of a user, but does not need +to do so. If `user_ref` is set to a non-empty string, it is acting on +behalf of a user, and this will cause any access tokens the API client +gets to have the `sub` claim set to the user's ID. + +### An OIDC application front end + +For OIDC application front ends, the following data is stored: + +* `allowed_scopes` — (a list of strings) the scopes the API + client it allowed to have +* `callbacks` — (a list of strings) the callback URIs for the + application + +### A password credential for scrypt + +For password based authentication for users, API clients, and +application front ends, Yuck will store the following data: + +* `user_ref` — (a string, or `null`) ID of the user resource for + the user, if any +* `client_ref` — (a string, or `null`) ID of the resource for + the API client, if any +* `hash` — (a string) password encrypted using scrypt, encoded + as hexadecimal +* `salt` — (a string) randomly chosen string to salt the + encryption, encoded as hexadecimal +* `key_len` — (an integer) used for scrypt +* `N` — (an integer) used for scrypt +* `r` — (an integer) used for scrypt +* `p` — (an integer) used for scrypt + +Note that Yuck will require only one of `user_ref` and `client_ref` to +be set to a non-empty string, and the other one to `null`. + +The `key_len`, `N`, `r`, and `p` fields are used for scrypt +encryption. They are stored so that they can later be varied without +making previously stored passwords invalid. + +### A TOTP credential for a user + +Yuck stores the TOTP credential for a user as follows: + +* `user_ref` — (a string) ID of the user resource for the user +* the rest to be determined, when TOTP is implemented + +## External interfaces of Yuck + +Yuck provides the following interfaces to the rest of the ecosystem: + +* endpoints for managing users, API clients, OIDC application + frontends, including their credentials +* an endpoint for OAuth2 API clients to get tokens using client + credential grants +* endpoints for OIDC frontends to use for interactively authenticating + the end users, and for getting the resulting tokens (including + refreshed tokens) +* an endpoint for monitoring the health of Yuck + +Details will be specified later. + +# Authentication protocols + +This chapter will walk through of each of the protocols Yuck supports, +down to sample HTTP requests and responses. + +## Authorization information + +Overview of how authorization happens in the eco system: + +* The IDP keeps track of what each end user and API client is + authorized to do. This is encoded by storing a list of "scopes". A + scope is a permission to do something, such as "create a resource" + or "update a resource the end user owns". See `allowed_scopes` in + the user and API client resources. + +* The access token identifies the end user. The token grants + permission to its bearer to do specific actions, encoded as a list + of scopes. Note that an access token need not have all the allowed + scopes. + +* The API provider actually implements the access control checks based + on the access token and its contents. The API provider implements + specific actions, and associates each with a scope, and checks that + the token has that scope. + +For example, assume that Alice is authorized the actions "create +resource" and "read resource owned by the user"; `authorized_scopes` +has the scopes `create` and `read`. + +Alice creates an API client, but only allows it the `read` scope. When +the API client gets an access token, it will have the `sub` claim set +to `alice`, and the `scope` claim set to `read`. With such an access +token, the API client can read any resources that Alice can read, but +can't create new resources. + +## OAuth2 for autonomous API clients + +* walkthrough of an API client getting tokens via OAuth2 CC +* and using them + +## OIDC for interactive end users + +* walkthrough of an end-user causing facade to get tokens +* and facade using them +* web sessions + +## End users authorizing API clients + +* walkthrough + +# Architecure: Yuck itself + +[[!graph src=""" +user [shape="ellipse" label="end user" margin="0.2,0.2"]; +browser [shape="tab" label="web browser /\nuser agent" margin="0.2,0.2"]; +webapp [shape="component" label="application\nfrontend" margin="0.2,0.2"]; +user_client [shape="tab" label="API client\n(on behalf of user)" margin="0.2,0.2"]; +auto_client [shape="tab" label="API client\n(autonomous)" margin="0.2,0.2"]; +IDP_auth [shape="component" label="IDP auth endpoints" margin="0.2,0.2"]; +IDP_token [shape="component" label="IDP token endpoints" margin="0.2,0.2"]; +IDP_admin [shape="component" label="IDP admin endpoints" margin="0.2,0.2"]; +IDP_store [shape="cylinder" label="IDP data store" margin="0.2,0.3"]; + +user -> browser; +browser -> webapp; +webapp -> IDP_auth; +webapp -> IDP_token; +webapp -> IDP_admin; + +user_client -> IDP_token; +user_client -> IDP_admin; + +auto_client -> IDP_token; + +IDP_auth -> IDP_store; +IDP_token -> IDP_store; +IDP_admin -> IDP_store; + +"""]] + +The diagram above doesn't include parts of the eco system that are not +part of Yuck or don't directly interact with Yuck. + +Yuck consists of three sets of endpoints, and a data store. The +endpoints implement the external interfaces for the authentication +protocols, and for administration. The data store stores JSON objects. + +An API client acting on behalf of an administrator, will use the Yuck +admin endpoints to manage uses, API clients, and OIDC applications. An +application frontend may provide a user interface for doing the same. + +Note that the various Yuck endpoints and the processes implementing +them do not need to interact except via the data store. This enables +horizontal scalability to the extent the data store scales. + +(It may be more sensible to have the application backend provide an +interface for admin actions. It will still need to use the Yuck admin +endpoints for doing that. This possibility has been left out of the +diagram to avoid clutter.) + +## The data store + +[Muck]: http://git.liw.fi/muck-poc/tree/ + +The data store will initially be [Muck][], which as a RESTful HTTP API +for managing JSON objects. The API uses the kind of JWT access tokens +for access control that Yuck creates. Yuck can create the tokens for +its own use. + +Later, support for other data stores can be added. LDAP is probably +going to be desired. This can be done by implementing a new component +that provides a Muck-like interface, but stores the data in LDAP. +Similarly, support can be added for SQL databases, etc. -- cgit v1.2.1