[[!meta title="Yuck - an authentication server"]] **NOTE:** Yuck is in its planning phase at the moment. No code exists, only this document. Feedback on this document is welcome, via normal Ick channels. Ick will continue to use Qvisqve for the time being, until Yuck is ready to replace it. [[!toc levels=2]] # Introduction Yuck is an **identity provider** that allows end users to **securely authenticate** themselves to web sites and applications. Yuck also allows users to **authorize** applications to act on their behalf. Yuck supports the **OAuth2** and **OpenID Connect** protocols, and has an API to allow storing and managing data about end users, applications, and other entities related to authentication. Yuck is intended to be used by web applications. It is not meant for authentication Unix or ssh logins or such. Status quo is that web applications often implement authentication themselves, but it is the opinion of Yuck's authors that this is a bad architectural design. Having a dedicated identity provider keeps the security sensitive parts of authentication in one place, without mixing them with application logic, results in a more cohesive, less coupled architecture and implementation that is more easily reviewed and modified. A separate identity provider also makes it easier to provide single sign-on for groups of applications, without complicating each application. Yuck does not provide any services unrelated to authentication. Other services can work with Yuck to control access to them. OpenID Connect (OIDC) is a protocol suitable for interactively authenticating a person (the end user). OAuth2 is suitable for non-interactive API clients, possibly ones acting on behalf of the end user. Both OAuth2 and OpenID Connect provide a number of variants and extensions. Yuck implements the **client credentials grant** for OAuth2, and the **authorization code flow** for OIDC. Yuck has an extensible architecture for supporting different ways for users to authenticate, and for optionally using multiple authentication factors. Initially it will implement traditional passwords and time-based one-time passwords (TOTP, same as "Google Authenticator"). The Yuck architecture supports different ways for storing the data and credentials it needs. Initially it comes with support for using the Muck JSON store, but support for, say, LDAP can be added. ## Terminology and concepts * **access token:** a token which grants access to a service or resource; usually quite short-lived (maybe less than a minute), since it can't be easily revoked, but see refresh token * **API client:** a program that uses the API, either on behalf of an end-user, or on its own behalf * **application:** software that provides a service using the RP * **authenticate:** prove the identity of someone or something; "this is how you know I am who I say am"; authentication can happen in any number of ways, and different relying parties may have different requirements: government ID; being able to read email sent to an email address; knowing a secret; possessing a unique thing; acting in a particular way; having particular body features (fingerprint, face, voice, hand shape, ...); etc, the list is almost endless * **authorize:** grant access to an authenticated entity; "what are they allowed to do?" * **end-user:** a human using the system, typically the reason the system exists, can also be a subject * **front end:** provides the user interface to an end user via the user agent or browser; typically provides HTML, JS, CSS, and images, statically or generated dynamically, but could audio, video, or anything the user can interact with * **IDP:** short for identity provider * **identify:** claim an identity; "this is who I say I am" * **identity:** who a human is, or which instance of a program is * **identity provider:** software the authenticates an end user and non-human entities, and also stores authorizations for them * **JWT:** a standard way to represent tokens, see [JWT][]; Yuck will use digitally signed tokens * **OAuth2:** a protocol for authenticating software; see [OAuth2][] * **OIDC:** short for OpenID Connect; a protocol for authenticating end users; see [OIDC][] * **refresh token:** a token that can be used to get a new access token; usually long-lived, but can be revoked, since every use can be checked by the IDP * **relying party:** software that relies on the IDP for authentication and authorization; often a resource provider, but can also do things on request instead of merely storing things * **resource:** data stored by a resource provider * **resource provider:** stores resources and allows authorized access to it; "database" * **RP** is short for relying party or resource provider * **subject:** a person whose personal information is handled by the system, see end-user * **user agent:** typically a web browser, but can be a mobile or desktop application; assumed to be under complete user control, and so trusted by the user, but not the ecosystem [JWT]: https://en.wikipedia.org/wiki/JSON_Web_Token [OAuth2]: https://en.wikipedia.org/wiki/OAuth#OAuth_2.0 [OIDC]: https://en.wikipedia.org/wiki/OpenID_Connect # Requirements [RFC 2119]: https://www.ietf.org/rfc/rfc2119.txt Yuck has at least the following high level requirements. In this section, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119][]. Each requirement and sub-requirement is given a unique name for easier reference in discussions. * (SECURE) Yuck MUST be secure. * (CREDSTORE) Yuck MUST store credentials in a way that minimises damage if they leak. Credentials SHOULD be stored encrypted using a respected encryption algorithm (such as scrypt) and using per-credential salting. Or something stronger may be implemented instead. Additionally, all the credntial records SHOULD be encrypted for an additional layer of defense. * (MFA) Yuck MUST support multi-factor authentication using secure factors. * (PROTOS) Yuck MUST use secure protocols to authenticate users and API clients. * (HTTPS) Yuck MUST NOT ever use plain HTTP, only HTTPS. * (AUDIT) Yuck SHOULD undergo security audits, and general scrutiny. Audits SHOULD happen regularly. (This is not an absolute requirement, as it depends on the availability of competent auditors. Yuck is not a for-profit project, and may not be able to pay them.) * (SECUREANDUSABLE) The Yuck developers MUST keep security at the highest priority, without sacrificing usability. * (QUALITY) The Yuck project MUST aim for high quality, by applying development methods that are known to work for achieving quality, such as test-driven development, automated test suites with high test coverage, and code review. * (HSCALABLE) The Yuck architecture MUST be horizontally scalable to very large numbers of concurrent users and API clients. * (NOTUNSCALABLE) The implementation might not scale to very many users or concurrent users, especially initially, but the architecure MUST NOT prevent a scalable implementation. * (ADMINFRIENDLY) Yuck MUST be flexible for system administrators to manage, and applications to use. * (ADMINAPIS) Yuck SHOULD provide APIs for managing the entities and data it needs, such as for creating end users and API clients, or changing their credentials. * (ACCOUNTAPI) Yuck MUST provide an API for managing end-user accounts: creating, deleting, changing, resetting secrets, etc. * (APPFRIENDLY) Yuck SHOULD enable applications to delegate all authentication to Yuck. * (CREDNOTIFY) Yuck SHOULD provide an out-of-band notification for users and admins when credentials are changed. * (FREEDOM) Yuck MUST be free software. It MUST NOT require applications, API clients, and other software that works with Yuck to be free software. * (PRIVACYSTORE) Yuck MUST NOT store personal information it does not need. * (PRIVACYLEAK) Yuck MUST NOT leak personal information. * (PWRESET) Yuck MUST support the user resetting their password, securely. Possibly by supporting a random, single-use link that can be communicated to the user (perhaps via email) to allow them to change the password. * (TEMPLOCK) Yuck MUST support locking an account temporarily, if it is the target of too many failures. This is to avoid an attacker from brute-forcing a password by trying many times. * (TEMPLOCKNOTIFY) Yuck MUST notify an account owner of temporary locking, out of band. * (ACLSIMPLE) It must be easy to understand and reason about ACL rules. It may be good aid this by visualising. * (ACLTRY) There must be a way to test ACL rules: if *this* user in *these groups* does *this* operation for *this* resource, is it allowed? This may require additional support from the RP. * (DISABLEACCT) It must be possible to disable an account (whether for an end-user or an API client) so that it still exists, but authentication cannot ever succeed. * (KILLSESSION) It must be possible to kill existing individual web sessions to kick out someone who is logged in to Yuck. * (KEYROTATION) The IDP MUST rotate signing keys so that a leaked key can be easily replaced. The IDP MUST have a secure way to distribute the key to clients. * (AUTHLIMIT) Authentication MUST be possible to limit by time or IP address. * (VISUALIZE) There should be a way to visualize access control rules so that it's easier to debug them. * (TESTRULES) There should be a way to test access control rules for specific entities and resources. # Architecture: the ecosystem [[!graph type=digraph src=""" user [shape="ellipse" label="end user" margin="0.2,0.2"]; browser [shape="tab" label="web browser /\nuser agent" margin="0.2,0.2"]; webapp [shape="component" label="application\nfrontend\n(facade)" margin="0.2,0.2"]; IDP [shape="component" label="IDP" margin="0.2,0.2"]; RP [shape="cylinder" margin="0.2,0.3"]; app_api [shape="component" label="application\nbackend" margin="0.2,0.2"]; user_client [shape="tab" label="API client\n(on behalf of user)" margin="0.2,0.2"]; auto_client [shape="tab" label="API client\n(autonomous)" margin="0.2,0.2"]; user -> browser; browser -> webapp; browser -> IDP; webapp -> IDP; webapp -> app_api; app_api -> RP; user_client -> IDP; user_client -> app_api; auto_client -> IDP; auto_client -> app_api; """]] An IDP interacts with several other systems to enable end users to do their thing. The RP provides the actual service, and delegates authentication to the IDP. There can be other services in front of the RP, and for security reasons there has to be at least one for end-user authentication. * The end user interacts directly with their web browser or other user agent, which is assumed to be entirely under their control, and thus not trusted by the IDP or other components. The end user is assumed to trust what they use. * The browser talks to the facade (to get the HTML and JS and other files to present a UI to the user), and the IDP (to allow the user to authenticate themselves). * The facade holds the access token on behalf of an authenticated end user. The access token can't be given to the browser, since the browser can't be assumed to be highly secure, from the point of view of the relying party. * The facade talks to a backend, giving it the user's access token as proof of authentication and authorization. * The backend provides an API suitable for the service it provides. It also allows access based on the access token. * The resource provider stores data for the backend. It also allows access based on the access token. Some access is not interactive by the end user, but by API clients that either act on behalf of the user, or are unrelated to them in any way. The end user can authorize an API client access on their behalf. The authorization can limit the API client's access to a subset to what the end user user can do. If the end user can both read and write a resource, the authorization might only allow the API client to read the resource. API clients that are unrelated to the user are authorized by the owners of the RP. See below for an example. ## Authentication scenarios As examples of how an authentication server might be used, consider a an online banking system. It should support at least three scenarios. **End user interactively accesses their account:** The end user opens up the bank web page, and logs in, and can interactively do whatever they're allowed to do: view their bank statement, transfer money, etc. **End user authorizes an API client:** The end user, who happens to be a Unix sysadmin, might want to automatically retrieve their bank statement and feed it to their accounting system. They create an authorization for an API client that only allows it to retrieve the statement, but not do anything else. This creates, in the IDP, a new API client identity, which is tied to the end user's identity, so that whatever the API client does, it is known to act on behalf of the end user. **Bank pays interest automatically:** The bank runs an API client, authorized by the bank to act autonomously and without end user authorization, which annually transfers interest from the bank's own account to each end user's account. Obviously, a real bank would need a lot more scenarios, but these will do for discussing Yuck. ## Data model Yuck needs to store data about end users, applications, and API clients. It models the data as a set of "resources", which can be represented as JSON objects. Initially, Yuck will store the JSON objects in Muck, which is a dedicated JSON object store, but Yuck will be able to support any store that supports the following: * an object can be created and assigned a unique ID and revision * an object can be updated, with collision prevention using the revision (updater gives the revision of what they think is the newest revision; the store will fail the update if it isn't) * an object can be retrieved, given the ID * an object can be deleted, given the ID * objects can be search for, based on any field defined below, using case-independent equality or comparison to a pattern The facade will need to store user login session data, such as the access and refresh tokens for the user. It will store these in some secure manner that prevents them from leaking to an attacker, such as in memory only. It may store them (possibly encrypted) in Muck instead, if this is needed to allow the facade to be restarted without breaking sessions, or to run multiple copies of the facade. ### A user A user resource represents the user. It's object ID is used to identify users in the eco system, not a username. The object identity is unique, never changes, and is chosen by Yuck, and ideally is never shown to the user, and only used to reference the user internally. The user resource stores the following data: * `allowed_scopes` — (a list of strings) the scopes the user is allowed to have Note that the user object does not store usernames or credentials in any way. They may have any number of credentials, for multi-factor authentication. When a user is being authenticated, they must provide all credentials. ### A username A username resource stores one name by which the user is identified to the system. As far as Yuck is concerned, a user may have any number of usernames, and they can change. The username is user-visible, and chosen by the user. They need to be unique. * `user_ref` — (a string) ID of the user resource for the user * `username` — (a string) a username for the user Yuck stores as little about a user as possible. For example, it does not store the full name, or any contact information. The applications may store that separately. ### An OAuth2 API client For OAuth2 API clients, the following data is stored: * `user_ref` — (a string, or `null`) ID of the user resource for the user on behalf of whom the API client acts, if any * `allowed_scopes` — (a list of strings) the scopes the API client is allowed to have Note that an API client may act on behalf of a user, but does not need to do so. If `user_ref` is set to a non-empty string, it is acting on behalf of a user, and this will cause any access tokens the API client gets to have the `sub` claim set to the user's ID. ### An OIDC application front end For OIDC application front ends, the following data is stored: * `allowed_scopes` — (a list of strings) the scopes the API client it allowed to have * `callbacks` — (a list of strings) the callback URIs for the application ### A password credential for scrypt For password based authentication for users, API clients, and application front ends, Yuck will store the following data: * `user_ref` — (a string, or `null`) ID of the user resource for the user, if any * `client_ref` — (a string, or `null`) ID of the resource for the API client, if any * `hash` — (a string) password encrypted using scrypt, encoded as hexadecimal * `salt` — (a string) randomly chosen string to salt the encryption, encoded as hexadecimal * `key_len` — (an integer) used for scrypt * `N` — (an integer) used for scrypt * `r` — (an integer) used for scrypt * `p` — (an integer) used for scrypt Note that Yuck will require only one of `user_ref` and `client_ref` to be set to a non-empty string, and the other one to `null`. The `key_len`, `N`, `r`, and `p` fields are used for scrypt encryption. They are stored so that they can later be varied without making previously stored passwords invalid. ### A TOTP credential for a user Yuck stores the TOTP credential for a user as follows: * `user_ref` — (a string) ID of the user resource for the user * the rest to be determined, when TOTP is implemented ## External interfaces of Yuck Yuck provides the following interfaces to the rest of the ecosystem: * endpoints for managing users, API clients, OIDC application frontends, including their credentials * an endpoint for OAuth2 API clients to get tokens using client credential grants * endpoints for OIDC frontends to use for interactively authenticating the end users, and for getting the resulting tokens (including refreshed tokens) * an endpoint for monitoring the health of Yuck Details will be specified later. # Authentication protocols This chapter will walk through of each of the protocols Yuck supports, down to sample HTTP requests and responses. ## Authorization information Overview of how authorization happens in the eco system: * The IDP keeps track of what each end user and API client is authorized to do. This is encoded by storing a list of "scopes". A scope is a permission to do something, such as "create a resource" or "update a resource the end user owns". See `allowed_scopes` in the user and API client resources. * The access token identifies the end user. The token grants permission to its bearer to do specific actions, encoded as a list of scopes. Note that an access token need not have all the allowed scopes. * The API provider actually implements the access control checks based on the access token and its contents. The API provider implements specific actions, and associates each with a scope, and checks that the token has that scope. For example, assume that Alice is authorized the actions "create resource" and "read resource owned by the user"; `authorized_scopes` has the scopes `create` and `read`. Alice creates an API client, but only allows it the `read` scope. When the API client gets an access token, it will have the `sub` claim set to `alice`, and the `scope` claim set to `read`. With such an access token, the API client can read any resources that Alice can read, but can't create new resources. ## OAuth2 for autonomous API clients * walkthrough of an API client getting tokens via OAuth2 CC * and using them ## OIDC for interactive end users * walkthrough of an end-user causing facade to get tokens * and facade using them * web sessions ## End users authorizing API clients * walkthrough # Architecure: Yuck itself [[!graph src=""" user [shape="ellipse" label="end user" margin="0.2,0.2"]; browser [shape="tab" label="web browser /\nuser agent" margin="0.2,0.2"]; webapp [shape="component" label="application\nfrontend" margin="0.2,0.2"]; user_client [shape="tab" label="API client\n(on behalf of user)" margin="0.2,0.2"]; auto_client [shape="tab" label="API client\n(autonomous)" margin="0.2,0.2"]; IDP_auth [shape="component" label="IDP auth endpoints" margin="0.2,0.2"]; IDP_token [shape="component" label="IDP token endpoints" margin="0.2,0.2"]; IDP_admin [shape="component" label="IDP admin endpoints" margin="0.2,0.2"]; IDP_store [shape="cylinder" label="IDP data store" margin="0.2,0.3"]; user -> browser; browser -> webapp; webapp -> IDP_auth; webapp -> IDP_token; webapp -> IDP_admin; user_client -> IDP_token; user_client -> IDP_admin; auto_client -> IDP_token; IDP_auth -> IDP_store; IDP_token -> IDP_store; IDP_admin -> IDP_store; """]] The diagram above doesn't include parts of the eco system that are not part of Yuck or don't directly interact with Yuck. Yuck consists of three sets of endpoints, and a data store. The endpoints implement the external interfaces for the authentication protocols, and for administration. The data store stores JSON objects. An API client acting on behalf of an administrator, will use the Yuck admin endpoints to manage uses, API clients, and OIDC applications. An application frontend may provide a user interface for doing the same. Note that the various Yuck endpoints and the processes implementing them do not need to interact except via the data store. This enables horizontal scalability to the extent the data store scales. (It may be more sensible to have the application backend provide an interface for admin actions. It will still need to use the Yuck admin endpoints for doing that. This possibility has been left out of the diagram to avoid clutter.) ## The data store [Muck]: http://git.liw.fi/muck-poc/tree/ The data store will initially be [Muck][], which has a RESTful HTTP API for managing JSON objects. The API uses the kind of JWT access tokens for access control that Yuck creates. Yuck can create the tokens for its own use. Later, support for other data stores can be added. LDAP is probably going to be desired. This can be done by implementing a new component that provides a Muck-like interface, but stores the data in LDAP. Similarly, support can be added for SQL databases, etc.