Best Practice and Example Identifier Provider

From MgmtWiki
Revision as of 19:27, 30 July 2018 by Tom (talk | contribs) (Questions and Answers for Designers and Developers)

Jump to: navigation, search

Status

This example is now a work in progress based on the IDEF requirements which transitioned to Kantara in June 2018.

This page is designed to assist designers and developers of Identifier or Attribute Provider (IAP) web sites, but is specifically targeted to just the Identifier Provider part of that broader category as currently implemented. Technical terms are used to convey the information that they will need to perform their task that may not be familiar to the casual reader.

For a less technical document see the Identity Model Overview.

Content

Goals of the Best Practices and Example

In furtherance of the objective of an ecosystem of trusted identities in cyberspace, it is incumbent on the various working of standards organization like Kantara to produce guidance on the use of their work product. This example is designed to provide one way in which compliance can be earned.

The goals set out the needs and constraints on an example for use by an Identifier or Attribute Provider (IAP) systems design.

  1. Provide web sites guidelines and an example of code that can be used: (in order of priority)
    1. to provide an example that can be used by any web site to more easily achieve IDEF and Kantara compliance,
    2. to create an IDEF compliant Identifier or Attribute Provider (IAP) web site so that Kantara can be a paradigm for its own principle
    3. to create a database of site members that can be used to create identity claims for various Kantara sites and activities.
  2. Support federation to more that one external widely used or IDEF compliant Best Practice and Example Relying Party.
  3. Support two factor authentication.
  4. Provide in-line guidance in the example code of the IDEF requirements that a developer of any web site can apply.
  5. Support various stake-holders of compliant Identity systems, such as Health Care and Government.

Non-goals for this Example

This project is limited to the case of a real-world user interacting with one Identifier Provide over an extended period of time using one or more Relying Parties (RP). The example does not consider the case where privacy enhancing technology is used to prevent linkage between different instances of use of the internet.

Design Thinking

Given the goals above, design thinking is needed to produce a solution in the form of patterns which meets the user's desires and intent. For example in the need to capture the user's acceptance of the privacy policy and the terms of use, a design pattern could just let the user read and accept the site's stated policies. But what the user wants is control on how their private information is used. The solution below tries to address both the RP site goals and the user desires in a way that allows users to see their past expressed intent and change it as befits their current intent. This is in keeping with the Usable Best Practice A of the IDEF requirements.

Access to the Example

Currently a working implementation of the IdP example web site is available here https://idesg-idp.azurewebsites.net/

Identifier used in Best Practices

Selection of the format of the Identifier

The operating assumption is that any identifier used for authentication must be unique and work effectively on the internet. On the internet unique identifiers are created as a scheme under the Uniform_Resource_Identifier (URI) as defined in RFC 3986. The only scheme that is widely used for identifiers is the mailto: scheme specified in RFC 822 and RFC 1122 where it is defined as "a@b.c" where "a" is usually considered to be a name and "b.c" to be the Identity Provider (IdP). While other identifiers have been created by IdPs, only the mailto: scheme is both URI compliant and well known to the community and so it is the recommended best practice for Kantara use. The specific technical reason for defining the identifier in such a limited structure is so that the RP can find the IdP configuration by looking at some well known URL (see RFC 5785) for that information. (n. b. The OpenID 1.0 code required a different syntax for identifiers that uses URL encoding of a.b.c described in the acct: scheme. There are a few existing Open ID providers that still use only the acct: scheme. If they have not fixed their code to permit the mailto: scheme for acquiring configuration information, they will not be able to provision IDs for this design.)

With the rise of personal communication devices, the internet may not be to the only source of user identifiers. Specifically one good practice is to use the user's cell phone number for authentication. The security and privacy of other authentication identifiers should be explained to the user before they are permitted to select one.

It is not required that any identifier is provisioned with an email (see RFC 7505) or other account that can receive notifications. However, if the relying party wants to use an account for notifications they should verify that the account provided belongs to a user that is willing to receive notifications. It is not sufficient simply to ask the user that is registering with the relying party since an attacker could be trying to send unwanted notifications to the unsuspecting owner of the account. Therefore it is incumbent on the relying party to verify the validity of the account by sending a verification to the account and awaiting a favorable response from someone that can access the account before adding the identifier to a list to receive notifications. This best practice applies to applications installed on mobile devices as well as to web browsers.

Compliance and Recovery of User Identifiers

To minimize the cognitive load on the user, this example asks for the user name in the form of an email address and then populates both the user name and the user email address fields of the database with that one string. If the user starts from a federated identity, like Google, that will be the user's email address at the federated IdP. If the user creates their own logon first then the Google email address will not be stored anywhere as the user identifier from Google is unique to the Google sign in identifier. While that provides some privacy protection, it is only because the user name appears in text form only for limited parts of the interchange, all of which should be protected by TLS. If user cognitive overload is not a primary concern the implementation could be changed to force the user to pick a user name as describe below in this same section.

The identifier used for authentication of a user is not well suited as the primary database key for the user since the specific IdP selected by the user for authentication is known to change over time. For this reason the identifier in the example database is a simple GUID which is statistically sure to be unique. Also the example allows for more than one authentication scheme to be enabled by a user. This choice carries some cognitive load for the user as the specific identifier selected by the user at registration must be known by the user at the time of later authentication. For that reason it is best practice for the RP to provide the user with other means to request their authentication (signin) identifier, which is yet another reason for the user to register an email address or phone number to receive the forgotten information.

If the RP registers itself with a federated IdP (like: this example, Microsoft or Google) then the user's email address never needs to be exposed to the RP in any situation, only the GUID assigned by the IdP which is put into the signin data table. If the user tries authentication with a dynamically registered IdP with OpenId Connect, then the user's email address must be requested by the RP for the purposes of locating the configuration information of the IdP. Note that the name part of the user email address is not required, but there is unlikely to be any user experience that would allow that distinction. Note that the OpenId Connect compliance certification process requires dynamic registration with a TCP port number, which is bad privacy practice and incompatible with the known naming schemes listed above. That requires special code to be inserted into the example for the sole purpose of OpenID compliance, and is never used in production.

For compliance tests of the RP system by a privacy auditor, the database dump will show that for preregistered federated IDs, the user email address is not stored in the database. It is likely that compliance tests will force the UX to be modified to encourage the user to enter a user name for display within the RP registration name space that is not related to any email or real world name and address. If the RP requires a real-world name and address for its own compliance criteria, then that must remain distinct from the user display name and the user email address. Any real-world identification information must be protected as befits the compliance criteria that the RP chooses to adopt.

Identifier from federated IdPs

Note that the identity stored in the user logins table is typically anonymized by the IdP so that it bears no relation to the user's URI at that IdP. So it is possible to use Google as for sign in to the RP web site without ever seeing or storing the email address that Google uses as a URI. Note that this is not guaranteed by Google. Since most IdPs of interest use OpenID Connect, the OAUTH 2.0 request to Google from the RP could ask for any of the profile information kept by Google. If the user declines to release the information to the RP, the authentication attempt with OpenID Connect will fail. From that persective OpenID connect is doing nothing within the protocol to protect User Private Information. It is up to the RP to ask for the minimum required information.

User Experience at the RPs

The user needs to be able to understand the nature of the Identity Model that is the basis for the User Private Information held by the RP. The sample below shows on the left side a typical federated ID site with a collection of well-known social sites plus a place for a username if that's what the user prefers. The right side of the images shows the next page after the user has selected a social site for signin. This images shows some good patterns as well as some bad (anti-) patterns. First the good. the user knows what attributes are being requested by the RP from the social site. It is also good that the page shows the user how to change the permissions for user attributes granted to it. The bad is that the user cannot change the user attributes that are selected on first signin. Also the terms and privacy statement are dense legalese which are written to protect the site owner. Terms that are hard for the user to understand will not be permissible in the EU once the GDPR comes into force in May 2018.

File:Social-combo.png

Personal Information evaluated for use in the Example

Note that the term Individual here means a human being. That should not be considered to prevent the RP from also supporting non human users.

  1. Categories of data editing enabled for users of data contained in the user data base (these categories are shown part of the data base schema, but some fields may change categories as the relationship between the user and RP changes over time):
    1. Required by the operation of the site and not editable,
    2. Required by the site owner but editable by the user,
    3. Optional data editable by the user,
    4. Links from external IdPs that are removable by the user,
    5. Data provided by 3rd parties together with a link to those 3rd parties for redress by the user.
    6. Data provided by 3rd parties can be marked as in dispute by the user if they cannot alter it.
  2. Personal data evaluated against the above categories is shown in the following model of the user. Where possible names used in the model match those used in the Open ID Connect standard claims. In those cases the lower case underline separate name was converted to a user friendly spelling and the background highlight with blue. Strictly speaking identity claims data is ONLY relevant in the AspNetUserLogins table, so this linkage is only for convenience as a taxonomy of the element names.

For more information see this page on User Private Information.

Model of the User in the Database

The goal of the data collected (as a User Object) about the individual and organizational users is to model the Users that are served by the web site with User Private Information intended for release by the users. Field names taken from other standards, like Open ID Connect, are highlighted in blue. This is a practice where the RP can determine the category tag used on each data field in advance based on the needs of the site. It may be necessary for the RP get user consent which requires that some fields are dynamically tagged based on user intent or user actions. In those cases are more complex design might be required. For example a deletion date may need to be associated with some fields.

Element Name Contents Cat Explanation for category
ID identifier unique within the db 1 required for internal lookups
Status Boolean 1 yes = active, no = inactive (entries are never deleted)
Type numeric 1 (1)individual, (2)organization (can be a parent), (3) pseudonym (avoid any linkage)
Parent link (exactly one) 2 link to an organization in this db, the default parent is the Kantara root entry
Member code 1 0=none, 1= member in good standing, 2=candidate for membership, 3=suspended or resigned
Role link (zero or more) 1 none, registered member, member of the parent member, voting member of the parent, site admin
Email Type 3 0=untyped, 1=prohibited, 2=allowed, 3=digest only
ui locales URI 3 User's preferred languages, represented as a space-separated list of BCP47 [RFC5646] language tag values, ordered by preference. For instance, the value "fr-CA fr en" represents a preference for French as spoken in Canada, then French (without a region designation), followed by English (without a region designation).
ZoneInfo URI 3 time zone for the named individual, it has less relevance for orgs
Email URI 2 this example assumes that contact with the user is required, also used for recovery. While it is possible for later reuse of the email name to cause a collision, this needs to be considered as unique at the time it was verified.
Email Verified Boolean 1 to avoid spam against some email that has been entered without the owner's knowledge
Password Hash hash 3 to allow user access local to the web site - only one authN method is required
Federated Logins link (zero or more) 4 The example shows how to create these for Google and Facebook
Security Stamp byte string 1
Phone Number string 3 not required in this example, also used for recovery with SMS. As with email this ID can be compromised or reused at a later time, but would be unique when verified.
Phone Number Verified Boolean 1 May be required if the telephone number is entered in order to avoid spamming.
Two Factor Enabled Boolean 5 User choice at first, this should be moved to the authN table
Lockout end date Date 1 set by the system when lockout occurs - since the user is locked out, this cannot be changed
Lockout enabled Boolean 1 Not required on this site - possibly the value could be given to the user
Access Failed count 1 site accounting
Legal name alphanumeric 2 Since the key is the ID, this can be changed, but only when legal documents are change as well - It is never displayed except to site admins
Display name alphanumeric 3 So that members will know how to refer to each other aka user or username
Stipulation link to document (zero or more) 2 User accepted stipulation on privacy, ToU, IPR or other policy together with any user response

Site documents and user expressed intent

Note that this record (called the stipulation) allows a range of responses from the user, such as would be the case if the user accepted data sharing with affiliated companies, but not with unrelated parties.

Element Name Contents Cat Explanation for category
ID identifier unique within the db 1 required for internal lookups
User ID GUID 1 Link back to the user record
Document URI 1 Link to the document shown to the user
Document Validation hash 1 Used to prove the document accepted by the user
Doc Type code 1 0 = unk, 1 = PP, 2 = ToU, 3 = IPR etc.
Intent XML 2 The coded response of the user to the document as shown (an alternative would be a JSON object)
Date of Intent Date 1 The date posted to the db

Schema used in the Example

Some of these connections are described in theoretical terms on the page Identity Model.

File:IDESG Identity Schema.png

Support for the Kantara Web Site

The Kantara has several additional requirements that have been instrumental in formatting the model of user data described above.

Working Notes

Next steps

Set up goals and start to build examples and best practices for all of the roles in an ID ecosystem.

  1. Now that the Kantara Initiative has absorbed the IDEF self assessment, work on enabling the ID ecosystem needs enable IdPs, RPs and other entities to comply.
  2. Promote a Trustmark with UX collateral, images of various sizes for web sites.
  3. Consider how to handle Attribute Providers and well as Identifier Providers (and how the two might interact).
  4. The Kantara web site itself should be an example of Guidelines.
  5. There is an IdP which shows the Guidelines for that industry. (The roles of IdP generates identity tokens. It may consume input from user credentials, but not identity tokens.)
  6. The various Kantara web sites become example of RP Guidelines.
  7. Move the example IdP into real-world web sites.
  8. Working example of the RP is operational.
  9. The best practice protocol for inter op between IAPs and RPs is the OpenID Connect protocol.

Questions and Answers for Designers and Developers

Will need to build for best practices:

  • Consent Receipt (CR)
  • Privacy policy (PP)
  • Terms of use (ToU)
  • Are there specific ToU and PP provisions that demonstrate how the IDEAL RP might deal with Identities in their policies.
  • IDEF Logos and Marks in vector graphic form for web sites to use.

UX Questions Specific to the RP example code

  1. Demo verification of email (or cell phone) address - will be needed in the future
  2. What are the canonical terms for identification?
    1. logon logout register resign
    2. login logoff create remove
    3. signin signout signup signoff
  3. User roles - how to model - note that one user can have multiple roles

Other issues to look at:

  • Some method for creating a strong web site identity, e.g. Federation or AV Certs
  • Can the Kantara Initiative publish a PP that web sites can include as their own by explicit reference or as a result of using the IDESG TrustMark?
  • Partnering with a company, later who implements the RP or is a provider to the RP (IdP), to write up our plans as a use case
  • Should the example code be available to all comers, or only to members of the IDESG?
  • Recording devices that are under the control of the user together with the device capabilities for data capture and display
  • While all content on this page is covered by the IPR rules, it should be clear that the content on pages linked from this site may have different ownership rights asserted.

References and Coordination

Issues and Comments

  • Any comments, suggestions or issues with the best practices or code can be tracked at this site.
  • General comments about this web page can be made on the "Discussion" tab at the top of the page.