PeerPoint Requirements Definition -- Section 1. Identity Management

26 views

Skip to first unread message

Poor Richard

unread,

Jul 10, 2012, 9:29:19 AM7/10/12

to building-a-distributed...@googlegroups.com

[The following is a new draft addition to the PeerPoint Open Requirements Definition and Design Specification Proposal. The PeerPoint project is an open and collaborative effort to develop requirements, standards, and specifications for peer-to-peer internet technologies that will promote fair and sustainable societies.]

PeerPoint Identity Management

The first step in defining the problem space of identity management is to define identity. What is it? From The Free Dictionary (tfd.com):

identity: 1. The collective aspect of the set of characteristics by which a thing is definitively recognizable or known

PeerPoint Terms and Definitions

entity: anything that has a definite, recognizable identity, whether a person, group, organization, place, object, computer, mobile device, concept, etc.

Identity conceptual view (credit: Wikipedia)
attribute: any characteristic, property, quality, trait, etc. that is inherent in or attributed to an entity. An entity has one or more attributes and an attribute has one or more values. For example "the sky (entity) has color (attribute) of blue (value)." This entity-attribute-value (EAV) model is sometimes called a "triple" as in the Resource Description Framework (RDF). An attribute (which is also a kind of entity) may have attributes of its own. These are often logically nested in a hierarchical fashion. For example, an address may be an attribute of a company but also an entity with attributes of street, city, state, etc. An entity may have multiple instances of the same attributes, such as multiple aliases or addresses. (Different programming languages, protocols, frameworks, and applications may organize the entity-attribute-value model differently; or use different terms such as object for entity or property for attribute; but this is probably the most generic approach.)
Rdf-graph3 (Photo credit: Wikipedia)

identity: a definitive and recognizable set of attribute-value pairs (or entity-attribute-value triples) for a particular entity. The set of attribute-value pairs may be partial or exhaustive, depending on the intended purpose of the identity construct.
identification (ID): a dataset (value, record, file, etc) which represents the most concise amount of information required to specify a particular entity and distinguish it from others. An ID may be local to a particular context, such as a company employee ID or inventory number, or it may be universal. Examples of universal ID are Global Trade Item Numbers (GTIN) and uniform resource identifiers (URI). The ID typically consists of a smaller quantity of data than the full identity dataset and only represents or refers to the full identity.

Identity management problem space

The PeerPoint requirements will explore various parts of the Identity Management problem space, all of which overlap or interpenetrate each other:

description
classification
identity provisioning and discovery (directory services, including identity & directory linking, mapping, and federation)
authentication (validation, verification, security token service)
authorization (access control, role-based access control, single sign on)
security (anonymity, vulnerabilities, risk management)

1. Description

Description is meant here in its most general sense as the entire set of attributes and values that describe an entity, and not simply a "description" box or field in a record. This is the aspect of identity management which establishes the attributes and values by which an entity is typically recognizable or known in a particular context. A description can attempt to be exhaustive, but in most cases it is only as complete as required for its intended purpose in a given application.

PeerPoint requirements

Identity management functions should be consistent across all PeerPoint applications, so the requirements should be implemented as part of a PeerPoint system library from which all applications, middleware, APIs, etc. can call the necessary functions. Interfaces or connectors must be provided for non-PeerPoint-compatable systems.
There are many methods in existing software applications, protocols, and frameworks to describe the identity of entities. The PeerPoint identity management solutions must inter-operate with as many of these as possible. For that reason the PeerPoint descriptions of entities must be as generic, modular, composable, and extensible (open-ended) as possible.
PeerPoint user interfaces (UI) must allow users to extend and customize entity descriptions in as intuitive a manner as possible without reducing or destroying the interoperability of the descriptions with those of other platforms. One approach is to provide user input forms with the most common or universal attributes for various types of entities, combined with fields for additional user-defined attribute-value pairs as well as simple tags.
In both standardized and customizable parts of entity descriptions, the UI should provide as much guidance as possible about the most typical names and/or value ranges for attributes without locking the user in to these "preferred" or popular choices.

One of the most basic entities in social networking systems is the person, member, or user account. The identity description for such an entity is commonly called a "user profile." User profiles are also found in most applications that involve online collaboration. The most primitive form of user account consists of a user ID (or UID) and a password, where both the ID and password are simple alphanumeric strings. But increasingly, user accounts for social and collaborative applications include elaborate user profiles. Facebook is a good example, having one of the most extensive user profiles of any internet application.

This is a partial screenshot of Poor Richard's Facebook Profile:

The information in a Facebook User Profile is organized into numerous logical categories. Some not shown above include the user's friends, Facebook groups to which the user belongs, and a personal library of documents and images. Other profile sections include unlimited free-form text.

Many of the profile data categories such as "Arts and Entertainment" may include unlimited numbers of "likes" or tags. These are added via an intuitive interface in which the user begins typing something such as a-r-e-t-h-a- -f-r-a-n-k... and as the user types, a list of matching tags is displayed and continuously updated with each keystroke, showing possible matches from the Facebook database. If no match is found by the end of typing, the entered tag label is displayed as-is with a generic icon. Facebook's database of entities in the various categories is created and maintained primarily by Facebook users who create Facebook "pages" for people, groups, companies, products, movies, authors, artists, etc.

Other social network sites have profile features not found in the Facebook User Profile. Google + adds a feature to the "friends" data category called "circles" and a homepage feature called "hangouts". Google + users can organize friends into user-defined categories called circles that inter-operate with other Google apps, and can create live audio-video chat groups with user-defined membership. LinkedIn has additional profile data categories for resumes, cvs, and employment references, recommendations or testimonials.

In addition to users, on various social networks accounts may be created for special-interest groups, fan clubs, companies, organizations, and topic pages of all kinds. The structures of the profiles for different types of accounts on different networks vary widely.

Very limited, generic profiles are also hosted by services such as Gravatar and About.me.

Sample Gravatar profile:

OpenID Simple Registation is an extension to the OpenID Authentication protocol that allows for very light-weight profile exchange. It is designed to pass eight commonly requested pieces of information when an End User goes to register a new account with a web service.

Gravatar and OpenID SR are simple examples of what PeerPoint will call a meta-profile.

PeerPoint requirements:

the capability to create and maintain meta-profiles for any type of entity
intuitive user interface for creating, customizing, and maintaining meta-profiles
allow the creator of a profile to determine where any portion of it is stored and with whom any portion of it is shared
capability to synchronize the PeerPoint meta-profile with profiles in non-PeerPoint applications

2. Classification: "people, places and things"

Different kinds of entities have different kinds of descriptions, so an important part of the identity management problem is the problem of sorting things into various categories. Sorting things into categories or classes is often called categorization or classification. Classification systems are often called taxonomies. Examples might include the index of an encyclopedia, a library card catalog, or a glossary of internet terms.

In the case of information systems, the term ontology means "a rigorous and exhaustive organization of some knowledge domain that is usually hierarchical and contains all the relevant entities and their relations." (tfd.com) Wikipedia says "An ontology renders shared vocabulary and taxonomy which models a domain with the definition of objects and/or concepts and their properties and relations. Ontologies are the structural frameworks for organizing information and are used in artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking, and information architecture as a form of knowledge representation about the world or some part of it. The creation of domain ontologies is also fundamental to the definition and use of an enterprise architecture framework.

Another related term in information systems is namespace, often used in relation to wiki structures and directory services.

In identity management, two of the main systems of categories, or taxonomies, would be categories of entities and categories of attributes. Attributes are themselves categories of values (the attribute "color" is a category of colors: red, blue, green, etc.).

Examples of high-level categories of entities might include:

people
groups
organizations
places
internet technologies
devices

Examples of very high-level categories of attributes could include:

These taxonomies become semantic web ontologies when they are defined in machine-readable protocols such as:

Resource Description Framework (RDF)
Web Ontology Language (OWL)
Extensible Markup Language (XML)
Simple Object Access Protocol (SOAP)
Description of a Project (DOAP) (an RDF schema and XML vocabulary to describe software project)
Service Provisioning Markup Language (SPML) is an XML-based framework, being developed by OASIS, for exchanging user, resource and service provisioning information between cooperating organizations
Friend of a friend (FOAF) (a machine-readable ontology describing persons, their activities and their relations to other people and objects.

Linked Data

One great advantage of machine-readable ontologies is the ability to semantically link data across the web.

Linking open-data community project

The goal of the W3C Semantic Web Education and Outreach group's Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links. By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. There is also an interactive visualization of the linked data sets to browse through the cloud.

Dataset instance and class relationships

Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud, as shown by the figures to the right, are:

3. Identity provisioning and discovery (directory services, including identity & directory linking, mapping, and federation)

(requirements to be determined)

: No center or hub (identityblog.com)

4. Authentication (validation, verification, security token service)

(requirements to be determined)

5. Authorization (access control, role-based access control, single sign on)

(requirements to be determined)

6. Security (anonymity, vulnerabilities, risk management)

(requirements to be determined)

LAWS OF IDENTITY IN BRIEF

1. User Control and Consent:

Digital identity systems must only reveal information identifying a user with the user’s consent. (Starts here…)

2. Limited Disclosure for Limited Use

The solution which discloses the least identifying information and best limits its use is the most stable, long-term solution. (Starts here…)

3. The Law of Fewest Parties

Digital identity systems must limit disclosure of identifying information to parties having a necessary and justifiable place in a given identity relationship. (Starts here…)

4. Directed Identity

A universal identity metasystem must support both “omnidirectional” identifiers for use by public entities and “unidirectional” identifiers for private entities, thus facilitating discovery while preventing unnecessary release of correlation handles. (Starts here…)

5. Pluralism of Operators and Technologies:

A universal identity metasystem must channel and enable the interworking of multiple identity technologies run by multiple identity providers. (Starts here…)

6. Human Integration:

A unifying identity metasystem must define the human user as a component integrated through protected and unambiguous human-machine communications. (Starts here…)

7. Consistent Experience Across Contexts:

A unifying identity metasystem must provide a simple consistent experience while enabling separation of contexts through multiple operators and technologies. (Starts here…)

Additional Information Management Resources:

Glossary of Semantic Technology Terms (mkbergman.com)
The Laws of Identity
The National Strategy for Trusted Identities in Cyberspace (NSTIC)
No hub. No center.
Federated Identity Management in Cloud Computing (clean-clouds.com)
Reimagining Active Directory for the Social Enterprise
The Open Group and MIT Experts Detail New Advances in ID Management (sys-con.com)
What is OpenID Connect? "OpenID Connect is a suite of lightweight specifications that provide a framework for identity interactions via RESTful APIs. The simplest deployment of OpenID Connect allows for clients of all types including browser-based, mobile, and javascript clients, to request and receive information about identities and currently authenticated sessions. The specification suite is extensible, allowing participants to optionally also support encryption of identity data, discovery of the OpenID Provider, and advanced session management, including logout."
Security Assertion Markup Language (SAML)
MIT Core ID Project Site "The increase dependence today of citizens on the IT and telecoms infrastructure for their day-to-day activities points to the crucial need for an “identity infrastructure” that offers an ecosystem in which digital identities can be created, managed and destroyed in a practical manner. Such an identity ecosystem must support digital identities which maintain the privacy of the human person associated with the identity, and allows the human person to personalize their identity according to their needs."
The Jericho Forum Identity Commandments "define the principles that must be observed when planning an identity eco-system. Whilst building on “good practice”, these commandments specifically address those areas that will allow “identity” processes to operate on a global, de-perimeterised scale; this necessitates open and interoperable standards and a commitment to implement such standards by both identity providers and identity consumers
Access governance: Identity management gets down to business; NetIQ integrates former Novell IDM tools (securitybistro.com)
OIX Open Identity Exchange "Building trust in online identity"

Melvin Carvalho

unread,

Jul 10, 2012, 9:45:04 AM7/10/12

to building-a-distributed...@googlegroups.com

Wow! What a fantastic writeup. Not least because there's so much noise to signal in this space, and you're knife has brilliantly cut through the majority of it.

I jotted some notes on identity some time back, but now seems basic in comparison to your thorough analysis.

http://melvincarvalho.com/blog/the-five-stars-of-web-identity/

The gist is to use HTTP URIs for identity (as facebook do). Facebook open graph protocol is a decent template to start with. But your post has shown that so much more can be achieved. KUTGW!

Poor Richard

unread,

Jul 12, 2012, 5:50:41 AM7/12/12

to building-a-distributed...@googlegroups.com

Melvin,

Your positive comment is very much appreciated.

I added your "5 stars of web identity" to the resource list, and will probably work it in further as a recap of some of the higher-level identity management requirements.

As you can see, four of the six main subsections are still "requirements to be determined."

Please feel free to add your ideas.

PR

Melvin Carvalho

unread,

Jul 12, 2012, 9:10:28 AM7/12/12

to building-a-distributed...@googlegroups.com

On 10 July 2012 15:29, Poor Richard <poor.r...@gmail.com> wrote:

Google Social Graph API is quite good but sadly shutting down.

These guys are working on identity search http://www.foaf-search.net/

(requirements to be determined)

No center or hub (identityblog.com)
4. Authentication (validation, verification, security token service)

(requirements to be determined)

*work in progress*

Some pointers to authentication:

http://www.w3.org/community/rww/wiki/Authentication

5. Authorization (access control, role-based access control, single sign on)

*needs fleshing out*

Some pointers to Authz (incomplete)

http://www.w3.org/community/rww/wiki/AccessControl

(requirements to be determined)
6. Security (anonymity, vulnerabilities, risk management)

(requirements to be determined)

Security can never be 100%. Often based no trust and reputation, which we need for the web (without gatekeepers).

Security by obscurity is an important principle. Privacy could be thought of as a form of security.

Poor Richard

unread,

Jul 14, 2012, 1:27:29 AM7/14/12

to building-a-distributed...@googlegroups.com

Melvin,

Thanks for the input. In future feel free to edit the Google Doc https://docs.google.com/document/d/1TkAUpUxdfKGr_5Qio2SlZcnBu_sgnZWdoVTZuD_Regs/edit#

> Google Social Graph API is quite good but sadly shutting down.

Sometimes Google puts it discards in the public domain. Does anybody know where Social Graph API may wind up?

> These guys are working on identity search http://www.foaf-search.net/

Great. I added this as well as your other comments and links.

PR

On Thursday, July 12, 2012 8:10:28 AM UTC-5, melvincarvalho wrote:

Reply all

Reply to author

Forward

0 new messages