SCIM protocol - 3 suggestions for improvement

75 views

Skip to first unread message

Ganesh Prasad

unread,

Aug 1, 2012, 9:01:21 AM8/1/12

to cloud-d...@googlegroups.com

Hi,

My name is Ganesh Prasad, and my experience in Identity and Access Management is mainly through a 3-year project at an Australian insurance company, an experience I have written about as a eBook on InfoQ (http://www.infoq.com/minibooks/Identity-Management-Shoestring).

I have been following the SCIM spec off and on, and based on my experience with a loosely-coupled architecture that I found to be successful, I have the following 3 suggestions to make.

1. The enterprise client and the cloud provider should maintain their own internal IDs for a resource, which they should not reveal to each other. Both of them should map their internal IDs to a shared External ID, and this is the only ID that should be exposed through the API. The current specification's provision of an id (which is the external ID and the only one to be transferred through the API) and an "external ID" (which is the client's internal ID and should be hidden) is diametrically opposite to this.

2. When dealing with multi-valued attributes of a resource (expressed as arrays in JSON), they must be converted from an array into a dictionary with unique keys (UUIDs generated by the cloud provider when the attribute is created). Without unique keys for every attribute value of a resource, manipulating it will be clumsy and inelegant.

3. The PATCH command can be improved in 3 significant ways:

3a. Leverage the fact (from 2 above) that every value has a key, to greatly simplify the API

3b. Use special verbs as nested operations of the PATCH command to add, modify and delete attributes at any level

3c. Use the WebDAV status code of "207 Multi-Status" instead of "200 OK" as the response to a PATCH (or BULK) command.

To elaborate,

1. Revealing private IDs externally is a form of tight coupling. A major requirement with Identity Management is to split (or merge) identities when false positives (or false negatives) are detected, i.e., when a resource is discovered to be more than one, or when multiple resources are detected to be the same. If internal identifiers are revealed to external domains, such clean-ups become difficult, hence every domain that wants to expose references to a resource must map its internal ID to and external one created for this explicit purpose, and only reveal this.

In the SCIM case, when an enterprise client POSTs a resource creation request, the cloud provider must generate its own internal UUID as well as an external UUID, map them together, and only return the external UUID in the "Location:" header. The enterprise client should map this external UUID to a newly-generated internal ID of its own. In case the resource already has an identifier within the enterprise client's domain, then this is the internal ID that must be mapped to the external UUID returned through the POST response.

2. If a resource is to be created, and one of its attributes is multi-valued, e.g.,

"email-addrs" :

[

"john_...@yahoo.com",

"john....@gmail.com",

"jsmit...@hotmail.com"

]

then on successful creation, the server response should include the representation of the resource, and this attribute should look like this:

"email-addrs" :

[

{ "7dfcb444-74d8-4f17-aa66-daf9ea3bd902" : "john_...@yahoo.com" },

{ "3bd10085-c474-43b9-9cda-8646c3085bbf" : "john....@gmail.com" },

{ "581da5c7-c6e1-4cca-9db7-7a6d1de664e1" : "jsmit...@hotmail.com" }

]

The client now knows what each value is labelled. This now provides an unambiguous way to reference a value to add, modify and delete it:

Add:

POST /Users/2819c223-7f76-453a-919d-413861904646/email-addrs

value="js...@easy.com.au"

Modify:

PUT /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf

value="john.r...@gmail.com"

Delete:

DELETE /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1

One can even delete all email addresses like this:

DELETE /Users/2819c223-7f76-453a-919d-413861904646/email-addrs

I believe this is more elegant than what the spec recommends.

3. It's possible to think of the operations POST, PUT and DELETE as nested operations inside a PATCH. PATCH itself need not be nested because its semantics apply throughout the "tree" of a resource.

However, the semantics of PUT are a little messy. Also, the use of HTTP verbs at a different level could be confusing. That's why I would recommend 6 separate verbs that are a little more unambiguous in their meaning:

1. INCLUDE (equivalent to POST): Add this resource to a collection and return a generated URI

2. PLACE (equivalent to one form of PUT): Add this resource at the location specified by the accompanying URI. (If there’s already a value at that location, return an error status.)

3. REPLACE (equivalent to another form of PUT): Replace the value at the location specified by the accompanying URI with this value. (If there’s no such URI, return an error status.)

4. FORCE (equivalent to a third form of PUT): This means PLACE or REPLACE. (At the end of this operation, we want the specified URI to hold the accompanying value whether the URI already existed or not.)

5. RETIRE (equivalent to DELETE): Delete, deactivate or otherwise render inaccessible the resource at the specified URI.

6. AMEND (equivalent to PATCH): (This verb is just listed for completeness. We probably don’t need a nested PATCH since PATCH cascades to every level of the tree.)

A PATCH request could therefore look like this:

PATCH /Users/2819c223-7f76-453a-919d-413861904646 HTTP/1.1

Host: example.com

Accept: application/json

Authorization: Bearer h480djs93hd8

Content-length: ...

{

REPLACE: {

"key" : "first-name",

"value" : "Jack"

PLACE : {

"key" : "middle-name",

"value" : "Richard"

FORCE : {

"key" : "dob",

"value" : "01-Jan-1971"

REPLACE : {

"key" : "address.unit-number",

"value" : "12"

PLACE : {

"key" : "address.state",

"value" : "SA"

FORCE : {

"key" : "address.country",

"value" : "Australia"

INCLUDE : {

"key" : "email-addrs",

"value" : "js...@easy.com.au"

REPLACE : {

"key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",

"value" : "john.r...@gmail.com"

RETIRE : {

"key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"

}

The PATCH response should utilise the status code "207 Multi-Status" because the nested operations could have varying status codes. A sample response is below:

HTTP/1.1 207 Multi-Status

Content-Type: application/json

ETag: W/"b431af54f0671a2"

Location:"https://example.com/v1/Users/2819c223-7f76-453a-919d-413861904646"

{

"schemas":["urn:scim:schemas:core:1.0"],

"external-id":"2819c223-7f76-453a-919d-413861904646",

REPLACE: {

"status" : "200 OK",

"key" : "first-name",

"value" : "Jack"

PLACE : {

"status" : "200 OK",

"key" : "middle-name",

"value" : "Richard"

FORCE : {

"status" : "200 OK",

"key" : "dob",

"value" : "01-Jan-1971"

REPLACE : {

"status" : "200 OK",

"key" : "address.unit-number",

"value" : "12"

PLACE : {

"status" : "200 OK",

"key" : "address.state",

"value" : "SA"

FORCE : {

"status" : "200 OK",

"key" : "address.country",

"value" : "Australia"

INCLUDE : {

"status" : "201 Created",

"key" : "email-addrs/11f664ec-898b-4f6f-8948-ecfda74deff0",

"value" : "js...@easy.com.au"

REPLACE : {

"status" : "200 OK",

"key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",

"value" : "john.r...@gmail.com"

RETIRE : {

"status" : "200 OK",

"key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"

}

"meta": {

"created":"2011-08-08T04:56:22Z",

"lastModified":"2011-08-08T08:00:12Z",

"location":"https://example.com/v1/Users/2819c223-7f76-453a-919d-413861904646",

"version":"W\/\"b431af54f0671a2\""

}

If there are errors, they will take the place of the "200 OK" or "201 Created" status codes in the above successful case. But the outer status will remain "207 Multi-Status".

The same scheme can be used to deal with operations on members of a group, and for bulk operations.

I hope you find these suggestions useful.

I read the SCIM spec afresh last week and these ideas came flooding into my head because I have been working at another organisation (a telco) for the last 5 months, also in Identity and Access Management, and my thoughts have moved further along the direction of evolving a specialised data model based on specific principles, especially for IAM.

I am planning to write about this and also the data-related principles soon and am in negotiations with InfoQ regarding publication.

Regards,

Ganesh Prasad

Leif Johansson

unread,

Aug 1, 2012, 11:17:28 AM8/1/12

to cloud-d...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/01/2012 03:01 PM, Ganesh Prasad wrote:
> Hi,
>
> My name is Ganesh Prasad, and my experience in Identity and Access

Hi Ganesh,

You should consider joining the SCIM WG in the IETF - thats where
the work going ahead will be focused.

Cheers Leif (scim co-chair)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAZSHwACgkQ8Jx8FtbMZne8ZgCfQvbpBZBPkgwke01QjDxVUlid
YqoAnRpUuGveoLzxv0zARGZZRVDyPOgf
=eeQD
-----END PGP SIGNATURE-----

Peter Saint-Andre

unread,

Aug 1, 2012, 11:37:19 AM8/1/12

to cloud-d...@googlegroups.com, Leif Johansson

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/1/12 9:17 AM, Leif Johansson wrote:
> On 08/01/2012 03:01 PM, Ganesh Prasad wrote:
>> Hi,
>
>> My name is Ganesh Prasad, and my experience in Identity and
>> Access
>
> Hi Ganesh,
>
> You should consider joining the SCIM WG in the IETF - thats where
> the work going ahead will be focused.

Yes, please provide feedback on the sc...@ietf.org list:

https://www.ietf.org/mailman/listinfo/scim

Peter

P.S. You don't really need to "join" the SCIM WG because IETF working
groups don't have members -- just subscribe to the mailing list and
post there. :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)

Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAZTS8ACgkQNL8k5A2w/vwTDwCfYizigQik1KXmNSBzcTl+Yd62
n88AoIctOEkMx0mkEhDMY1yOPKqOEAtN
=Go5H
-----END PGP SIGNATURE-----

Trey Drake

unread,

Aug 1, 2012, 2:30:52 PM8/1/12

to cloud-d...@googlegroups.com

As others have noted, if you want to influence the spec then join the sc...@ietf.org mailing list vs writing about SCIM in an InfoQ paper.

As for the ID comment:

The Service Provider *may*, and likely will, establish an id other than the id minted at time of creation for internal purposes. The Service Provider minted id is the shareable, stable id. I don't see any reason why the consumer specified externalId must remain private. In any case, the externalId is suggestive (read optional) as a useful facility for search via the consumer.

On Aug 1, 2012, at 8:01 AM, Ganesh Prasad <g.c.p...@gmail.com> wrote:

Hi,

My name is Ganesh Prasad, and my experience in Identity and Access Management is mainly through a 3-year project at an Australian insurance company, an experience I have written about as a eBook on InfoQ (http://www.infoq.com/minibooks/Identity-Management-Shoestring).

I have been following the SCIM spec off and on, and based on my experience with a loosely-coupled architecture that I found to be successful, I have the following 3 suggestions to make.

1. The enterprise client and the cloud provider should maintain their own internal IDs for a resource, which they should not reveal to each other. Both of them should map their internal IDs to a shared External ID, and this is the only ID that should be exposed through the API. The current specification's provision of an id (which is the external ID and the only one to be transferred through the API) and an "external ID" (which is the client's internal ID and should be hidden) is diametrically opposite to this

Reply all

Reply to author

Forward

0 new messages