Hi,
I have been following the SCIM spec off and on, and based on my experience with a loosely-coupled architecture that I found to be successful, I have the following 3 suggestions to make.
1. The enterprise client and the cloud provider should maintain their own internal IDs for a resource, which they should not reveal to each other. Both of them should map their internal IDs to a shared External ID, and this is the only ID that should be exposed through the API. The current specification's provision of an id (which is the external ID and the only one to be transferred through the API) and an "external ID" (which is the client's internal ID and should be hidden) is diametrically opposite to this.
2. When dealing with multi-valued attributes of a resource (expressed as arrays in JSON), they must be converted from an array into a dictionary with unique keys (UUIDs generated by the cloud provider when the attribute is created). Without unique keys for every attribute value of a resource, manipulating it will be clumsy and inelegant.
3. The PATCH command can be improved in 3 significant ways:
3a. Leverage the fact (from 2 above) that every value has a key, to greatly simplify the API
3b. Use special verbs as nested operations of the PATCH command to add, modify and delete attributes at any level
3c. Use the WebDAV status code of "207 Multi-Status" instead of "200 OK" as the response to a PATCH (or BULK) command.
To elaborate,
1. Revealing private IDs externally is a form of tight coupling. A major requirement with Identity Management is to split (or merge) identities when false positives (or false negatives) are detected, i.e., when a resource is discovered to be more than one, or when multiple resources are detected to be the same. If internal identifiers are revealed to external domains, such clean-ups become difficult, hence every domain that wants to expose references to a resource must map its internal ID to and external one created for this explicit purpose, and only reveal this.
In the SCIM case, when an enterprise client POSTs a resource creation request, the cloud provider must generate its own internal UUID as well as an external UUID, map them together, and only return the external UUID in the "Location:" header. The enterprise client should map this external UUID to a newly-generated internal ID of its own. In case the resource already has an identifier within the enterprise client's domain, then this is the internal ID that must be mapped to the external UUID returned through the POST response.
2. If a resource is to be created, and one of its attributes is multi-valued, e.g.,
then on successful creation, the server response should include the representation of the resource, and this attribute should look like this:
The client now knows what each value is labelled. This now provides an unambiguous way to reference a value to add, modify and delete it:
Add:
POST /Users/2819c223-7f76-453a-919d-413861904646/email-addrs
Modify:
PUT /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf
Delete:
DELETE /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1
One can even delete all email addresses like this:
DELETE /Users/2819c223-7f76-453a-919d-413861904646/email-addrs
I believe this is more elegant than what the spec recommends.
3. It's possible to think of the operations POST, PUT and DELETE as nested operations inside a PATCH. PATCH itself need not be nested because its semantics apply throughout the "tree" of a resource.
However, the semantics of PUT are a little messy. Also, the use of HTTP verbs at a different level could be confusing. That's why I would recommend 6 separate verbs that are a little more unambiguous in their meaning:
1. INCLUDE (equivalent to POST): Add
this resource to a collection and return a generated URI
2. PLACE (equivalent to one form of PUT): Add
this resource at the location specified by the accompanying URI. (If there’s
already a value at that location, return an error status.)
3. REPLACE (equivalent to another form of PUT): Replace
the value at the location specified by the accompanying URI with this value.
(If there’s no such URI, return an error status.)
4. FORCE (equivalent to a third form of PUT): This
means PLACE or REPLACE. (At the end of this operation, we want the specified
URI to hold the accompanying value whether the URI already existed or not.)
5. RETIRE (equivalent to DELETE): Delete, deactivate or otherwise render inaccessible the resource at the specified URI.
6. AMEND (equivalent to PATCH): (This
verb is just listed for completeness. We probably don’t need a nested PATCH
since PATCH cascades to every level of the tree.)
A PATCH request could therefore look like this:
PATCH /Users/2819c223-7f76-453a-919d-413861904646 HTTP/1.1
Accept: application/json
Authorization: Bearer h480djs93hd8
Content-length: ...
{
REPLACE: {
"key" : "first-name",
"value" : "Jack"
},
PLACE : {
"key" : "middle-name",
"value" : "Richard"
},
FORCE : {
"key" : "dob",
"value" : "01-Jan-1971"
},
REPLACE : {
"key" : "address.unit-number",
"value" : "12"
},
PLACE : {
"key" : "address.state",
"value" : "SA"
},
FORCE : {
"key" : "address.country",
"value" : "Australia"
},
INCLUDE : {
"key" : "email-addrs",
},
REPLACE : {
"key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",
},
RETIRE : {
"key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"
}
}
The PATCH response should utilise the status code "207 Multi-Status" because the nested operations could have varying status codes. A sample response is below:
HTTP/1.1 207 Multi-Status
Content-Type: application/json
ETag: W/"b431af54f0671a2"
{
"schemas":["urn:scim:schemas:core:1.0"],
"external-id":"2819c223-7f76-453a-919d-413861904646",
REPLACE: {
"status" : "200 OK",
"key" : "first-name",
"value" : "Jack"
},
PLACE : {
"status" : "200 OK",
"key" : "middle-name",
"value" : "Richard"
},
FORCE : {
"status" : "200 OK",
"key" : "dob",
"value" : "01-Jan-1971"
},
REPLACE : {
"status" : "200 OK",
"key" : "address.unit-number",
"value" : "12"
},
PLACE : {
"status" : "200 OK",
"key" : "address.state",
"value" : "SA"
},
FORCE : {
"status" : "200 OK",
"key" : "address.country",
"value" : "Australia"
},
INCLUDE : {
"status" : "201 Created",
"key" : "email-addrs/11f664ec-898b-4f6f-8948-ecfda74deff0",
},
REPLACE : {
"status" : "200 OK",
"key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",
},
RETIRE : {
"status" : "200 OK",
"key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"
}
"meta": {
"created":"2011-08-08T04:56:22Z",
"lastModified":"2011-08-08T08:00:12Z",
"version":"W\/\"b431af54f0671a2\""
}
}
If there are errors, they will take the place of the "200 OK" or "201 Created" status codes in the above successful case. But the outer status will remain "207 Multi-Status".
The same scheme can be used to deal with operations on members of a group, and for bulk operations.
I hope you find these suggestions useful.
I read the SCIM spec afresh last week and these ideas came flooding into my head because I have been working at another organisation (a telco) for the last 5 months, also in Identity and Access Management, and my thoughts have moved further along the direction of evolving a specialised data model based on specific principles, especially for IAM.
I am planning to write about this and also the data-related principles soon and am in negotiations with InfoQ regarding publication.
Regards,
Ganesh Prasad