validation

137 views
Skip to first unread message

Andrei Neculau

unread,
Sep 26, 2012, 8:44:30 AM9/26/12
to api-...@googlegroups.com
This might be an obvious decision (it is for me, but as you will see my analysis is skewed and biased below),
but since I need to defend it in front of non-tech people, thought of fishing for opinions/experiences

Strict validation:
+ fail fast
any integration mishaps will surface asap, and the API will give descriptive errors for totally unknown or misspelled or mistyped properties (eg. unknown field <adres>, maybe <address>? or post_letter_at must be a timestamp, not a ISO string, etc)
+ inherent memory optimization
if there is more data being sent than you actually know how to process, you will free up memory instantly by throwing an error. That gets even worse if you work with functional programming and you pass contexts around, which may care useless data around.
+ inherent clean client-server contract
server says "send me something that i know how to process, and i will do what you (client) expect me to do"

Loose validation:
0 (a fake impression that) integration is smoother
since sending unknown properties still allows for successful API requests
0 (a fake impression that) you will get more useful data now
just that you don't know what to do with it, other than store it,
but that in the future you will get the chance to analyze it one way or the other
+ memory optimization is still possible
if you clean up the request data and only keep data that you know how to process
- fuzzy client-server contract
server says "send me something, and i will pick what i want from it, and do something that you might or might not expect me to do" eg. sell car X to _persson_ Y, server doesn't understand persson, but only person, but it has the feature to simply sell the car to nobody (ie. put the car on the market for sale), server responds 200 OK and it is up to the client to fix the mess

thank you for your time.

Steve Klabnik

unread,
Sep 26, 2012, 9:27:09 AM9/26/12
to api-...@googlegroups.com
You didn't really define what you mean by 'validation.'

Andrei Neculau

unread,
Sep 26, 2012, 9:49:43 AM9/26/12
to api-...@googlegroups.com
True. I'll be specific: JSON-schema validation

Steve Klabnik

unread,
Sep 26, 2012, 10:42:11 AM9/26/12
to api-...@googlegroups.com
Well yes, your colors are showing. ;)

For a counter-argument, look up any given static vs. dynamic typing
debate that has ever happened on the internet, ever. They're all
pretty interchangeable. They all make good points, on both sides.

As usual, with engineering, it boils down to "What am I trying to
accomplish? What are my tolerances for risk? _Where_ are my tolerances
for risk?"

Peter Williams

unread,
Sep 26, 2012, 11:05:39 AM9/26/12
to api-...@googlegroups.com

I think schema validation is a pretty horrible idea, except as a design time debugging tool. Even that role is better served by a validation program. Schema languages are rarely powerful enough to fully specify message requirements (eg, numbers that must be inside a particular range, strings that must have particular structures, etc).

Disadvantages of strict runtime validation:

* introduces unnecessary failures for messages that don't conform but that would be usable by the recipient
* makes introducing new properties awkward. New properties must be optional so as to not break existing clients even if you would really prefer all clients send them.
* inefficient because the message must be interpreted extra times. Once by the recipient to accomplish its goal, and one additional time for each schema validation.

I think schema based message validation provides very little value and requires a fair bit of work. You'd be better off using that time implementing really good error messages on the server side.

Peter
barelyenough.org

> --
> You received this message because you are subscribed to the Google Groups "API Craft" group.
> To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/api-craft?hl=en.
>  
>  

Mike Kelly

unread,
Sep 26, 2012, 11:19:04 AM9/26/12
to api-...@googlegroups.com

Andrei Neculau

unread,
Sep 26, 2012, 11:21:03 AM9/26/12
to api-...@googlegroups.com
I see your point - risk questions are pretty though =)
At times I feel I'm less of a developer/API designer, and more of a tester (highlighting corner cases and their risks)

Andrei Neculau

unread,
Sep 26, 2012, 11:26:04 AM9/26/12
to api-...@googlegroups.com
Peter, I would say that you took my answer too much ad-literam. In practice, I am going to use JSON schema validation.

If you ignore schema validation, my question is also valid up to the point where you have to decide whether you allow or not allow unknown properties to be sent.
Do you accept unknown data and terminate successfully  or do you barf saying "talk (my) language" ?

If I try to simplify my analysis: the former is deceiving.
I tell you "please bring me water and an aspirine" and you come back with aspirine "you're welcome!" -> (o.O)

Andrei Neculau

unread,
Sep 26, 2012, 11:28:57 AM9/26/12
to api-...@googlegroups.com
Mike, you made my day =)

Pat Cappelaere

unread,
Sep 26, 2012, 11:28:56 AM9/26/12
to api-...@googlegroups.com
Mike,

:) I really liked that.
Prickly gooey must our API be. Give it a discovery document for the pricks and hypermedia as goo.
This will make everybody happy.
Pat.

Steve Klabnik

unread,
Sep 26, 2012, 11:31:50 AM9/26/12
to api-...@googlegroups.com
> Peter, I would say that you took my answer too much ad-literam. In practice,
> I am going to use JSON schema validation.

Then why are you asking for opinions? Your mind is made up.

Pat Cappelaere

unread,
Sep 26, 2012, 11:32:46 AM9/26/12
to api-...@googlegroups.com
Runtime message validation on the server side is pretty critical to me.
You do not have to use a schema but if you had one, you could use it for validation and you could inform your client of your validation rules.
Client can then use it any way it wants (or even ignore it).  Seems a win-win to me…
Pat.

Andrei Neculau

unread,
Sep 26, 2012, 12:34:54 PM9/26/12
to api-...@googlegroups.com
Maybe I'm wrong, and maybe I didn't do a good job at phrasing the question and maybe the type of validation plays a strong role in my question,
but again: I don't see this as a decision between using or not using schema validation. I have only answered JSON schema to be transparent, to give you a context, not because it actually matters.

If you're still following this thread, I will do my best to simplify the question.

If I break down validation, hoping to move away from schema validation, there's
- syntax validation (JSON)
- structural validation (schema) - types, allowing unknown properties or not
- semantic validation (schema + logic) - regex format, minimum, maximum, whether you can create a kid resource if one of the parent resources is sterile, etc.

So, given that breakdown, I can apply my question just on the topic of syntax validation.

Given a collection resource, you do
POST /x
> Content-Type: "application/json"
request_payload

The payload is entirely optional, and the request is expected to create a resource and return its location.
Since Content-Type says "application/json", you try to parse "request_payload", which is not valid JSON.
What do you do when the parsing fails? You end up with 200, 500 or 400? I guess you will say 400. But that means
#1 the payload is actually not so optional
#2 you validate JSON
#3 you are transparent to the client: the request failed because the request payload was malformed

The question: why did you fail the request? Since the payload is optional, the request could also be terminated successfully.
Same thing applies if you can handle some ID property as an integer, but you get a string.

So bottom-line, if there's a chance for the request to succeed (you have your required information), do you ignore what you don't like and let it succeed (goo),
or do you throw an error (prickle)?

PS I liked your question regarding the risks, and my corner-case example is that today you might allow the client to send X (you know how to handle it) and anything else (call it Y). Tomorrow you actually implement Y, and you define it as Y1, while the client assumed it will mean Y2. What gives?

PPS: I may have my mind made up, entirely or up to a point, voluntarily or limited to 1 choice by circumstances. But when I stop asking questions and listening, I will be limited by something else...

Mike Schinkel

unread,
Sep 26, 2012, 4:25:58 PM9/26/12
to api-...@googlegroups.com
On Sep 26, 2012, at 11:19 AM, Mike Kelly <mikeke...@gmail.com> wrote:
it's all just prickles and goo:

http://www.youtube.com/watch?v=XXi_ldNRNtM

Priceless! Thank you. :)

On Sep 26, 2012, at 12:34 PM, Andrei Neculau <andrei....@gmail.com> wrote:
So bottom-line, if there's a chance for the request to succeed (you have your required information), do you ignore what you don't like and let it succeed (goo),
or do you throw an error (prickle)?

The answer is found in Postel's law: prickly goo.  Figure out whatever you can (be gooey) and if you can't throw an error (be a prickle.)

-Mike

sune jakobsson

unread,
Sep 26, 2012, 5:30:03 PM9/26/12
to api-...@googlegroups.com
And if you really need to be strict about the validation, there is always XML...

Sune

--

Mike Schinkel

unread,
Sep 26, 2012, 5:53:11 PM9/26/12
to api-...@googlegroups.com
On Sep 26, 2012, at 5:30 PM, sune jakobsson <sune.ja...@gmail.com> wrote:
> And if you really need to be strict about the validation, there is always XML...

http://apijoy.tumblr.com/post/31525536559/soap-xml

-Mike

Diego Magalhães

unread,
Sep 26, 2012, 5:57:38 PM9/26/12
to api-...@googlegroups.com
Hahahaha Epic!

Diego Magalhães
http://diegomagalhaes.com
claro @ +55 21 9411 2823





-Mike

Peter Williams

unread,
Sep 27, 2012, 11:07:14 AM9/27/12
to api-...@googlegroups.com
On Wed, Sep 26, 2012 at 10:34 AM, Andrei Neculau
<andrei....@gmail.com> wrote:
> So, given that breakdown, I can apply my question just on the topic of
> syntax validation.
>
> Given a collection resource, you do
> POST /x
>> Content-Type: "application/json"
> request_payload
>
> The payload is entirely optional, and the request is expected to create a
> resource and return its location.
> Since Content-Type says "application/json", you try to parse
> "request_payload", which is not valid JSON.
> What do you do when the parsing fails? You end up with 200, 500 or 400? I
> guess you will say 400.

That is what i would do.

> The question: why did you fail the request? Since the payload is optional,
> the request could also be terminated successfully.

I don't think this follows. Having optional parts of a message usually
means you can either A) provide a semantically meaningful value or B)
omit that part altogether. Optional does not normally mean that you
shove garbage in the slot and expect everything to be fine.

> So bottom-line, if there's a chance for the request to succeed (you have
> your required information), do you ignore what you don't like and let it
> succeed (goo),
> or do you throw an error (prickle)?

I think either approach would be acceptable. There certainly are
situations where a recipient might want to "succeed" all cost, but
clients should not assume that they can shove garbage at the server
and it will succeed. Fortunately, this is not a problem in practice.
Clients usually want to accomplish something meaningful so their
messages are rarely completely bogus.

Most of the apis i work on require a minium level of correctness in
the messages. Usually that they are syntacticly well formed, and that
they meet the business rule validations for the resource flavor in
question. If they are not syntacticly well formed the server generally
responses with a 400 and a plain text body describing the parse
failure and, hopefully, character offset of the failure. If they are
semantically invalid then the response is a 422 with a body that
indicates each of the properties that where invalid with a human
readable description of the invalidity, in language of the domain.

The key, imo, is recognizing that the server/recipient is the final
authority on what is acceptable. This is the case whether you have a
strict schema or not. Messages that are valid against the schema may
be rejected based on complex business logic. Messages that are not
valid against the schema might be acceptable (via Postel's principle)
if they are not rejected out of hand because of the apparent
invalidity.

> PS I liked your question regarding the risks, and my corner-case example is
> that today you might allow the client to send X (you know how to handle it)
> and anything else (call it Y). Tomorrow you actually implement Y, and you
> define it as Y1, while the client assumed it will mean Y2. What gives?

This issue is one of the reasons i like rdf. It provides globally
unique names for all properties. By doing so it prevents this sort of
name collision for non-link properties in the same way that the rel
value does for links.

Peter
barelyenough.org

Greg Brail

unread,
Sep 28, 2012, 1:56:30 PM9/28/12
to api-...@googlegroups.com
One thing we did on a recent project was have POST and even DELETE API calls return the resource, as rendered by the server.

It's still important IMHO to validate the requests and ensure that we never leave the client thinking that the server did something when actually it ignored the whole thing, but at the very least this helps us find bugs. 

So for instance, the client sends a POST and includes:

{ foo: 12, bar: "baz" }

and the server responds:

{"foo":12,"bar":"baz"}

because it nicely interpreted the original request even though technically it is invalid JSON.

or another example, the client POSTs:

<Foo>
  <Bar>Hello, "World!"</Bar>
</Foo>

and the server responds:

<Foo>
  <Bar>Hello, &quot;World!&quot;</Bar>
</Foo>

At the very least this makes it easier to track down surprises.

--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.
 
 



--
Gregory Brail  |  Technology  |  Apigee

sarah wilkes

unread,
Oct 11, 2012, 6:41:42 AM10/11/12
to api-...@googlegroups.com
Thers also plenty of free tools online to validate your xml,

http://www.liquid-technologies.com/FreeXmlTools/FreeXmlValidator.aspx

Reply all
Reply to author
Forward
0 new messages