Implementing a JSON Schema Validator. What is the best approach?

833 views
Skip to first unread message

Francesco Guardiani

unread,
Aug 13, 2018, 4:04:02 AM8/13/18
to JSON Schema
Hi,
I'm Francesco Guardiani, project mantainer for Eclipse Vert.x, a JVM reactive toolkit. I'm planning to implement a JSON Schema validator in Java that fits our needs:
  • Support for our Json object/Json Array without converting through different json representations
  • Async ref solve (that means async validation)
  • Support different json schema drafts and mostly important openapi schemas
Actually, as far as i know, there is no existing Json Schema validator on Java that supports async validation and openapi schemas, so here i am! I'm investigating around json schema world to understand what is the best approach to the problem, keeping in mind that i want to focus mostly on performance and extensibility of the tool.

Looking around i find that there are three approach to json schema validation:
  • generate dinamically the code that does the validation (AJV does it). This seems the best performance approach, but in Java can be a problem to implement it and can have serious performance implications of schema parsing
  • represent the model based on each keyword. Each keyword is an object. Cool for extensibility and multiple versions support, but maybe a memory killer?!
  • represent the model based on type entities. The object schema object, the number schema object, the string schema object and so on. When no type is specified, there is a special "multiple types" model. Less memory usage and better perfs?!
What do you think about it? Any advice from validators devs?

Thank you for the help!

Francesco Guardiani

Henry Andrews

unread,
Aug 13, 2018, 3:45:19 PM8/13/18
to JSON Schema
Hi Francesco,
  Great to hear about a new implementation!  You will probably get more response by joining our Slack workspace, as the mailing list is fairly quiet these days:  https://join.slack.com/t/json-schema/shared_invite/enQtMjk1NDcyNDI2NTAwLTcyYmYwMjdmMmUxNzZjYzIxNGU2YjdkNzdlOGZiNjIwNDI2M2Y3NmRkYjA4YmMwODMwYjgyOTFlNWZjZjAyNjg

  One thing worth noting, Phil Sturgeon and I have been working with the OpenAPI Technical Steering Committee to get them to converge with JSON Schema proper.  The next release of OpenAPI will include experimental support for using any version of JSON Schema, as indicated by the "$schema" keyword.  See https://github.com/OAI/OpenAPI-Specification/issues/1532 for details.

  Regarding types, JSON Schema is more flexible about types than many realize.  If you make completely separate code paths for objects vs integers or whatever, you will find yourself in an awkward spot for schemas that work with multiple types simultaneously.  For example, this is a valid schema:

{
    "type": ["integer", "string"],
    "minimum": 0,
    "maximum": 255,
    "minLength": 1,
    "maxLength": 50
}

which validates any integer from 0 to 255 or any string of 1 to 50 characters.  Implementations that assume a single type tend to have trouble with these schemas.

thanks,
-henry


--
You received this message because you are subscribed to the Google Groups "JSON Schema" group.
To unsubscribe from this group and stop receiving emails from it, send an email to json-schema...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesco Guardiani

unread,
Aug 13, 2018, 5:24:51 PM8/13/18
to json-...@googlegroups.com
Hi Henry! Thanks for the answer!
I'm constantly following the openapi spec discussions and i will be very happy to see this new feature in next release. Of course a new json validator will be shipped as a json validator and used together with openapi requests validation

I've created a couple of months ago a poc: https://github.com/slinkydeveloper/vertx-json-validator-poc/. What i want to achieve is to support draft 8 and openapi dialect. I've also solved a problem similar to what you are underlining (It's necessary to pass json test suite tests!) using this: https://github.com/slinkydeveloper/vertx-json-validator-poc/blob/master/src/main/java/io/vertx/ext/json/validator/schema/MissingTypeSchema.java . With this "hack" missing types are addressed without problems. I can do something similar for multiple types of course.

My actual doubts about "keyword oriented" validator are two:
  • Objects has some keywords that depends each other, so I should do an "hack" for object related stuff. Am I right?
  • Is smart to develop entity based validator trying to optimize perfs and memory and solve this corner cases with some hacks? Honestly I have no idea how much are used these multiple or missing type features
Francesco

Henry Andrews

unread,
Aug 20, 2018, 12:50:54 AM8/20/18
to json-...@googlegroups.com
We're trying to make the interdependent keywords make more sense in draft-08, although in practice people may implement some of them more directly for performance reasons.

Basically, we're defining keyword interactions in terms of annotation collection, which at least puts some boundaries on what keywords can do.  For `additionalProperties` and `additionalItems`, it's arguably more complicated to actually implement them that way, but `unevaluatedProperties` and `unevaluatedItems` will only work based on annotation collection.

thanks,
-henry


Reply all
Reply to author
Forward
0 new messages