Custom Contexts JSON Schema & Schema Guru

48 views
Skip to first unread message

Joao Correia

unread,
Mar 31, 2016, 7:53:50 PM3/31/16
to Snowplow
Hi Snowplowers,

When creating a custom context JSON schema can we define a default value for a certain field?

Also I have a schema I need to update. When incrementing the JSON schema version from 1-0-0 to 1-0-1, the SQL generated by the schema-guru still gives me a table xxxx_1. Shouldn't it create a table xxxx_1_0_1? 

How does Snowplow knows which data goes to each schema? For some time I will have data going to the 1-0-0 and other going to 1-0-1.


Thanks
Joao


Ihor Tomilenko

unread,
Mar 31, 2016, 8:41:50 PM3/31/16
to Snowplow
Hi Joao,

When creating a custom context JSON schema can we define a default value for a certain field?

"Our" JSON schema is a standard schema. The only difference is we require it to be self-describing. Apart from that follow the standards. More pleasant read could be this blog.

Example that comes from JSON schemas created for Snowplow

{
"$schema" : "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"self" : {
"vendor" : "com.amazon.aws.ec2",
"name" : "instance_identity_document",
"version" : "1-0-0",
"format" : "jsonschema"
},
"type" : "object",
"properties" : {
"instanceId" : {
"type" : "string",
"minLength" : 10,
"maxLength" : 19
},

Also I have a schema I need to update. When incrementing the JSON schema version from 1-0-0 to 1-0-1, the SQL generated by the schema-guru still gives me a table xxxx_1. Shouldn't it create a table xxxx_1_0_1? 

No. We follow MODEL-REVISION-ADDITION  model. Please, read more on this approach in this blog. Also take a look at the wiki page.
  • MODEL when you make a breaking schema change which will prevent interaction with any historical data
  • REVISION when you introduce a schema change which may prevent interaction with some historical data
  • ADDITION when you make a schema change that is compatible with all historical data
Side note: Unfortunately, the current version of schema guru is not version aware. So when table definitions and jsonpath files are generated for schema version 1-0-1, they are exactly the same as if they were generated for schema version 1-0-0. That is incorrect: we want schema guru to be aware that 1-0-1 is an update to 1-0-0, and that any new fields at the end of the redshift table definition. (Because Redshift doesn't support adding columns with AFTER clauses. The new schema-guru release should resolve this problem.

How does Snowplow knows which data goes to each schema? For some time I will have data going to the 1-0-0 and other going to 1-0-1.

When tracker sends the data related to custom contexts/events it sends self-describing JSON.

{
    "schema": "iglu:com.snowplowanalytics/ad_click/jsonschema/1-0-0",
    "data": {
        "bannerId": "4732ce23d345"
    }
}

Regards,
Ihor

Joao Correia

unread,
Apr 1, 2016, 12:20:17 PM4/1/16
to Snowplow
Hi Ihor,

This is extremely useful! Thank you!
Reply all
Reply to author
Forward
0 new messages