Yes, we just added initial support for JSON! The CLI now has json-schema
and json-import. The API also has a JSON format.
JSON support is like the CSV support. Kite assumes that the JSON records
adhere to some schema and constructs records according to that schema.
If a required field is missing, Kite will complain. If the JSON contains
data for fields not in the schema, it is ignored.
We're happy to get some feedback on how we should be improving the
support, so please try it out and get back to us!
Here's the output of json-schema on a json sample:
blue@work:~/tmp$ head movies.json -n 1
{"id": 1, "title": "Toy Story (1995)", "release_date": "01-Jan-1995",
"video_release_date": "", "imdb_url":
"http:\/\/
us.imdb.com\/M\/title-exact?Toy%20Story%20(1995)"}
blue@work:~/tmp$ kite-dataset json-schema movies.json --class Movie
{
"type" : "record",
"name" : "Movie",
"fields" : [ {
"name" : "id",
"type" : "int",
"doc" : "Type inferred from '1'"
}, {
"name" : "title",
"type" : "string",
"doc" : "Type inferred from '\"Toy Story (1995)\"'"
}, {
"name" : "release_date",
"type" : "string",
"doc" : "Type inferred from '\"01-Jan-1995\"'"
}, {
"name" : "video_release_date",
"type" : "string",
"doc" : "Type inferred from '\"\"'"
}, {
"name" : "imdb_url",
"type" : "string",
"doc" : "Type inferred from
'\"
http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)\"'"
} ]
}
rb
--
Ryan Blue
Software Engineer
Cloudera, Inc.