Hell
guys,
I am trying to add hive schema to query data from hdfs file system which has a parent field, sub field and sub of sub field.
My data format is:
++++
country -> main fields
state -> sub field of main field "country"
city -> sub field of field "state"
+++++
Hive schema:
+++++
CREATE EXTERNAL TABLE IF NOT EXISTS test_table (
country map<string,map<string,string>>
)
PARTITIONED BY (date string, hour string)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE LOCATION '/user/logs/';
++++
I could query country['state'] and country['city'] from the above table but not able query the parent field "country". If change the schema to country map<string,string>. it works for country and country['state'] works but not for country['start']['city'].
Can anyone help me figure out the way to query all three fields(parent, child and child of child) ?