Re: Importing from JSON flattening too much

67 views
Skip to first unread message

David Huynh

unread,
Nov 17, 2012, 11:01:38 PM11/17/12
to openr...@googlegroups.com
After the project is created, you can use the Join Multi-valued Cells command to combine "android" and "ios" into one cell, thus reducing 4 rows to 3.

David

On Sat, Nov 17, 2012 at 4:51 PM, Kim Pham <kimc...@gmail.com> wrote:
When I try to import this json data to Refine.

{"items" : [
{"name":"item A", "tags":["android","ios"]},
{"name":"item B", "tags":["ios"]},
{"name":"item C", "tags":["android"]},
]}

It doesn't give me 3 rows as expected, but 4 rows instead. It tried to expand the first item's tags attribute

 "tags":["android","ios"]

into 2 records.

Is there any way to tell Refine to not flatten the value of each column. Thus, the output table will have 3 rows, whose "tags" column will contain

["android","ios"]
["ios"]
["android"]

respectively.

Thanks !

Kim

--



David Huynh

unread,
Nov 19, 2012, 12:16:29 AM11/19/12
to openr...@googlegroups.com
The "records" model was not thoroughly designed and implemented. But within an expression being applied on a particular row, you could still access all rows in the same record.

For your case, I'd recommend combining the "os" and "brand" columns first, such as creating a new column with this expression

cells["os"].value + " : " + cells["brand"].value

and then invoke the Join Multi-valued Cells command on that new column. Then you'd get something like

android : google, jellybean : google

David

On Sun, Nov 18, 2012 at 1:55 PM, Kim Pham <kimc...@gmail.com> wrote:
That works for the previous case, but won't work for something like this:

{"items" : [
{"name":"item A", "tags":[{"os":"android","brand":"google"}, {"os":"jellybean","brand":"google"}]},
{"name":"item B", "tags":{"os":"ios","brand":"apple"}},
{"name":"item C", "tags":{"os":"android","brand":"google"}},
]}

When you apply Join Multi-valued cells command to "os" and "brand", with a separate ",", you'll get two columns
"os" with value "android","jellybean"
"brand" with value "google","google"

Which throws away the structure of the json.

Btw, i notice that after importing, refine has a notion of "records", and it knows that I have 3 records, not 4 records. However, all the commands only works with rows. I wonder what the point of "records view"?

Vào 20:01:40 UTC-8 Thứ bảy, ngày 17 tháng mười một năm 2012, David Huynh đã viết:
--
 
 

Tom Morris

unread,
Nov 20, 2012, 9:41:06 AM11/20/12
to openr...@googlegroups.com
Another approach would be to pre-process the JSON using either Refine (import as text instead of JSON) or another tool to convert the nested arrays into strings and then import the result as JSON for processing.

As David said, record mode as currently implemented doesn't handle complete representation of possible JSON or XML structures.  There are a few small improvements in the current git repository, but nothing that would help with this case.  We'd need to rework the underlying data structures to be able to handle this correctly.

Tom


--
 
 

Reply all
Reply to author
Forward
0 new messages