convert csv to flare.json format

16,277 views
Skip to first unread message

jenny999

unread,
Nov 3, 2011, 2:54:21 PM11/3/11
to d3-js
Hello!

Is there a fast way to convert a large csv file to the flare.json
format to represent the data in a treemap.

Here is a small snippet of the csv file:

ParentItem SubsidiaryItem Country Price
Car SteeringWheel USA High
Car Tyre USA High
Car SeatBelt USA High
Bicycle Handle Mexico Medium
Bicycle Peddle Mexico Medium
Plane Wing France Very High
Plane Engine France Very High
Plane Fuselage France Very High
Plane Seat Tray China Very High
Skateboard WheelA China Low
Skateboard PlasticPart China Low
RollerSkates Strap China Low
RollerSkates WheelB China Low

The treemap should have color representing the various countries and
the size will represent the no. of subsidiaryItems under each
ParentItem. Could someone show a simple way to get this results,
preferably with the zoom function? Thank you very much.

wimdows

unread,
Nov 4, 2011, 5:40:59 AM11/4/11
to d3-js
Jenny,

Try converting your data with one of the csv to json conversion pages
you can find on the web. After that you can check the syntax with the
JSONlint (also an online service).

regards,

wim


Iain

unread,
Nov 4, 2011, 7:34:45 AM11/4/11
to d3...@googlegroups.com
I'm not familiar with the structure you need to create the treemap, but a nice (graphical) way to convert a table data structure to json is to use Google Refine. It also makes cleaning, joining etc. easier.

Iain

Peter Rust

unread,
Nov 4, 2011, 10:08:09 AM11/4/11
to d3-js
Jenny,

If the data doesn't change, it would preferable to convert it with an
online tool as wim suggests. However, if the data changes frequently
and you need the webpage to always grab the latest CSV, you can import
csv with the d3 functions described here: https://github.com/mbostock/d3/wiki/CSV

Once you have the CSV loaded as JSON/javascript objects, you'll want
to convert it from a flat list to a nested/hierarchical structure
using d3.nest() as described here: https://github.com/mbostock/d3/wiki/Arrays#wiki-d3_nest.

As far as a simple way to get the results with a zoom function, you'll
probably want to pick one of the example Tree Layouts from the wiki
that you like (https://github.com/mbostock/d3/wiki) and take care to
get your data looking like the data in the example (via d3.nest()) and
then switch the example to use your data instead of the example data.

-- Peter Rust
Developer, Cornerstone Systems

jenny999

unread,
Nov 5, 2011, 4:44:07 AM11/5/11
to d3-js
Thank you all. I understand the d3.nest() function better now. I was
also wondering if there is a way to display the name on the treemap
cell in addition to the return value on it. i.e If a particular color
represents a country, how do we display the country name on the cell
or is there a way to add in a legend? Thanks.

Mike Bostock

unread,
Nov 5, 2011, 11:47:44 AM11/5/11
to d3...@googlegroups.com
> I was also wondering if there is a way to display the name on
> the treemap cell in addition to the return value on it.

The treemap examples do this. In SVG, you use an svg:text element; in
HTML, you just set the text content of your div elements with the
`text` operator.

Mike

jenny999

unread,
Nov 6, 2011, 2:13:07 AM11/6/11
to d3-js
Alright thanks. I've added the zoom function and assigned it to be
working .on("click"). And I want to add a mouseover
function .on("mouseover") to show the extra labels of the treemap. The
problem is that when I add these 2 to the cell var, only the one that
appears first will work. Is there a way to work around this? Thanks.

Thug

unread,
Nov 18, 2011, 3:49:32 AM11/18/11
to d3...@googlegroups.com
One further question. In contrast to results shown in the d3.nest() example, the current flare.json data structure is free of redundancies (parents common to multiple children). Looking at the d3.nest() example, this suggests that instead of :

[{year: 1931, values: [
   {key: "Manchuria", values: [
     {yield: 27.00, variety: "Manchuria", year: 1931, site: "University Farm"},
     :
..we should ideally see something along the lines of :

  {
   "year": "1931",
   "values": [
    {
     "variety": "Manchuria",
     "values": [
      {
       "site": "University Farm",
       "values": [
        {yield: 27.00}
     ]
    },
    :

Is such an optimisation possible at the moment using d3?

Thanks
Thug

Mike Bostock

unread,
Nov 18, 2011, 10:38:18 AM11/18/11
to d3...@googlegroups.com
> Is such an optimisation possible at the moment using d3?

That would require creating new objects that contain only the
properties that weren't nested. So I'm not sure that would really be
an optimization. You can do this yourself by specifying a rollup
operator, for example, rollup(function(v) { return v.map(function(d) {
return d.yield; }); }) would have values contain an array of yields
rather than objects.

Mike

Thug

unread,
Nov 22, 2011, 12:23:35 PM11/22/11
to d3...@googlegroups.com
Though rollup seems to work, my json-from-csv data still didn't get displayed, so I switched to using data from the stocks example (known to work in another context, so should be ok in generating a node-link tree):

symbol,date,price
S&P 500,Jan 2000,1394.46
S&P 500,Feb 2000,1366.42
S&P 500,Mar 2000,1498.58
S&P 500,Apr 2000,1452.43
S&P 500,May 2000,1420.6
S&P 500,Jun 2000,1454.6 
       :             :           :

The trace looks ok up to the point where the links are created. Step by step (with boldening by me):

d3.csv("file:///Volumes/PLEXTOR/AAA_MyStartup/CSV/stocks.csv"functioncsv ) {
console.log("csv = " csv.toSource() );

csv = [{symbol:"S&P 500", date:"Jan 2000", price:"1394.46"}, {symbol:"S&P 500", date:"Feb 2000", price:"1366.42"}, {symbol:"S&P 500", date:"Mar 2000", price:"1498.58"}, {symbol:"S&P 500", date:"Apr 2000", price:"1452.43"}, {symbol:"S&P 500", date:"May 2000", price:"1420.6"}, {symbol:"S&P 500", date:"Jun 2000",
     :             :
{symbol:"AAPL", date:"Dec 2009", price:"210.73"}, {symbol:"AAPL", date:"Jan 2010", price:"192.06"}, {symbol:"AAPL", date:"Feb 2010", price:"204.62"}, {symbol:"AAPL", date:"Mar 2010", price:"223.02"}]
 

json d3.nest()
.key(function(d) { return d.symbol; })
.key(function(d) { return d.date; })
.rollup(function(v) { return v.map(function(d) { return d.price; }); })
.entries(csv);

console.log("json = " json.toSource() );

json = [{key:"S&P 500", values:[{key:"Jan 2000", values:["1394.46"]}, {key:"Feb 2000", values:["1366.42"]}, {key:"Mar 2000", values:["1498.58"]}, {key:"Apr 2000", values:["1452.43"]}, {key:"May 2000", values:["1420.6"]}, {key:"Jun 2000",
     :             :
{key:"Oct 2009", values:["188.5"]}, {key:"Nov 2009", values:["199.91"]}, {key:"Dec 2009", values:["210.73"]}, {key:"Jan 2010", values:["192.06"]}, {key:"Feb 2010", values:["204.62"]}, {key:"Mar 2010", values:["223.02"]}]}]

var nodes tree.nodes(json);
console.log("nodes = " nodes.toSource() );

 
nodes = [[{key:"S&P 500", values:[{key:"Jan 2000", values:["1394.46"]}, {key:"Feb 2000", values:["1366.42"]}, {key:"Mar 2000", values:["1498.58"]}, {key:"Apr 2000", values:["1452.43"]}, {key:"May 2000", values:["1420.6"]}, {key:"Jun 2000", values:["1454.6"]}, {key:"Jul 2000", values:["1430.83"]},
     :             :
{key:"Oct 2009", values:["188.5"]}, {key:"Nov 2009", values:["199.91"]}, {key:"Dec 2009", values:["210.73"]}, {key:"Jan 2010", values:["192.06"]}, {key:"Feb 2010", values:["204.62"]}, {key:"Mar 2010", values:["223.02"]}]}]]
 

var link vis.selectAll("path.link").data(tree.links(nodes))
.enter().append("svg:path").attr("class""link").attr("d", diagonal);

console.log("link = " link.toSource() );

link = [[]]

var link vis.selectAll("path.link").data(tree.links(nodes)).enter().append("svg:path").attr("class""link").attr("d", diagonal);
console.log("link = " link.toSource() );  

node = [[({})]]

Comparing the above to a sane json file (flare.json): only thing looking suspect are the increasing number of enclosing brackets:

{
 "name""flare",
 "children": [
  {
   "name""analytics",
   "children": [
    {
     "name""cluster",
     "children": [
      {"name""AgglomerativeCluster""size": 3938},
      :

Any idea what is going wrong?

Thanks
Thug

Thug

unread,
Nov 22, 2011, 1:18:03 PM11/22/11
to d3...@googlegroups.com

Copy/paste error above. Last trace output should of course be preceded by node instead of link code :

var node = vis.selectAll("g.node").data(nodes).enter().append("svg:g").attr("class", "node").attr("transform", function(d) {return "rotate(" + (d.x - 90) + ")translate(" + d.y + ")"});

console.log("node = " + node.toSource() );

node = [[({})]]

Thug

unread,
Nov 22, 2011, 1:57:00 PM11/22/11
to d3...@googlegroups.com

In the DOM inspector, I see the following is throwing up a NaN error:

node vis.selectAll("g.node").data(nodes).enter().append("svg:g").attr("class", "node").attr("transform", function(d) { return "rotate(" + (d.x - 90) + ")translate(" + d.y + ")"});

class="node" transform="rotate(NaN)translate(0)"

This also seems to suggest a formatting problem arising during the csv-json conversion process.

Thug

unread,
Nov 22, 2011, 2:47:57 PM11/22/11
to d3...@googlegroups.com
Mmm. All I see is that d contains the json:

node = vis.selectAll("g.node").data(nodes).enter().append("svg:g").attr("class", "node").attr("transform", function(d) {

console.log("d = " + d.toSource());


return "rotate(" + (d.x - 90) + ")translate(" + d.y + ")";
});

d = [{key:"S&P 500", values:[{key:"Jan 2000", values:["1394.46"]}, {key:"Feb 2000", values:["1366.42"]}, {key:"Mar 2000", values:["1498.58"]}, {key:"Apr 2000", values:["1452.43"]}, {key:"May 2000", values:["1420.6"]}, {key:"Jun 2000", values:["1454.6"]}, {key:"Jul 2000", values:["1430.83"]}, {key:"Aug 2000", values:["1517.68"]}, {key:"Sep ...

Thug

unread,
Nov 30, 2011, 10:00:50 AM11/30/11
to d3...@googlegroups.com
Hi,

I've returned to this problem -the generation of "optimised" (non-redundant) json from csv files- a number of times, but -even using csv files known to work in other contexts and extended by a root column with identical value zero throughout- the generated JSON (tested using http://www.jsonlint.com) is invariably invalid.

Here (in order of likelihood) are the aspects I feel may be contributing to the failure: 

  • Missing double quotes around the generated collection labels key and values
  • Though present in original csv, missing label for field contents identified in the rollup function (here price).
  • A gradual accumulation of outer square brackets around the entire logged output construct.
Valid JSON looks like:

{
    "name": "flare",
    "children": [
        {
            "name": "analytics",
            "children": [
                {
                    "name": "cluster",
                    "children": [
                        {
                            "name": "AgglomerativeCluster",
                            "size": 3938
                        },

Invalid JSON (here generated from stocks.csv) looks like:

{
    key: "0",
    values: [
        {
            key: "S&P 500",
            values: [
                {
                    key: "Jan-00",
                    values: [
                        "1394.46"
                    ]
                },
                {
                    key: "Feb-00",
                    values: [
                        "1366.42"
                    ]
                },

Here the code generating the above:

json = d3.nest().key(function(d) { return d.root; })


.key(function(d) { return d.symbol; })
.key(function(d) { return d.date; })
.rollup(function(v) { return v.map(function(d) { return d.price; }); })
     .entries(csv);

Are there inconsistencies in some of these respects in d3's conversion of csv to json? From http://www.jsonlint.com : "Be sure to follow JSON's syntax properly. For example, always use double quotes, always quotify your keys, and remove all callback functions"

The underlying motivation is that I would like to use part of a very large, open-source, csv format spreadsheet, but realise that the json equivalent of the little actually needed could be many times smaller. My thought was that having shown that a node-link tree could be generated from the converted csv, I could in good faith write the json to file for this purpose. Moreover the code could be used as a base for a more general purpose file converter/optimiser.

Regards
Thug


skore

unread,
Dec 3, 2011, 3:04:43 PM12/3/11
to d3-js
I'm pretty new to javascript and hacking my way through d3.js, so I'm
not sure how much of this applies to your problem and please take what
I write with some salt. In any case, this is how I solved the very
same issue for my case:

One of the most important parts is how the partition layout converts
the nested set. For that, you need to define the call like so:

var partition = d3.layout.partition()
.sort(null)
.size([2 * Math.PI, r * r])
.value(function(d) { return d.values; })
.children(function (d) { return d.values; });

This means that it looks for both the value and the children
in .values (instead of the standard .children) - we have to do this
because a nested set only has .values as properties of a node.

I pull my data from a json source, but it really doesn't matter much -
my data isn't even close to being formatted in the way we need it
here, it's just a collection of sales records:

{"type":"SalesCollection","sales":[
{"type":"Sale","id":"9122","invoice":"invoice-1","date":"2011-11-23","plan":"60","group":"10","amount":"55.00"},
{"type":"Sale","id":"9123","invoice":"invoice-2","date":"2011-11-23","plan":"60","group":"10","amount":"34.00"},
{"type":"Sale","id":"9124","invoice":"invoice-3","date":"2011-11-23","plan":"69","group":"8","amount":"12.00"}
]}

From this, I build the nested set:

pre.values = d3.nest()
.key(function(d) { return d.group; })
.rollup(function(v) { return d3.nest()
.key(function(d) { return d.plan; })
.rollup(function(v) { return d3.sum(v.map(function(d)
{ return d.amount; })); })
.entries(v);
})
.entries(json.sales); // <-get the sales array from the json above

So in this case, I get a two-level nested set array: group->plan-
>amount. I get the amount with the rollup trick that Mike quoted
above.

Finally, I take almost the same call for the path:

var path = vis.data([pre]).selectAll("path")
.data(partition.nodes).enter()
.append("svg:path")
.attr("display", function(d) { return d.depth ? null :
"none"; }) // hide inner ring
.attr("d", arc)
.attr("fill-rule", "evenodd")
.style("opacity", "0.8")
.style("fill", function(d) { return color(( (typeof d.values !=
'object') ? d.parent : d).key); });

As you can see, in the "fill" rule, I make a distinction on whether
the content of d.values is an object - thus giving me the correct
color rule depending on whether I'm in a child item or a group.

The part I'm working on right now is that instead of just calling in
the data, I would like to attach it to the path I'm creating.

Anyhow - hope it helps. Oh and also: Hello World, d3-js list.

cheers,
skore


On Nov 30, 4:00 pm, Thug <a.douglas.h...@gmail.com> wrote:
> Hi,
>
> I've returned to this problem -the generation of "optimised"
> (non-redundant) json from csv files- a number of times, but -even using csv
> files known to work in other contexts and extended by a root column with
> identical value zero throughout- the generated JSON (tested

> usinghttp://www.jsonlint.com) is invariably invalid.


>
> Here (in order of likelihood) are the aspects I feel may be contributing to
> the failure:
>

>    - Missing double quotes around the generated collection labels *key* and
>    *values*
>    - Though present in original csv, missing label for field
>    contents identified in the rollup function (here *price*).
>    - A gradual accumulation of outer square brackets around the entire
>    logged output construct.
>
> *Valid JSON looks like:*


>
> {
>     "name": "flare",
>     "children": [
>         {
>             "name": "analytics",
>             "children": [
>                 {
>                     "name": "cluster",
>                     "children": [
>                         {
>                             "name": "AgglomerativeCluster",
>                             "size": 3938
>                         },
>

> *Invalid JSON (here generated from stocks.csv) looks like:*


>
> {
>     key: "0",
>     values: [
>         {
>             key: "S&P 500",
>             values: [
>                 {
>                     key: "Jan-00",
>                     values: [
>                         "1394.46"
>                     ]
>                 },
>                 {
>                     key: "Feb-00",
>                     values: [
>                         "1366.42"
>                     ]
>                 },
>
> Here the code generating the above:
>
>   json = d3.nest().key(function(d) { return d.root; })
> .key(function(d) { return d.symbol; })
> .key(function(d) { return d.date; })
> .rollup(function(v) { return v.map(function(d) { return d.price; }); })
>        .entries(csv);
> Are there inconsistencies in some of these respects in d3's conversion of

> csv to json? Fromhttp://www.jsonlint.com: "Be sure to follow JSON's syntax<http://www.json.org/> properly.

skore

unread,
Dec 3, 2011, 3:12:09 PM12/3/11
to d3-js
Minor oversight, this is the complete code for creating the nested
array:

var pre = new Object;
pre.key = 0;


pre.values = d3.nest()
.key(function(d) { return d.group; })
.rollup(function(v) { return d3.nest()
.key(function(d) { return d.plan; })
.rollup(function(v) { return d3.sum(v.map(function(d)
{ return d.amount; })); })
.entries(v);
})
.entries(json.sales); // <-get the sales array from the json above

cheers,
skore

Jon

unread,
Dec 7, 2011, 12:35:51 PM12/7/11
to d3-js
I have been messing around trying to convert csv to flare style json
with limited success.... :(

My trial code is here: http://bl.ocks.org/1439067

Eventually I want to allow users to:

1 -- Apply multiple filters to a csv file to obtain a sub selection of
interest via dropdown select box.
2 -- Choose from a variety of different nest options to create
different "flare.json" style format.
3 -- Choose from a variety of different visualization options that use
the "flare.json" style format.
4 -- RENDER the visualisation.

I can generate a treemap from a filtered and nested csv file. Then if
I select a csv filter and refresh the page it updates the treemap!
This "refresh" approach doesn't seem to work on bl.ocks.org... and I
would rather it did it as an "onchange" event... but I cant figure out
how to make it to work!

I am sure this is probably something really simple for somebody who
knows what they are doing!!

Hope somebody can point out the mistake... or even better suggest ways
to progress towards the end goal....

cheers
James

James

unread,
Dec 8, 2011, 7:55:29 AM12/8/11
to d3-js
Ok was trying to do too much...

much simplified version now here: http://bl.ocks.org/1446865

On select it draws the treemap for the filtered csv and onchange it
redraws. Hooray! .. BUT positioning goes haywire and old nodes are not
being removed.... not so good.

guessing this is something to do will cell.exit().remove();

despite lots of poking still cant figure out the problem.


d3 is tough with my almost non existent javascript or programming
background.... but I recognise its power and hope I can figure out
some more of the basics soon!

James

Mike Bostock

unread,
Dec 8, 2011, 11:37:29 AM12/8/11
to d3...@googlegroups.com
The standard enter/update/exit pattern should look like this:

// join data to .cell elements
// optionally specify a key function to the data join!
var cell = svg.selectAll(".cell")
.data(nodes);

// enter new elements
var cellEnter = cell.enter().append("g")
.attr("class", "cell")

cellEnter.append("rect")

cellEnter.append("text")

// update remaining elements
cell.select("rect")

cell.select("text")

// remove old elements
cell.exit().remove();

Your code isn't working because you are appending the rect and text to
the update selection (cell) rather than the enter selection
(cell.enter()). And conversely, you're not updating the update
selection or using a key function for the data join.

Mike

James

unread,
Dec 8, 2011, 4:08:13 PM12/8/11
to d3-js
Thanks Mike,

Have been hacking stuff together that I still don't fully understand.
Your guidance above is the clearest yet...definitely some dim lights
at the end of my tunnel as things become clearer!

Ok v3 is now updated here:

http://bl.ocks.org/1446865

Its getting better. Old cells are now removing properly but new cells
are still all over the place. I am thinking this is something to do
with treemap.sticky(false)?

Found this: https://github.com/mbostock/d3/issues/393

and tried some tweaks but no luck yet. Is this an unresolved issue?

Mike Bostock

unread,
Dec 8, 2011, 4:19:30 PM12/8/11
to d3...@googlegroups.com
> I am thinking this is something to do with treemap.sticky(false)?

Not likely. That bug that you linked isn't a bug, just a
misunderstanding about the behavior of sticky(true).

The problem in your case is that you've forgotten to update the g.cell
elements' transform attribute. You could say:

cell.attr("transform", function(d) { return "translate(" + d.x + ","


+ d.y + ")"; });

Also, you can simplify your code by removing the duplicate operations
on both the enter and update selections. Whatever you apply to the
update selection after enter is applied to both entering and updating
nodes. So, for example, rather than saying this:

var cellEnter = cell.enter().append("g")
.attr("class", "cell");

cellEnter.append("rect")
.attr("width", function(d) { return d.dx - 1; })
.attr("height", function(d) { return d.dy - 1; })
.style("fill", function(d) { return color(d.parent.key); })

cell.select("rect")
.attr("width", function(d) { return d.dx - 1; })
.attr("height", function(d) { return d.dy - 1; })
.style("fill", function(d) { return color(d.parent.key); })

You can just say:

var cellEnter = cell.enter().append("g")
.attr("class", "cell");

cellEnter.append("rect");

cell.select("rect")
.attr("width", function(d) { return d.dx - 1; })
.attr("height", function(d) { return d.dy - 1; })
.style("fill", function(d) { return color(d.parent.key); })

Of course, if something only needs to be set on enter, you should set
it there rather than on update to improve performance.

Mike

James

unread,
Dec 8, 2011, 4:47:45 PM12/8/11
to d3-js
Sweet and simple. updated here: http://bl.ocks.org/1446865

now to test with big data....

thanks Mike!!

"the conductor enters the stage, raps three times with his baton, and
harmony emerges from the chaos." --Arthur Koestler, The Sleepwalkers
(New York: Macmillan, 1968), p. 25

Thug

unread,
Dec 10, 2011, 1:51:57 PM12/10/11
to d3...@googlegroups.com
Hi Mike, d3'ers

Mike : given difficulties encountered generating valid JSON from CSV, a burning question for me now is how the original flare.json was generated. Can you summarise the steps you used?

D3'ers: for the reasons described below, should someone be in possession of a working version of a node-link tree based on CSV rather than JSON data, I'd be glad indeed if it could be made public.

Why? My only viable data source is a large (ca 100-line), deeply (up to 10) nested csv-formatted file. I want very much to load this into the node-link tree, but see I've been at this (on and off) since mid november... :-(

Don't get me wrong. I've come a long way in other areas of d3, and understand how CSV has been made to work with other hierachical visualisations (such as the one mentioned in the thread immediately above). CSV display in a node-link tree appears, however -and especially for multi-level hierarchies of varying depths- hamstrung by one or more formatting issues. I've tried at least three approaches, but feel I'm close to exhausting the possibilities :
  • json produced from csv using online tools

    Using only csv files known to work on other public, hierarchical visualisations (such as the one mentioned in the thread above), no online csv-to-json converter has produced a file acceptable to d3.tree (not to speak of "optimised" or truly hierarchical json, as in the original flare.json file).

  • json produced from csv using d3's key, map and rollup calls

    The calls fail consistently, but either deep within the d3 libraries or in the form of empty link or node objects. I had hoped to get away without logging calls from within d3, but this now seems unavoidable.

  • not in itself a solution, more recently I've been trying to set up a jsfiddle with which I could share the problem. Unfortunately, difficulties with cross domain data loading are increasingly pushing the approach away from the original.
With the CSV and JSON specs at hand, for the first two approaches I experimented with csv files both with and without:
  • additional "root" column
  • multiple "root" values
  • identical values in the first (root) csv column
  • column headers line
  • empty strings ("")
  • empty fields ( ", , , ")
  • surrounding quotes ("word", 'word', word)
Thanks
Thug

James

unread,
Dec 10, 2011, 2:17:24 PM12/10/11
to d3-js
Hi Thug,

To date I have been using python to generate flare style
hierarchies...It has successfully converted 12,000 line csv files into
exact d3 flare format (with a bit of cutting and pasting from the
python shell).

For the moment it is MUCH faster than through nested csv with d3.js
(but I still dont understand nesting that well). I still like the
pure javascript option because I want to filter and nest my data in
multiple ways and tweaking the python code every time is tedious!

Maybe somebody who understands python better could turn it into some
sort of useful tool.

code is pretty simple...

import csv, itertools, json

def cluster(rows):
result = []
data = sorted(rows, key=lambda r: r[1])
for k, g in itertools.groupby(rows, lambda r: r[0]):
group_rows = [row[1:] for row in g]

if len(row[1:]) == 4:
result.append({"name": row[1],"fon": row[1],"bud":
int(row[2]),"act": int(row[3])}) // NEED TO FIDDLE WITH THIS LINE
DEPENDING ON NUMBER OF COLUMNS IN YOUR CSV
else:
result.append({"name": k,"children":
cluster(group_rows)})

return result

if __name__ == '__main__':
s = '''\
//PASTE YOUR CSV HERE!!!
'''

rows = list(csv.reader(s.splitlines()))
print json.dumps(cluster(rows),indent = 2)

James

unread,
Dec 10, 2011, 6:59:49 PM12/10/11
to d3-js
if i was me... last example was not so understandable....so..

here's an example that has been validated by json lint

# -*- coding: utf8 -*-
import csv, itertools, json

def cluster(rows):
result = []
data = sorted(rows, key=lambda r: r[1])
for k, g in itertools.groupby(rows, lambda r: r[0]):
group_rows = [row[1:] for row in g]

if len(row[1:]) == 1:
result.append({"name": row[0],"size": int(row[1])})


else:
result.append({"name": k,"children":
cluster(group_rows)})

return result

if __name__ == '__main__':
s = '''\

wotamess,907,3-4-0-0412-070-0160-02-1,3-621-10,850000
wotamess,907,3-4-0-0412-070-0160-02-1,3-628-20,850000
wotamess,907,4-4-0-0912-070-0150-02-1,2-611-07,111318000
wotamess,907,4-4-0-0912-070-0150-02-1,2-611-22,440775000
'''
rows = list(csv.reader(s.splitlines()))
print json.dumps(cluster(rows),indent=2)

churcjos

unread,
Apr 20, 2012, 2:38:57 PM4/20/12
to d3...@googlegroups.com
Hi All,

I came up with a fairly simple solution to get flat, record level data into the nested flare json format.  All we had to do was edit 2 lines in the d3.v2.js file, which changed the output of the d3.nest function to "name" & "children" instead of "key" & "values".  Then we created a parent object to insert the nest into.  And viola, that worked.  See the fiddle to find what edits I made to the d3.v2.js file and how I used the nest function.


Sorry if this got posted twice.  I think the first time I tried it just sent an email rather than get added to the discussion.

TapioK

unread,
Oct 10, 2012, 5:54:24 AM10/10/12
to d3...@googlegroups.com
Interesting. I've tried to get this to work on partition-icicle-zoom. Am I missing something.. how do I manipulate the csv to change key -> name and values->children. 
So instead of  
"key": "Jan 2000",
      "values": [
         {
            "key": "S&P 500",
            "values": [
               {
                  "symbol": "S&P 500",
                  "date": "Jan 2000",
                  "price": "1394.46"

I get
"name": "Jan 2000",
      "children": [
         {
            "name": "S&P 500",
            "children": [
               {
                  "symbol": "S&P 500",
                  "date": "Jan 2000",
                  "price": "1394.46"

churcjos' tip does not seem to work on partition?

Kai Chang

unread,
Oct 10, 2012, 7:14:33 AM10/10/12
to d3...@googlegroups.com
Check out underscore.nest, which is designed to do exactly this
(convert a csv to nested structure) with convenient, simple syntax:

https://github.com/iros/underscore.nest

TapioK

unread,
Oct 11, 2012, 2:13:31 PM10/11/12
to d3...@googlegroups.com, kai.s...@gmail.com
Thanks! I'll check that.

Sudhir C

unread,
Nov 1, 2012, 11:18:28 PM11/1/12
to d3...@googlegroups.com, jarobe...@googlemail.com
Hi James,

The python code works very well, Thank you for providing it.

Is there a PHP version of this, so I can directly convert database results to flare.json kinda format?

Thanks again,
Sudhir

Nikhil VJ

unread,
Dec 13, 2014, 6:45:16 AM12/13/14
to d3...@googlegroups.com
Hi James,

I copy-pasted the script exactly as you've given; but no luck.. am getting this error message:
  File "D:\TECH\visualization D3 dot JS\test.py", line 25
    rows = list(csv.reader(s.splitlines()))
                                          ^
IndentationError: unindent does not match any outer indentation level


(The caret is positioned under the last bracket.)

I'm using Python 3.4.2, on a windows command line prompt.

Attached here is the CSV file that needs to be converted to hierarchical JSON. It has id in first column, and parent_id in second column.
This format has worked with the project here:
https://github.com/stephen-james/DataStructures.Tree/
...and is also the easiest to implement with the kind of data I'm working with (A city's budget book).

electricalbudgetforcsv.csv

Alex J

unread,
Jan 2, 2015, 12:17:06 PM1/2/15
to d3...@googlegroups.com
hey Nikhil,

I've got a py script for converting flat csvs to flare.json. I went to do your file but noticed that the delimiters are a little wonky and you have some fields that are splitting when they shouldnt be (ex. your header shows that you only have 4 fields, with 'size' being the last, however some fields delimit into a fifth field). If you can clean that part up a little I'll pass it through the script and see if I can get you the flare.json, I just don't have time to clean up the original csv.

Nikhil VJ

unread,
Jan 3, 2015, 7:40:38 AM1/3/15
to d3...@googlegroups.com
Hi Alex,

Sorry, that earlier CSV was a dirty one. Here's a fixed version attached.. all the best!

There are some more columns now, but the deal remains the same : first column is id, second column is parentId. It's a self-referencing table.
It would be great if the JSON produced treated numbers as numbers and didn't put double-quotes around them. Since the visualization code I'm using seems to allergic to numbers in quotes!

About python : is there some difference between Python 3.4.2 and 2.7.9 that breaks the older scripts? Or was James's code not compatible with this kind of CSV (not exactly flat.. more self-referencing and flexible-depth) to begin with?

Alternately, can there be a CSV-to-CSV converter that "flattens" out a self-referencing CSV and makes it d3.nest()-friendly?


--
Cheers,
Nikhil
+91-966-583-1250
Pune, India
Self-designed learner at Swaraj University <http://www.swarajuniversity.org>
http://www.nikhilsheth.tk

--
You received this message because you are subscribed to a topic in the Google Groups "d3-js" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/d3-js/L3UeeUnNHO8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to d3-js+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

electrical5.csv

Nikhil VJ

unread,
Jan 3, 2015, 8:29:07 AM1/3/15
to d3...@googlegroups.com
Whoops, scratch the last attachment too! I'd left some bad entries in. Here's the working version.. confirmed since I was able to make it work with the DataStructures.Tree visualization.



--
Cheers,
Nikhil
+91-966-583-1250
Pune, India
Self-designed learner at Swaraj University <http://www.swarajuniversity.org>
http://www.nikhilsheth.tk

electrical5.csv

Nikhil VJ

unread,
Jan 4, 2015, 4:04:49 AM1/4/15
to d3...@googlegroups.com
Hi.. just a quick heads-up.. I figured it out how to insert the CSV
handling code into other d3.js visualizations. We have to replace just
the json-loading line with about 11 other lines.

I'm now able to sunburst using simple self-referencing CSV as input!
Check out the detailed explanation on
http://stackoverflow.com/questions/27576807/d3-js-zoomable-sunburst-visualization-from-self-referencing-csv-input
(yep, answered my own question)
I've also included instructions on how one can sneak out the
internally created JSON. So now an HTML file can make your JSON for
you. It's a bit messy (you have to modify the .js file to match the
columns in your csv.. it's not automatic), so would love to see some
programming genius going into that.

-Nikhil
>>>> * File "D:\TECH\visualization D3 dot JS\test.py", line 25 rows =
>>>> list(csv.reader(s.splitlines()))
>>>> ^IndentationError: unindent does not match any outer indentation level*
>>>>
>>>> (The caret is positioned under the last bracket.)
>>>>
>>>> I'm using Python 3.4.2, on a windows command line prompt.
>>>>
>>>> Attached here is the CSV file that needs to be converted to
>>>> hierarchical
>>>> JSON. It has id in first column, and parent_id in second column.
>>>> This format has worked with the project here:
>>>> https://github.com/stephen-james/DataStructures.Tree/
>>>> ...and is also the easiest to implement with the kind of data I'm
>>>> working with (A city's budget book).
>>>>
>>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "d3-js" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/d3-js/L3UeeUnNHO8/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> d3-js+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>


--
Reply all
Reply to author
Forward
0 new messages