How do you create a subset of the data?

2,592 views
Skip to first unread message

Stuart Sharples

unread,
Oct 21, 2011, 5:59:29 AM10/21/11
to d3...@googlegroups.com
Hi all,

I primarily work with R and D3 is my first experience of java, so while I accept things are done differently, I'm still trying to shoe horn my existing concepts of R data manipulation in to java.

So I've been struggling with a couple of problems for the last day or so, but first here's some data:

var data = [{author:"Roald Dahl",title:"The BFG",year:1982},
            {author:"Roald Dahl",title:"Matilda",year:1988},
            {author:"Roald Dahl",title:"Georges Marvellous Medicine",year:1981},
            {author:"J K Rowling",title:"Harry Potter and the Philosopher's Stone",year:1997},
            {author:"J K Rowling",title:"Harry Potter and the Chamber of Secrets",year:1998},
            {author:"J K Rowling",title:"Harry Potter and the Prisoner of Azkaban",year:1999},
            {author:"J K Rowling",title:"Harry Potter and the Goblet of Fire",year:2000}];

Now for the problems:
1) How would you get an object that is a subset of the data?
  a) All rows for which the author is "Roald Dahl"?
  b) All rows for which the title contains "Harry Potter" and the year is greater than 1998?

2) How would you get an array of unique values such as authors?

In both cases, what I'm aiming for is something that can then be used in, say, the following:

      chart.selectAll("line")
          .data(myData)

but with 'myData' then being either the data subset itself or a function that returns the subset from (1) or (2).

Cheers

Stu

Master Bold

unread,
Oct 21, 2011, 6:26:22 AM10/21/11
to d3...@googlegroups.com
Hello!

D3 is a javascript library, not java library.
Take a look at the API Reference on the Working with Arrays section.
You have different operators like cross or nest that will help you achieve this.
Also search on the forum all the posts about cross and nesting.
If you understand SQL you should be able to grasp these concepts.
This should help you achieve what you want.

Good luck!

wimdows

unread,
Oct 21, 2011, 7:39:04 AM10/21/11
to d3-js
You are all but asking us to explain at least one (big) chapter of the
manual, Stu.

For selecting data you can use .filter and combine it with comparisons
like: some_variable == some_value, or not (!=) if that is preferable.
All sorts of operators are available of course: search for Javascript
operators online, if need be. You can also stick them together with
ampersands.

An example: dataFiltered = dataloaded.filter(function(d) { return
d.someThing==currentThing; });

You can do your selecting just about anywhere in the flow of your
graph: at startup, when called in a function, or in the line that
binds the data, etcetera.

I don't know how to extract unique values. (As far as I know
protovis.uniq never got reincarnated as d3.uniq, so you might have to
find a Javascript or other way of doing that.)


Stu Sharples

unread,
Oct 21, 2011, 8:00:26 AM10/21/11
to d3...@googlegroups.com
Thank you both for your quick and useful(!) responses.

Yeah, I appreciate my questions are fairly fundamental and broad.

I'll look up cross and nesting, and I've made a note about .filter, thanks.

In terms of finding unique values within a 'variable' such as author I've rolled my own function:

    function uniqueAuthor(data) {
      var unique = [];
     
      for(i=0; i<data.length; i++) {
        var authorName = data[i].author;
        var inArray = jQuery.inArray(authorName, unique) // returns -1 if cant find

        if (inArray == -1) {
          unique.push(authorName);
        }
      }
     
      return unique;
    }

This then returns an array with each author mentioned once.

Mike Bostock

unread,
Oct 21, 2011, 10:00:22 AM10/21/11
to d3...@googlegroups.com
A much faster version of your uniqueAuthor function uses a map
(constant-time lookup) rather than jQuery.inArray (scans each element
in the array).

function authors(data) {
var map = {}, i = -1, n = data.length;
while (++i < n) map[data[i].author] = 1;
return d3.keys(map);
}

You can also use d3.nest to do this:

var authors = d3.keys(d3.nest()
.key(function(d) { return d.author; })
.map(data));

Of course if you want to keep the data associated with each author
then you'd remove the d3.keys and leave it as a map or entries array.

Also, you can use array.indexOf rather than jQuery.inArray; this is
standard JavaScript and eliminates a jQuery dependency.

Mike

Tim Jestico

unread,
Mar 10, 2014, 9:49:29 PM3/10/14
to d3...@googlegroups.com, mbos...@cs.stanford.edu
Hi,

I'm very new to .d3 (and js...) and have been trying to get my head around how I can use this d3.nest method to remove duplicates from a multi dimensional array as opposed just creating a unique array of keys.

An extract of my data is as follows

var data = [{name:"Jeff",color:"#EDF1DE"},
{name:"Sue",color:"#E4DFEC"},
{name:"John",color:"#E4DFEC"},
{name:"Jeff",color:"#EDF1DE"},
{name:"Sam",color:"#EBF1DE"},
{name:"Sue",color:"#E4DFEC"}];

I would like to remove any name duplicates but keep the corresponding colors, i.e.

var data = [{name:"Jeff",color:"#EDF1DE"},
{name:"Sue",color:"#E4DFEC"},
{name:"John",color:"#E4DFEC"},
{name:"Sam",color:"#EBF1DE"};

I've tried looking through examples of the .map and .entries methods but haven't had any luck making them work.

Thanks,

Tim

Mike Bostock

unread,
Mar 11, 2014, 12:00:27 PM3/11/14
to Tim Jestico, d3...@googlegroups.com
1. Define a nest operator that uses the name as a key:

var nestByName = d3.nest()
    .key(function(d) { return d.name; });

2. Apply this nest operator to your data, returning a list of entries. Each entry has a key property (the name as returned by the key function) and a values array (the matching data elements):

var nameEntries = nestByName.entries(data);

3. Convert this array to just the first value for each name using array.map:

var uniqueData = nameEntries.map(function(entry) { return entry.values[0]; });

Alternatively, as a single statement:

var uniqueData = d3.nest()
    .key(function(d) { return d.name; })
    .entries(data)
    .map(function(entry) { return entry.values[0]; });

Mike

Tim Jestico

unread,
Mar 12, 2014, 6:25:29 PM3/12/14
to d3...@googlegroups.com, Tim Jestico, mi...@ocks.org
Awesome, thanks Mike, it works perfectly.
Reply all
Reply to author
Forward
0 new messages