need help with dc,crossfilter and reductio visualization

212 views
Skip to first unread message

Margaret Greaney

unread,
Apr 27, 2016, 1:16:23 PM4/27/16
to dc-js user group

Hello,


I am learning about dc, crossfilter and reductio libraries and am attempting

to make an intermediate kind of example, basing the design on one from peacecorps.gov.  This set of three examples will be further documented and shared.


I have had help from Ethan Jewett, who helped me get an example working with reductio and his example showed that I was missing two countries in the json. He said I should move this question to this google group and mentioned a fiddle to start from. The problem is that even with that example I exceeded the quota on jfiddle.


I am stuck and would appreciate any help getting past the next problem.  My attempts to get a working fiddle on jsfiddle failed because I keep exceeding the quota there, even with smaller data sets. (Here is where I tried last,  https://jsfiddle.net/wheatgrass/qdnbogs0/10/)


I have three examples on github that are gists. The examples two and three almost work, but they do not show all the countries for any example, just some of them. I don't know if I am not filtering right, or if it is a data problem. 

The data is a subset and "flattened".


I am interested mostly in discovering why the complete set of countries does

not display after filtering n2dim.  Just some of them show up. I have other things too, to fix, but mainly I want to find out if it the data or the filtering, or both that need fixing.


I appreciate any help or suggestions.


https://gist.github.com/greaneym/51c08fbdf2b61c5c5645f84e625728de


http://bl.ocks.org/greaneym/raw/2a600873150a8e3f4bede48356579ad3/


thank you,


Margaret

Ethan Jewett

unread,
Apr 27, 2016, 1:45:04 PM4/27/16
to dc-js-us...@googlegroups.com
Hi Margaret,

Thanks for moving to the list. This is easier than comment threads on SO.

So, in your JSFiddle, the issue is that the external dependencies are loaded in the wrong order. This is a huge pain to get right and is hard to identify when it happens. I’ve fixed that in the example link below.

After fixing the external dependencies, the only error aside from a failed image load is: “ReferenceError: Can’t find variable: theCountries”

This happens down on line 623: var n1 = crossfilter(theCountries);

That’s been changed to ‘worldData’ here: https://jsfiddle.net/esjewett/qdnbogs0/12/

It’s still a bit broken somehow, but maybe you are in a position to fix it now. It’s also quite likely that others will be able to help you based on the Gists/blocks examples.

Cheers,
Ethan

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dc-js-user-group/8452e38c-819c-4521-9362-f3c01202bbf0%40googlegroups.com.

Margaret Greaney

unread,
Apr 27, 2016, 3:40:31 PM4/27/16
to dc-js user group
thanks, Ethan,

I wish I could figure out how to put the external resources in a particular order, but thanks for starting another fiddle.
I put some changes on a fork of this here,

and am unable to add the icon either as a pasted blob in the fiddle or as an external resource. Also, I added in bootstrap.css but that did not fix my menu selector.  But I think that whoever is helping can see my stumbling block now.  When you mouse over the bars below the map, some but not all the countries show up, so not all the countries are being selected. Is that because my data is bad, or I am not making the filters correctly? The same thing does happen on the gist, when if a country is selected via the menu, not all the countries are showing up after the selection, just some of them.
If I can get past that problem, I think I can fix the rest.

Ethan Jewett

unread,
Apr 27, 2016, 4:29:27 PM4/27/16
to dc-js-us...@googlegroups.com
You have to remove external dependencies and re-add them in the correct order to get it working. It’s terrible. Don’t worry about it.

I think you are not filtering correctly, but I’m not sure. When you say this, for example, in the event handler for a bar…

.on('mouseover.chart', function(d) {
          //reset_filter();
            var thestateid = d.data.key;
            var dname = lrgColDimension.filter(d.data.key);
            var location = dname.top(Infinity)[0].nameloc;
            namesDimension.filterExact(location);

You’ve just grabbed a single college (dname is the same dimension object as lrgColDimension if I remember correctly, so dname.top(Infinity)[0].nameloc is just the nameloc property of the top college in the current filter) and then filtered the namesDimension.

I’d suggest a few things:
  1. Managing all of this interaction between different charts with different Crossfilters is a bit of a nightmare. Can you figure out a way to combine your data into a single set of data that you can put into a single Crossfilter? This will allow you to really use Crossfilter filtering and just have the map directly display your data and maybe avoid having to manage filtering across different Crossfilters.
  2. For the moment, after doing #1, it would make sense to get rid of the renderlets and focus on getting the standard click-to-filter functionality working to filter in the way you want. Once you have that working, it’s fairly straightforward to translate that into hover-to-filter based on events or a renderlet if necessary.
  3. I think it would be helpful if you trim the example down to the simplest version of what you are trying to do. Just a map, a single bar chart, and maybe a data table.
  4. You can also remove your data and geoJSON from your example Javascript file and use rawgit.com to serve them based on the versions in Github/Gists. Put the rawgit.com URLs into the external resources for your JSFiddle.
Sorry, for all the suggestions that are just steps along the path. What you have looks really cool, but the underlying structure has probably gotten to its breaking point, so it’s likely time to back up and address that issue.

Ethan

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.

Margaret Greaney

unread,
Apr 27, 2016, 5:00:07 PM4/27/16
to dc-js user group
Ethan, 

I appreciate your help. I did try with one data set but wasn't able to get that working. But I think that does make the most sense and will try that again first.

What I was trying to do is make some smaller data sets, first with the college divisions, small, medium and large, then with a data set that had the state ids with the countries, then the original set that was then flattened. So, I will try again with just one set.   Also I have been doing what you suggested all along, working with a small group of chart, table and map and then combining them when I got some results.

I do have a question about using one data set. Does it have to be sorted beforehand in any way?

thank you.


On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:

Ethan Jewett

unread,
Apr 28, 2016, 5:39:04 PM4/28/16
to dc-js-us...@googlegroups.com
Hi Margaret,

Ok, great to hear you’ve already been working that way. It just looked complicated to me because I only saw the end result, I suspect. The example you put together is ingenious.

Regarding data sorting, no the data set doesn’t need to be sorted in any way before using it. What you usually want to do is identify all the dimensions of your data (College, Size, Country, Ranking (multiple if there are multiple types of rankings)) and then create one big table that includes all the information in all your tables. This does result in data duplication, where you’ll have the same number (such as a college ranking) in a bunch of rows (because the college will have a row in the table for each country). But that actually ends up making things work better for Crossfilter in the longer run.

Cheers,
Ethan

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.

Margaret Greaney

unread,
Apr 29, 2016, 6:20:24 PM4/29/16
to dc-js user group
I am still very ignorant about filtering and such, but I wanted to ask you about what you meant about the table. I have been working on the one big data set and making the various dimensions and groups. Then you said put them all into one table. In my example I am using dataTable. Is that where I should put everything? Right now I have id, rank, lat,lgn and total volunteers. Specifically do you mean I should put all the other variables into the dataTable as well? Thank you.


On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:

Margaret Greaney

unread,
Apr 29, 2016, 6:35:39 PM4/29/16
to dc-js user group
By the way, thank you for your kind words, they provide encouragement!

Here is the first record in the data. Is it flattened enough?

var rankings = [

{

"key":"Armenia",

"value":1,

"city":"Seattle",

"key":"Mozambique",

"value":5,

"total":"72",

"key":"Peru",

"value":3,

"key":"Cambodia",

"value":1,

"key":"Costa Rica",

"value":3,

"key":"Albania",

"value":2,

"key":"Liberia",

"value":1,

"key":"Macedonia",

"value":4,

"key":"Nepal",

"value":1,

"key":"Lesotho",

"value":1,

"key":"Vanuatu",

"value":2,

"key":"Rwanda",

"value":2,

"key":"Swaziland",

"value":1,

"key":"Zambia",

"value":3,

"key":"Guyana",

"value":3,

"key":"Belize",

"value":2,

"key":"China",

"value":2,

"key":"South Africa",

"value":1,

"id":"4854",

"key":"Ethiopia",

"value":3,

"key":"United Republic of Tanzania",

"value":4,

"address":"1410 NE Campus Pkwy",

"key":"Jamaica",

"value":1,

"rank":"1",

"key":"Kyrgyzstan",

"value":1,

"key":"Moldova",

"value":1,

"key":"Madagascar",

"value":1,

"key":"Burkina Faso",

"value":2,

"lat":"47.6062095",

"key":"Cameroon",

"value":1,

"zip":"98195",

"key":"Botswana",

"value":1,

"nameloc":"University of Washington",

"key":"Malawi",

"value":2,

"size":"Large",

"key":"Panama",

"value":2,

"key":"Mongolia",

"value":2,

"link":"http://www.peacecorps.gov/recruiters/?zipcode=98195&radius=20&utm_source=college%20rankings&utm_medium=data%20visualization&utm_campaign=hq%20college%20ranking%2016-2-19#zip-search-results",

"nid":"0",

"key":"Senegal",

"value":3,

"key":"Morocco",

"value":2,

"key":"Mexico",

"value":1,

"key":"Georgia",

"value":2,

"key":"Ghana",

"value":1,

"state":"WA",

"key":"Gambia",

"value":1,

"key":"Guatemala",

"value":1,

"order":"1",

"key":"Nicaragua",

"value":1,

"lng":"-122.3320708"

},


thank you


On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:

Margaret Greaney

unread,
Apr 30, 2016, 12:18:20 PM4/30/16
to dc-js user group

I found this comment in SO

http://stackoverflow.com/questions/24225364/apply-filter-from-one-crossfilter-dataset-to-another-crossfilter


and I think that is the method you are telling me to try, which is to change the data that goes into crossfilter(data).  I will proceed with that method, 

although I do see that other people are using multiple crossfilter data sets.



On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:

Ethan Jewett

unread,
May 5, 2016, 12:18:20 AM5/5/16
to dc-js-us...@googlegroups.com
Yes, that’s what I meant, but was not clear about it. There are different ways to approach the problem, but what you’ll usually want to do is decide what the basic object of analysis is. In your case it sounds like you are interested in the population of a university from countries other than where the university is? That population will have attributes like what university it belongs to, what country it is from, and rankings and various demographic information about that country. So the data you’d pass into Crossfilter would look be an array of objects describing those populations. Something like:

crossfilter([{
  university: “University of Washington”,
  country: “Canada”,
  num_students: 100,
  ranking: 2,
  …
}, {
  university: “University of Washington”,
  country: “Mexico”,
  num_students: 100,
  ranking: 2,
},{
  …
}
])

Notice how countries and universities repeat. That’s OK. The important thing is that you have an array of objects, each of which describe attributes of one “thing” - in this case a population of students from a particular country.

I’m sure I’ve misunderstood the use case you have in mind with my example above, but hopefully that explains the idea of combining your data sets a little more.

Ethan

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.

Margaret Greaney

unread,
May 6, 2016, 3:19:56 PM5/6/16
to dc-js user group
Thanks Ethan,

I was able to get the visualization working with two crossfilter data sets, but it still has some problems.
It was working earlier, but I didn't realize that my world json was missing some countries and that is one reason why some countries did not show up.

The goal was just to try to reproduce the highcharts example of Peace Corps University rankings, but use dc.js and crossfilter.js so I could learn it better. I did get this to work and want to thank you again for the reductio example and help.
There are two things I would still ask for help with, 

1. How do you use the color accessor to color the countries in with navy? I tried this but the error says it can't see the
d.exceptionSum.value,
2. The x-axes didn't work at all. I tried various things but they didn't work.  If you or anyone could make a suggestion on how to do this, that would be great. 

Also, I want to learn how to make the charts from just one crossfilter data set, but so far was not able to do this. When I use one set, I wasn't able to figure out how to separate the three categories of small, med and large sized colleges, but I will keep
trying. That is I could use the filter to separate them, but the charts didn't work. It gave me one chart instead of three.

Here is the working demo:
http://bl.ocks.org/greaneym/raw/51c08fbdf2b61c5c5645f84e625728de/

On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:

Ethan Jewett

unread,
May 9, 2016, 11:42:30 AM5/9/16
to dc-js-us...@googlegroups.com
Hi Margaret,

Comments inline

On Fri, May 6, 2016 at 2:19 PM, Margaret Greaney <grea...@gmail.com> wrote:
1. How do you use the color accessor to color the countries in with navy? I tried this but the error says it can't see the
d.exceptionSum.value,

That’s what I would recommend. I’d be surprised if the valueAccessor works when the same function with the colorAccessor doesn’t work. If you can make an example with it failing, you can watch in the debugger or just add a console statement in the colorAccessor function to try to see what is happening. Keep in mind that the way you are setting up your color scales, many of your values will be outside the range, so you’ll want your colorAccessor to actually return either 0 or 1 depending on the value of the exceptionSum. If the error you are encountering is missing values, you may be able to just use .colorAccessor(function(d) { return d.value && d.value.exceptionCount ? 1 : 0; }). This should first check for existence, then check if the count is actually non-zero. If the count doesn’t exist or is 0, it will return 0. If the count does exist and is non-zero it will return 1. Might work.
 
2. The x-axes didn't work at all. I tried various things but they didn't work.  If you or anyone could make a suggestion on how to do this, that would be great.

It looks like your x-axis is ordinal (college names?). I’ve never been able to get this to work, but I hear it is possible. There is an example here that looks like it is working, but it also looks pretty similar to what you are doing: https://github.com/dc-js/dc.js/blob/master/web/examples/ord.html
 
Also, I want to learn how to make the charts from just one crossfilter data set, but so far was not able to do this. When I use one set, I wasn't able to figure out how to separate the three categories of small, med and large sized colleges, but I will keep
trying. That is I could use the filter to separate them, but the charts didn't work. It gave me one chart instead of three.

Yes, that’s definitely a problem. Merging to a single table causes its own headaches, but they are the kind of headaches that Crossfilter is good at dealing with. Using Reductio, you’d want to merge them all into one dataset with a “size” property that would have values ’small’, ‘medium’, ‘large’ or something similar. Then you’d use a “filter” to create groups that only aggregate records with the correct size category, then use these filtered groups to build your tables. Docs on the Reductio filter here: https://github.com/crossfilter/reductio#aggregations-standard-aggregations-reductio-b-filter-b-i-filterfn-i- (You can also do this using custom reducers.)

Ethan

Margaret Greaney

unread,
May 9, 2016, 6:58:15 PM5/9/16
to dc-js user group
Ethan,

Thank you very much for your comments. I will review carefully and try your suggestions. Very helpful!

Margaret

Margaret Greaney

unread,
May 11, 2016, 8:08:16 AM5/11/16
to dc-js user group
Thanks again. I was able to get things working on this example. I appreciate your help.


On Wednesday, April 27, 2016 at 12:16:23 PM UTC-5, Margaret Greaney wrote:
Reply all
Reply to author
Forward
0 new messages