DC.js multiple charts are not showing correct values.

54 views
Skip to first unread message

Aydin Jalilov

unread,
Apr 29, 2020, 4:08:29 AM4/29/20
to dc-js user group
I am developing a small app based on stackoverflow survey data for 2019. I am trying to show the Top 10 languages per Dev. Type and respected salary ranges. Dev. Type is a pie chart. I use row chart for top 10 languages and another row chart for the salary ranges. Additionally I have boxplot chart that shows the relation between compensation and job satisfaction. I included screenshots of the charts and I will try to explain the issue that I am facing. Sorry for the quality of the pictures. Windows snipping tool behaves funky when I attempt to take screenshot of charts. It does not show the additional on-screen element. Therefor I took picture of my charts with my phone camera instead. 

For the record, respondents can have multiple DevTypes and LanguageWorkedWith as shown in single_document.png. I split them inside my js code and group them using reduceAdd and reduceRemove functions. 

The original data has 80K+ datapoints and after some manipulation and cleaning I ended up having 50K+ data points. For the test purposes I worked with limited dataset that has only 1000 points. I have to mention that the data have no null values.
When the data first loaded everything looks fine. The function counts occurrence of DevType, LanguageWorkedWith and ConvertedComp (salary in USD) correctly. pie_fullstack.jpg shows that full-stack developer occured 534 times, meaning 534 out of 1000 people answered have full-stack as the language they worked with. After clicking on full-stack on pie chart the other charts adjusted accordingly but they show wrong counts.  sixtyk_js.jpg shows that the top salary range for full-stack developer is 60k+ and when I hover the mouse over the that particular bar it shows that it is based on 29 people. If it worked properly the sum of counts of people in each salary range should be equal to 534.  When I sum all the counts in each salary range it makes only 124. 

Further I went ahead and hovered the mouse over JS in top 10 languages section. JS has 109 occurrences (picture top_10_JS.jpg). It is kind of strange that only 109 out of 534 full-stack developers know JS. I summed the number of occurrence of each language and got 599. Given that respondents can chose multiple languages and full-stack developers must be proficient in multiple languages, count 599 is only true if only 65 out of 534 developers know more than 1 languages, which is highly unlikely.

I then clicked JS and the rest of the charts recalculated and redrawn. When I checked salary range there is only one salary range and that is based on only one person (ninetyk_clicked.jpg). I expect it to have the same number as 109 (even though 109 is not correct either).
As I mentioned above when the data is loaded the first time and no chart was clicked all the values seem OK. When I start clicking it behaves funky I do not know how to fix this problem. 

Hope someone has patience to read through this and help me:)

Thank you.

Here is the link to fiddle: https://jsfiddle.net/9f8ax3vr/
ninetyk_clicked.jpg
pie_fullstack.jpg
single_document.PNG
sixtyk_js.jpg
top_10_JS.jpg

Gordon Woodhull

unread,
Apr 29, 2020, 6:46:56 PM4/29/20
to dc.js user group
Hi Aydin,

I sort of understand what you're saying - lots of tag/array-key dimensions, I take it. 

But I'll have to run it in order to understand the problems you're looking at.

I forked your fiddle and got the dependencies working: https://jsfiddle.net/gordonwoodhull/4b9hcx50/4/

I don't have the data but I've added my favorite way of embedding data in a fiddle.

Please look for
   <pre id="data"></pre>

in the HTML section and paste a sample of the JSON and see if the fiddle runs. 

(Or the data is available directly by HTTP, that could work too.)

Thanks,
Gordon



--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dc-js-user-group/a4c4cdb7-f9d3-4af6-a765-032d1ff4c03c%40googlegroups.com.
<ninetyk_clicked.jpg><pie_fullstack.jpg><single_document.PNG><sixtyk_js.jpg><top_10_JS.jpg>

Gordon Woodhull

unread,
Apr 29, 2020, 6:47:37 PM4/29/20
to dc.js user group
Sorry, a few revisions while I was writing this, now https://jsfiddle.net/gordonwoodhull/4b9hcx50/9/

Aydin Jalilov

unread,
Apr 29, 2020, 8:15:31 PM4/29/20
to dc-js user group
Hey Gordon, Thanks for the reply. I just created json file that contains some of  data you may need to run this script. Since the original data has 50K+ records I limited the data to 2000 records and put in the html file.
Thanks again for your support.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-us...@googlegroups.com.

Aydin Jalilov

unread,
Apr 29, 2020, 8:26:04 PM4/29/20
to dc-js user group
I think I messed up with jsfiddle. I inserted the data, closed the fiddle and then reopened it and the data was not there anymore. I was able to save the data but now it shows that the fiddle belongs to me. So there is a new link to the latest script: https://jsfiddle.net/aydinjalil/wfre8tj4/


On Wednesday, April 29, 2020 at 6:47:37 PM UTC-4, Gordon Woodhull wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-us...@googlegroups.com.

Gordon Woodhull

unread,
Apr 30, 2020, 7:21:41 AM4/30/20
to dc.js user group
I got your fiddle working in this version:


It looks like you are using an ancient and very clunky way to deal with tag dimensions. I guess old code never dies.

If you are using tag dimensions, please use the feature that is available in version 1.4+:

    var dev_dim = ndx.dimension(function(d) {return d.DevType.split(";");}, true);

I think you had a mixture of the two, because it wouldn't make sense to return an array from the dimension key function without this feature.

With the developer and language charts changed this way, hopefully they will make more sense. Unfortunately they don't look as good because there are a lot of Others.


Also, it doesn't make sense to use a tag dimension with a pie chart because the results are going to add up to much more than 100%

I'm afraid this is all the time I have to help you with this. Hopefully you can experiments some more and understand what is going on.


To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dc-js-user-group/3b565d95-e26f-4f89-a072-088c0e826289%40googlegroups.com.

Aydin Jalilov

unread,
May 1, 2020, 12:00:23 AM5/1/20
to dc-js user group

Hi Gordon, 

Thanks again for taking the time to answer my quesiton. crossfilter_dimension.group() won't deliver the desired solution for me since it will look for specific combinations of dev types and languages that developers input in their response. Now in the solution I can see that the top language is HTML/CSS,JavaScript,PHP,SQL (array of languages) when I try to break it down to the single language. It appears that in the given dataset, combination of HTML/CSS,JavaScript,PHP,SQL appears the most. My aim on the other hand is to see what language is the top language overall and to be able to do it when I click certain Dev. Type on pie chart.
My knowledge in JS is very basic, I am Python developer. Python visualization tools have their limitations and I thought that it would be better to explore JS libraries that are capable to do accomplish this task. 
I do appreciate your support in this group and will keep trying to solve this issue. 

Thank you,
Aydin.

Aydin Jalilov

unread,
May 1, 2020, 12:45:22 AM5/1/20
to dc-js user group
Gordon, I want to add that you are awesome. I fixed my problem by simply changing crossfilter in my library to 1.4.
Thank you tons.

Gordon Woodhull

unread,
May 1, 2020, 12:51:35 AM5/1/20
to dc.js user group
Ha, that's great, I knew that was the right solution but my fiddle wasn't working because it imported crossfilter twice, here it is fixed: 


Here is the documentation for the "tag dimension" or "dimension with array keys" feature in crossfilter:


Thanks for following up! 

I think since your original dimension didn't have this feature, you were drawing the data correctly but filtering would only select the exact item, that is people who specified they only have one role or language.


On May 1, 2020, at 12:45 AM, Aydin Jalilov <aydin...@gmail.com> wrote:

Gordon, I want to add that you are awesome. I fixed my problem by simply changing crossfilter in my library to 1.4.
Thank you tons.

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.

Aydin Jalilov

unread,
May 1, 2020, 1:17:23 AM5/1/20
to dc-js user group
Yep that was exactly what the issue was. I would like to find out how the crossfilter and grouping works in the source code and will explore the library a little bit. dc.js is lit. 

Thanks again. 


On Friday, May 1, 2020 at 12:51:35 AM UTC-4, Gordon Woodhull wrote:
Ha, that's great, I knew that was the right solution but my fiddle wasn't working because it imported crossfilter twice, here it is fixed: 


Here is the documentation for the "tag dimension" or "dimension with array keys" feature in crossfilter:


Thanks for following up! 

I think since your original dimension didn't have this feature, you were drawing the data correctly but filtering would only select the exact item, that is people who specified they only have one role or language.

On May 1, 2020, at 12:45 AM, Aydin Jalilov <aydin...@gmail.com> wrote:

Gordon, I want to add that you are awesome. I fixed my problem by simply changing crossfilter in my library to 1.4.
Thank you tons.

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-us...@googlegroups.com.

Aydin Jalilov

unread,
May 1, 2020, 1:28:59 AM5/1/20
to dc-js user group
Also I have modified some dimensions a little.  
var country_dim = ndx.dimension(function(d) { return d.Country;});
var jobSat_dim = ndx.dimension(function(d) { return d.JobSat;})
 and did the same for compensation dimension. It had true in the end and it was messing up everything. 

Btw after brushing up the code your boxplot appears nicer than mine on my machine. Why is that? 

Gordon Woodhull

unread,
May 1, 2020, 2:57:27 AM5/1/20
to dc-js-us...@googlegroups.com
Do you have dc.css loading correctly? That’s the only thing I can think of that would cause different appearance (and not just breakage haha).

Yes, you will only want to use the second parameter to  .dimension() when it’s really a tag dimension with a key function that returns an array.


On May 1, 2020, at 1:29 AM, Aydin Jalilov <aydin...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dc-js-user-group/91eb324b-1026-4641-b969-2e3e229edd50%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages