Scatterplot point size from data, area or radius?

363 views
Skip to first unread message

Bob Monteverde

unread,
Mar 14, 2012, 12:58:08 PM3/14/12
to d3...@googlegroups.com
As the title says, if I'm using the size of a point to represent a data value, does it make more sense having the value directly relate to the radius, or the area of the dot?

While initially I was using the radius, the more I think about it, considering its a two-dimensional visualization, maybe using the are to show the data will be more statistically accurate.  Gonna have to skim through some Tufte books to see what his 2 cents are (well, its Tufte, so maybe his 98 cents vs my 2 cents).

Anyway, just wanting to hear what you guys use, or at least what you think is the more accurate representation.

Thanks,

Bob

Scott Murray

unread,
Mar 14, 2012, 1:04:14 PM3/14/12
to d3...@googlegroups.com
Great question.

Unfortunately, people are not good at perceiving relative areas of circles. Even though I think using area is a more "honest" interpretation of data, it's not meaningful perceptually. Using radius is maybe less honest, but more visually distinguishable.

That said, circles are actually not great for visualizing relative values, because it's hard for our brains to make 2-dimensional comparisons. Better are lines or rectangles (1-dimensional comparisons, like bar charts).

Nonetheless, circles are common because they look cool and have center points, so they work well to identify points on maps. I love circles — I just wish our brains were better at reading data out of them. :-)

Scott

Bob Monteverde

unread,
Mar 14, 2012, 1:08:26 PM3/14/12
to d3...@googlegroups.com
Wow, literally just had the same conversation with a coworker.  We both pretty much agreed that area would be more statistically accurate, but at the same time, to derive a value, its a more complex equation r vs. P*r*r.

I'll have to see what Tufte says, but as much as his reasoning for a lot of what he teaches is logically sound (so much so, sometimes very hard to argue against), sometimes the answer is simply "What are people used to?"

Bob

Mike Bostock

unread,
Mar 14, 2012, 5:29:39 PM3/14/12
to d3...@googlegroups.com
> Unfortunately, people are not good at perceiving relative areas
> of circles.  Even though I think using area is a more "honest"
> interpretation of data, it's not meaningful perceptually.
> Using radius is maybe less honest, but more visually distinguishable.

Whoa, hold up! It's true that length is more accurate than area (and
position more accurate than both), and that people slightly
underestimate increases in area. But that's an argument for using a
different visual encoding, not for distorting the data! It is still
accepted best practice that circles should encode data as area, not
radius. You perceive the number of colored pixels as the amount of
data; you don't perceive the radius directly.

The underestimation effect of area is small and varies greatly from
individual to individual; trying to compensate for the average
underestimation is therefore of little value, just adding unnecessary
complexity. Even so, using radius rather than area distorts much more
significantly (in the other direction) than the average
underestimation.

The d3.svg.symbol encodes area by default, rather than radius. But
<circle> elements are defined in terms of radius, so you typically use
a d3.scale.sqrt to define the area. I'd lean towards area if you're
intending this to be used for visualization. Otherwise, make sure your
examples use area and document best practice.

Mike

Bob Monteverde

unread,
Mar 14, 2012, 6:47:05 PM3/14/12
to d3...@googlegroups.com
Well, I knew I asked the question here for a reason.  Thank you, I shall be implementing with area not radius.  I do remember a Tufte example (well, a bad example) of an old data graphic showing oil prices with a drawing of different drums of oil, where the drums we're drawn in 3 dimensions, but the interpretation was actually the height of the drum.  I guess someone thought they were putting a good spin on a boring bar chart, but in reality, destroyed a valid set of data by adding an absurd level of confusion... could have been volume, area, or height at first glance you would not know which.

Bob

Scott Murray

unread,
Mar 14, 2012, 9:28:06 PM3/14/12
to d3...@googlegroups.com
Hmm, maybe I mis-remembered the area vs. radius debate. Thanks for clarifying, Mike.

Scott

Gopal Vaswani

unread,
Mar 14, 2012, 11:46:35 PM3/14/12
to d3...@googlegroups.com
These kinds of discussions about data graphics perception are very useful.
Is there a google (or some other) group devoted to discussing perception, aesthetics, usability etc of data graphics?

Thanks
Gopal

--
Gopal vaswani


Reply all
Reply to author
Forward
0 new messages