RFC: Stacked Bar Plot Implementation

220 views
Skip to first unread message

macserv

unread,
Aug 24, 2010, 2:06:13 PM8/24/10
to coreplot-discuss
Hi, All,

I've done some work toward getting a stacked bar plot implementation
in place that doesn't require significant work on the developer's
part. I'd like to share my approach, and get feedback on the
direction moving forward. I'd also like to thank Jason Boyle; I've
used his code for adjusting bar position during path creation.

In my current implementation, to get the bars to stack, all you have
to do is this (to your CPBarBlot instance)...

[plot setShouldStackBars:YES];

and the CPBarPlot handles the rest. This is accomplished by caching
the heights of the bars in an NSMutableDictionary, which is stored in/
managed by CPBarPlot's container class, CPXYGraph. The bar's location
is the key, so only bars with the same location value will stack. The
cache structure currently looks like this:

Root (NSDictionary)
"location" (NSDictionary)
"numbers" (NSArray)
[0] (NSNumber)
[1] (NSNumber)
[2] (NSNumber)
"paths" (Array)
[0] (CGPathRef)
[1] (CGPathRef)
[2] (CGPathRef)

This stacking cache does not exist until used, and is only used when
the plot shouldStackBars. The "numbers" are used during
newBarPathWithContext:recordIndex: to adjust the base and tip points
based on the previous bars. The paths generated by that method are
then cached in "paths", and are then used when positioning labels
(TODO: I have yet to implement this for the recent change in the way
plots are labeled). I've been thinking about creating a simple
"CPBarPlotBar" object to contain the data in a more organized
fashion. I'm also aware that I might be caching some data twice now,
and should probably separate or unify things to make sure that doesn't
happen.

I also added a means to set the size of the gap between bars. Right
now, it just shortens all bars at both ends by half the gap size...
TODO: It shouldn't do this at the very top and bottom of the stack.

I've posted a patch in the files section that contains my diffs from
the head of the tree (as of 2010-08-24, 14:00 ET). I'd appreciate any
feedback you all may have, and will do my best to respond in a timely
manner.

Thanks In Advance
--Matt

Eric

unread,
Aug 24, 2010, 7:38:44 PM8/24/10
to coreplot-discuss
Matt,

Nice work. I haven't looked at your code yet, but I have a few
observations based on your comments. I will take a look at the code
soon.

1. We want to keep the implementation general enough that it can be
used by other plot types as well. Bar plots are a good place to start,
but I see definite usefulness for this with scatter plots as well. I'm
sure there are other plot types that are not implemented yet that
could also use a stacking option. For example, we might want to move
the "shouldStackBars" property to the CPPlot class and rename it
something like "shouldStack" or "shouldStackPlots".

2. Why cache the paths for the bars? Each plot only draws itself, so
it can figure out its own path outlines without knowing the others.
Just having the other values at the same location and knowing the
index of where the plot is in the stack should be enough information
to draw the bars. You'd have to re-cache the path for the bars at the
top (right) of the stack when you add and remove plots anyway because
of the offsets you mentioned and the fact that the ends of the bars
can be rounded. The graph should tell the affected stacked plots to
redraw when a plot is added to or removed from the stack.

3. We should support the case where there are two or more independent
stacks on the same graph. For instance, you could have multiple plot
spaces with a set of stacked plots in each one. We could put your
dictionary structure inside an array or another dictionary. You would
then have to tell the plot which stack it belonged to in addition to
turning on the stacking behavior.

Eric

macserv

unread,
Aug 26, 2010, 10:48:21 AM8/26/10
to coreplot-discuss
Hey, Eric,

1. I'm all for abstracting the logic as much as possible, but I'm not
sure how it could go into CPPlot. The cache itself is currently
implemented in CPXYGraph, and in CPBarPlot, the cache is used in
newBarPathWithContext:. It would also be used in
positionLabelAnnotation:forIndex:.

2. Short answer: labels. If I'm reading the code correctly, I'll need
the actual coordinates of the bars (not just their lengths) to
position the labels properly. For a stacked chart, label positioning
becomes more complex: the value appears in the middle of the bar,
unless it's too small, and then it goes on top. Avoiding label
collisions will take a good bit of logic.

3. Good point... that would make the implementation more robust.
Noted when I'm able to work on this more in the future.

My focus has shifted temporarily to another project, so feel free to
run with these ideas if you're inclined to do so. I'll pick it back
up when I get back to working on my previous app.

Thanks!
--Matt

Eric

unread,
Aug 26, 2010, 8:47:18 PM8/26/10
to coreplot-discuss
Matt,

1. I'm not talking about moving the stacking logic to CPPlot, just the
properties that control it. They would then be inherited by other plot
classes that could implement their own stacking behavior if it makes
sense. Similarly, the stacking cache should probably live in CPGraph
instead of CPXYGraph. Not only would this get rid of a bunch of
typecasts, but it would be inherited by other future graph types. For
example, it's not unreasonable to expect that Core Plot will
eventually have a polar graph class and polar plots. Keeping the
stacking infrastructure as generic as possible will make it easy to
add something like a stacked radar plot.

2. The cache stores the length of each bar. You can compute the
starting location from the lengths of the bars underneath it in the
stack. For the more complex labeling options, we might need some sort
of controller that can see all of the plots in order to decide where
to put the labels. This could be a separate class that's instantiated
as needed to manage a stack, or it could be the graph (or CPXYGraph in
this case). The plot would provide a few helper methods specific to
the type of plot and the graph could do the general layout work and
make decisions about where to place the labels.

3. You've made a good start. I hope you can come back soon!

Thanks,
Eric

Drew McCormack

unread,
Aug 27, 2010, 7:02:09 AM8/27/10
to coreplot...@googlegroups.com
I'm a bit confused by this discussion. Why would plotting functionality be pushed into CPGraph? Seems like a bad idea to me.

Is the idea that you will have multiple CPPlots, stacked on top of one another? If that is the case, it may be best implemented by having a 'supportingPlot' property, or something like that, which one plot can use to request another plot for its starting position (ie base). I think caching the data in CPGraph would be an uglier solution. By simply have a reference from one plot to another, the data can be generated on the fly, and lazily.

Another option would simply be to have a single plot draw the stacked bars, rather than having multiple plots. In this case the CPPlot class would be updated to request multiple y values for each x value, or something along those lines.

Drew

> --
> You received this message because you are subscribed to the Google Groups "coreplot-discuss" group.
> To post to this group, send email to coreplot...@googlegroups.com.
> To unsubscribe from this group, send email to coreplot-discu...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/coreplot-discuss?hl=en.
>

Eric

unread,
Aug 27, 2010, 2:38:53 PM8/27/10
to coreplot-discuss
Drew,

I was just talking about pushing the cache to CPGraph, not any of the
drawing.

I don't like the idea of linking one plot to another. You're then in
the situation of having to maintain a linked list every time you add
or remove plots from the stack.

Your alternate suggestion is probably the cleanest solution. We could
have a subclass of CPBarPlot that accepts multiple values for each
location and draws the stacked bars. I think the stacking
functionality should be separated from the standard bar plot class.
They will share some properties like the corner radius and bar
offsets, but the data handling and drawing will be different.

As I mentioned above, we need a similar stacked version of
CPScatterPlot also. This would be especially useful for stacked area
plots.

Eric

Drew McCormack

unread,
Aug 27, 2010, 3:09:11 PM8/27/10
to coreplot...@googlegroups.com
Sounds right.
I would still avoid moving the plot cache into CPGraph. It would be better in the CPPlot class, where caching is done now for the existing plot types.

Drew

Eric

unread,
Aug 27, 2010, 3:25:13 PM8/27/10
to coreplot-discuss
Drew,

If we make a separate stacked plot type, we don't need the new cache
structure at all. The existing caching mechanism in the plots will
work fine.

Eric

macserv

unread,
Aug 28, 2010, 5:48:44 PM8/28/10
to coreplot-discuss
I had originally thought of going that way (an array of bars per
location), but I went the other route because it's more
philosophically correct from a data perspective. The goal of my
implementation is to avoid having to create a lot of custom logic on
the developer's end, having to pivot data, or calculate what does and
doesn't go into an array of bars in a stack.

Looking at it from a tabular data perspective, the relationship
between the bars is horizontal, not vertical.

A A A
B B B
C C C

Now, if I want to turn off the data in plot B, I simply remove its
plot, and it falls out of all of the stacks. If I do it the other
way, I have to re-process my arrays, and not include plot B.

A couple of notes: Having a shared cache in a container object is far
from abnormal. It's also only used when stacking is enabled. Also,
we need to bear in mind that the bars not only have to be drawn, they
have to be labeled, and the means for detemining the position of those
labels at a time following your draw methods. For stacked bars, it's
more complicated than simple ones... the labels have to sit inside the
bars, unless the bar is too short, and then the behavior would be
preferential (draw above, hide label, show on hover, etc.).

Ultimately, I'd be fine with it either way, as long as everything
works, and I don't have to write a ton more code to process the data
I'm getting from a JSON service response.

--Matt
> ...
>
> read more »

Eric

unread,
Aug 28, 2010, 7:39:27 PM8/28/10
to coreplot-discuss
There's another issue to consider that hasn't been discussed in this
thread yet. We also have to detect touch and click events in the bars.
The event handling delegate only gets an index right now. This assumes
that each plot has only one data series.

Instead of caching all of the stacked plot data in the graph, why not
just give it the ability to manage the stacks? It could keep track of
which plots are stacked and how the stacks are grouped together, but
leave all of the details of the data and drawing to the individual
plots. When stacked, the plot could ask its graph for the offset due
to the plots stacked below it and whether the drawing should be
modified due to being at the bottom, middle, or top of the stack.

Eric
> ...
>
> read more »

Drew McCormack

unread,
Aug 29, 2010, 5:13:45 AM8/29/10
to coreplot...@googlegroups.com
OK, but I still think we are loading functionality into CPGraph that doesn't belong there.
What if we just created a new class, called CPPlotStack, or something like that? It could be used internally to cache data, and generally organize the plots in a stack.

Drew

> --
> You received this message because you are subscribed to the Google Groups "coreplot-discuss" group.
> To post to this group, send email to coreplot...@googlegroups.com.

Reply all
Reply to author
Forward
0 new messages