On 28/11/2012 11:45, CG wrote:
> for ordinal dimensions (PAD, RED in your dataset), it is pretty easy.
> something like : dim = crossfilter.dimension(function(d) {return
> !!d.ordinal ? d.ordinal: '__missing')}) should work.
Okay, but what do you do with the __missing value?
> What I am planning to do for interval dimension (numbers) is to have my
> dimension function return Infinity when the value is missing/NaN. This
> still allows for correct crossfilter ordering.
That makes sense, but if you have a lot of data in this cateorgy, how do
you stop that category from artificially reducing the height of the
histogram? (or are D3's histograms smarter than that?)
> Then I'd add one group
> for dis-aggregating/filtering Infinity from other correct values.
Not sure I follow here, can you give me an example?
Chris
> On Wednesday, November 28, 2012 8:54:03 AM UTC+1, ChrisW wrote:
>
> Hi All,
>
> I didn't see a mailing list for crossfilter, if there is one, please
> can
> someone point me there?
>
> In the meantime, I want to know what to do about some missing data.
> The data I'm exploring is actually very similar to the data in the
> example at
http://square.github.com/crossfilter/
> <
http://square.github.com/crossfilter/>, I'm looking at delays
> and cancellations on the UK rail network.
>
> So, I have data that looks like this:
>
> origin,destination,departed,delay
> PAD,RED,201211230716,0
> PAD,RED,201211260701,CANCELLED
> PAD,RED,201211260721,2
>
> ...and I want to show cancelled services as a red bar stacked on top of
> the "date" histogram in the example.
>
> I had thought to re-structure the day to be like:
>
> origin,destination,departed,delay,cancelled
> PAD,RED,201211230716,0,0
> PAD,RED,201211260701,?,1
> PAD,RED,201211260721,2,0
>
> ...but, how do I sum up those cancels for each day to stack them onto
> the "date" histogram?
>
> More importantly, what do I put in place of the '?' for delay?
> If I put "0" they'll show up in the "0 minutes delayed" bucket, which
> wouldn't be correct ;-) How do I exclude cancelled services from the
> "arrival delay" histogram?
>
> cheers,
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
> -
http://www.simplistix.co.uk
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit
http://www.symanteccloud.com
> ______________________________________________________________________