Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Histogram ignores 'bspec' (bin spec) ??

318 views
Skip to first unread message

James Stein

unread,
Jun 4, 2012, 4:21:20 AM6/4/12
to

I would like to plot a histogram with a fixed number of bars, e.g. ten bars.
Consider this code:

SeedRandom [ 1 ] ;
foo = RandomVariate [ NormalDistribution [ 0, 1], 1000 ] ;
baz = Table [ Histogram [ foo, k ], { k, 8, 15 } ] ;
1 == Length [ DeleteDuplicates [ baz ] ]

The last line prints 'True', confirmed by visual inspection of baz which
reveals eight identical histograms, each containing 13 bins. What do I fail
to understand here?


Chris Degnen

unread,
Jun 5, 2012, 4:52:04 AM6/5/12
to
The bin number specification is approximate, grouping to simple numbers.

The legacy version of Histogram had an option called ApproximateIntervals
which would control "whether to adjust the interval boundaries to be simple
numbers". You can read this in the 'More information' section here:
http://reference.wolfram.com/mathematica/Histograms/ref/Histogram.html

You can obtain a histogram with exactly 10 bins using the bin width
specification, dx.

I.e. Histogram[foo, {(Max@foo - Min@foo)/(10 - 1)}]

So your code below would be:

SeedRandom[1];
foo = RandomVariate[NormalDistribution[0, 1], 1000];
{fmin, fmax} = #@foo & /@ {Min, Max};
baz = Table[Histogram[foo, {(fmax - fmin)/(k - 1)}], {k, 8, 15}]
1 == Length[DeleteDuplicates[baz]]



Nasser M. Abbasi

unread,
Jun 5, 2012, 4:55:07 AM6/5/12
to
I am not sure why the same bins show up each time.

But the way I would do this, at least for now, is to
use {dx} and not n for bspec. This way I know exactly
the bin width I want to use. Something like this

-----------------------
SeedRandom[1];
foo = RandomVariate[NormalDistribution[0,1],1000];
max = Max[foo];
min = Min[foo];
wid = max-min;
baz = Table[Histogram[foo,{wid/(k-1)},"PDF"],{k,8,15}]
----------------------

I used PDF above just to get the total area =1 that is all.
It is not needed otherwise.

now

1 == Length[DeleteDuplicates[baz]]

gives False. Note that total width of the data is less
than 8 (99.9999% of the time, i.e. 4 standard deviations each
side gives 4. So data goes from -4 to 4 since you
used NormalDistribution with zero mean and std=1).

Not sure if Mathematica has used '1' for the width of
each bin when using just 'n'.

So 8 bins or more will result in the same
data being binned into only first 8 bins (anything
over 8 bins will have zero elements, hence might not
show in the plot). This might explain why all plots
has 8 bins.

Not sure. But either way, using {dx} seems safer
until some expert here can explain this better.

--Nasser






Chris Degnen

unread,
Jun 7, 2012, 5:22:15 AM6/7/12
to
On Tuesday, June 5, 2012 9:52:04 AM UTC+1, I wrote:
> ... [So code] would be:
>
> SeedRandom[1];
> foo = RandomVariate[NormalDistribution[0, 1], 1000];
> {fmin, fmax} = #@foo & /@ {Min, Max};
> ...

{fmin, fmax} = Through[{Min, Max}[foo]]

was what I meant.

0 new messages