Binning data?

11 views
Skip to first unread message

Bruce Miller

unread,
Feb 7, 2018, 11:56:00 AM2/7/18
to ded...@googlegroups.com
Hi all,

I need to summarize hundreds of large data sets into "bins" with value
ranges then a simple count of the bins for each range.
Data Frame is a simple 2 column format with the first being a character
and the second a numeric value.

The numeric values are frequencies in kHz with possible values ranging
from 10-110 kHz.

What I need to do is group the data into 10 kHz bins, e.g.
10-20,21-30,31-40,41-50 etc. to 110.
What I would like is to have a result that can be written as a CSV file
with the number of values for each bin.

Raw data looks like this:

    Filename         Fc
Q5161811.04    20.36
Q5161811.04    20.46
Q5161811.04    24.17
Q5161811.04    20.20
Q5161811.04    21.14
Q5161811.04    20.23
Q5161811.04    20.41
Q5161811.04    25.36
Q5161811.04    22.50
Q5161811.04    21.83
Q5161811.04    20.10
Q5161811.04    20.97
Q5161811.04    21.11
Q5161811.04    21.62
Q5161811.04    25.48
Q5161811.04    22.44
Q5161811.04    20.49
Q5161811.04    23.05
Q5161811.04    29.57
Q5161811.04    32.50
Q5161811.04    38.06
Q5161811.04    42.18
Q5161811.04    42.19
Q5161811.04    42.23
Q5161811.04    42.35
Q5161811.04    42.18
Q5161811.04    55.01
Q5161811.04    56.20
Q5161811.04    60.12
Q5161811.04    63.05


What I need is for this to be summarized by 10 kHz bins so the output
result would be
bins 10-20, 21-30,31-40,41-50 etc. to 110 with the number of records
that fell within each.
If there are no values for a given bin then the result would be 0.

I have looked at binr and dplyr but have not been successful in finding 
vignette to show syntax to set this up.

Thanks for any assistance,

Bruce AKA Bat Dude

Ian Fellows

unread,
Feb 7, 2018, 12:16:32 PM2/7/18
to ded...@googlegroups.com
For binning try something like:

dat$newFc <- floor(dat$Fc / 10) * 10 + 5

Then you can run the frequencies on newFc.

ian
> --
>
> --- You received this message because you are subscribed to the Google Groups "Deducer" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to deducer+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Iurie Malai

unread,
Feb 9, 2018, 1:13:22 AM2/9/18
to Deducer
You can use the Recode Variables Dialog to create a new variable from Fc, then just convert it to a Factor type. If some Factor Levels will missing (because of 0 frequencies) you will need to add it manually. Then you can run the frequencies on new variable, as Ian suggested.


Iurie

Tom

unread,
Feb 13, 2018, 4:11:34 PM2/13/18
to Deducer
Bruce,

Assuming your two columns are in a data frame the_df:

the_data$bins <- cut(the_data$Fc, breaks = seq(10, 110, by = 10))

might do what you want.

Regards,

Tom
Reply all
Reply to author
Forward
0 new messages