Best way to implement a moving window?

4,537 views
Skip to first unread message

Jake Brukhman

unread,
Feb 12, 2012, 10:08:55 PM2/12/12
to golang-nuts
A moving window is a queue which holds a number of elements up to a
certain maximum capacity; once capacity is reached and a new item is
pushed onto the queue, the head of the queue is popped to make room
for it.

I've implemented a moving window with container/list, as follows:

l := list.New()
for {
val := <-data // channel with incoming data
l.PushBack(val)
for l.Len() > MAX_CAPACITY {
l.Remove(l.Front())
}
process(l)
}

The issue is that process() needs to turn the list into an array, and
therefore must traverse the list one extra time. Is there an efficient
way to implement a moving window using slices, where that extra
traversal isn't necessary? Or will it always involve copying to a new
array on each iteration (when at capacity)?

Thanks,
Jake

David Symonds

unread,
Feb 12, 2012, 10:12:33 PM2/12/12
to Jake Brukhman, golang-nuts
The canonical implementation of a moving window (a.k.a. "sliding
window") is a fixed size buffer treated as a ring buffer. That is,
allocate a fixed-length array/slice, and keep track of separate read
and write markers.


Dave.

spir

unread,
Feb 13, 2012, 2:10:58 AM2/13/12
to golan...@googlegroups.com

0. At first sight, but I don't know the details indeed, I find strange
your list update routine: since you push a single item at a time (val),
why do you need to remove the front in a loop? But maybe I just miss the
point.

Concerning your question:
1. Do you implement process, esp its input? If yes, consider using a
list as well. There are chances process only needs traversing the data
and thus a list would do the job perfectly. (The core advantage of an
array is instant access to arbitrary items by index. In go, another is
that one can slice them, but we could do that with lists as well.) In
this case, no copy indeed.
2. There is a data structure I used once a long time ago (don't know the
proper name in english, call it "circular array"), which is just an
array which "first" index is not fixed as 0 (or 1). In other words, you
have a moving start pos, which is a field of the structure. Thus,
instead of performing pop-front, one increments start-pos. The trick is
just to add this offet to indices, and to "modulo" len so that higher
indices reach back the beginning of the array. Thus, on writing, this
beginning is automagically reused. (Hope I'm clear, it's easy in
practice and transparent to the client once implemented.) Again, in this
case, no copy needed.

I hope I properly understood what your question is actually about,
Denis

Dustin

unread,
Feb 13, 2012, 11:54:14 AM2/13/12
to golan...@googlegroups.com
I'm doing exactly this in my thermometer:  http://bleu.west.spy.net/house/  --- each room has a different width and keeps just enough readings around to be able to render a sparkline on demand.

I'm using the stdlib's container/ring for this.  I construct it based on the width of the box here:


And then for each new reading, I go backwards in the list and overwrite the oldest value:


When I want to read the entire history, I convert it to a slice:


Or I can just read the "current" value (into a one-element slice):


Jake Brukhman

unread,
Feb 13, 2012, 12:52:47 PM2/13/12
to golang-nuts
Thanks everyone for your input. I ended up using a generalized ring
structure based on arrays. As I mentioned, I wanted to minimize the
amount of array copying that needed to be done, so I decided to trade
off space and copying complexity. I describe what I did below, in case
this is helpful to someone.

If your intended window size is S, then pick some integer M >= 1, and
allocate an array of size S*M. I kept head and tail markers, advancing
the tail marker every time an element is added. If the distance
between the markers became S, then I would advance the head marker as
well, thus keeping the window size. When, finally, the array was full
and a new element would need to be inserted, I would do a "rewind"
operation: I would simply copy the last S-1 elements from the end of
the array to the beginning and insert the new element at the S-th
index.

The nice thing about this (for me) is that to get a view of the moving
window as a slice never ever requires copying arrays:

func (m *MovingWindow) ToSlice() []float64 {
return m.arr[m.head:m.tail]
}

Finally, you can generate approx. S*M moving windows before you need
to do S copy operations. So your amortized cost in copying is
proportional to 1/M, which you can control.

Thanks,
Jake.

On Feb 13, 11:54 am, Dustin <dsalli...@gmail.com> wrote:
> I'm doing exactly this in my thermometer:  http://bleu.west.spy.net/house/
>  --- each room has a different width and keeps just enough readings around
> to be able to render a sparkline on demand.
>
> I'm using the stdlib's container/ring for this.  I construct it based on
> the width of the box here:
>
> https://github.com/dustin/web-thermometer/blob/master/gohousetherm/re...
>
> And then for each new reading, I go backwards in the list and overwrite the
> oldest value:
>
> https://github.com/dustin/web-thermometer/blob/master/gohousetherm/re...
>
> When I want to read the entire history, I convert it to a slice:
>
> https://github.com/dustin/web-thermometer/blob/master/gohousetherm/re...
>
> Or I can just read the "current" value (into a one-element slice):
>
> https://github.com/dustin/web-thermometer/blob/master/gohousetherm/re...

Dustin

unread,
Feb 13, 2012, 1:02:49 PM2/13/12
to golan...@googlegroups.com

On Monday, February 13, 2012 9:52:47 AM UTC-8, Jake Brukhman wrote:
Thanks everyone for your input. I ended up using a generalized ring
structure based on arrays. As I mentioned, I wanted to minimize the
amount of array copying that needed to be done, so I decided to trade
off space and copying complexity. I describe what I did below, in case
this is helpful to someone.  

  It would help more people if you pushed it up as a library we could "go get"  :) 

Jake Brukhman

unread,
Feb 13, 2012, 2:16:10 PM2/13/12
to golang-nuts
I've put it up here:

http://github.com/jbrukh/window

Thanks.

Miki Tebeka

unread,
Feb 13, 2012, 2:19:08 PM2/13/12
to golan...@googlegroups.com
Is there a reason you're not using http://golang.org/pkg/container/ring/ ?

Jake Brukhman

unread,
Feb 13, 2012, 2:46:52 PM2/13/12
to golang-nuts
If I used a ring, then in my application I would have to allocate a
new array for each window and perform a linear-time copy (as in
Dustin's third link above). My intention is to avoid these extra
allocations; furthermore, the amortized cost of copying can be made
arbitrarily small.

To be more precise, I can try to optimize the quantity "copy
operations per windows displayed" for my application, which my moving
window sets at 1/M, where M is a parameter. For Dustin's
implementation, this number is always ~N, the capacity of his ring.

Jake

Jake Brukhman

unread,
Feb 13, 2012, 11:18:47 PM2/13/12
to golan...@googlegroups.com
Here are some benchmarks:

BenchmarkListS1000        1       42649824000 ns/op
BenchmarkS1000M1       1 2707254000 ns/op
BenchmarkS1000M10     100  10864690 ns/op
BenchmarkS1000M100     100  10774190 ns/op
BenchmarkS1000M500     100  11953970 ns/op

The first test is running 1,000,000 data points through a 1000-point moving window implemented with a list. On each insert, a slice is generated. The subsequent tests run the same number of data points through a moving window with different size and M combinations (as above). On each insert, a slice is generated as well.

Dustin

unread,
Feb 14, 2012, 1:38:11 AM2/14/12
to golan...@googlegroups.com
  Science... awesome.  I'd consider using ring since that's what you're actually wanting.  :)  Also, separate the adds and the slice conversion (if if you *really*  need a slice every time, it pays off to measure separately).  This is on my ancient atom box:

BenchmarkS1000M1       1 14553606000 ns/op
BenchmarkS1000M10      50  40414860 ns/op
BenchmarkS1000M100      50  39455580 ns/op
BenchmarkS1000M500      50  39409940 ns/op
BenchmarkListS1000       5 957675800 ns/op
BenchmarkRingS1000      10 186544400 ns/op

Jake Brukhman

unread,
Feb 14, 2012, 10:02:26 PM2/14/12
to golang-nuts
Sweet deal. If you post the code, I'll include it in the tests.

Jake

Dustin

unread,
Feb 15, 2012, 2:54:47 AM2/15/12
to golan...@googlegroups.com

On Tuesday, February 14, 2012 7:02:26 PM UTC-8, Jake Brukhman wrote:
Sweet deal. If you post the code, I'll include it in the tests.  

  I kind of mutilated your tests to get there.  :/  I removed all of the ->slice stuff.  I think adding back separate ->slice benches would be good.  In particular, the naïve way I did it for ring was much slower than the way you did it for list (but your method would probably work fine).  This is the basic ring test, though (as it fit your framework, minus the slicification):

func r(size int, b *testing.B) {
       b.StopTimer()
       rng := ring.New(size)
       TIMES := 1000 * size
       b.StartTimer()
       for i := 0; i < b.N; i++ {
               for j := 0; j < TIMES; j++ {
                       rng.Value = float64(j)
                       rng = rng.Prev()
               }
       }
}

Jake Brukhman

unread,
Feb 15, 2012, 10:24:53 AM2/15/12
to golang-nuts
Thanks, Dustin. I followed your suggestion and separated out the
benchmarks. Check out these results on a MacBook Pro:

BenchmarkListS1000 10 188497800 ns/op
BenchmarkRingS1000 50 35760520 ns/op
BenchmarkS1000M1 1 2817453000 ns/op
BenchmarkS1000M10 200 9012795 ns/op
BenchmarkS1000M100 200 8807370 ns/op
BenchmarkS1000M500 200 8471190 ns/op
BenchmarkSlicifyList 50000 41656 ns/op
BenchmarkSlicifyRing 50000 48598 ns/op
BenchmarkSlicifyWindow 200000000 9.30 ns/op
Message has been deleted

Kyle Lemons

unread,
Feb 23, 2012, 2:42:42 PM2/23/12
to Jake Brukhman, golang-nuts
I have a data processing application that does moving windows somewhat differently.  Because a window is often used for running sums and such, instead of forcing the application to recompute the sum each time, it provides the start/end indices of the range of the slice being added and removed from the window.  In your case, I would provide pointers to the start/end of each subset being added and removed.


Basically this allows you to process large rolling windows without having to parse a piece of data too many times.
Reply all
Reply to author
Forward
0 new messages