Implementation of a fps counter

408 views
Skip to first unread message

Andrea Fazzi

unread,
Jul 6, 2010, 10:39:14 AM7/6/10
to golang-nuts

Hi all,

I'm implementing a fps counter for gospeccy and I'd like to use goroutines
and channels, of course. The idea is to create a service that streams the
fps values calculated over a range of timings provided by the client code
in a given time interval.

I ended up with something like this (implementation + test):

http://gist.github.com/465454

I'd like to know your opinion about this approach. In particular, I'm not
sure about the way I handle the non-blocking stream of fps values. Thank
you in advance!

Andrea

<<CODE

package spectrum

import (
"testing"
"time"
)

const (
second = 1e9
ms = 1e6
)

type FpsCounter struct {
// The channel on which timings are sent from client code
Timings chan<- int64

// Client code receives fps values from this channel
Fps <-chan float

// Same as Timings but the end we use
timings <-chan int64

timeInterval int64

// Same as Fps but the end we use
fps chan<- float

// The ticker that triggers the calculation of average fps
ticker <-chan int64
}

func NewFpsCounter(timeInterval int64) *FpsCounter {

timings := make(chan int64)
fps := make(chan float)

fpsCounter := &FpsCounter{timings, fps, timings, timeInterval, fps, time.Tick(timeInterval)}

var (
sum, numSamples int64
lastFps float
)

// Non-blocking fps stream
go func() {
for {
fpsCounter.fps <- lastFps
}
}()

// Wait for timings and calculate the sum
go func() {
for t := range timings {
sum += t
numSamples++
}
}()

// Calculate average fps and reset variables every tick
go func() {
for {
<-fpsCounter.ticker

if numSamples > 0 {
avgTime := sum / numSamples
lastFps = 1 / (float(avgTime) / second)
}

sum, numSamples = 0, 0
}
}()

return fpsCounter
}

// Helper for TestFpsCounter
func loopFor(timeInterval int64, block func(elapsedTime int64)) {
var elapsedTime int64
startTime := time.Nanoseconds()
for elapsedTime < timeInterval {
block(elapsedTime)
elapsedTime = time.Nanoseconds() - startTime
}
}

func TestFpsCounter(t *testing.T) {
// Collect timings every second
var timeInterval int64 = 1 * second
// Create a new FpsCounter service with the given timeInterval
fpsCounter := NewFpsCounter(timeInterval)

// Simulate a three seconds emulator loop
loopFor(3 * second, func(elapsedTime int64) {

fpsCounter.Timings <- 20 * ms // Send a dummy time of
// 20ms (50 fps)

fps := <-fpsCounter.Fps // Receive fps from the
// FpsCounter service

// After 2 seconds average fps should be 50
if elapsedTime > 2 * second {
if fps != 50 {
t.Errorf("fps should be 50 but got %f", fps)
}
}
})
}

CODE
--
Andrea Fazzi @ alcacoop.it
Read my blog at http://freecella.blogspot.com
Follow me on http://twitter.com/remogatto

roger peppe

unread,
Jul 6, 2010, 11:19:12 AM7/6/10
to Andrea Fazzi, golang-nuts
a problem with this code is that you're sharing
values between goroutines with no synchronisation
between them.

i think you can do better by avoiding the shared variables
("share memory by communicating", right ? :-))

see the attached for one possibility. only barely tested.

speccytest.go

Andrea Fazzi

unread,
Jul 9, 2010, 1:12:20 PM7/9/10
to golang-nuts
Excerpts from roger peppe's message of mar lug 06 17:19:12 +0200 2010:

> a problem with this code is that you're sharing
> values between goroutines with no synchronisation
> between them.
>
> i think you can do better by avoiding the shared variables
> ("share memory by communicating", right ? :-))
>
> see the attached for one possibility. only barely tested.

Roger,

thank you very much, you show me a better and more idiomatic way of doing
it. It was the a good opportunity to read carefully the documentation about
select statements. I refactored[1] just a bit your solution getting rid of
few useless non-public struct fields (timings and ticker). Now it looks
much better.

Thank you,
Andrea

[1] - http://gist.github.com/469721

unread,
Jul 9, 2010, 3:30:00 PM7/9/10
to golang-nuts
I find it strange that you want to implement the FPS counter by using
Go routines. I would do this:

1. I am assuming there are at least two Go routines (OS threads), one
performs the Z80 CPU emulation and the other performs SDL rendering
(+upscaling and filtering).

2. When the Z80 thread finishes processing instructions belonging to
one frame (ideally, with a 50 Hz frequency), it notifies (via a
channel) the SDL thread.

3. The SDL thread receives the 6192 "VRAM" bytes from the Z80 thread
and can start processing further instructions. I suppose the whole
VRAM data can be simply copied into a new array and sent over a
channel to the SDL thread.

4. The Z80 thread and SDL thread are now (potentially) working in
parallel. The SDL thread is converting the Spectrum screen into SDL
pixels.

5. The display timing data and the FPS counter are *local* to the SDL
thread. There is no special Go routine for handling FPS.

(As I already mentioned previously elsewhere, I would put the keyboard/
joystick input processing into another separate Go routine.)

... or is there some reason of why should the display timing and FPS
counter be placed in distinct Go routines ?

On Jul 9, 7:12 pm, Andrea Fazzi <andrea.fa...@alcacoop.it> wrote:
> Excerpts from roger peppe's message of mar lug 06 17:19:12 +0200 2010:
>
> > a problem with this code is that you're sharing
> > values between goroutines with no synchronisation
> > between them.
>
> > i think you can do better by avoiding the shared variables
> > ("share memory by communicating", right ? :-))
>
> > see the attached for one possibility. only barely tested.
>
> Roger,
>
> thank you very much, you show me a better and more idiomatic way of doing
> it. It was the a good opportunity to read carefully the documentation about
> select statements. I refactored[1] just a bit your solution getting rid of
> few useless non-public struct fields (timings and ticker). Now it looks
> much better.
>
> Thank you,
> Andrea
>
> [1] -http://gist.github.com/469721

Andrea Fazzi

unread,
Jul 10, 2010, 11:25:32 AM7/10/10
to golang-nuts
Excerpts from ⚛'s message of ven lug 09 21:30:00 +0200 2010:

> I find it strange that you want to implement the FPS counter by using
> Go routines. I would do this:
>
> 1. I am assuming there are at least two Go routines (OS threads), one
> performs the Z80 CPU emulation and the other performs SDL rendering
> (+upscaling and filtering).
>
> 2. When the Z80 thread finishes processing instructions belonging to
> one frame (ideally, with a 50 Hz frequency), it notifies (via a
> channel) the SDL thread.
>
> 3. The SDL thread receives the 6192 "VRAM" bytes from the Z80 thread
> and can start processing further instructions. I suppose the whole
> VRAM data can be simply copied into a new array and sent over a
> channel to the SDL thread.
>
> 4. The Z80 thread and SDL thread are now (potentially) working in
> parallel. The SDL thread is converting the Spectrum screen into SDL
> pixels.
>
> 5. The display timing data and the FPS counter are *local* to the SDL
> thread. There is no special Go routine for handling FPS.
>
> (As I already mentioned previously elsewhere, I would put the keyboard/
> joystick input processing into another separate Go routine.)

Hi,

surely your suggested design is fashinating :) My doubts with it are about
performances: I see a lot of per-frame communication between goroutines!
Currently, video memory is written *directly* on the host video surface
through the DisplayAccessor interface, there is not post-processing nor
filtering at all. This lead to highter performances, I guess. What do you
think?

Andrea

--
Andrea Fazzi @ alcacoop.it

unread,
Jul 10, 2010, 2:16:04 PM7/10/10
to golang-nuts
> Read my blog athttp://freecella.blogspot.com
> Follow me onhttp://twitter.com/remogatto

- Several days ago, I did some simple benchmarking and found out that
the major portion of work is spent in the CPU emulation. The display-
related stuff needs a much smaller portion of time to process.

- Copying some 50*6912 bytes (345KB) per second is completely
negligible. It even does not need to be flushed from the x86 CPU cache
to main memory, since 6912 bytes fits even into the L1 data cache. The
bandwidth of L1 cache is several gigabytes per second. (There could be
a (very minor) performance problem if the Go memory allocator will not
recycle the most recently deallocated 6912 chunk - in which case the
solution is to allocate a permanent buffer.)

- The Z80 CPU emulation executes 3.5 millions of instructions per
second. Per *each* instruction, there are several memory reads and
memory writes. My estimate is that the number of bytes the x86 CPU
accesses which emulating a single Z80 instruction is something like
10-100. So, that is at least 10*3.5*10^6 bytes (35MB) per second for
the Z80 CPU emulation. Now compare that to those 345KB/s mentioned
above ...

Andrea Fazzi

unread,
Jul 11, 2010, 12:13:04 PM7/11/10
to golang-nuts
Excerpts from ⚛'s message of sab lug 10 20:16:04 +0200 2010:

> - Several days ago, I did some simple benchmarking and found out that
> the major portion of work is spent in the CPU emulation. The display-
> related stuff needs a much smaller portion of time to process.
>
> - Copying some 50*6912 bytes (345KB) per second is completely
> negligible. It even does not need to be flushed from the x86 CPU cache
> to main memory, since 6912 bytes fits even into the L1 data cache. The
> bandwidth of L1 cache is several gigabytes per second. (There could be
> a (very minor) performance problem if the Go memory allocator will not
> recycle the most recently deallocated 6912 chunk - in which case the
> solution is to allocate a permanent buffer.)
>
> - The Z80 CPU emulation executes 3.5 millions of instructions per
> second. Per *each* instruction, there are several memory reads and
> memory writes. My estimate is that the number of bytes the x86 CPU
> accesses which emulating a single Z80 instruction is something like
> 10-100. So, that is at least 10*3.5*10^6 bytes (35MB) per second for
> the Z80 CPU emulation. Now compare that to those 345KB/s mentioned
> above ...

Ok, it's definitely worth a try, thanks for the benchmarking. Would you
like to provide a patch? :) Well... At least I'll rely on your feedback if
I'll end up with something working in this sense :) This could be an
interesting way to exploit Go's pecularities in emulation...

With regard to the original question about the fps counter, I think it's a
good thing running it on separate goroutines. The counter calculates the
average fps from an average of the timings sent to it, collected during a
given time interval. The measure of the time is received by a time.Tick
channel. This receive operation is blocking thus I run it on a detached
goroutine. Do you see better/more idiomatic alternatives?

Cheers,
Andrea

--
Andrea Fazzi @ alcacoop.it

Reply all
Reply to author
Forward
0 new messages