Multi-threaded parallel worker testing (more channels, PWM)

39 views
Skip to first unread message

Micah Wedemeyer

unread,
Jan 16, 2015, 12:38:39 PM1/16/15
to lightsh...@googlegroups.com
I've played some more with breaking the guts of play_song into multiple parallel workers, and so far I've had great success. Overall CPU usage is low (12-15% with cached FFT. That's good, right?) and the lame process (used for mp3 decoding, right?) typically takes up about the same CPU as python.

However, I only have 8 channels and am doing on/off (with the Sainsmart SSR). I'd like to know how it works for someone with a lot of channels and/or doing PWM. I'm curious if there's any stuttering or freezing.

I'd love for someone with a high channel and/or PWM setup to give my code a run. Right now it only works with mp3/wav input, so no audio_in yet.


Clone the repo, switch to the pipes_and_filters branch, drop in your overrides.cfg, and you should be good to go. Just run py/synchronized_lights.py with a playlist and tell me how it goes.

If you're curious about the guts, here's where I've been fiddling around:

I create a bunch of Queues (thread-safe collection structure) and a bunch of workers. Then each worker just processes their queue. The added overhead should be tiny, since under the hood it's basically just pushing/popping a bunch of pointers (to the string audio data bytes) on several stacks. The benefit is that workers can still be going while others are blocked on i/o.

Anyways, please give it a try and let me know if it works.

Paul

unread,
Jan 16, 2015, 4:31:50 PM1/16/15
to lightsh...@googlegroups.com
I only have 8 channels active on both raspberries. But I am in the process of adding another 8 to my main one, driven by an MCP23017.
I can test it on that once I have the rest of the wiring done, but am not sure I will have it done this weekend; next week some time though.

cheers!
--
http://www.lightshowpi.com/
---
You received this message because you are subscribed to the Google Groups "LightShow Pi Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lightshowpi-d...@googlegroups.com.
To post to this group, send email to lightsh...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lightshowpi-dev/ca190826-83f7-4307-8a95-946df9383d57%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Enos

unread,
Jan 17, 2015, 2:52:19 AM1/17/15
to lightsh...@googlegroups.com
12-15% is a big jump.  My baseline was in the 10-12% range with the stable and master branches with 8 channels.  Lame was all over the place depending on what I was decoding.  I'll test out your code and let you know the results by the end of next week.  

Tom Enos

unread,
Jan 17, 2015, 9:09:15 PM1/17/15
to lightsh...@googlegroups.com
It will not run for me,  I deleted my sync file and still no go.  
Exception in thread Lightshow Worker:
Traceback (most recent call last):
 
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
   
self.run()
 
File "/home/pi/lightshowpi/py/workers/lightshow_worker.py", line 29, in run
   
self._update_lights(matrix, mean, std)
 
File "/home/pi/lightshowpi/py/workers/lightshow_worker.py", line 48, in _update_lights
    brightness
= matrix[i] - mean[i] + 0.5 * std[i]
IndexError: index out of bounds



overrides.cfg
[hardware]
devices
= {
               
"mcp23s17": [
                       
{
                               
"pinBase": "65",
                               
"spiPort": "0",
                               
"devId": "0"
                       
},
                       
{
                               
"pinBase": "100",
                               
"spiPort": "0",
                               
"devId": "1"
                       
},
                       
{
                               
"pinBase": "200",
                               
"spiPort": "0",
                               
"devId": "2"
                       
}
               
]
       
}
gpio_pins
= 65,66,67,68,69,70,71,72,100,101,102,103,104,105,106,107,200,201,202,203,204,205,206,207,73,74,75,76,77,78,79,80,108,109,110,111,112,113,114,115,208,209,210,211,212,213,214,215
pin_modes
= onoff



On Friday, January 16, 2015 at 9:38:39 AM UTC-8, Micah Wedemeyer wrote:

Micah Wedemeyer

unread,
Jan 18, 2015, 8:22:35 AM1/18/15
to lightsh...@googlegroups.com
Thanks for giving it a shot. I really appreciate that. I'll see if I can figure out what the issue is and push a fix.

--
http://www.lightshowpi.com/
---
You received this message because you are subscribed to a topic in the Google Groups "LightShow Pi Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lightshowpi-dev/lcWWcFajiM8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lightshowpi-d...@googlegroups.com.

To post to this group, send email to lightsh...@googlegroups.com.

Micah Wedemeyer

unread,
Jan 18, 2015, 9:18:52 AM1/18/15
to lightsh...@googlegroups.com
Turns out I was not passing in the correct number of channels to the FftAnalyzer. Please update to the HEAD and try again.

And again, I really appreciate you testing it out with more channels. That's a big help. I'm really curious to see if buffering and multi-threading can work in a high-channel setup.

Micah Wedemeyer

unread,
Jan 18, 2015, 1:30:42 PM1/18/15
to lightsh...@googlegroups.com
Also, if you get any freezing or stuttering, try upping the queue_size in synchronized_lights from 10 to something like 50, or even 100. Especially if you see any log messages about the Lightshow worker starving.

This will use more memory, but allow for deeper buffering of the audio and light analysis data. CPU usage should spike at the start and then level out once the buffers fill up.

If there are still stuttering issues after the buffer size is increased...well then maybe my idea is a no-go after all.

Micah

Tom Enos

unread,
Jan 18, 2015, 5:00:09 PM1/18/15
to lightsh...@googlegroups.com
Okay, so it works now, had to fix issue #61 first (that was my fault)
First 48 channels, onoff mode, sms disabled (had to alter the start scripts), no prepost shows, only other thing changed from the defaults is devices and number of pins

48 channels performance took a big hit, adjusting the queue size only makes a difference while caching, but even then adjusting it does not make a difference.  Commenting out the logging statements made more of a difference.
But after caching and with the logging back in it was never starving with a queue size of 10.  So the queue only affects caching.  But there is still a 10-15% increase in cpu usage  after caching, (NOTE: you have a bug that makes it resave the cache, even after you have a valid sync file).

With modified startup scripts and a few changes to check_sms I was able to get it working with sms, but only just,  if check_sms had to process more then 1 or 2 messages, it would sometimes cause problems.  Also the size of the playlist affects this too, as check_sms has to read and write the playlist from/to the sd card and it starts to show at about 20 - 30 songs (I worked this out from testing on master while working out something else).  But with a short playlist and only a few messages to process it will work.

It was no surprise that pwm playback did not happen with 48 channels, it doesn't work with the master or stable either so.......     
But I narrowed the limit down to 22 channels, with a little stutter only every so often (maybe once in a song, and not every time), by comparison I get the same results with master at 24 channels.  And this is with sms disabled.  

Tom Enos

unread,
Jan 18, 2015, 10:38:41 PM1/18/15
to lightsh...@googlegroups.com
Well I uninstalled pulseaudio and bluez and we both get a performance boost.  I need to retest a few things, but I think we will both gain a few more channels in pwm mode and and a lot more in onoff mode.

Your code is still a lot heavier, 10-15%, but without pulseaudio it at least now looks a lot better.

Micah Wedemeyer

unread,
Jan 19, 2015, 7:40:16 AM1/19/15
to lightsh...@googlegroups.com
I'm actually curious as to why more channels makes a difference. Unless the actual writes to the pins are somehow slow, it seems odd to me that going from 8 channels to 24 would make a big difference. 8 to 800 maybe, but 24 still seems low. The amount of code being executed with respect to the timeframe involved seems like there should be plenty of headroom, even with only 700MHz.

To help me test with a realistic setup, one thought I had would be to simply duplicate pins in the config. So, for example, if I have:

gpio_pins = 1,2,3,4

then I could increase my channels with

gpio_pins = 1,1,1,2,2,2,3,3,3,4,4,4

Right? It seems like doing that will allow me to boost up to any number of channels I want and it should still be a valid test.

Thanks again for your help,
Micah



--
http://www.lightshowpi.com/
---
You received this message because you are subscribed to a topic in the Google Groups "LightShow Pi Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lightshowpi-dev/lcWWcFajiM8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lightshowpi-d...@googlegroups.com.
To post to this group, send email to lightsh...@googlegroups.com.

Micah Wedemeyer

unread,
Jan 19, 2015, 11:00:05 AM1/19/15
to lightsh...@googlegroups.com
I did duplicate my channels like I said, going up to 32 channels, and it definitely caused playback slowness. Looking in the logs, it seems like the lightshow playback worker is starving for light matrix data packets. I interpret that to mean that the FFT analysis is too slow, even when read from cache. That actually makes a lot of sense to me and would explain why performance degrades with more channels.

I may poke around in the FFT and see what I can see. I may also try to create some kind of dummy analyzer that just pumps out light matrix data as fast as possible (no FFT, no cache, just repeat copies of the same data) and see if stuttering goes away.
Reply all
Reply to author
Forward
0 new messages