Hey Guys,
My app is seeing a hang on startup that i get maybe 1/10 times, but some other people get it 100% of the time.
My pthread knowledge lacks a little bit... Does anyone see any issues?
Here's the high level render loop:
1) In my Instance's init function, I call CallOnMainThread with a delay of 0, calling my Update function for the first time
2) In my update function, I render to my graphics2d context (render details below!) and then flush my 2d context
3) In the callback of my 2d context flush, i call my update function again, which renders and flushes again. then i get another callback, render and flush again etc. (this is the render loop essentially)
Rendering details (my rendering is multithreaded):
At init... (in the constructor... this is just setup for the render stuff)
* I create a "BackToSleep" Mutex and condition variable
* I get the number of processors on the system by calling sysconf(_SC_NPROCESSORS_ONLN)
* I create the same number of threads as there are processors. For each one i...
* Create a "WakeUp" Mutex and condition variable for this thread
* In the parameters passed to the thread I put the pointer to the "BackToSleep" mutex and condition variable
* In the parameters passed to the thread I also put the pointer to a variable called "ThreadsInFlight" along with a pointer to the mutex that protects that variable.
* Lastly, I start the thread
In my rendering, the screen is broken up into tiles, where there's a queue of tiles to render and the threads just grab the next tile index to render as they are looking for work. When they are out of work, they go back to sleep.
When rendering (main thread)...
* I set a variable NextCellToRender to the number of screen tiles that there are
* I set the "ThreadsInFlight" variable to the number of threads that there are (this write is protected by the "ThreadsInFlight" mutex. Doesn't need to be, but doesn't hurt... the "ThreadsInFlight" value is actually part of a ThreadSafeUINT class)
* For each thread, i call pthread_cond_signal for each thread's "WakeUp" condition variable to wake up all the threads.
* I then lock my "BackToSleep" mutex, and wait on the "BackToSleep" condition variable. After that, of course unlocking the "BackToSleep" mutex.
* Now that the threads have rendered all the screen tiles (which each have their own pixel buffer for their region of the screen), i gather up all the tiles and copy them onto the REAL pixel buffer (the one that gets flushed to the screen)
Render Thread Function...
* I start out by casting the thread parameters to the appropriate struct type to get the parameters passed to the thread
* While(1)...
* Lock the "WakeUp" mutex, wait on the "WakeUp" condition variable then unlock the "WakeUp" Mutex (ie it should sleep til woken up by the main thread)
* I have a function that locks the "NextCellToRender" mutex, decriments if it is greater than zero, unlocks the mutex, and returns true if the number was > 0 before -decrement, else returns false. The thread calls this in a while loop (basically.. while there is work to do, keep going)
* Inside the while loop, it renders the current cell to the pixel buffer owned by the current screen cell
* After the while loop that renders the screen cells, it locks the "ThreadsInFlight" mutex, decriments ThreadsInFlight if it's > 0 and unlocks the mutex.
* If this thread is the one that decrimented ThreadsInFlight to 0, that means it's the final thread that is awake, so it signals the "BackToSleep" condition variable, which should let the main thread wake up again and continue on it's way.
What do you guys think... do you see any dangerous areas that could be causing a lock?