Hi all,
I am experimenting with partial reconfiguration using ReconOS and Vivado. Before reconfiguring a slot, I suspend the running hardware thread with the function reconos_thread_suspend_block() (see reconos.c). When suspending a thread, it is likely that the corresponding delegate thread is currently in a blocked function call(e.g. mbox_get). To interrupt the blocked function, the suspend function sends a signal to the delegate thread with pthread_kill(). Now it can happen that the signal is sent, the signal handler is executed, but the blocked function call is not interrupted and/or the delegate thread is not resumed.
With more details:
I modified the hardware thread sources (vhdl) of the sortdemo, such that it exits the thread when the HWT_signal is set. This is necessary to use the suspend function. To test the suspend function, I simply run the sortdemo and at the end call the suspend function on all hardware threads. Sometimes all HWTs are successfully suspended, sometimes not. After some test runs, I would say that the use of software threads significantly decreases the chance of success.
The code that sends the interrupt signal looks like this:
code from hwslot_suspendthread(struct hwslot *slot) in reconos.c
| do { |
| switch (slot->dt_state) { |
| case DELEGATE_STATE_BLOCKED_OSIF: |
| reconos_osif_break(slot->osif); |
| break; |
|
|
| case DELEGATE_STATE_BLOCKED_SYSCALL: |
| pthread_kill(slot->dt, SIGUSR1); |
| break; |
| } |
|
|
| sched_yield(); |
| } while (slot->dt_flags & DELEGATE_FLAG_SUSPEND); |
I never used signals in my code before, but I suppose the rationale behind the code snippet is to send the signal to interrupt the blocked function and then yield in the hope that the delegate is executed next, so that 'the suspending" can continue. The signal handler outputs a message as debug output, so when I enable the debug output, I can observe how often the signal handler is called. The delegate thread on the other hand will also output a debug message when the blocked function call is interrupted.
Even when everything works fine, the signal handler is called ~800 to 8000 times when suspending a HWT. After the function call was interrupted, the signal handler is no longer called and the remainder of the suspend function runs to completion and successfully suspends the thread.
In many cases, especially when using software threads, the suspend function does not terminate. In these situations, the function call is never interrupted, but instead the signal handler is continuously called.
Has anyone suggestions or ideas on how to solve the problem that the signal is not interrupting the blocked function? Since the problem seems to get worse with an increasing number of additional software threads, I suspected that the behavior of the scheduler has something to do with it, but I don't know if this is a plausible explanation.
Thanks in advance! (And for reading this rather long text...)
Tim