Mac crash on timeout

61 views
Skip to first unread message

Gonzalo Garramuno

unread,
Jan 3, 2021, 4:56:38 PM1/3/21
to fltkc...@googlegroups.com
I am getting crashes on a timeout on Mac OS Big Sur on the second run of a movie that was removed and loaded again. The same code and behavior works fine on Linux.


repeat_timeout 0.0104167

AddressSanitizer:DEADLYSIGNAL
=================================================================
==2150==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x7fff204e2af6 bp 0x7ffeec2ad880 sp 0x7ffeec2ad880 T0)
==2150==The signal is caused by a READ memory access.
==2150==Hint: address points to the zero page.
#0 0x7fff204e2af6 in _CFGetNonObjCTypeID+0xa (CoreFoundation:x86_64h+0x14baf6)
#1 0x7fff20455141 in CFRunLoopTimerSetNextFireDate+0x39 (CoreFoundation:x86_64h+0xbe141)
#2 0x10479f8bb in Fl_Cocoa_Screen_Driver::repeat_timeout(double, void (*)(void*), void*)+0x5b (mrv-dbg:x86_64+0x100e538bb)
#3 0x1042d7c83 in mrv::ImageView::timeout() mrvImageView.cpp:3797
#4 0x10430d4b8 in mrv::ImageView::handle_timeout() mrvImageView.cpp:8082
#5 0x1042aab9d in mrv::static_timeout(mrv::ImageView*) mrvImageView.cpp:1675
#6 0x10479f7d4 in do_timer(__CFRunLoopTimer*, void*)+0x34 (mrv-dbg:x86_64+0x100e537d4)
#7 0x7fff204318fc in __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__+0x13 (CoreFoundation:x86_64h+0x9a8fc)
#8 0x7fff204313d7 in __CFRunLoopDoTimer+0x399 (CoreFoundation:x86_64h+0x9a3d7)
#9 0x7fff20430f31 in __CFRunLoopDoTimers+0x132 (CoreFoundation:x86_64h+0x99f31)
#10 0x7fff2041756e in __CFRunLoopRun+0x7d7 (CoreFoundation:x86_64h+0x8056e)
#11 0x7fff204166bd in CFRunLoopRunSpecific+0x232 (CoreFoundation:x86_64h+0x7f6bd)
#12 0x7fff28682fcf in RunCurrentEventLoopInMode+0x123 (HIToolbox:x86_64+0x30fcf)
#13 0x7fff28682c21 in ReceiveNextEventCommon+0x11a (HIToolbox:x86_64+0x30c21)
#14 0x7fff28682aee in _BlockUntilNextEventMatchingListInModeWithFilter+0x3f (HIToolbox:x86_64+0x30aee)
#15 0x7fff22c2ff84 in _DPSNextEvent+0x372 (AppKit:x86_64+0x3ef84)
#16 0x7fff22c2e74a in -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]+0x555 (AppKit:x86_64+0x3d74a)
#17 0x1047a2955 in Fl_Cocoa_Screen_Driver::wait(double)+0x245 (mrv-dbg:x86_64+0x100e56955)
#18 0x10472514c in Fl::run()+0x1c (mrv-dbg:x86_64+0x100dd914c)
#19 0x1047010b4 in main main.cpp:523
#20 0x7fff2033b630 in start+0x0 (libdyld.dylib:x86_64+0x15630)

==2150==Register values:
rax = 0x9e49b50ccf28004e rbx = 0x00000001042aab60 rcx = 0x79c5804a578a830b rdx = 0x0000000000000000
rdi = 0x0000000000000000 rsi = 0x00000000850431b4 rbp = 0x00007ffeec2ad880 rsp = 0x00007ffeec2ad880
r8 = 0x0000035ad9784647 r9 = 0x00000000029e4b17 r10 = 0x000000010f03be00 r11 = 0x00000000ffffff00
r12 = 0x0000000000000000 r13 = 0x00001fffdd855d2c r14 = 0x000061a00002ee80 r15 = 0x00007ffeec2aeaa0
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (CoreFoundation:x86_64h+0x14baf6) in _CFGetNonObjCTypeID+0xa
==2150==ABORTING

#3 0x1042d7c83 in mrv::ImageView::timeout() mrvImageView.cpp:3797

Is the following line which works fine untill the crash:

std::cerr << "repeat_timeout " << delay << std::endl;:
Fl::repeat_timeout( delay, (Fl_Timeout_Handler)static_timeout, this );

And static_timeout function is:
void static_timeout( mrv::ImageView* v )
{
v->handle_timeout(); // this function does a lot of things
}

Any ideas what I could try to narrow the problem down? The problem seems to be in the ObjC code but not sure how to debug that code.


Gonzalo Garramuno
ggar...@gmail.com




Ian MacArthur

unread,
Jan 3, 2021, 5:27:58 PM1/3/21
to coredev fltk
On 3 Jan 2021, at 21:56, Gonzalo Garramuno wrote:
>
> Is the following line which works fine untill the crash:
>
> std::cerr << "repeat_timeout " << delay << std::endl;:
> Fl::repeat_timeout( delay, (Fl_Timeout_Handler)static_timeout, this );
>
> And static_timeout function is:
> void static_timeout( mrv::ImageView* v )
> {
> v->handle_timeout(); // this function does a lot of things
> }
>
> Any ideas what I could try to narrow the problem down? The problem seems to be in the ObjC code but not sure how to debug that code.
>


I’ve got nothing to suggest regards the debug - though I wonder; is there any chance the repeat_timeout call is adding the timer into the list “again" before it has actually triggered?
Adding the same timer into the list multiple times has done weird stuff to me in the past...



Gonzalo Garramuño

unread,
Jan 3, 2021, 5:35:22 PM1/3/21
to fltkc...@googlegroups.com

El 3/1/21 a las 19:27, Ian MacArthur escribió:
> I’ve got nothing to suggest regards the debug - though I wonder; is there any chance the repeat_timeout call is adding the timer into the list “again" before it has actually triggered?
> Adding the same timer into the list multiple times has done weird stuff to me in the past...
>
Hmm... I did not check that.  I only checked add/remove pairs were
consistent and repeat was getting called in between those.  I'll check
if repeat timeout is called again before it has actually triggered.

Gonzalo Garramuño

unread,
Jan 3, 2021, 7:00:31 PM1/3/21
to fltkc...@googlegroups.com

El 3/1/21 a las 19:27, Ian MacArthur escribió:
> I’ve got nothing to suggest regards the debug - though I wonder; is there any chance the repeat_timeout call is adding the timer into the list “again" before it has actually triggered?
> Adding the same timer into the list multiple times has done weird stuff to me in the past...
I think you nailed it, albeit it isn't clear to me where in my code this
is happening.  I added Fl::has_timeout calls right before the repeat
timeouts and all works fine.

Albrecht Schlosser

unread,
Jan 4, 2021, 6:48:57 AM1/4/21
to fltkc...@googlegroups.com
Hmm, why do you have multiple "repeat timeouts", i.e.
Fl::repeat_timeout() calls? Each Fl::repeat_timeout() call should only
be inside the timer callback that triggered it, so there's likely only
one close to the end (or maybe at the beginning) of the timer callback.
If you call Fl::repeat_timeout() inside the timer callback (once) it's
always OK. Everything else results in undefined behavior.

The documentation states:

"You may only call this method inside a timeout callback of the same
timer or at least a closely related timer, otherwise the timing accuracy
can't be improved and the behavior is undefined."

https://www.fltk.org/doc-1.4/classFl.html#ae5373d1d50c2b0ba38280d78bb6d2628


I'm also wondering how you use Fl::has_timeout() in this context? The
docs say "Returns true if the timeout exists and has not been called yet."

If it returns true you should not call Fl::repeat_timeout() for sure,
but if it returns false there are three possible situations:

(1) you are *inside* the timer callback and it's fine to call
Fl::repeat_timeout or

(2) you are *outside* the timer callback and

(2a) the timer exists and has not been called yet or
(2b) the timer does not exist

Neither in case 2a nor in 2b you should call Fl::repeat_timeout().

Basically only your code can tell if it's OK to call
Fl::repeat_timeout() and Fl::has_timeout() can't help here. I'm puzzled.
Or am I missing anything?

Albrecht Schlosser

unread,
Jan 4, 2021, 7:14:07 AM1/4/21
to fltkc...@googlegroups.com
On 1/4/21 12:48 PM Albrecht Schlosser wrote:

> If you call Fl::repeat_timeout() inside the timer callback (once) it's
> always OK. Everything else results in undefined behavior.

Note: Emphasis on *(once)*.

> [...]
> Basically only your code can tell if it's OK to call
> Fl::repeat_timeout() and Fl::has_timeout() can't help here. I'm puzzled.
> Or am I missing anything?

Well, after thinking about it and looking closer at the backtrace you
(Gonzalo, the OP) posted it looks as if you're calling
Fl::repeat_timeout() twice *inside* the timer callback. In this case
Fl::has_timeout() would indeed return true after the first call and
could prevent calling Fl::repeat_timeout() twice. Still strange though
-- why would you have two distinct calls of Fl::repeat_timeout() inside
the same timer callback? If this is the case there must be some control
flow issues in your code which I'd suggest to check first.

Another question (wild guess): Do you call Fl::remove_timeout()
[correctly] in the scenario you described as "on the second run of a

Gonzalo Garramuño

unread,
Jan 4, 2021, 8:12:01 AM1/4/21
to fltkc...@googlegroups.com

El 4/1/21 a las 09:14, Albrecht Schlosser escribió:
> On 1/4/21 12:48 PM Albrecht Schlosser wrote:
>
>> If you call Fl::repeat_timeout() inside the timer callback (once)
>> it's always OK. Everything else results in undefined behavior.
>
> Note: Emphasis on *(once)*.
>
>> [...]
>> Basically only your code can tell if it's OK to call
>> Fl::repeat_timeout() and Fl::has_timeout() can't help here. I'm
>> puzzled. Or am I missing anything?
>
> Well, after thinking about it and looking closer at the backtrace you
> (Gonzalo, the OP) posted it looks as if you're calling
> Fl::repeat_timeout() twice *inside* the timer callback. In this case
> Fl::has_timeout() would indeed return true after the first call and
> could prevent calling Fl::repeat_timeout() twice. Still strange though
> -- why would you have two distinct calls of Fl::repeat_timeout()
> inside the same timer callback? If this is the case there must be some
> control flow issues in your code which I'd suggest to check first.

Where did you see that in the stack trace?  AFAICT, only once it is
getting called in the stack trace.

The only weird thing is that I have three timeouts.  Two with the same
user data but pointing to different functions and one with different
function and different data.  These may be called independently at any
time and in reality most of the time only the two with the same data are
called at the same time but they go through different paths and
different repeat_timeouts.

>
> Another question (wild guess): Do you call Fl::remove_timeout()
> [correctly] in the scenario you described as "on the second run of a
> movie that was removed and loaded again".
>
Yes, AFAIK.


Albrecht Schlosser

unread,
Jan 4, 2021, 2:25:01 PM1/4/21
to fltkc...@googlegroups.com
On 1/4/21 2:11 PM Gonzalo Garramuño wrote:
>
> El 4/1/21 a las 09:14, Albrecht Schlosser escribió:
>> On 1/4/21 12:48 PM Albrecht Schlosser wrote:
>>
>> Well, after thinking about it and looking closer at the backtrace you
>> (Gonzalo, the OP) posted it looks as if you're calling
>> Fl::repeat_timeout() twice *inside* the timer callback. In this case
>> Fl::has_timeout() would indeed return true after the first call ...
>
> Where did you see that in the stack trace?  AFAICT, only once it is
> getting called in the stack trace.

Actually I don't see that it is really called twice. Sorry for the
confusion.

What I can see is only that Fl::repeat_timeout() is called from within a
timeout handler function from this part of the stack frame:

> #2 0x10479f8bb in Fl_Cocoa_Screen_Driver::repeat_timeout(double,
void (*)(void*), void*)+0x5b (mrv-dbg:x86_64+0x100e538bb)
> #3 0x1042d7c83 in mrv::ImageView::timeout() mrvImageView.cpp:3797
> #4 0x10430d4b8 in mrv::ImageView::handle_timeout()
mrvImageView.cpp:8082
> #5 0x1042aab9d in mrv::static_timeout(mrv::ImageView*)
mrvImageView.cpp:1675
> #6 0x10479f7d4 in do_timer(__CFRunLoopTimer*, void*)+0x34
(mrv-dbg:x86_64+0x100e537d4)
> #7 0x7fff204318fc in
__CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__+0x13
(CoreFoundation:x86_64h+0x9a8fc)

I conclude that [supposedly the macOS specific FLTK function] do_timer()
calls your static timer callback mrv::static_timeout() which eventually
(indirectly) calls Fl_Cocoa_Screen_Driver::repeat_timeout() which
finally causes the issue.

The fact that you wrote that adding Fl::has_timeout() helped caused me
to *assume* that this was the second call and hence Fl::has_timeout()
would return true -- after the first call had added the same or a
similar timeout already.

> The only weird thing is that I have three timeouts.  Two with the same
> user data but pointing to different functions and one with different
> function and different data.

This should be fine. Timer functions are (or at least should be, per
definition) distinguished by the combination of the function pointer and
the user data. userdata == NULL is a special case.

> These may be called independently at any
> time and in reality most of the time only the two with the same data are
> called at the same time but they go through different paths and
> different repeat_timeouts.

Note that if these timers are instantiated for the first time they
should be added with Fl::add_timeout() although Fl::repeat_timeout()
would also kinda work but, as documented, would result in undefined
behavior (which means: everything could happen, even a system crash).

You still did not answer my question *how exactly* adding
Fl::has_timeout() helped in your case. I think I asked this, but if not,
please answer now: For which return value (true or false) did you
subsequently call Fl::repeat_timeout()? And, BTW, since you are
obviously using the 'userdata' argument in your timers you should also
include it in the call to Fl::has_timeout(). So although this might be
obvious, please elaborate to make sure we're not "thinking in the wrong
direction".

>> Another question (wild guess): Do you call Fl::remove_timeout()
>> [correctly] in the scenario you described as "on the second run of a
>> movie that was removed and loaded again".
>>
> Yes, AFAIK.

OK, then we can rule this out.

That all said, I don't know and can't effectively check the macOS code
[3], hence I can't definitely say that the macOS implementation is 100%
correct. (Neither can I say this for any FLTK code, you know...) Maybe
you found a bug, but I don't take this for very probable.

Finally, there are still some things unclear to me, regarding the timer
callbacks as such: I'm not sure if it is (or should be) allowed to queue
two timer callbacks with the same callback function and the same
userdata argument. As far as I can tell off the top of my head this
*might* be possible/allowed and (if true) should not result in any timer
problems on the FLTK library side. This seems at least reasonable for
the Unix/Linux implementation which schedules and triggers timer
callbacks by itself but ISTR that Windows works different (using system
timers). I have no idea how the macOS implementation works...

I'm now curious and intend to look into this more deeply, but very
likely not today.

Thinking out loud: Assuming someone managed to add two identical timer
callbacks, i.e. with the same callback and userdata [1] and a similar
time it could happen that both timer events would be serviced in the
same run of the FLTK event loop, i.e. the second timer callback would
see [2] the previously queued timer callback from the first timer event.
Generally this would alway be the case, i.e. any timer callback would
see [2] the timer instance of the "other" timer that's still or already
active. Again, I'm not sure if this is allowed (i.e. documented to be
allowed) or a practical use at all. Needs some investigation and maybe
better documentation...

Footnotes:

[1] Note that userdata can be NULL which works like a wildcard in terms
of Fl::has_timeout()

[2] "see" in the sense that Fl::add_timeout() would return true.

[3] I'm not familiar with ObjC and I don't know the intrinsics of the
macOS specific system functions

Albrecht Schlosser

unread,
Jan 4, 2021, 2:31:24 PM1/4/21
to fltkc...@googlegroups.com
CORRECTION:

On 1/4/21 8:24 PM Albrecht Schlosser wrote:
>
> Footnotes:
>
> [2] "see" in the sense that Fl::add_timeout() would return true.

should read: ... Fl::has_timeout() would return true.

Gonzalo Garramuno

unread,
Jan 5, 2021, 3:33:08 PM1/5/21
to fltkc...@googlegroups.com


> El 4 ene. 2021, a las 16:24, Albrecht Schlosser <Albrech...@online.de> escribió:
>
> You still did not answer my question *how exactly* adding Fl::has_timeout() helped in your case. I think I asked this, but if not, please answer now: For which return value (true or false) did you subsequently call Fl::repeat_timeout()? And, BTW, since you are obviously using the 'userdata' argument in your timers you should also include it in the call to Fl::has_timeout(). So although this might be obvious, please elaborate to make sure we're not "thinking in the wrong direction".

For the false case, like:

If ( ! Fl::has_timeout( handler, data ) )
{
Fl::repeat_timeout( handler, data );
}


Gonzalo Garramuno
ggar...@gmail.com




Ian MacArthur

unread,
Jan 5, 2021, 3:55:07 PM1/5/21
to coredev fltk
On 5 Jan 2021, at 20:33, Gonzalo Garramuno wrote:
>
>
>
>> El 4 ene. 2021, a las 16:24, Albrecht Schlosser escribió:
>>
>> You still did not answer my question *how exactly* adding Fl::has_timeout() helped in your case. I think I asked this, but if not, please answer now: For which return value (true or false) did you subsequently call Fl::repeat_timeout()? And, BTW, since you are obviously using the 'userdata' argument in your timers you should also include it in the call to Fl::has_timeout(). So although this might be obvious, please elaborate to make sure we're not "thinking in the wrong direction".
>
> For the false case, like:
>
> If ( ! Fl::has_timeout( handler, data ) )
> {
> Fl::repeat_timeout( handler, data );
> }
>


Looks about right to me - at least, that’s what I'd envisaged when Gonzalo posted that he tried it and it resolved (or avoided at any rate!) the issue...


Albrecht Schlosser

unread,
Jan 6, 2021, 7:37:45 AM1/6/21
to fltkc...@googlegroups.com
Yes, that looks fine as long as it is called inside a callback of the
same timeout function ('handler') with the same userdata argument ('data').
Reply all
Reply to author
Forward
0 new messages