Fl_Choice initiates infinite loop

37 views
Skip to first unread message

Rob McDonald

unread,
Jun 15, 2022, 11:58:59 PM6/15/22
to fltk.coredev
Clicking a Fl_Choice will kick off an infinite loop with the choice repeatedly incrementing and firing off repeated events.  Instead of incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.

This behavior depends on how long my program takes to respond to the event.  If my program is fast, the bug does not appear.  However, if my program takes 'too long', then the Fl_Choice sends another event and it slips into the infinite loop.

I manually bisected this behavior down to this commit:

29d9e31c51e6c  Consolidate timeout handling across platforms (#379) Albrecht Schlosser

The current tip of master (d8eb1f9ca46) still has this problem.

I am on MacOS.

I have not tried to duplicate this in any of the test programs - or to create a MWE.  I'm hopeful that this will start a conversation and perhaps someone will see the problem with the new timeout handling code.

Best,

Rob

Albrecht Schlosser

unread,
Jun 16, 2022, 10:54:21 AM6/16/22
to fltkc...@googlegroups.com
On 6/16/22 05:58 Rob McDonald wrote:
Clicking a Fl_Choice will kick off an infinite loop with the choice repeatedly incrementing and firing off repeated events.  Instead of incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.

Hmm, this sounds more like Fl_Counter rather than Fl_Choice. Correct?


This behavior depends on how long my program takes to respond to the event.  If my program is fast, the bug does not appear.  However, if my program takes 'too long', then the Fl_Choice sends another event and it slips into the infinite loop.

I have a theory what *might* happen but I need more info.

(1) How are you handling the event? Are you using a callback?

(2) Do you create another window in your event (callback) handling?

(3) If yes, is this a modal window, maybe something like fl_message() or fl_ask() or one of the other common dialogs?

(4) If it's not (2) or (3), can you describe what your program does when it "takes 'too long' " ? Everything related to the event loop (Fl::wait, Fl::check etc.) would be important, as well as info about opening other windows.

More questions below...


I manually bisected this behavior down to this commit:

29d9e31c51e6c  Consolidate timeout handling across platforms (#379) Albrecht Schlosser

The current tip of master (d8eb1f9ca46) still has this problem.

I am on MacOS.

Would your program run on Linux too? If yes, can you please test it on Linux and:

(5a) Does it exhibit the same behavior on Linux?

(5b) Does it also do this in commit cf4a832e6, the one before 29d9e31c51e6c?


I have not tried to duplicate this in any of the test programs - or to create a MWE.  I'm hopeful that this will start a conversation and perhaps someone will see the problem with the new timeout handling code.

FWIW: I can replicate the issue with Fl_Counter (sic!) and calling fl_message() in the callback.
Minimal test case (counter.cxx) attached.

Note that this test program exhibits the issue on Linux (git current) and even before commit 29d9e31c51e6c. In fact, it's also "broken" in FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).

The fact that your program worked on macOS before changing the timeout handling was supposedly only luck (not your "fault" ;-) ). I didn't test my demo program on macOS yet, awaiting your response with more info (see questions above).

FYI: My demo program can be "fixed" with both changes given in the attached Fl_Counter.patch independently but this is only a first proof of concept, not a real solution. However, if your issue is similar to what I *guessed* then you might want to test the patch and report if any one of the changes (each one, separately) fixes the issue for you.

Looking forward to your reply. TIA.

counter.cxx
Fl_Counter.patch

Albrecht Schlosser

unread,
Jun 16, 2022, 11:08:44 AM6/16/22
to fltkc...@googlegroups.com
On 6/16/22 16:54 Albrecht Schlosser wrote:
> On 6/16/22 05:58 Rob McDonald wrote:
>> Clicking a Fl_Choice will kick off an infinite loop with the choice
>> repeatedly incrementing and firing off repeated events.  Instead of
>> incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.
>
> Hmm, this sounds more like Fl_Counter rather than Fl_Choice. Correct?
>
> FWIW: I can replicate the issue with Fl_Counter (sic!) and calling
> fl_message() in the callback.
> Minimal test case (counter.cxx) attached.
>
> Note that this test program exhibits the issue on Linux (git current)
> and even before commit 29d9e31c51e6c. In fact, it's also "broken" in
> FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).

Note for testers: once you trigger the loop you can either press and
hold the "Escape" key to stop the program or you need to abort (kill)
the program otherwise.

Note also that FLTK 1.4 creates lots of individual fl_message() windows
whereas FLTK 1.3 creates only one window (that's an intended change in
FLTK 1.4).

Rob McDonald

unread,
Jun 16, 2022, 12:56:42 PM6/16/22
to fltk.coredev
On Thursday, June 16, 2022 at 7:54:21 AM UTC-7 Albrecht Schlosser wrote:
On 6/16/22 05:58 Rob McDonald wrote:
Clicking a Fl_Choice will kick off an infinite loop with the choice repeatedly incrementing and firing off repeated events.  Instead of incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.

Hmm, this sounds more like Fl_Counter rather than Fl_Choice. Correct?

Sorry about that -- my mistake.  I don't know how I got so crossed up, my fingers must have auto-completed...

Yes, this is all about Fl_Counter (not choice). 

 
This behavior depends on how long my program takes to respond to the event.  If my program is fast, the bug does not appear.  However, if my program takes 'too long', then the Fl_Choice sends another event and it slips into the infinite loop.

I have a theory what *might* happen but I need more info.

(1) How are you handling the event? Are you using a callback?

Yes, I have a callback.

 
(2) Do you create another window in your event (callback) handling?

No additional FLTK calls should be made.  No window should be opened.
 
(3) If yes, is this a modal window, maybe something like fl_message() or fl_ask() or one of the other common dialogs?

(4) If it's not (2) or (3), can you describe what your program does when it "takes 'too long' " ? Everything related to the event loop (Fl::wait, Fl::check etc.) would be important, as well as info about opening other windows.

I do a bunch of math to regenerate a Bezier surface.  It should not touch anything else FLTK related.  If the Bezier surface is a simple one (fast), the problem does not occur.  If the case happens to be sufficiently complex (slow) then the problem does occur.

 
More questions below...


I manually bisected this behavior down to this commit:

29d9e31c51e6c  Consolidate timeout handling across platforms (#379) Albrecht Schlosser

The current tip of master (d8eb1f9ca46) still has this problem.

I am on MacOS.

Would your program run on Linux too? If yes, can you please test it on Linux and:

It will take me a while to get a Linux build set up to use an arbitrary version of FLTK.  I will see what I can do. 


 
(5a) Does it exhibit the same behavior on Linux?

(5b) Does it also do this in commit cf4a832e6, the one before 29d9e31c51e6c?

I'll let you know.  On my Mac, cf4a832e6 works fine.
 
I have not tried to duplicate this in any of the test programs - or to create a MWE.  I'm hopeful that this will start a conversation and perhaps someone will see the problem with the new timeout handling code.

FWIW: I can replicate the issue with Fl_Counter (sic!) and calling fl_message() in the callback.
Minimal test case (counter.cxx) attached.
 
You might just put a sleep of some sort -- my expectation is that a wait on the order of one second is enough to get the loop going.

 
Note that this test program exhibits the issue on Linux (git current) and even before commit 29d9e31c51e6c. In fact, it's also "broken" in FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).

The fact that your program worked on macOS before changing the timeout handling was supposedly only luck (not your "fault" ;-) ). I didn't test my demo program on macOS yet, awaiting your response with more info (see questions above).

FYI: My demo program can be "fixed" with both changes given in the attached Fl_Counter.patch independently but this is only a first proof of concept, not a real solution. However, if your issue is similar to what I *guessed* then you might want to test the patch and report if any one of the changes (each one, separately) fixes the issue for you.

Looking forward to your reply. TIA.

I will give it a shot and report back.  Thanks,

Rob

 

Rob McDonald

unread,
Jun 16, 2022, 2:00:11 PM6/16/22
to fltk.coredev
On Thursday, June 16, 2022 at 7:54:21 AM UTC-7 Albrecht Schlosser wrote:

FWIW: I can replicate the issue with Fl_Counter (sic!) and calling fl_message() in the callback.
Minimal test case (counter.cxx) attached.

Note that this test program exhibits the issue on Linux (git current) and even before commit 29d9e31c51e6c. In fact, it's also "broken" in FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).

The fact that your program worked on macOS before changing the timeout handling was supposedly only luck (not your "fault" ;-) ). I didn't test my demo program on macOS yet, awaiting your response with more info (see questions above).

FYI: My demo program can be "fixed" with both changes given in the attached Fl_Counter.patch independently but this is only a first proof of concept, not a real solution. However, if your issue is similar to what I *guessed* then you might want to test the patch and report if any one of the changes (each one, separately) fixes the issue for you.

Looking forward to your reply. TIA.

On my Mac, neither of the proposed fixes help the situation.

I tried it with both fixes -- and it _maybe_ made a slight improvement.  The first time I pressed the button, I had success, but the second time, it initiated the infinite loop.

Rob
 

Albrecht Schlosser

unread,
Jun 16, 2022, 6:00:55 PM6/16/22
to fltkc...@googlegroups.com
OK, taking your earlier reply into account, I changed the demo program to just (u)sleep() inside the callback and can trigger the issue in a reprocucible way. This may be different than your real case but it shows the issue. I'm attaching the modified demo program counter.cxx. I know what is causing the issue, i.e. what happens once it gets triggered, but I don't know yet how to fix it. I'll look deeper into it tomorrow.

What I found out so far:

(1) My first assumption was that the Fl_Counter widget misses the FL_RELEASE event and this can easily be demonstrated by opening a modal window inside the callback. This is test case 1 [change '#if (0)' to '#if (1)' ...] and can be fixed by my previously posted patch.

(2) My new test program works by sleeping at least 100 ms inside the callback. It seems necessary to hold the Fl_Counter button pressed at least this time so the internal repeat timer gets called before the FL_RELEASE event is delivered. This restarts the repeat timer and triggers the (in this case potentially infinite) loop. This happens on both Linux and macOS.

(3) Testing on the Mac I could confirm that commit cf4a832e6 (the one before the timeout changes) does not have this issue.

(4) The problem is obviously in the code of Fl_Counter. It's not a direct regression introduced by the timeout changes but it is revealed by those.

(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.

Rob, can you please:

(a) test and (hopefully) confirm that this patch fixes your issue too?

(b) open a GitHub Issue describing the issue in a short form so the issue won't be forgotten. You can refer to this discussion for further information [1]. I will investigate further and (hopefully, again) find a proper solution later.

Thanks for finding the issue and your support in testing.

[1] Link: https://groups.google.com/g/fltkcoredev/c/daKlBxeOJVk/m/D-MPHPiCAAAJ

counter.cxx
Fl_Counter_v2.patch

Albrecht Schlosser

unread,
Jun 16, 2022, 6:12:03 PM6/16/22
to fltkc...@googlegroups.com
On 6/17/22 00:00 Albrecht Schlosser wrote:
> What I found out so far:
...
>
> (4) The problem is obviously in the code of Fl_Counter. It's not a
> direct regression introduced by the timeout changes but it is revealed
> by those.
>
> (5) Another attempt to fix all known issues is my new patch
> Fl_Counter_v2.patch (attached). It's likely not the final solution
> (fixing symptoms only) but it's a start.

@Rob: I don't think it's necessary to test your program on Linux, you
can save that extra work. Thanks.

Rob McDonald

unread,
Jun 16, 2022, 7:36:36 PM6/16/22
to fltk.coredev
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:
OK, taking your earlier reply into account, I changed the demo program to just (u)sleep() inside the callback and can trigger the issue in a reprocucible way. This may be different than your real case but it shows the issue. I'm attaching the modified demo program counter.cxx. I know what is causing the issue, i.e. what happens once it gets triggered, but I don't know yet how to fix it. I'll look deeper into it tomorrow.

What I found out so far:

(1) My first assumption was that the Fl_Counter widget misses the FL_RELEASE event and this can easily be demonstrated by opening a modal window inside the callback. This is test case 1 [change '#if (0)' to '#if (1)' ...] and can be fixed by my previously posted patch.

(2) My new test program works by sleeping at least 100 ms inside the callback. It seems necessary to hold the Fl_Counter button pressed at least this time so the internal repeat timer gets called before the FL_RELEASE event is delivered. This restarts the repeat timer and triggers the (in this case potentially infinite) loop. This happens on both Linux and macOS.

(3) Testing on the Mac I could confirm that commit cf4a832e6 (the one before the timeout changes) does not have this issue.

(4) The problem is obviously in the code of Fl_Counter. It's not a direct regression introduced by the timeout changes but it is revealed by those.

(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.

Rob, can you please:

(a) test and (hopefully) confirm that this patch fixes your issue too?

I tried this patch (thanks for the prompt work).  It helps, but isn't a fix...

When I click the counter, it now increments two values (usually) and then stops.  It doesn't skip from 2 to 4, I see the intermediate 3.  It goes 2, 3, 4 and then stops -- for a single click.

Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).

 
(b) open a GitHub Issue describing the issue in a short form so the issue won't be forgotten. You can refer to this discussion for further information [1]. I will investigate further and (hopefully, again) find a proper solution later.

Thanks for finding the issue and your support in testing.

[1] Link: https://groups.google.com/g/fltkcoredev/c/daKlBxeOJVk/m/D-MPHPiCAAAJ

I'll create the issue.  Thanks for your prompt attention to this.

Rob

 

Bill Spitzak

unread,
Jun 16, 2022, 7:58:40 PM6/16/22
to fltkc...@googlegroups.com
I think what is happening is the RELEASE event is getting lost, so the counter thinks the button is pressed and keeps incrementing.

It is probably worth while to figure out why the event is not getting delivered, but it would be possible for the counter to check if the mouse is currently held down before repeating the timeout.


--
You received this message because you are subscribed to the Google Groups "fltk.coredev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fltkcoredev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fltkcoredev/11c023a5-fe80-4b71-9d35-ec5dd8e9f1c1n%40googlegroups.com.

Albrecht Schlosser

unread,
Jun 17, 2022, 6:25:38 AM6/17/22
to fltkc...@googlegroups.com
On 6/17/22 01:36 Rob McDonald wrote:
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:

(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.

Rob, can you please:

(a) test and (hopefully) confirm that this patch fixes your issue too?

I tried this patch (thanks for the prompt work).  It helps, but isn't a fix...

When I click the counter, it now increments two values (usually) and then stops.  It doesn't skip from 2 to 4, I see the intermediate 3.  It goes 2, 3, 4 and then stops -- for a single click.

Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).

In my demo I need to hold the mouse button for a "significant" time. This time is presumably longer than 0.1 sec which is the repeat timer of Fl_Counter. This can be seen clearly in the existing code: if the FL_RELEASE event is delivered before the timer event the internal variable `mouseobj` is reset and the timer repetition is terminated.

Is it possible that your mouse button "click" lasts longer than 0.1 sec?



(b) open a GitHub Issue describing the issue in a short form so the issue won't be forgotten. You can refer to this discussion for further information [1]. I will investigate further and (hopefully, again) find a proper solution later.

Thanks for finding the issue and your support in testing.

[1] Link: https://groups.google.com/g/fltkcoredev/c/daKlBxeOJVk/m/D-MPHPiCAAAJ

I'll create the issue.  Thanks for your prompt attention to this.

Thanks for opening the issue.

Albrecht Schlosser

unread,
Jun 17, 2022, 6:44:05 AM6/17/22
to fltkc...@googlegroups.com
On 6/17/22 01:58 Bill Spitzak wrote:
> I think what is happening is the RELEASE event is getting lost, so the
> counter thinks the button is pressed and keeps incrementing.

Yes, that was my first thought too and my first demo program opened a
modal window in the callback which directs the FL_RELEASE event to the
modal window and thus it is lost for the Fl_Counter widget. But this is
only one of the issues.

> It is probably worth while to figure out why the event is not getting
> delivered,

See above for one possible reason. But unfortunately that's not all that
can happen. I'm currently investigating another issue...

> ... but it would be possible for the counter to check if the mouse is
> currently held down before repeating the timeout.

That's something I'm definitely going to look at.

imm

unread,
Jun 17, 2022, 7:14:00 AM6/17/22
to coredev fltk
On Fri, 17 Jun 2022 at 11:44, Albrecht Schlosser wrote:
>
> On 6/17/22 01:58 Bill Spitzak wrote:
> > I think what is happening is the RELEASE event is getting lost, so the
> > counter thinks the button is pressed and keeps incrementing.
>
> Yes, that was my first thought too and my first demo program opened a
> modal window in the callback which directs the FL_RELEASE event to the
> modal window and thus it is lost for the Fl_Counter widget. But this is
> only one of the issues.
>

So... I haven't really been following this, but I'm concerned that
having a callback that runs for a significant time period might be a
tricky problem to solve anyway.

Whilst the button callback is "in flight" it seems that there could,
potentially, always be a risk of missing some events being delivered.
I think Windows tends to queue 'em all up, but I'm not sure to what
extent that happens for other OS, or how robust that really is...
Certainly on Windows if you block event delivery for "long enough"
things can start to get weird...

So it is useful to understand how/why the rebase event is getting lost
but I' also concerned that whatever we do may not be able to fix all
the cases.

Rob, if you break the timing dependency out - that is, have the button
cb return "instantly" and then perform the time consuming
computations in a different thread (say) does that then make things
"work" - i.e. is it is possible to demonstrate that it is timing that
is the crux here, or is there something more going on?



> > ... but it would be possible for the counter to check if the mouse is
> > currently held down before repeating the timeout.
>
> That's something I'm definitely going to look at.

Though, reading this (and not knowing what is happening underneath!) I
was concerned that this may itself induce some aberrant behaviours...

Rob McDonald

unread,
Jun 17, 2022, 11:23:59 AM6/17/22
to fltk.coredev
On Friday, June 17, 2022 at 3:25:38 AM UTC-7 Albrecht Schlosser wrote:
On 6/17/22 01:36 Rob McDonald wrote:
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:

(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.

Rob, can you please:

(a) test and (hopefully) confirm that this patch fixes your issue too?

I tried this patch (thanks for the prompt work).  It helps, but isn't a fix...

When I click the counter, it now increments two values (usually) and then stops.  It doesn't skip from 2 to 4, I see the intermediate 3.  It goes 2, 3, 4 and then stops -- for a single click.

Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).

In my demo I need to hold the mouse button for a "significant" time. This time is presumably longer than 0.1 sec which is the repeat timer of Fl_Counter. This can be seen clearly in the existing code: if the FL_RELEASE event is delivered before the timer event the internal variable `mouseobj` is reset and the timer repetition is terminated.

Is it possible that your mouse button "click" lasts longer than 0.1 sec?

I am just clicking the mouse in a normal way.  I am not dwelling on the button in any way.  I don't have a millisecond stopwatch connected to my mouse finger -- but I'm not doing anything unusual as a user. 

Is a counter supposed to repeat events if you hold down the button?  I guess I'm failing to see how it doesn't behave like a normal button press and why we don't see this problem for all kinds of FLTK events.

If it is 'supposed' to repeat, then I should probably avoid Fl_Counter in the first place and just use normal buttons.

Rob


Rob McDonald

unread,
Jun 17, 2022, 11:37:26 AM6/17/22
to fltk.coredev
I don't see a practical way to do this (certainly not in the short term).  My program is not thread safe and we take a long time to come back in a number of situations.  This program and its ancestors have been using FLTK (and XForms and SGI Forms before that) since the early 1990's -- at different times, computers get faster, but our calculations also get more complex.  Most things maintain the illusion of happening 'instantly', but users can create arbitrarily complex models -- which can take time to update.

That said, I can certainly load up a trivially simple model -- and when I do that, I never see the repeated events or the infinite loop.  Everything works as normal.  So while I can't return from the callback "instantly" by spawning a new thread -- if I return "fast enough", I never see the problem.

I will certainly take free advice -- long-term, how would I go about re-designing my program to improve this interactivity?  I imagine a separate thread for most of my program -- with a wait loop listening for events -- then FLTK sends events over to that separate thread.  It seems to me that this would not improve the user's experience at all.  While the GUI would return instantly and allow the user to continue queueing up commands -- the background engine would still take time and would have to handle those requests in series (and in the order they were given).

Rob

Albrecht Schlosser

unread,
Jun 17, 2022, 12:49:27 PM6/17/22
to fltkc...@googlegroups.com
On 6/17/22 17:23 Rob McDonald wrote:
On Friday, June 17, 2022 at 3:25:38 AM UTC-7 Albrecht Schlosser wrote:
On 6/17/22 01:36 Rob McDonald wrote:
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:

(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.

Rob, can you please:

(a) test and (hopefully) confirm that this patch fixes your issue too?

I tried this patch (thanks for the prompt work).  It helps, but isn't a fix...

When I click the counter, it now increments two values (usually) and then stops.  It doesn't skip from 2 to 4, I see the intermediate 3.  It goes 2, 3, 4 and then stops -- for a single click.

Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).

In my demo I need to hold the mouse button for a "significant" time. This time is presumably longer than 0.1 sec which is the repeat timer of Fl_Counter. This can be seen clearly in the existing code: if the FL_RELEASE event is delivered before the timer event the internal variable `mouseobj` is reset and the timer repetition is terminated.

Is it possible that your mouse button "click" lasts longer than 0.1 sec?

I am just clicking the mouse in a normal way.  I am not dwelling on the button in any way.  I don't have a millisecond stopwatch connected to my mouse finger -- but I'm not doing anything unusual as a user.

Yes, I see this too now on macOS under certain conditions. I pushed two commits that solve (AFAICT) all problems I could see on Linux and Windows (the latter tested only shortly using Wine on Linux). There was indeed an issue with the new timeout handling (thanks for pointing this out) but there was also something going awry in the Fl_Counter code. The two mentioned commits fix the obvious issues.

Please update your FLTK repo and try the latest commits with your program and please try also my latest demo program which I posted to GitHub issue #450 which you created.

The behavior on macOS is weird and I'd like to get feedback from others testing my tiny demo program on their macOS systems. Instructions can be found at https://github.com/fltk/fltk/issues/450#issuecomment-1159038176 . Thanks in advance for all feedback.


Is a counter supposed to repeat events if you hold down the button?  I guess I'm failing to see how it doesn't behave like a normal button press and why we don't see this problem for all kinds of FLTK events.

Yes, it is supposed to repeat if you keep the mouse button pressed: once every 0.1 seconds after an initial delay of 0.5 seconds. This is also true if you drag the mouse over the buttons (while keeping the mouse button down, of course).


If it is 'supposed' to repeat, then I should probably avoid Fl_Counter in the first place and just use normal buttons.

What did you want to achieve with the Fl_Counter in the first place? Do you handle different buttons differently?

Anyway, clicking on one of the buttons in a "normal" way should never trigger the auto-repeat function and this is what I don't understand (yet) in the macOS implementation. This is really a strange and unexpected behavior.

Bill Spitzak

unread,
Jun 17, 2022, 1:08:17 PM6/17/22
to fltkc...@googlegroups.com
I'm pretty certain all platforms queue up events. X certainly does.

However it is not hard for bugs or badly-written code to cause the RELEASE event to be sent the wrong place.

It probably is best for the counter (and anywhere else timeouts are repeated) to check if the mouse button is still held down. This indicator is set when the event is processed even if the event is not delivered to the right place.


--
You received this message because you are subscribed to the Google Groups "fltk.coredev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fltkcoredev...@googlegroups.com.

Albrecht Schlosser

unread,
Jun 17, 2022, 1:10:37 PM6/17/22
to fltkc...@googlegroups.com
On 6/17/22 13:16 imm wrote:
> On Fri, 17 Jun 2022 at 11:44, Albrecht Schlosser wrote:
>> On 6/17/22 01:58 Bill Spitzak wrote:
>>> I think what is happening is the RELEASE event is getting lost, so the
>>> counter thinks the button is pressed and keeps incrementing.
>> Yes, that was my first thought too and my first demo program opened a
>> modal window in the callback which directs the FL_RELEASE event to the
>> modal window and thus it is lost for the Fl_Counter widget. But this is
>> only one of the issues.
>>
> So... I haven't really been following this, but I'm concerned that
> having a callback that runs for a significant time period might be a
> tricky problem to solve anyway.

Yes, sure, but the library should handle it gracefully - as far as possible.

> Whilst the button callback is "in flight" it seems that there could,
> potentially, always be a risk of missing some events being delivered.
> I think Windows tends to queue 'em all up, but I'm not sure to what
> extent that happens for other OS, or how robust that really is...
> Certainly on Windows if you block event delivery for "long enough"
> things can start to get weird...
>
> So it is useful to understand how/why the release [typo corrected] event is getting lost
> but I' also concerned that whatever we do may not be able to fix all
> the cases.

Most of the issues is meanwhile well understood (by me ;-) ).

One point was that the current timeout handling could cause repeated
timeouts to fire again and again w/o intervening event handling which -
in this case - would miss the FL_RELEASE event because all the timeouts
ran in sequence based on just the button click.

To be more specific: if the timeout callback takes 50 ms and then calls
Fl::repeat_timeout(0.1, ...), what should happen? For the best accuracy
the new timer is internally corrected to trigger after 100 ms - 50 ms =
50 ms. This has always been the case on Linux but not on other platforms.

In our case the repetition timeout is 0.1 seconds. What if the callback
takes more than 0.1 seconds? In this case Fl::repeat_timeout() would
calculate 0 or a negative value which triggers the timer immediately.
This is basically what happened in my demo program. The fact (bug) that
the next timer callback could be called w/o intervening event handling
caused the problem that the FL_RELEASE event would never be seen as long
as the timer would repeat. This bug has been fixed now.

>>> ... but it would be possible for the counter to check if the mouse is
>>> currently held down before repeating the timeout.
>> That's something I'm definitely going to look at.
> Though, reading this (and not knowing what is happening underneath!) I
> was concerned that this may itself induce some aberrant behaviours...

I did this in another commit which I pushed as well. I didn't find
another way to make sure that the Fl_Counter widget gets all necessary
events. Why? The simple case that the callback opens a modal window
(e.g. fl_message()) would redirect all following events including
FL_RELEASE to that modal window and thus eclipse the FL_RELEASE event
from the Fl_Counter widget. Maybe we could use FL_LEAVE but I'm not sure
that this would be delivered in this case.

Once the blocking of the event handling was fixed I could rely on the
current Fl::event_state() info about mouse buttons etc.. Although this
is not 100% bullet-proof it will usually be enough to terminate the
repetition of Fl_Counter callbacks which are based on an internal state
variable which in turn needs the FL_RELEASE event or some other events
to reset.

Unless someone else has better ideas how to solve this issue this is now
committed and should suffice.

There's still an open issue on macOS only which is described in GitHub
issue #450 though. I have no idea what's going on in this case.

Albrecht Schlosser

unread,
Jun 17, 2022, 1:16:54 PM6/17/22
to fltkc...@googlegroups.com
On 6/17/22 19:08 Bill Spitzak wrote:
> I'm pretty certain all platforms queue up events. X certainly does.
>
> However it is not hard for bugs or badly-written code to cause the
> RELEASE event to be sent the wrong place.
>
> It probably is best for the counter (and anywhere else timeouts are
> repeated) to check if the mouse button is still held down. This
> indicator is set when the event is processed even if the event is not
> delivered to the right place.

Yep, I did this and it catches even the case when a modal dialog window
is opened in the callback. This works sufficiently because the event
state etc. (Fl::event_*) is "sticky" - I know that's your work, Bill
(thanks!), just for others reading here.

Rob McDonald

unread,
Jun 17, 2022, 8:00:59 PM6/17/22
to fltk.coredev
On Friday, June 17, 2022 at 9:49:27 AM UTC-7 Albrecht Schlosser wrote:
Is a counter supposed to repeat events if you hold down the button?  I guess I'm failing to see how it doesn't behave like a normal button press and why we don't see this problem for all kinds of FLTK events.

Yes, it is supposed to repeat if you keep the mouse button pressed: once every 0.1 seconds after an initial delay of 0.5 seconds. This is also true if you drag the mouse over the buttons (while keeping the mouse button down, of course).

Interesting, I never realized that.  Sure enough, it works as described.

If it is 'supposed' to repeat, then I should probably avoid Fl_Counter in the first place and just use normal buttons.

What did you want to achieve with the Fl_Counter in the first place? Do you handle different buttons differently?

I don't use Fl_Counter very often in my program.  Usually the counter is limited to about 10 items, so there isn't much reason to fly through them quickly.

It looks like most 'normal' buttons, we don't actually trigger the event until button release.
 
Anyway, clicking on one of the buttons in a "normal" way should never trigger the auto-repeat function and this is what I don't understand (yet) in the macOS implementation. This is really a strange and unexpected behavior.

I appreciate all the effort by you and the rest of the FLTK team -- both on this bug and in general.

Rob

 

imacarthur

unread,
Jun 20, 2022, 3:38:49 AM6/20/22
to fltk.coredev
On Friday, 17 June 2022 at 16:37:26 UTC+1 Rob wrote:
Rob, if you break the timing dependency out - that is, have the button
cb return "instantly" and then perform the time consuming
computations in a different thread (say) does that then make things
"work" - i.e. is it is possible to demonstrate that it is timing that
is the crux here, or is there something more going on?

I don't see a practical way to do this (certainly not in the short term).  My program is not thread safe and we take a long time to come back in a number of situations.  This program and its ancestors have been using FLTK (and XForms and SGI Forms before that) since the early 1990's -- at different times, computers get faster, but our calculations also get more complex.  Most things maintain the illusion of happening 'instantly', but users can create arbitrarily complex models -- which can take time to update.

What I'm about to add is way off-topic for the original thread, and probably not all that useful either, but here goes anyway...

You say "at different times, computers get faster", but that's not quite what happens any more - CPU clocks hit the 3 ~ 3.5GHz level many years ago now and haven't really gone much higher since - sure, there have been incremental improvements in microcode execution, branch prediction, pipelining, etc. that have each allowed modern CPUs do do a bit more per-clock cycle, but the raw single-thread performance is still largely determined by raw clock speed, which hasn't really changed all that much (indeed, in practice a lot of modern cores have lower nominal clock speeds in a drive to lower the TDP.)

Rather, modern CPUs have become much "wider" rather than "faster" - they're still using all the extra transistors that Moore promised us, but not to go faster...
But the only way to get at that extra capacity is to use more threads - which is a nuisance, but an apparently necessary one.
So if your program is being tasked with increasingly complex models to execute, it may well be worth the pain to see if those tasks can be spun into their own threads in some safe way...

 

Rob McDonald

unread,
Jun 20, 2022, 11:12:57 PM6/20/22
to fltk.coredev
On Monday, June 20, 2022 at 12:38:49 AM UTC-7 imacarthur wrote:
I don't see a practical way to do this (certainly not in the short term).  My program is not thread safe and we take a long time to come back in a number of situations.  This program and its ancestors have been using FLTK (and XForms and SGI Forms before that) since the early 1990's -- at different times, computers get faster, but our calculations also get more complex.  Most things maintain the illusion of happening 'instantly', but users can create arbitrarily complex models -- which can take time to update.

What I'm about to add is way off-topic for the original thread, and probably not all that useful either, but here goes anyway...

You say "at different times, computers get faster", but that's not quite what happens any more - CPU clocks hit the 3 ~ 3.5GHz level many years ago now and haven't really gone much higher since - sure, there have been incremental improvements in microcode execution, branch prediction, pipelining, etc. that have each allowed modern CPUs do do a bit more per-clock cycle, but the raw single-thread performance is still largely determined by raw clock speed, which hasn't really changed all that much (indeed, in practice a lot of modern cores have lower nominal clock speeds in a drive to lower the TDP.)

Rather, modern CPUs have become much "wider" rather than "faster" - they're still using all the extra transistors that Moore promised us, but not to go faster...
But the only way to get at that extra capacity is to use more threads - which is a nuisance, but an apparently necessary one.
So if your program is being tasked with increasingly complex models to execute, it may well be worth the pain to see if those tasks can be spun into their own threads in some safe way...

This is a fair point.

I will have to start thinking about ways to make this happen.

There are admittedly a lot of things that can be done to make my program faster in places -- lots of them are easier than making it thread safe.  Unfortunately, while there is sometimes funding to add features, there is seldom funding to buy down technical debt.

I also struggle a bit with what a multi-threaded use case would be like.  In my program, you are interacting with a central model.  When you make a change, it takes a certain amount of time.  It doesn't really make sense to start making the next change before the model has updated for the previous one -- you're somewhat forced to serialize around the user's ability to see the impact of their change.

Think about applying a filter or blur in Photoshop or Gimp.  If it takes 1.5 seconds for the filter to be applied to your image -- does it make sense for the user to be already applying the next filter?  In many ways, the best thing to do is show the user an hourglass (or spinning beachball) and make them wait.

Rob


Ian MacArthur

unread,
Jun 21, 2022, 8:17:16 AM6/21/22
to coredev fltk
On 21 Jun 2022, at 04:12, Rob McDonald wrote:
>
> There are admittedly a lot of things that can be done to make my program faster in places -- lots of them are easier than making it thread safe. Unfortunately, while there is sometimes funding to add features, there is seldom funding to buy down technical debt.

This: I know...

>
> I also struggle a bit with what a multi-threaded use case would be like. In my program, you are interacting with a central model. When you make a change, it takes a certain amount of time. It doesn't really make sense to start making the next change before the model has updated for the previous one -- you're somewhat forced to serialize around the user's ability to see the impact of their change.

Well.... I’d say you need to serialize their (ability to make) changes to the model, but that does not necessarily mean you need to lock out the UI whilst the model is recalculated.

Running the computation in the context of the callback does tend to lock out the UI, which just seems wrong to me (and on some platforms can lead to weirdness if the OS is sending events that seem to be ignored...)

So I’d always favour returning the callback ASAP and keeping the UI “alive” and responsive - even if the response is just to say “hold on, I’m still busy!"


>
> Think about applying a filter or blur in Photoshop or Gimp. If it takes 1.5 seconds for the filter to be applied to your image -- does it make sense for the user to be already applying the next filter? In many ways, the best thing to do is show the user an hourglass (or spinning beachball) and make them wait.

Sure, but I’d counter with: “Ah ha! But applying a filter is a classic case where threading can help.” Since many of those operations do lend themselves to being accelerated by subdivision of the work... (Not that long ago I was using - I think it was the gimp - on a really slow old laptop, and I could actually see it filling in tiles ”randomly” across the scene as it applied the filter..)

And yes, I recognise you didn’t specifically mean an operation that lends itself to SIMD operations and to being split over multiple tiles.
The point is rather that you need to complete one change before you start the next; but the time it takes to complete the change can very often be shortened by subdividing the work; even where the subdivision itself adds a fair bit of extra work, it is often a net win. If it makes the job 50% harder but allows 400% more CPU to be brought to bear...


Reply all
Reply to author
Forward
0 new messages