Clicking a Fl_Choice will kick off an infinite loop with the choice repeatedly incrementing and firing off repeated events. Instead of incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.
This behavior depends on how long my program takes to respond to the event. If my program is fast, the bug does not appear. However, if my program takes 'too long', then the Fl_Choice sends another event and it slips into the infinite loop.
I manually bisected this behavior down to this commit:
29d9e31c51e6c Consolidate timeout handling across platforms (#379) Albrecht Schlosser
The current tip of master (d8eb1f9ca46) still has this problem.
I am on MacOS.
I have not tried to duplicate this in any of the test programs - or to create a MWE. I'm hopeful that this will start a conversation and perhaps someone will see the problem with the new timeout handling code.
On 6/16/22 05:58 Rob McDonald wrote:
Clicking a Fl_Choice will kick off an infinite loop with the choice repeatedly incrementing and firing off repeated events. Instead of incrementing from 0 to 1, it goes to 1, 2, 3, 4, 5, 6, etc.Hmm, this sounds more like Fl_Counter rather than Fl_Choice. Correct?
This behavior depends on how long my program takes to respond to the event. If my program is fast, the bug does not appear. However, if my program takes 'too long', then the Fl_Choice sends another event and it slips into the infinite loop.I have a theory what *might* happen but I need more info.
(1) How are you handling the event? Are you using a callback?
(2) Do you create another window in your event (callback) handling?
(3) If yes, is this a modal window, maybe something like fl_message() or fl_ask() or one of the other common dialogs?
(4) If it's not (2) or (3), can you describe what your program does when it "takes 'too long' " ? Everything related to the event loop (Fl::wait, Fl::check etc.) would be important, as well as info about opening other windows.
More questions below...
I manually bisected this behavior down to this commit:
29d9e31c51e6c Consolidate timeout handling across platforms (#379) Albrecht Schlosser
The current tip of master (d8eb1f9ca46) still has this problem.
I am on MacOS.Would your program run on Linux too? If yes, can you please test it on Linux and:
(5a) Does it exhibit the same behavior on Linux?
(5b) Does it also do this in commit cf4a832e6, the one before 29d9e31c51e6c?
I have not tried to duplicate this in any of the test programs - or to create a MWE. I'm hopeful that this will start a conversation and perhaps someone will see the problem with the new timeout handling code.FWIW: I can replicate the issue with Fl_Counter (sic!) and calling fl_message() in the callback.
Minimal test case (counter.cxx) attached.
Note that this test program exhibits the issue on Linux (git current) and even before commit 29d9e31c51e6c. In fact, it's also "broken" in FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).
The fact that your program worked on macOS before changing the timeout handling was supposedly only luck (not your "fault" ;-) ). I didn't test my demo program on macOS yet, awaiting your response with more info (see questions above).
FYI: My demo program can be "fixed" with both changes given in the attached Fl_Counter.patch independently but this is only a first proof of concept, not a real solution. However, if your issue is similar to what I *guessed* then you might want to test the patch and report if any one of the changes (each one, separately) fixes the issue for you.
Looking forward to your reply. TIA.
FWIW: I can replicate the issue with Fl_Counter (sic!) and calling fl_message() in the callback.
Minimal test case (counter.cxx) attached.
Note that this test program exhibits the issue on Linux (git current) and even before commit 29d9e31c51e6c. In fact, it's also "broken" in FLTK 1.3 (git branch-1.3 latest). I didn't bother to test on macOS (yet).
The fact that your program worked on macOS before changing the timeout handling was supposedly only luck (not your "fault" ;-) ). I didn't test my demo program on macOS yet, awaiting your response with more info (see questions above).
FYI: My demo program can be "fixed" with both changes given in the attached Fl_Counter.patch independently but this is only a first proof of concept, not a real solution. However, if your issue is similar to what I *guessed* then you might want to test the patch and report if any one of the changes (each one, separately) fixes the issue for you.
Looking forward to your reply. TIA.
OK, taking your earlier reply into account, I changed the demo program to just (u)sleep() inside the callback and can trigger the issue in a reprocucible way. This may be different than your real case but it shows the issue. I'm attaching the modified demo program counter.cxx. I know what is causing the issue, i.e. what happens once it gets triggered, but I don't know yet how to fix it. I'll look deeper into it tomorrow.
What I found out so far:
(1) My first assumption was that the Fl_Counter widget misses the FL_RELEASE event and this can easily be demonstrated by opening a modal window inside the callback. This is test case 1 [change '#if (0)' to '#if (1)' ...] and can be fixed by my previously posted patch.
(2) My new test program works by sleeping at least 100 ms inside the callback. It seems necessary to hold the Fl_Counter button pressed at least this time so the internal repeat timer gets called before the FL_RELEASE event is delivered. This restarts the repeat timer and triggers the (in this case potentially infinite) loop. This happens on both Linux and macOS.
(3) Testing on the Mac I could confirm that commit cf4a832e6 (the one before the timeout changes) does not have this issue.
(4) The problem is obviously in the code of Fl_Counter. It's not a direct regression introduced by the timeout changes but it is revealed by those.
(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.
Rob, can you please:
(a) test and (hopefully) confirm that this patch fixes your issue too?
(b) open a GitHub Issue describing the issue in a short form so the issue won't be forgotten. You can refer to this discussion for further information [1]. I will investigate further and (hopefully, again) find a proper solution later.
Thanks for finding the issue and your support in testing.
[1] Link: https://groups.google.com/g/fltkcoredev/c/daKlBxeOJVk/m/D-MPHPiCAAAJ
--
You received this message because you are subscribed to the Google Groups "fltk.coredev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fltkcoredev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fltkcoredev/11c023a5-fe80-4b71-9d35-ec5dd8e9f1c1n%40googlegroups.com.
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:
(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.
Rob, can you please:
(a) test and (hopefully) confirm that this patch fixes your issue too?
I tried this patch (thanks for the prompt work). It helps, but isn't a fix...
When I click the counter, it now increments two values (usually) and then stops. It doesn't skip from 2 to 4, I see the intermediate 3. It goes 2, 3, 4 and then stops -- for a single click.
Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).
(b) open a GitHub Issue describing the issue in a short form so the issue won't be forgotten. You can refer to this discussion for further information [1]. I will investigate further and (hopefully, again) find a proper solution later.
Thanks for finding the issue and your support in testing.
[1] Link: https://groups.google.com/g/fltkcoredev/c/daKlBxeOJVk/m/D-MPHPiCAAAJ
I'll create the issue. Thanks for your prompt attention to this.
On 6/17/22 01:36 Rob McDonald wrote:
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:
(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.
Rob, can you please:
(a) test and (hopefully) confirm that this patch fixes your issue too?
I tried this patch (thanks for the prompt work). It helps, but isn't a fix...
When I click the counter, it now increments two values (usually) and then stops. It doesn't skip from 2 to 4, I see the intermediate 3. It goes 2, 3, 4 and then stops -- for a single click.
Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).In my demo I need to hold the mouse button for a "significant" time. This time is presumably longer than 0.1 sec which is the repeat timer of Fl_Counter. This can be seen clearly in the existing code: if the FL_RELEASE event is delivered before the timer event the internal variable `mouseobj` is reset and the timer repetition is terminated.
Is it possible that your mouse button "click" lasts longer than 0.1 sec?
On Friday, June 17, 2022 at 3:25:38 AM UTC-7 Albrecht Schlosser wrote:
On 6/17/22 01:36 Rob McDonald wrote:
On Thursday, June 16, 2022 at 3:00:55 PM UTC-7 Albrecht Schlosser wrote:
(5) Another attempt to fix all known issues is my a new patch Fl_Counter_v2.patch (attached). It's likely not the final solution (fixing symptoms only) but it's a start.
Rob, can you please:
(a) test and (hopefully) confirm that this patch fixes your issue too?
I tried this patch (thanks for the prompt work). It helps, but isn't a fix...
When I click the counter, it now increments two values (usually) and then stops. It doesn't skip from 2 to 4, I see the intermediate 3. It goes 2, 3, 4 and then stops -- for a single click.
Occasionally (once out of 20+ attempts) it will only increment by one (my sequence switched from even to odd).
In my demo I need to hold the mouse button for a "significant" time. This time is presumably longer than 0.1 sec which is the repeat timer of Fl_Counter. This can be seen clearly in the existing code: if the FL_RELEASE event is delivered before the timer event the internal variable `mouseobj` is reset and the timer repetition is terminated.
Is it possible that your mouse button "click" lasts longer than 0.1 sec?
I am just clicking the mouse in a normal way. I am not dwelling on the button in any way. I don't have a millisecond stopwatch connected to my mouse finger -- but I'm not doing anything unusual as a user.
Is a counter supposed to repeat events if you hold down the button? I guess I'm failing to see how it doesn't behave like a normal button press and why we don't see this problem for all kinds of FLTK events.
If it is 'supposed' to repeat, then I should probably avoid Fl_Counter in the first place and just use normal buttons.
--
You received this message because you are subscribed to the Google Groups "fltk.coredev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fltkcoredev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fltkcoredev/a5d05678-9bda-44f2-9c99-5bae97c49580n%40googlegroups.com.
Is a counter supposed to repeat events if you hold down the button? I guess I'm failing to see how it doesn't behave like a normal button press and why we don't see this problem for all kinds of FLTK events.Yes, it is supposed to repeat if you keep the mouse button pressed: once every 0.1 seconds after an initial delay of 0.5 seconds. This is also true if you drag the mouse over the buttons (while keeping the mouse button down, of course).
If it is 'supposed' to repeat, then I should probably avoid Fl_Counter in the first place and just use normal buttons.What did you want to achieve with the Fl_Counter in the first place? Do you handle different buttons differently?
Anyway, clicking on one of the buttons in a "normal" way should never trigger the auto-repeat function and this is what I don't understand (yet) in the macOS implementation. This is really a strange and unexpected behavior.
Rob, if you break the timing dependency out - that is, have the button
cb return "instantly" and then perform the time consuming
computations in a different thread (say) does that then make things
"work" - i.e. is it is possible to demonstrate that it is timing that
is the crux here, or is there something more going on?I don't see a practical way to do this (certainly not in the short term). My program is not thread safe and we take a long time to come back in a number of situations. This program and its ancestors have been using FLTK (and XForms and SGI Forms before that) since the early 1990's -- at different times, computers get faster, but our calculations also get more complex. Most things maintain the illusion of happening 'instantly', but users can create arbitrarily complex models -- which can take time to update.
I don't see a practical way to do this (certainly not in the short term). My program is not thread safe and we take a long time to come back in a number of situations. This program and its ancestors have been using FLTK (and XForms and SGI Forms before that) since the early 1990's -- at different times, computers get faster, but our calculations also get more complex. Most things maintain the illusion of happening 'instantly', but users can create arbitrarily complex models -- which can take time to update.What I'm about to add is way off-topic for the original thread, and probably not all that useful either, but here goes anyway...You say "at different times, computers get faster", but that's not quite what happens any more - CPU clocks hit the 3 ~ 3.5GHz level many years ago now and haven't really gone much higher since - sure, there have been incremental improvements in microcode execution, branch prediction, pipelining, etc. that have each allowed modern CPUs do do a bit more per-clock cycle, but the raw single-thread performance is still largely determined by raw clock speed, which hasn't really changed all that much (indeed, in practice a lot of modern cores have lower nominal clock speeds in a drive to lower the TDP.)Rather, modern CPUs have become much "wider" rather than "faster" - they're still using all the extra transistors that Moore promised us, but not to go faster...But the only way to get at that extra capacity is to use more threads - which is a nuisance, but an apparently necessary one.So if your program is being tasked with increasingly complex models to execute, it may well be worth the pain to see if those tasks can be spun into their own threads in some safe way...