FLTK Performance on Mac

117 views
Skip to first unread message

Daniel Harding

unread,
Feb 18, 2024, 11:09:45 PMFeb 18
to fltk.general
Spinning off from some of the discussion in this previous thread: https://groups.google.com/g/fltkgeneral/c/xUEnuZs-THA

I have a real-time audio application that ideally needs a frame rate of at least 45 frames per second or higher.


I have no problem achieving a frame rate higher than this on Linux and Windows, but the frame rate struggles a little bit on Mac -- at least on my somewhat old 2013 MacBook Pro.

I'm interested in investigating what might be causing this lower frame rate and if there's anything that can be done to improve it.

I suspect that the main bottleneck is sheer drawing speed, because the frame rate gets noticeably worse when I make the window bigger and immediately improves when I resize the window smaller.

Some points already discussed in the previous thread:

* Performance is already significantly better in FLTK 1.4 than in 1.3. Awake callbacks are no longer indefinitely stalled as a result of never-ending mouse/keyboard events (wiggling the mouse cursor inside a disabled button, scroll events in a Fl_Scroll, or holding down a keyboard shortcut). Also, scroll "inertia" from the trackpad is greatly improved in FLTK 1.4.

* In my application, the audio thread attempts to refill its audio sample buffer roughly every 8 milliseconds, after which it will call Fl::awake() if and only if there is not already a queued awake callback that the main thread has not yet fulfilled.
This awake callback is strictly for redrawing the UI. All audio processing is done by the audio thread and is not affected by frame rate.
This leads to fantastic frame rate/low latency on Windows and Linux. But on Mac, songs with faster tempos and/or shorter notes will have a frame rate low enough that short notes will be completely "skipped over" by the main thread UI. By this I mean that the auto-scrolling done by the GUI will scroll past whole notes in a single frame as the song plays. Ideally this would not happen.

* I've always wondered if this issue would "go away" on a newer MacBook, but I do not currently have access to a newer MacBook.

* Interestingly, the frame rate is perfect when running the Windows 64-bit executable on Mac via wine, on the same 2013 MacBook Pro. This leads me to think that a newer MacBook should not be necessary (or maybe not even very helpful) in trying to achieve a higher frame rate.

Anyone interested may try to compile my program from the github repo. The INSTALL.md instructions are very straightforward for Mac and Linux, but pretty tedious for Windows.

As I have time, I may add more in depth explanations here, screen recordings to demonstrate the performance that I'm seeing, or may even try to create a simpler demo program that illustrates the same idea but without all of the dependencies of the full program, but no guarantees on that.

Thanks in advance to anyone for your input.

Albrecht Schlosser

unread,
Feb 20, 2024, 10:50:26 AMFeb 20
to fltkg...@googlegroups.com
On 2/19/24 05:09 Daniel Harding wrote:
> Spinning off from some of the discussion in this previous thread:
> https://groups.google.com/g/fltkgeneral/c/xUEnuZs-THA
>
> I have a real-time audio application that ideally needs a frame rate
> of at least 45 frames per second or higher.
>
> https://github.com/dannye/crystal-tracker
> https://www.youtube.com/watch?v=BPsvsHj3eLM
>
> I have no problem achieving a frame rate higher than this on Linux and
> Windows, but the frame rate struggles a little bit on Mac -- at least
> on my somewhat old 2013 MacBook Pro.
>
> I'm interested in investigating what might be causing this lower frame
> rate and if there's anything that can be done to improve it.

@Daniel, thanks for opening this thread. It's interesting, but
unfortunately not essential to investigate this before the release of
FLTK 1.4.0. Therefore, I can't spend time on investigating this issue.
However, this should not be forgotten. There is an independent
investigation going on, regarding "DataReadyThread" which could also
influence the performance on macOS (not only drawing speed as you assumed).

This issue should be investigated at some time...

> As I have time, I may add more in depth explanations here, screen
> recordings to demonstrate the performance that I'm seeing, or may even
> try to create a simpler demo program that illustrates the same idea
> but without all of the dependencies of the full program, but no
> guarantees on that.

The simpler example would IMHO be the way to go. Seeing performance in
screen recordings would be "nice" but building one's own test program
that can be tested on all platforms to see the difference would be
preferable. So if you are willing to invest more time, then the small
demo program would IMHO be the better choice. Thanks in advance.

Some ideas and suggestions: draw a lot of objects like your keyboard and
the "notes" at arbitrary (maybe random) locations with different sizes.
Are you using widgets like Fl_Box in your original program, or do you
use "draw" methods like fl_rectf() etc.? Do the same in the demo
program. Make the window contents "scroll" as in your example program
(use the same Fl::awake() mechanism like your audio thread). Add a
slider where you can change the "speed" interactively. Try to add an
"fps" counter that may show a result that's comparable between systems.
Make the main window resizable so we can test with different sizes. Etc...

The smaller the demo code that shows the effect (some kind of saturation
on macOS) the better it is.

> Thanks in advance to anyone for your input.

Sorry that I can't help with "real" input because I'm concentrating on
finishing the 1.4.0 release.

Daniel Harding

unread,
Feb 20, 2024, 12:13:21 PMFeb 20
to fltk.general
On Tuesday, February 20, 2024 at 9:50:26 AM UTC-6 Albrecht Schlosser wrote:
@Daniel, thanks for opening this thread. It's interesting, but
unfortunately not essential to investigate this before the release of
FLTK 1.4.0.

Oh that's no problem. I by no means expect this to be anyone's priority or impact the release schedule for FLTK 1.4. Especially because the performance in 1.4 is already better, not worse, than in 1.3. This can be a long term thread that anyone interested can help investigate, purely as they have the time available.
 
The simpler example would IMHO be the way to go. Seeing performance in
screen recordings would be "nice" but building one's own test program
that can be tested on all platforms to see the difference would be
preferable. So if you are willing to invest more time, then the small
demo program would IMHO be the better choice. Thanks in advance.

Some ideas and suggestions: draw a lot of objects like your keyboard and
the "notes" at arbitrary (maybe random) locations with different sizes.
Are you using widgets like Fl_Box in your original program, or do you
use "draw" methods like fl_rectf() etc.? Do the same in the demo
program. Make the window contents "scroll" as in your example program
(use the same Fl::awake() mechanism like your audio thread). Add a
slider where you can change the "speed" interactively. Try to add an
"fps" counter that may show a result that's comparable between systems.
Make the main window resizable so we can test with different sizes. Etc...

The smaller the demo code that shows the effect (some kind of saturation
on macOS) the better it is.

Sure, I'll see what demo I can get going in the next week or so. (FWIW, if the demo is a stripped down version of my full program then the demo will be LGPL by extension. I don't expect that to be a problem, but just FYI.)

Albrecht Schlosser

unread,
Feb 20, 2024, 12:50:06 PMFeb 20
to fltkg...@googlegroups.com
Great, looking forward to seeing it.


(FWIW, if the demo is a stripped down version of my full program then the demo will be LGPL by extension. I don't expect that to be a problem, but just FYI.)

I don't think that this would be an issue, at least as long as we're only using to test the FLTK stuff.

But (and this is a BIG But): the demo program should be self-contained, i.e. no external dependencies. We can only test FLTK if it's really (pure) FLTK code. It would also be easier to build an run and devs wouldn't have to "waste" time figuring out how to build the demo.

And, BTW, the easier you make it to build, the earlier one of the devs will likely try to build and test it.

Daniel Harding

unread,
Feb 20, 2024, 2:47:57 PMFeb 20
to fltk.general
On Tuesday, February 20, 2024 at 11:50:06 AM UTC-6 Albrecht Schlosser wrote:
But (and this is a BIG But): the demo program should be self-contained, i.e. no external dependencies. We can only test FLTK if it's really (pure) FLTK code. It would also be easier to build an run and devs wouldn't have to "waste" time figuring out how to build the demo.

And, BTW, the easier you make it to build, the earlier one of the devs will likely try to build and test it.


Oh of course, that's exactly what I'll do.

Daniel Harding

unread,
Feb 25, 2024, 12:32:47 AMFeb 25
to fltk.general
I made a ~1000 line demo that illustrates the lower frame rate on Mac: https://github.com/dannye/fltk-scroll-perf-test/blob/main/src/main.cpp

I get 60 or 120 fps on Windows and Linux but only 10-25 fps on Mac. You can build the standalone main.cpp however you usually build and link with fltk.

Thanks.

Mo_Al_

unread,
Feb 25, 2024, 3:00:04 PMFeb 25
to fltk.general
I built the demo on my fairly old macbook air (early 2015) with 4gb ram and it runs at around 94-95 fps:
Screen Shot 2024-02-25 at 22.56.47.jpg

Daniel Harding

unread,
Feb 25, 2024, 3:13:27 PMFeb 25
to fltk.general
On Sunday, February 25, 2024 at 2:00:04 PM UTC-6 may64...@gmail.com wrote:
I built the demo on my fairly old macbook air (early 2015) with 4gb ram and it runs at around 94-95 fps:

Thanks for trying it out. Was that with a small window? I also get 60-90 fps on Mac when the window is only a quarter the size of my screen or smaller, but the fps plummets to 10-15 when the window is resized to take up the whole screen. If you're able to verify how window size affects fps on your machine I would really appreciate it.

Mo_Al_

unread,
Feb 25, 2024, 3:22:43 PMFeb 25
to fltk.general
In fullscreen mode, it's around 46, and if I increase the speed to max, it's around 39-40.

This is with latest FLTK built in Release, and the app built with -O3.

Daniel Harding

unread,
Feb 25, 2024, 4:31:43 PMFeb 25
to fltk.general
Thanks for that. So the difference isn't as dramatic as on my slightly older 2013 MacBook, but it's still achieving fewer than half the frames per second compared to when the window is rather small.

I don't see any drop-off in fps on Windows or Linux, or Wine on Mac, with respect to window size.

I get a very steady 105 frames per second when running the .exe through Wine on the same 2013 MacBook both when the window is small and when the window is full-sized.

I get ~70 fps on Windows 11 regardless of window size (fluctuates between 60 and 80 but irrespective of window size), and a very steady 123 fps on Linux regardless of window size.

I'm curious if someone more knowledgeable than me may be able to help with profiling/debugging to determine where that excess time is being spent, but if not I may try to tinker and see what I can find without knowing where I'm looking. Thanks in advance.

Daniel Harding

unread,
Feb 25, 2024, 4:39:34 PMFeb 25
to fltk.general
I can actually achieve >150 fps on Windows 11, > 250 fps on Linux, and >190 fps on Wine by violently sliding the scroll bar to trigger many excessive redraws.

But doing the same on Mac has no affect on the ~15 fps frame rate, positive or negative, which makes sense since it is clearly already saturated.

Daniel Harding

unread,
Feb 25, 2024, 8:36:56 PMFeb 25
to fltk.general
On Tuesday, February 20, 2024 at 9:50:26 AM UTC-6 Albrecht Schlosser wrote:
Some ideas and suggestions: draw a lot of objects like your keyboard and
the "notes" at arbitrary (maybe random) locations with different sizes.
Are you using widgets like Fl_Box in your original program, or do you
use "draw" methods like fl_rectf() etc.? Do the same in the demo
program. Make the window contents "scroll" as in your example program
(use the same Fl::awake() mechanism like your audio thread). Add a
slider where you can change the "speed" interactively. Try to add an
"fps" counter that may show a result that's comparable between systems.
Make the main window resizable so we can test with different sizes. Etc...

The smaller the demo code that shows the effect (some kind of saturation
on macOS) the better it is.

@Albrecht While the larger demo that I shared yesterday is a nice visualization of refresh rate, it turns out that it isn't necessary. The low frame rate is not caused by the large Fl_Scroll, or all the Fl_Boxes in it, or all the fl_rectf calls in the Fl_Group, or the Fl::awake() callback mechanism used from another thread to trigger a redraw.

This 50 line program removes essentially all complexity but still only achieves 15 frames per second -- exactly the same as with all the box drawing and such:
#include <ctime>

#include <FL/Fl.H>
#include <FL/fl_draw.H>
#include <FL/Fl_Double_Window.H>

class Main_Window : public Fl_Double_Window {
private:
int _frames = 0;
int _frames_per_second = 0;
time_t _frame_time = time(NULL);
public:
Main_Window(int x, int y, int w, int h, const char *l = nullptr);
protected:
void draw() override;
};

Main_Window::Main_Window(int x, int y, int w, int h, const char *) : Fl_Double_Window(x, y, w, h, "Perf Test") {
size_range(100, 20);
}

void Main_Window::draw() {
Fl_Double_Window::draw();

_frames += 1;
time_t current_time = time(NULL);
if (current_time > _frame_time) {
_frames_per_second = (_frames_per_second + 3 * _frames / int(current_time - _frame_time)) / 4;
_frame_time = current_time;
_frames = 0;
}

char s[16];
snprintf(s, sizeof(s), "FPS: %d", _frames_per_second);
fl_color(FL_FOREGROUND_COLOR);
fl_draw(s, 0, 0, 100, 20, FL_ALIGN_LEFT);
}

static Main_Window *window = nullptr;

int main(int argc, char **argv) {
window = new Main_Window(48, 48, 800, 600);
window->show();
while (true) {
Fl::wait(1e20);
window->damage(FL_DAMAGE_ALL);
}
return 0;
}


(If you choose to run this program, simply start measuring fps by scrubbing the mouse cursor inside the window to interrupt the Fl::wait loop. You'll have to terminate it with Control+C from the terminal that launched it because of the hacky infinite Fl::wait loop.)

Screen Shot 2024-02-25 at 7.20.54 PM.png

I think the slowness comes from something very core to FLTK's syncing of the window's buffer to the OS, and for some reason this slowness grows proportionally with window size.

I think I'm at the end of what I know and can experiment with inside the application code. To get any further I would need to dig into the FLTK code (which isn't a bad thing).

I'm curious what others might think. Thanks.

Mo_Al_

unread,
Feb 26, 2024, 1:23:37 PMFeb 26
to fltk.general
I tried checking with Instruments.app and I get:
11112 a.out (3108)
11009 Main Thread  0x11f85
11009 start
11009 Fl::run()
11008 Fl_Darwin_System_Driver::wait(double)
10461 Fl::flush()
10461 Fl_Cocoa_Window_Driver::flush()
10429 -[NSView displayIfNeeded]
10402 -[_NSBackingLayer displayIfNeeded]
10402 -[_NSViewBackingLayer display]
10379 -[_NSBackingLayer display]
10369 -[CALayer _display]
10353 invocation function for block in CA::Layer::display_()
10352 CABackingStoreUpdate_
10181 -[NSView(NSLayerKitGlue) drawLayer:inContext:]
7841 CGDisplayListDrawInContextDelegate
7839 CG::DisplayList::execute(CGContextDelegate*, CGRenderingState*, CGGStack*, CGRect const*, __CFDictionary const*)
7839 CG::DisplayList::executeEntries(std::__1::__wrap_iter<std::__1::unique_ptr<CG::DisplayListEntry const, std::__1::default_delete<CG::DisplayListEntry const> >*>, std::__1::__wrap_iter<std::__1::unique_ptr<CG::DisplayListEntry const, std::__1::default_delete<CG::DisplayListEntry const> >*>, CGContextDelegate*, CGRenderingState*, CGGStack*, CGRect const*, __CFDictionary const*, bool)
7836 CG::DisplayListExecutor::drawImage(CG::DisplayListEntryImage const*)
7833 ripc_DrawImage
7379 ripc_AcquireRIPImageData
7378 RIPImageCacheGetRetained
7361 RIPImageDataInitializeShared
7359 CGSImageDataLock
7342 img_data_lock
7314 img_raw_read
7313 get_chunks_direct
7310 CGDataProviderDirectGetBytesAtPositionInternal
7290 provider_for_destination_get_bytes_at_position_inner
5580 CGColorTransformConvertUsingCMSConverter
5516 convert_icc
5505 convert_using_vImageConverter
5440 vImageConverterConvert
5370 vImageConverter_convert_internal
5354 vImageConvert_AnyToAny
3929 AnyToAnyBlock
3920 AnyToAnyBlockInternal
815 LookupTable_Planar8toPlanar16
723 vImageLookupTable_Planar8toPlanar16
676 vLookupTable_Planar8toPlanar16

The output is the the same in fullscreen and windowed.

I tried looking online and it seems a similar issue was opened against gimp:

According to this site https://cocoadev.github.io/HowToSpeedUpDrawing/,  [NSView displayIfNeeded] is:
slows down your actual performance, but you can draw a fraction of a second earlier. You still won’t be able to eliminate all lag: if it’s an issue, try OpenGL.

I'm not sure how wine does it

Daniel Harding

unread,
Feb 26, 2024, 1:54:48 PMFeb 26
to fltk.general
@Mo Thank you very much for helping to test. I've never done profiling on Mac before so I didn't know about Instruments, thank you for mentioning it.

When you say "The output is the the same in fullscreen and windowed." do you mean that the metrics/analytics are essentially the same whether the app is getting ~100 fps or whether it's getting <50 fps? That is surprising. I would imagine that the time spent in one or more functions would increase noticeably when the frame rate is at its lowest. (I'm assuming that the first column of numbers in your output is "time spent".)

I will try out Instruments and see what I can find. Thanks for the other info and links as well.

Mo_Al_

unread,
Feb 26, 2024, 3:00:21 PMFeb 26
to fltk.general
Sorry I wasn't clear. I meant the call stack appears to be the same for the drawing pipeline. So it doesn't seem there are extra calls or checks that are being issued by FLTK while in fullscreen.

I'm not sure how one can work around this from the application side, since the slowness seems to be in the appkit level.

Daniel Harding

unread,
Feb 26, 2024, 8:22:05 PMFeb 26
to fltk.general
Oh I see, thank you. I wouldn't expect the drawing pipeline to change much or at all, because this is also not about full screen mode per se, but just about small windows versus large windows. You should be able to slowly resize the window larger and larger and observe the frame rate drop. So the way that profiling may help is to see which functions take longer and longer as window size increases.

Daniel Harding

unread,
Jun 11, 2024, 9:10:55 PMJun 11
to fltk.general
I just wanted to return to this thread to say that I finally replaced my 2013 MacBook Pro (macOS 11.4) with a 2023 MacBook Pro (macOS 14.5) and there are no performance problems whatsoever.

I get 100 fps using the performance testing app shared earlier in this thread, both with a tiny window as well as a full-sized window.

It's still definitely interesting, because the 2013 laptop wasn't "slow" in general -- YouTube in the web browser for example played just fine at a smooth frame rate. So it's still interesting that FLTK apps can only fun at ~15 fps on that laptop.

But given that performance is fantastic on a new MacBook, and, importantly, that the fps is the same whether the window is small or large, I completely understand if you determine it to be not worth it to investigate/improve performance any further on older macs.

Thanks again for entertaining me on this topic.

P.S. I just released the latest version of my song editor this time built with the FLTK 1.4 master branch (because all the improvements have become too good to ignore!) and I want to thank all of the developers of FLTK. Version 1.4 is fantastic and I especially want to recognize all the massive improvements with regard to macOS window management/reliability; multiple windows, individual fullscreen windows, mergable windows, the entire new Window menu of the system menu bar, etc. Thank you, Manolo, for all of that in particular.

Cheers.
Reply all
Reply to author
Forward
0 new messages