JNA segfaults on MacOS, often but not always

60 views
Skip to first unread message

Support

unread,
Nov 22, 2023, 8:05:17 PM11/22/23
to jna-users
Greetings!

Please I need your kind help and advise. Our project https://github.com/manticore-projects/fpng-java compiles on Linux, Windows and MacOS (using GitHub runners).
It also calls the Native Libs via JNA successfully on Linux and Windows -- 100% stable.
However, on MacOS it sometimes succeeds but most times Segfaults.


> Task :maven-test:test
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGILL (0x4) at pc=0x000000011033aa93, pid=1285, tid=9219
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# C [libfpng.dylib+0x4a93] fpng::fpng_adler32(void const*, unsigned long, unsigned int)+0xc3
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

I have two questions here:
1) could that be a JNA problem? Why would the same C code work on AMD64 Linux flawlessly, work on AMD64 MacOS sometimes but also segfault like that?
2) what steps to tackle this challenge (given that I have zero clue about MacOS)

I am really confused since it works sometimes (like 2 out of 10).

Thank you already and best regards
Andreas

Support

unread,
Nov 22, 2023, 8:10:14 PM11/22/23
to jna-...@googlegroups.com
Although it took me 5 runs to get 1 winner 😞
--
You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jna-users/df823ce383afc9b45d859ec27a19036d16d867e3.camel%40manticore-projects.com.

Daniel B. Widdis

unread,
Nov 23, 2023, 12:46:22 AM11/23/23
to jna-...@googlegroups.com

Native crashes are very frequently a result of incorrect mappings.

 

This project fails on my M2 Macbook, although the crashing native function is different.  Here’s the stack trace:

 

Stack: [0x00000003057e3000,0x00000003058e3000],  sp=0x00000003058dcd80,  free space=999k

Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)

C  [libfpng.dylib+0xdbb0]  fpng_encode_image_to_memory+0x10

C  [jna15943423394244317169.tmp+0xf17a]  ffi_prep_go_closure+0x55a

 

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)

j  com.sun.jna.Native.invokePointer(Lcom/sun/jna/Function;JI[Ljava/lang/Object;)J+0

j  com.sun.jna.Function.invokePointer(I[Ljava/lang/Object;)Lcom/sun/jna/Pointer;+7

j  com.sun.jna.Function.invoke([Ljava/lang/Object;Ljava/lang/Class;ZI)Ljava/lang/Object;+440

j  com.sun.jna.Function.invoke(Ljava/lang/reflect/Method;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Object;Ljava/util/Map;)Ljava/lang/Object;+271

j  com.sun.jna.Library$Handler.invoke(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object;+390

j  com.sun.proxy.$Proxy16.fpng_encode_image_to_memory([BIIII)Lcom/manticore/tools/Encoder$ByteArray;+46

j  com.manticore.tools.FPNGEncoder.encode(Ljava/awt/image/BufferedImage;II)[B+28

 

The bottom line in that stack trace indicates a Java BufferedImage object is being passed to native, and since that’s not something the type mapper understands, unpredictable things are happening.

 

In the function in the test report I’d investigate the byte[] argument.

 

Try to instrument your crashing gradle test runner to output the top several lines of the generated log files.

 

--

Support

unread,
Nov 23, 2023, 12:52:15 AM11/23/23
to jna-...@googlegroups.com
Thank you very much Daniel, I appreciate your feedback.

In fact, we pass an byte[] array to native and we do this in the same way on all 3 platforms. `FPNGEncoder.encode` translates the BufferedImage into such an array of bytes.

If the code was wrong, would it not:
a) always fails on MacOS and
b) also fail on the other 2 platforms?

Why/how would it fail only occasional?
Can I ask you for the big favour of repeating this test 10x and confirm if it always fails?

Thank you again big time and cheers
Andreas

Support

unread,
Nov 23, 2023, 12:57:36 AM11/23/23
to jna-...@googlegroups.com
On Thu, 2023-11-23 at 12:52 +0700, Support wrote:
M2 Macbook

Please allow me to ask:

Is this "x64" or "arm64"?
The libraries are provided for "x64" only since I did not find any Code Runner for "arm64".

Also, have you compiled yourself or tried the Maven artifacts?

Thank you and cheers!

Tres Finocchiaro

unread,
Nov 23, 2023, 12:59:44 AM11/23/23
to jna-...@googlegroups.com
M2 is ARM64

Daniel B. Widdis

unread,
Nov 23, 2023, 1:00:10 AM11/23/23
to jna-...@googlegroups.com

Yep, that’s arm64 which may explain why I’m getting different results than you (and why it doesn’t work, ever, for me.)

 

--

You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.

Tres Finocchiaro

unread,
Nov 23, 2023, 1:00:28 AM11/23/23
to jna-...@googlegroups.com
> M2 is ARM64

Sorry, this is presumptuous since Apple offers Rosetta 2.

Daniel B. Widdis

unread,
Nov 23, 2023, 1:05:33 AM11/23/23
to jna-...@googlegroups.com

I’ll go try on an x86 mac shortly.

 

As for “what’s different” there could be different assumptions about the processor.

 

I looked into the native code for the function that’s crashing and see a compiler conditional based on a processor feature (can_use_sse41) …. Really hard to debug based on a single line.

 

uint32_t fpng_adler32(const void* pData, size_t size, uint32_t adler)

{

#if FPNG_X86_OR_X64_CPU && !FPNG_NO_SSE

  if (g_cpu_info.can_use_sse41())

    return adler32_sse_16((const uint8_t*)pData, size, adler);

#endif

  return fpng_adler32_scalar((const uint8_t*)pData, size, adler);

}

 

From: jna-...@googlegroups.com <jna-...@googlegroups.com> on behalf of Support <sup...@manticore-projects.com>
Date: Wednesday, November 22, 2023 at 9:52

PM

Support

unread,
Nov 23, 2023, 1:06:05 AM11/23/23
to jna-...@googlegroups.com
On Thu, 2023-11-23 at 01:00 -0500, Tres Finocchiaro wrote:
> M2 is ARM64

Sorry, this is presumptuous since Apple offers Rosetta 2.

My instinct would tell me running it on bare x64 first before trying to emulate it with Rosetta.
I also wonder, if MacOS would not complain when the architecture does not match (instead of just segfaulting).

How could I bribe you into compiling from source on MacOS arm64?

Cheers
Andreas

Tres Finocchiaro

unread,
Nov 23, 2023, 1:09:17 AM11/23/23
to jna-...@googlegroups.com
I made an attempt to patch fpng-python for a baseline.


I've only tested it on M1, but it causes a segmentation fault on MacOS, but not on Ubuntu (both ARM64).

I don't know Python well enough to claim that my patches to this project are correct, but I thought that if you had a working baseline for comparison, it would help narrow down the cause.

-Tres

Support

unread,
Nov 23, 2023, 1:13:44 AM11/23/23
to jna-...@googlegroups.com
On Thu, 2023-11-23 at 01:09 -0500, Tres Finocchiaro wrote:
I made an attempt to patch fpng-python for a baseline.


I've only tested it on M1, but it causes a segmentation fault on MacOS, but not on Ubuntu (both ARM64).

You are awesome!

But please lets go back a step as I am indeed running and testing against x64 MacOS (which should be close to x64 Linux).
And it runs in general, but segfaults often.

Apologies for asking stupid questions: As long as my test data are static, would the code not always fail if the memory allocation/freeing or the type mapping was an issue?
Why would it fail only on Apple, but not always?

Cheers
Andreas

PS: digesting the Python variant fixes now.

Daniel B. Widdis

unread,
Nov 23, 2023, 1:21:02 AM11/23/23
to jna-...@googlegroups.com
First attempt on x86 macOS failed on the Java side, perhaps a clue?

> Task :maven-test:test

FPNGTest > encodeFPNGETest(String, int) > [1] example, 3 FAILED
    javax.imageio.IIOException at FPNGETest.java:32
        Caused by: java.io.EOFException at FPNGETest.java:32

Second attempt, two for one sale! (Same for third, fourth, fifth....)

> Task :maven-test:test

FPNGTest > encodeFPNGTest(String, int) > [1] example, 3 FAILED
    javax.imageio.IIOException at FPNGETest.java:21
        Caused by: java.io.EOFException at FPNGETest.java:21

FPNGTest > encodeFPNGETest(String, int) > [1] example, 3 FAILED
    javax.imageio.IIOException at FPNGETest.java:32
        Caused by: java.io.EOFException at FPNGETest.java:32

I'd suggest digging into how your code detects EOF.



--
You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.


--
Dan Widdis

Support

unread,
Nov 23, 2023, 1:26:51 AM11/23/23
to jna-...@googlegroups.com
Oughhhh!!!

This is totally unrelated, since Java ImageIO fails to read the sample.png into a BufferedImage.
I have also seen this error on GitHub Runners, but again only for MacOS (not for Windows and not for Linux).

And this code is pure Java only (before the native libs get called).

Pardon me for stressing you. What will happen with:

./gradlew jmh

or

./gradlew test

(This would build the Native Libs or your computer.)

Big thank you! I am grateful!
Andreas

Daniel B. Widdis

unread,
Nov 23, 2023, 1:30:07 AM11/23/23
to jna-...@googlegroups.com
I was getting those failures with ./gradlew test.  The jmh one seems to run benchmarks normally with no crash.

--
You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.


--
Dan Widdis

Support

unread,
Nov 23, 2023, 1:33:41 AM11/23/23
to jna-...@googlegroups.com
On Wed, 2023-11-22 at 22:29 -0800, Daniel B. Widdis wrote:
I was getting those failures with ./gradlew test.  The jmh one seems to run benchmarks normally with no crash.

That's a strong hint, thank you.
The Tests copy the samples from the Benchmark resources. Although pure Java Filesystem, maybe the paths get broken.

Please share benchmark results -- I am so curious! 😄

In the meantime I will try to get a MacOS VM somehow in order to ensure working code structures.

Thank you again big time!


Tres Finocchiaro

unread,
Nov 23, 2023, 1:39:30 AM11/23/23
to jna-...@googlegroups.com
I've heard that sosumi is a pretty quick way to fire up a macOS VM on a Linux host.

Daniel B. Widdis

unread,
Nov 23, 2023, 1:39:58 AM11/23/23
to jna-...@googlegroups.com
Benchmark                                           (imageName)  Mode  Cnt     Score     Error  Units
FPNGEBenchmark.encode                               example.png  avgt    3     3.760 ±   0.282  ms/op
FPNGEBenchmark.encode                   looklet-look-scale6.png  avgt    3   374.481 ± 174.265  ms/op
FPNGEncoderBenchmark.encode                         example.png  avgt    3     8.693 ±   5.758  ms/op
FPNGEncoderBenchmark.encode             looklet-look-scale6.png  avgt    3   642.674 ± 319.904  ms/op
ImageIOEncoderBenchmark.encode                      example.png  avgt    3    96.261 ±   4.445  ms/op
ImageIOEncoderBenchmark.encode          looklet-look-scale6.png  avgt    3  2265.919 ± 467.095  ms/op
ObjectPlanetPNGEncoderBenchmark.encode              example.png  avgt    3    76.676 ±   5.559  ms/op
ObjectPlanetPNGEncoderBenchmark.encode  looklet-look-scale6.png  avgt    3  1653.728 ± 135.696  ms/op
PNGEncoderBenchmark.encode                          example.png  avgt    3    55.403 ±  13.296  ms/op
PNGEncoderBenchmark.encode              looklet-look-scale6.png  avgt    3  1129.696 ± 166.870  ms/op
PNGEncoderBenchmark.encodeFastest                   example.png  avgt    3    52.664 ±   4.737  ms/op
PNGEncoderBenchmark.encodeFastest       looklet-look-scale6.png  avgt    3   980.785 ± 163.863  ms/op

--
You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.


--
Dan Widdis

Support

unread,
Nov 23, 2023, 1:52:30 AM11/23/23
to jna-...@googlegroups.com
On Wed, 2023-11-22 at 22:39 -0800, Daniel B. Widdis wrote:
Benchmark                                           (imageName)  Mode  Cnt     Score     Error  Units
FPNGEBenchmark.encode                               example.png  avgt    3     3.760 ±   0.282  ms/op
FPNGEBenchmark.encode                   looklet-look-scale6.png  avgt    3   374.481 ± 174.265  ms/op
FPNGEncoderBenchmark.encode                         example.png  avgt    3     8.693 ±   5.758  ms/op
FPNGEncoderBenchmark.encode             looklet-look-scale6.png  avgt    3   642.674 ± 319.904  ms/op
ImageIOEncoderBenchmark.encode                      example.png  avgt    3    96.261 ±   4.445  ms/op
ImageIOEncoderBenchmark.encode          looklet-look-scale6.png  avgt    3  2265.919 ± 467.095  ms/op <-- outch!

ObjectPlanetPNGEncoderBenchmark.encode              example.png  avgt    3    76.676 ±   5.559  ms/op
ObjectPlanetPNGEncoderBenchmark.encode  looklet-look-scale6.png  avgt    3  1653.728 ± 135.696  ms/op
PNGEncoderBenchmark.encode                          example.png  avgt    3    55.403 ±  13.296  ms/op
PNGEncoderBenchmark.encode              looklet-look-scale6.png  avgt    3  1129.696 ± 166.870  ms/op
PNGEncoderBenchmark.encodeFastest                   example.png  avgt    3    52.664 ±   4.737  ms/op
PNGEncoderBenchmark.encodeFastest       looklet-look-scale6.png  avgt    3   980.785 ± 163.863  ms/op

Very cool! Thank you so much.
It's interesting how slow Java on ARM64 is. The native libraries don't differ much from my benchmarks, but the java encoders are 3 times slower.

Also it shows that it runs on Mac ARM64, which is amazing.
Now I only need to find about the problem with the binaries.

Daniel B. Widdis

unread,
Nov 23, 2023, 1:55:24 AM11/23/23
to jna-...@googlegroups.com
These benchmarks are on my (well, my company's) x86 mac (2.6 GHz 6-Core Intel Core i7)

Let me go try the benchmarks on my M2 :) 

--
You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.


--
Dan Widdis

Daniel B. Widdis

unread,
Nov 23, 2023, 1:59:55 AM11/23/23
to jna-...@googlegroups.com

Nope, even jmh crashes on my (ARM64) M2.

 

From: jna-...@googlegroups.com <jna-...@googlegroups.com> on behalf of Support <sup...@manticore-projects.com>
Date: Wednesday, November 22, 2023 at 10:52

PM

--

You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.

Support

unread,
Nov 23, 2023, 2:02:03 AM11/23/23
to jna-...@googlegroups.com
On Wed, 2023-11-22 at 22:59 -0800, Daniel B. Widdis wrote:
Nope, even jmh crashes on my (ARM64) M2.

But it compiled?
Are there dylibs in fpng/build and fpnge/build?

And how can I buy you a beer or coffee, my friend?

Cheers
Andreas

Daniel B. Widdis

unread,
Nov 23, 2023, 2:08:30 AM11/23/23
to jna-...@googlegroups.com

Nope, doesn’t look like anything built. There are x86-64 images there that I presume were downloaded from the github fork.

 

No need to buy me beverages, just contribute to the open source community and pay it forward.

 

From: jna-...@googlegroups.com <jna-...@googlegroups.com> on behalf of Support <sup...@manticore-projects.com>
Date: Wednesday, November 22, 2023 at 11:02
PM
To: jna-...@googlegroups.com <jna-...@googlegroups.com>
Subject: Re: JNA segfaults on MacOS, often but not always

--

You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.

Support

unread,
Nov 23, 2023, 2:13:05 AM11/23/23
to jna-...@googlegroups.com
On Wed, 2023-11-22 at 23:08 -0800, Daniel B. Widdis wrote:
No need to buy me beverages, just contribute to the open source community and pay it forward.

Thank you! I tip my hat and leave you in peace for a while. Installing MacOS as we speak (x64 only though).
Cheers

Andreas

Support

unread,
Nov 28, 2023, 8:11:00 PM11/28/23
to jna-...@googlegroups.com
David and All,

compliments of the day and thank you for your patience.
I would like to share my latest findings regarding JNA and MacOS related to 

1) There seems to be a MacOS specific JDK bug, when reading certain 3 channel PNG via InputStream into ImageIO. They claimed to have fixed that for JDK 11/12 but I still reproduced this reliably on GitHub MacOS Runner and on my Catalina VM.
I mitigated it successfully by writing the Image InputStream into a File first, before loading this File into ImageIO. (Although this should not make any difference.)

2) I still get the occasional Segfaults on MacOS, but only on the GitHub Runner. I have not been able to reproduce this on the Catalina VM (trying more than 20 times).
If anyone wants to lend a helping MacOS hand:

git clone --depth 1 https://github.com/manticore-projects/fpng-java.git
cd fpng-java
./gradlew :maven-test:test

I would be very interested if you experience segfaults and if you can tell me why.
Unfortunately I am unsure if this is a MacOS specific JNA problem or a MacOS problem (although I can't believe that a static test case fails randomly, when it runs stable on Linux and Windows).

3) I think I am going to write a Gradle Plugin for copying the native binaries from Gradle `cpp-library` into a JNA compatible LIBS folder, renaming the libraries according the architecture and also applying UPX compression when possible.
Most of those parts I have done already as part of FPNG-JAVA. Was there any potential interest in such a gradle plugin please?

Thank you all and cheers
Andreas




Tres Finocchiaro

unread,
Nov 28, 2023, 9:28:07 PM11/28/23
to jna-...@googlegroups.com
1) There seems to be a MacOS specific JDK bug, when reading certain 3 channel PNG via InputStream into ImageIO. They claimed to have fixed that for JDK 11/12 but I still reproduced this reliably on GitHub MacOS Runner and on my Catalina VM.
I mitigated it successfully by writing the Image InputStream into a File first, before loading this File into ImageIO. (Although this should not make any difference.)

I remember a similar bug I had about 8 years ago.  Converting it to ARGB was my workaround.  Is there a chance that writing it to PNG adds an alpha channel?

With regards to testing, I tried to execute :maven-test:test on M1, but all I get are errors.

Support

unread,
Nov 28, 2023, 9:35:07 PM11/28/23
to jna-...@googlegroups.com
Good Morning Tres.

On Tue, 2023-11-28 at 21:27 -0500, Tres Finocchiaro wrote:
I remember a similar bug I had about 8 years ago.  Converting it to ARGB was my workaround.  Is there a chance that writing it to PNG adds an alpha channel?

Yes, but it beats the purpose of the Test since 3 channel encoding is (supposed to be) faster and smaller than 4 channel encoding.
Never mind, my mitigation of this is effective. I just felt the obligation to point out this mess.

With regards to testing, I tried to execute :maven-test:test on M1, but all I get are errors.

M1 is ARM64, right?
We do provide binaries for x64 only (since there is no ARM64 runner on Github), although I will investigate cross-compiling much later.
I would have expected a reasonable error message about "ARM64", no?

Would you mind sharing with me verbose log files, so I can see where it goes wrong and how to catch it gracefully? 
What is best practise to simulate an ARM64 architecture (although Rosetta may blur things further.)

Sorry for asking so many questions, I am just an accountant.
Cheers

Andreas


Tres Finocchiaro

unread,
Nov 28, 2023, 9:53:47 PM11/28/23
to jna-...@googlegroups.com
XCode can target ARM64 from Intel, but I don't know how to get cpp-library to produce them.  My other projects use CMAKE_OSX_ARCHITECTURES via CMake to do this from Intel.

Catalina's XCode should be new enough to target ARM64.  Naturally, the tests would have to be skipped.

With regards to logs, Gradle gives me a 50-line debug and a 8,000-line debug. Here they both are: https://gist.github.com/tresf/379d866a1bd25539c55b40ad0070b45e


Support

unread,
Nov 28, 2023, 10:00:28 PM11/28/23
to jna-...@googlegroups.com
On Tue, 2023-11-28 at 21:53 -0500, Tres Finocchiaro wrote:
With regards to logs, Gradle gives me a 50-line debug

Thank you! That's perfect:

> Task :maven-test:test

FPNGTest > encodeFPNGTest(String, int) > [1] example, 3 FAILED
    java.lang.UnsatisfiedLinkError at FPNGETest.java:38

It states that the Native Lib has not been found in the Maven jar files (because not included) and I only need to add, that this is supposed to happen since you are on ARM64.

What happens when you try to compile the Native Libs by yourself?

./gradlew :fpng-java:test :fpnge-java:test


Much obliged, cheers!

Andreas

Tres Finocchiaro

unread,
Nov 28, 2023, 10:04:19 PM11/28/23
to jna-...@googlegroups.com
./gradlew :fpng-java:test :fpnge-java:test

It briefly shows a bunch of messages about X86_64 but they disappear and there are more similar errors.

Support

unread,
Nov 28, 2023, 10:07:09 PM11/28/23
to jna-...@googlegroups.com
On Tue, 2023-11-28 at 22:04 -0500, Tres Finocchiaro wrote:
It briefly shows a bunch of messages about X86_64 but they disappear and there are more similar errors.

Thank again, I have stressed you enough.
I will need to find a way to get an ARM64 VM running on my x64 host (although Linux/BSD may be sufficient, there should not be many Apple Silicon *Servers* out there.)

Cheers
Andreas

Daniel B. Widdis

unread,
Nov 29, 2023, 12:54:26 AM11/29/23
to jna-...@googlegroups.com

If you’re using this for developing Open Source software, consider applying for an account at the Gnu Compile Farm: https://gcc.gnu.org/wiki/CompileFarm

 

One of the machines you’ll have access to is cfarm104, an M1 Mac Mini (arm64).

 

From: jna-...@googlegroups.com <jna-...@googlegroups.com> on behalf of Support <sup...@manticore-projects.com>
Date: Tuesday, November 28, 2023 at 7:07
PM
To: jna-...@googlegroups.com <jna-...@googlegroups.com>
Subject: Re: JNA segfaults on MacOS, often but not always

--

You received this message because you are subscribed to the Google Groups "Java Native Access" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jna-users+...@googlegroups.com.

Support

unread,
Nov 29, 2023, 1:09:59 AM11/29/23
to jna-...@googlegroups.com
On Tue, 2023-11-28 at 21:54 -0800, Daniel B. Widdis wrote:

If you’re using this for developing Open Source software, consider applying for an account at the Gnu Compile Farm: https://gcc.gnu.org/wiki/CompileFarm

One of the machines you’ll have access to is cfarm104, an M1 Mac Mini (arm64).


Thank you so much! Exactly what I was looking for:

cfarm104
Graphs
22 Mac mini (2020) arm64 Apple M1 8 cores
8 threads
CPU graph
16.0 GB
Mem. graph
0 bytes
Disk usage
MacOSX 12.6 21.6.0
21.6.0

Much appreciated and should be promoted more.
Cheers
Andreas


Support

unread,
Nov 29, 2023, 3:24:54 AM11/29/23
to jna-...@googlegroups.com
I apologise upfront for breaking the scope of this list. But since there is not much good documentation out there it may be of interest:

1) Gradle target machine can be added and it would build targets for MacOS AARCH64 even on the MacOS X86 runner. No extra AARCH64 needed when your C code was correct.

library {
linkage = [Linkage.STATIC, Linkage.SHARED]
targetMachines = [
machines.linux.x86_64,
machines.windows.x86, machines.windows.x86_64,
machines.macOS.x86_64, machines.macOS.architecture("aarch64")
]
}

2) However, this won't work since FPNG and FPNGe use SSE/AVX intrinsics-- which on AARCH64 seems to depend on NEON and is different:

#if defined(__SSE4_2__)
#include <nmmintrin.h>
#elif defined(__aarch64__)
#include <arm_neon.h>
#endif

// z[i] = x[i] + y[i]
void vadd(const int* x, const int* y, int* z, unsigned int count) {
    // process 4 integers (128bits) with simd
    unsigned int i = 0;
    for (; i + 4 <= count; i += 4) {
#if defined(__SSE4_2__)
        const __m128i vx = _mm_lddqu_si128((const __m128i*)(x + i));
        const __m128i vy = _mm_lddqu_si128((const __m128i*)(y + i));
        const __m128i vz = _mm_add_epi32(vx, vy);
        _mm_storeu_si128((__m128i*)(z + i), vz);
#elif defined(__aarch64__)
        const int32x4_t vx = vld1q_s32(x + i);
        const int32x4_t vy = vld1q_s32(y + i);
        const int32x4_t vz = vaddq_s32(vx, vy);
        vst1q_s32(z + i, vz);
#endif
    }

    // tail loop
    for (; i < count; ++i) {
        z[i] = x[i] + y[i];
    }
}

Long story short, I will need to edit the C++ sources for supporting AARCH64.
I am not really certain, if I will go through this pain -- especially when its not easy or pretty to get this stuff running in a normal VM.

Thank you all for the guidance and help, I think I got all my interest and questions answered.
Cheers and all the best

Andreas




Tres Finocchiaro

unread,
Nov 29, 2023, 11:13:11 AM11/29/23
to jna-...@googlegroups.com
Perhaps this header file will offer some insight:


I have to disclaim that I don't know much about c++/clang/gcc compiler flags, but the example in this header seems to mention "-mfloat-abi=softfp", which seems a bit dated since all ARM64 hardware should have hardware floating point support.  I assume this is still around for older ARM hardware, such as iOS devices.

Here's a copy that's on my M1:


Notice that this one says "-mfloat-abi=hard" 

With this knowledge, doing a GitHub crawl for this produces some CMake examples:

Which produces:

set(CMAKE_C_FLAGS "-march=native -mfpu=neon -mfloat-abi=hard ${CMAKE_C_FLAGS}")
set(CMAKE_CXX_FLAGS "-march=native -mfpu=neon -mfloat-abi=hard ${CMAKE_CXX_FLAGS}")

I would expect cpp-library to offer a way to provide these flags.

As a bare minimum, I would encourage enabling a debug flag which shows the verbose compiler output so that you can see what flags are currently being provided to the compiler.  This should also help confirm that when new flags are added that they're automatically picked up.

A quick glance at the CMakeLists file for fpng, it does appear to be missing some optimizations for non-SSE processors.  Since you're working from a copy of fpng (rather than a submodule) you don't appear to be using this, however if any compiler flags can be offered upstream, it'll help others.

Personally, I try to leverage CMake directly for these tasks, especially when the upstream repository uses it.  I would likely switch fpng-java to using a submodule and invoke CMake directly versus cpp-library, but from what I understand, your goal is to leverage Gradle as much as possible so I'll try not to be too prescriptive. :)

Support

unread,
Nov 29, 2023, 5:36:58 PM11/29/23
to jna-...@googlegroups.com
On Wed, 2023-11-29 at 11:12 -0500, Tres Finocchiaro wrote:
for fpng, it does appear to be missing some optimizations for non-SSE processors.  Since you're working from a copy of fpng (rather than a submodule) you don't appear to be using this, however if any compiler flags can be offered upstream, it'll help others.

Exactl. Both FPNG and FPNGe are x64 AVX/SSE only but do not implement the Neon equivalent calls.
This affects mostly Adler, CRC and maybe Deflate as well as the RGBA byte swaps. 

There is a ZLIB-NG which has optimised Adler, CRC and Deflate on AVX/SSE and also Neon. The Byte swap is rather simple to translate.
So eventually I ill have to sit down and start porting (although this exceeds my scope massively, I just wanted to have a fast PNG Encoder).

And yes, I will of course feed back any changes to the the FPNG and FPNGe developers eventually, when the dust will have settled. 

Cheers
Andreas


Support

unread,
Nov 29, 2023, 7:28:49 PM11/29/23
to jna-...@googlegroups.com
All,

turns out you actually can simulate an AARCH64 guest on a x64 host via QEMU:

localhost:~# cat /proc/cpuinfo 
processor       : 0
BogoMIPS        : 125.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd07
CPU revision    : 0

ASIMD is the Neon instruction set, available on CORTEX-A57.

Important: use an Alpine AARCH64 image because other images kept failing/freezing (for whatever reason).

qemu-system-aarch64 -m 2048 -smp 4 -M virt -nographic \
-cpu cortex-a57 \
-bios QEMU_EFI.fd
-drive if=none,file=alpine-virt-3.18.4-aarch64.iso,id=hd0 -device virtio-blk-device,drive=hd0 \
-device virtio-net-device,netdev=net0 -netdev user,hostfwd=tcp:127.0.0.1:2222-:22,id=net0

Sources:

For SSH'ing into this machine, just add an user image:

-drive file=user-data.img,format=raw

Now, install CLANG and then we should be able to port to Neon and compile and link the binaries. Once the C++ code runs on Linux AARCH64 it *should* compile on MacOS too.

Maybe this helps other adventures.

Cheers
Andreas

PS: if interested in this please follow the discussion at https://github.com/manticore-projects/fpng-java/discussions/2 since I drift more and more off topic here. Apologies to the JNA team.
Reply all
Reply to author
Forward
0 new messages