Support Linux/ppc64le arch

190 views
Skip to first unread message

M G

unread,
Jan 7, 2025, 6:35:13 PMJan 7
to pdfium
I'm trying to build the pdfium binary to then build pypdfium2 python wheel,

however it seems the toolchain (vpython?) doesn't support Linux/ppc64le - this is my current process

```bash
#centos7 OS

$ export PATH=$(pwd)/depot_tools:$PATH
$ mkdir pypdfium && cd pypdfium

$ fetch --nohooks pdfium Bundled Python 3.11 not found. Use VPYTHON_BYPASS if prebuilt cpython not available on this platform: open /depot_tools/.cipd_bin/.cipd/pkgs/0/yIqdB2JmeFcl0j2lRWic0BJJ_bz7nnMF1oM_HP5bvcAC/3.11/.versions/cpython3.cipd_version: no such file or directory

```

I tried using the VPYTHON_BYPASS env arg, but I end up in different errors...is there an alternative to compile pypdfium on non-supported platforms like Linux/ppc64le?

Thank you!

Lei Zhang

unread,
Jan 7, 2025, 7:07:32 PMJan 7
to M G, pdfium
What is the next error about VPYTHON_BYPASS?

--
You received this message because you are subscribed to the Google Groups "pdfium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdfium+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pdfium/1bfa6fe6-e32c-4ae6-9e18-05c923866c9cn%40googlegroups.com.

geisserml

unread,
Jan 8, 2025, 10:54:25 AMJan 8
to pdfium
pypdfium2 maintainer here. Unfortunately, AFAIK, building pdfium on non-standard platforms is troublesome. The pypdfium2 readme says:
"Building PDFium may take a long time, as it comes with its bundled toolchain and deps, rather than taking them from the system. [...] PDFium may not compile on arbitrary hosts. The [build] script is limited to build hosts supported by Google's toolchain. Ideally, we'd need an alternative build system that runs with system packages instead."


> is there an alternative to compile pypdfium on non-supported platforms like Linux/ppc64le?
You could try to cross-compile pdfium from x86_64 by setting `target_cpu = ppc64` in args.gn, but unfortunately, with s390x, we've had the experience that this doesn't work, and produces an x86_64 binary anyway.
Otherwise, you could see if your distributor provides libreoffice, and builds it with pdfium. Unfortunately, I think Red Hat / that podman container image probably don't.
(Note that libreoffice build pdfium through their own build system, but I'm not sure if it can be used standalone for just pdfium.)

geisserml

unread,
Jan 8, 2025, 11:02:18 AMJan 8
to pdfium
> You could try to cross-compile pdfium from x86_64 by setting `target_cpu = ppc64` in args.gn

Though, this might actually be big endian, and I think you are looking for little endian (which does not seem to be available in `gn help target_cpu` -> "Possible values"), so this may not be an option anyway. :/

M G

unread,
Jan 8, 2025, 12:42:42 PMJan 8
to pdfium
I'm sorry for the poor formatting here. When running VPYTHON_BYPASS I see:

```bash

fetch --nohooks pdfium

//depot_tools/fetch.py:82: DeprecationWarning: Use shutil.which instead of find_executable

  if not spawn.find_executable('gclient'):

Running: gclient root

Traceback (most recent call last):

  File "//depot_tools/gclient.py", line 108, in <module>

    import gclient_scm

  File "/depot_tools/gclient_scm.py", line 23, in <module>

    import gerrit_util

  File "/depot_tools/gerrit_util.py", line 36, in <module>

    import httplib2

ModuleNotFoundError: No module named 'httplib2'

//depot_tools/fetch.py:82: DeprecationWarning: Use shutil.which instead of find_executable

  if not spawn.find_executable('gclient'):

Running: gclient config --spec 'solutions = [

  { 

    "name": "pdfium",

    "url": "https://pdfium.googlesource.com/pdfium.git",

    "managed": False,

    "custom_vars": {},

  },

]

'

Errors:

  failed to resolve infra/3pp/tools/git/linux-ppc64le@version:2...@2.41.0.chromium.11 (line 24): no such package: infra/3pp/tools/git/linux-ppc64le

  failed to resolve infra/3pp/tools/cpython3/linux-ppc64le@version:2...@3.11.8.chromium.35 (line 21): no such package: infra/3pp/tools/cpython3/linux-ppc64le

//depot_tools/bootstrap_python3: line 32: boots...@3.11.8.chromium.35_bin/python3/bin/python3: No such file or directory

Traceback (most recent call last):

  File "//depot_tools/gclient.py", line 108, in <module>

    import gclient_scm

  File "/depot_tools/gclient_scm.py", line 23, in <module>

    import gerrit_util

  File "/depot_tools/gerrit_util.py", line 36, in <module>

    import httplib2

ModuleNotFoundError: No module named 'httplib2'

Subprocess failed with return code 1.
```

I can manually install these libs but I guess fundamentally vpython has it's own wheels which do not support ppc64le

M G

unread,
Jan 8, 2025, 12:42:42 PMJan 8
to pdfium
Ah I haven't thought about cross-compiling - I'll give that a try!

Usually things work a bit easier on Linux on Power since we don't have the endianess challenge (that exist e.g on Power AIX or IBM z/s390x).
Although ppc64 indeed seems big endian.

I already tried installing libreoffice (on manylinux2014 which is centos7) but it doesn't come with a pdfium binary, also `rpm -qa | grep pdfium` does not show anything :/
Only pdfimport is shown

```bash

$ rpm -qa | grep pdf

libreoffice-pdfimport-5.3.6.1-26.el7_9.ppc64le

```

If all of this fail is there another way to go forward with this? I'm a bit afraid porting the whole toolchain might be a lot of work!

Thanks a lot for your ideas, support & quick answers here!

Miklos Vajna

unread,
Jan 8, 2025, 1:21:27 PMJan 8
to pdfium
Hi,

On Wed, Jan 08, 2025 at 09:33:11AM -0800, M G <marvin....@gmail.com> wrote:
> Only pdfimport is shown
>
> ```bash
>
> $ rpm -qa | grep pdf
>
> libreoffice-pdfimport-5.3.6.1-26.el7_9.ppc64le
> ```

That's unrelated, libreoffice has a poppler-based (default) and
pdfium-based (non-default) PDF importer.

The RPM spec disables pdfium:

https://src.fedoraproject.org/rpms/libreoffice/blob/rawhide/f/libreoffice.spec#_1214

So no, you won't get a pdfium binary there.

> >> (Note that libreoffice build pdfium through their own build system
> >> <https://github.com/LibreOffice/core/tree/master/external/pdfium>, but
> >> I'm not sure if it can be used standalone for just pdfium.)

That uses gbuild (libreoffice's home-grown build system on top of make)
and it wasn't tested on anything exotic like ppc64le, so I don't think
that's a promising direction.

Sorry. :-)

Regards,

Miklos

geisserml

unread,
Jan 8, 2025, 1:50:58 PMJan 8
to pdfium
FWIW, debian (and probably ubuntu) do build libreoffice with pdfium, on ppc64el (see [1] and search for libpdfiumlo.so). So it seems to be possible in theory ;)
The debian folks are even building Chromium on ppc64el. Perhaps you can find some hints in the workflows/patches that they use.

[1]: https://packages.debian.org/bookworm/ppc64el/libreoffice-core/filelist

Miklos Vajna

unread,
Jan 9, 2025, 2:36:18 AMJan 9
to geisserml, pdfium
Hi,

On Wed, Jan 08, 2025 at 10:50:58AM -0800, geisserml <geis...@gmail.com> wrote:
> FWIW, debian (and probably ubuntu) do build libreoffice with pdfium, on
> ppc64el (see [1] and search for libpdfiumlo.so). So it seems to be possible
> in theory ;)
> The debian folks are even building Chromium on ppc64el. Perhaps you can
> find some hints in the workflows/patches that they use.

Oh, I see. libreoffice has a set of allowed build systems to use for
externals, gn is not one of them, so the usual workaround is to just
take the c++ files of the external and build them using own makefiles
(these use libreoffice's gbuild macros), that's what I did for pdfium
years ago (and just tweak those as the pdfium version updates as
necessary). If that works for ppc64le by accident, great. I ~only did
explicit testing on x86_64.

Still, if you would need pdfium on ppc64le in other contexts, perhaps
best to just take those makefiles as an example to help understanding
the original gn build system -- but otherwise best to build your own
makefiles.

Or even better if you find out how to get ppc64le working with gn,
fixing the root of the problem. :-)

Regards,

Miklos

M G

unread,
Jan 15, 2025, 11:07:52 AMJan 15
to pdfium
I was able to successfully built pdfium for ppc64le but I figured out my library (which I built using pdfium-binaries github project) is linked against newer glibc (2.27 .. 2.29) whereas aarch64 & x86_64 have a max glibc of 2.17 (can be used in python manylinux wheels and have greater compatability AFAIU)

I assume the way I built my debian_bullseye_ppc64el_sysroot is missing some symbol extraction which I've read about in some other issues, but not sure it is not done anymore.

A bit more detailed description of the issue with glibc is here: https://github.com/bblanchon/pdfium-binaries/issues/187

The main patches I needed to apply (besides providing sysroot and libclang_rt.builtins) can be found here: https://github.com/mgiessing/pdfium-binaries/blob/ppc64le/patches/build_config_ppc64le.patch

If it's really an issue of my sysroot I'm happy to understand how I correctly can build that with MAX_GLIBC of 2.17

geisserml

unread,
Jan 15, 2025, 1:28:09 PMJan 15
to pdfium
Great to hear you got so far already!

As for the glibc symbols, I vaguely remember https://github.com/bblanchon/pdfium-binaries/issues/82 and a `reversion_glibc.py` script, but probably you're already be aware of that.
Unfortunately I can't help any further (in the end I'm only a bindings writer/packager), but perhaps Lei Zhang or Benoît Blanchon can.

PS: Once the patches have matured, it would be great if you could contribute them to pdfium/pdfium-binaries, then we might be able to package ppc64le wheels directly at pypdfium2.

M G

unread,
Jan 16, 2025, 4:56:46 AMJan 16
to pdfium
Yes, I saw the reversion_glibc.py script (which is removed in the current checkout btw.) but it had no effect on my libc/libm :/
I will also try to upstream patches to pdfium-binares & pypdfium2 but I guess first I need to file proper PRs to build (the correct) ppc64el sysroot & the clang_rt.builtins for Power.
Unfortunately I have no experience with google git/gerrit (yet).

In regards to the glibc issue I investigated things further and very similar to what was mentioned in the issue you referenced I'm almost certain there must be a different way to build the sysroot or any kind of post-processing that handles these symbols:

1.) I checked what symbols exactly cause 2.27/2.29:

readelf --symbols ppc64le_ninja_out/libpdfium.so | grep -E 'GLIBC_2\.27|GLIBC_2\.29'
    47: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND powf@GLIBC_2.27 (5)
    48: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND log2@GLIBC_2.29 (6)
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND logf@GLIBC_2.27 (5)
    85: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND pow@GLIBC_2.29 (6)
    86: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND exp@GLIBC_2.29 (6)
    89: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND log@GLIBC_2.29 (6)


It seem to be the math functions!

2.) After investigating the ppc64el-sysroot which I built manually I saw the GLIBC2.29 is the default (AFAIK double @ indicates default; I tested log, exp and pow).

$ pushd ${MANUALLY_BUILT_PPC64LE_SYSROOT}/lib
$ nm -D powerpc64le-linux-gnu/libm.so.6 | grep -E ' log2@'
0000000000056470 T log2@@GLIBC_2.29
00000000000114c0 T log2@GLIBC_2.17


3.) When I compared it to the arm64-sysroot I downloaded (referenced in sysroots.json) I saw GLIBC2.17 is the default (double @)

$ pushd ${DOWNLOADED_ARM64_SYSROOT}/lib
$ nm -D aarch64-linux-gnu/libm.so.6 | grep -E ' log2@'
000000000003d3e0 T log2@GLIBC_2.29
000000000000ea60 T log2@@GLIBC_2.17


4.) As a last test I built arm64-sysroot again manually using the build script: `sysroot_creator.py build arm64` and I saw the similar behaviour as for the manually built ppc64el-sysroot => double @ for GLIBC2.29:

$ pushd ${MANUALLY_BUILT_ARM64_SYSROOT}/lib
$ nm -D aarch64-linux-gnu/libm.so.6 | grep -E ' log2@'
000000000003d3e0 T log2@@GLIBC_2.29
000000000000ea60 T log2@GLIBC_2.17

So if someone could tell me how you usually build the sysroot for arm, arm64, x64, x86 that would be a great help :) 

Lei Zhang

unread,
Jan 16, 2025, 9:45:14 PMJan 16
to M G, pdfium
On Thu, Jan 16, 2025 at 1:56 AM M G <marvin....@gmail.com> wrote:
> So if someone could tell me how you usually build the sysroot for arm, arm64, x64, x86 that would be a great help :)

Look in build/linux/sysroot_scripts in the Chromium source repo.
https://chromium.googlesource.com/chromium/src/+/main/build/linux/sysroot_scripts

M G

unread,
Jan 17, 2025, 4:43:03 AMJan 17
to pdfium
Very much appreciate the feedback, however I'm exactly using/doing that script already but there is not much documentation about specifics :)

Is there anything else to consider with regards of the host system I'm using to build the sysroot?
I don't think so as the sysroot_creator.py script basically downloads a lot of stuff right?

Here is a very detailed description of how I built things (for the sake of simplicity I'm rebuilding amd64 to showcase the issue):

1.) Build new amd64 sysroot
mkdir -p $HOME/build-deb-sysroot
git clone https://chromium.googlesource.com/chromium/src/build.git $HOME/build-deb-sysroot/build
podman run -ti -v $PWD/build-deb-sysroot:/root/build-deb-sysroot ubuntu:22.04
apt update && apt install -y build-essential python3 python3-requests git file wget
cd /root/build-deb-sysroot/
./build/linux/sysroot_scripts/sysroot_creator.py build amd64

2.) Get the "prebuilt" sysroot from sysroots.json to compare (download)
mkdir prebuilt-out
tar -xf prebuilt-amd64.tar.xz -C prebuilt-out


3.) Compare the symbols in their libm.so.6
3a) Prebuilt (downloaded) => looks good, it has the default glibc<2.17
nm -D prebuilt-out/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
0000000000040040 T pow@GLIBC_2.29
0000000000010020 T pow@@GLIBC_2.2.5

3b) Manually built => looks bad, it has the default glibc at 2.29
nm -D out/sysroot-build/bullseye/bullseye_amd64_staging/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
nm -D out/sysroot-build/bullseye/bullseye_amd64_staging/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
0000000000040040 A pow@@GLIBC_2.29
0000000000010020 A pow@GLIBC_2.2.5

Can you confirm this behaviour?

M G

unread,
Jan 17, 2025, 9:10:17 AMJan 17
to pdfium
Okay, I managed to get a sysroot with patched glibc and finally also a "clean" libpdfium.so. Essentially two things needed to be changed:

1.) Checkout the correct chromium/src/build.git

The sysroots were uploaded 1st May 2024, so I checked the correct commit hash which indeed had the reversion_glibc.py script availalble - with that commit arm64 & amd64 were correctly built with a max glibc of 2.17
However, for powerpc64le this was still not the fact.

2.) Change the reversion_glibc.py script

The script changes the glibc default version of libc, libm and libcrpyt by splitting a readelf command based on spaces, but on powerpc64le the format is slightly different:

#power sample
317: 000000000004d960 1756 FUNC GLOBAL DEFAULT [<localentry>: 8] 12 pow@@GLIBC_2.29

#amd64 sample
861: 0000000000040040 182 FUNC GLOBAL DEFAULT 14 pow@@GLIBC_2.29

after editing the script and extracting "[<localentry>: *]" everything worked fine and I got a clean glibc2.17 pdfium library on Power!

Thanks a lot for all your suggestions & help!

geisserml

unread,
Jan 17, 2025, 3:43:49 PMJan 17
to pdfium
Very nice, congrats!

geisserml

unread,
May 5, 2025, 1:04:13 PMMay 5
to pdfium
Late update: pypdfium2 now also comes with a native build script that should be portable across different Linux architectures (and possibly other OSes if they are handled by the build system and provide a compatible system library environment). This was possible with pdfium's GN build system after all, with a few tricks. (Thanks to the libpdfium-nojs AUR and libpdfium COPR recipe authors for showing how to do this!)

However, I believe it would still make sense to handle ppc64le in the toolchain, for cross-compilation, symbol reversioning, inclusion of dependency libraries etc.
Alternatively, perhaps using a PyPA manylinux container + auditwheel might also work. (polyfill-glibc only supports x86_64 and aarch64 AFAIK.)
Reply all
Reply to author
Forward
0 new messages