Support Linux/ppc64le arch

304 views
Skip to first unread message

M G

unread,
Jan 7, 2025, 6:35:13 PM1/7/25
to pdfium
I'm trying to build the pdfium binary to then build pypdfium2 python wheel,

however it seems the toolchain (vpython?) doesn't support Linux/ppc64le - this is my current process

```bash
#centos7 OS

$ export PATH=$(pwd)/depot_tools:$PATH
$ mkdir pypdfium && cd pypdfium

$ fetch --nohooks pdfium Bundled Python 3.11 not found. Use VPYTHON_BYPASS if prebuilt cpython not available on this platform: open /depot_tools/.cipd_bin/.cipd/pkgs/0/yIqdB2JmeFcl0j2lRWic0BJJ_bz7nnMF1oM_HP5bvcAC/3.11/.versions/cpython3.cipd_version: no such file or directory

```

I tried using the VPYTHON_BYPASS env arg, but I end up in different errors...is there an alternative to compile pypdfium on non-supported platforms like Linux/ppc64le?

Thank you!

Lei Zhang

unread,
Jan 7, 2025, 7:07:32 PM1/7/25
to M G, pdfium
What is the next error about VPYTHON_BYPASS?

--
You received this message because you are subscribed to the Google Groups "pdfium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdfium+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pdfium/1bfa6fe6-e32c-4ae6-9e18-05c923866c9cn%40googlegroups.com.

geisserml

unread,
Jan 8, 2025, 10:54:25 AM1/8/25
to pdfium
pypdfium2 maintainer here. Unfortunately, AFAIK, building pdfium on non-standard platforms is troublesome. The pypdfium2 readme says:
"Building PDFium may take a long time, as it comes with its bundled toolchain and deps, rather than taking them from the system. [...] PDFium may not compile on arbitrary hosts. The [build] script is limited to build hosts supported by Google's toolchain. Ideally, we'd need an alternative build system that runs with system packages instead."


> is there an alternative to compile pypdfium on non-supported platforms like Linux/ppc64le?
You could try to cross-compile pdfium from x86_64 by setting `target_cpu = ppc64` in args.gn, but unfortunately, with s390x, we've had the experience that this doesn't work, and produces an x86_64 binary anyway.
Otherwise, you could see if your distributor provides libreoffice, and builds it with pdfium. Unfortunately, I think Red Hat / that podman container image probably don't.
(Note that libreoffice build pdfium through their own build system, but I'm not sure if it can be used standalone for just pdfium.)

geisserml

unread,
Jan 8, 2025, 11:02:18 AM1/8/25
to pdfium
> You could try to cross-compile pdfium from x86_64 by setting `target_cpu = ppc64` in args.gn

Though, this might actually be big endian, and I think you are looking for little endian (which does not seem to be available in `gn help target_cpu` -> "Possible values"), so this may not be an option anyway. :/

M G

unread,
Jan 8, 2025, 12:42:42 PM1/8/25
to pdfium
I'm sorry for the poor formatting here. When running VPYTHON_BYPASS I see:

```bash

fetch --nohooks pdfium

//depot_tools/fetch.py:82: DeprecationWarning: Use shutil.which instead of find_executable

  if not spawn.find_executable('gclient'):

Running: gclient root

Traceback (most recent call last):

  File "//depot_tools/gclient.py", line 108, in <module>

    import gclient_scm

  File "/depot_tools/gclient_scm.py", line 23, in <module>

    import gerrit_util

  File "/depot_tools/gerrit_util.py", line 36, in <module>

    import httplib2

ModuleNotFoundError: No module named 'httplib2'

//depot_tools/fetch.py:82: DeprecationWarning: Use shutil.which instead of find_executable

  if not spawn.find_executable('gclient'):

Running: gclient config --spec 'solutions = [

  { 

    "name": "pdfium",

    "url": "https://pdfium.googlesource.com/pdfium.git",

    "managed": False,

    "custom_vars": {},

  },

]

'

Errors:

  failed to resolve infra/3pp/tools/git/linux-ppc64le@version:2...@2.41.0.chromium.11 (line 24): no such package: infra/3pp/tools/git/linux-ppc64le

  failed to resolve infra/3pp/tools/cpython3/linux-ppc64le@version:2...@3.11.8.chromium.35 (line 21): no such package: infra/3pp/tools/cpython3/linux-ppc64le

//depot_tools/bootstrap_python3: line 32: boots...@3.11.8.chromium.35_bin/python3/bin/python3: No such file or directory

Traceback (most recent call last):

  File "//depot_tools/gclient.py", line 108, in <module>

    import gclient_scm

  File "/depot_tools/gclient_scm.py", line 23, in <module>

    import gerrit_util

  File "/depot_tools/gerrit_util.py", line 36, in <module>

    import httplib2

ModuleNotFoundError: No module named 'httplib2'

Subprocess failed with return code 1.
```

I can manually install these libs but I guess fundamentally vpython has it's own wheels which do not support ppc64le

M G

unread,
Jan 8, 2025, 12:42:42 PM1/8/25
to pdfium
Ah I haven't thought about cross-compiling - I'll give that a try!

Usually things work a bit easier on Linux on Power since we don't have the endianess challenge (that exist e.g on Power AIX or IBM z/s390x).
Although ppc64 indeed seems big endian.

I already tried installing libreoffice (on manylinux2014 which is centos7) but it doesn't come with a pdfium binary, also `rpm -qa | grep pdfium` does not show anything :/
Only pdfimport is shown

```bash

$ rpm -qa | grep pdf

libreoffice-pdfimport-5.3.6.1-26.el7_9.ppc64le

```

If all of this fail is there another way to go forward with this? I'm a bit afraid porting the whole toolchain might be a lot of work!

Thanks a lot for your ideas, support & quick answers here!

Miklos Vajna

unread,
Jan 8, 2025, 1:21:27 PM1/8/25
to pdfium
Hi,

On Wed, Jan 08, 2025 at 09:33:11AM -0800, M G <marvin....@gmail.com> wrote:
> Only pdfimport is shown
>
> ```bash
>
> $ rpm -qa | grep pdf
>
> libreoffice-pdfimport-5.3.6.1-26.el7_9.ppc64le
> ```

That's unrelated, libreoffice has a poppler-based (default) and
pdfium-based (non-default) PDF importer.

The RPM spec disables pdfium:

https://src.fedoraproject.org/rpms/libreoffice/blob/rawhide/f/libreoffice.spec#_1214

So no, you won't get a pdfium binary there.

> >> (Note that libreoffice build pdfium through their own build system
> >> <https://github.com/LibreOffice/core/tree/master/external/pdfium>, but
> >> I'm not sure if it can be used standalone for just pdfium.)

That uses gbuild (libreoffice's home-grown build system on top of make)
and it wasn't tested on anything exotic like ppc64le, so I don't think
that's a promising direction.

Sorry. :-)

Regards,

Miklos

geisserml

unread,
Jan 8, 2025, 1:50:58 PM1/8/25
to pdfium
FWIW, debian (and probably ubuntu) do build libreoffice with pdfium, on ppc64el (see [1] and search for libpdfiumlo.so). So it seems to be possible in theory ;)
The debian folks are even building Chromium on ppc64el. Perhaps you can find some hints in the workflows/patches that they use.

[1]: https://packages.debian.org/bookworm/ppc64el/libreoffice-core/filelist

Miklos Vajna

unread,
Jan 9, 2025, 2:36:18 AM1/9/25
to geisserml, pdfium
Hi,

On Wed, Jan 08, 2025 at 10:50:58AM -0800, geisserml <geis...@gmail.com> wrote:
> FWIW, debian (and probably ubuntu) do build libreoffice with pdfium, on
> ppc64el (see [1] and search for libpdfiumlo.so). So it seems to be possible
> in theory ;)
> The debian folks are even building Chromium on ppc64el. Perhaps you can
> find some hints in the workflows/patches that they use.

Oh, I see. libreoffice has a set of allowed build systems to use for
externals, gn is not one of them, so the usual workaround is to just
take the c++ files of the external and build them using own makefiles
(these use libreoffice's gbuild macros), that's what I did for pdfium
years ago (and just tweak those as the pdfium version updates as
necessary). If that works for ppc64le by accident, great. I ~only did
explicit testing on x86_64.

Still, if you would need pdfium on ppc64le in other contexts, perhaps
best to just take those makefiles as an example to help understanding
the original gn build system -- but otherwise best to build your own
makefiles.

Or even better if you find out how to get ppc64le working with gn,
fixing the root of the problem. :-)

Regards,

Miklos

M G

unread,
Jan 15, 2025, 11:07:52 AM1/15/25
to pdfium
I was able to successfully built pdfium for ppc64le but I figured out my library (which I built using pdfium-binaries github project) is linked against newer glibc (2.27 .. 2.29) whereas aarch64 & x86_64 have a max glibc of 2.17 (can be used in python manylinux wheels and have greater compatability AFAIU)

I assume the way I built my debian_bullseye_ppc64el_sysroot is missing some symbol extraction which I've read about in some other issues, but not sure it is not done anymore.

A bit more detailed description of the issue with glibc is here: https://github.com/bblanchon/pdfium-binaries/issues/187

The main patches I needed to apply (besides providing sysroot and libclang_rt.builtins) can be found here: https://github.com/mgiessing/pdfium-binaries/blob/ppc64le/patches/build_config_ppc64le.patch

If it's really an issue of my sysroot I'm happy to understand how I correctly can build that with MAX_GLIBC of 2.17

geisserml

unread,
Jan 15, 2025, 1:28:09 PM1/15/25
to pdfium
Great to hear you got so far already!

As for the glibc symbols, I vaguely remember https://github.com/bblanchon/pdfium-binaries/issues/82 and a `reversion_glibc.py` script, but probably you're already be aware of that.
Unfortunately I can't help any further (in the end I'm only a bindings writer/packager), but perhaps Lei Zhang or Benoît Blanchon can.

PS: Once the patches have matured, it would be great if you could contribute them to pdfium/pdfium-binaries, then we might be able to package ppc64le wheels directly at pypdfium2.

M G

unread,
Jan 16, 2025, 4:56:46 AM1/16/25
to pdfium
Yes, I saw the reversion_glibc.py script (which is removed in the current checkout btw.) but it had no effect on my libc/libm :/
I will also try to upstream patches to pdfium-binares & pypdfium2 but I guess first I need to file proper PRs to build (the correct) ppc64el sysroot & the clang_rt.builtins for Power.
Unfortunately I have no experience with google git/gerrit (yet).

In regards to the glibc issue I investigated things further and very similar to what was mentioned in the issue you referenced I'm almost certain there must be a different way to build the sysroot or any kind of post-processing that handles these symbols:

1.) I checked what symbols exactly cause 2.27/2.29:

readelf --symbols ppc64le_ninja_out/libpdfium.so | grep -E 'GLIBC_2\.27|GLIBC_2\.29'
    47: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND powf@GLIBC_2.27 (5)
    48: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND log2@GLIBC_2.29 (6)
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND logf@GLIBC_2.27 (5)
    85: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND pow@GLIBC_2.29 (6)
    86: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND exp@GLIBC_2.29 (6)
    89: 0000000000000000     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]   UND log@GLIBC_2.29 (6)


It seem to be the math functions!

2.) After investigating the ppc64el-sysroot which I built manually I saw the GLIBC2.29 is the default (AFAIK double @ indicates default; I tested log, exp and pow).

$ pushd ${MANUALLY_BUILT_PPC64LE_SYSROOT}/lib
$ nm -D powerpc64le-linux-gnu/libm.so.6 | grep -E ' log2@'
0000000000056470 T log2@@GLIBC_2.29
00000000000114c0 T log2@GLIBC_2.17


3.) When I compared it to the arm64-sysroot I downloaded (referenced in sysroots.json) I saw GLIBC2.17 is the default (double @)

$ pushd ${DOWNLOADED_ARM64_SYSROOT}/lib
$ nm -D aarch64-linux-gnu/libm.so.6 | grep -E ' log2@'
000000000003d3e0 T log2@GLIBC_2.29
000000000000ea60 T log2@@GLIBC_2.17


4.) As a last test I built arm64-sysroot again manually using the build script: `sysroot_creator.py build arm64` and I saw the similar behaviour as for the manually built ppc64el-sysroot => double @ for GLIBC2.29:

$ pushd ${MANUALLY_BUILT_ARM64_SYSROOT}/lib
$ nm -D aarch64-linux-gnu/libm.so.6 | grep -E ' log2@'
000000000003d3e0 T log2@@GLIBC_2.29
000000000000ea60 T log2@GLIBC_2.17

So if someone could tell me how you usually build the sysroot for arm, arm64, x64, x86 that would be a great help :) 

Lei Zhang

unread,
Jan 16, 2025, 9:45:14 PM1/16/25
to M G, pdfium
On Thu, Jan 16, 2025 at 1:56 AM M G <marvin....@gmail.com> wrote:
> So if someone could tell me how you usually build the sysroot for arm, arm64, x64, x86 that would be a great help :)

Look in build/linux/sysroot_scripts in the Chromium source repo.
https://chromium.googlesource.com/chromium/src/+/main/build/linux/sysroot_scripts

M G

unread,
Jan 17, 2025, 4:43:03 AM1/17/25
to pdfium
Very much appreciate the feedback, however I'm exactly using/doing that script already but there is not much documentation about specifics :)

Is there anything else to consider with regards of the host system I'm using to build the sysroot?
I don't think so as the sysroot_creator.py script basically downloads a lot of stuff right?

Here is a very detailed description of how I built things (for the sake of simplicity I'm rebuilding amd64 to showcase the issue):

1.) Build new amd64 sysroot
mkdir -p $HOME/build-deb-sysroot
git clone https://chromium.googlesource.com/chromium/src/build.git $HOME/build-deb-sysroot/build
podman run -ti -v $PWD/build-deb-sysroot:/root/build-deb-sysroot ubuntu:22.04
apt update && apt install -y build-essential python3 python3-requests git file wget
cd /root/build-deb-sysroot/
./build/linux/sysroot_scripts/sysroot_creator.py build amd64

2.) Get the "prebuilt" sysroot from sysroots.json to compare (download)
mkdir prebuilt-out
tar -xf prebuilt-amd64.tar.xz -C prebuilt-out


3.) Compare the symbols in their libm.so.6
3a) Prebuilt (downloaded) => looks good, it has the default glibc<2.17
nm -D prebuilt-out/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
0000000000040040 T pow@GLIBC_2.29
0000000000010020 T pow@@GLIBC_2.2.5

3b) Manually built => looks bad, it has the default glibc at 2.29
nm -D out/sysroot-build/bullseye/bullseye_amd64_staging/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
nm -D out/sysroot-build/bullseye/bullseye_amd64_staging/lib/x86_64-linux-gnu/libm.so.6 | grep -i " pow@"
0000000000040040 A pow@@GLIBC_2.29
0000000000010020 A pow@GLIBC_2.2.5

Can you confirm this behaviour?

M G

unread,
Jan 17, 2025, 9:10:17 AM1/17/25
to pdfium
Okay, I managed to get a sysroot with patched glibc and finally also a "clean" libpdfium.so. Essentially two things needed to be changed:

1.) Checkout the correct chromium/src/build.git

The sysroots were uploaded 1st May 2024, so I checked the correct commit hash which indeed had the reversion_glibc.py script availalble - with that commit arm64 & amd64 were correctly built with a max glibc of 2.17
However, for powerpc64le this was still not the fact.

2.) Change the reversion_glibc.py script

The script changes the glibc default version of libc, libm and libcrpyt by splitting a readelf command based on spaces, but on powerpc64le the format is slightly different:

#power sample
317: 000000000004d960 1756 FUNC GLOBAL DEFAULT [<localentry>: 8] 12 pow@@GLIBC_2.29

#amd64 sample
861: 0000000000040040 182 FUNC GLOBAL DEFAULT 14 pow@@GLIBC_2.29

after editing the script and extracting "[<localentry>: *]" everything worked fine and I got a clean glibc2.17 pdfium library on Power!

Thanks a lot for all your suggestions & help!

geisserml

unread,
Jan 17, 2025, 3:43:49 PM1/17/25
to pdfium
Very nice, congrats!

geisserml

unread,
May 5, 2025, 1:04:13 PM5/5/25
to pdfium
Late update: pypdfium2 now also comes with a native build script that should be portable across different Linux architectures (and possibly other OSes if they are handled by the build system and provide a compatible system library environment). This was possible with pdfium's GN build system after all, with a few tricks. (Thanks to the libpdfium-nojs AUR and libpdfium COPR recipe authors for showing how to do this!)

However, I believe it would still make sense to handle ppc64le in the toolchain, for cross-compilation, symbol reversioning, inclusion of dependency libraries etc.
Alternatively, perhaps using a PyPA manylinux container + auditwheel might also work. (polyfill-glibc only supports x86_64 and aarch64 AFAIK.)

geisserml

unread,
Oct 31, 2025, 3:33:45 PM10/31/25
to pdfium
Coming back to this again. I finally realized I don't need native hardware but can investigate this with an emulated podman container as in the original post.

The depot_tools issue from Comment 5 is actually quite manageable. I did
$ export VPYTHON_BYPASS="manually managed python not supported by chrome operations"
and then (after $ python3 -m ensurepip --upgrade && python3 -m pip install -U pip):
$ python3 -m pip install httplib2==0.22.0
The httplib2 appears to be a pure python package, and the only external dependency needed to make `gclient --help` work.
depot_tools currently needs httplib2 v0.22.0, because the httplib2.socks module has disappeared in v0.30.

Also note that you have to use the newer manylinux_2_28 container where python3 is 3.11 – manylinux_2014 python 3.6 is too old for the depot_tools sources.
(Also CentOS 7 appears to be too old to build pdfium itself, at least when using gcc and system libcxx, I tried with pypdfium2's build_native.py.)

Then I did as usual:
$ gclient config --custom-var checkout_configuration=minimal --unmanaged https://pdfium.googlesource.com/pdfium
This shows errors
```
Errors:

  failed to resolve infra/3pp/tools/cpython3/linux-ppc64le@version:2...@3.11.8.chromium.35 (line 21): no such package: infra/3pp/tools/cpython3/linux-ppc64le
/home/test/depot_tools/bootstrap_python3: line 32: boots...@3.11.8.chromium.35_bin/python3/bin/python3: No such file or directory
```
but they appear to be benign and we can still proceed with sync:
$ gclient sync --revision origin/chromium/7191 --no-history --shallow
```
Errors:

  failed to resolve infra/3pp/tools/cpython3/linux-ppc64le@version:2...@3.11.8.chromium.35 (line 21): no such package: infra/3pp/tools/cpython3/linux-ppc64le
/home/test/depot_tools/bootstrap_python3: line 32: boots...@3.11.8.chromium.35_bin/python3/bin/python3: No such file or directory
Syncing projects: 100% (35/35), done.                                                                                                                                    

________ running 'cipd ensure -log-level error -root /home/test/lib -ensure-file /tmp/tmphurniz2u.ensure' in '.'
Errors:
  failed to resolve gn/gn/linux-ppc64le@git_revision:487f8353f15456474437df32bb186187b0940b45 (line 5): no such package: gn/gn/linux-ppc64le
  failed to resolve infra/rbe/client/linux-ppc64le@re_client_version:0.177.1.e58c0145-gomaip (line 8): no such package: infra/rbe/client/linux-ppc64le
  failed to resolve infra/3pp/tools/ninja/linux-ppc64le@version:3...@1.12.1.chromium.4 (line 11): no such package: infra/3pp/tools/ninja/linux-ppc64le
Error: Command 'cipd ensure -log-level error -root /home/test/lib -ensure-file /tmp/tmphurniz2u.ensure' returned non-zero exit status 1
Errors:
  failed to resolve gn/gn/linux-ppc64le@git_revision:487f8353f15456474437df32bb186187b0940b45 (line 5): no such package: gn/gn/linux-ppc64le
  failed to resolve infra/rbe/client/linux-ppc64le@re_client_version:0.177.1.e58c0145-gomaip (line 8): no such package: infra/rbe/client/linux-ppc64le
  failed to resolve infra/3pp/tools/ninja/linux-ppc64le@version:3...@1.12.1.chromium.4 (line 11): no such package: infra/3pp/tools/ninja/linux-ppc64le
```
Errors again, but it is anticipated that some binaries cannot be sourced for ppc64le the usual way.
Now we have a checkout with the vendored libraries, and it should be feasible to make a build from that state, though I imagine is_clang=false and use_sysroot=false might be needed.

gn and ninja binaries can be obtained from the container distro:
$ yum install gn ninja-build

I will experiment with the build options and might make a follow-up post, but I'm not sure it is feasible to do a full build under emulation.

geisserml

unread,
Oct 31, 2025, 6:28:02 PM10/31/25
to pdfium
Okay, I'm currently building with the following args.gn:
```
is_debug = false
use_remoteexec = false
use_siso = false
treat_warnings_as_errors = false
use_glib = false
is_component_build = true
pdf_is_standalone = true
pdf_use_partition_alloc = false
pdf_enable_v8 = false
pdf_enable_xfa = false
pdf_use_skia = false
is_clang = false
use_custom_libcxx = false
use_libcxx_modules = false
clang_use_chrome_plugins = false
use_sysroot = false
```
I'm calling /usr/bin/gn and /usr/bin/ninja directly, as the depot_tools "gn" wrapper failed for some reason.
Also, I needed to apply a small patch for compatibility with the container's older version of gn, which does not know the path_exists() function. [1]
Alternatively, newer gn could be built from source (I actually did it in the manylinux2014 container as the older distro did not provide gn).

The emulated build is slow, I'm currently at [655/1178], so far without issues.

It may even be possible to do use_sysroot = true now, as I noticed during checkout that the ppc64le sysroot from [2] seems to have arrived.
Given that, I will also retry cross-compilation, but I'm not sure if the sysroot is enough, or if more will be needed to make cross-compilation work?

[1]: https://github.com/pypdfium2-team/pypdfium2/blob/56650c449b37238d06f051ad438ad3c3c7382bc8/pdfium_patches/siso.patch
[2]: https://chromium-review.googlesource.com/c/chromium/src/+/6187330

geisserml

unread,
Oct 31, 2025, 7:24:34 PM10/31/25
to pdfium
For the record, the build was successful.
I have uploaded the DLLs and a packaged pypdfium2 wheel here: https://github.com/mara004/binary_staging/releases/tag/2-pdfium-ppc64le
The min glibc requirement is 2_28, matching the build container.

geisserml

unread,
Oct 31, 2025, 7:45:07 PM10/31/25
to pdfium
> I'm calling /usr/bin/gn and /usr/bin/ninja directly, as the depot_tools "gn" wrapper failed for some reason.

Also for the record, the precise error with depot_tools gn is
```
gn gen out/Default/
python3_bin_reldir.txt not found. need to initialize depot_tools by
running gclient, update_depot_tools or ensure_bootstrap.
```
Anyway, this is not a problem as we can just call /usr/bin/gn instead. Or perhaps remove depot_tools from $PATH after the checkout phase.


> Also note that you have to use the newer manylinux_2_28 container where python3 is 3.11
Huh, for some reason python3 fell back to python3.6, and python3.11 is now Python 3.11.14.
But when I open a fresh container from the same image, python3 is 3.11.13 as it was at first.
I suppose that I installed something or ran some command that caused this mess. Perhaps best call python3.11 explicitly.

geisserml

unread,
Mar 16, 2026, 7:32:52 PM (10 days ago) Mar 16
to pdfium
Getting back to the sub-thread with @vmiklos, I finally figured out how to build pdfium through Libreoffice's build system.
Here's a reconstruction of the commands used (on Fedora):
```
git clone --depth 1 https://gerrit.libreoffice.org/core libreoffice
cd libreoffice/
sudo dnf -y builddep libreoffice && sudo dnf -y install meson ant
./autogen.sh --without-junit
make Library_pdfium
ls -l ./instdir/program/ && ldd ./instdir/program/libpdfiumlo.so  # informational
```
However, there are some drawbacks:
This clones all of libreoffice, and installs all build dependencies just to satisfy autogen, although we only want to build pdfium.
Also, `make` first downloads a lot of dependency archives, but only a subset is needed for pdfium.
@vmiklos: Is there any chance of sidestepping or reducing the amount of unrelated dependencies?

The reason I looked into this again is that, given [1], pdfium's GN build probably doesn't work on BSD systems (e.g. FreeBSD).

[1]: https://groups.google.com/a/chromium.org/g/chromium-dev/c/b57hDs8yE4g/m/5tXefZ74AQAJ

Miklos Vajna

unread,
Mar 18, 2026, 4:47:57 PM (8 days ago) Mar 18
to geisserml, pdfium
Hi,

On Mon, Mar 16, 2026 at 04:32:52PM -0700, geisserml <geis...@gmail.com> wrote:
> This clones all of libreoffice, and installs all build dependencies just to
> satisfy autogen, although we only want to build pdfium.
> Also, `make` first downloads a lot of dependency archives, but only a
> subset is needed for pdfium.
> @vmiklos: Is there any chance of sidestepping or reducing the amount of
> unrelated dependencies?

Oh, I think you took this idea a bit too far. :-)

What I wanted to point out is that pdfium's build system is luckily not
hyper-complicated; you can write your own build system if you prefer
something other than GN or you hit a limit that's not supported by GN.

In the libreoffice case, there is a fixed set of supported build systems
for dependencies and if you're not on that list, you need to build the
dependency with the "own" build system, which is what we do for pdfium.

If you are e.g. familiar with how other c++ code is built for
Linux/ppc64le using let's say cmake, then you could certainly came up
with a cmake config for pdfium.

I don't think it makes sense for others to take the entire libreoffice
build system just because of some arch which is not supported out of the
box by pdfium's GN config.

Regards,

Miklos

geisserml

unread,
Mar 18, 2026, 5:11:53 PM (8 days ago) Mar 18
to pdfium
Makes sense, yes. I initially thought one might just be able to go in external/pdfium and make, but after I realized you first have to run top-level autogen/configure, with all the dependencies, it dawned on me too that this wasn't a good approach just to build pdfium. Though I like the fact that, thanks to libreoffice, distributions often incidentally provide a pdfium shared library, with libreoffice typically being installed on end user systems.

As for FreeBSD, maybe the situation is better than I thought. I knew Chromium needs over a thousand patches to build on BSD, but seeing
it looks like the amount of patching needed for pdfium is actually quite small.

geisserml

unread,
Mar 23, 2026, 7:12:54 PM (3 days ago) Mar 23
to pdfium
I have another note/question with respect to libreoffice-pdfium.

As of Libreoffice 25.8.4.2 / pdfium 7012 on FreeBSD, there are APIs missing that should exist in that pdfium version.
Here's the set as reported by ctypesgen:
- FPDF_NewXObjectFromPage FPDF_NewFormObjectFromXObject FPDF_CloseXObject
- FPDF_ImportPages FPDF_ImportPagesByIndex FPDF_ImportNPagesToOne FPDF_CopyViewerPreferences
- FPDFPage_GetRawThumbnailData FPDFPage_GetThumbnailAsBitmap FPDFPage_GetDecodedThumbnailData
- FPDFDoc_GetJavaScriptAction FPDFJavaScriptAction_GetScript FPDFDoc_GetJavaScriptActionCount FPDFJavaScriptAction_GetName FPDFDoc_CloseJavaScriptAction

The XObject and pages APIs missing causes failures in pypdfium2's test suite.

@vmiklos: Sorry to bother you again. I wanted to ask, is this still an issue, or has it already been fixed in newer versions? Or is it even intentional that these APIs are not included, like because libreoffice doesn't need them?

Miklos Vajna

unread,
Mar 24, 2026, 3:42:28 AM (2 days ago) Mar 24
to geisserml, pdfium
Hi,

On Mon, Mar 23, 2026 at 04:12:54PM -0700, geisserml <geis...@gmail.com> wrote:
> - FPDFDoc_GetJavaScriptAction FPDFJavaScriptAction_GetScript
> FPDFDoc_GetJavaScriptActionCount FPDFJavaScriptAction_GetName
> FPDFDoc_CloseJavaScriptAction

The XFA / JS bits are indeed disabled in our builds, since we don't have
a v8 at hand.

Right now we bundle pdfium 7691 here:

https://github.com/libreoffice/core/tree/4731284939ce286316738d1fced7f16e9b91028b/external/pdfium

Hopefully that has all the APIs you would expect from that version,
apart form these fxjs ones. Only that part is intentional.

It can happen that some object file is omitted by accident, since we
only get a build failure if some API that we use is not there.

Regards,

Miklos

geisserml

unread,
Mar 24, 2026, 9:25:43 AM (2 days ago) Mar 24
to pdfium
Thank you for the clarification!

FWIW, we don't use those JavaScript APIs, but at the same time I don't think the ones listed above are V8 specific per preprocessor?
At least, ctypesgen does not pick up any members behind #ifdef's unless we explicitly define the condition, and we don't define PDF_ENABLE_V8 when binding to libreoffice-pdfium...

Miklos Vajna

unread,
Mar 24, 2026, 11:59:44 AM (2 days ago) Mar 24
to geisserml, pdfium
Hi,

On Tue, Mar 24, 2026 at 06:25:43AM -0700, geisserml <geis...@gmail.com> wrote:
> FWIW, we don't use those JavaScript APIs, but at the same time I don't
> think the ones listed above are V8 specific per preprocessor?
> At least, ctypesgen does not pick up any members behind #ifdef's unless we
> explicitly define the condition, and we don't define PDF_ENABLE_V8 when
> binding to libreoffice-pdfium...

I see, e.g. fpdfsdk/fpdf_ppo.cpp is meant to be part of the resulting
shared object, but no code on our end uses it, so it wasn't detected
it's missing. It could be added, sure.

My point was: this Makefile demonstrates that luckily pdfium is mostly a
sane C++ project where it's not impossible to build the code with the
default build system if you don't like it for some reason.

If you're interested to solve your Linux/ppc64le need without touching
GN, I guess the most straightforward way would be to write an own
Makefile for a specific version of pdfium. You can script an initial
version of that based on the compile database that ninja can generate
for you.

Most probably the libreoffice Makefiles are not useful outside of its
home-grown build system for its own needs, so please don't misinterpret
that as a generic mechanism that gives you a Linux/ppc64le binary, I
mentioned it as a proof that you can build pdfium without GN if you
want to go that way. :-)

Thanks,

Miklos

geisserml

unread,
Mar 24, 2026, 12:23:57 PM (2 days ago) Mar 24
to pdfium
I agree. FWIW I am mostly fine with GN, and we already provide ppc64le builds nowadays.
It is probably easier for us to patch pdfium's GN files instead of writing a whole new build system.
The reason I keep having an eye on libpdfiumlo.so is that there are still some platforms we don't provide builds for, like the BSDs, and it's convenient for an end user installation to reuse an existing binary instead of trying to build from scratch, no matter what build system. That's faster and more likely to work out of the box, whereas building from scratch mostly needs some degree of user interaction.
I can live with some APIs being missing; most downstreams sort of use pypdfium2 for rendering and text extraction so if these core APIs are available that is already a win for us.
Reply all
Reply to author
Forward
0 new messages