Hello,
I'm using pdf2html and trying to upgrade from pdf2htmlEX-0.14.6 to the latest pdf2htmlEX-0.18.8.rc1 version.
But the resulting conversion is a file without any text., while photos are converted as expected.
I'm getting these warning message during the conversion:
Working: 3/100 Working: 1/100 Working: 1/100 Working: 2/100 Working: 1/100 Warning: encoding confliction detected in font: 11
This is the dockerfile I use to build my docker image which runs the conversion:
FROM ubuntu:20.04
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
ENV DEBIAN_FRONTEND=noninteractive
RUN dpkg --configure -a
RUN apt-get clean
RUN apt-get update
RUN apt-get install -f -y python3
RUN apt-get install dialog apt-utils -y
RUN apt-get install -f -y python3-pip
RUN apt-get install -f -y python3-setuptools
RUN apt-get install -f -y wget
RUN apt-get install -f -y poppler-utils
RUN apt-get install -f -y jq
RUN apt-get install -f -y zip unzip
RUN apt-get install -f -y pdftk
RUN apt-get install -f -y ffmpeg
RUN apt-get install -f -y libfontforge-dev
RUN DEBIAN_FRONTEND=noninteractive; apt-get install -f -y pdftk-java
RUN apt install -f -y ghostscript
RUN pip3 install --upgrade pip \
&& apt-get clean
RUN pip3 --no-cache-dir install --upgrade awscli
WORKDIR /tmp
COPY lib/pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb /tmp
RUN apt install -y ./pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb
RUN wget https://www.imagemagick.org/download/ImageMagick.tar.gz && \
tar -xf ImageMagick.tar.gz && \
cd ImageMagick* && \
./configure && \
make && \
make install && \
ldconfig /usr/local/lib
Please advise on how I can. resolve this.