Nice work. Regarding this note on your site...
"Testing on Linux will be performed once the shared object library
equivalent to the DLL becomes available."
I don't know if you know but I've more or less taken over development
of Tesjeract, the JNI equivalent. I got it working on Linux by
basically mimicking the necessary parts of tessdll.dll. Feel free to
borrow that code. Ideally I'd like to ditch that ugly mess and use
Tesseract 3 but I've been tidying up Leptonica first.
http://code.google.com/p/tesjeract/
James
public static void main(String[] args) { File imageFile = new File("C:\\tesseract-ocr\\JNA\\Tess4J\\eurotext.png"); //File imageFile = new File("C:\\tesseract-ocr\\H1B.jpg"); Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping try { String result = instance.doOCR(imageFile); System.out.println(result); } catch (TesseractException e) { System.err.println(e.getMessage()); }catch (Exception ex) { System.err.println(ex.getMessage()); } }
Here are the server logs:
07-Mar-2025 20:58:12.967 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.Native.extractFromResourcePath Looking in classpath from com.pega.pegarules.bootstrap.loader.PRAppLoader@2b7e8044 for /com/sun/jna/linux-x86-64/libjnidispatch.so
07-Mar-2025 20:58:13.053 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.Native.extractFromResourcePath Found library resource at pegajdbc://408132785:0/jna-5.8.0.jar!/com/sun/jna/linux-x86-64/libjnidispatch.so
07-Mar-2025 20:58:13.149 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.Native.extractFromResourcePath Extracting library to /usr/local/tomcat/temp/jna14459296195516564982.tmp
07-Mar-2025 20:58:13.150 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath Trying /usr/local/tomcat/temp/jna14459296195516564982.tmp
07-Mar-2025 20:58:13.157 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath Found jnidispatch at /usr/local/tomcat/temp/jna14459296195516564982.tmp
07-Mar-2025 20:58:13.867 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.NativeLibrary.loadLibrary Looking for library 'tesseract'
07-Mar-2025 20:58:13.867 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.NativeLibrary.loadLibrary Adding paths from jna.library.path: /mnt/BCDS/outbound/tess4j/linux-x86-64/
07-Mar-2025 20:58:13.911 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.NativeLibrary.loadLibrary Trying /mnt/BCDS/outbound/tess4j/linux-x86-64/libtesseract.so
07-Mar-2025 20:58:14.255 INFO [https-jsse-nio-8443-exec-2] com.sun.jna.NativeLibrary.loadLibrary Found library 'tesseract' at /mnt/BCDS/outbound/tess4j/linux-x86-64/libtesseract.so
!strcmp(locale, "C"):Error:Assert failed:in file /mnt/c/nix/Dev/cpp/lib/tesseract/src/api/baseapi.cpp, line 209
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f63e0bff898, pid=1, tid=523
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed mode, tiered, compressed oops, serial gc, linux-amd64)
# Problematic frame:
# C [libc.so.6+0x28898] abort+0x178
#
# Core dump will be written. Default location: //core.1
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log
#
# If you would like to submit a bug report, please visit:
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
Jar files imported in Pega application:
tess4j-5.4.0.jar
jna-5.8.0.jar
jna-platform-5.8.0.jar
slf4j-api-1.7.30.jar
slf4j-simple-1.7.30.jar
lept4j-1.16.2.jar
commons-io-2.6.jar
And copied the jar files to azure mount location (/mnt/BCDS/outbound/) on server
Java Paths:
-Djna.library.path=/mnt/BCDS/outbound/tess4j/linux-x86-64/
-Dtessdata.path=/mnt/BCDS/outbound/tess4j/tessdata/