I'm attempting to run GPU Acceleration during training using the OpenCL libraries.
I have built tesseract to use openCL, and installed the NVidia Compute driver 440 on my Ubuntu 19.10 installation
Whenever I run tesstrain.sh, however, I run into the issue that the program refuses to select the proper GPU. Rather than use my
NVidia GeForce GTX 1060 6GB device, it will select the CPU as my default OpenCL device even though it detects my GPU,
and scores it better in the built-in benchmark.
Setting TESSERACT_OPENCL_DEVICE=1 seems to do nothing as nvidia-smi shows that the process is not utilizing my GPU.
Here is my tesstrain.sh output:
=== Starting training for language 'eng'
[Tue 18 Feb 2020 04:55:13 PM PST] /usr/local/bin/text2image --fonts_dir=/usr/share/fonts --ptsize 12 --font=Chit --outputbase=/tmp/font_tmp.Hk8xAdjwI8/sample_text.txt --text=/tmp/font_tmp.Hk8xAdjwI8/sample_text.txt --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8
Rendered page 0 to file /tmp/font_tmp.Hk8xAdjwI8/sample_text.txt.tif
=== Phase I: Generating training images ===
Rendering using Chit
[Tue 18 Feb 2020 04:55:15 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=1 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp1 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.tif
Rendering using Chit
[Tue 18 Feb 2020 04:55:17 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=2 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp2 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.tif
Rendering using Chit
[Tue 18 Feb 2020 04:55:19 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=3 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp3 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.tif
Rendering using Chit
[Tue 18 Feb 2020 04:55:22 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=4 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp4 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.tif
Rendering using Chit
[Tue 18 Feb 2020 04:55:24 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=5 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp5 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.tif
Rendering using Chit
[Tue 18 Feb 2020 04:55:27 PM PST] /usr/local/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.Hk8xAdjwI8 --fonts_dir=/usr/share/fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=6 --outputbase=/tmp/eng-2020-02-18.Gfj/eng.Chit.exp6 --max_pages=0 --font=Chit --ptsize 12 --text=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.training_text
Stripped 35 unrenderable words
Rendered page 0 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.tif
Stripped 6 unrenderable words
Rendered page 1 to file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.tif
=== Phase UP: Generating unicharset and unichar properties files ===
[Tue 18 Feb 2020 04:55:28 PM PST] /usr/local/bin/unicharset_extractor --output_unicharset /tmp/eng-2020-02-18.Gfj/eng.unicharset --norm_mode 1 /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.box /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.box /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.box /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.box /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.box /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.box
Extracting unicharset from box file /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.box
Other case É of é is not in unicharset
Wrote unicharset file /tmp/eng-2020-02-18.Gfj/eng.unicharset
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/set_unicharset_properties -U /tmp/eng-2020-02-18.Gfj/eng.unicharset -O /tmp/eng-2020-02-18.Gfj/eng.unicharset -X /tmp/eng-2020-02-18.Gfj/eng.xheights --script_dir=/home/tim/PycharmProjects/RnD/OCR_Dataset/langdata
Loaded unicharset of size 102 from file /tmp/eng-2020-02-18.Gfj/eng.unicharset
Setting unichar properties
Other case É of é is not in unicharset
Setting script properties
Warning: properties incomplete for index 25 = ~
Writing unicharset to file /tmp/eng-2020-02-18.Gfj/eng.unicharset
=== Phase E: Generating lstmf files ===
Using TESSDATA_PREFIX=/home/tim/PycharmProjects/RnD/OCR_Dataset/tessdata_best
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1 --psm 6 lstm.train
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3 --psm 6 lstm.train
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4 --psm 6 lstm.train
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2 --psm 6 lstm.train
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5 --psm 6 lstm.train
[Tue 18 Feb 2020 04:55:29 PM PST] /usr/local/bin/tesseract /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.tif /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6 --psm 6 lstm.train
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1060 6GB score is 1.846448
[DS] Device[2] 0:(null) score is 0.503215
[DS] Selected Device[2]: "(null)" (Native)
[DS] Overriding Device Selection (TESSERACT_OPENCL_DEVICE=1, 1)
[DS] Overridden Device[1]: "GeForce GTX 1060 6GB" (OpenCL)
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Page 1
Page 1
Page 1
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Page 1
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Tesseract Open Source OCR Engine v5.0.0-alpha with Leptonica
Page 1
Page 1
Page 2
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.lstmf
Page 2
Page 2
Page 2
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.lstmf
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.lstmf
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.lstmf
Page 2
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.lstmf
Page 2
Loaded 56/56 lines (1-56) of document /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.lstmf
=== Constructing LSTM training data ===
[Tue 18 Feb 2020 04:55:33 PM PST] /usr/local/bin/combine_lang_model --input_unicharset /tmp/eng-2020-02-18.Gfj/eng.unicharset --script_dir /home/tim/PycharmProjects/RnD/OCR_Dataset/langdata --words /home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.wordlist --numbers /home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.numbers --puncs /home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.punc --output_dir /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA --lang eng
Loaded unicharset of size 102 from file /tmp/eng-2020-02-18.Gfj/eng.unicharset
Setting unichar properties
Other case É of é is not in unicharset
Setting script properties
Config file is optional, continuing...
Failed to read data from: /home/tim/PycharmProjects/RnD/OCR_Dataset/langdata/eng/eng.config
Null char=2
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
=== Saving box/tiff pairs for training data ===
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.box to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.tif to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
=== Moving lstmf files for training data ===
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp1.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp2.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp3.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp4.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp5.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Moving /tmp/eng-2020-02-18.Gfj/eng.Chit.exp6.lstmf to /home/tim/PycharmProjects/RnD/OCR_Dataset/DATA
Created starter traineddata for LSTM training of language 'eng'
Run 'lstmtraining' command to continue LSTM training for language 'eng'
And here is my nvidia-smi output during the training process:
Tue Feb 18 17:01:50 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... On | 00000000:01:00.0 On | N/A |
| 22% 56C P0 23W / 120W | 402MiB / 6072MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1254 G /usr/lib/xorg/Xorg 32MiB |
| 0 2003 G /usr/lib/xorg/Xorg 155MiB |
| 0 2223 G /usr/bin/gnome-shell 97MiB |
| 0 2724 G ...p/pycharm-professional/183/jbr/bin/java 2MiB |
| 0 4932 G /usr/bin/nvidia-settings 0MiB |
| 0 5071 G ...AAAAAAAAAAAAAAgAAAAAAAAA --shared-files 62MiB |
+-----------------------------------------------------------------------------+