Fail to untar tesseract language pack files

57 views
Skip to first unread message

iori

unread,
Jun 19, 2015, 2:01:45 AM6/19/15
to tesser...@googlegroups.com
I'm trying to use tesseract library in iOS platform but having some trouble. After downloading any of the language packs (.tar.gz), and successfully ungzip it (become .tar), using the popular Light-Untar-for-iOS couldn't manage to untar the file. 

Does the .tar format in tesseract language packs have special structure? One of the thing I notice is the first block is always a null block, which is not the case for any of the other tar files I tried. I am able to untar other tar files I found on my Mac without issues..

The file is not corrupted because I can untar it on my Mac without issues..

I have posted the more detailed description of my problem at:

zdenko podobny

unread,
Jun 19, 2015, 2:14:52 AM6/19/15
to tesser...@googlegroups.com
There should be nothing special regarding tar structure - AFAIR language packages were compressed on linux and I have no problem to decompress them on linux or windows...

Zdenko

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2df9ca8f-cf7b-4373-a8e9-07c0ff3088fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

iori

unread,
Jun 19, 2015, 11:22:27 PM6/19/15
to tesser...@googlegroups.com
I have no problem decompress them on the os level too, just that seems like a few of the TAR libraries I found for iOS platform were all based on Light-Untar-for-iOS and they couldn't untar it due to "invalid block type" error.

iori

unread,
Jun 20, 2015, 12:38:46 AM6/20/15
to tesser...@googlegroups.com
Seems like the problem is because the library doesn't handle 'L' block type, which means a long file name.



I still doesn't know how to solve the issue..

Tom Morris

unread,
Jun 20, 2015, 11:19:59 PM6/20/15
to tesser...@googlegroups.com
On Saturday, June 20, 2015 at 12:38:46 AM UTC-4, iori wrote:
Seems like the problem is because the library doesn't handle 'L' block type, which means a long file name.



Good work tracking down the root cause!
 
I still doesn't know how to solve the issue..

Seems like there are few straightforward solutions:

1. Patch the library to handle long names
2. Switch to a different library
3. Recompress/tar the data files to be compatible with the library you are using.

Tom
Reply all
Reply to author
Forward
0 new messages