Generating a PDF with Tesseract C++

615 views
Skip to first unread message

Saliaj Adrian

unread,
Apr 28, 2017, 4:05:24 AM4/28/17
to tesseract-ocr
Hello,

I am testing the Tesseract C++ API.

Here is my code :

#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>

int main()
{
    fprintf(stderr, "Heyhey !\n");
    char * outText;

    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
   
    if (api->Init(NULL, "fra")) {
        fprintf(stderr, "Could not initialize tesseract.\n");
        exit(1);
    }
   
    Pix * image = pixRead("/path/test.tif");
    api->SetImage(image);
   
    outText = api->GetUTF8Text();
    printf("OCR output:\n%s", outText);

    api->End();
    delete [] outText;
    pixDestroy(&image);

    return 0;
}

It works and display the text output.

But how can I get a searchable PDF file output and save it on my computer ?

I mean, exactly like the command line : tesseract test.tif output pdf

Thanks a lot

Zdenko Podobný

unread,
Apr 28, 2017, 4:39:02 AM4/28/17
to tesser...@googlegroups.com

Zdenko

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/28da1809-a51b-4918-a375-99ca6cd9c713%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Saliaj Adrian

unread,
Apr 28, 2017, 6:40:29 AM4/28/17
to tesseract-ocr
So I must set the variable "tessedit_create_pdf" at true, but how ? I do not find the setter.

Thanks

Zdenko Podobný

unread,
Apr 28, 2017, 1:03:34 PM4/28/17
to tesser...@googlegroups.com
Really? Where did you look?

BTW: It seems you did not not understand that example: setting variable will not help you.

Zdenko

On Fri, Apr 28, 2017 at 12:40 PM, Saliaj Adrian <supe...@live.fr> wrote:
So I must set the variable "tessedit_create_pdf" at true, but how ? I do not find the setter.

Thanks

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Saliaj Adrian

unread,
May 22, 2017, 10:04:45 AM5/22/17
to tesseract-ocr
Thank you but it still doesn't work...

Can someone just tell me what should be add to this code to generate a PDF output ?


#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>

int main()
{
    fprintf
(stderr, "Heyhey !\n");
   
char * outText;

    tesseract
::TessBaseAPI *api = new tesseract::TessBaseAPI();
   
   
if (api->Init(NULL, "fra")) {
        fprintf
(stderr, "Could not initialize tesseract.\n");
       
exit(1);
   
}
   
   
Pix * image = pixRead("/path/test.tif");
    api
->SetImage(image);
   
    outText
= api->GetUTF8Text();
    printf
("OCR output:\n%s", outText);

    api
->End();
   
delete [] outText;
    pixDestroy
(&image);

   
return 0;
}



Thanks a lot

ShreeDevi Kumar

unread,
May 22, 2017, 10:54:42 AM5/22/17
to tesser...@googlegroups.com

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages