Issue With Page Segment Mode

59 views
Skip to first unread message

Shobhit Kapil

unread,
Mar 25, 2019, 7:59:54 AM3/25/19
to tesseract-ocr
Hi Team,

I am using Tesseract 4.0.0.0 version in c#.

In which i am using the below code....

So i am processing a pdf scanned image instead of processing the whole image i am making the height of image  = img.height /3 in that case i getting the below exception in page1.GetIterator(),
I am not getting error in all the file just in fewer files and when i am not making the height /3 and passing the complete height it is working for all the images.

system.accessviolationexception: 'attempted to read or write protected memory. this is often an indication that other memory is corrupt.

Any help!!!

 _engine = new TesseractEngine(
                            Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location) +
                            "\\tessdata", "eng", EngineMode.Default);

using (var ocr = Engine)
                {  
                    using (var page1 = ocr.Process(img1, PageSegMode.SparseText))
                    {
                        using (var iterator = page1.GetIterator())
                        {
                            iterator.Begin();
                            do
                            {
                                processWord = iterator.GetText(PageIteratorLevel.Word);
                                iterator.TryGetBoundingBox(PageIteratorLevel.Word, out Rect bounds);
                                OCRWords = new OCRObjects();
                                OCRWords.index = _index;
                                OCRWords.key = processWord;
                                OCRWords.bounds = bounds;
                                ocrObjects.Add(OCRWords);
                                _index++;
                            } while (iterator.Next(PageIteratorLevel.Word));
                        }
                    }
                } // End of OCR iteration. 
0710c7c5-bd3e-47c8-a0a4-6d0cb70e6295.PDF
a68f2bb3-a084-4701-88fc-286e3b002654.PDF

Carsten Giesen

unread,
Aug 7, 2019, 2:35:58 AM8/7/19
to tesseract-ocr
Hello,

I run exact in the same error. I didn't found a solution until now.
Did you found something or have a idea?

cu
Carsten
Reply all
Reply to author
Forward
0 new messages