Status: New
Owner: ----
New issue 1496 by 
rtaylor...@gmail.com: Access Violation - reading outside  
image buffer during line detection
https://code.google.com/p/tesseract-ocr/issues/detail?id=1496
What steps will reproduce the problem?
1.  Tesseract 3.02+ command line
2.  "tesseract -l eng Image_crop.png Image pdf"
What is the expected output? What do you see instead?
>   I expect tesseract to run and produce output
> Instead, Tesseract crashes with "ACCESS VIOLATION (0xC0000005)"-type  
> error.
What version of the product are you using? On what operating system?
Seen in Tesseract 3.02.02 and code from SVN around March 2015.
Windows 7
Win32-bit Tesseract builds.
Please provide any additional information below.
- Doesn't happen in 64-bit Windows build (lucky?)
- Attached image has non-white pixels at image edges - this seems to  
trigger this crash bug.
- Access violation occurs in TextlineProjection::MeanPixelsInLineSegment()  
when it calls GET_DATA_BYTE() (~line 550).  This can break when  
start_pt/end_pt Y values = 0 and offset is a negative value.  This can also  
break when start_pt/end_pt Y value = bottom of image and offset is a  
positive value.  These conditions lead to an attempted reads of data either  
before or after the image buffer.
- Other problems would occur horizontally (i.e. X value = 0 or right edge  
of image).  In these cases there is less chance of stepping outside the  
image buffer (unless at a corner), but good chance that the algorithm will  
not read the intended data due to wrapping to other image side.
Attachments:
	Image_crop.png  1.5 MB
-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings