Cognitive Openocr

0 views

Skip to first unread message

Llanque Mazurek

unread,

Aug 4, 2024, 1:29:13 PM8/4/24

to newssourmata

CognitiveOpenOCR is an open source program for Microsoft Dynamics GP. It's free and was developed by combining the libraries from several open source OCR applications, and by collecting user feedback and input. It doesn't have many bloatware tools when installation, but they're easily avoidable, since the software works with so many different languages. One of the best things about Cognitive OpenOCR (other than its simplicity) is that it integrates extremely well with Microsoft SharePoint and other MS Office programs.

The problem with so many of the current OCR platforms available to business is that they're all based on Windows, which is not an ideal platform for enterprise applications due to the security risks associated with using Windows, especially in a cloud environment. Cognitive OpenOCR however is designed to run on Windows, so it can be used with any of the other available OCR platforms out there, as long as they're all running on the same server. If you have a Windows based system, Cognitive OpenOCR should work out of the box. If you use Windows based machines for your business, however, you'll want to look into hosted OCR solutions like Microsoft Business Solutions. These type of hosted solutions allow you to use Microsoft Cognitive Studio or Visual Studio to build your own cognitive recognition software, while leveraging already existing back end processing and data sources.

One of the key features of Cognitive OpenOCR (other than its simplicity and integration with Microsoft systems) is the implementation of a convolutional neural network, called the ConvNet. The convent is a network that "lays down" the actual image or text that is being recognized. This then becomes the basis for generating high-quality, crisp voice recognition results. Another key feature of Cognitive OpenOCR is that it implements the idea of optical character recognition. By making use of the convolutional network and OCR technologies, this product promises to deliver near flawless recognition results, while saving large amounts of money and time during the process.

Computer Vision is an AI service that analyzes content in images. We will use the OCR feature of Computer Vision to detect the printed text in an image. The application will extract the text from the image and detects the language of the text.

On the next screen, click on the Add button. It will open the cognitive services marketplace page. Search for the Computer Vision in the search bar and click on the search result. It will open the Computer Vision API page. Click on the Create button to create a new Computer Vision resource. Refer to the image shown below.

We will install the Azure Computer Vision API library which will provide us with the models out of the box to handle the Computer Vision REST API response. To install the package, navigate to Tools >> NuGet Package Manager >> Package Manager Console. It will open the Package Manager Console. Run the command as shown below.

Right-click on the ngComputerVision project and select Add >> New Folder. Name the folder as Models. Again, right-click on the Models folder and select Add >> Class to add a new class file. Put the name of your class as LanguageDetails.cs and click Add.

The Post method will receive the image data as a file collection in the request body and return an object of type OcrResultDTO. We will convert the image data to a byte array and invoke the ReadTextFromStream method. We will deserialize the response into an object of type OcrResult. We will then form the sentence by iterating over the OcrWord object.

Inside the ReadTextFromStream method, we will create a new HttpRequestMessage. This HTTP request is a Post request. We will pass the subscription key in the header of the request. The OCR API will return a JSON object having each word from the image as well as the detected language of the text.

The GetAvailableLanguages method will return the list of all the language supported by the Translate Text API. We will set the request URI and create a HttpRequestMessage which will be a Get request. This request URI will return a JSON object which will be deserialized to an object of type AvailableLanguage.

The OCR API returns the language code (e.g. en for English, de for German, etc.) of the detected language. But we cannot display the language code on the UI as it is not user-friendly. Therefore, we need a dictionary to look up the language name corresponding to the language code.

The Azure Computer Vision OCR API supports 25 languages. To know all the languages supported by OCR API see the list of supported languages. These languages are a subset of the languages supported by the Azure Translate Text API.

Since there is no dedicated API endpoint to fetch the list of languages supported by OCR API, we are using the Translate Text API endpoint to fetch the list of languages. We will create the language lookup dictionary using the JSON response from this API call and filter the result based on the language code returned by the OCR API.

We have defined a text area to display the detected text and a text box for displaying the detected language. We have defined a file upload control which will allow us to upload an image. After uploading the image, the preview of the image will be displayed using an element.

The uploadImage method will be invoked upon uploading an image. We will check if the uploaded file is a valid image and within the allowed size limit. We will process the image data using a FileReader object. The readAsDataURL method will read the contents of the uploaded file.

Upon successful completion of the read operation, the reader.onload event will be triggered. The value of imagePreview will be set to the result returned by the fileReader object, which is of type ArrayBuffer.

Inside the GetText method, we will append the image file to a variable for type FormData. We will invoke the getTextFromImage of the service and bind the result to an object of type OcrResult. We will search for the language name from the array availableLanguage, based on the language code returned from the service. If the language code is not found, we will set the language as unknown.

We will add the navigation links for our components in the nav menu. Open nav-menu.component.html and remove the links for Counter and Fetch data components. Add the following lines in the list of navigation links.

We have created an optical character recognition (OCR) application using Angular and the Computer Vision Azure Cognitive Service. The application is able to extract the printed text from the uploaded image and recognizes the language of the text. The OCR API of the Computer Vision is used which can recognize text in 25 languages.

Optical Character Recognition (OCR) detects text in an image. Extracts characters into a usable character.

First, subscribe on Microsoft Azure. Microsoft

gives a 7-day trial Subscription Key -us/try/cognitive-services/ . We can use that Subscription key for testing purposes.

you need to log into the Azure Portal with our Azure credentials. Then we need to create an Azure Computer Vision Subscription Key in the Azure portal.

var caracters= OCR.GetImageOCR(key,imagePath, endPoint = " ")

In our previous tutorial, you learned how to use the Amazon Rekognition API to OCR images. The hardest part of using the Amazon Rekognition API was obtaining your API keys. However, once you had your API keys, it was smooth sailing.

Ensure you have followed Obtaining Your Microsoft Cognitive Services Keys to obtain your subscription keys to the MCS API. From there, open the microsoft_cognitive_services.py file and update your SUBSCRPTION_KEY:

We only need a single argument here, --image, which is the path to the input image on disk. We read this image from disk, both as a binary byte array (so we can submit it to the MCS API, and then again in OpenCV/NumPy format (so we can draw on/annotate it).

Figure 2 shows the output of applying the MCS OCR API to our aircraft warning sign. If you recall, this is the same image we used in a previous tutorial when applying the Amazon Rekognition API. Therefore, I included the same image in this tutorial to demonstrate that the MCS OCR API can correctly OCR this image.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

When working with low-quality images, the MCS API shined. Typically, I recommend you programmatically detect and discard low-quality images (as we did in a previous tutorial). However, if you find yourself in a situation where you have to work with low-quality images, it may be worth your while to use the Microsoft Azure Cognitive Services OCR API.

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.

PyImageSearch University is really the best Computer Visions "Masters" Degree that I wish I had when starting out. Being able to access all of Adrian's tutorials in a single indexed page and being able to start playing around with the code without going through the nightmare of setting up everything is just amazing. 10/10 would recommend.

In this article, we will create an optical character recognition (OCR) application using Angular and the Azure Computer Vision Cognitive Service. Computer Vision is an AI service that analyzes content in images. We will use the OCR feature of Computer Vision to detect the printed text in an image. The application will extract the text from the image and detect the language of the text. Currently, the OCR API supports 25 languages.