Skip to main content

OCR engines

An OCR Engine is used in the Digitization component, to identify text in a file, when native content is not available.

note

The images that need to be processed should have a resolution range of:

  • min: 50 x 50 pixels
  • max: 9000 x 9000 pixels

Here is a selection of OCR Engines that you can choose from, according to your needs, throughout the Document UnderstandingTM Framework.

OCR EngineActivity PackDebug Logs Format in Logs FolderReports Confidence
UiPath Extended Languages OCRUiPath.OCR.Activities${date:format=yyyy-MM-dd}
UiPath Document OCRUiPath.OCR.Activities${date:format=yyyy-MM-dd}
OCR for Chinese, Japanese and KoreanUiPath.Core.Activities.CjkOCR${date:format=yyyy-MM-dd}
OmniPage OCRUiPath.OmniPage.Activities${date:format=yyyy-MM-dd}
Google Cloud Vision OCRUiPath.UIAutomation.Activities${date:format=yyyy-MM-dd}❌ if DetectionMode is set to TextDetection (default) ✅ if DetectionMode is set to DocumentTextDetection
Microsoft Azure Computer Vision OCRUiPath.UIAutomation.Activities${date:format=yyyy-MM-dd}❌ if UseReadAPI is not selected (default) ✅ if UseReadAPI is selected
Microsoft OCRUiPath.UIAutomation.Activities${date:format=yyyy-MM-dd}
Tesseract OCRUiPath.UIAutomation.Activities${date:format=yyyy-MM-dd}
note

When debugging errors, you can always visit the logs folder and check the relevant OCR log files. Read more about logging here.