OCR Automation

This section describes how you can use OCR Automation activities to extract text from an image or application and transfer the same data to a variable or application.

About OCR Automation Activities and Engine

The OCR Automation activities use OCR (Optical Character Recognition) technology that enables you to scan different types of documents (for example, scanned paper document, PDF files, MS Word document, text files, or images) and entire screen of the machine.

To extract and repurpose textual content from the scanned document or image file, OCR Automation activities require OCR engines to access and transfer the textual content to a variable or machine. The OCR Automation activities receive documents or images files as a web link, local file or web element, which is processed and then send to OCR engines for extraction. As output, the activities provide a variable that contains the extracted textual content that which used in the subsequent automation task.

The following OCR engines are available:

  • Tesseract
  • ABBYY FineReader
  • Amazon Textract
  • Google Cloud Vision API

The Tesseract OCR engine is preconfigured and does not require any additional setup. It is available by default for textual content extraction from document files or images.

OCR Engine Subscription

You must set up user or group subscriptions with the respective OCR Engines provider to get the license keys and activation.

For more information about subscription:

Google Cloud

You can use Google Cloud Vision with the OCR activities to extract text from a specified web UI image. Cloud Vision provides API, which allows you to integrate Cloud Vision OCR image extraction services with the OCR engine in ASG-Studio.

Enable the Google Cloud Vision API

To use the extract image text services provided by Google Cloud Vision, you need to enable and authenticate to the API for using the services.

You have to perform the following main steps to enable and authenticate to the API:

  1. If you do not have a Google Account (Gmail or Google Apps), you must create one.
    • Sign-in to the Google Cloud Platform console (console.cloud.google.com) and create a new project.
  2. After you create a project, you need to enable the Google Cloud Vision API.
  3. In the Cloud Vision API page, create credentials to authenticate to the API.
  4. Create API credentials key to make API calls
  5. Select JSON as the key type and download the file.
  6. Rename the JSON file to gcred.json and then save it to the following location:
  7. C:\Program Files (x86)\ASG RPA BotEngine\Resources

For detailed information about Google Cloud Platform API Management, see Cloud Vision documentation.

Configure OCR Engine

Perform the following task:

  1. In ASG-Studio, navigate to the RPA visual designer and click the OCR Configuration button.
  2. In the OCR Configuration console do the following:
  3. OCR Engine

    Steps

    Google Vision

    1.       Expand the Google Vision section and click the Choose File button.

    2.       Select the gcred.json file and open it.

    Abbyy

    1.       Select the Configure checkbox.

    2.       Select region: Select the region to which you license key is registered.

    3.       App Id: The application ID provided when subscribing to the ABBYY FineReader service.

    4.       App Password: The application password provided when subscribing to the ABBYY FineReader service.

    AWS

    1.       Select the Configure checkbox.

    2.       AWS Access Key ID: The access key ID provided when subscribing to the Amazon Textract service.

    3.       AWS Secret Access Key: The access key provided when subscribing to the Amazon Textract service.

    4.       S3 Bucket Name: The Amazon S3 bucket you created to upload your data.

    5.       Region: Select the region to which you license key is registered.

  4. Click Save.

OCR Automation Activity

You can use the following activity to extracting textual content from document files and images.:

Perform OCR

Extract textual content from a document file or image stored as a file on your machine, weblink, or element. This automation action used OCR technology and engine to scan and extract data.

Properties

  • Name: Enter the display name of the action.
  • OCR Engine: Select the OCR engine you want to use for data extraction.

  • Processing Type: Select the location of the document or image file.

    • Local File

      • File Path: Specify the name (including extension) and path of the document or image file.
    • Web Link

      • Web Link: Specify the web address of the image.
    • Web Element

      • Instance Name: Specify the instance of the web browser which you want to use as a reference for the automation job.
      • Element Search Type: Select one of the following attributes that will be used to search a UI element.
        • Find Element By ID
        • Find Element By Name
        • Find Element By XPath
        • Find Element By Tag Name
        • Find Element By Class Name
        • Find Element By CSS Selector
      • Search Parameter: The value that exists for a specific attribute of a UI element you want to find.
  • Variable Name: Specify the variable to store the extracted data.