OCR

The OCR connector recognizes the characters in a scanned document and extract them as text. When the OCR language is a language other than Japanese, the extracted text can be saved in the DOCX or XML format. The OCR connector also identifies the top and bottom of the document and adds a file name based on the text extracted from the first page of the scanned document.

This connector can only be added to a workflow whose [Job Processing Location] is set to [On Server].

  • When converted to the RTF or DOCX format, the extracted text is placed in text boxes to maintain document layout.
  • If the document contains languages other than that specified, parts that cannot be detected are output as blank spaces.

OCR File Format Selector Key

To allow the image conversion plug-in to read the file format from the ScanSetting, enable the GC key:

  1. Login to Streamline NX as customer engineer.

    To enable Customer Engineer privilege to a user (Admin) account, visit System⟶ Security⟶ User Accounts. Select the account and add the Customer Engineer role, then save the changes and re-login to SLNX.

  1. In the Advanced System Settings Editor tab, click [View] and choose [Delegation Server settings].

  2. Click [Add].

  3. In the Details tab, enter the following:

    • Key: ds.scan.server.FileFormatSupport.mode

    • Type: Boolean

    • Value: True

  4. Click [Save].

  5. Logout of the Management Console and then log back in. The File Format option will now appear when configuring the OCR Connector properties below.

Configure the OCR Connector Properties

  1. In the Delivery Flow, click the [OCR ] connector icon.

  2. Specify the display name.

  3. In General Settings, specify the Document Name Extraction and Format Conversion.

  4. The file name is generated using the keyword on the first page of the scanned document.

When the format conversion function is enabled, input files are connected to create one file.

Supported Formats (Input Data)

Convertible Formats (Output Data)

  • TIFF

  • TIFF-F

  • DCX

  • BMP

  • JPEG

  • PNG

  • GIF

  • RTF

  • XLS

  • XLSX

  • DOCX

  • *Select 'File Format Selected on [Scan Settings] tab': allows the user to select any file format

OCR Connector processing conditions

  • When input data cannot be processed: Moves to the next step in the delivery flow without performing OCR. An error does not occur.
  • When the input data contains data that cannot be processed: The file including data that cannot be processed is skipped, and only the processable data is processed by OCR.
  • If an internal error occurs, OCR fails, and the remaining delivery flow is not executed.
  • When [Top and Bottom Identification] is enabled and the input data is in TIFF-F format, the output data is converted to TIFF format. The compression format (MMR, MH, etc.), however, is the same as the original input data.
  • If enabling the 'File Format Selected on [Scan Settings] tab' option, place OCR prior to Image Conversion. Do not arrange more than one OCR connector in series in any other combination.