Tasks

Analyze Document

The Analyze Document task provides various methods of extracting information from documents, such as Optical Character Recognition (OCR), entity extraction, ID proofing, and classification.

Use Cases

Use this task to extract text from a document
Use to extract a data structure of all entities in a filled in form
Use to extract data from a South African ID document
Use to classify a proof of payment document as belonging to a specific bank in South Africa

NOTE

If you are specifically wanted to extract financial information from an invoice or receipt, you should use the Analyze Financial File task.

Basic usage

Parameters

document_processor_type
required

string

The type of analysis that you would like to perform on the file.

Current Options are:

ocr
Extract text from a document with optical character recognition. The document can be a pdf or an image.
entity_extraction
Extract entities and form fields from a document. The document can be a pdf or an image.
id_proofing
Id proof a document, getting verification signals on image manipulation, verification of an id document and checking for suspicious words. The document can be a pdf or an image.
classifier
Classify a proof of payment document as belonging to a specific bank in South Africa. The document can be a pdf or an image.
parse_contents
Parse the contents of a document. The document can be a pdf, pptx, docx, html, xml.

[document_processor_type]
required

object

This object key must correspond with the document_processor_type.
This object will contain all the options and params for the selected document_processor_type.

For example, if the document_processor_type is ocr, then there must be an ocr object with the required options for the ocr processor type.

Show child attributes

params

optional

object

This object contains all the options and params for the selected document_processor_type.

files

required

object[]

An array of file objects. Each object contains file meta data.
The most important key for these objects is the fileuuid

Show child attributes

fileuuid
required

string

The unique identifier of the file that should be transcribed.

Result

Properties

payload

object

The returned payload

total_pages

integer

Number of pages analyzed

ocr

json object

Contains all the analysis data for the ocr document_processor_type. This object with change to correspond to the document_processor type.

ocr.full_text

string

Contains all the text the analysis could get from the file

ocr.detected_languages

json array

An array of objects. Each object in the array is a page of the file and indicates which languages were detected on that page of the document.

ocr.all_languages

json array

An array containing language codes for all languages detected in the document.

OCR

Shown in the Basic Usage

Entity Extraction

This document_processor_type processor type is used to extract entity and form fields from a document.

Example would be a document that has an organization and fields that have information in them like a receipt, invoice, application form, etc.

It will attempt to identify the field name and the value of that field and create a key value pair. It will do this for each form field that it detects.

Here is an example of a form. Its a Fax Cover Sheet with a form with fields and entries for each field.

On the left is what it has extracted from the image displayed on the right.

You will notice the document_processor_type is now entity_extraction and so there should be the corresponding object with the same name referring to the file it should analyze.

Result

The result returned would look something like this. The returned data is quite large so only showing a sample of the important data that is returned.

The result payload is similar to the first example with a few additions:

entity_extraction

json object

Contains all the analysis data for the ocr document_processor_type. This object with change to correspond to the document_processor type.

Please see above OCR example for explanations for the detected_languages and full_text objects & key values.

entity_extraction.entities

array

Contains an object for each entity detected. eg. organization is: Staples

entity_extraction.form_fields.pages

array

Contains an object for each pages.

Each page is an object in the array and each page object contains an object for each field detected. eg. Date: 04/12/2021 From: Jane Doe

ID Proofing

This is used to extract all the information from South African Identity documents, ie. Green bar code ID Book, Smart ID Card or Passports.

It doesn't only extract the information, it also does fraud checks to indicate if the document is valid or fraudulent.

You will notice the document_processor_type is now id_proofing and so there should be the corresponding object with the same name referring to the file it should analyze.

Result

The result returned would look something like this. The returned data is quite large so only showing a sample of the important data that is returned.

The result payload is similar to the first example with a few additions:

id_proofing

json object

Contains all the analysis data for the id_proofing document_processor_type. This object with change to correspond to the document_processor type.

Please see above OCR example for explanations for the detected_languages and full_text objects & key values.

id_proofing.id_proofing_signals

object

Contains an entry for each id proofing check performed and its result. eg. fraud_signals_suspicious_words: PASS

This indicates the the suspicious words check passed inspection. No suspicious words found.

id_proofing.all_signals_passed

boolean

Indicates if all the id proofing checks passed. If one check fails, this will be false.

Parse Contents

This document_processor_type is used to parse the text from a document. The document can be a pdf, pptx, docx, html, xml.

It uses a Large Language Model (LLM) to parse the text from the document.

Parameters

These parameters are specific to the parse_contents processor type.

parse_contents.sites

optional

object[]

An array of site objects. Each object contains site meta data.
The most important key for these objects is the url.

Only sites or files can be used, not both.

Show child attributes

url
required

string

The URL of the site that should be parsed. Will only parse this specific page.

parse_contents.params

optional

object

The type of analysis that you would like to perform on the file.

Show child attributes

parse_mode

optional

string

The mode of parsing to use. One of fast, accurate, premium

Default: accurate

fast | Best for text-only PDFs.
accurate | Performs OCR, image extraction, and table/heading identification. Ideal for complex documents with images.
premium | Same features as accurate, and outputs equations in LaTeX format. Outputs diagrams in Mermaid format.

result_type

optional

string

The type of result to return. One of text, markdown, json.

Default: markdown

markdown and json are only available in accurate and premium mode.

parsing_instruction

optional

string

The prompt to provide to the LLM to help it understand the context of the document. Applied to each page of the parsed document.

disable_ocr

optional

boolean

Skip the OCR step. Useful for text-only PDFs as it will speed up the parsing process.

skip_diagonal_text

optional

boolean

Skip any text that is not horizontal or vertical.

do_not_unroll_columns

optional

boolean

By default multiple columns of text are parsed as a single continuous block. Set this to true to keep the columns separate.

Result

The result returned would look something like this. The returned data is quite large so only showing a sample of the important data that is returned.

full_text

string

Contains all the parsed text from the document. Typically in the format specified by result_type.

Links

Options

Analyze Document

Use Cases

Basic usage

Parameters

document_processor_type
required

string

[document_processor_type]
required

object

fileuuid
required

string

Result

Properties

OCR

Entity Extraction

Result

ID Proofing

Result

Parse Contents

Parameters

url
required

string

Result

On this page

Analyze Document

# Use Cases

# Basic usage

# Parameters

# document_processor_type required string

# [document_processor_type] required object

# fileuuid required string

# Result

# Properties

# OCR

# Entity Extraction

# Result

# ID Proofing

# Result

# Parse Contents

# Parameters

# url required string

# Result

On this page

Use Cases

Basic usage

Parameters

document_processor_type
required

string

[document_processor_type]
required

object

fileuuid
required

string

Result

Properties

OCR

Entity Extraction

Result

ID Proofing

Result

Parse Contents

Parameters

url
required

string

Result