PDF Tools
OCR PDF
Perform OCR on PDF files, making text in images selectable and searchable.
Uses
Use this task to convert unselectable text in images within a PDF to selectable and searchable text. The OCR process makes text selectable and searchable, which is essential for further processing, indexing, or accessibility.
Basic Usage
Flow Data Explained
required array
files An array of PDF files to be processed by OCR.
required string
function The pdf_tools
function to call, in this case it would be ocr_pdf
required object
ocr_settings The OCR settings.
Show child attributes
required array
languages Specifies one or more languages for OCR text recognition. Use language codes from the list below:
eng
– Englishchi_sim
– Chinese (Simplified)deu
– Germanfra
– Frenchpor
– Portuguese
required string
ocr_type Determines how the OCR engine handles existing text in the PDF:
Normal
– OCR only images or pages without existing text.skip-text
– Skips OCR entirely on pages where selectable text already exists.force-ocr
– Forces OCR on all pages, even if they already contain text.
required string
ocr_render_type Controls how the OCR result is embedded into the PDF:
hocr
– Produces an invisible text layer without altering the visual content.sandwich
– Inserts the recognized text behind the existing page images, allowing text selection while preserving the original visual appearance.