Skip to content

Extract Documents Node

AI/Processing

Extract Documents

Extracts text and content from multiple documents in parallel.

ai_processing_extract_documentsprocessingLong running
Inputs3
Outputs2
Security exposure0/10
Packageprocessing

Ratings

Scores range from 0 to 10. Higher values mean more impact, exposure, or operational weight.

SecurityAttack surface and exposure impact.
0/10High
PrivacyPotential sensitivity of processed data.
0/10High
PerformanceRuntime or resource pressure.
1/10High
GovernancePolicy, audit, or compliance impact.
0/10High
ReliabilityOperational stability considerations.
2/10High
CostExternal or compute cost impact.
0/10High

Input Pins

3

Input

Execution
exec_in

Execution trigger to start batch document extraction.

Files

Struct Array
files

Array of document files to extract.

FlowPathFlowPath3 fields
pathstringrequired
store_refstringrequired
cache_store_refstring | null
Schema enforced

Extract Images

Boolean
extract_images

Whether to extract and embed images from documents.

Default true

Output Pins

2

Output

Execution
exec_out

Execution output after all extractions complete.

Results

Struct Array
results

Array of extracted document pages for each file.

DocumentPageDocumentPage3 fields
page_numberinteger:uint32required
format uint32min 0
contentstringrequired
imagesArray<NodeImage>required
itemsNodeImagearray item
image_refstringrequired
Schema enforced

Node Info

Internal name
ai_processing_extract_documents
Category
AI/Processing