AI Extract Documents Node

AI/Processing

AI Extract Documents

Extracts text and content from multiple documents using AI in parallel.

ai_processing_extract_documents_aiprocessingLong running

Inputs8

Outputs2

Security exposure5/10

Packageprocessing

Ratings

Scores range from 0 to 10. Higher values mean more impact, exposure, or operational weight.

SecurityAttack surface and exposure impact.

5/10Medium

PrivacyPotential sensitivity of processed data.

5/10Medium

PerformanceRuntime or resource pressure.

3/10High

GovernancePolicy, audit, or compliance impact.

5/10Medium

ReliabilityOperational stability considerations.

3/10High

CostExternal or compute cost impact.

7/10Low

Input Pins

Input

Execution

exec_in

Execution trigger to start AI-powered batch extraction.

Files

Struct Array

files

Array of document files to extract.

Schema enforced

Model

Struct

model

Vision-capable AI model for image analysis and OCR.

Schema enforced

Extract Images

Boolean

extract_images

Whether to extract and embed images from documents.

Default true

Images Per Message

Integer

images_per_message

Number of images to batch per LLM request (higher = faster but may hit token limits).

Default 2

Pages Per Batch

Integer

pages_per_batch

Number of PDF pages to process in parallel (higher = faster but uses more memory).

Default 2

Temperature

Float

temperature

LLM temperature (0.0 = deterministic, 1.0 = creative). Lower is better for extraction.

Default 0.1

Max Tokens

Integer

max_tokens

Maximum output tokens per LLM call. Leave at 0 for model default. Set lower for unreliable models.

Default 4096

Output Pins

Output

Execution

exec_out

Execution output after all extractions complete.

Results

Struct Array

results

Array of extracted document pages with AI descriptions for each file.

Schema enforced

Node Info

Internal name: ai_processing_extract_documents_ai
Category: AI/Processing
Version: 2