Skip to content

Summarize Document Node

AI/Processing

Summarize Document

Creates an intelligent summary of document pages using AI with configurable strategies and detail levels. Handles long documents via chunked summarization with multiple strategy options.

ai_processing_summarize_documentprocessingLong running
Inputs12
Outputs2
Security exposure5/10
Packageprocessing

Ratings

Scores range from 0 to 10. Higher values mean more impact, exposure, or operational weight.

SecurityAttack surface and exposure impact.
5/10Medium
PrivacyPotential sensitivity of processed data.
5/10Medium
PerformanceRuntime or resource pressure.
4/10Medium
GovernancePolicy, audit, or compliance impact.
5/10Medium
ReliabilityOperational stability considerations.
3/10High
CostExternal or compute cost impact.
6/10Medium

Input Pins

12

Input

Execution
exec_in

Execution trigger to start summarization.

Pages

Struct Array
pages

Document pages to summarize.

DocumentPageDocumentPage3 fields
page_numberinteger:uint32required
format uint32min 0
contentstringrequired
imagesArray<NodeImage>required
itemsNodeImagearray item
image_refstringrequired
Schema enforced

Model

Struct
model

AI model to use for summarization.

BitBit19 fields
idstring
default ""
typeBitTypes
enum "Llm", "Vlm", "Tts", "Stt"...default "Other"
metaMap<string, Metadata>
default {}
*Metadatamap value
namestringrequired
descriptionstringrequired
long_descriptionstring | null
release_notesstring | null
tagsArray<string>required
itemsstringarray item
+11 more fields
authorsArray<string>
default []
itemsstringarray item
repositorystring | null
default null
download_linkstring | null
default null
file_namestring | null
default null
hashstring
default ""
sizeinteger | null
format uint64default nullmin 0
hubstring
default ""
parametersvalue
default null
versionstring | null
default null
licensestring | null
default null
dependenciesArray<string>
default []
itemsstringarray item
dependency_tree_hashstring
default ""
createdstring
default ""
updatedstring
default ""
model_slugstring | null
default null
+1 more fields
Schema enforced

Detail Level

String
detail_level

Summary detail level: Low (very concise), Medium (balanced), High (comprehensive).

Default Medium
LowMediumHigh

Include TOC

Boolean
include_toc

Whether to include a table of contents with page references.

Default true

Strategy

String
strategy

Summarization strategy: • Refine — sequential, best coherence, no parallelism • MapReduce — parallel chunking, fast, may lose cross-chunk context • Hierarchical — structure-aware tree, best for headed documents • Hybrid — MapReduce speed + Refine coherence polish • SlidingWindow — fixed memory buffer, best for very long documents

Default Refine
RefineMapReduceHierarchicalHybridSlidingWindow

Densification

String
densification

Post-processing to increase information density: • None — use the strategy output as-is • ChainOfDensity — iteratively compress to optimal density

Default None
NoneChainOfDensity

Max Context Tokens

Integer
max_context_tokens

Maximum characters per summarization chunk (adjust based on model context window).

Default 8000

Chunk Overlap %

Integer
chunk_overlap

Overlap between adjacent chunks as percentage (0-50). Prevents information loss at boundaries (default: 10).

Default 10

Track Entities

Boolean
track_entities

Extract and track named entities across chunks to prevent information loss.

Default false

Parallel Requests

Integer
parallel_requests

Number of chunks to process in parallel for MapReduce/Hybrid strategies. 0 = unlimited (default: 4).

Default 4

Density Steps

Integer
density_steps

Number of Chain of Density refinement steps when densification is enabled (1-5, default: 3).

Default 3

Output Pins

2

Output

Execution
exec_out

Execution output after summarization completes.

Summary

Struct
summary

The generated document summary.

DocumentSummaryDocumentSummary3 fields
summarystringrequired
keywordsArray<string>required
itemsstringarray item
page_referencesArray<PageReference>required
itemsPageReferencearray item

Reference to content location within the document

topicstringrequired
pagesArray<integer:uint32>required
itemsinteger:uint32array item
format uint32min 0
Schema enforced

Node Info

Internal name
ai_processing_summarize_document
Category
AI/Processing
Version
4