Skip to content

Create DataFusion Session Node

Data/DataFusion

Create DataFusion Session

Creates a new DataFusion session for SQL analytics. Configure optimization settings for production workloads.

df_create_sessiondata
Inputs10
Outputs2
Security exposure0/10
Packagedata

Ratings

Scores range from 0 to 10. Higher values mean more impact, exposure, or operational weight.

SecurityAttack surface and exposure impact.
0/10High
PrivacyPotential sensitivity of processed data.
0/10High
PerformanceRuntime or resource pressure.
1/10High
GovernancePolicy, audit, or compliance impact.
1/10High
ReliabilityOperational stability considerations.
1/10High
CostExternal or compute cost impact.
0/10High

Input Pins

10

Input

Execution
exec_in

Trigger execution

Session Name

String
session_name

Unique name for this session (used for caching)

Default default

Target Partitions

Integer
target_partitions

Number of partitions for parallel query execution. Higher values increase parallelism but add overhead. 0 = auto (uses CPU count).

Default 0

Batch Size

Integer
batch_size

Number of rows processed per batch. Larger batches improve throughput but use more memory.

Default 8192

Repartition Joins

Boolean
repartition_joins

Enable automatic repartitioning before joins for better parallelism

Default true

Repartition Aggregations

Boolean
repartition_aggregations

Enable automatic repartitioning before aggregations

Default true

Repartition Sorts

Boolean
repartition_sorts

Enable automatic repartitioning for parallel sorting

Default true

Coalesce Batches

Boolean
coalesce_batches

Combine small batches into larger ones to reduce overhead

Default true

Parquet Pruning

Boolean
parquet_pruning

Enable predicate pushdown and column pruning for Parquet files

Default true

Collect Statistics

Boolean
collect_statistics

Collect statistics from data sources for query optimization

Default true

Output Pins

2

Done

Execution
exec_out

Session created successfully

Session

Struct
session

DataFusion session reference for use with other DataFusion nodes

DataFusionSessionDataFusionSession1 fields
cache_keystringrequired

Node Info

Internal name
df_create_session
Category
Data/DataFusion