Gateway Performance Tuning

Overview

Gateway performance depends on several factors:

Hardware resources: CPU core count, memory amount
Traffic profile: Concurrent request count, request size, SSE/streaming usage
Policy complexity: Number and weight of applied policies (e.g., CPU-intensive policies like Content Filtering)
Backend response time: Latency profile of the target service

Tuning approach — priority order:

Automatic profile — Sufficient for most environments, no intervention needed
Tier recommendations — Ready-made parameter sets based on resource size (this page)
Manual tuning — Fine-tuning based on load test results

Tier-Based Recommended Settings

The following tables show recommended Gateway Worker settings for different hardware profiles. These values should be validated with pre-production load tests.

Tier 1 — 2 Core / 2 GB RAM

Suitable for low-traffic environments, development, and PoC scenarios.

Parameter	Recommended Value
`tuneIoThreads`	2
`tuneWorkerThreads`	512
`tuneWorkerMaxThreads`	1024
`tuneRoutingConnectionPoolMinThreadCount`	512
`tuneRoutingConnectionPoolMaxThreadCount`	2048
`tuneRoutingConnectionPoolMaxConnectionTotal`	2048
`tuneElasticsearchClientIoThreadCount`	8
`tuneAsyncExecutorCorePoolSize`	512
`tuneAsyncExecutorMaxPoolSize`	1024
`tuneAsyncExecutorQueueCapacity`	1000

Tier 1 Environment Variable Block Example:

tuneIoThreads=2
tuneWorkerThreads=512
tuneWorkerMaxThreads=1024
tuneRoutingConnectionPoolMinThreadCount=512
tuneRoutingConnectionPoolMaxThreadCount=2048
tuneRoutingConnectionPoolMaxConnectionTotal=2048
tuneElasticsearchClientIoThreadCount=8

Tier 2 — 4 Core / 4 GB RAM

Recommended for medium-scale production environments. A balanced starting point for most deployments.

Parameter	Recommended Value
`tuneIoThreads`	4
`tuneWorkerThreads`	1024
`tuneWorkerMaxThreads`	2048
`tuneRoutingConnectionPoolMinThreadCount`	1024
`tuneRoutingConnectionPoolMaxThreadCount`	4096
`tuneRoutingConnectionPoolMaxConnectionTotal`	4096
`tuneElasticsearchClientIoThreadCount`	16
`tuneAsyncExecutorCorePoolSize`	1024
`tuneAsyncExecutorMaxPoolSize`	2048
`tuneAsyncExecutorQueueCapacity`	1000

Tier 2 Environment Variable Block Example:

tuneIoThreads=4
tuneWorkerThreads=1024
tuneWorkerMaxThreads=2048
tuneRoutingConnectionPoolMinThreadCount=1024
tuneRoutingConnectionPoolMaxThreadCount=4096
tuneRoutingConnectionPoolMaxConnectionTotal=4096
tuneElasticsearchClientIoThreadCount=16

Tier 3 — 8 Core / 8 GB RAM

Recommended for high-traffic production environments.

Parameter	Recommended Value
`tuneIoThreads`	8
`tuneWorkerThreads`	2048
`tuneWorkerMaxThreads`	4096
`tuneRoutingConnectionPoolMinThreadCount`	1024
`tuneRoutingConnectionPoolMaxThreadCount`	8192
`tuneRoutingConnectionPoolMaxConnectionTotal`	8192
`tuneElasticsearchClientIoThreadCount`	32
`tuneAsyncExecutorCorePoolSize`	2048
`tuneAsyncExecutorMaxPoolSize`	4096
`tuneAsyncExecutorQueueCapacity`	2000

Tier 3 Environment Variable Block Example:

tuneIoThreads=8
tuneWorkerThreads=2048
tuneWorkerMaxThreads=4096
tuneRoutingConnectionPoolMinThreadCount=1024
tuneRoutingConnectionPoolMaxThreadCount=8192
tuneRoutingConnectionPoolMaxConnectionTotal=8192
tuneElasticsearchClientIoThreadCount=32

JVM Parameters: Automatic memory profile system is recommended for all tiers (no need to add JAVA_OPTS). For profile details and manual GC configuration, see JVM Garbage Collector Tuning.

Thread Tuning

Worker Threads

Worker threads are the core thread pool that processes incoming HTTP requests.

Parameter	Default	Description
`tuneWorkerThreads`	1024	Minimum Worker thread count — threads created at server startup
`tuneWorkerMaxThreads`	2048	Maximum Worker thread count — upper limit the pool can grow to under load

Recommended Thread Values by CPU

CPU Cores	`tuneWorkerThreads`	`tuneWorkerMaxThreads`
1	512	1024
2	1024	2048
4	2048	4096
8	4096	8192

General rule: tuneWorkerThreads ≈ CPU × 512, tuneWorkerMaxThreads ≈ CPU × 1024. These values are based on Undertow’s (the HTTP server used by Gateway) thread management model.

IO Threads

IO threads are low-level threads that handle network I/O operations (socket read/write).

Parameter	Default	Description
`tuneIoThreads`	CPU core count	IO Thread count

IO thread count is typically kept equal to the CPU core count. Increasing it provides no benefit in most scenarios; it may cause context switching overhead.

Async Executor Thread Pool

Asynchronous operations such as RestApi Policy, Script Policy, logging, and traffic mirroring use this separate thread pool.

Parameter	Default	Description
`tuneAsyncExecutorCorePoolSize`	Same as `tuneWorkerThreads`	Minimum number of threads kept alive in the pool
`tuneAsyncExecutorMaxPoolSize`	Same as `tuneWorkerMaxThreads`	Maximum number of threads that can be created
`tuneAsyncExecutorQueueCapacity`	`tuneMaxQueueSize` if > 0, otherwise 1000	Maximum number of tasks that can wait in the queue when threads are full

Thread Pool Sizing Warning:The async executor thread pool is independent from the main worker thread pool. Ensure that the total thread count (worker + async executor) does not exceed your system’s capacity. Consider CPU core count and available memory when determining thread counts.

Connection Pool Tuning

Routing Connection Pool

Manages HTTP connections to backend services. These values are critical for high concurrent request volumes.

Parameter	Default	Description
`tuneRoutingConnectionPoolMinThreadCount`	—	Minimum connection thread count
`tuneRoutingConnectionPoolMaxThreadCount`	—	Maximum connection thread count
`tuneRoutingConnectionPoolMaxConnectionPerHost`	1024	Maximum connections to a single backend host
`tuneRoutingConnectionPoolMaxConnectionTotal`	2048	Total maximum connections to all backend hosts

When to increase:

When 503 errors or connection timeout logs are observed
When backend services respond slowly and connections are exhausted
When concurrent request count exceeds current pool limits

Cache Connection Pool

Manages connections to cache services.

Parameter	Default	Description
`tuneCacheConnectionPoolMaxConnectionTotal`	2048	Cache Connection Pool total maximum connection count

API Call Connection Pool

Manages external API calls made by RestApi Policy and similar policies.

Parameter	Default	Description
`tuneApiCallConnectionPoolMaxConnectionPerHost`	256	Maximum connections per host
`tuneApiCallConnectionPoolMaxConnectionTotal`	4096	Total maximum connection count

Timeout Tuning

Parameter	Default	Description
`tuneReadTimeout`	30000 ms (30 sec)	Client data read timeout. Connection is closed if client stops sending data
`tuneStreamingReadTimeout`	0 ms (unlimited)	Client-side read timeout for SSE/LLM streaming connections
`tuneNoRequestTimeout`	60000 ms (60 sec)	No-request timeout after connection is established

Relationship Between Timeout Parameters

Client connection → tuneNoRequestTimeout (waiting for request)
                  → Request received → tuneReadTimeout (reading data)
                                      → Streaming? → tuneStreamingReadTimeout

tuneNoRequestTimeout: Connection opened but no HTTP request sent — cleans up idle connections
tuneReadTimeout: Client stopped sending data during request processing — cleans up stuck requests
tuneStreamingReadTimeout: For long-lived connections like SSE, Server-Sent Events, and LLM streaming, the normal tuneReadTimeout is insufficient

SSE/LLM Streaming Scenarios:In streaming connections, clients don’t send data for extended periods, so the normal tuneReadTimeout may prematurely close the connection. tuneStreamingReadTimeout assigns a dedicated timeout value for streaming connections. The default value of 0 (unlimited) is suitable for most scenarios; however, you may set an appropriate upper limit for your environment to prevent resource leaks.

Benchmark Reference

For tier-based performance comparisons and detailed benchmark results, see Capacity Planning.

Benchmark results were measured under ideal conditions (fast backend, minimal network latency, simple policy chain). Always perform load tests based on your own traffic patterns for production environments.

JVM Garbage Collector Tuning

Automatic memory profiles, GC options, and heap configuration

Capacity Planning

Hardware sizing, RAM calculation, and benchmark results

Gateway Runtimes

UI-based configuration and parameter reference for gateway environments

Pod Thread Count Periodic Monitoring

Monitoring thread usage over time and alert mechanisms

Guides

How-To Guides

Articles

Overview

Tier-Based Recommended Settings

Tier 1 — 2 Core / 2 GB RAM

Tier 2 — 4 Core / 4 GB RAM

Tier 3 — 8 Core / 8 GB RAM

Thread Tuning

Worker Threads

Recommended Thread Values by CPU

IO Threads

Async Executor Thread Pool

Connection Pool Tuning

Routing Connection Pool

Cache Connection Pool

API Call Connection Pool

Timeout Tuning

Relationship Between Timeout Parameters

Benchmark Reference

JVM Garbage Collector Tuning

Capacity Planning

Gateway Runtimes

Pod Thread Count Periodic Monitoring

Guides

How-To Guides

Articles

​Overview

​Tier-Based Recommended Settings

​Tier 1 — 2 Core / 2 GB RAM

​Tier 2 — 4 Core / 4 GB RAM

​Tier 3 — 8 Core / 8 GB RAM

​Thread Tuning

​Worker Threads

​Recommended Thread Values by CPU

​IO Threads

​Async Executor Thread Pool

​Connection Pool Tuning

​Routing Connection Pool

​Cache Connection Pool

​API Call Connection Pool

​Timeout Tuning

​Relationship Between Timeout Parameters

​Benchmark Reference

​Related Pages

JVM Garbage Collector Tuning

Capacity Planning

Gateway Runtimes

Pod Thread Count Periodic Monitoring

Overview

Tier-Based Recommended Settings

Tier 1 — 2 Core / 2 GB RAM

Tier 2 — 4 Core / 4 GB RAM

Tier 3 — 8 Core / 8 GB RAM

Thread Tuning

Worker Threads

Recommended Thread Values by CPU

IO Threads

Async Executor Thread Pool

Connection Pool Tuning

Routing Connection Pool

Cache Connection Pool

API Call Connection Pool

Timeout Tuning

Relationship Between Timeout Parameters

Benchmark Reference

Related Pages