Canary Release and Traffic Mirroring
These features are only available for HTTP type API Proxies. These settings are not valid for gRPC and WebSocket type API Proxies.
Canary Release
Canary Release is a deployment strategy used to minimize risk by deploying a new backend version to production with a small percentage of traffic. This allows you to test the performance and stability of the new version on real traffic.
How Does Canary Release Work?
The canary release mechanism works as follows:
- Traffic Percentage: A certain percentage of traffic (e.g., 10%) is routed to the canary backend
- Counter-Based Routing: A counter is used for deterministic routing (with mod 100)
- Health Check Integration: When the canary backend becomes unhealthy, a cooldown period is automatically started
- Cooldown Period: During the cooldown period, no traffic is sent to canary, all traffic is routed to the stable backend
Canary Release Parameters
| Parameter | Description |
|---|---|
| Traffic Percentage | Percentage of traffic to be routed to the canary backend (0-100). |
| Cooldown Period (Seconds) | Duration during which traffic is stopped when the canary backend becomes unhealthy. Default: 300 seconds (5 minutes). |
Canary Routing Decision Flow
The canary routing decision mechanism works as follows:
-
When a request arrives, it is checked if canary release is enabled:
- If canary release is not enabled: Request is routed directly to PRIMARY backend
-
If canary release is enabled, cooldown period is checked:
- If cooldown period is active: All traffic is routed to PRIMARY backend (no traffic is sent to canary)
- If cooldown period is not active: Traffic distribution is performed
-
Traffic distribution (Bresenham-style distribution algorithm):
- Counter is incremented for each request
- Distribution threshold is calculated:
floor(counter * percentage / 100) - If threshold changes from previous request: Request is routed to CANARY backend
- If threshold remains the same: Request is routed to PRIMARY backend
- This ensures even distribution across requests (e.g., for 50%: 1st→PRIMARY, 2nd→CANARY, 3rd→PRIMARY, 4th→CANARY, ...)
-
After routing to canary backend:
- If request is successful: Operation is completed
- If request fails: Automatic failover to PRIMARY backend is performed
Bresenham-Style Distribution Algorithm
Canary release uses Bresenham-style distribution to evenly distribute traffic across requests:
How Does Bresenham Distribution Work?
Unlike simple modulo-based approaches that create traffic bursts (e.g., first 50 requests to canary, next 50 to primary), Bresenham distribution ensures even spacing:
Example for 50% traffic:
- 1st request → PRIMARY
- 2nd request → CANARY
- 3rd request → PRIMARY
- 4th request → CANARY
- ... (alternating pattern continues)
Example for 33% traffic:
- 1st, 2nd → PRIMARY
- 3rd → CANARY
- 4th, 5th → PRIMARY
- 6th → CANARY
- ... (evenly spaced pattern continues)
Algorithm Details
- Counter Increment: Counter is incremented for each request (using thread-safe atomic counter)
- Threshold Calculation:
- Current threshold =
floor(counter × percentage / 100) - Previous threshold =
floor((counter-1) × percentage / 100)
- Current threshold =
- Decision Making: If current threshold ≠ previous threshold, route to canary
- Overflow Prevention: When counter reaches 10,000, it is automatically reset to 0
sequenceDiagram
participant R as Request
participant C as Counter
participant D as Decision Engine
participant BE as Backend
R->>C: Increment Counter
C->>C: counter++ (thread-safe increment)
C->>C: currentThreshold = (counter * percentage) / 100
C->>C: previousThreshold = ((counter-1) * percentage) / 100
C->>D: currentThreshold, previousThreshold
alt currentThreshold != previousThreshold
D->>BE: Route to Canary
Note over D: Request selected for canary<br/>(threshold changed)
else currentThreshold == previousThreshold
D->>BE: Route to PRIMARY
Note over D: Request not selected<br/>(threshold unchanged)
end
alt counter >= 10000
C->>C: Reset to 0 (atomic reset operation)
end
Why Bresenham? This algorithm is named after Bresenham's line algorithm used in computer graphics. It ensures even distribution without clumping, providing a smooth traffic distribution pattern that's ideal for testing and monitoring.
Distribution Quality
The Bresenham distribution provides:
- Even Spacing: Requests are evenly distributed, not clustered
- Deterministic: Same counter value always produces same decision
- Thread-Safe: Works correctly in multi-threaded environments
- Accurate: Over any 100-request window, exactly the specified percentage routes to canary
Automatic Failback with Health Check
The health status of the canary backend is continuously monitored by the active health check mechanism:
- Unhealthy Detection: When health check fail threshold is exceeded, canary backend is marked as unhealthy
- Cooldown Start: When unhealthy is detected, cooldown period is automatically started
- Traffic Stop: During cooldown period, no traffic is sent to canary
- Automatic Recovery: After cooldown period passes, if health check becomes successful again, canary becomes active again
sequenceDiagram
participant HC as Health Check Service
participant CB as Canary Backend
participant CM as Canary Manager
participant C as Cache (Cooldown)
loop Every Interval Seconds
HC->>CB: GET /health
CB-->>HC: Response
alt Unhealthy Detected
HC->>HC: Consecutive Failures >= Fail Threshold
HC->>CM: triggerCanaryCooldown()
CM->>C: Set Cooldown Key (TTL)
Note over CM,C: Cooldown Period Started<br/>(Traffic stopped)
end
alt Cooldown Period Passed
HC->>CB: GET /health
CB-->>HC: 200 OK
HC->>HC: Consecutive Successes >= Pass Threshold
Note over CM: Cooldown automatically ends<br/>with TTL
end
end
Traffic Mirroring
Traffic Mirroring is a mechanism used to send a copy of live traffic to a test environment. Mirror requests work asynchronously and do not affect the main request.
How Does Traffic Mirroring Work?
The traffic mirroring mechanism works as follows:
- Main Request: Sent normally to PRIMARY backend
- Mirror Request: A copy of a certain percentage of traffic is sent asynchronously to MIRROR backend
- Asynchronous Processing: Mirror requests do not block the main request
- Result Irrelevance: Success/failure of mirror requests does not affect the main request
Traffic Mirroring Parameters
| Parameter | Description |
|---|---|
| Traffic Mirror Enabled | Whether traffic mirroring is enabled. |
| Mirror Percentage | Percentage of traffic to be mirrored (0-100). Determined with counter-based routing. |
Traffic Mirroring Architecture
graph TB
subgraph "Primary Request Flow"
A[Client Request] --> B[API Proxy]
B --> C[PRIMARY Backend]
C --> D[Response]
D --> E[Client]
end
subgraph "Mirror Request Flow (Async)"
B --> F{Mirror Percentage?}
F -->|Selected| G[Async Processing]
G --> H[MIRROR Backend]
H --> I[Log Results]
I --> J[Elasticsearch]
end
style G fill:#e1f5ff
style H fill:#e1f5ff
style I fill:#e1f5ff
Asynchronous Mirror Processing
Mirror requests are processed asynchronously:
- Async Task Creation: An asynchronous task is created for each mirror address
- Async Execution: Mirror requests run on async executor
- Result Aggregation: All mirror results are collected and added to message context
- Logging: Mirror results are logged to Elasticsearch (by LogHandler)
sequenceDiagram
participant RH as RoutingHandler
participant TMH as TrafficMirrorHandler
participant EX as Async Executor
participant MB as Mirror Backend
participant MC as Message Context
participant ES as Elasticsearch
RH->>TMH: executeMirrors(mirrorAddresses)
TMH->>TMH: Filter by Mirror Percentage
loop Each Mirror Address
TMH->>EX: Execute Async (mirrorRequest)
EX->>MB: Send Mirror Request
MB-->>EX: Response
EX-->>TMH: Async Result
end
TMH->>TMH: Aggregate Results
TMH->>MC: Add Mirror Results
TMH-->>RH: Async Processing Complete
Note over RH: Primary request continues<br/>(does not wait for mirror)
RH->>ES: Log Mirror Results (async)
If mirror requests fail, the main request is not affected. Mirror results are only used for logging and monitoring purposes.
Percentage-Based Router
Both Canary Release and Traffic Mirroring use the same PercentageBasedRouter utility class for traffic distribution.
Counter Management
- Local Counter: Independent counter is maintained in each pod (using thread-safe map)
- Thread-Safety: Thread-safe operations are performed using atomic counter
- Overflow Prevention: When counter reaches 100, it is automatically reset to 0 using atomic reset operation
Counter Reset Mechanism
The counter reset mechanism prevents Long overflow:
-
When counter reaches 10,000, it is reset to 0 using atomic operation
-
Atomic reset operation:
- If counter value is 10,000, it is attempted to be reset to 0 using
compareAndSet - If reset operation succeeds: Counter becomes 0 and distribution continues
- If reset operation fails (another thread has already reset): Distribution continues with current value
- If counter value is 10,000, it is attempted to be reset to 0 using
-
Why 10,000?: This threshold provides:
- Sufficient range for accurate percentage distribution
- Protection against Long.MAX_VALUE overflow
- Minimal performance overhead from reset operations
This mechanism ensures Long overflow never occurs while maintaining accurate traffic distribution.
Lifecycle Management
When an API Proxy is deployed, updated, or undeployed, canary and mirror counters are automatically cleaned up:
- Deploy/Update: Counters are reset (fresh start)
- Undeploy: All counters are cleaned up
Resetting counters ensures a consistent start with the new configuration. For example, when traffic percentage changes, counters need to be reset.
Canary vs Mirror Comparison
| Feature | Canary Release | Traffic Mirroring |
|---|---|---|
| Purpose | Test new version with small traffic | Copy live traffic to test environment |
| Backend Type | CANARY | MIRROR |
| Traffic Impact | Selected traffic goes to canary | Main traffic to PRIMARY, copy to MIRROR |
| Failure Impact | If canary fails, failover to PRIMARY | If mirror fails, main request is not affected |
| Health Check | If canary unhealthy, cooldown starts | Health check not performed for mirror |
| Synchronization | Synchronous (main request waits for canary) | Asynchronous (main request does not wait for mirror) |
Related Topics
- Retry and Failover - Health check and circuit breaker integration
- Load Balancing - Load distribution among backend addresses