Overview of Apinizer Gateway Metrics
GatewayMetricsService collects various metrics about API traffic, external connections, cache operations and JVM state. These metrics fall into the following categories:
- API Traffic Metrics: API requests, success/error rates, response times and sizes
- External Connection Metrics: Requests to external services, success/error rates, response times
- Cache Metrics: Cache operations, success/failure rates, response times
- JVM Metrics: Memory usage, GC, thread state, processor utilization
Each metric is collected in two formats:
- General Metric: Total values without tags (e.g., total number of all API requests)
- Tagged Metric: Metrics enriched with tags for detailed analysis (e.g. requests by API ID)
Prometheus Metric Types
Gateway metrics are collected using Prometheus' four basic metric types. These types are designed to best represent different data types and behaviors. Each metric type serves different purposes depending on how you collect and analyze data.
Counter
A counter is only an increasing value. It starts from zero when the application is running and only resets when the application is restarted. Counter-type metrics are ideal for tracking continuously increasing values such as the total number of requests, the number of errors, or the number of completed transactions.
Available Operations:
- sum: The total value itself.
- rate: Calculates the rate of increase over time (such as how much it increases per second).
- increase: Calculates the total increase over a specific time interval.
Gauge
A gauge represents a momentary value. This value can increase, decrease, or remain constant. Gauge-type metrics are used to monitor a momentary state or level, such as current memory usage, momentary CPU usage, or the number of active threads.
Available Operations:
- sum: The sum of Gauge values grouped by tags.
- mean (average): The average of Gauge values grouped by tags.
- min/max: The minimum or maximum of values grouped by tags.
Timer
Timer measures how long a process takes (usually in milliseconds or seconds). Although it is not a specific type in Prometheus, it is a metric created by libraries such as Micrometer using a combination of DistributionSummary and Counter. These metrics provide information such as average duration, maximum duration, and percentiles.
Available Operations:
- sum: Returns the total duration.
- count: Returns the total number of times the operation was performed.
- mean (average): Calculates the average duration of the operation.
- max: Returns the longest observed duration.
- histogram_quantile: Calculates percentiles.
DistributionSummary
DistributionSummary is used to track the distribution of a value. It works similarly to Timer, but instead of measuring time, it measures arbitrary numerical values such as request size or file size. This metric also provides statistical information such as average, maximum, and percentiles.
Available Operations:
sum: Returns the total value.
- count: Returns the number of observed values.
- mean (average): Calculates the average of the values.
- max: Returns the largest observed value.
- histogram_quantile: Calculates the percentile intervals.
API Traffic Metrics
These metrics are used to track API requests passing through Apinizer and measure their performance. Total requests, success, error and cache hit rates are tracked numerically, while request processing time and data sizes are measured for performance analysis. Some metrics are provided with api_id and api_name tags for detailed API-based analysis.
Metric Name | Description | Type | Tags |
---|---|---|---|
apinizer_api_traffic_total_count_total | Total API traffic requests | Counter | - |
apinizer_api_traffic_success_count_total | Successful API requests | Counter | - |
apinizer_api_traffic_error_count_total | Failed API requests | Counter | - |
apinizer_api_traffic_blocked_count_total | Blocked API requests | Counter | - |
apinizer_api_traffic_request_pipeline_time | API request pipeline time (ms) | Timer | - |
apinizer_api_traffic_routing_time | API routing time (ms) | Timer | - |
apinizer_api_traffic_response_pipeline_time | API response pipeline time (ms) | Timer | - |
apinizer_api_traffic_total_time | API total time (ms) | Timer | - |
apinizer_api_traffic_request_size | API request size (byte) | DistributionSummary | - |
apinizer_api_traffic_response_size | API response size (byte) | DistributionSummary | - |
apinizer_api_traffic_cache_hits_count | API cache hit count | Counter | - |
apinizer_api_traffic_total_count_tagged | Total API traffic requests | Counter | api_id, api_name |
apinizer_api_traffic_success_count_tagged | Successful API requests | Counter | api_id, api_name |
apinizer_api_traffic_error_count_tagged | Failed API requests | Counter | api_id, api_name |
apinizer_api_traffic_blocked_count_tagged | Blocked API requests | Counter | api_id, api_name |
apinizer_api_traffic_request_pipeline_time_tagged | API request pipeline time (ms) | Timer | api_id, api_name |
apinizer_api_traffic_routing_time_tagged | API routing time (ms) | Timer | api_id, api_name |
apinizer_api_traffic_response_pipeline_time_tagged | API response pipeline time (ms) | Timer | api_id, api_name |
apinizer_api_traffic_total_time_tagged | API total time (ms) | Timer | api_id, api_name |
apinizer_api_traffic_request_size_tagged | API request size (byte) | DistributionSummary | api_id, api_name |
apinizer_api_traffic_response_size_tagged | API response size (byte) | DistributionSummary | api_id, api_name |
apinizer_api_traffic_cache_hits_count_tagged | API cache hit count | Counter | api_id, api_name |
External Connection Metrics
These metrics are used to monitor external requests made through Apinizer. The performance of external services is analyzed by measuring total requests, number of errors and response time. Some metrics are provided with url tags for detailed URL-based analysis.
Metric Name | Description | Type | Tags |
---|---|---|---|
apinizer_external_requests_total_count | Total external request count | Counter | - |
apinizer_external_errors_total_count | Total external error count | Counter | - |
apinizer_external_response_time | External response time (ms) | Timer | - |
apinizer_external_requests_total_count_tagged | Total external request count | Counter | url |
apinizer_external_errors_total_count_tagged | Total external error count | Counter | url |
apinizer_external_response_time_tagged | External response time (ms) | Timer | url |
Cache Metrics
These metrics are used to monitor the interaction of the worker (gateway) pod with the cache. By measuring the total number of requests, errors and response time, the performance of the worker pod and how it performs cache operations are analyzed.
Metric Name | Description | Type | Tags |
---|---|---|---|
apinizer_cache_requests_total_count | Total cache request count | Counter | - |
apinizer_cache_errors_total_count | Total cache error count | Counter | - |
apinizer_cache_response_time | Cache operation response time (ms) | Timer | - |
JVM Metrics
These metrics are used to monitor JVM performance and resource utilization in the worker (gateway) pod. It helps to analyze the efficient operation of the system by providing detailed information about memory, GC (Garbage Collection) activity and thread status.
Metric Name | Description | Type | Tags |
---|---|---|---|
jvm_buffer_count_buffers | Number of buffers used by JVM | Gauge | - |
jvm_buffer_memory_used_bytes | Total used buffer memory (bytes) | Gauge | - |
jvm_buffer_total_capacity_bytes | Total buffer capacity (bytes) | Gauge | - |
jvm_gc_live_data_size_bytes | Size of surviving data after GC (bytes) | Gauge | - |
jvm_gc_max_data_size_bytes | Maximum data size for GC (bytes) | Gauge | - |
jvm_gc_memory_allocated_bytes_total | Amount of memory allocated by GC (bytes) | Counter | - |
jvm_gc_memory_promoted_bytes_total | Memory promoted from eden by GC (bytes) | Counter | - |
jvm_gc_pause_seconds_count | Total number of GC pauses | Counter | - |
jvm_gc_pause_seconds_max | Longest GC pause (seconds) | Gauge | - |
jvm_gc_pause_seconds_sum | Total GC pause time (seconds) | Gauge | - |
jvm_memory_committed_bytes | Memory allocated by JVM (bytes) | Gauge | - |
jvm_memory_max_bytes | Maximum memory available to JVM (bytes) | Gauge | - |
jvm_memory_used_bytes | Memory used by JVM (bytes) | Gauge | - |
jvm_threads_daemon_threads | Number of running daemon threads | Gauge | - |
jvm_threads_live_threads | Number of active running threads | Gauge | - |
jvm_threads_peak_threads | Highest number of threads reached | Gauge | - |
jvm_threads_started_threads_total | Total number of threads started | Counter | - |
jvm_threads_states_threads | Number of threads in different states | Gauge | state |
System Metrics
These metrics are used to monitor the CPU and system load of the worker (gateway) pod. It provides information about CPU core count, utilization rate and load averaging.
Metric Name | Description | Type | Tags |
---|---|---|---|
system_cpu_count | Total CPU core count | Gauge | - |
system_cpu_usage | System-wide CPU usage rate | Gauge | - |
system_load_average_1m | System load average for the last 1 minute | Gauge | - |
Process Metrics
These metrics track the resource utilization of the JVM process running in the worker (gateway) pod. It provides information about CPU utilization, number of open files and maximum file limit.
Metric Name | Description | Type | Tags |
---|---|---|---|
process_cpu_usage | JVM's CPU usage rate | Gauge | - |
process_files_max_files | Maximum number of files that can be opened | Gauge | - |
process_files_open_files | Number of open files | Gauge | - |