GatewayMetricsService collects various metrics about API traffic, external connections, cache operations and JVM state. These metrics fall into the following categories:

  1. API Traffic Metrics: API requests, success/error rates, response times and sizes
  2. External Connection Metrics: Requests to external services, success/error rates, response times
  3. Cache Metrics: Cache operations, success/failure rates, response times
  4. JVM Metrics: Memory usage, GC, thread state, processor utilization

Each metric is collected in two formats:

  • General Metric: Total values without tags (e.g., total number of all API requests)
  • Tagged Metric: Metrics enriched with tags for detailed analysis (e.g. requests by API ID)

Prometheus Metric Types

Gateway metrics are collected using Prometheus' four basic metric types. These types are designed to best represent different data types and behaviors. Each metric type serves different purposes depending on how you collect and analyze data.

Counter

A counter is only an increasing value. It starts from zero when the application is running and only resets when the application is restarted. Counter-type metrics are ideal for tracking continuously increasing values such as the total number of requests, the number of errors, or the number of completed transactions.

Available Operations:

  • sum: The total value itself.
  • rate: Calculates the rate of increase over time (such as how much it increases per second).
  • increase: Calculates the total increase over a specific time interval.

Gauge

A gauge represents a momentary value. This value can increase, decrease, or remain constant. Gauge-type metrics are used to monitor a momentary state or level, such as current memory usage, momentary CPU usage, or the number of active threads.

Available Operations:

  • sum: The sum of Gauge values grouped by tags.
  • mean (average): The average of Gauge values grouped by tags.
  • min/max: The minimum or maximum of values grouped by tags.

Timer

Timer measures how long a process takes (usually in milliseconds or seconds). Although it is not a specific type in Prometheus, it is a metric created by libraries such as Micrometer using a combination of DistributionSummary and Counter. These metrics provide information such as average duration, maximum duration, and percentiles.

Available Operations:

  • sum: Returns the total duration.
  • count: Returns the total number of times the operation was performed.
  • mean (average): Calculates the average duration of the operation.
  • max: Returns the longest observed duration.
  • histogram_quantile: Calculates percentiles.

DistributionSummary

DistributionSummary is used to track the distribution of a value. It works similarly to Timer, but instead of measuring time, it measures arbitrary numerical values such as request size or file size. This metric also provides statistical information such as average, maximum, and percentiles.

Available Operations:

  • sum: Returns the total value.

  • count: Returns the number of observed values.
  • mean (average): Calculates the average of the values.
  • max: Returns the largest observed value.
  • histogram_quantile: Calculates the percentile intervals.


API Traffic Metrics

These metrics are used to track API requests passing through Apinizer and measure their performance. Total requests, success, error and cache hit rates are tracked numerically, while request processing time and data sizes are measured for performance analysis. Some metrics are provided with api_id and api_name tags for detailed API-based analysis.

Metric NameDescriptionTypeTags
apinizer_api_traffic_total_count_totalTotal API traffic requestsCounter-
apinizer_api_traffic_success_count_totalSuccessful API requestsCounter-
apinizer_api_traffic_error_count_totalFailed API requestsCounter-
apinizer_api_traffic_blocked_count_totalBlocked API requestsCounter-
apinizer_api_traffic_request_pipeline_timeAPI request pipeline time (ms)Timer-
apinizer_api_traffic_routing_timeAPI routing time (ms)Timer-
apinizer_api_traffic_response_pipeline_timeAPI response pipeline time (ms)Timer-
apinizer_api_traffic_total_timeAPI total time (ms)Timer-
apinizer_api_traffic_request_sizeAPI request size (byte)DistributionSummary-
apinizer_api_traffic_response_sizeAPI response size (byte)DistributionSummary-
apinizer_api_traffic_cache_hits_countAPI cache hit countCounter-
apinizer_api_traffic_total_count_taggedTotal API traffic requestsCounterapi_id, api_name
apinizer_api_traffic_success_count_taggedSuccessful API requestsCounterapi_id, api_name
apinizer_api_traffic_error_count_taggedFailed API requestsCounterapi_id, api_name
apinizer_api_traffic_blocked_count_taggedBlocked API requestsCounterapi_id, api_name
apinizer_api_traffic_request_pipeline_time_taggedAPI request pipeline time (ms)Timerapi_id, api_name
apinizer_api_traffic_routing_time_taggedAPI routing time (ms)Timerapi_id, api_name
apinizer_api_traffic_response_pipeline_time_taggedAPI response pipeline time (ms)Timerapi_id, api_name
apinizer_api_traffic_total_time_taggedAPI total time (ms)Timerapi_id, api_name
apinizer_api_traffic_request_size_taggedAPI request size (byte)DistributionSummaryapi_id, api_name
apinizer_api_traffic_response_size_taggedAPI response size (byte)DistributionSummaryapi_id, api_name
apinizer_api_traffic_cache_hits_count_taggedAPI cache hit countCounterapi_id, api_name

External Connection Metrics

These metrics are used to monitor external requests made through Apinizer. The performance of external services is analyzed by measuring total requests, number of errors and response time. Some metrics are provided with url tags for detailed URL-based analysis.

Metric NameDescriptionTypeTags
apinizer_external_requests_total_countTotal external request countCounter-
apinizer_external_errors_total_countTotal external error countCounter-
apinizer_external_response_timeExternal response time (ms)Timer-
apinizer_external_requests_total_count_taggedTotal external request countCounterurl
apinizer_external_errors_total_count_taggedTotal external error countCounterurl
apinizer_external_response_time_taggedExternal response time (ms)Timerurl

Cache Metrics

These metrics are used to monitor the interaction of the worker (gateway) pod with the cache. By measuring the total number of requests, errors and response time, the performance of the worker pod and how it performs cache operations are analyzed.

Metric NameDescriptionTypeTags
apinizer_cache_requests_total_countTotal cache request countCounter-
apinizer_cache_errors_total_countTotal cache error countCounter-
apinizer_cache_response_timeCache operation response time (ms)Timer-

JVM Metrics

These metrics are used to monitor JVM performance and resource utilization in the worker (gateway) pod. It helps to analyze the efficient operation of the system by providing detailed information about memory, GC (Garbage Collection) activity and thread status.

Metric NameDescriptionTypeTags
jvm_buffer_count_buffersNumber of buffers used by JVMGauge-
jvm_buffer_memory_used_bytesTotal used buffer memory (bytes)Gauge-
jvm_buffer_total_capacity_bytesTotal buffer capacity (bytes)Gauge-
jvm_gc_live_data_size_bytesSize of surviving data after GC (bytes)Gauge-
jvm_gc_max_data_size_bytesMaximum data size for GC (bytes)Gauge-
jvm_gc_memory_allocated_bytes_totalAmount of memory allocated by GC (bytes)Counter-
jvm_gc_memory_promoted_bytes_totalMemory promoted from eden by GC (bytes)Counter-
jvm_gc_pause_seconds_countTotal number of GC pausesCounter-
jvm_gc_pause_seconds_maxLongest GC pause (seconds)Gauge-
jvm_gc_pause_seconds_sumTotal GC pause time (seconds)Gauge-
jvm_memory_committed_bytesMemory allocated by JVM (bytes)Gauge-
jvm_memory_max_bytesMaximum memory available to JVM (bytes)Gauge-
jvm_memory_used_bytesMemory used by JVM (bytes)Gauge-
jvm_threads_daemon_threadsNumber of running daemon threadsGauge-
jvm_threads_live_threadsNumber of active running threadsGauge-
jvm_threads_peak_threadsHighest number of threads reachedGauge-
jvm_threads_started_threads_totalTotal number of threads startedCounter-
jvm_threads_states_threadsNumber of threads in different statesGaugestate

System Metrics

These metrics are used to monitor the CPU and system load of the worker (gateway) pod. It provides information about CPU core count, utilization rate and load averaging.

Metric NameDescriptionTypeTags
system_cpu_countTotal CPU core countGauge-
system_cpu_usageSystem-wide CPU usage rateGauge-
system_load_average_1mSystem load average for the last 1 minuteGauge-

Process Metrics

These metrics track the resource utilization of the JVM process running in the worker (gateway) pod. It provides information about CPU utilization, number of open files and maximum file limit.

Metric NameDescriptionTypeTags
process_cpu_usageJVM's CPU usage rateGauge-
process_files_max_filesMaximum number of files that can be openedGauge-
process_files_open_filesNumber of open filesGauge-