Overview of Apinizer Cache Metrics
CacheMetricsService collects various metrics about Hazelcast cache, API Performance, JVM and System Health. These metrics fall into the following categories.
- Hazelcast cache statistics
- API performance metrics
- JVM
- System Metrics
Prometheus Metric Types
Cache metrics are collected using Prometheus' four metric types. These types are designed to best represent different data types and behaviors. Each metric type serves different purposes depending on how you collect and analyze your data.
Counter
A counter is simply an incrementing value. It starts at zero when the application is running and only resets when the application is restarted. Counter-type metrics are ideal for tracking continuously increasing values such as total request count, error count, or completed transaction count.
Available Operations:
- sum: The total value itself.
- rate: Calculates the rate of increase over time (e.g., how much it increases per second).
- increase: Calculates the total increase over a specific time interval.
Gauge
A Gauge represents an instantaneous value. This value can increase, decrease, or remain constant. Gauge-type metrics are used to monitor a momentary state or level, such as current memory usage, instantaneous CPU usage, or the number of active threads.
Available Operations:
- sum: The sum of Gauge values grouped by tags.
- mean (average): The average of Gauge values grouped by tags.
- min/max: The minimum or maximum of values grouped by tags.
Timer
Timer measures how long a process takes (usually in milliseconds or seconds). Although it is not a specific type in Prometheus, it is a metric created by libraries such as Micrometer using a combination of DistributionSummary and Counter. These metrics provide information such as average duration, maximum duration, and percentiles.
Available Operations:
- sum: Returns the total duration.
- count: Returns the total number of times the operation was performed.
- mean (average): Calculates the average duration of the operation.
- max: Returns the longest observed duration.
- histogram_quantile: Calculates percentiles.
Summary
Summary is a metric that calculates both the sum of the observed values and the predefined quantiles. The difference between Summary and Timer and DistributionSummary is that it calculates the quantiles directly itself; this ensures that the quantiles are more accurate because they are calculated while the metric is being collected, not at query time.
Available Operations:
- sum: The sum of the observed values.
- count: The number of observed values.
- Specified quantiles (<metric_name>{quantile=“0.99”}).
apinizer_cache_api_response_time_bucket
These metrics monitor Cache's cache performance. Efficiency is analyzed by tracking cache lookups, additions and latencies. In addition, memory cost and partition breakdown are measured.
Metric | Description | Type |
---|---|---|
cache_gets_total | Total cache searches (hits and misses) | Counter |
cache_puts_total | Total cache additions | Counter |
cache_size | Current number of entries in cache | Gauge |
cache_entries | Number of entries per cache partition | Gauge |
cache_entry_memory_bytes | Memory cost of cache entries | Gauge |
cache_gets_latency_seconds | Cache access latency | Summary |
cache_puts_latency_seconds | Cache insertion delay | Summary |
cache_removals_latency_seconds | Cache removal delay | Summary |
API Metrics
These metrics track the performance of Cache's APIs. API performance is evaluated with data such as number of requests, response time and error rates.
Metric | Description | Type |
---|---|---|
apinizer_cache_api_requests_total | Total number of API requests | Counter |
apinizer_cache_api_response_time | API response time (seconds) | Timer |
apinizer_cache_api_errors_total | Total number of API bugs | Counter |
JVM Metrics
These metrics track Cache's memory and thread utilization. JVM performance is analyzed with data such as memory usage, GC (Garbage Collection) pause times and active threads.
Metric | Description | Type |
---|---|---|
jvm_memory_used_bytes | Memory usage by space (heap/non-heap) | Gauge |
jvm_memory_committed_bytes | Memory allocated by area | Gauge |
jvm_memory_max_bytes | Maximum memory by area | Gauge |
jvm_gc_pause_seconds | GC pause time | Summary |
jvm_threads_live_threads | Number of live threads available | Gauge |
jvm_threads_daemon_threads | Number of available daemon threads | Gauge |
System Metrics
These metrics monitor Cache's overall performance with data such as CPU utilization, number of processors and load average. In addition, resource utilization is evaluated with data such as processing time and number of open files.
Metric | Description | Type |
---|---|---|
system_cpu_usage | CPU utilization of the main system | Gauge |
system_cpu_count | Number of available processors | Gauge |
system_load_average_1m | System load average (1 minute) | Gauge |
process_cpu_usage | CPU usage of the JVM process | Gauge |
process_uptime_seconds | JVM process uptime | Gauge |
process_files_open_files | Number of open file identifiers | Gauge |