In today's microservices architecture, monitoring system performance and behavior is vital to maintaining optimal operations. Apinizer, a powerful API management platform, offers robust metrics collection capabilities that can be integrated with industry-standard monitoring tools such as Prometheus and Grafana. This article provides a detailed guide on how you can leverage Apinizer's metrics features to gain valuable insights into your API gateway and cache performance.

Overview of Apinizer's Metric System

Apinizer's metric system is built on Micrometer. The platform collects a wide range of metrics through two main components:

  1. Apinizer Gateway: Collects metrics on API traffic, external connections, JVM health and system resources
  2. Apinizer Cache: Monitors cache operations, API requests, JVM performance and system health

These metrics provide comprehensive visibility into the performance and health of your API management infrastructure, enabling proactive identification of bottlenecks, troubleshooting and capacity planning.

Metrics Collected by Apinizer

Apinizer Gateway Metrics

The Gateway component collects metrics in various categories:

API Traffic MetricsLink to API Traffic Metrics

These metrics track requests passing through the Apinizer Gateway:

  • Total API traffic requests
  • Successful/failed/blocked API requests
  • Request processing times (pipeline, routing, total)
  • Request and response dimensions
  • Cache hit statistics

Each metric is available in two forms:

  • Total metrics (e.g., total API requests across all APIs)
  • Tagged metrics with detailed dimensions (e.g., API ID, requests per API name)

External Connection MetricsLink to External Connection Metrics

These track connections to external services:

  • Total external requests
  • Number of external errors
  • External response times

JVM MetricsLink to JVM Metrics

These provide insights into the Java Virtual Machine:

  • Memory usage (heap, non-heap)
  • Garbage collection statistics
  • Thread counts and states

System MetricsLink to System Metrics

These monitor the underlying system:

  • CPU utilization
  • Number of processors
  • System load averaging
  • File descriptor numbers

Apinizer Cache Metrics

The Cache component collects the following:

Cache Operation MetricsLink to Cache Operation Metrics

  • Cache get/put counts
  • Cache size and number of entries
  • Cache operation latencies
  • Memory usage by cache entries

API MetricsLink to API Metrics

  • API request numbers
  • API response times
  • API error counts

JVM and System MetricsLink to JVM and System Metrics

Similar to Gateway, the Cache component monitors JVM performance and system resource utilization.

Setup of Prometheus Integration

1. Enabling Metrics in Apinizer ComponentsLink to 1. Enabling Metrics in Apinizer Components

For Apinizer Gateway:Link to For Apinizer Gateway:

In the Apinizer interface, go to the Gateway Environments page and enable the “Prometheus Metric Server” option. This will enable metric publishing over port 9091.

For Apinizer Cache:Link to For Apinizer Cache:

Edit the cache deployment and add the METRICS_ENABLED=TRUE environment variable. This can be done in the following ways:

Through the Apinizer interface: Gateway Environments > Deployments & Services > Cache > Edit deployment

Via Kubernetes CLI:

kubectl edit deploy -n <namespace> cache

# Add the following environment variable
- name: METRICS_ENABLED
  value: "true"
TEXT

2. Configuring Prometheus to Collect MetricsLink to 2. Configuring Prometheus to Collect Metrics

You can configure Prometheus to collect metrics from Apinizer components in two different ways:

Sabit (Constant) KazımaLink to Sabit (Constant) Kazıma

Create a service that targets Apinizer components on port 9091:

apiVersion: v1
kind: Service
metadata:
  name: cache-prometheus-service  # or gateway-prometheus-service
  namespace: <NAMESPACE>
spec:
  ports:
    - port: 9091
      protocol: TCP
      targetPort: 9091
  selector:
    app: cache  # or worker for Gateway
  type: ClusterIP
YML

Then configure Prometheus to collect metrics from these services:

scrape_configs:
  - job_name: 'apinizer-components'
    static_configs:
      - targets: ['cache-prometheus-service.<NAMESPACE>.svc.cluster.local:9091', 'worker-prometheus-service.<NAMESPACE>.svc.cluster.local:9091']
YML

Kubernetes Service Discovery and Dynamic ScrapingLink to Kubernetes Service Discovery and Dynamic Scraping

For more flexible configurations, you can use Kubernetes service discovery with pod annotations:

1.Add anotations to Deployment:

template:
  metadata:
    annotations:
      prometheus.io/port: "9091"
      prometheus.io/scrape: "true"
YML

2. Configure Prometheus to use Kubernetes service discovery:

scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_ip]
        action: replace
        regex: (\d+);((([0-9]+?)(\.|$)){4})
        replacement: $2:$1
        target_label: __address__       
# Additional relabeling configurations
YML

Analyzing Apinizer Metrics with PromQL

Once you start collecting metrics from Prometheus Apinizer components, you can use PromQL (Prometheus Query Language) to analyze the data. Here are some useful queries:

Gateway API Traffic Analysis

# Total API requests in the last hour
sum(increase(apinizer_api_traffic_total_count_total[1h]))

# Requests per API in the last 5 minutes
sum by (api_name) (increase(apinizer_api_traffic_total_count_tagged_total[5m]))

# API success rate in the last 10 minutes (%)
(sum(increase(apinizer_api_traffic_success_count_total[10m])) / sum(increase(apinizer_api_traffic_total_count_total[10m]))) * 100

# Average response time per API (milliseconds)
sum by (api_name) (rate(apinizer_api_traffic_total_time_tagged_seconds_sum[5m])) / sum by (api_name) (rate(apinizer_api_traffic_total_time_tagged_seconds_count[5m])) * 1000
YML

Cache Performance Analysis

# Total cache get operations in the last hour
sum(increase(cache_gets_total[1h]))

# Cache hit rate (%)
(sum(increase(cache_gets_total[5m])) - sum(increase(apinizer_cache_api_errors_total[5m]))) / sum(increase(cache_gets_total[5m])) * 100

# Available cache entries
sum(cache_size)
YML

JVM Analysis

# Memory utilization (%)
(sum(jvm_memory_used_bytes{application="apinizer-cache"}) * 100) / sum(jvm_memory_max_bytes{application="apinizer-cache"})

# Garbage collection time
sum(rate(jvm_gc_pause_seconds_sum[5m]))
YML

Creating Grafana Dashboards

After setting Prometheus as a data source in Grafana, you can create dashboards to visualize Apinizer metrics. Here are some dashboard suggestions:

API Traffic Dashboard

Panel 1: API Request Volume by Type

  • Metrics
  • Total requests: sum(rate(apinizer_api_traffic_total_count_total[5m]))
  • Successful requests: sum(rate(apinizer_api_traffic_success_count_total[5m]))
  • Failed requests: sum(rate(apinizer_api_traffic_error_count_total[5m]))
  • Visualization Time series

Panel 2: Most Used APIs by Traffic

  • Metric: topk(5, sum by (api_name) (increase(apinizer_api_traffic_total_count_tagged_total[5m])))
  • Visualization: Bar chart

Panel 3: API Processing Time Distribution

  • Metrics:
  • Request pipeline:  sum(rate(apinizer_api_traffic_request_pipeline_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_request_pipeline_time_seconds_count[5m])) * 1000
  • Routing:  sum(rate(apinizer_api_traffic_routing_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_routing_time_seconds_count[5m])) * 1000
  • Response pipeline: sum(rate(apinizer_api_traffic_response_pipeline_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_response_pipeline_time_seconds_count[5m])) * 1000
  • Visualization: Time series

Panel 4: Request/Response Dimension

  • Metrics:
  • Request size: sum(rate(apinizer_api_traffic_request_size_bytes_sum[5m])) / sum(rate(apinizer_api_traffic_request_size_bytes_count[5m]))
  • Response size: sum(rate(apinizer_api_traffic_response_size_bytes_sum[5m])) / sum(rate(apinizer_api_traffic_response_size_bytes_count[5m]))
  • Visualization: Time series

Cache Performance Dashboard

Panel 1: Cache Operations

  • Metrics:
  • Get operations: rate(cache_gets_total[5m])
  • Put operations: rate(cache_puts_total[5m])
  • Visualization: Time series

Panel 2: Cache Hit Rate

  • Metric: (sum(increase(cache_gets_total[5m])) -sum(increase(apinizer_cache_api_errors_total[5m]))) / sum(increase(cache_gets_total[5m])) * 100
  • Visualization: Gauge

Panel 3: Cache Memory Usage

  • Metric: sum(cache_entry_memory_bytes)
  • Visualization: Statistic or indicator

System Health Dashboard

Panel 1: JVM Memory Usage

  • Metric: sum by (area)(jvm_memory_used_bytes) / sum by (area)(jvm_memory_max_bytes) * 100
  • Visualization: Gauge or time series

Panel 2: CPU Utilization

  • Metric: sum(system_cpu_usage{pod=~".*"}) by (pod) * 100
  • Visualization: Time series

Panel 3: Active Threads

  • Metric: sum(jvm_threads_live_threads)
  • Visualization: Statistic or indicator

Best Practices

1. Metric Retention Period

Configure the appropriate retention periods in Prometheus according to your needs. The default configuration retains data for 7 days:

- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=7d"

Consider remote storage integrations or adjust these values for longer retention.

2. Alarm Configuration

Set alarms for critical metrics in Prometheus AlertManager or Grafana:

  • High API error rates:  (sum(increase(apinizer_api_traffic_error_count_total[5m])) / sum(increase(apinizer_api_traffic_total_count_total[5m]))) * 100 > 10
  • High memory usage: sum(jvm_memory_used_bytes) / sum(jvm_memory_max_bytes) * 100 > 85
  • Slow response times: sum(rate(apinizer_api_traffic_total_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_total_time_seconds_count[5m])) > 1

3. Dashboard Organization

Organize your Grafana dashboards logically:

  • Create separate dashboards for Gateway and Cache components
  • Aggregate relevant metrics
  • Use variables that allow filtering by namespace, pod or API

4. Label Usage

Leverage Prometheus tags for more effective querying:

  • Filter by specific APIs using the api_name tag
  • Analyze metrics by namespace or pod
  • Compare performance between different environments

Conclusion

Integrating Apinizer with Prometheus and Grafana provides powerful monitoring capabilities for your API management infrastructure. By properly configuring metric collection, creating informative dashboards, and implementing alarms, you can ensure optimal performance, quickly identify issues, and make data-driven decisions about your API ecosystem.

This integration capitalizes on the strengths of each component:

  • Apinizer's comprehensive metrics collection
  • Prometheus' efficient time series database and powerful query language
  • Grafana's flexible and beautiful visualizations

Start monitoring your Apinizer deployment today to gain deeper insights into API gateway and cache performance.

Sources

For more information: