Apinizer's Integration with Prometheus and Grafana: A Comprehensive Guide
In today's microservices architecture, monitoring system performance and behavior is vital to maintaining optimal operations. Apinizer, a powerful API management platform, offers robust metrics collection capabilities that can be integrated with industry-standard monitoring tools such as Prometheus and Grafana. This article provides a detailed guide on how you can leverage Apinizer's metrics features to gain valuable insights into your API gateway and cache performance.
Overview of Apinizer's Metric System
Apinizer's metric system is built on Micrometer. The platform collects a wide range of metrics through two main components:
- Apinizer Gateway: Collects metrics on API traffic, external connections, JVM health and system resources
- Apinizer Cache: Monitors cache operations, API requests, JVM performance and system health
These metrics provide comprehensive visibility into the performance and health of your API management infrastructure, enabling proactive identification of bottlenecks, troubleshooting and capacity planning.
Metrics Collected by Apinizer
Apinizer Gateway Metrics
The Gateway component collects metrics in various categories:
API Traffic Metrics
These metrics track requests passing through the Apinizer Gateway:
- Total API traffic requests
- Successful/failed/blocked API requests
- Request processing times (pipeline, routing, total)
- Request and response dimensions
- Cache hit statistics
Each metric is available in two forms:
- Total metrics (e.g., total API requests across all APIs)
- Tagged metrics with detailed dimensions (e.g., API ID, requests per API name)
External Connection Metrics
These track connections to external services:
- Total external requests
- Number of external errors
- External response times
JVM Metrics
These provide insights into the Java Virtual Machine:
- Memory usage (heap, non-heap)
- Garbage collection statistics
- Thread counts and states
System Metrics
These monitor the underlying system:
- CPU utilization
- Number of processors
- System load averaging
- File descriptor numbers
Apinizer Cache Metrics
The Cache component collects the following:
Cache Operation Metrics
- Cache get/put counts
- Cache size and number of entries
- Cache operation latencies
- Memory usage by cache entries
API Metrics
- API request numbers
- API response times
- API error counts
JVM and System Metrics
Similar to Gateway, the Cache component monitors JVM performance and system resource utilization.
Setup of Prometheus Integration
1. Enabling Metrics in Apinizer Components
For Apinizer Gateway:
In the Apinizer interface, go to the Gateway Environments page and enable the “Prometheus Metric Server” option. This will enable metric publishing over port 9091.
For Apinizer Cache:
Edit the cache deployment and add the METRICS_ENABLED=TRUE environment variable. This can be done in the following ways:
Through the Apinizer interface: Gateway Environments > Deployments & Services > Cache > Edit deployment
Via Kubernetes CLI:
2. Configuring Prometheus to Collect Metrics
You can configure Prometheus to collect metrics from Apinizer components in two different ways:
Sabit (Constant) Kazıma
Create a service that targets Apinizer components on port 9091:
Then configure Prometheus to collect metrics from these services:
Kubernetes Service Discovery and Dynamic Scraping
For more flexible configurations, you can use Kubernetes service discovery with pod annotations:
1.Add anotations to Deployment:
2. Configure Prometheus to use Kubernetes service discovery:
Analyzing Apinizer Metrics with PromQL
Once you start collecting metrics from Prometheus Apinizer components, you can use PromQL (Prometheus Query Language) to analyze the data. Here are some useful queries:
Gateway API Traffic Analysis
Cache Performance Analysis
JVM Analysis
Creating Grafana Dashboards
After setting Prometheus as a data source in Grafana, you can create dashboards to visualize Apinizer metrics. Here are some dashboard suggestions:
API Traffic Dashboard
Panel 1: API Request Volume by Type
- Metrics
- Total requests: sum(rate(apinizer_api_traffic_total_count_total[5m]))
- Successful requests: sum(rate(apinizer_api_traffic_success_count_total[5m]))
- Failed requests: sum(rate(apinizer_api_traffic_error_count_total[5m]))
- Visualization Time series
Panel 2: Most Used APIs by Traffic
- Metric:
topk(5, sum by (api_name) (increase(apinizer_api_traffic_total_count_tagged_total[5m])))
- Visualization: Bar chart
Panel 3: API Processing Time Distribution
- Metrics:
- Request pipeline:
sum(rate(apinizer_api_traffic_request_pipeline_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_request_pipeline_time_seconds_count[5m])) * 1000
- Routing:
sum(rate(apinizer_api_traffic_routing_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_routing_time_seconds_count[5m])) * 1000
- Response pipeline:
sum(rate(apinizer_api_traffic_response_pipeline_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_response_pipeline_time_seconds_count[5m])) * 1000
- Visualization: Time series
Panel 4: Request/Response Dimension
- Metrics:
- Request size:
sum(rate(apinizer_api_traffic_request_size_bytes_sum[5m])) / sum(rate(apinizer_api_traffic_request_size_bytes_count[5m]))
- Response size:
sum(rate(apinizer_api_traffic_response_size_bytes_sum[5m])) / sum(rate(apinizer_api_traffic_response_size_bytes_count[5m]))
- Visualization: Time series
Cache Performance Dashboard
Panel 1: Cache Operations
- Metrics:
- Get operations:
rate(cache_gets_total[5m])
- Put operations:
rate(cache_puts_total[5m])
- Visualization: Time series
Panel 2: Cache Hit Rate
- Metric:
(sum(increase(cache_gets_total[5m])) -sum(increase(apinizer_cache_api_errors_total[5m]))) / sum(increase(cache_gets_total[5m])) * 100
- Visualization: Gauge
Panel 3: Cache Memory Usage
- Metric:
sum(cache_entry_memory_bytes)
- Visualization: Statistic or indicator
System Health Dashboard
Panel 1: JVM Memory Usage
- Metric:
sum by (area)(jvm_memory_used_bytes) / sum by (area)(jvm_memory_max_bytes) * 100
- Visualization: Gauge or time series
Panel 2: CPU Utilization
- Metric:
sum(system_cpu_usage{pod=~".*"}) by (pod) * 100
- Visualization: Time series
Panel 3: Active Threads
- Metric:
sum(jvm_threads_live_threads)
- Visualization: Statistic or indicator
Best Practices
1. Metric Retention Period
Configure the appropriate retention periods in Prometheus according to your needs. The default configuration retains data for 7 days:
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=7d"
Consider remote storage integrations or adjust these values for longer retention.
2. Alarm Configuration
Set alarms for critical metrics in Prometheus AlertManager or Grafana:
- High API error rates:
(sum(increase(apinizer_api_traffic_error_count_total[5m])) / sum(increase(apinizer_api_traffic_total_count_total[5m]))) * 100 > 10
- High memory usage:
sum(jvm_memory_used_bytes) / sum(jvm_memory_max_bytes) * 100 > 85
- Slow response times:
sum(rate(apinizer_api_traffic_total_time_seconds_sum[5m])) / sum(rate(apinizer_api_traffic_total_time_seconds_count[5m])) > 1
3. Dashboard Organization
Organize your Grafana dashboards logically:
- Create separate dashboards for Gateway and Cache components
- Aggregate relevant metrics
- Use variables that allow filtering by namespace, pod or API
4. Label Usage
Leverage Prometheus tags for more effective querying:
- Filter by specific APIs using the api_name tag
- Analyze metrics by namespace or pod
- Compare performance between different environments
Conclusion
Integrating Apinizer with Prometheus and Grafana provides powerful monitoring capabilities for your API management infrastructure. By properly configuring metric collection, creating informative dashboards, and implementing alarms, you can ensure optimal performance, quickly identify issues, and make data-driven decisions about your API ecosystem.
This integration capitalizes on the strengths of each component:
- Apinizer's comprehensive metrics collection
- Prometheus' efficient time series database and powerful query language
- Grafana's flexible and beautiful visualizations
Start monitoring your Apinizer deployment today to gain deeper insights into API gateway and cache performance.
Sources
For more information:
- Overview of Apinizer's Metric System
- Metrics Collected by Apinizer
- Apinizer Gateway Metrics
- Apinizer Cache Metrics
- Setup of Prometheus Integration
- Analyzing Apinizer Metrics with PromQL
- Gateway API Traffic Analysis
- Cache Performance Analysis
- JVM Analysis
- Creating Grafana Dashboards
- API Traffic Dashboard
- Cache Performance Dashboard
- System Health Dashboard
- Best Practices
- 1. Metric Retention Period
- 2. Alarm Configuration
- 3. Dashboard Organization
- 4. Label Usage
- Conclusion
- Sources