Skip to main content

Monitoring Component Concept

Uptime Monitor

Monitoring uptime status of API Proxy endpointsRegular checks at specified time intervals and triggering actions when expectations are not met.

Anomaly Detector

Anomaly detection in log recordsDetecting anomalies that exceed threshold values by analyzing time-based data.

System Health

Platform and component health monitoringMonitoring status of Kubernetes, Elasticsearch, and other platform components.

Performance Metrics

CPU, memory, disk, network metricsCollecting and analyzing system and application performance metrics.

Alarm Management

Alarm production and managementProducing alarms and sending notifications for various system components.

Monitoring Component Features

Uptime Monitor

  • Endpoint accessibility check with HTTP requests
  • Regular checks at specified time intervals
  • Method, URL, parameter, and header support
  • Response validation with assertion
  • Determining execution frequency with job scheduler
  • Selecting from test collection
  • Timeout settings
  • Retry on failure
  • Triggering actions when expectations are not met
  • Actions like email, API call, notification
  • Integration with connectors
For detailed information about Uptime Monitor, see the Uptime Monitor page.

Performance Metrics

CPU Metrics

  • CPU usage
  • CPU load average
  • CPU core count

Memory Metrics

  • Memory usage
  • Heap memory
  • Garbage collection

Disk Metrics

  • Disk usage
  • Disk I/O
  • Disk space

Network Metrics

  • Network throughput
  • Network latency
  • Connection count

API Metrics

  • Request rate
  • Request latency
  • Request size
  • Response time
  • Response size
  • Status code distribution
  • Error rate
  • Error types
  • Error trends

Anomaly Detector

Four different condition types can be used for anomaly detection:
  • Threshold Value Check: Anomaly detection when metric values exceed determined threshold value
  • EMA with Bollinger Bands: Anomaly detection using Exponential Moving Average and Bollinger Bands
  • Query/Filter Ratio Check: Anomaly detection based on ratio of query and filter results
  • Custom Conditions: User-defined complex conditions
Anomaly Detector detects unexpected behaviors by analyzing API traffic logs and produces alarms.
  • Defining queries and filters
  • Determining conditions
  • Time range and triggering frequency
  • Defining actions when anomaly is detected
For detailed information about Anomaly Detector, see the Anomaly Detector page.

Alarm Management

The following alarm types are available in Apinizer:
  • Kubernetes Pod Health Status: Health status of Kubernetes pods
  • Kubernetes Node Health Status: Health status of Kubernetes nodes
  • Kubernetes Node CPU Percentage: Kubernetes node CPU usage percentage
  • Elasticsearch Health Status: Elasticsearch cluster health status
  • Elasticsearch CPU Percentage: Elasticsearch CPU usage percentage
  • Elasticsearch Disk Percentage: Elasticsearch disk usage percentage
  • API Traffic Logs Exist in Database: Existence of API traffic logs in database
  • Remaining Expiration Days of SSL: Remaining validity days of SSL certificate
  • Remaining Expiration Days of JWK: Remaining validity days of JWK key
  • Application Logs Count: Application log count
Alarm types enable early detection of problem situations by monitoring different components of the system.
  • Threshold value exceedance
  • Anomaly detection
  • Health check failure
  • Uptime monitor failure
Alarm notifications can be sent through various channels:
  • Email: Email notifications
  • Webhook: Webhook integration
  • Connectors: Actions with connectors like email, API call, notification, SNMP
Alarm channels are configured using connectors. For detailed information, see the Actions and Connectors page.
  • Alarm grouping
  • Alarm filtering
  • Alarm acknowledgment
  • Alarm escalation
For detailed information about alarm management, see the Alarm (Alert) page.

Monitoring Component Components

Metric Collector

Component that collects metrics
  • System Metrics: CPU, memory, disk, network
  • Application Metrics: API metrics, business metrics
  • Custom Metrics: User-defined metrics

Alarm Manager

Component that manages alarms
  • Rule Engine: Alarm rules
  • Notification Service: Notification service
  • Alarm Aggregation: Alarm aggregation

Dashboard

Visualization and monitoring interface
  • Real-Time Dashboards: Real-time dashboards
  • Custom Dashboards: Custom dashboards
  • Widgets: Various widgets

Monitoring Component Integrations

Prometheus

  • Prometheus integration
  • Metric export
  • Prometheus scraping

Grafana

  • Grafana integration
  • Dashboard import
  • Visualization

ELK Stack

  • Elasticsearch, Logstash, Kibana
  • Log collection
  • Log analysis

Custom Integrations

  • Webhook integration
  • Custom API integration

Monitoring Usage Scenarios

API Proxy Uptime Monitoring

  1. Monitoring API Proxy endpoints with Uptime Monitor
  2. Sending regular HTTP requests
  3. Response validation and assertion check
  4. Triggering actions on failure

Anomaly Detection

  1. Defining queries and filters in log records
  2. Determining conditions (threshold value, EMA, Bollinger Bands)
  3. Producing alarms when anomaly is detected
  4. Sending notifications with actions

System Component Monitoring

  1. Monitoring Kubernetes pod and node statuses
  2. Monitoring Elasticsearch health and resource usage
  3. Monitoring SSL and JWK certificate expiration
  4. Monitoring API traffic logs and application logs

Performance Monitoring

  1. Collecting CPU, memory metrics
  2. Performing trend analysis
  3. Detecting bottlenecks
  4. Optimization recommendations

Monitoring Best Practices

Metric Collection

  • Collect important metrics
  • Filter unnecessary metrics
  • Use sampling

Alarm Configuration

  • Select appropriate threshold values
  • Reduce false positives
  • Group alarms

Dashboard Design

  • Meaningful dashboards
  • Highlight important metrics
  • Real-time and historical views

Retention

  • Appropriate retention policies
  • Long-term retention
  • Cost optimization

Next Steps