Ana içeriğe geç

Monitoring Component

Monitoring Component Concept

Uptime Monitor

Monitoring uptime status of API Proxy endpoints

Regular checks at specified time intervals and triggering actions when expectations are not met.

Anomaly Detector

Anomaly detection in log records

Detecting anomalies that exceed threshold values by analyzing time-based data.

System Health

Platform and component health monitoring

Monitoring status of Kubernetes, Elasticsearch, and other platform components.

Performance Metrics

CPU, memory, disk, network metrics

Collecting and analyzing system and application performance metrics.

Alarm Management

Alarm production and management

Producing alarms and sending notifications for various system components.

Monitoring Component Features

Uptime Monitor

Endpoint Monitoring
  • Endpoint accessibility check with HTTP requests
  • Regular checks at specified time intervals
  • Method, URL, parameter, and header support
  • Response validation with assertion
Configuration
  • Determining execution frequency with job scheduler
  • Selecting from test collection
  • Timeout settings
  • Retry on failure
Actions
  • Triggering actions when expectations are not met
  • Actions like email, API call, notification
  • Integration with connectors
ipucu

For detailed information about Uptime Monitor, see the Uptime Monitor page.

Performance Metrics

CPU Metrics
  • CPU usage
  • CPU load average
  • CPU core count
Memory Metrics
  • Memory usage
  • Heap memory
  • Garbage collection
Disk Metrics
  • Disk usage
  • Disk I/O
  • Disk space
Network Metrics
  • Network throughput
  • Network latency
  • Connection count

API Metrics

Request Metrics
  • Request rate
  • Request latency
  • Request size
Response Metrics
  • Response time
  • Response size
  • Status code distribution
Error Metrics
  • Error rate
  • Error types
  • Error trends

Anomaly Detector

Condition Types

Four different condition types can be used for anomaly detection:

  • Threshold Value Check: Anomaly detection when metric values exceed determined threshold value
  • EMA with Bollinger Bands: Anomaly detection using Exponential Moving Average and Bollinger Bands
  • Query/Filter Ratio Check: Anomaly detection based on ratio of query and filter results
  • Custom Conditions: User-defined complex conditions
ipucu

Anomaly Detector detects unexpected behaviors by analyzing API traffic logs and produces alarms.

Configuration
  • Defining queries and filters
  • Determining conditions
  • Time range and triggering frequency
  • Defining actions when anomaly is detected
ipucu

For detailed information about Anomaly Detector, see the Anomaly Detector page.

Alarm Management

Alarm Types

The following alarm types are available in Apinizer:

  • Kubernetes Pod Health Status: Health status of Kubernetes pods
  • Kubernetes Node Health Status: Health status of Kubernetes nodes
  • Kubernetes Node CPU Percentage: Kubernetes node CPU usage percentage
  • Elasticsearch Health Status: Elasticsearch cluster health status
  • Elasticsearch CPU Percentage: Elasticsearch CPU usage percentage
  • Elasticsearch Disk Percentage: Elasticsearch disk usage percentage
  • API Traffic Logs Exist in Database: Existence of API traffic logs in database
  • Remaining Expiration Days of SSL: Remaining validity days of SSL certificate
  • Remaining Expiration Days of JWK: Remaining validity days of JWK key
  • Application Logs Count: Application log count
ipucu

Alarm types enable early detection of problem situations by monitoring different components of the system.

Alarm Production
  • Threshold value exceedance
  • Anomaly detection
  • Health check failure
  • Uptime monitor failure
Alarm Channels

Alarm notifications can be sent through various channels:

  • Email: Email notifications
  • Webhook: Webhook integration
  • Connectors: Actions with connectors like email, API call, notification, SNMP
ipucu

Alarm channels are configured using connectors. For detailed information, see the Actions and Connectors page.

Alarm Management
  • Alarm grouping
  • Alarm filtering
  • Alarm acknowledgment
  • Alarm escalation
ipucu

For detailed information about alarm management, see the Alarm (Alert) page.

Monitoring Component Components

Metric Collector

Component that collects metrics

  • System Metrics: CPU, memory, disk, network
  • Application Metrics: API metrics, business metrics
  • Custom Metrics: User-defined metrics
Alarm Manager

Component that manages alarms

  • Rule Engine: Alarm rules
  • Notification Service: Notification service
  • Alarm Aggregation: Alarm aggregation
Dashboard

Visualization and monitoring interface

  • Real-Time Dashboards: Real-time dashboards
  • Custom Dashboards: Custom dashboards
  • Widgets: Various widgets

Monitoring Component Integrations

Prometheus
  • Prometheus integration
  • Metric export
  • Prometheus scraping
Grafana
  • Grafana integration
  • Dashboard import
  • Visualization
ELK Stack
  • Elasticsearch, Logstash, Kibana
  • Log collection
  • Log analysis
Custom Integrations
  • Webhook integration
  • Custom API integration

Monitoring Usage Scenarios

API Proxy Uptime Monitoring
  1. Monitoring API Proxy endpoints with Uptime Monitor
  2. Sending regular HTTP requests
  3. Response validation and assertion check
  4. Triggering actions on failure
Anomaly Detection
  1. Defining queries and filters in log records
  2. Determining conditions (threshold value, EMA, Bollinger Bands)
  3. Producing alarms when anomaly is detected
  4. Sending notifications with actions
System Component Monitoring
  1. Monitoring Kubernetes pod and node statuses
  2. Monitoring Elasticsearch health and resource usage
  3. Monitoring SSL and JWK certificate expiration
  4. Monitoring API traffic logs and application logs
Performance Monitoring
  1. Collecting CPU, memory metrics
  2. Performing trend analysis
  3. Detecting bottlenecks
  4. Optimization recommendations

Monitoring Best Practices

Metric Collection
  • Collect important metrics
  • Filter unnecessary metrics
  • Use sampling
Alarm Configuration
  • Select appropriate threshold values
  • Reduce false positives
  • Group alarms
Dashboard Design
  • Meaningful dashboards
  • Highlight important metrics
  • Real-time and historical views
Retention
  • Appropriate retention policies
  • Long-term retention
  • Cost optimization

Next Steps