Monitoring Component
Monitoring Component Concept
Monitoring uptime status of API Proxy endpoints
Regular checks at specified time intervals and triggering actions when expectations are not met.
Anomaly detection in log records
Detecting anomalies that exceed threshold values by analyzing time-based data.
Platform and component health monitoring
Monitoring status of Kubernetes, Elasticsearch, and other platform components.
CPU, memory, disk, network metrics
Collecting and analyzing system and application performance metrics.
Alarm production and management
Producing alarms and sending notifications for various system components.
Monitoring Component Features
Uptime Monitor
Endpoint Monitoring
- Endpoint accessibility check with HTTP requests
- Regular checks at specified time intervals
- Method, URL, parameter, and header support
- Response validation with assertion
Configuration
- Determining execution frequency with job scheduler
- Selecting from test collection
- Timeout settings
- Retry on failure
Actions
- Triggering actions when expectations are not met
- Actions like email, API call, notification
- Integration with connectors
For detailed information about Uptime Monitor, see the Uptime Monitor page.
Performance Metrics
- CPU usage
- CPU load average
- CPU core count
- Memory usage
- Heap memory
- Garbage collection
- Disk usage
- Disk I/O
- Disk space
- Network throughput
- Network latency
- Connection count
API Metrics
Request Metrics
- Request rate
- Request latency
- Request size
Response Metrics
- Response time
- Response size
- Status code distribution
Error Metrics
- Error rate
- Error types
- Error trends
Anomaly Detector
Condition Types
Four different condition types can be used for anomaly detection:
- Threshold Value Check: Anomaly detection when metric values exceed determined threshold value
- EMA with Bollinger Bands: Anomaly detection using Exponential Moving Average and Bollinger Bands
- Query/Filter Ratio Check: Anomaly detection based on ratio of query and filter results
- Custom Conditions: User-defined complex conditions
Anomaly Detector detects unexpected behaviors by analyzing API traffic logs and produces alarms.
Configuration
- Defining queries and filters
- Determining conditions
- Time range and triggering frequency
- Defining actions when anomaly is detected
For detailed information about Anomaly Detector, see the Anomaly Detector page.
Alarm Management
Alarm Types
The following alarm types are available in Apinizer:
- Kubernetes Pod Health Status: Health status of Kubernetes pods
- Kubernetes Node Health Status: Health status of Kubernetes nodes
- Kubernetes Node CPU Percentage: Kubernetes node CPU usage percentage
- Elasticsearch Health Status: Elasticsearch cluster health status
- Elasticsearch CPU Percentage: Elasticsearch CPU usage percentage
- Elasticsearch Disk Percentage: Elasticsearch disk usage percentage
- API Traffic Logs Exist in Database: Existence of API traffic logs in database
- Remaining Expiration Days of SSL: Remaining validity days of SSL certificate
- Remaining Expiration Days of JWK: Remaining validity days of JWK key
- Application Logs Count: Application log count
Alarm types enable early detection of problem situations by monitoring different components of the system.
Alarm Production
- Threshold value exceedance
- Anomaly detection
- Health check failure
- Uptime monitor failure
Alarm Channels
Alarm notifications can be sent through various channels:
- Email: Email notifications
- Webhook: Webhook integration
- Connectors: Actions with connectors like email, API call, notification, SNMP
Alarm channels are configured using connectors. For detailed information, see the Actions and Connectors page.
Alarm Management
- Alarm grouping
- Alarm filtering
- Alarm acknowledgment
- Alarm escalation
For detailed information about alarm management, see the Alarm (Alert) page.
Monitoring Component Components
Component that collects metrics
- System Metrics: CPU, memory, disk, network
- Application Metrics: API metrics, business metrics
- Custom Metrics: User-defined metrics
Component that manages alarms
- Rule Engine: Alarm rules
- Notification Service: Notification service
- Alarm Aggregation: Alarm aggregation
Visualization and monitoring interface
- Real-Time Dashboards: Real-time dashboards
- Custom Dashboards: Custom dashboards
- Widgets: Various widgets
Monitoring Component Integrations
- Prometheus integration
- Metric export
- Prometheus scraping
- Grafana integration
- Dashboard import
- Visualization
- Elasticsearch, Logstash, Kibana
- Log collection
- Log analysis
- Webhook integration
- Custom API integration
Monitoring Usage Scenarios
- Monitoring API Proxy endpoints with Uptime Monitor
- Sending regular HTTP requests
- Response validation and assertion check
- Triggering actions on failure
- Defining queries and filters in log records
- Determining conditions (threshold value, EMA, Bollinger Bands)
- Producing alarms when anomaly is detected
- Sending notifications with actions
- Monitoring Kubernetes pod and node statuses
- Monitoring Elasticsearch health and resource usage
- Monitoring SSL and JWK certificate expiration
- Monitoring API traffic logs and application logs
- Collecting CPU, memory metrics
- Performing trend analysis
- Detecting bottlenecks
- Optimization recommendations
Monitoring Best Practices
- Collect important metrics
- Filter unnecessary metrics
- Use sampling
- Select appropriate threshold values
- Reduce false positives
- Group alarms
- Meaningful dashboards
- Highlight important metrics
- Real-time and historical views
- Appropriate retention policies
- Long-term retention
- Cost optimization