Planq Monitoring Guide

This guide covers monitoring setup for your Planq node, including metrics collection, alerting, and dashboard configuration.

Overview

Monitoring your Planq node is crucial for:

Ensuring node health and uptime
Tracking performance metrics
Detecting issues before they become critical
Understanding resource usage patterns

Metrics Endpoints

Planq exposes the following metrics endpoints:

Endpoint	Port	Description
Prometheus Metrics	26660	Node metrics in Prometheus format
Health Check	1317/health	Basic health status
Node Status	26657/status	Detailed node status
EVM RPC	8545	Ethereum-compatible RPC

Setting Up Prometheus

1. Install Prometheus

# Download Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xvf prometheus-2.45.0.linux-amd64.tar.gz
sudo mv prometheus-2.45.0.linux-amd64 /opt/prometheus

# Create prometheus user
sudo useradd --no-create-home --shell /bin/false prometheus
sudo chown -R prometheus:prometheus /opt/prometheus

2. Configure Prometheus

/opt/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'planq_node'
  static_configs:
    - targets: ['localhost:26660']
      labels:
        instance: 'main'
        node_type: 'planq'

3. Create Prometheus Service

/etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
After=network.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/opt/prometheus/prometheus \
  --config.file /opt/prometheus/prometheus.yml \
  --storage.tsdb.path /opt/prometheus/data \
  --web.console.templates=/opt/prometheus/consoles \
  --web.console.libraries=/opt/prometheus/console_libraries
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Key Metrics to Monitor

Node Health Metrics

Basic Metrics
Performance
EVM Specific

Metric	Description	Alert Threshold
`up`	Node availability	< 1
`tendermint_consensus_height`	Current block height	Stalled > 5 min
`tendermint_p2p_peers`	Connected peers	< 3
`tendermint_consensus_fast_syncing`	Sync status	true > 30 min

Metric	Description	Alert Threshold
`process_cpu_seconds_total`	CPU usage	> 80%
`process_resident_memory_bytes`	Memory usage	> 90%
`planq_disk_usage`	Disk usage	> 85%
`tendermint_p2p_message_receive_bytes_total`	Network I/O	High rate

Metric	Description	Alert Threshold
`eth_block_number`	Latest EVM block	Stalled > 2 min
`eth_pending_transactions`	Pending EVM transactions	> 1000
`eth_gas_price`	Current gas price	Abnormally high
`json_rpc_requests_total`	RPC request rate	Monitor trends

Setting Up Alerts

1. Configure Alertmanager

/opt/prometheus/alertmanager.yml
global:
resolve_timeout: 5m

route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'telegram'

receivers:
- name: 'telegram'
telegram_configs:
- bot_token: 'YOUR_BOT_TOKEN'
  chat_id: YOUR_CHAT_ID
  parse_mode: 'HTML'

2. Create Alert Rules

/opt/prometheus/alerts.yml
groups:
- name: planq_alerts
interval: 30s
rules:
- alert: NodeDown
  expr: up{job="planq_node"} == 0
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Planq node is down"
    description: "Node {{ $labels.instance }} has been down for more than 2 minutes."
    
- alert: LowPeerCount
  expr: tendermint_p2p_peers < 3
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Low peer count"
    description: "Node has only {{ $value }} peers connected."
    
- alert: NodeNotSyncing
  expr: increase(tendermint_consensus_height[5m]) == 0
  for: 10m
  labels:
    severity: critical
  annotations:
    summary: "Node stopped syncing"
    description: "Block height has not increased for 10 minutes."
    
- alert: EVMBlockStalled
  expr: increase(eth_block_number[2m]) == 0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "EVM blocks not being produced"
    description: "EVM block number has not increased for 5 minutes."

Monitoring Commands

Check Node Status

# Basic status
curl -s localhost:26657/status | jq .

# Check sync status
curl -s localhost:26657/status | jq .result.sync_info

# Get peer count
curl -s localhost:26657/net_info | jq .result.n_peers

# Check EVM status
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://localhost:8545

Log Analysis

# View recent logs
journalctl -u planqd -n 100 --no-pager

# Follow logs in real-time
journalctl -u planqd -f

# Search for errors
journalctl -u planqd | grep -i error | tail -20

# Export logs for analysis
journalctl -u planqd --since "1 hour ago" > node-logs.txt

Dashboard Examples

Basic Node Dashboard

Key panels to include:

Node Status: Up/Down indicator
Block Height: Current vs network height
EVM Block: Latest EVM block number
Peer Count: Connected peers over time
Resource Usage: CPU, Memory, Disk
RPC Requests: API usage metrics
Gas Usage: EVM transaction costs

Example Query Expressions

# Uptime percentage (last 24h)
avg_over_time(up{job="planq_node"}[24h]) * 100

# Blocks behind network
max(tendermint_consensus_height) - tendermint_consensus_height

# EVM RPC request rate
rate(json_rpc_requests_total[5m])

# Memory usage percentage
100 * (process_resident_memory_bytes / node_memory_MemTotal_bytes)

Best Practices

Regular Backups: Backup Prometheus data regularly
Retention Policy: Set appropriate data retention (e.g., 30 days)
Alert Fatigue: Tune alerts to reduce false positives
Dashboard Organization: Create separate dashboards for different concerns
Documentation: Document custom metrics and alert thresholds

Overview​

Metrics Endpoints​

Setting Up Prometheus​

1. Install Prometheus​

2. Configure Prometheus​

3. Create Prometheus Service​

Key Metrics to Monitor​

Node Health Metrics​

Setting Up Alerts​

1. Configure Alertmanager​

2. Create Alert Rules​

Monitoring Commands​

Check Node Status​

Log Analysis​

Dashboard Examples​

Basic Node Dashboard​

Example Query Expressions​

Best Practices​

Additional Resources​