What is VictoriaMetrics(VM)?

  • VictoriaMetrics is a free open-source time series database (TSDB) and monitoring solution, designed to collect, store and process real-time metrics
  • My company used VM to replace Prometheus for high availability and better performance.
  • In this article, i will introduce 2 types of VM:
    • Single node VM: for small businesses that have fewer than 10-20 servers
    • Cluster VM: for larger businesses

Small business with VictoriaMetrics single-mode

Docker compose

  • Yes, using docker compose for fast deployment and easy to control.
  • Docker compose file:
services:
  victoria-metrics:
    image: victoriametrics/victoria-metrics:v1.114.0
    container_name: vm_single_node
    ports:
      - "8428:8428"  # Default port for HTTP API
    volumes:
      - vm_data:/storage  # Save data to a volume
      - ./prometheus.yml:/etc/vm/prometheus.yml  # Mount file config
    command:
      - "-storageDataPath=/storage"  # Specify the path to store data
      - "-retentionPeriod=30d"      # Retain data for 30 days
      - "-promscrape.config=/etc/vm/prometheus.yml"  # Specify the path to the config file
    restart: unless-stopped

  vmalert:
    image: victoriametrics/vmalert:v1.114.0
    container_name: vmalert
    ports:
      - "8880:8880"  # Port for vmalert HTTP API/UI
    volumes:
      - ./alert_rules.yml:/etc/alerts/alert_rules.yml  # Mount alert rules file
    command:
      - "-rule=/etc/alerts/alert_rules.yml"  # Path to alert rules
      - "-datasource.url=http://victoria-metrics:8428"  # Connect to VM
      - "-notifier.url=http://alertmanager:9093"  # Connect to Alertmanager
      - "-httpListenAddr=:8880"  # vmalert listens on 8880
    depends_on:
      - victoria-metrics
      - alertmanager
    restart: unless-stopped

  alertmanager:
    image: prom/alertmanager:v0.27.0
    container_name: alertmanager
    ports:
      - "9093:9093"  # Default Alertmanager port
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml  # Mount Alertmanager config
    command:
      - "--config.file=/etc/alertmanager/alertmanager.yml"  # Use mounted config file
    restart: unless-stopped

volumes:
  vm_data:
  • prometheus.yml file:
global:
  scrape_interval: 60s  # Interval between each scrape
  scrape_timeout: 30s   # Timeout for each scrape

scrape_configs:
  - job_name: 'ES-Dev'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.1.1.1:9114']  # target to scrape
  • alert_rules.yml file sample rules to trigger notify only:
groups:
  - name: ExampleAlerts
    rules:
      - alert: HighLatencyImmediate
        expr: http_request_duration_seconds > 0.5
        labels:
          severity: critical
        annotations:
          summary: "Immediate High Latency Detected"
          description: "A request exceeded 0.5 second latency."
  • alertmanager.yml file sample with telegram notifier:
route:
  receiver: 'telegram-notifications'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 1h

receivers:
- name: 'telegram-notifications'
  telegram_configs:
  - bot_token: '<YOUR_TELEGRAM_BOT_TOKEN>'
    chat_id: <YOUR_TELEGRAM_CHAT_ID>
    send_resolved: true # Optional: Send notifications when alerts are resolved.
    parse_mode: 'HTML' # Optional: Use HTML formatting.
    disable_web_page_preview: true # Optional: Disable web page previews.

templates:
- |-
    {{ define "telegram.message" }}
    {{ range .Alerts }}
    <b>{{ .Annotations.summary | default .Labels.alertname }}</b>

    {{ if .Labels.severity }}Severity: {{ .Labels.severity }}{{ end }}
    {{ if .Labels.namespace }}Namespace: {{ .Labels.namespace }}{{ end }}
    {{ if .Labels.job }}Job: {{ .Labels.job }}{{ end }}
    {{ if .Labels.instance }}Instance: {{ .Labels.instance }}{{ end }}

    {{ if .Annotations.description }}Description: {{ .Annotations.description }}{{ end }}

    {{ if gt (len .GeneratorURL) 0 }}<a href="{{ .GeneratorURL }}">View in Prometheus</a>{{ end }}

    {{ if gt (len .DashboardURL) 0 }}<a href="{{ .DashboardURL }}">View Dashboard</a>{{ end }}

    {{ if eq .Status "firing" }}🔥 Firing{{ else }} Resolved{{ end }}
    {{ end }}
    {{ end }}

    {{ define "telegram.title" }}
    {{ .CommonLabels.alertname }}: {{ .GroupLabels.alertname }}
    {{ end }}
  • I guess i don't need to write how to start this container!

Insert and test query

  • Insert script, run this, and wait like 2-3 minutes to fill up some metrics
#!/bin/bash

for i in $(seq 1 100); do
  timestamp=$(date +"%Y-%m-%d %H:%M:%S")
  echo "[$timestamp] Inserting metric: kienlt{kien=\"depzai\"} $i into http://localhost:8428/api/v1/import/prometheus"
  curl -d "kienlt{kien=\"depzai\"} $i" http://localhost:8428/api/v1/import/prometheus
  echo "[$timestamp] Sleep for 10 seconds. Insert count: $i"
  sleep 10
done

echo "Completed!"
  • Query with VM ui at http://localhost:8428/vmui

alt text

Scrape Target

  • We have defined our target need to scrape metrics in the file prometheus.yml
  • Let's check the target at the endpoint: http://localhost:8428/api/v1/targets

alt text

Rule and Notify

  • Let's make a sample test to trigger our rules:
for i in $(seq 0 100); do curl -d 'http_request_duration_seconds 1' http://localhost:8428/api/v1/import/prometheus; sleep 1; echo "sleep 1s"; done
  • After a little time, it will appear in vmalert UI and will be forwarded to Alertmanager

alt text

alt text

  • How is it displayed in Telegram? (I know the format is ugly, but it is a demo xD)

alt text

Conclusion single mode

  • There are many features of VM but i want to keep this simple and short for single-node since we still have Cluster VM to go.
  • I included vmalert and alertmanager for the complete stack!

Large business with VictoriaMetrics cluster mode

Introduction

  • vmstorage: Save time-series data
  • vminsert: receive metrics from client and write to vmstorage
  • vmselect: handle query, select data from vmstorage
  • vmagent: scrape metrics from targets and send them to vminsert
  • vmalert: evaluate alerting rules and send notifications to alertmanager
  • alertmanager: handle alerts sent by vmalert and manage notifications

Installation

For this mode, i prefer to use a dedicated host/ virtual machine to install VictoriaMetrics. I will use 3 nodes to setup this cluster.

  • Here is the installation script that will install: setup-vm-cluster.sh

  • Things need to change in this script in case you want to modify the version and hosts

    • hosts: update this Step 3: Update /etc/hosts to your host or don't need if you have internal DNS system
    • variables VM_VERSION, AM_VERSION
  • Check logs and port status after inserting if you have any problems. Below are examples of output from mine:

netstat -tulpn|grep -E "vmstorage|vmagent|vmselect|vminsert|vmalert|alertmanager"
tcp        0      0 0.0.0.0:8401            0.0.0.0:*               LISTEN      491202/vmstorage-pr 
tcp        0      0 0.0.0.0:8400            0.0.0.0:*               LISTEN      491202/vmstorage-pr 
tcp        0      0 0.0.0.0:8429            0.0.0.0:*               LISTEN      491225/vmagent-prod 
tcp        0      0 0.0.0.0:8482            0.0.0.0:*               LISTEN      491202/vmstorage-pr 
tcp        0      0 0.0.0.0:8481            0.0.0.0:*               LISTEN      491216/vmselect-pro 
tcp        0      0 0.0.0.0:8480            0.0.0.0:*               LISTEN      491207/vminsert-pro 
tcp        0      0 0.0.0.0:8880            0.0.0.0:*               LISTEN      491226/vmalert-prod 
tcp6       0      0 :::9094                 :::*                    LISTEN      491238/alertmanager 
tcp6       0      0 :::9093                 :::*                    LISTEN      491238/alertmanager 
udp6       0      0 :::9094                 :::*                                491238/alertmanager 

Port access and information

  • vmstorage: This has 3 ports

    • 8400: Receive metrics from vminsert
    • 8401: internal communication between vmstorage in the cluster
    • 8482: No idea about this much but this is where exporter exposes metrics of vmstorage with path: vm-node1:8482/metrics
  • vminsert: receive metrics from other sources and optimize process writes data into vmstorage in a distributed way! Here is an example write into vmstorage via vminsert. It will use port 8480

curl -X POST --data-binary 'kienlt_victoria_metrics_cluster,kiendepzai=ahihi value=42' "http://vm-node1:8480/insert/0/influx/write"

remoteWrite setting from Prometheus to VictoriaMetrics for vminsert:

    remoteWrite:
      - url: http://victoria-metrics:8480/insert/0/prometheus/api/v1/write
  • vmselect: We have inserted data, now let's check. The endpoint will be:
http://vm-node1:8481/select/0/prometheus/vmui

alt text

  • vmagent: This will show all active targets and another useful api like showing all targets in json format. The endpoint will be: http://vm-node1:8429/targets

alt text

  • vmalert: Endpoint will be: http://vm-node1:8880/vmalert/alerts . Same as vmalert in single mode

  • alertmanager: http://vm-node1:9093. Same as single-node but in cluster mode xD

Conclusion cluster mode

  • This is a simple article for VictoriaMetrics like how it works in single mode and cluster mode
  • I'm so lazy at this time and so this article still lacks of things below:
    • Real notification setup for alert manager
    • Backup and Restore with vmutils
    • Authentication: no auth for endpoints to access vminsert / vmselect
    • No HAProxy for failover and load balancing
  • There are many features i still don't know about VictoriaMetrics, i will write more articles in case i need to learn to use xD

Ref


Published

Category

Knowledge Base

Tags

Contact