Knowledge base/Basics

What is infrastructure monitoring

Infrastructure monitoring is the continuous collection of metrics from servers, containers, databases and the network, their storage, visualization and alerting when something goes out of bounds. The goal is to notice a problem before users do and find the cause faster.

What exactly gets tracked

Hosts: CPU, memory, disk (space and I/O), network, load average.
Containers and orchestration: pod restarts, CPU/memory limits, OOM kills.
Databases: connections, slow queries, replication lag.
Services and applications: error rate, latency (p50/p95/p99), queues.

What it consists of

Collection — an agent or exporter scrapes metrics and ships them to storage.
Storage — a time-series database (TSDB) holds the series of points.
Visualization — dashboards with charts.
Alerting — rules that send a notification to Slack/Telegram/on-call when a metric breaches a threshold.

Push vs pull

In the pull model the server fetches metrics itself (like Prometheus). In the push model the agent sends data itself. Unimoni uses push over mTLS — you do not need to open inbound ports on your servers.

Where to start

Capture basic host metrics (USE: Utilization, Saturation, Errors), set up a few actionable alerts (host down, low disk space, rising errors) and do not breed noise — an alert with no action only dulls attention.

What is observability, in plain words

What is a time-series database (TSDB)

SLO, SLI and SLA: the difference