Skip to content

Commit

Permalink
Update docs, add success metric
Browse files Browse the repository at this point in the history
- Update README.md. (Fixes #1)
- Update prometheus-client version.
- Add rsync_success metric.
- Update metric names to follow prometheus naming conventions.
- `--pushgw -` will now dump metrics to stdout, for debugging.
  • Loading branch information
kormat committed May 31, 2020
1 parent a967bdc commit 8b2d482
Show file tree
Hide file tree
Showing 3 changed files with 138 additions and 58 deletions.
76 changes: 66 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,80 @@
Exports [rsnapshot](http://rsnapshot.org/) stats to [Prometheus](http://prometheus.io/) (via [Pushgateway](https://github.com/prometheus/pushgateway/)).

## Requirements:
- python3
- The Prometheus [python client library](https://github.com/prometheus/client_python)
A running pushgateway instance.

This assumes that you already have Prometheus setup, and a pushgateway running on the local machine.

A debian package is available from [deb.ichbinn.net](https://deb.ichbinn.net/).
## Installation:
Download a binary from the [releases](https://github.com/kormat/rsnap_prom_stats/releases) page, install somewhere in your $PATH (e.g. `/usr/local/bin`), and make it executable.

## Setup
rsnap_prom_stats needs 2 settings changed in `rsnapshot.conf`. The first is setting `verbose` to at least 3 (so that rsnapshot prints the rsync commands that it runs), and the second is adding `--stats --no-human-readable` to the `rsync_long_args` setting, so that rsync prints out stats after every run (in a machine-readable format). Example of how the entries will look:
`rsnap_prom_stats` needs 2 settings changed in `rsnapshot.conf`. The first is setting `verbose` to at least 4 (so that rsnapshot prints the rsync commands that it runs, and the resulting stats), and the second is adding `--stats --no-human-readable` to the `rsync_long_args` setting, so that rsync prints out stats after every run (in a machine-readable format). It's also recommended to add `--no-verbose` to `rsync_long_args`, to reduce [log spam](https://github.com/rsnapshot/rsnapshot/issues/203#issuecomment-369386151). Example of how the entries will look:
```
verbose 3
rsync_long_args --delete --numeric-ids --relative --delete-excluded --stats --no-human-readable
verbose 4
rsync_long_args --delete --numeric-ids --relative --delete-excluded --stats --no-human-readable --no-verbose
```

## Running
To use rsnap_prom_stats, simply pipe the output of `rsnapshot sync` into it. E.g. (assuming that rsnap_prom_stats is in your path):
To use `rsnap_prom_stats`, simply pipe the output of `rsnapshot sync` into it. E.g. (assuming that `rsnap_prom_stats` is in your path):
```
rsnapshot sync | rsnap_prom_stats
```

For now, the location of your pushgateway is assumed to be `localhost:9091`, edit the `PUSH_GATEWAY` constant at the top of `rsnap_prom_stats.py` to change this.
If your pushgateway isn't running at `localhost:9091`, use the `--pushgw` flag to specify its location.

## Example metrics
```
# HELP rsnapshot_start_time Timestamp rsnapshot started at
# TYPE rsnapshot_start_time gauge
rsnapshot_start_time{instance="{'instance': 'local.host'}"} 1.5909347692082808e+09
# HELP rsnapshot_end_time Timestamp rsnapshot finished at
# TYPE rsnapshot_end_time gauge
rsnapshot_end_time{instance="{'instance': 'local.host'}"} 1.5909347717281592e+09
# HELP rsnapshot_duration_seconds How long rsnapshot ran for
# TYPE rsnapshot_duration_seconds gauge
rsnapshot_duration_seconds{instance="{'instance': 'local.host'}"} 2.519878387451172
# HELP rsync_start_time Time rsync started at
# TYPE rsync_start_time gauge
rsync_start_time{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 1.590934769995552e+09
# HELP rsync_end_time Time rsync finished at
# TYPE rsync_end_time gauge
rsync_end_time{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 1.590934769997473e+09
# HELP rsync_duration_seconds How long rsync ran for
# TYPE rsync_duration_seconds gauge
rsync_duration_seconds{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 0.001920938491821289
# HELP rsync_success 0 if rsync encountered no errors, 1 otherwise.
# TYPE rsync_success gauge
rsync_success{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 0.0
# HELP rsync_num_files Number of files
# TYPE rsync_num_files gauge
rsync_num_files{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 511795.0
# HELP rsync_num_xferred_files Number of regular files transferred
# TYPE rsync_num_xferred_files gauge
rsync_num_xferred_files{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 17.0
# HELP rsync_total_file_bytes Total file size
# TYPE rsync_total_file_bytes gauge
rsync_total_file_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 2.0689480235e+010
# HELP rsync_total_xferred_file_bytes Total transferred file size
# TYPE rsync_total_xferred_file_bytes gauge
rsync_total_xferred_file_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 1.13147736e+08
# HELP rsync_literal_data_bytes Literal data
# TYPE rsync_literal_data_bytes gauge
rsync_literal_data_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 4.946028e+06
# HELP rsync_matched_data_bytes Matched data
# TYPE rsync_matched_data_bytes gauge
rsync_matched_data_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 1.08201708e+08
# HELP rsync_file_list_bytes File list size
# TYPE rsync_file_list_bytes gauge
rsync_file_list_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 5.248188e+06
# HELP rsync_file_list_gen_seconds File list generation time
# TYPE rsync_file_list_gen_seconds gauge
rsync_file_list_gen_seconds{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 0.001
# HELP rsync_file_list_xfer_seconds File list transfer time
# TYPE rsync_file_list_xfer_seconds gauge
rsync_file_list_xfer_seconds{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 0.0
# HELP rsync_total_sent_bytes Total bytes sent
# TYPE rsync_total_sent_bytes gauge
rsync_total_sent_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 167812.0
# HELP rsync_total_recv_bytes Total bytes received
# TYPE rsync_total_recv_bytes gauge
rsync_total_recv_bytes{dst_host="local.host",dst_path="/rsnapshot/.sync/remote.host/",instance="local.host",src_host="remote.host",src_path="/"} 2.078377e+07
```
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
prometheus_client==0.5.0 --hash=sha256:e8c11ff5ca53de6c3d91e1510500611cafd1d247a937ec6c588a0a7cc3bef93c
prometheus_client==0.8.0 --hash=sha256:983c7ac4b47478720db338f1491ef67a100b474e3bc7dafcbaefb7d0b8f9b01c
118 changes: 71 additions & 47 deletions rsnap_prom_stats/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@
import sys
import time

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway
from prometheus_client import (
CollectorRegistry,
Gauge,
generate_latest,
push_to_gateway,
)

DEFAULT_PUSH_GATEWAY = "localhost:9091"
DEFAULT_JOB_NAME = "rsnapshot"
Expand All @@ -28,31 +33,33 @@
gauges = {}
RSYNC_STATS = {
# Metadata
"rsync_start": "Time rsync started at",
"rsync_end": "Time rsync finished at",
"rsync_duration": "How long rsync ran for",
"rsync_start_time": "Time rsync started at",
"rsync_end_time": "Time rsync finished at",
"rsync_duration_seconds": "How long rsync ran for",
"rsync_success": "0 if rsync encountered no errors, 1 otherwise.",
# Stats directly from rsync
"rsync_num_files": "Number of files",
"rsync_num_files_xferred": "Number of regular files transferred",
"rsync_total_file_size": "Total file size",
"rsync_total_xferred_file_size": "Total transferred file size",
"rsync_literal_data": "Literal data",
"rsync_matched_data": "Matched data",
"rsync_file_list_size": "File list size",
"rsync_file_list_gen_time": "File list generation time",
"rsync_file_list_xfer_time": "File list transfer time",
"rsync_total_sent": "Total bytes sent",
"rsync_total_recv": "Total bytes received",
"rsync_num_xferred_files": "Number of regular files transferred",
"rsync_total_file_bytes": "Total file size",
"rsync_total_xferred_file_bytes": "Total transferred file size",
"rsync_literal_data_bytes": "Literal data",
"rsync_matched_data_bytes": "Matched data",
"rsync_file_list_bytes": "File list size",
"rsync_file_list_gen_seconds": "File list generation time",
"rsync_file_list_xfer_seconds": "File list transfer time",
"rsync_total_sent_bytes": "Total bytes sent",
"rsync_total_recv_bytes": "Total bytes received",
}


class Stats:
START_NAME = {v: k for k, v in RSYNC_STATS.items()}
STAT_NAME = {v: k for k, v in RSYNC_STATS.items()}

def __init__(self, line):
self._metrics = {}
self._metrics['rsync_start'] = time.time()
self._metrics['rsync_start_time'] = time.time()
self._end = 0
self._success = True
self.src_host = None
self.src_path = None
self.dst_host = None
Expand All @@ -67,27 +74,35 @@ def _parse_rsync_line(self, line):
def _get_host_path(self, s):
remote_rx = re.compile(r'((.*@)?(?P<host>.+):)?(?P<path>.+)$')
m = remote_rx.match(s)
host = m.group('host') or socket.getfqdn()
host = m.group('host') or localhost
path = m.group('path')
return host, path

def parse(self, line):
"""
Returns None on success, False on error
"""
parse_rx = re.compile(r'^(?P<desc>[^:]+): (?P<val>\S+)')
m = parse_rx.match(line)
if not m:
return
name = self.START_NAME.get(m.group('desc'))
desc = m.group('desc')
if desc == "rsync error":
self._success = False
return False
name = self.STAT_NAME.get(m.group('desc'))
if not name:
# Skip non-machines lines
# Skip non-matching lines
return
self._metrics[name] = float(m.group('val'))

def publish(self, def_labels):
self._metrics['rsync_end'] = time.time()
self._metrics['rsync_duration'] = (
self._metrics['rsync_end'] - self._metrics['rsync_start'])
print("Publishing %s:%s -> %s:%s" % (
self.src_host, self.src_path, self.dst_host, self.dst_path))
self._metrics['rsync_end_time'] = time.time()
self._metrics['rsync_duration_seconds'] = (
self._metrics['rsync_end_time'] - self._metrics['rsync_start_time'])
self._metrics['rsync_success'] = 0 if self._success else 1
logging.info("Publishing %s:%s -> %s:%s" % (
self.src_host, self.src_path, self.dst_host, self.dst_path))
labels = {
'src_host': self.src_host,
'src_path': self.src_path,
Expand All @@ -101,44 +116,52 @@ def publish(self, def_labels):


def main():
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser = argparse.ArgumentParser(
prog="rsnap_prom_stats",
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("--pushgw", default=DEFAULT_PUSH_GATEWAY,
help="Address of the pushgateway to publish to.")
help="Address of the pushgateway to publish to. If "
"set to '-' it will print the metrics to stdout instead.")
parser.add_argument("--job", default=DEFAULT_JOB_NAME,
help="Pushgateway job name.")
help="Pushgateway job name.")
parser.add_argument("-v", action="store_true",
help="Print some information to stdout.")
help="Print some information to stdout.")
args = parser.parse_args()
logging.basicConfig(level=logging.INFO)
level = logging.WARNING
if args.v:
level = logging.INFO
logging.basicConfig(
format='[%(asctime)s] %(message)s',
level=level)

registry = setup_metrics()
start = time.time()
if args.v:
logging.info("started")
def_labels = {'instance': socket.getfqdn()}
logging.info("Started")
def_labels = {'instance': localhost}
process_input(def_labels)
end = time.time()
if args.v:
logging.info("finished reading output.")
gauges["rsnapshot_start"].labels(def_labels).set(start)
gauges["rsnapshot_end"].labels(def_labels).set(end)
gauges["rsnapshot_duration"].labels(def_labels).set(end - start)
if args.v:
logging.info("Finished reading output")
gauges["rsnapshot_start_time"].labels(def_labels).set(start)
gauges["rsnapshot_end_time"].labels(def_labels).set(end)
gauges["rsnapshot_duration_seconds"].labels(def_labels).set(end - start)
if args.pushgw == "-":
print(generate_latest(registry).decode("utf-8"))
else:
logging.info("publishing to pushgateway @ %s", args.pushgw)
push_to_gateway(args.pushgw, job=args.job, registry=registry)
push_to_gateway(args.pushgw, job=args.job, registry=registry)


def setup_metrics():
registry = CollectorRegistry()
basic_labels = ['instance']
gauges["rsnapshot_start"] = Gauge(
"rsnapshot_start", "Timestamp rsnapshot started at", basic_labels,
gauges["rsnapshot_start_time"] = Gauge(
"rsnapshot_start_time", "Timestamp rsnapshot started at", basic_labels,
registry=registry)
gauges["rsnapshot_end"] = Gauge(
"rsnapshot_end", "Timestamp rsnapshot finished at", basic_labels,
gauges["rsnapshot_end_time"] = Gauge(
"rsnapshot_end_time", "Timestamp rsnapshot finished at", basic_labels,
registry=registry)
gauges["rsnapshot_duration"] = Gauge(
"rsnapshot_duration", "How long rsnapshot ran for", basic_labels,
gauges["rsnapshot_duration_seconds"] = Gauge(
"rsnapshot_duration_seconds", "How long rsnapshot ran for", basic_labels,
registry=registry)
rsync_labels = ['src_host', 'src_path', 'dst_host', 'dst_path']
for name, desc in RSYNC_STATS.items():
Expand All @@ -160,11 +183,12 @@ def process_input(def_labels):
# Don't bother parsing lines until we found the start of a stats
# block
continue
if line.startswith('sent '):
if line.startswith('sent ') or s.parse(line) is False:
# We've reached the end of the stats block, or an rsync error
# was encountered. Either way, publish the stats.
s.publish(def_labels)
s = None
continue
s.parse(line)


def read_lines():
Expand Down

0 comments on commit 8b2d482

Please sign in to comment.