writing-plugins.md

  1# Writing Munin Plugins
  2
  3## Protocol
  4
  5A plugin is called with no arguments to fetch values, and with `config` to describe the graph. Optional: `autoconf` (print `yes`/`no`), `suggest` (print modes for wildcard plugins).
  6
  7### Config output
  8
  9Global attributes describe the graph:
 10
 11| Attribute | Purpose | Example |
 12|---|---|---|
 13| `graph_title` | Title above graph | `graph_title CPU usage` |
 14| `graph_vlabel` | Y-axis label | `graph_vlabel percent` |
 15| `graph_category` | Grouping in web UI | `graph_category system` |
 16| `graph_args` | Passed to rrdgraph | `graph_args --base 1000 -l 0` |
 17| `graph_scale` | Enable SI scaling | `graph_scale no` |
 18| `graph_info` | Description below graph | `graph_info CPU usage by type` |
 19
 20Per-field attributes:
 21
 22| Attribute | Purpose | Example |
 23|---|---|---|
 24| `field.label` | Legend label (required) | `cpu.label CPU` |
 25| `field.type` | GAUGE, COUNTER, DERIVE | `cpu.type DERIVE` |
 26| `field.draw` | LINE1, LINE2, AREA, AREASTACK | `mem.draw AREASTACK` |
 27| `field.min` | Minimum valid value | `cpu.min 0` |
 28| `field.max` | Maximum valid value | `cpu.max 100` |
 29| `field.warning` | Warning threshold | `cpu.warning 80` |
 30| `field.critical` | Critical threshold | `cpu.critical 95` |
 31| `field.info` | Field description | `cpu.info Percent CPU used` |
 32| `field.cdef` | RPN expression transform | `bytes.cdef bytes,8,*` |
 33| `field.negative` | Mirror another field below axis | `up.negative down` |
 34
 35### Value output
 36
 37```
 38fieldname.value 42
 39otherfield.value 3.14
 40```
 41
 42Use `U` for unknown: `fieldname.value U`
 43
 44## Wildcard plugins
 45
 46A single plugin script handles multiple instances via its symlink name. The plugin parses `basename $0` to determine what to monitor.
 47
 48Example: `if_` plugin symlinked as `if_eth0`, `if_wlan0`. The script strips its prefix to get the interface name.
 49
 50Wildcard plugins should implement `suggest` to list valid instances:
 51
 52```sh
 53if [ "${1:-}" = "suggest" ]; then
 54    ls /sys/class/net/ | grep -v '^lo$'
 55    exit 0
 56fi
 57```
 58
 59## Multigraph plugins
 60
 61Emit `multigraph <graph_id>` before each graph's output:
 62
 63```sh
 64#!/bin/sh
 65if [ "${1:-}" = "config" ]; then
 66    echo "multigraph service_users"
 67    echo "graph_title Users"
 68    echo "graph_category myservice"
 69    echo "graph_vlabel count"
 70    echo "total.label Total users"
 71
 72    echo "multigraph service_uptime"
 73    echo "graph_title Uptime"
 74    echo "graph_category myservice"
 75    echo "graph_vlabel days"
 76    echo "uptime.label Uptime"
 77    exit 0
 78fi
 79
 80echo "multigraph service_users"
 81echo "total.value $(get_user_count)"
 82
 83echo "multigraph service_uptime"
 84echo "uptime.value $(get_uptime_days)"
 85```
 86
 87The master must negotiate `cap multigraph` before `list` will show these plugins. This happens automatically during normal polling.
 88
 89## Graph args reference
 90
 91Common `graph_args` values:
 92
 93- `--base 1000`: decimal SI units (1k = 1000)
 94- `--base 1024`: binary units (1Ki = 1024), use for bytes
 95- `-l 0`: lower limit 0 (graph won't go below)
 96- `--upper-limit 100`: upper limit (for percentages)
 97
 98## Magic markers
 99
100Add these comments for `munin-node-configure` auto-detection:
101
102```sh
103#%# family=auto          # or contrib, manual
104#%# capabilities=autoconf suggest
105```
106
107## Testing during development
108
109```bash
110# Set MUNIN_LIBDIR if plugin uses plugin.sh helpers
111export MUNIN_LIBDIR=/usr/share/munin   # or /usr/lib/munin on Arch
112
113# Test directly
114chmod +x ./myplugin
115./myplugin config
116./myplugin
117
118# Test through munin-run (sets up full environment)
119cp myplugin /etc/munin/plugins/
120munin-run myplugin config
121munin-run myplugin
122```
123
124## Common patterns
125
126### Monitoring a CLI tool's output
127
128```sh
129#!/bin/sh
130case ${1:-} in
131    config)
132        echo "graph_title My Service Stats"
133        echo "graph_category myservice"
134        echo "graph_vlabel count"
135        echo "connections.label Active connections"
136        exit 0 ;;
137esac
138echo "connections.value $(myservice-ctl status | awk '/connections:/ {print $2}')"
139```
140
141### Monitoring an API endpoint
142
143```sh
144#!/bin/sh
145case ${1:-} in
146    config)
147        echo "graph_title API Response Time"
148        echo "graph_category network"
149        echo "graph_vlabel ms"
150        echo "response.label Response time"
151        exit 0 ;;
152esac
153ms=$(curl -s -o /dev/null -w '%{time_total}' http://localhost:8080/health | awk '{printf "%.0f", $1 * 1000}')
154echo "response.value $ms"
155```
156
157### Monitoring container resource usage (Incus/LXD)
158
159Query the Incus API via `incus query` for per-container stats. Run as `user root` in plugin-conf.d. Use DERIVE for CPU (cumulative nanoseconds → rate), AREASTACK for memory.