Secrets
Non-secret information or fields may be available via the HTTP API and/or logs.
In Prometheus, metadata retrieved from service discovery is not considered
secret. Throughout the Prometheus system, metrics are not considered secret.
Fields containing secrets in configuration files (marked explicitly as such in
the documentation) will not be exposed in logs or via the HTTP API. Secrets
should not be placed in other configuration fields, as it is common for
components to expose their configuration over their HTTP endpoint. It is the
responsibility of the user to protect files on disk from unwanted reads and
writes.
Secrets from other sources used by dependencies (e.g. the
environment variable as used by EC2 service discovery) may end up exposed due to
code outside of our control or due to functionality that happens to expose
wherever it is stored.
Prometheus
It is presumed that untrusted users have access to the Prometheus HTTP endpoint
and logs. They have access to all time series information contained in the
database, plus a variety of operational/debugging information.
It is also presumed that only trusted users have the ability to change the
command line, configuration file, rule files and other aspects of the runtime
environment of Prometheus and other components.
Which targets Prometheus scrapes, how often and with what other settings is
determined entirely via the configuration file. The administrator may
decide to use information from service discovery systems, which combined with
relabelling may grant some of this control to anyone who can modify data in
that service discovery system.
Scraped targets may be run by untrusted users. It should not by default be
possible for a target to expose data that impersonates a different target. The
option removes this protection, as can certain relabelling
setups.
As of Prometheus 2.0, the flag controls access to the
administrative HTTP API which includes functionality such as deleting time
series. This is disabled by default. If enabled, administrative and mutating
functionality will be accessible under the paths. The
flag controls HTTP reloads and shutdowns of
Prometheus. This is also disabled by default. If enabled they will be
accessible under the and paths.
In Prometheus 1.x, and using on are
accessible to anyone with access to the HTTP API. The endpoint is
disabled by default, but can be enabled with the
flag.
The remote read feature allows anyone with HTTP access to send queries to the
remote read endpoint. If for example the PromQL queries were ending up directly
run against a relational database, then anyone with the ability to send queries
to Prometheus (such as via Grafana) can run arbitrary SQL against that
database.
Подготовка сервера
Настроим некоторые параметры сервера, необходимые для правильно работы системы.
Время
Для отображения событий в правильное время, необходимо настроить его синхронизацию. Для этого установим chrony:
а) если на системе CentOS / Red Hat:
yum install chrony
systemctl enable chronyd
systemctl start chronyd
б) если на системе Ubuntu / Debian:
apt-get install chrony
systemctl enable chrony
systemctl start chrony
Брандмауэр
На фаерволе, при его использовании, необходимо открыть порты:
- TCP 9090 — http для сервера прометеус.
- TCP 9093 — http для алерт менеджера.
- TCP и UDP 9094 — для алерт менеджера.
- TCP 9100 — для node_exporter.
а) с помощью firewalld:
firewall-cmd —permanent —add-port=9090/tcp —add-port=9093/tcp —add-port=9094/{tcp,udp} —add-port=9100/tcp
firewall-cmd —reload
б) с помощью iptables:
iptables -I INPUT 1 -p tcp —match multiport —dports 9090,9093,9094,9100 -j ACCEPT
iptables -A INPUT -p udp —dport 9094 -j ACCEPT
в) с помощью ufw:
ufw allow 9090,9093,9094,9100/tcp
ufw allow 9094/udp
ufw reload
SELinux
По умолчанию, SELinux работает в операционный системах на базе Red Hat. Проверяем, работает ли она в нашей системе:
getenforce
Если мы получаем в ответ:
Enforcing
… необходимо отключить его командами:
setenforce 0
sed -i ‘s/^SELINUX=.*/SELINUX=disabled/g’ /etc/selinux/config
* если же мы получим ответ The program ‘getenforce’ is currently not installed, то SELinux не установлен в системе.
prometheus-pushgateway
Установка и настройка
# apt install prometheus-pushgateway # cat /etc/prometheus/prometheus.yml
... - job_name: 'pushgateway' honor_labels: true static_configs: - targets:
Проверка конфигурации и перезапуск
Пример prometheus pushgateway на bash
# cat ip_dhcp_binding.sh
#!/bin/sh unset http_proxy DHCP_SERVER=router NET=192.168 COUNT=`rsh ${DHCP_SERVER} show ip dhcp binding | grep ${NET} | wc -l` cat << EOF | curl --data-binary @- http://127.0.0.1:9091/metrics/job/cisco_dhcp/dhcp_server/${DHCP_SERVER}/net/${NET} ip_dhcp_binding ${COUNT} EOF
ip_dhcp_binding{dhcp_server="router",job="cisco_dhcp",net="192.168"}
# crontab -l
* * * * * /root/ip_dhcp_binding.sh
Authentication, Authorization, and Encryption
In the future, server-side TLS support will be rolled out to the different
Prometheus projects. Those projects include Prometheus, Alertmanager,
Pushgateway and the official exporters.
Authentication of clients by TLS client certs will also be supported.
The Go projects will share the same TLS library, which will be based on the
Go vanilla crypto/tls library.
We default to TLS 1.2 as minimum version. Our policy regarding this is based on
Qualys SSL Labs recommendations, where we strive to
achieve a grade ‘A’ with a default configuration and correctly provided
certificates, while sticking as closely as possible to the upstream Go defaults.
Achieving that grade provides a balance between perfect security and usability.
TLS will be added to Java exporters in the future.
If you have special TLS needs, like a different cipher suite or older TLS
version, you can tune the minimum TLS version and the ciphers, as long as the
cipher is not
in the crypto/tls library. If that still
does not suit you, the current TLS settings enable you to build a secure tunnel
between the servers and reverse proxies with more special requirements.
HTTP Basic Authentication will also be supported. Basic Authentication can be
used without TLS, but it will then expose usernames and passwords in cleartext
over the network.
On the server side, basic authentication passwords are stored as hashes with the
bcrypt algorithm. It is your
responsibility to pick the number of rounds that matches your security
standards. More rounds make brute-force more complicated at the cost of more CPU
power and more time to authenticate the requests.
Various Prometheus components support client-side authentication and
encryption. If TLS client support is offered, there is often also an option
called which skips SSL verification.
prometheus-pushgateway
Установка и настройка
# apt install prometheus-pushgateway # cat /etc/prometheus/prometheus.yml
... - job_name: 'pushgateway' honor_labels: true static_configs: - targets:
Проверка конфигурации и перезапуск
Пример prometheus pushgateway на bash
# cat ip_dhcp_binding.sh
#!/bin/sh unset http_proxy DHCP_SERVER=router NET=192.168 COUNT=`rsh ${DHCP_SERVER} show ip dhcp binding | grep ${NET} | wc -l` cat << EOF | curl --data-binary @- http://127.0.0.1:9091/metrics/job/cisco_dhcp/dhcp_server/${DHCP_SERVER}/net/${NET} ip_dhcp_binding ${COUNT} EOF
ip_dhcp_binding{dhcp_server="router",job="cisco_dhcp",net="192.168"}
# crontab -l
* * * * * /root/ip_dhcp_binding.sh
Alertmanager
Any user with access to the Alertmanager HTTP endpoint has access to its data.
They can create and resolve alerts. They can create, modify and delete
silences.
Any secret fields which are templatable are intended for routing notifications
in the above use case. They are not intended as a way for secrets to be
separated out from the configuration files using the template file feature. Any
secrets stored in template files could be exfiltrated by anyone able to
configure receivers in the Alertmanager configuration file. For example in
large setups, each team might have an alertmanager configuration file fragment
which they fully control, that are then combined into the full final
configuration file.
Prometheus
Prometheus не устанавливается из репозитория и имеет, относительно, сложный процесс установки. Необходимо скачать исходник, создать пользователя, вручную скопировать нужные файлы, назначить права и создать юнит для автозапуска.
Загрузка
Переходим на официальную страницу загрузки и копируем ссылку на пакет для Linux:
… и используем ее для загрузки пакета на Linux:
wget https://github.com/prometheus/prometheus/releases/download/v2.20.1/prometheus-2.20.1.linux-amd64.tar.gz
* если система вернет ошибку, необходимо установить пакет wget.
Установка (копирование файлов)
После того, как мы скачали архив prometheus, необходимо его распаковать и скопировать содержимое по разным каталогам.
Для начала создаем каталоги, в которые скопируем файлы для prometheus:
mkdir /etc/prometheus
mkdir /var/lib/prometheus
Распакуем наш архив:
tar zxvf prometheus-*.linux-amd64.tar.gz
… и перейдем в каталог с распакованными файлами:
cd prometheus-*.linux-amd64
Распределяем файлы по каталогам:
cp prometheus promtool /usr/local/bin/
cp -r console_libraries consoles prometheus.yml /etc/prometheus
Назначение прав
Создаем пользователя, от которого будем запускать систему мониторинга:
useradd —no-create-home —shell /bin/false prometheus
* мы создали пользователя prometheus без домашней директории и без возможности входа в консоль сервера.
Задаем владельца для каталогов, которые мы создали на предыдущем шаге:
chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus
Задаем владельца для скопированных файлов:
chown prometheus:prometheus /usr/local/bin/{prometheus,promtool}
Запуск и проверка
Запускаем prometheus командой:
/usr/local/bin/prometheus —config.file /etc/prometheus/prometheus.yml —storage.tsdb.path /var/lib/prometheus/ —web.console.templates=/etc/prometheus/consoles —web.console.libraries=/etc/prometheus/console_libraries
… мы увидим лог запуска — в конце «Server is ready to receive web requests»:
level=info ts=2019-08-07T07:39:06.849Z caller=main.go:621 msg=»Server is ready to receive web requests.»
Открываем веб-браузер и переходим по адресу http://<IP-адрес сервера>:9090 — загрузится консоль Prometheus:
Установка завершена.
Автозапуск
Мы установили наш сервер мониторинга, но его необходимо запускать вручную, что совсем не подходит для серверных задач. Для настройки автоматического старта Prometheus мы создадим новый юнит в systemd.
Возвращаемся к консоли сервера и прерываем работу Prometheus с помощью комбинации Ctrl + C. Создаем файл prometheus.service:
vi /etc/systemd/system/prometheus.service
Description=Prometheus Service
After=network.target
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
—config.file /etc/prometheus/prometheus.yml \
—storage.tsdb.path /var/lib/prometheus/ \
—web.console.templates=/etc/prometheus/consoles \
—web.console.libraries=/etc/prometheus/console_libraries
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
WantedBy=multi-user.target
Перечитываем конфигурацию systemd:
systemctl daemon-reload
Разрешаем автозапуск:
systemctl enable prometheus
После ручного запуска мониторинга, который мы делали для проверки, могли сбиться права на папку библиотек — снова зададим ей владельца:
chown -R prometheus:prometheus /var/lib/prometheus
Запускаем службу:
systemctl start prometheus
… и проверяем, что она запустилась корректно:
systemctl status prometheus
Мониторинг служб Linux
Для мониторинга сервисов с помощью Prometheus мы настроим сбор метрик и отображение тревог.
Сбор метрие с помощью node_exporter
Открываем сервис, созданный для node_exporter:
vi /etc/systemd/system/node_exporter.service
… и добавим к ExecStart:
…
ExecStart=/usr/local/bin/node_exporter —collector.systemd
…
* данная опция указывает экспортеру мониторить состояние каждой службы.
При необходимости, мы можем либо мониторить отдельные службы, добавив опцию collector.systemd.unit-whitelist:
ExecStart=/usr/local/bin/node_exporter —collector.systemd —collector.systemd.unit-whitelist=»(chronyd|mariadb|nginx).service»
* в данном примере будут мониториться только сервисы chronyd, mariadb и nginx.
… либо наоборот — мониторить все службы, кроме отдельно взятых:
ExecStart=/usr/local/bin/node_exporter —collector.systemd —collector.systemd.unit-blacklist=»(auditd|dbus|kdump).service»
* при такой настройке мы запретим мониторинг сервисов auditd, dbus и kdump.
Чтобы применить настройки, перечитываем конфиг systemd:
systemctl daemon-reload
Перезапускаем node_exporter:
systemctl restart node_exporter
Отображение тревог
Настроим мониторинг для службы NGINX.
Создаем файл с правилом:
vi /etc/prometheus/services.rules.yml
groups:
— name: services.rules
rules:
— alert: nginx_service
expr: node_systemd_unit_state{name=»nginx.service»,state=»active»} == 0
for: 1s
annotations:
summary: «Instance {{ $labels.instance }} is down»
description: «{{ $labels.instance }} of job {{ $labels.job }} is down.»
Подключим файл с описанием правил в конфигурационном файле prometheus:
vi /etc/prometheus/prometheus.yml
…
rule_files:
# — «first_rules.yml»
# — «second_rules.yml»
— «alert.rules.yml»
— «services.rules.yml»
…
* в данном примере мы добавили наш файл services.rules.yml к уже ранее добавленному alert.rules.yml в секцию rule_files.
Перезапускаем prometheus:
systemctl restart prometheus
Для проверки, остановим наш сервис:
systemctl stop nginx
В консоли Prometheus в разделе Alerts мы должны увидеть тревогу:
Exporters
prometheus-node-exporter
- В Debian ставится как зависимость к пакету prometheus и добавлен в конфигурацию
Примеры счетчиков
node_filesystem_free_bytes
$ df / ... /dev/mapper/debian--vg-root 15662008 1877488 12969212 13% / ... # TYPE node_filesystem_free_bytes gauge node_filesystem_free_bytes{device="/dev/mapper/debian--vg-root",fstype="ext4",mountpoint="/"} = (15662008 - 1877488) * 1024
node_network_receive_bytes_total
$ cat /sys/class/net/eth1/statistics/rx_bytes # TYPE node_network_receive_bytes_total counter node_network_receive_bytes_total{device="eth1"}
Подключение к prometheus
# less /etc/prometheus/prometheus.yml ... - job_name: node # If prometheus-node-exporter is installed, grab stats about the local # machine by default. static_configs: - targets:
Запросы PromQL
8*rate(node_network_receive_bytes_total) 8*rate(node_network_receive_bytes_total{device="eth1"}) 8*rate(node_network_receive_bytes_total{device="eth1",instance="localhost:9100",job="node"})
prometheus-blackbox-exporter
# apt install prometheus-blackbox-exporter
Пример конфигурации
# cat /etc/prometheus/blackbox.yml
... http_2xx: prober: http http: preferred_ip_protocol: "ip4" ...
# service prometheus-blackbox-exporter restart # cat /etc/prometheus/prometheus.yml
... - job_name: check_http metrics_path: /probe params: module: static_configs: - targets: - https://google.com - https://ya.ru relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115 - job_name: check_ssh metrics_path: /probe params: module: static_configs: - targets: - switch1:22 - switch2:22 - switch3:22 relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115
probe_success... probe_duration_seconds... probe_http_duration_seconds...
Пример использования file-based service discovery и сервиса ping
# cat /etc/prometheus/prometheus.yml
... - job_name: check_ping metrics_path: /probe params: module: file_sd_configs: - files: # - switchs.yml # - switchs.json relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115
# cat /etc/prometheus/switchs.json
} ]
# cat /etc/prometheus/switchs.yml
- targets: - switch1 - switch2 - switch3
Проверка конфигурации и перезапуск
prometheus-snmp-exporter
# apt install prometheus-snmp-exporter # cat /etc/prometheus/snmp.yml
#if_mib: # по умолчанию, позволяет не указывать module в http запросе snmp_in_out_octets: version: 2 auth: community: public walk: - 1.3.6.1.2.1.2.2.1.10 - 1.3.6.1.2.1.2.2.1.16 - 1.3.6.1.2.1.2.2.1.2 metrics: - name: if_in_octets oid: 1.3.6.1.2.1.2.2.1.10 type: counter indexes: - labelname: ifIndex type: Integer lookups: - labels: - ifIndex labelname: ifDescr oid: 1.3.6.1.2.1.2.2.1.2 type: DisplayString - name: if_out_octets oid: 1.3.6.1.2.1.2.2.1.16 type: counter indexes: - labelname: ifIndex type: Integer lookups: - labels: - ifIndex labelname: ifDescr oid: 1.3.6.1.2.1.2.2.1.2 type: DisplayString
# service prometheus-snmp-exporter restart
http://server.corpX.un:9116/
# curl --noproxy 127.0.0.1 'http://127.0.0.1:9116/snmp?target=router&module=snmp_in_out_octets'
# cat /etc/prometheus/prometheus.yml
... - job_name: 'snmp' static_configs: - targets: - router metrics_path: /snmp params: module: relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9116
Проверка конфигурации и перезапуск
rate(if_in_octets{ifDescr="FastEthernet1/1",ifIndex="3",instance="router",job="snmp"}) 8*rate(if_in_octets{ifDescr="FastEthernet1/1",instance="router"})
Exporters
prometheus-node-exporter
- В Debian ставится как зависимость к пакету prometheus и добавлен в конфигурацию
Примеры счетчиков
node_filesystem_free_bytes
$ df / ... /dev/mapper/debian--vg-root 15662008 1877488 12969212 13% / ... # TYPE node_filesystem_free_bytes gauge node_filesystem_free_bytes{device="/dev/mapper/debian--vg-root",fstype="ext4",mountpoint="/"} = (15662008 - 1877488) * 1024
node_network_receive_bytes_total
$ cat /sys/class/net/eth1/statistics/rx_bytes # TYPE node_network_receive_bytes_total counter node_network_receive_bytes_total{device="eth1"}
Подключение к prometheus
# less /etc/prometheus/prometheus.yml ... - job_name: node # If prometheus-node-exporter is installed, grab stats about the local # machine by default. static_configs: - targets:
Запросы PromQL
8*rate(node_network_receive_bytes_total) 8*rate(node_network_receive_bytes_total{device="eth1"}) 8*rate(node_network_receive_bytes_total{device="eth1",instance="localhost:9100",job="node"})
prometheus-blackbox-exporter
# apt install prometheus-blackbox-exporter
Пример конфигурации
# cat /etc/prometheus/blackbox.yml
... http_2xx: prober: http http: preferred_ip_protocol: "ip4" ...
# service prometheus-blackbox-exporter restart # cat /etc/prometheus/prometheus.yml
... - job_name: check_http metrics_path: /probe params: module: static_configs: - targets: - https://google.com - https://ya.ru relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115 - job_name: check_ssh metrics_path: /probe params: module: static_configs: - targets: - switch1:22 - switch2:22 - switch3:22 relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115
probe_success... probe_duration_seconds... probe_http_duration_seconds...
Пример использования file-based service discovery и сервиса ping
# cat /etc/prometheus/prometheus.yml
... - job_name: check_ping metrics_path: /probe params: module: file_sd_configs: - files: # - switchs.yml # - switchs.json relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9115
# cat /etc/prometheus/switchs.json
} ]
# cat /etc/prometheus/switchs.yml
- targets: - switch1 - switch2 - switch3
Проверка конфигурации и перезапуск
prometheus-snmp-exporter
# apt install prometheus-snmp-exporter # cat /etc/prometheus/snmp.yml
#if_mib: # по умолчанию, позволяет не указывать module в http запросе snmp_in_out_octets: version: 2 auth: community: public walk: - 1.3.6.1.2.1.2.2.1.10 - 1.3.6.1.2.1.2.2.1.16 - 1.3.6.1.2.1.2.2.1.2 metrics: - name: if_in_octets oid: 1.3.6.1.2.1.2.2.1.10 type: counter indexes: - labelname: ifIndex type: Integer lookups: - labels: - ifIndex labelname: ifDescr oid: 1.3.6.1.2.1.2.2.1.2 type: DisplayString - name: if_out_octets oid: 1.3.6.1.2.1.2.2.1.16 type: counter indexes: - labelname: ifIndex type: Integer lookups: - labels: - ifIndex labelname: ifDescr oid: 1.3.6.1.2.1.2.2.1.2 type: DisplayString
# service prometheus-snmp-exporter restart
http://server.corpX.un:9116/
# curl --noproxy 127.0.0.1 'http://127.0.0.1:9116/snmp?target=router&module=snmp_in_out_octets'
# cat /etc/prometheus/prometheus.yml
... - job_name: 'snmp' static_configs: - targets: - router metrics_path: /snmp params: module: relabel_configs: - source_labels: target_label: __param_target - source_labels: target_label: instance - target_label: __address__ replacement: localhost:9116
Проверка конфигурации и перезапуск
rate(if_in_octets{ifDescr="FastEthernet1/1",ifIndex="3",instance="router",job="snmp"}) 8*rate(if_in_octets{ifDescr="FastEthernet1/1",instance="router"})
prometheus-alertmanager
# apt install prometheus-alertmanager # cat /etc/prometheus/alertmanager.yml
... global: smtp_smarthost: 'localhost:25' smtp_from: '[email protected]' smtp_require_tls: false # smtp_auth_username: 'alertmanager' # smtp_auth_password: 'password' ... # A default receiver receiver: team-X-mails ... receivers: - name: 'team-X-mails' email_configs: # - to: '[email protected]' # - to: '[email protected]' send_resolved: true ...
# service prometheus-alertmanager restart # cat /etc/prometheus/first_rules.yml
groups: - name: alert.rules rules: - alert: InstanceDown expr: up == 0 for: 1m labels: severity: "critical" annotations: summary: "Endpoint {{ $labels.instance }} down" description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes." - alert: EndpointDown expr: probe_success == 0 for: 1m labels: severity: "critical" annotations: summary: "Endpoint {{ $labels.instance }} down" - alert: CriticalTraffic expr: rate(if_in_octets{instance="router"})>125000 for: 1m labels: severity: "critical" annotations: summary: "CriticalTraffic {{ $labels.instance }}"
# cat /etc/prometheus/prometheus.yml
... # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "first_rules.yml" # - "second_rules.yml" ...
Проверка конфигурации и перезапуск
... Checking /etc/prometheus/first_rules.yml SUCCESS: N rules found ...
http://server.corpX.un:9090/alerts
Отправка уведомлений
Теперь настроим связку с алерт менеджером для отправки уведомлений на почту.
Настроим alertmanager:
vi /etc/alertmanager/alertmanager.yml
В секцию global добавим:
global:
…
smtp_from: [email protected]
Приведем секцию route к виду:
… далее добавим еще один ресивер:
* в данном примере мы отправляем сообщение на почтовый ящик [email protected] с локального сервера
Обратите внимание, что для отправки почты наружу у нас должен быть корректно настроенный почтовый сервер (в противном случае, почта может попадать в СПАМ)
Перезапустим сервис для алерт менеджера:
systemctl restart alertmanager
Теперь настроим связку prometheus с alertmanager — открываем конфигурационный файл сервера мониторинга:
vi /etc/prometheus/prometheus.yml
Приведем секцию alerting к виду:
alerting:
alertmanagers:
— static_configs:
— targets:
— 192.168.0.14:9093
* где 192.168.0.14 — IP-адрес сервера, на котором у нас стоит alertmanager.
Перезапускаем сервис:
systemctl restart prometheus
Немного ждем и заходим на веб интерфейс алерт менеджера — мы должны увидеть тревогу:
… а на почтовый ящик должно прийти письмо с тревогой.
prometheus-alertmanager
# apt install prometheus-alertmanager # cat /etc/prometheus/alertmanager.yml
... global: smtp_smarthost: 'localhost:25' smtp_from: '[email protected]' smtp_require_tls: false # smtp_auth_username: 'alertmanager' # smtp_auth_password: 'password' ... # A default receiver receiver: team-X-mails ... receivers: - name: 'team-X-mails' email_configs: # - to: '[email protected]' # - to: '[email protected]' send_resolved: true ...
# service prometheus-alertmanager restart # cat /etc/prometheus/first_rules.yml
groups: - name: alert.rules rules: - alert: InstanceDown expr: up == 0 for: 1m labels: severity: "critical" annotations: summary: "Endpoint {{ $labels.instance }} down" description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes." - alert: EndpointDown expr: probe_success == 0 for: 1m labels: severity: "critical" annotations: summary: "Endpoint {{ $labels.instance }} down" - alert: CriticalTraffic expr: rate(if_in_octets{instance="router"})>125000 for: 1m labels: severity: "critical" annotations: summary: "CriticalTraffic {{ $labels.instance }}"
# cat /etc/prometheus/prometheus.yml
... # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "first_rules.yml" # - "second_rules.yml" ...
Проверка конфигурации и перезапуск
... Checking /etc/prometheus/first_rules.yml SUCCESS: N rules found ...
http://server.corpX.un:9090/alerts
API Security
As administrative and mutating endpoints are intended to be accessed via simple
tools such as cURL, there is no built in
CSRF protection as
that would break such use cases. Accordingly when using a reverse proxy, you
may wish to block such paths to prevent CSRF.
For non-mutating endpoints, you may wish to set such as
in your reverse proxy to prevent
XSS.
If you are composing PromQL queries that include input from untrusted users
(e.g. URL parameters to console templates, or something you built yourself) who
are not meant to be able to run arbitrary PromQL queries make sure any
untrusted input is appropriately escaped to prevent injection attacks. For
example would become if the was .
For those using Grafana note that ,
so do not limit a user’s ability to run arbitrary queries in proxy mode.