Различия
Показаны различия между двумя версиями страницы.
| Предыдущая версия справа и слева Предыдущая версия Следующая версия | Предыдущая версия | ||
| software:monitoring [2025/11/17 00:45] – 192.168.1.1 | software:monitoring [2025/12/07 02:38] (текущий) – [Программные средства] mirocow | ||
|---|---|---|---|
| Строка 1: | Строка 1: | ||
| - | {{tag> | + | {{tag> |
| - | ====== | + | ====== |
| - | ===== Версии: ===== | + | {{:software:67567.jpg?600|}} |
| - | + | ||
| - | * grafana/ | + | |
| - | * grafana/ | + | |
| - | * grafana/ | + | |
| - | ===== Настройки | + | Мониторинг IT-инфраструктуры (ИТ-мониторинг) — это процесс постоянного сбора, обработки и анализа данных о состоянии элементов ИТ-среды: |
| - | ==== [[ loki_config ]] ==== | + | * Минимизация времени простоя — быстрая реакция на проблемы (перезапуск служб, уведомление администраторов, |
| + | * Снижение операционных расходов — автоматизация контроля сети, серверов, | ||
| + | * Улучшение пользовательского опыта — стабильная работа систем влияет на удовлетворённость клиентов и сотрудников, | ||
| + | * Планирование развития инфраструктуры — анализ исторических данных и трендов позволяет прогнозировать будущие потребности в ресурсах и планировать развитие ИТ-инфраструктуры. | ||
| - | <code yaml> | + | ===== Программы ===== |
| - | auth_enabled: | + | |
| - | server: | + | Grafana, Promtail, Loki, Prometheus, Fluent Bit, Fluentd, Kibana, Logstah, Elasticsearch, |
| - | http_listen_port: | + | |
| - | common: | + | ==== Рекомендации ==== |
| - | instance_addr: | + | |
| - | path_prefix: | + | |
| - | storage: | + | |
| - | filesystem: | + | |
| - | chunks_directory: | + | |
| - | rules_directory: | + | |
| - | replication_factor: | + | |
| - | ring: | + | |
| - | kvstore: | + | |
| - | store: inmemory | + | |
| - | schema_config: | + | * Использовать Fluentd или Fluent Bit, которые умеют отправлять данные в Loki. В отличие от Promtail они имеют готовые парсеры практически для любого вида лога и справляются в том числе с multiline-логами. |
| - | configs: | + | |
| - | | + | |
| - | store: tsdb | + | |
| - | object_store: | + | |
| - | schema: v13 | + | |
| - | index: | + | |
| - | prefix: index_ | + | |
| - | period: 24h | + | |
| - | ruler: | + | ===== Программные средства ===== |
| - | alertmanager_url: | + | |
| - | limits_config: | ||
| - | retention_period: | ||
| - | reject_old_samples: | ||
| - | reject_old_samples_max_age: | ||
| - | table_manager: | + | * [[system: |
| - | retention_deletes_enabled: | + | |
| - | retention_period: | + | |
| - | </ | + | |
| - | + | | |
| - | ==== [[ promtail_config | + | |
| - | + | * [[system:htop]] | |
| - | [[promtail]] | + | * [[system:iotop]] |
| - | + | * [[software:iperf3]] | |
| - | <code yaml> | + | * [[:lsof]] |
| - | server: | + | * [[system:elasticsearch:logstash]] |
| - | | + | * [[:lm-sensors]] |
| - | http_listen_address: | + | |
| - | + | | |
| - | positions: | + | |
| - | filename: / | + | |
| - | # Запись позиций каждые 15 секунд | + | |
| - | sync_period: | + | |
| - | + | | |
| - | clients: | + | |
| - | | + | |
| - | backoff_config: | + | |
| - | min_period: 10s | + | |
| - | max_period: 5m | + | |
| - | max_retries: | + | |
| - | batchwait: 30s | + | |
| - | batchsize: 2097152 | + | |
| - | timeout: 60s | + | |
| - | external_labels: | + | |
| - | cluster: docker-swarm | + | |
| - | host: " | + | |
| - | + | ||
| - | scrape_configs: | + | |
| - | + | ||
| - | | + | |
| - | static_configs: | + | |
| - | | + | |
| - | | + | |
| - | labels: | + | |
| - | job: varlogs | + | |
| - | __path__: / | + | |
| - | + | ||
| - | # See https:// | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | - host: unix:/// | + | |
| - | | + | |
| - | | + | |
| - | - name: status | + | |
| - | values: | + | |
| - | relabel_configs: | + | |
| - | - target_label: | + | |
| - | replacement: | + | |
| - | - source_labels: | + | |
| - | | + | |
| - | target_label: container | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | | + | |
| - | regex: " | + | |
| - | action: keep | + | |
| - | - source_labels: | + | |
| - | target_label: swarm_service_name | + | |
| - | | + | |
| - | | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | target_label: | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | target_label: log_stream | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | | + | |
| - | action: replace | + | |
| - | - source_labels: | + | |
| - | target_label: | + | |
| - | action: replace | + | |
| - | # Правило дропа для тестирования | + | |
| - | # - source_labels: | + | |
| - | # | + | |
| - | # | + | |
| - | pipeline_stages: | + | |
| - | - docker: {} | + | |
| - | - timestamp: | + | |
| - | source: current_time | + | |
| - | format: RFC3339 | + | |
| - | </ | + | |
| - | + | ||
| - | ===== Docker Compose ===== | + | |
| - | + | ||
| - | + | ||
| - | monitoring | + | |
| - | <code yaml> | + | |
| - | version: " | + | |
| - | + | ||
| - | services: | + | |
| - | + | ||
| - | | + | |
| - | | + | |
| - | ports: | + | |
| - | - " | + | |
| - | configs: | + | |
| - | - source: loki_config | + | |
| - | target: / | + | |
| - | volumes: | + | |
| - | - loki_data:/ | + | |
| - | - / | + | |
| - | - / | + | |
| - | command: | + | |
| - | - -config.file=/ | + | |
| - | - -config.expand-env=true | + | |
| - | - -target=all | + | |
| - | networks: | + | |
| - | - monitoring | + | |
| - | deploy: | + | |
| - | placement: | + | |
| - | constraints: | + | |
| - | - node.role == manager | + | |
| - | resources: | + | |
| - | limits: | + | |
| - | memory: 2G # Уменьшаем память | + | |
| - | cpus: ' | + | |
| - | reservations: | + | |
| - | memory: 1G | + | |
| - | cpus: ' | + | |
| - | restart_policy: | + | |
| - | condition: on-failure | + | |
| - | delay: 10s | + | |
| - | max_attempts: | + | |
| - | + | ||
| - | | + | |
| - | image: grafana/ | + | |
| - | configs: | + | |
| - | - source: promtail_config | + | |
| - | target: / | + | |
| - | volumes: | + | |
| - | - / | + | |
| - | - / | + | |
| - | - / | + | |
| - | - / | + | |
| - | - / | + | |
| - | - promtail_positions:/ | + | |
| - | - / | + | |
| - | - / | + | |
| - | command: | + | |
| - | - -config.file=/ | + | |
| - | - -client.external-labels=host=${HOSTNAME} | + | |
| - | - -config.expand-env=true | + | |
| - | #- -log.level=info | + | |
| - | environment: | + | |
| - | - HOSTNAME={{.Node.Hostname}} | + | |
| - | networks: | + | |
| - | - monitoring | + | |
| - | deploy: | + | |
| - | mode: global | + | |
| - | resources: | + | |
| - | limits: | + | |
| - | memory: 512M | + | |
| - | cpus: ' | + | |
| - | reservations: | + | |
| - | memory: 256M | + | |
| - | cpus: ' | + | |
| - | restart_policy: | + | |
| - | condition: any | + | |
| - | delay: 30s | + | |
| - | max_attempts: | + | |
| - | + | ||
| - | | + | |
| - | image: grafana/ | + | |
| - | ports: | + | |
| - | - " | + | |
| - | environment: | + | |
| - | - GF_SECURITY_ADMIN_PASSWORD=admin | + | |
| - | - GF_SECURITY_ADMIN_USER=admin | + | |
| - | - GF_USERS_ALLOW_SIGN_UP=false | + | |
| - | - GF_AUTH_ANONYMOUS_ENABLED=true | + | |
| - | - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer | + | |
| - | # Настройки для решения проблемы блокировки БД | + | |
| - | - GF_DATABASE_TYPE=sqlite3 | + | |
| - | - GF_DATABASE_PATH=grafana.db | + | |
| - | - GF_DATABASE_MAX_IDLE_CONN=1 | + | |
| - | - GF_DATABASE_MAX_OPEN_CONN=1 | + | |
| - | - GF_DATABASE_CONN_MAX_LIFETIME=14400 | + | |
| - | # Отключаем функции, | + | |
| - | - GF_ALERTING_ENABLED=false | + | |
| - | - GF_REPORTING_ENABLED=false | + | |
| - | - GF_LIVE_ENABLED=false | + | |
| - | volumes: | + | |
| - | - grafana_data:/ | + | |
| - | - / | + | |
| - | - / | + | |
| - | networks: | + | |
| - | - monitoring | + | |
| - | deploy: | + | |
| - | placement: | + | |
| - | constraints: | + | |
| - | - node.role == manager | + | |
| - | resources: | + | |
| - | limits: | + | |
| - | memory: 1G | + | |
| - | cpus: ' | + | |
| - | reservations: | + | |
| - | memory: 512M | + | |
| - | cpus: ' | + | |
| - | restart_policy: | + | |
| - | condition: on-failure | + | |
| - | delay: 10s | + | |
| - | max_attempts: | + | |
| - | + | ||
| - | configs: | + | |
| - | | + | |
| - | external: true | + | |
| - | | + | |
| - | external: true | + | |
| - | + | ||
| - | networks: | + | |
| - | | + | |
| - | driver: overlay | + | |
| - | attachable: true | + | |
| - | + | ||
| - | + | ||
| - | volumes: | + | |
| - | | + | |
| - | driver: local | + | |
| - | | + | |
| - | driver: local | + | |
| - | grafana_data: | + | |
| - | driver: local | + | |
| - | </ | + | |