Memo/Prometheus

Prometheus(メトリクスの収集)

Prometheus(メトリクスの収集) †

Prometheus - Monitoring system & time series database
- オープンソース。Golangで書かれている。
- メトリクスの収集のみに特化。Docker, k8s(Kubernetes)のmicroservice構成の監視に優れている。
- グラフ化は Memo/Grafana を使う。

Prometheus公式ドキュメント和訳+α - it-engineer’s blog

似たようなOSS:

Memo/VictoriaMetrics

記事

↑

トラブルシューティング †

記事:

Prometheusによる数百台規模のモニタリングで直面した問題について | GREE Engineering

↑

Prometheusが起動しない時 †

Prometheus err message : get segment range: segments are not sequential
- wal, chunks_head を削除してから再起動してみてとある。walは直近の生データが入っているため1日分くらいのデータは消える

walディレクトリを移動して起動を試す。直近(1日)の生データは消える

sudo service prometheus stop
sudo mv /mnt/disk1/prometheus/data/wal /mnt/disk1/prometheus/data/wal.$(date +%Y%m%d)
sudo service prometheus start
sudo service prometheus status

# 問題なければ削除
sudo rm -rf /mnt/disk1/prometheus/data/wal.20250201

↑

Thanos: k8s環境でprometheusのデータをs3 bucketへ保存 †

Thanosの利用:

↑

S3 API Costが跳ね上がる対策 †

get object/head objectのAPIコールが膨大でコストが上がる。

thanos s3 tier 2 request usage is very high #7916
- 200K/5min -> 5K/5minへ下がったらしいオプション
```
--sync-block-duration=30m
--wait-interval=4h
--compact.cleanup-interval=0s
--compact.progress-interval=0s
```

↑

compact: ストレージの圧縮とダウンサンプリング †

s3 bucket等にデータがたまり続けるので、削除/ダウンサンプリングしたい。
- s3 lifecycleは使わない方が良いそうだ。代わりにcompactコマンドを推奨している。 https://github.com/thanos-io/thanos/issues/2869#issuecomment-1585790279

Improving HA and long-term storage for Prometheus using Thanos on EKS with S3 | AWS Open Source Blog
Deploy Thanos Compactor

Compactor

↑

メトリクス肥大化対策 †

2.20.0以降、圧縮がデフォルトに。
https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
--storage.tsdb.wal-compression

This flag was introduced in 2.11.0 and enabled by default in 2.20.0.

多いメトリクスを探す:

Jobs and instances | Prometheus
- scrape_samples_scraped: これを実行するだけで、新規に増えたメトリクス数が分かる。

job一覧

up

# grafanaでjob=だけを抽出
/.*job="([^"]+)/

特定のインスタンスのメトリクスを全て表示
```
{instance="<IP>:<PORT>",job="example"}
```

その他

# job毎の件数, top 10
topk(10,count by (job)({__name__=~".+"}))
# 別の書き方？
topk(10,count by (__name__, job)({__name__=~".+"}))


# メトリクス毎の件数, top 10
topk(10,count({__name__=~".+"}))

# instance毎の件数, top 10
topk(10,count by (__name__, instance)({__name__=~".+"}))

記事:

Prometheusにおけるカーディナリティの高いデータを適切に管理する方法 | New Relic

↑

Prometheus自身のモニタリング †

参考ダッシュボード:

Prometheus 2.0 Overview | Grafana Labs
- prometheus_tsdb_head_samples_appended_total を参照している

↑

入門 †

Prometheus をインストールしたホストへsshポートフォワードする

ssh -fNg -L 9090:localhost:9090 user@hostA

# sshポートフォワーディングを終了する場合
pkill -u $USER -f 9090:

ブラウザで http://localhost:9090/ を開く
http://localhost:9090/targets でEndpoint, job, label で認識されているか確認する
http://localhost:9090/service-discovery でec2_sd_configsで自動認識された Endpoint, labelが確認できる

http://localhost:9090/graph で対象をクエリできるか確認する

# 認識されている、指定jobだけ抽出
up{job="fluent-bit"}

# 認識されている、指定jobの件数
count(up{job="fluent-bit"})

curlで試す。複雑なqueryはurlencodeが必要。

# up : 有効なendpointを返す
curl  -s "http://localhost:9090/api/v1/query?query=up&time=$(date +%s)" | jq .

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "up",
...

↑

query: †

記事:

PromQLを始めましょう：チートシートもあります！ – Sysdig

↑

topk()/bottomk(): 上位, 下位K個のみ取得 †

上位10個のみ取得
```
topk(10, ...)
```

下位10個のみ表示
```
bottomk(10, ...)
```

↑

正規表現 †

https://github.com/google/re2/wiki/Syntax

# すべてのjobにマッチ
http_requests_total{job=~".*"}

# serverで終わる文字にマッチ
http_requests_total{job=~".*server"}

↑

up{}: jobやインスタンス数 †

Jobs and instances | Prometheus

Prometheusが有効な jobの件数
```
count(up{job="$job"})
```

指定job & fluentbit_uptimeが取得できる件数
```
count(fluentbit_uptime{job="$job")
```

↑

CloudWatchとの連携 †

記事
- Amazon CloudWatch での Prometheus メトリクスの使用 | Amazon Web Services ブログ

↑

ec2_sd_config: service discovery †

ec2_sd_config
- "ec2:DescribeInstances" 権限が必要なので、IAM roleをEC2に付ける

記事
- Prometheusのすすめ - Service Discovery - - Qiita

連絡先

TreeView

Memo (701)
- 2FA-MFA
- ACDSee
- ASP
- ASP.NET
- Adobe
- Affiliate
- Aipo
- Akelos (3)
  - sample (2)
    - ajax
    - pref
- Algorithm (1)
  - image
- Aliyun (1)
  - cli
- AmazonWebServices (85)
  - ACM
  - APIGateway
  - Alexa
  - Athena (2)
    - CloudTrail
    - Example
  - Backup
  - Batch
  - Bedrock
  - Billing
  - CDK
  - Chatbot
  - CloudFormation
  - CloudFront
  - CloudSearch
  - CloudTrail
  - CloudWatch
  - CodeBuild
  - CodeCommit
  - CodeDeploy
  - CodePipeline
  - Configservice
  - Connect
  - DataPipeline
  - Detective
  - DirectConnect
  - DirectoryService
  - DocumentDB
  - DynamoDB
  - EC2 (5)
    - AMI
    - DLM
    - EBS
    - ec2-metadata-mock
    - mock-ec2-metadata
  - EC2-classic
  - ECR
  - EKS
  - ELB
  - EMR
  - ElastiCache
  - ElasticBeanstalk
  - Fargate
  - FireLens
  - Glacier
  - Glue
  - GuardDuty
  - Health
  - IAM
  - IPv6
  - IoT
  - KinesisDataFirehose
  - Lambda
  - OpenSearch(Elasticsearch)
  - Organizations
  - QuickSight
  - R53
  - RDS (1)
    - Aurora
  - Route53
  - S3
  - SDB
  - SDK (1)
    - Python
  - SES
  - SNS
  - SSO
  - SecretsManager
  - SystemsManager
  - VPC
  - VPN
  - WAF
  - awscli (12)
    - CloudWatch
    - DynamoDB
    - EC2
    - RDS
    - S3
    - backup
    - ce
    - elb
    - error
    - parallel
    - route53
    - v2
- Amazon_Dash_Button
- Amazon_Fire_TV_Stick
- Amazon_Fire_Tablet
- Android (15)
  - AIR
  - ASUS_EeePad_Transformer
  - ASUS_Nexus7
  - ASUS_ZenFone_Max_Pro_M1
  - GALAXY_Tab
  - HTC_Butterfly_s
  - HTC_Desire
  - HTC_Desire_HD
  - HTC_Desire_Z
  - HTC_HT-03A
  - Lenovo_Legion_Y700_2023
  - SHARP_AQUOS_sense4_lite
  - adk
  - app
  - sdk
- Ansible (20)
  - AWS
  - AWX
  - Error
  - Facts
  - Filters
  - Galaxy
  - Install
  - Jinja2
  - Lookups
  - Loops
  - Tower
  - Troubleshooting
  - Validation
  - Variables
  - Vault
  - Windows
  - ldap
  - local_action
  - module_development
  - set_fact
- AppleTV
- Arduino
- ArtificialIntelligence (1)
  - MachineLearning
- Atlassian
- AugmentedReality
- BBS
- Backlog
- Bazaar
- Becky
- Blog
- Blynk
- Browser
- C
- C++
- CDN
- CSS
- ChaosEngineering
- CircleCI
- CloudComputing
- Cloudflare
- ComputerSecurity
- Consul
- Creative
- DNS
- DTM
- Database (1)
  - ツリー(階層)構造の設計
- Datadog
- Docker (4)
  - Dockerfile
  - Kubernetes
  - k3s
  - nomad
- Doxygen
- Dropbox
- DuckDB
- E-book (4)
  - BOOKSCAN
  - Calibre
  - NOOK
  - edit
- EC-CUBE
- EOL
- Electron
- English
- EstimateTechnique
- Evernote
- Excel
- F-PLUG
- F-Secure_PSB
- FFmpeg
- Fabric
- Filer
- Flash (1)
  - Lite
- Flex
- Fonts
- FreeWLAN
- GIS
- Gainer
- Game
- GameDev
- Geolocation
- GlusterFS
- Google (10)
  - AdSense
  - Apps_Script
  - Docs
  - Drive
  - Gemini
  - Gmail
  - NotebookLM
  - SpreadSheet
  - Workspace
  - reCAPTCHA
- GoogleCloudPlatform
- Google_Chromecast
- Gradle
- Grafana (9)
  - API
  - Athena
  - CloudWatch
  - GoogleBigQuery
  - Loki
  - Troubleshooting
  - Zabbix
  - docker
  - plugins
- GraphQL
- Graphics
- Graphviz
- HTML (1)
  - 5
- Heroku (2)
  - Papertrail
  - herokucli
- IPTV
- IPv6
- ImageProcessing
- InfoPath (1)
  - JScript
- IntelliJ_IDEA
- IoT
- JAWBONE_UP
- Java (8)
  - Eclipse (1)
    - Cray
  - Eclipse3.1でiアプリ開発
  - Maven
  - Tomcat
  - log4j
  - 携帯 (1)
    - サンプル
- JavaScript (1)
  - jq
- Jenkins
- JupyterNotebook
- LVGL
- LXC
- LibreOffice
- LifeHacks
- Linux (139)
  - AlmaLinux (2)
    - 10
    - 9
  - AntiVirus
  - Apache
  - Archiver
  - Bash
  - BitTorrent
  - CVS
  - CentOS (5)
    - 5
    - 6
    - 7
    - 8
    - Stream
  - CentOS4
  - Chrony
  - ClamAV
  - Disk
  - Dragonfly
  - Duplicity
  - Fedora core 1
  - Fedora core 5 (2)
    - VMware server
    - samba3.0
  - Firecracker
  - Fluent-Bit
  - Fluentd(td-agent) (1)
    - Errors
  - Heartbeat
  - HyperEstraier
  - InitScript
  - LVM
  - LVS
  - Linuxbrew
  - LiteSpeed
  - Mailman
  - Makefile
  - Memcached
  - Monit
  - Monitoring
  - Munin
  - Namazu
  - OSS-license
  - OpenLDAP (1)
    - WebUI
  - OpenResty
  - Pacemaker
  - Redhat Enterprise Linux ES3
  - Redhat Enterprise Linux ES4
  - SC420 (6)
    - MySQL5.0
    - PostgreSQL8.0
    - Subversion
    - eAccelerator0.9.3
    - php5.0
    - 外部公開(DMZ)
  - SELinux
  - SSL
  - ShellScript
  - Vyatta
  - WebCamera
  - Zabbix (14)
    - 1.8jp
    - 2.0
    - 2.2
    - 2.4
    - 3.0
    - 4.0
    - 5.0
    - 6.0
    - 7.0
    - API
    - AWS
    - Error
    - Template
    - docker
  - ag
  - anyenv
  - audit
  - cloud-init
  - comm
  - command
  - curl
  - datetime
  - denyhosts
  - diff
  - dnf
  - dnsmasq
  - fail2ban
  - find
  - firewalld
  - fzf
  - grep
  - haproxy
  - iptables
  - jailkit
  - jc
  - less
  - lftp
  - logrotate
  - lsof
  - mail
  - mdadm
  - nftables
  - nginx
  - ntp
  - parallel
  - pdsh
  - peco
  - postfix
  - pure-ftpd
  - rclone
  - rdiff-backup
  - redis
  - resolv.conf
  - rootsh
  - rpm
  - rsync
  - samba
  - servicelist
  - socat
  - source-highlight
  - ss
  - ssh
  - sudo
  - symlink
  - syslog
  - systemd
  - taRgrey
  - tcpflow
  - tmux
  - vim
  - vsftpd
  - webdruid
  - wget
  - xargs
  - yum
  - zphoto
- Lua
- MSOffice
- MacOS X (2)
  - Homebrew
  - app
- MariaDB
- Markdown
- MessagePack
- MicrosoftVisualStudio
- MongoDB
- Movie
- Mp3tag
- MySQL (5)
  - 5.5
  - 5.7
  - 8.0
  - Error
  - docker
- NETGEAR_ReadyNAS
- NLP
- NetworkNotepad
- NoSQL
- Node.js
- Notion
- OCR
- ODROID
- OSRM
- OSSEC
- OpenOfficeOrg
- OpenTofu
- OpenVPN
- OpenWrt (7)
  - Buffalo_WSR-2533DHP
  - ELECOM_WRC-X3200GST3
  - Package (4)
    - adblock
    - ddns
    - luci
    - openvpn
- Opera
- Oracle
- PC
- PC98
- PDF
- PHP (35)
  - Bug
  - CakePHP
  - Composer
  - Excelファイル生成
  - Google
  - PEAR (11)
    - Auth
    - Cache_Lite
    - DB
    - DB_DataObject
    - File_Archive
    - HTML_QuickForm
    - MDB2
    - QuickForm
    - SQLite
    - Services_Amazon
    - session_pgsql
  - PECL (1)
    - oauth
  - PHP Accelerator
  - SOAP
  - Smarty
  - XML_RPC
  - Xdebug
  - eAccelerator (1)
    - CentOS5.2
  - log4php
  - phpでSSL通信
  - qdmail
  - rss
  - simpletest
  - wkhtmltox
  - xhprof
  - フレームワーク
  - ベンチマーク
- PSP
- PT2 (2)
  - CentOS
  - Ubuntu
- Packer
- Parquet
- PayPal
- Perl (1)
  - OneLiner
- PhantomJS
- Photoshop
- PlantUML
- Poderosa
- Podman
- PostgreSQL (3)
  - constraint
  - pgpool
  - pgpool-II
- PowerPoint
- Programming
- Prometheus
- Puppet (5)
  - Geppetto
  - hiera
  - v3.x
  - v5.x
  - v8
- Python (7)
  - AWS
  - Error
  - Poetry
  - install
  - luigi
  - pip
  - test
- QRCode
- RADIUS
- REST_API
- RFID
- RabbitMQ
- Rackspace
- Raspberry_Pi (3)
  - ANAVI_Infrared_pHAT
  - DNSB-35137
  - co2
- RecordingServer
- Redash
- Redmine
- RegularExpression
- Rlogin
- Ruby (1)
  - Rails
- Rundeck
- Rust
- SQL
- SQLite
- SSO
- SakuraEditor
- SendGrid
- Server
- Skype
- Slack
- SoftEther (1)
  - Raspberry_Pi
- Solaris
- Stashboard
- Subversion (6)
  - Install
  - trac (4)
- SwitchBot
- Taskfile
- Tauri
- Terraform (12)
  - Error
  - G_Suite
  - GoogleCloud
  - aliyun
  - aws (1)
    - Errors
  - dynamic-block
  - for_each
  - heroku
  - legacy
  - provider
  - random
- ThinkPad
- Trac
- Traefik
- TrendMicroDeepSecurity
- Twitter
- TypeScript
- USB-IO
- Ubuntu (1)
  - Error
- Unity
- VMware
- VOCALOID
- Vagrant (4)
  - Errors
  - Windows
  - box
  - macOS
- VictoriaMetrics
- VirtualBox
- Visio
- VisualStudioCode (1)
  - AWS
- VxWorks
- WebBrowser (5)
  - Chrome
  - Chromium
  - Firefox (1)
    - PreBar
  - Vivaldi
- Wiki (4)
  - BlockingSpam
  - dev
  - docker
  - 広告
- Windows (28)
  - .NET_Framework
  - 10 (12)
    - WSL (11)
      - 2
      - CentOS
      - Docker
      - Install
      - Ubuntu
      - ssh-agent-wsl
      - wsl-ssh-agent
      - wsl-terminal
      - wsl2-ssh-agent
      - wslgit
      - wsltty
  - 11
  - 7
  - 8
  - ATOK
  - BatchFile
  - EventLog
  - Firewall
  - PackageManagement
  - PowerShell
  - VirtualPC
  - WSA
  - WindowsTerminal
  - winget
  - エラー
- WindowsMobile (3)
  - Link
  - Software
  - ssh
- Wireshark
- Word
- WordPress
- XBMC (2)
  - CentOS
  - Ubuntu
- Xamarin
- YAMAHA-RTX
- antivirus
- backup (1)
  - AcronisTrueImage
- company
- containerd
- csvtsv
- dd-wrt (7)
  - Buffalo_WHR-G301N
  - Buffalo_WXR-1900DHP
  - Buffalo_WZR-1166DHP
  - Buffalo_WZR-1750DHP
  - Buffalo_WZR-D1800H
  - Buffalo_WZR-HP-AG300H
  - Buffalo_WZR-HP-G300NH
- draw.io
- facebook
- favorite
- git (5)
  - Bitbucket
  - GitHub (3)
    - Actions (2)
      - AWS
      - act
- golang
- iPhone (2)
  - app
  - sdk
- iPod
- jQuery
- keepass
- localstack
- mbed
- mise
- mobile
- pandoc
- printer
- programmer
- serverspec
- soracom
- spam
- telework
- twilio
- web
- wkhtmltopdf
- yaml (1)
  - yq
- ユーザビリティ
- 個人情報保護法
- 航空機用語
- 就職
- 設計開発
- 素材集
- 相続
- 用語
- FrontPage
  - - instag.inc.php