Docker Metrics with InstrumentalD
The best way to get Docker metrics into Instrumental is with InstrumentalD, the fast and reliable server agent created by the Instrumental team. By using InstrumentalD to collect Docker metrics, you'll get premade Docker graphs and unlock the full power of our Query Language.
Quick Start
Check out our Installation Instructions for more details. Otherwise, here's the bare minimum to get up and running.
brew install instrumental/instrumentald/instrumentald echo 'docker = ["unix:///var/run/docker.sock"]' >> instrumentald.toml instrumentald -c instrumentald.toml -k PROJECT_TOKEN
curl https://packagecloud.io/install/repositories/expectedbehavior/instrumental/script.deb.sh | sudo bash sudo apt-get install instrumentald echo 'project_token = "PROJECT_TOKEN"' | sudo tee /etc/instrumentald.toml echo 'docker = ["unix:///var/run/docker.sock"]' | sudo tee -a /etc/instrumentald.toml sudo systemctl restart instrumentald
curl https://packagecloud.io/install/repositories/expectedbehavior/instrumental/script.rpm.sh | sudo bash sudo yum install instrumentald echo 'project_token = "PROJECT_TOKEN"' | sudo tee /etc/instrumentald.toml echo 'docker = ["unix:///var/run/docker.sock"]' | sudo tee -a /etc/instrumentald.toml sudo service instrumentald restart
Configuring InstrumentalD
InstrumentalD will collect the metrics below from as many Docker endpoints as configured. Here's a basic example of the Docker config:
docker = ["unix:///var/run/docker.sock"]
Metrics Collected
InstrumentalD collects both container-related metrics and host-specific metrics.
Container-Related Metrics
Container-related metrics collected by InstrumentalD follow this pattern:
docker.container.<image name>.<container name>.<metric>
Container Memory Metrics
fail_count |
Number of times memory usage has hit limits |
---|---|
limit |
Maximum memory allowed for the container in bytes |
total_cache |
Size of the page cache in bytes |
total_pgfault |
Indicate the number of times that a process of the cgroup triggered a "page fault". A page fault happens when a process accesses a part of its virtual memory space which is nonexistent or protected |
total_pgmafault |
Indicate the number of times that a process of the cgroup triggered a "major fault". "Major" faults happen when the kernel actually has to read the data from disk. When it just has to duplicate an existing page, or allocate an empty page, it's a regular (or "minor") fault |
total_rss |
The amount of memory that doesn’t correspond to anything on disk: stacks, heaps, and anonymous memory maps |
total_unevictable |
The amount of memory that cannot be reclaimed; generally, it will account for memory that has been "locked" with mlock. It is often used by crypto frameworks to make sure that secret keys and other sensitive material never gets swapped out to disk |
usage |
Memory usage in bytes |
usage_percent |
Memory usage as a percent of total available memory |
Container CPU Metrics
Container CPU metrics follow a similar pattern to the general pattern described above, except that they include an additional metric part of `cpu-total`.
docker.container.<image name>.<container name>.<cpu_total>.<metric>
cpu-total.throttling_periods |
The total number of times the container could have been throttled |
---|---|
cpu-total.throttling_throttled_periods |
The total number of times the container was throttled |
cpu-total.throttling_throttled_time |
The amount of time the container was throttled, in microseconds |
cpu-total.usage_percent |
CPU usage as a percent of total available |
Container Network Metrics
Container network metrics follow a similar pattern to the general pattern described above, except that they include an additional metric part that represents the interface name (e.g. `eth0`)
docker.container.<image name>.<container name>.<interface name>.<metric>
rx_bytes |
Bytes received |
---|---|
rx_dropped |
Inbound packets dropped |
rx_errors |
Inbound packet errors |
rx_packets |
Packets received |
tx_bytes |
Bytes sent |
tx_dropped |
Outbound packets dropped |
tx_errors |
Outbound packet errors |
tx_packets |
Packets sent |
Container Block I/O Metrics
Container block I/O metrics follow a similar pattern to the general pattern described above, except that they include an additional metric part that represents the device major/minor numbers (e.g. `254_0`)
docker.container.<image name>.<container name>.<major_minor>.<metric>
io_service_bytes_recursive_async |
Volume of serviced asynchronous block I/O requests, in bytes |
---|---|
io_service_bytes_recursive_read |
Volume read from block devices, in bytes |
io_service_bytes_recursive_sync |
Volume of serviced synchronous block I/O requests, in bytes |
io_service_bytes_recursive_write |
Volume written to block devices, in bytes |
io_serviced_recursive_async |
Count of serviced asynchronous block I/O requests |
io_serviced_recursive_read |
Count of read requests from block devices serviced |
io_serviced_recursive_sync |
Count of serviced synchronous block I/O requests |
io_serviced_recursive_write |
Count of write requests to block devices serviced |
Host-Specific Metrics
Host-specific metrics collected by InstrumentalD follow this pattern:
docker.host.<hostname>.<metric>
bytes.memory_total |
Total memory allocated for all containers |
---|---|
n_containers |
Total number of running containers |
n_cpus |
Total number of CPUs available to Docker |
n_images |
Total number of images |
n_listener_events |
Current number of listeners connected to Docker |
n_used_file_descriptors |
Total number of file descriptors in use by Docker |