Overview
Kubelet Stats provide metrics from the kubelet on each node about pods, CPU, memory, and disk usage.
Why it’s Useful
- Detects resource bottlenecks per node.
- Ensures pods are scheduled to healthy nodes.
- Helps identify memory leaks or runaway processes.
What Users Can Do
- Monitor pod and node-level usage.
- Track CPU/memory utilization trends.
- Debug performance issues for workloads.
Steps to Modify Configuration
- Get the existing ConfigMap.
kubectl get cm opsramp-k8s-infra-metric-user-config -n <agent-installed-namespace> -o yaml - Edit the ConfigMap:
kubectl edit cm opsramp-k8s-infra-metric-user-config -n <agent-installed-namespace> - Locate the
kubelet_statssection in ConfigMap.k8s_cluster: enabled: true config: scrape_interval: "2m" - Update the required parameters.
- Save and apply the changes.
Supported Metrics
| Metric Name | Description |
|---|---|
| k8s_node_cpu_usage | Total CPU usage (sum of all cores per second) averaged over the sample window. |
| k8s_node_cpu_utilization | Node CPU utilization. |
| k8s_node_cpu_time | Total cumulative CPU time (sum of all cores) spent by the container, pod, or node since creation. |
| k8s_node_memory_available | Node memory available. |
| k8s_node_memory_usage | Node memory usage. |
| k8s_node_memory_rss | Node memory RSS. |
| k8s_node_memory_working_set | Node memory working set. |
| k8s_node_memory_page_faults | Node memory page faults. |
| k8s_node_memory_major_page_faults | Node memory major page faults. |
| k8s_node_filesystem_available | Node filesystem available. |
| k8s_node_filesystem_capacity | Node filesystem capacity. |
| k8s_node_filesystem_usage | Node filesystem usage. |
| k8s_node_network_io | Node network I/O. |
| k8s_node_network_errors | Node network errors. |
| k8s_node_uptime | The time since the node started. |
| k8s_pod_cpu_usage | Total CPU usage (sum of all cores per second) averaged over the sample window. |
| k8s_pod_cpu_utilization | Pod CPU utilization. |
| k8s_pod_cpu_time | Total cumulative CPU time (sum of all cores) spent by the container, pod, or node since creation. |
| k8s_pod_memory_available | Pod memory available. |
| k8s_pod_memory_usage | Pod memory usage. |
| k8s_pod_cpu_node_utilization | Pod CPU utilization as a ratio of the node's capacity. |
| k8s_pod_cpu_limit_utilization | Pod CPU utilization as a ratio of the pod's total container limits. Metric not emitted if any container is missing a limit. |
| k8s_pod_cpu_request_utilization | Pod CPU utilization as a ratio of the pod's total container requests. Metric not emitted if any container is missing a request. |
| k8s_pod_memory_node_utilization | Pod memory utilization as a ratio of the node's capacity. |
| k8s_pod_memory_limit_utilization | Pod memory utilization as a ratio of the pod's total container limits. Metric not emitted if any container is missing a limit. |
| k8s_pod_memory_request_utilization | Pod memory utilization as a ratio of the pod's total container requests. Metric not emitted if any container is missing a request. |
| k8s_pod_memory_rss | Pod memory RSS. |
| k8s_pod_memory_working_set | Pod memory working set. |
| k8s_pod_memory_page_faults | Pod memory page faults. |
| k8s_pod_memory_major_page_faults | Pod memory major page faults. |
| k8s_pod_filesystem_available | Pod filesystem available. |
| k8s_pod_filesystem_capacity | Pod filesystem capacity. |
| k8s_pod_filesystem_usage | Pod filesystem usage. |
| k8s_pod_network_io | Pod network I/O. |
| k8s_pod_network_errors | Pod network errors. |
| k8s_pod_uptime | The time since the pod started. |
| container_cpu_usage | Total CPU usage (sum of all cores per second) averaged over the sample window. |
| container_cpu_utilization | Container CPU utilization. |
| container_cpu_time | Total cumulative CPU time (sum of all cores) spent by the container, pod, or node since creation. |
| container_memory_available | Container memory available. |
| container_memory_usage | Container memory usage. |
| k8s_container_cpu_node_utilization | Container CPU utilization as a ratio of the node's capacity. |
| k8s_container_cpu_limit_utilization | Container CPU utilization as a ratio of the container's limits. |
| k8s_container_cpu_request_utilization | Container CPU utilization as a ratio of the container's requests. |
| k8s_container_memory_node_utilization | Container memory utilization as a ratio of the node's capacity. |
| k8s_container_memory_limit_utilization | Container memory utilization as a ratio of the container's limits. |
| k8s_container_memory_request_utilization | Container memory utilization as a ratio of the container's requests. |
| container_memory_rss | Container memory RSS. |
| container_memory_working_set | Container memory working set. |
| container_memory_page_faults | Container memory page faults. |
| container_memory_major_page_faults | Container memory major page faults. |
| container_filesystem_available | Container filesystem available. |
| container_filesystem_capacity | Container filesystem capacity. |
| container_filesystem_usage | Container filesystem usage. |
| container_uptime | The time since the container started. |
| k8s_volume_available | The number of available bytes in the volume. |
| k8s_volume_capacity | The total capacity in bytes of the volume. |
| k8s_volume_inodes | The total inodes in the filesystem. |
| k8s_volume_inodes_free | The free inodes in the filesystem. |
| k8s_volume_inodes_used | The inodes used by the filesystem. This may not equal inodes - free because the filesystem may share inodes with other filesystems. |