Collector Type: Agent
Category: Application Monitors
Application Name: nvidiagpumonitor
G2 Monitor Name: Agent G2 - Nvidia Gpu Monitor
Global Template Name: Agent G2 - Linux - Nvidia GPU Monitoring
Supported DCGM Version: 3.1.7
Note
This template will work for Kuberenetes agent version 15.0.0 and above.Configuration Parameters
Name | Description | Default Value |
---|---|---|
Namespace | Namespace on which dcgm exporter is running | gpu-operator |
Port | Port on which metrics are exported | 9400 |
Collected Metrics
Monitor Name | Display Name | Description |
---|---|---|
nvidia_dcgm_power_usage | Nvidia Dcgm Power Usage | Power draw |
nvidia_dcgm_mem_clock | Nvidia Dcgm Mem Clock Freq | Memory clock frequency |
nvidia_dcgm_mem_copy_util | Nvidia Dcgm Mem Util | Memory utilization |
nvidia_dcgm_fb_mem_used | Nvidia Dcgm Framebuffer Memory Used | Framebuffer memory used |
nvidia_dcgm_gpu_temp | Nvidia Dcgm Gpu Temp | GPU temperature |
nvidia_dcgm_memory_temp | Nvidia Dcgm Memory Temp | Memory temperature |
nvidia_dcgm_gpu_util | Nvidia Dcgm Gpu Util | GPU utilization |