Understand how to troubleshoot problems with alerts, whether they are not being generated or failing for certain metrics. This section includes:

  • Fixing alerts not being generated for specific resources or metrics
  • Resolving issues with alert thresholds and configurations
  • Ensuring alarms and events are being triggered as expected

Alerts not getting generated

  1. Verify metric graph data collection according to the defined frequency. If data is not being collected, troubleshoot the issue using the debugging process for monitoring. <Metric graph data is not getting on all resources>

  2. Verify if alerts are enabled in the metric definition in the template:

    • Ensure that the resource-level or component-level alert checkbox is enabled for the specific resource.
    • Verify template-level alert customization (ComponentAlertThresholds).
  3. Verify the alert configurations (raiseAlert, Warning & Critical thresholds) of the metric definition in the template configuration in the gateway database using the gcli command.

  4. Verify the resource-level alert configurations in the gateway database using the gcli command.

Alerts not getting generated for a particular metric on a resource

  1. Verify metric graph data collection according to the defined frequency. If data is not being collected, troubleshoot the issue using the debugging process for monitoring. <Metric graph data is not getting on all resources>

  2. Verify if alerts are enabled in the metric definition in the template:

    • Ensure that the resource-level or component-level alert checkbox is enabled for the specific resource.
    • Verify template-level alert customization (ComponentAlertThresholds).
  3. Verify the alert configurations (raiseAlert, Warning & Critical thresholds) of the metric definition in the template configuration in the gateway database using the gcli command.

  4. Verify the resource-level alert configurations in the gateway database using the gcli command.

Alerts not getting generated based on alert thresholds

  1. Verify metric graph data collection according to the defined frequency. If data is not being collected, troubleshoot the issue using the debugging process for monitoring. <Metric graph data is not getting on all resources>

  2. Verify if alerts are enabled in the metric definition in the template:

    • Ensure that the resource-level or component-level alert checkbox is enabled for the specific resource.
    • Verify template-level alert customization (ComponentAlertThresholds).
  3. Verify the alert configurations (raiseAlert, Warning & Critical thresholds) of the metric definition in the template configuration in the gateway database using the gcli command.

  4. Verify the resource-level alert configurations in the gateway database using the gcli command.

Alarms / Events not getting generated

  1. Verify the event polling related app configurations.
  2. Verify if the alarm or event-related metric exists in the template of the root resource.
  3. Verify the template configurations in the gateway database using the gcli command: a. Go to the Resource Details page for the resource where monitoring is not working, and copy the Resource UUID. b. Launch the GCLI terminal. Refer to the how to connect to the GCLI terminal in the gateway. c. Execute the following commands: gcli> sdkgetresourceconfig <resourceId> gcli> sdkgetmonitoringconfig <templateId>