Overview

Resource availability is a state of a resource and it is identified based on the alert on the availability metric and when the resource is onboarded on the OpsRamp.

OpsRamp continuously monitors the resources and keeps track of all the metrics samples. Whenever the availability metric reaches the critical threshold limit an alert will be raised and based on the alert the availability state of the resource will be changed.

Availability Calculation

Most of OpsRamp out-of-the-box templates include at least one or two metrics for availability calculation. These metrics will help to identify the availability state of a resource.

How to Configure the Availability

Follow the steps to configure the availability:

  1. Select any template of your choice and Edit the template.
  2. Go to the Metrics section and then select the metric that is more important for the resource.
  3. Select the Availability checkbox and Save the template.
  4. Apply the Template to the resources.

Availability States

Availability StateDescriptionColor Indication
UPNo critical alert on availability metrics.GREEN
DOWNCritical alert on availability metrics.RED
UNKNOWNData samples are not available for the availability metrics.GREY
UNDEFINEDNo availability metric on the resource.BROWN
UNMONITOREDThese resources are not supported for monitoring.LIGHT TEAL

The onboarded resources in your client fall under any of the above categories.

Availability Rules

When you apply a template, the first option ALL is selected by default, but you can change it to ANY if you prefer. To change, select the Resource, then click the Monitors tab on the right side, and then click Availability Rule.

Availability calculation is divided into two parts:

  • ALL: This option means, if all the Availability metrics do not have any critical alert, then the resource is considered UP (OK). If any of the Availability metrics has a critical alert, then the resource is considered as DOWN.
  • ANY: This option means, if any of the Availability metrics do not have a critical alert, then the resource is considered as UP (OK). If all the Availability metrics have a critical alert, then the resource is considered as DOWN.

You will find the below options and you have the option to switch between them.

  • Resource is UP, if ALL availability metrics are OK. Otherwise, the resource is DOWN.
  • Resource is UP, if ANY availability metric is OK. Otherwise, the resource is DOWN.

Possible States for Availability Rule

The below table explains the state of a resource based on all the possible combinations of availability metrics.

Assuming you have two availability metrics on a resource.

How will the state be calculated for ALL rules ?

Resource is UP, if ALL availability metrics are OK. Otherwise, the resource is Down.

Metric Sample#1Metric Sample#2Sample#1 Critical Alert?Sample#2 Critical Alert?Availability
Resource ACollectedCollectedNoNoUP
Resource ACollectedCollectedYesYesDOWN
Resource ACollectedCollectedYesNoDOWN
Resource ACollectedNot collectedYesN/ADOWN
Resource ACollectedNot collectedNoN/AUNKNOWN
Resource ANot collectedNot collectedYesN/ADOWN
Resource ANot collectedNot collectedN/AN/AUNKNOWN

How will the state be calculated for ANY rules ?

Resource is UP, if ANY availability metric is OK. Otherwise, the resource is DOWN.

Metric Sample#1Metric Sample#2Sample#1 Critical Alert?Sample#2 Critical Alert?Availability
Resource ACollectedCollectedNoNoUP
Resource ACollectedCollectedYesYesDOWN
Resource ACollectedCollectedYesNoUP
Resource ACollectedNot collectedYesN/AUP
Resource ACollectedNot collectedNoN/AUP
Resource ANot collectedNot collectedYesN/AUNKNOWN
Resource ANot collectedNot collectedN/AN/AUNKNOWN

When to go for the ALL Availability rule ?

If you are really concerned about ALL availability metrics and expect those metrics to be always healthy, i.e., metric samples are below the critical threshold limits, then you should go with this rule.
Therefore, if you want your resource to be in the UP state, then all availability metrics should be below the critical threshold limit.

When to go for ANY Availability rule ?

If you are only concerned about ANY one of the availability metrics and you expect one of the metrics in healthy i.e., the metric sample is below the critical threshold limits, then you should go with this rule.
Therefore, if you want your resource to be in UP state, then any one of the availability metrics should be below the critical threshold limit.

Resource Availability Score

Resource availability score is calculated based on the state of the availability metric.
Example: If the availability of a resource is DOWN for sometime, then the overall resource availability score is impacted.

Availability Score (%) = 100 - (Downtime Score)

Downtime score is determined when there is a critical alert(s) on the availability metrics in-combination with the availability rule on the resource.

Generate alert for the resources with unknown availability state

When does a resource go into an Unknown Availability State?

A template that has at least one availability metric, is applied on a resource. The resource goes into an UNKNOWN state when there is no data sample collected for the metric(s) for the last 30 minutes.

How will the user know if a resource goes into an UNKNOWN availability state?

A client-level critical alert will be generated every 30 minutes, if the resource availability state changes to the UNKNOWN state.

The critical alert will contain the link to the list of resources that have no monitoring data for the last 30 minutes. When you click the link, the Infrastructure > Resources page is displayed, with the list of UNKNOWN resources.

The alert is auto healed, if all the resources in the alert move out of the UNKNOWN state.

This alerting option is, by default, in the disabled state. You can Enable/Disable the option from the Setup > Accounts > Clients page.

Availability-enable-alerting

See Create a Client for more information.