Supported Target Version
Rubrik cluster software version: 8.0.2-p2-22662

Application Version and Upgrade Details

Application VersionBug fixes / Enhancements
2.0.2Added support for NativeType Display order changes and resource grouping by type in UI
2.0.3Bug Fix for metrics intermittent issue
2.0.1
  • Added Metric Labels support.
  • Missing component alerts.
  • Change metric instance name as resource name for single instance metrics.
Click here to view the earlier version updates
Application VersionBug fixes / Enhancements
2.0.0
  • API statistics metric.
  • Full discovery Support.
  • We have included "ObjectName" and "ObjectType" in the alert description For correlating Event Polling alerts based on alert description or alert subject.
1.0.0Initial SDK2.0 app Discovery and Monitoring Implementations.

Introduction

Rubrik simplifies backup and recovery for hybrid cloud environments. By combining data orchestration, catalog management, and deduplicated storage into a singular software platform, it removes the complexity of legacy backup systems. Enterprises can use Rubrik’s API-first software tool to automate automation and unlock cloud for long-term data retention or disaster recovery. Rubrik supports the top operating systems, databases, hypervisors, clouds, and SaaS apps and was made to be vendor-neutral.

Rubrik assists organizations in maintaining data integrity, provides data availability that withstands challenging circumstances, constantly tracks data risks and threats, and restores businesses with their data when infrastructure is attacked.

Key Use cases

Discovery Use cases

  • It discovers the Rubrik Cluster components.
  • Publishes relationships between resources to have a topological view and ease of maintenance.

Monitoring Use cases

  • Provides metrics related to job scheduling time and status etc,.
  • Generates alerts for each metric and notifies administrators about the issue with the resource.

Prerequisites

  • OpsRamp Classic Gateway 14.0.0 and above.
  • OpsRamp NextGen Gateway 14.0.0 and above.
    Note: OpsRamp recommends using the latest Gateway version for full coverage of recent bug fixes, enhancements, etc.
  • Provided IpAddress/hostname, credentials should work for Rubrik REST API’s.

Hierarchy of Rubrik resources

  - Rubrik Cluster
         - Rubrik Node
                    - Rubrik Disk

Supported Metrics

Click here to view the supported metrics
Native TypeMetric NamesDisplay NameUnitApplication VersionDescription
Rubrik Clusterrubrik_cluster_runwayRemainingRubrik Cluster Runway RemainingDays1.0.0Number of days remaining before the system fills up.
rubrik_cluster_StatusRubrik Cluster Status1.0.0Status of the Rubrik cluster.
rubrik_cluster_StorageUsageRubrik Cluster Storage UsageGB1.0.0Used storage of the Rubrik cluster.
rubrik_cluster_StorageUtilizationRubrik Cluster Storage Utilization%1.0.0Storage utilization of the Rubrik cluster.
rubrik_cluster_PhysicalDataIngestionRubrik Cluster Physical Data IngestionBytes/sec1.0.0Physical data ingestion of the Rubrik cluster.
rubrik_cluster_ReadIOPSRubrik Cluster Read IOPSIOPS1.0.0Read IOPS of Rubrik cluster.
rubrik_cluster_WriteIOPSRubrik Cluster Write IOPSIOPS1.0.0Write IOPS of Rubrik cluster.
rubrik_cluster_ReadIOThroughputRubrik Cluster Read IO ThroughputBytes/sec1.0.0ReadIO throughput statistics of Rubrik cluster.
rubrik_cluster_WriteIOThroughputRubrik Cluster Write IO ThroughputBytes/sec1.0.0WriteIO throughput statistics of Rubrik cluster.
rubrik_task_SuccessCountRubrik Task Success Countcount1.0.0Success count of tasks on Rubrik cluster.
rubrik_task_FailureCountRubrik Task Failure Countcount1.0.0Failure count of tasks on Rubrik cluster.
rubrik_job_SuccessCountRubrik Job Success Countcount1.0.0Success count of jobs run in the last 24 hours.
rubrik_job_FailureCountRubrik Job Failure Countcount1.0.0Failure count of jobs run in the last 24 hours.
rubrik_job_ActiveCountRubrik Job Active Countcount1.0.0Active jobs running for the last 24 hours.
rubrik_job_CanceledCountRubrik Job Canceled Countcount1.0.0Canceled jobs in the last 24 hours.
rubrik_cluster_RegisteredHostStatusRubrik Cluster Registered Host Status1.0.0Connection status of hosts registered to Rubrik cluster.
rubrik_resource_APIStatsRubrik API Statisticscount2.0.0Provides the number of API calls and resources made within the frequency.
rubrik_event_StatisticsRubrik Event Statistics1.0.0Provides the count of the number of events polled within the frequency
Rubrik Noderubrik_node_StatusRubrik Node Status1.0.0Status of the Rubrik cluster node.
rubrik_node_ReadIOPSRubrik Node Read IOPSIOPS1.0.0Rubrik cluster node read IOPS.
rubrik_node_WriteIOPSRubrik Node Write IOPSIOPS1.0.0Rubrik cluster node write IOPS.
rubrik_node_ReadIOThroughputRubrik Node Read IO ThroughputBytes/sec1.0.0Rubrik cluster node read IO throughput.
rubrik_node_WriteIOThroughputRubrik Node Write IO ThroughputBytes/sec1.0.0Rubrik cluster node write IO throughput.
Rubrik Diskrubrik_disk_StatusRubrik Disk Status1.0.0Status of the Rubrik cluster node disk.
rubrik_disk_UsageRubrik Disk UsageGB1.0.0Rubrik cluster node disk usage.
rubrik_disk_UtilizationRubrik Disk Utilization%1.0.0Rubrik cluster node disk utilization.

Default Monitoring Configurations

Rubrik has default Global Device Management Policies, Global Templates, Global Monitors and Global metrics in OpsRamp. You can customize these default monitoring configurations as per your business use cases by cloning respective Global Templates and Global Device Management Policies. OpsRamp recommends performing the below activity before installing the application to avoid noise alerts and data.

  1. Default Global Device Management Policies

    OpsRamp has a Global Device Management Policy for each Native Type of Rubrik Cluster. You can find those Device Management Policies at Setup > Resources > Device Management Policies, search with suggested names in global scope. Each Device Management Policy follows below naming convention:

    {appName nativeType - version}

    Ex: rubrik Rubrik Cluster - 1 (i.e, appName = rubrik, nativeType =Rubrik Cluster, version = 1)

  2. Default Global Templates

    OpsRamp has a Global template for each Native Type of Rubrik Cluster. You can find those templates at Setup > Monitoring > Templates, search with suggested names in global scope. Each template follows below naming convention:

    {appName nativeType 'Template' - version}

    Ex: rubrik StorageGRID Template - 1 (i.e, appName = rubrik , nativeType = Rubrik Cluster, version = 1)

  3. Default Global Monitors

    OpsRamp has a Global Monitors for each Native Type which has monitoring support. You can find those monitors at Setup > Monitoring > Monitors, search with suggested names in global scope. Each Monitors follows below naming convention:

    {monitorKey appName nativeType - version}

    Ex: Rubrik Cluster Monitor rubrik Rubrik Cluster 1 (i.e, monitorKey = Rubrik Cluster Monitor, appName = rubrik , nativeType = Rubrik Cluster, version= 1)

Configure and Install the Rubrik Integration

  1. From All Clients, select a client.
  2. Go to Setup > Account.
  3. Select the Integrations and Apps tab.
  4. The Installed Integrations page, where all the installed applications are displayed.
    Note: If there are no installed applications, it will navigate to the ADD APP page.
  5. Click + ADD on the Installed Integrations page. The Available Integrations and Apps page displays all the available applications along with the newly created application with the version.
    Note: You can even search for the application using the search option available. Also you can use the All Categories option to search.
Hpe3par
  1. Click ADD in the Rubrik application.
  2. In the Configuration page, click + ADD. The Add Configuration page appears.
  3. Enter the below mentioned BASIC INFORMATION:
FunctionalityDescription
NameEnter the name for the configuration.
Rubrik Cluster IP Address/Host NameEnter the Host name or the IP address.
Rubrik REST API PortAPI Port information
CredentialSelect the credentials from the drop-down list.
Note: Click + Add to create a credential.

Notes:

  • By default the Is Secure checkbox is selected.
  • Rubrik Cluster IP Address/Host Name and Rubrik REST API Port should be accessible from Gateway.
  • Select the following:
    • App Failure Notifications: if enabled,
      • an alert will be sent to the registered gateway resource.
      • an alert is raised for connectivity, authentication exception,
        • Discovery - alert will be on a gateway resource that is registered with the application.
        • Monitoring - alert will be on a particular Powerflex resource.
    • Alert Configuration: enables integrating third party alerts into OpsRamp using further configurations.
  • Below are the default values set for:
    • alertSeverity: provides severity alerts that get integrated out of all possible alerts.
      • Default Values: Critical, Warning.
      • Possible Values: Critical, Warning.
    • Alert Severity Mapping: enables you to map the severities between Dell PowerFlex and OpsRamp as severities are predefined values in each system.
      • Possible values of Alert Severity Mapping Filter configuration property are {“Critical”:“Critical”,“Warning”:“Warning”}
        Note: You can change it as per your business use cases at any point in time from the Configuration page.
  1. Select the below mentioned Custom Attribute:
FunctionalityDescription
Custom AttributeSelect the custom attribute from the drop down list box.
ValueSelect the value from the drop down list box.

Note: The custom attribute that you add here will be assigned to all the resources that are created by the integration. You can add a maximum of five custom attributes (key and value pair).

  1. In the RESOURCE TYPE section, select:
    • ALL: All the existing and future resources will be discovered.
    • SELECT: You can select one or multiple resources to be discovered.
  2. In the DISCOVERY SCHEDULE section, select Recurrence Pattern to add one of the following patterns:
    • Minutes
    • Hourly
    • Daily
    • Weekly
    • Monthly
  3. Click ADD.
Hpe3par

Now the configuration is saved and displayed on the configurations page after you save it.
Note: From the same page, you may Edit and Remove the created configuration.

  1. Click Next.
  2. Below are the optional steps you can perform on the Installation page.
  • Under the ADVANCED SETTINGS, Select the Bypass Resource Reconciliation option, if you wish to bypass resource reconciliation when encountering the same resources discovered by multiple applications.

    Note: If two different applications provide identical discovery attributes, two separate resources will be generated with those respective attributes from the individual discoveries.

Cisco FirePower
  • Click +ADD to create a new collector by providing a name or use the pre-populated name.
Aruba Airwave Integrations
  1. Select an existing registered profile.
Aruba Airwave Integrations
  1. Click FINISH.

The integration is now installed and displayed on the Installed Integration page. Use the search field to find the installed application.

Modify the Configuration

View the Rubrik details

The discovered resource(s) are displayed in Infrastructure > Resources > Server, with Native Resource Type as Rubrik Node. You can navigate to the Attributes tab to view the discovery details, and the Metrics tab to view the metric details for Rubrik Node.

Hpe3par
Hpe3par

Resource Type Filter Keys

Rubrik application resources are filtered and discovered based on below keys:

Click here to view the Supported Input Keys
Resource TypeSupported Input Keys
All TypesresourceName
hostName
aliasName
dnsName
ipAddress
macAddress
os
make
model
serialNumber
Rubrik ClusterVersion
API Version
Registered Mode
Timezone
Rubrik DiskDisk Type
Node Id
path
Rubrik NodeBrikId

Risks, Limitations & Assumptions

  • Application can handle Critical/Recovery failure notifications for below two cases when user enables App Failure Notifications in configuration
    • Connectivity Exception
    • Authentication Exception
  • Application will not send any duplicate/repeat failure alert notification until the already existing critical alert is recovered.
  • Using metrics for monitoring the resources and generating alerts when the threshold values are breached.
  • Application cannot control monitoring pause/resume actions based on above alerts.
  • No support of showing activity log and applied time.
  • Event polling Critical alerts generated based on “event_series_status” category “Failure” events.
  • Coming to the challenge with event monitoring, we could not relate a Failure event with a subsequent heel event. So because of this there won’t be any healing mechanism from the application side. The customer has to heal the alerts manually in every case.
  • This application supports both Classic Gateway and NextGen Gateway.