Scale Up Operation
The Scale Up operation in Elastic Collector Profiles feature allows users to expand their gateway collector capacity dynamically when a single replica is insufficient to manage large-scale resources. This document outlines the best use cases for the ScaleUp operation, how it works, and key considerations to ensure optimal performance.
Use Case Overview
When a single Elastic Collector Profile is overloaded due to managing a large number of resources, Scale Up ensures that new replicas are created within the cluster to handle the load efficiently. This feature provides automatic new replica deployment, registration, and resource redistribution among the newly created replicas.
Key Features of Scale Up Operation
- Automatic Replica Deployment: When the Scale Up action is triggered from the OpsRamp Collector Profiles UI, a new replica is deployed in the cluster.
- Replica Auto-Registration: The newly created replica is automatically registered with the cloud.
- Resource Redistribution: Resources managed by the master replica are equally redistributed across all replicas, ensuring balanced workload distribution.
- Sequential Scaling: ScaleUp operations must complete successfully before triggering another ScaleUp operation.
- Naming Convention for Replicas: New replicas follow a standardized naming pattern:
- If the Master Replica Profile is named OpsRamp_Elastic profile, then the replicas will be named sequentially as:
- OpsRamp_Elastic_nextgen_gw_1
- OpsRamp_Elastic_nextgen_gw_2
- OpsRamp_Elastic_nextgen_gw_3, and so on.
- If the Master Replica Profile is named OpsRamp_Elastic profile, then the replicas will be named sequentially as:
- Monitoring & Discovery:
- Discovery happens through the Master Replica.
- Monitoring is performed across all replica profiles.
- Failure Handling:
- If the ScaleUp operation fails, an alert is generated on the Master Replica Gateway Device.
- The alert contains failure reasons, and users can take corrective actions accordingly.
- Last ScaleUp status is visible for up to 15 minutes against Elastic profile.
- Manual Rebalancing:
If auto-redistribution/rebalacing of resources fails, users can manually rebalance resources via the Rebalance option in the Collector Profile Actions.
Resource Distribution Scenarios
Scenario 1: Equal Distribution at Discovery
- When new resources are discovered, they are equally distributed among all existing replicas.
- Example:
If 3 replicas (including the master) exist and 1000 new resources are discovered, they can be distributed as 334 to Master Replica, 333 to Replica 1, and 333 to Replica 2, or with variations like 333 to Master Replica, 334 to Replica 1, and 333 to Replica 2. Alternatively, all replicas may receive 333 resources each, with one replica getting 334.
Scenario 2: Scaling Up When Master Replica is Overloaded
- If the Master Replica initially manages 1000 resources without any replicas:
- A ScaleUp operation creates Replica 1, and resources redistribute as:
- 500 for Master Replica
- 500 for Replica 1
- If another ScaleUp operation is performed, a second replica is created, and resources redistribute as:
- 334 for Master Replica
- 333 for Replica 1
- 333 for Replica 2
- A ScaleUp operation creates Replica 1, and resources redistribute as:
Scenario 3: Unequal Load Redistribution/Rebalance
- When some replicas already manage resources and new resources are discovered:
- The newly discovered resources are assigned to the replica which is managing the lowest resources.
- Example:
If Replica 1 has 600 resources, and Replica 2 has 400 resources, newly discovered resources will first be assigned to Replica 2 to balance the load.
Limitations & Important Considerations
- Sufficient Cluster Resources: Ensure the cluster has enough CPU, memory, and disk space before scaling up.
- Single Integration Point: The Master Replica Profile should be used for all app/integration installations.
- Scaling Frequency Restriction: A new Scale-Up operation cannot be initiated until the previous one has successfully completed.
- Elastic Profile Boundaries: Resources can only be redistributed/rebalanced within the same Elastic Profile and not to another profile.
- Standard Profiles Exclusion: Standard Collector Profiles do not support Scale-Up or Rebalance operations.
Rebalance Operation
The Rebalance Operation in Elastic Collector Profiles feature allows users to manually redistribute/rebalance the resources when automatic distribution fails after a ScaleUp operation. This ensures efficient load balancing among all replicas within an Elastic Collector Profile. This document outlines the best use cases, functionality, and key considerations for using the Rebalance option.
Use Case Overview
The Scale Down feature is useful when:
- A ScaleUp operation successfully creates a new replica, but automatic resource redistribution fails.
- Users need to manually redistribute/rebalance resources across all replicas in an Elastic Collector Profile to maintain an optimal balance.
Key Features of Rebalance Operation
- Manual Redistribution: If automatic resource distribution fails after scaleup, users can manually trigger Rebalance via the OpsRamp Collector Profiles UI.
- Formula-Based Redistribution: Resources are distributed using the formula:
Total Resources in Elastic Profile ÷ Total Replicas in Elastic Profile
- Replica-Aware Distribution: Resources are evenly divided among all available replicas, including the Master Replica and newly created replicas.
Rebalance Process & Example
- Scenario: Automatic Redistribution Failure After Scale-Up
- A user performs a ScaleUp operation.
- A new Replica Profile is successfully created and registered to the cloud.
- Automatic resource redistribution fails, causing an imbalance in the workload.
- The user triggers the Rebalance option to manually redistribute resources.
- Rebalance Calculation Example
- Suppose an Elastic Collector Profile manages 1000 resources with 3 replicas (including the master replica).
- The Rebalance operation distributes resources equally among all replicas using the formula:
1000÷3=334,333,333
- The resource allocation will be:
- Master Replica: 334 resources
- Replica 1: 333 resources
- Replica 2: 333 resources
Limitations & Considerations
- Rebalance is only available in Elastic Collector Profiles. Standard profiles do not support this option.
- Ensure sufficient cluster resources (CPU, memory, disk) before triggering a Rebalance.
- Rebalance applies only to existing replicas and does not create new ones.
- Users must wait for the previous Scale-Up/Rebalance operation to complete before initiating another.
Scale Down Operation
The Scale Down Operation in Elastic Collector Profiles feature allows users to remove the most recently deployed replica when system resource utilization is consistently low. This helps optimize resource usage while ensuring efficient workload distribution among the remaining replicas. This document outlines the best use cases, functionality, and key considerations for using the Scale Down option.
Use Case Overview
The Scale Down feature is useful when:
- CPU, memory, or network usage is consistently low, and reducing replicas helps save cluster resources.
- The number of managed resources per replica is minimal, and the user wants to free up cluster resources.
Key Features of Scale Down Operation
- Most Recent Replica Removal: The last deployed replica is removed first when Scale Down is performed.
- Automatic Reallocation: Resources from the deleted replica are redistributed among the remaining active replicas.
- Cluster Resource Optimization: Helps in optimizing CPU, memory, and network usage by reducing unnecessary replicas.
Scale Down Process & Example
- Initial State:
- The cluster has 3 replicas (including the master replica).
- Each replica manages 200 resources:
- Master Replica: 200 resources
- Replica 1: 200 resources
- Replica 2: 200 resources
- Triggering Scale Down:
- The user performs a Scale Down operation.
- The most recently created replica (Replica 2) is deleted.
- The resources from Replica 2 are redistributed to the remaining replicas.
- Final State After Scale Down:
- The remaining 2 replicas now manage 300 resources each:
- Master Replica: 300 resources
- Replica 1: 300 resources
- The remaining 2 replicas now manage 300 resources each:
Limitations & Considerations
- Only the most recently created replica is removed—you cannot specify which replica to delete.
- Scale Down is available only in Elastic Collector Profiles; standard profiles do not support this feature.
- The option is not visible when only one replica (Master Replica) exists.
- Cluster resources must be evaluated before scaling down to avoid overloading remaining replicas.
- Once a replica is removed, it cannot be restored—a new Scale Up operation would be required to add it back.