Skip to main content

General Scaling

Core concepts and decision criteria for NodeGroup scaling operations.

Scaling Timeline

Scale-Up Operations (5-8 minutes total):

  • Initiation: 0-30 seconds (validation and planning)
  • Resource allocation: 30 seconds - 2 minutes (VM provisioning)
  • Node bootstrap: 2-5 minutes (OS initialization and Kubernetes registration)
  • Health checks: 1-2 minutes (readiness verification)

Scale-Down Operations (3-5 minutes total):

  • Workload migration: 1-3 minutes (pod eviction and rescheduling)
  • Node draining: 30 seconds - 1 minute (graceful removal)
  • Resource cleanup: 30 seconds - 1 minute (VM termination)

Scaling Constraints

Node Limits:

  • Minimum: 1 node per worker NodeGroup
  • Maximum: 10 nodes per NodeGroup
  • NodeGroup must be in "Ready" state

Resource Requirements:

  • vCloud quota availability for CPU, memory, and storage
  • IP address availability within assigned subnets
  • Sufficient cluster-level resources for orchestration

When to Scale

Scale Up Scenarios

  • Performance Issues: CPU above 70%, memory above 80%, or pod scheduling failures
  • Capacity Planning: Traffic growth, seasonal events, or high availability requirements
  • Development: Testing and deployment activities requiring extra capacity

Scale Down Scenarios

  • Efficiency: Resource utilization below 30% across multiple nodes
  • Cost Optimization: Reducing unnecessary infrastructure expenses
  • Operational: Maintenance windows, off-peak periods, or project completion

Scaling Strategies

Conservative Scaling

  • Scale 1-2 nodes at a time
  • Monitor impact before additional scaling
  • Best for: Production environments, cost-sensitive workloads

Aggressive Scaling

  • Scale rapidly to meet immediate demand
  • Higher initial over-provisioning
  • Best for: High-availability requirements, variable workloads

Predictive Scaling

  • Scale based on historical patterns
  • Pre-scale before anticipated load increases
  • Best for: Scheduled workloads, known traffic patterns