41 pages ยท 8 sections
Ctrl K
GitHub Portfolio

OptScale Setup

OptScale is an open-source FinOps and cloud cost management platform originally developed by Hystax and now maintained as a comprehensive multi-cloud optimization solution. This guide covers deployment, cloud connector setup, and practical optimization workflows.

What is OptScale

OptScale (acquired from Hystax) is an open-source FinOps platform that provides:

  • Multi-cloud cost aggregation: Unified billing across AWS, Azure, and GCP.
  • Resource discovery and inventory: Complete visibility into all cloud assets.
  • Right-sizing recommendations: ML-based suggestions for EC2, RDS, and EBS optimization.
  • Reserved Instance and Savings Plans analytics: Purchase recommendations with break-even analysis.
  • Anomaly detection: Automated detection of unusual spending patterns.
  • Cost allocation by team/project: Tag-based showback and chargeback reporting.
Open Source Advantage: OptScale's source code is available on GitHub, allowing full customization, air-gapped deployment, and no per-percentage-of-spend pricing. This makes it ideal for organizations with privacy requirements or those wanting to avoid variable FinOps tooling costs.

Deployment Options

Deployment ModeBest ForEffortMaintenance
OptScale SaaS (Hosted)Quick start; teams without Kubernetes expertiseLow โ€” sign up and connect cloudsNone (vendor-managed)
Docker Compose (Self-Hosted)Small to medium deployments; single-nodeMediumSelf-managed updates
Kubernetes (Helm Charts)Production-scale; high availability; large teamsHighRequires K8s operational expertise

Docker Compose Deployment Guide

This section provides a complete Docker Compose deployment for a self-hosted OptScale instance. This is the recommended starting point for teams new to OptScale.

Prerequisites

  • Linux server (Ubuntu 22.04 LTS recommended) with 8+ vCPUs, 32GB RAM, 200GB SSD
  • Docker Engine 24.0+ and Docker Compose v2
  • Public IP or domain name for the OptScale UI
  • Outgoing HTTPS access to cloud provider APIs

Step 1: Install Docker Engine

#!/bin/bash
# install-docker.sh โ€” Docker & Docker Compose installation for Ubuntu 22.04
set -euo pipefail

# Update package index
sudo apt-get update

# Install prerequisites
sudo apt-get install -y ca-certificates curl gnupg lsb-release

# Add Docker's official GPG key
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# Add Docker repository
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker Engine and Compose plugin
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

# Enable and start Docker
sudo systemctl enable docker
sudo systemctl start docker

# Add current user to docker group
sudo usermod -aG docker $USER

echo "Docker $(docker --version) installed successfully."
echo "Docker Compose $(docker compose version) installed successfully."
echo "Log out and back in for group changes to take effect."

Step 2: Clone OptScale and Deploy

#!/bin/bash
# deploy-optscale.sh โ€” Deploy OptScale via Docker Compose
set -euo pipefail

OPTSCALE_DIR="/opt/optscale"

# Clone the OptScale repository
sudo mkdir -p "$OPTSCALE_DIR"
sudo chown $(id -u):$(id -g) "$OPTSCALE_DIR"
git clone https://github.com/hystax/optscale.git "$OPTSCALE_DIR"

cd "$OPTSCALE_DIR"

# Copy and customize environment configuration
cp .env.example .env

# Edit critical variables in .env
cat >> .env <<'EOF'
# OptScale Configuration
OPTSCALE_HOST=finops.company.com          # Your domain or IP
OPTSCALE_PORT=443
ADMIN_EMAIL=finops-admin@company.com
ADMIN_PASSWORD=$(openssl rand -base64 24)  # Generate secure password

# Database
POSTGRES_PASSWORD=$(openssl rand -base64 24)
MONGO_PASSWORD=$(openssl rand -base64 24)
REDIS_PASSWORD=$(openssl rand -base64 24)

# ClickHouse (analytics database)
CLICKHOUSE_PASSWORD=$(openssl rand -base64 24)

# JWT Secret for API authentication
JWT_SECRET=$(openssl rand -base64 48)
EOF

# Store passwords securely (e.g., AWS Secrets Manager or HashiCorp Vault)
echo "Admin password: $ADMIN_PASSWORD" | tee /opt/optscale-secrets.txt
chmod 600 /opt/optscale-secrets.txt

# Deploy with Docker Compose
docker compose -f docker-compose.yaml up -d

# Verify services are running
echo "Checking OptScale services..."
sleep 30
docker compose ps

# Check health endpoints
curl -sf http://localhost:9000/health || echo "WARNING: Health check failed"

echo "OptScale deployment complete. Access at https://$OPTSCALE_HOST"

Step 3: Configure Nginx Reverse Proxy with SSL

# /etc/nginx/sites-available/optscale
# Nginx reverse proxy configuration for OptScale

upstream optscale_backend {
    server 127.0.0.1:80;
    keepalive 32;
}

server {
    listen 80;
    server_name finops.company.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name finops.company.com;

    ssl_certificate     /etc/letsencrypt/live/finops.company.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/finops.company.com/privkey.pem;
    ssl_protocols       TLSv1.2 TLSv1.3;
    ssl_ciphers         HIGH:!aNULL:!MD5;

    client_max_body_size 100M;

    location / {
        proxy_pass http://optscale_backend;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    location /api/ {
        proxy_pass http://optscale_backend/api/;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket support for real-time updates
    location /ws/ {
        proxy_pass http://optscale_backend/ws/;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
    }
}
# Enable the Nginx configuration
sudo ln -sf /etc/nginx/sites-available/optscale /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

# Obtain SSL certificate via Let's Encrypt
sudo apt-get install -y certbot python3-certbot-nginx
sudo certbot --nginx -d finops.company.com --non-interactive --agree-tos \
  -m admin@company.com

AWS Account Connection

Connecting AWS accounts requires IAM roles and Cost and Usage Report (CUR) configuration for accurate cost data.

Step 1: Enable Cost and Usage Report (CUR)

# AWS CLI commands to set up CUR
# Run this in your AWS master/payer account

export AWS_PROFILE=master-account
export BILLING_BUCKET="optscale-cur-reports-$(aws sts get-caller-identity --query Account --output text)"
export REPORT_PREFIX="optscale/cur"

# Create S3 bucket for CUR delivery
aws s3api create-bucket \
  --bucket "$BILLING_BUCKET" \
  --region us-east-1

# Enable versioning and encryption on the CUR bucket
aws s3api put-bucket-versioning \
  --bucket "$BILLING_BUCKET" \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption \
  --bucket "$BILLING_BUCKET" \
  --server-side-encryption-configuration '{
    "Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]
  }'

# Create CUR definition (requires Billing permissions)
aws cur put-report-definition --report-definition '{
  "ReportName": "optscale-cur-daily",
  "TimeUnit": "DAILY",
  "Format": "Parquet",
  "Compression": "Parquet",
  "AdditionalSchemaElements": ["RESOURCES"],
  "S3Bucket": "'"$BILLING_BUCKET"'",
  "S3Prefix": "'"$REPORT_PREFIX"'",
  "S3Region": "us-east-1",
  "AdditionalArtifacts": ["ATHENA"],
  "ReportVersioning": "OVERWRITE_REPORT"
}'

echo "CUR configured. Reports will be delivered to s3://$BILLING_BUCKET/$REPORT_PREFIX/"

Step 2: Create OptScale IAM Role

# optscale-iam-role.yaml โ€” CloudFormation template for OptScale IAM role
AWSTemplateFormatVersion: '2010-09-09'
Description: 'IAM Role for OptScale cost optimization and resource discovery'

Parameters:
  OptScaleAccountId:
    Type: String
    Description: 'AWS Account ID where OptScale is deployed (or use external ID for SaaS)'
    AllowedPattern: '^[0-9]{12}$'

  ExternalId:
    Type: String
    Description: 'External ID for cross-account role assumption'
    NoEcho: true

Resources:
  OptScaleRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: OptScaleIntegrationRole
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              AWS: !Sub 'arn:aws:iam::${OptScaleAccountId}:root'
            Action: sts:AssumeRole
            Condition:
              StringEquals:
                sts:ExternalId: !Ref ExternalId
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/ReadOnlyAccess
      Policies:
        - PolicyName: OptScaleCostOptimization
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              # Required for Cost Explorer access
              - Effect: Allow
                Action:
                  - ce:GetCostAndUsage
                  - ce:GetReservationUtilization
                  - ce:GetSavingsPlansUtilization
                  - ce:GetReservationCoverage
                  - ce:GetSavingsPlansCoverage
                  - ce:GetCostForecast
                  - ce:GetUsageForecast
                  - pricing:GetProducts
                Resource: '*'
              # Required for resource optimization actions
              - Effect: Allow
                Action:
                  - ec2:ModifyInstanceAttribute
                  - ec2:StopInstances
                  - ec2:StartInstances
                  - ec2:TerminateInstances
                  - ec2:ModifyVolume
                  - rds:ModifyDBInstance
                  - rds:StopDBInstance
                  - rds:StartDBInstance
                  - autoscaling:UpdateAutoScalingGroup
                Resource: '*'
                Condition:
                  StringEquals:
                    ec2:ResourceTag/ManagedBy: optscale

  OptScaleRoleArn:
    Type: AWS::SSM::Parameter
    Properties:
      Name: /optscale/role-arn
      Type: String
      Value: !GetAtt OptScaleRole.Arn

Outputs:
  RoleArn:
    Description: 'ARN of the OptScale IAM Role'
    Value: !GetAtt OptScaleRole.Arn
    Export:
      Name: OptScaleRoleArn
# Deploy the IAM role to each target account
aws cloudformation deploy \
  --template-file optscale-iam-role.yaml \
  --stack-name optscale-integration \
  --parameter-overrides \
    OptScaleAccountId=123456789012 \
    ExternalId=optscale-ext-uuid-12345 \
  --capabilities CAPABILITY_IAM \
  --profile target-account

Azure Connector Setup

#!/bin/bash
# azure-connector-setup.sh โ€” Register OptScale with Azure subscriptions
set -euo pipefail

export SUBSCRIPTION_ID="your-subscription-id"
export OPTSCALE_APP_NAME="OptScaleFinOps"

# Login to Azure
az login
az account set --subscription "$SUBSCRIPTION_ID"

# Create Azure AD Application for OptScale
APP_OUTPUT=$(az ad app create \
  --display-name "$OPTSCALE_APP_NAME" \
  --sign-in-audience AzureADMyOrg \
  --query '{appId:appId,objectId:id}' -o json)

APP_ID=$(echo "$APP_OUTPUT" | jq -r '.appId')
APP_OBJECT_ID=$(echo "$APP_OUTPUT" | jq -r '.objectId')

echo "Created App Registration: $APP_ID"

# Create Service Principal
SP_OBJECT_ID=$(az ad sp create --id "$APP_ID" --query 'id' -o tsv)
echo "Created Service Principal: $SP_OBJECT_ID"

# Create client secret (valid for 2 years)
SECRET_OUTPUT=$(az ad app credential reset \
  --id "$APP_ID" \
  --display-name "optscale-secret" \
  --years 2 \
  --query '{password:password,endDate:endDate}' -o json)

CLIENT_SECRET=$(echo "$SECRET_OUTPUT" | jq -r '.password')
SECRET_EXPIRY=$(echo "$SECRET_OUTPUT" | jq -r '.endDate')

echo "Client Secret created (expires: $SECRET_EXPIRY)"

# Assign required roles at subscription scope
# Reader role for cost and resource data
az role assignment create \
  --assignee "$SP_OBJECT_ID" \
  --role "Reader" \
  --scope "/subscriptions/$SUBSCRIPTION_ID"

# Cost Management Reader for billing data
az role assignment create \
  --assignee "$SP_OBJECT_ID" \
  --role "Cost Management Reader" \
  --scope "/subscriptions/$SUBSCRIPTION_ID"

# Reservation Reader for RI recommendations (if applicable)
az role assignment create \
  --assignee "$SP_OBJECT_ID" \
  --role "Reservation Reader" \
  --scope "/subscriptions/$SUBSCRIPTION_ID"

# Export credentials for OptScale configuration
cat > optscale-azure-credentials.json <<EOF
{
  "subscription_id": "$SUBSCRIPTION_ID",
  "tenant_id": "$(az account show --query tenantId -o tsv)",
  "client_id": "$APP_ID",
  "client_secret": "$CLIENT_SECRET",
  "secret_expiry": "$SECRET_EXPIRY"
}
EOF

echo "Azure credentials saved to optscale-azure-credentials.json"
echo "Upload these credentials to OptScale UI: Settings > Cloud Connectors > Azure"

GCP Connector Setup

#!/bin/bash
# gcp-connector-setup.sh โ€” Register OptScale with GCP projects
set -euo pipefail

export PROJECT_ID="your-gcp-project"
export OPTSCALE_SA="optscale-finops@$PROJECT_ID.iam.gserviceaccount.com"

# Set active project
gcloud config set project "$PROJECT_ID"

# Create dedicated service account for OptScale
gcloud iam service-accounts create optscale-finops \
  --display-name="OptScale FinOps Integration" \
  --description="Service account for OptScale cost management and optimization"

# Grant required IAM roles at project level
ROLES=(
  "roles/viewer"                          # General resource viewing
  "roles/billing.costsManager"            # Billing data access
  "roles/bigquery.jobUser"                # For BigQuery cost exports
  "roles/recommender.viewer"              # Right-sizing recommendations
  "roles/compute.instanceAdmin"           # For optimization actions (optional)
)

for ROLE in "${ROLES[@]}"; do
  gcloud projects add-iam-policy-binding "$PROJECT_ID" \
    --member="serviceAccount:$OPTSCALE_SA" \
    --role="$ROLE"
done

# Create and download service account key
gcloud iam service-accounts keys create optscale-gcp-key.json \
  --iam-account="$OPTSCALE_SA"

echo "GCP service account key saved to optscale-gcp-key.json"
echo "Upload this key to OptScale UI: Settings > Cloud Connectors > GCP"

# Enable required APIs
gcloud services enable \
  cloudbilling.googleapis.com \
  compute.googleapis.com \
  recommender.googleapis.com \
  bigquery.googleapis.com \
  monitoring.googleapis.com

echo "Required GCP APIs enabled."

Kubernetes Cluster Cost Allocation

OptScale integrates with Kubernetes for pod-level cost allocation. This requires deploying a lightweight agent to each cluster.

# optscale-k8s-agent.yaml โ€” Kubernetes deployment for OptScale agent
apiVersion: v1
kind: Namespace
metadata:
  name: optscale
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: optscale-agent
  namespace: optscale
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: optscale-agent
rules:
  - apiGroups: [""]
    resources: ["nodes", "pods", "namespaces", "services", "persistentvolumes", "persistentvolumeclaims"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources: ["deployments", "statefulsets", "daemonsets", "replicasets"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["metrics.k8s.io"]
    resources: ["nodes", "pods"]
    verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: optscale-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: optscale-agent
subjects:
  - kind: ServiceAccount
    name: optscale-agent
    namespace: optscale
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: optscale-agent
  namespace: optscale
  labels:
    app: optscale-agent
spec:
  replicas: 1
  selector:
    matchLabels:
      app: optscale-agent
  template:
    metadata:
      labels:
        app: optscale-agent
    spec:
      serviceAccountName: optscale-agent
      containers:
        - name: agent
          image: hystax/optscale-agent:latest
          env:
            - name: OPTSCALE_API_URL
              value: "https://finops.company.com/api/v2"
            - name: OPTSCALE_API_KEY
              valueFrom:
                secretKeyRef:
                  name: optscale-api-key
                  key: api-key
            - name: CLUSTER_NAME
              valueFrom:
                configMapKeyRef:
                  name: cluster-metadata
                  key: cluster-name
            - name: CLUSTER_PROVIDER
              value: "eks"  # eks | aks | gke
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "256Mi"
              cpu: "200m"
---
apiVersion: v1
kind: Secret
metadata:
  name: optscale-api-key
  namespace: optscale
type: Opaque
stringData:
  api-key: "YOUR_OPTSCALE_API_KEY_HERE"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-metadata
  namespace: optscale
data:
  cluster-name: "production-eks-us-east-1"
# Deploy the agent to your cluster
kubectl apply -f optscale-k8s-agent.yaml

# Verify agent is running
kubectl -n optscale get pods
kubectl -n optscale logs -l app=optscale-agent --tail=50

Resource Discovery and Inventory

Once cloud connectors are established, OptScale automatically discovers resources across all connected accounts. The discovery process typically runs every 6 hours.

Discovery Coverage: OptScale discovers EC2 instances, EBS volumes, RDS databases, ELB/ALB, Lambda functions, S3 buckets, ECS clusters, EKS clusters, Auto Scaling groups, and more. Azure and GCP resources have equivalent coverage.

Querying Resource Inventory via API

#!/usr/bin/env python3
# optscale_inventory.py โ€” Query OptScale resource inventory via API
import requests
import os
from datetime import datetime, timedelta

OPTSCALE_URL = os.environ.get("OPTSCALE_URL", "https://finops.company.com")
API_KEY = os.environ.get("OPTSCALE_API_KEY")

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

def get_resource_inventory(cloud_account_id=None, resource_type=None):
    """Fetch resource inventory from OptScale."""
    url = f"{OPTSCALE_URL}/api/v2/resources"
    params = {}
    if cloud_account_id:
        params["cloud_account_id"] = cloud_account_id
    if resource_type:
        params["type"] = resource_type  # Instance, Volume, Database, etc.

    response = requests.get(url, headers=headers, params=params, timeout=30)
    response.raise_for_status()
    return response.json()

def get_untagged_resources(tag_key="CostCenter"):
    """Find resources missing a specific tag."""
    resources = get_resource_inventory()
    untagged = [
        r for r in resources.get("data", [])
        if not r.get("tags", {}).get(tag_key)
    ]
    return untagged

def get_idle_resources(days=7):
    """Identify resources with low utilization over the past N days."""
    url = f"{OPTSCALE_URL}/api/v2/resources/idle"
    params = {"days": days, "cpu_threshold": 5}
    response = requests.get(url, headers=headers, params=params, timeout=30)
    response.raise_for_status()
    return response.json()

# Example usage
if __name__ == "__main__":
    print("=== OptScale Resource Inventory ===")
    inventory = get_resource_inventory(resource_type="Instance")
    print(f"Total instances: {inventory.get('total', 0)}")

    print("\n=== Untagged Resources ===")
    untagged = get_untagged_resources("CostCenter")
    print(f"Resources missing CostCenter tag: {len(untagged)}")
    for r in untagged[:10]:
        print(f"  - {r.get('name', 'unnamed')} ({r.get('cloud_resource_id')})")

    print("\n=== Idle Resources ===")
    idle = get_idle_resources(days=7)
    print(f"Idle resources (7d): {idle.get('total', 0)}")

Right-Sizing Recommendations

OptScale analyzes CloudWatch/Cloud Monitoring metrics to generate right-sizing recommendations. Recommendations are available for:

Resource TypeMetrics AnalyzedRecommendation Type
EC2 InstancesCPU utilization, memory utilization, network I/OResize to smaller instance, migrate family, or use Graviton
RDS DatabasesCPU, memory, connections, storage IOPSResize instance, upgrade storage, enable Multi-AZ
EBS VolumesRead/write IOPS, throughputDowngrade volume type, resize volume, or migrate to gp3
Auto Scaling GroupsGroup utilization patternsAdjust min/max/desired capacity

Implementing Right-Sizing via API

#!/usr/bin/env python3
# optscale_rightsize.py โ€” Automated right-sizing using OptScale API
import requests
import sys
import os

OPTSCALE_URL = os.environ.get("OPTSCALE_URL")
API_KEY = os.environ.get("OPTSCALE_API_KEY")
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def get_recommendations(resource_type="Instance"):
    """Fetch available right-sizing recommendations."""
    url = f"{OPTSCALE_URL}/api/v2/recommendations"
    response = requests.get(
        url, headers=headers,
        params={"type": "rightsize", "resource_type": resource_type},
        timeout=30
    )
    response.raise_for_status()
    return response.json()

def approve_recommendation(rec_id):
    """Approve and apply a right-sizing recommendation."""
    url = f"{OPTSCALE_URL}/api/v2/recommendations/{rec_id}/apply"
    response = requests.post(url, headers=headers, timeout=60)
    response.raise_for_status()
    return response.json()

def main():
    print("Fetching right-sizing recommendations...")
    recs = get_recommendations()
    recommendations = recs.get("data", [])

    if not recommendations:
        print("No recommendations found.")
        return

    total_savings = 0
    for rec in recommendations:
        resource = rec.get("resource_name", "unknown")
        current = rec.get("current_instance_type", "unknown")
        recommended = rec.get("recommended_instance_type", "unknown")
        monthly_savings = rec.get("monthly_savings_usd", 0)
        confidence = rec.get("confidence", "unknown")

        print(f"\nResource: {resource}")
        print(f"  Current:    {current}")
        print(f"  Recommend:  {recommended}")
        print(f"  Savings:    ${monthly_savings:.2f}/month")
        print(f"  Confidence: {confidence}")

        total_savings += monthly_savings

    print(f"\n=== Total Potential Savings: ${total_savings:.2f}/month ===")

    # Auto-approve high-confidence recommendations with >$50/month savings
    auto_approved = 0
    for rec in recommendations:
        if rec.get("confidence") == "high" and rec.get("monthly_savings_usd", 0) >= 50:
            print(f"Auto-approving: {rec.get('resource_name')}")
            try:
                result = approve_recommendation(rec["id"])
                print(f"  Result: {result.get('status')}")
                auto_approved += 1
            except Exception as e:
                print(f"  ERROR: {e}", file=sys.stderr)

    print(f"\nAuto-approved {auto_approved} recommendations.")

if __name__ == "__main__":
    main()

Reserved Instance and Savings Plans Recommendations

OptScale analyzes historical usage patterns to recommend RI and Savings Plans purchases with break-even analysis.

#!/usr/bin/env python3
# optscale_ri_recommendations.py โ€” RI and Savings Plans analysis
import requests
import os

OPTSCALE_URL = os.environ.get("OPTSCALE_URL")
API_KEY = os.environ.get("OPTSCALE_API_KEY")
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def get_ri_recommendations(cloud_account_id, term="1_year", payment="partial_upfront"):
    """Get RI purchase recommendations with break-even analysis."""
    url = f"{OPTSCALE_URL}/api/v2/recommendations/ri"
    params = {
        "cloud_account_id": cloud_account_id,
        "term": term,
        "payment_option": payment
    }
    response = requests.get(url, headers=headers, params=params, timeout=30)
    response.raise_for_status()
    return response.json()

def print_break_even_analysis(rec):
    """Print detailed break-even analysis for a recommendation."""
    instance_family = rec.get("instance_family")
    region = rec.get("region")
    recommended_qty = rec.get("recommended_quantity")

    on_demand_hourly = rec.get("on_demand_hourly_rate", 0)
    ri_hourly = rec.get("ri_effective_hourly_rate", 0)
    upfront_cost = rec.get("upfront_cost", 0)
    total_savings = rec.get("total_savings_3_year", 0)
    break_even_months = rec.get("break_even_months", 0)

    hourly_savings = on_demand_hourly - ri_hourly
    utilization = rec.get("expected_utilization", 0) * 100

    print(f"\n{'='*60}")
    print(f"RI Recommendation: {instance_family} in {region}")
    print(f"{'='*60}")
    print(f"Recommended Quantity:  {recommended_qty}")
    print(f"Current Utilization:   {utilization:.1f}%")
    print(f"")
    print(f"On-Demand Hourly:      ${on_demand_hourly:.4f}")
    print(f"RI Effective Hourly:   ${ri_hourly:.4f}")
    print(f"Hourly Savings:        ${hourly_savings:.4f} ({(hourly_savings/on_demand_hourly)*100:.1f}%)")
    print(f"")
    print(f"Upfront Cost:          ${upfront_cost:,.2f}")
    print(f"Break-even Period:     {break_even_months:.1f} months")
    print(f"3-Year Total Savings:  ${total_savings:,.2f}")
    print(f"ROI:                   {(total_savings / max(upfront_cost, 1)) * 100:.1f}%")

# Example usage
if __name__ == "__main__":
    recs = get_ri_recommendations("aws-account-uuid-123", term="1_year")
    for rec in recs.get("data", [])[:5]:
        print_break_even_analysis(rec)

Anomaly Detection Configuration

OptScale uses statistical anomaly detection on daily cost streams. Configure anomaly detection per cloud account with custom sensitivity.

#!/usr/bin/env python3
# optscale_anomaly_config.py โ€” Configure anomaly detection
import requests
import json
import os

OPTSCALE_URL = os.environ.get("OPTSCALE_URL")
API_KEY = os.environ.get("OPTSCALE_API_KEY")
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def configure_anomaly_detection(cloud_account_id, config):
    """Configure anomaly detection for a cloud account."""
    url = f"{OPTSCALE_URL}/api/v2/anomaly_detection/settings"
    payload = {
        "cloud_account_id": cloud_account_id,
        "sensitivity": config.get("sensitivity", "medium"),  # low | medium | high
        "min_daily_spend_threshold": config.get("min_threshold", 10.0),
        "alert_channels": config.get("channels", ["email"]),
        "notification_recipients": config.get("recipients", []),
        "group_by_dimensions": config.get("dimensions", ["SERVICE", "USAGE_TYPE"]),
        "exclude_tags": config.get("exclude_tags", {})
    }
    response = requests.post(url, headers=headers, json=payload, timeout=30)
    response.raise_for_status()
    return response.json()

# Configuration for production account
prod_config = {
    "sensitivity": "high",
    "min_threshold": 50.0,
    "channels": ["email", "slack", "webhook"],
    "recipients": ["finops-alerts@company.com", "sre-oncall@company.com"],
    "dimensions": ["SERVICE", "USAGE_TYPE", "REGION"],
    "exclude_tags": {"Environment": "sandbox"}
}

# Configuration for development account (lower threshold)
dev_config = {
    "sensitivity": "medium",
    "min_threshold": 5.0,
    "channels": ["email", "slack"],
    "recipients": ["dev-team@company.com"],
    "dimensions": ["SERVICE"],
    "exclude_tags": {}
}

if __name__ == "__main__":
    result_prod = configure_anomaly_detection("aws-prod-account-uuid", prod_config)
    print(f"Prod anomaly config: {json.dumps(result_prod, indent=2)}")

    result_dev = configure_anomaly_detection("aws-dev-account-uuid", dev_config)
    print(f"Dev anomaly config: {json.dumps(result_dev, indent=2)}")

Cost Allocation by Team and Project

OptScale supports tag-based cost allocation. First, ensure your tagging strategy is implemented consistently across all resources.

#!/usr/bin/env python3
# optscale_cost_allocation.py โ€” Generate cost allocation reports
import requests
import os
from datetime import datetime, timedelta

OPTSCALE_URL = os.environ.get("OPTSCALE_URL")
API_KEY = os.environ.get("OPTSCALE_API_KEY")
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def get_cost_by_tags(start_date, end_date, tag_key="Team"):
    """Get cost breakdown by a specific tag dimension."""
    url = f"{OPTSCALE_URL}/api/v2/costs/breakdown"
    params = {
        "start_date": start_date,
        "end_date": end_date,
        "group_by": f"tag:{tag_key}",
        "granularity": "daily"
    }
    response = requests.get(url, headers=headers, params=params, timeout=30)
    response.raise_for_status()
    return response.json()

def generate_team_report(month):
    """Generate monthly cost report by team."""
    start = f"{month}-01"
    year, mon = month.split("-")
    # Calculate last day of month
    import calendar
    last_day = calendar.monthrange(int(year), int(mon))[1]
    end = f"{month}-{last_day}"

    data = get_cost_by_tags(start, end, "Team")

    print(f"\n{'='*70}")
    print(f"  Monthly Cost Report โ€” {month}")
    print(f"{'='*70}")
    print(f"  {'Team':<25} {'Cost':>12} {'% of Total':>10} {'MoM Change':>12}")
    print(f"  {'-'*25} {'-'*12} {'-'*10} {'-'*12}")

    total = data.get("total_cost", 0)
    for team in data.get("breakdown", []):
        name = team.get("key", "untagged")
        cost = team.get("cost", 0)
        pct = (cost / total * 100) if total > 0 else 0
        mom = team.get("month_over_month_change", 0)
        mom_str = f"{mom:+.1f}%" if mom else "N/A"
        print(f"  {name:<25} ${cost:>10,.2f} {pct:>9.1f}% {mom_str:>12}")

    print(f"  {'-'*25} {'-'*12} {'-'*10} {'-'*12}")
    print(f"  {'TOTAL':<25} ${total:>10,.2f} {'100.0%':>10}")

    return data

if __name__ == "__main__":
    last_month = (datetime.now().replace(day=1) - timedelta(days=1)).strftime("%Y-%m")
    generate_team_report(last_month)

API and Webhook Integrations

OptScale exposes a comprehensive REST API. Use webhooks to trigger external workflows on cost events.

#!/usr/bin/env python3
# optscale_webhook_server.py โ€” Receive and process OptScale webhooks
from flask import Flask, request, jsonify
import hmac
import hashlib
import os
import requests

app = Flask(__name__)
WEBHOOK_SECRET = os.environ.get("OPTSCALE_WEBHOOK_SECRET", "your-webhook-secret")
SLACK_WEBHOOK_URL = os.environ.get("SLACK_WEBHOOK_URL")

def verify_signature(payload, signature):
    """Verify HMAC signature from OptScale webhook."""
    expected = hmac.new(
        WEBHOOK_SECRET.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)

def send_slack_alert(event_type, data):
    """Send formatted alert to Slack."""
    if not SLACK_WEBHOOK_URL:
        return

    color_map = {
        "anomaly_detected": "danger",
        "budget_threshold": "warning",
        "recommendation_created": "good",
        "optimization_applied": "#36a64f"
    }

    message = {
        "attachments": [{
            "color": color_map.get(event_type, "#808080"),
            "title": f"OptScale Alert: {event_type}",
            "fields": [
                {"title": "Account", "value": data.get("cloud_account_name", "N/A"), "short": True},
                {"title": "Cost Impact", "value": f"${data.get('cost_impact', 0):,.2f}", "short": True},
                {"title": "Details", "value": data.get("description", "No details"), "short": False}
            ],
            "footer": "OptScale FinOps",
            "ts": data.get("timestamp")
        }]
    }

    requests.post(SLACK_WEBHOOK_URL, json=message, timeout=10)

@app.route('/webhook/optscale', methods=['POST'])
def optscale_webhook():
    """Receive OptScale webhook events."""
    signature = request.headers.get('X-OptScale-Signature', '')
    payload = request.get_data()

    if not verify_signature(payload, signature):
        return jsonify({"error": "Invalid signature"}), 401

    event = request.json
    event_type = event.get("event_type")
    data = event.get("data", {})

    print(f"Received event: {event_type}")
    print(f"Data: {data}")

    # Route events to appropriate handlers
    if event_type == "anomaly_detected":
        send_slack_alert(event_type, data)
        # Could also create PagerDuty incident here
    elif event_type == "budget_threshold":
        send_slack_alert(event_type, data)
    elif event_type == "recommendation_created":
        # Auto-apply high-confidence recommendations during maintenance windows
        if data.get("confidence") == "high" and data.get("auto_applicable", False):
            print(f"Auto-applying recommendation: {data.get('id')}")
            # Call OptScale API to apply
    elif event_type == "optimization_applied":
        send_slack_alert(event_type, data)

    return jsonify({"status": "processed"}), 200

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=5000)

Slack and Teams Notification Setup

Slack Integration

#!/bin/bash
# Setup Slack notification channel for OptScale

# Create a Slack app at https://api.slack.com/apps
# Enable Incoming Webhooks
# Copy the webhook URL

SLACK_WEBHOOK="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"

# Register webhook in OptScale
curl -X POST "$OPTSCALE_URL/api/v2/notification_channels" \
  -H "Authorization: Bearer $OPTSCALE_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"type\": \"slack\",
    \"name\": \"finops-alerts\",
    \"config\": {
      \"webhook_url\": \"$SLACK_WEBHOOK\",
      \"channel\": \"#finops-alerts\",
      \"username\": \"OptScale Bot\"
    },
    \"events\": [
      \"anomaly_detected\",
      \"budget_threshold_exceeded\",
      \"recommendation_created\",
      \"optimization_applied\"
    ]
  }"

echo "Slack notification channel configured."

Microsoft Teams Integration

# Teams uses Office 365 Connector Webhooks
# From Teams channel: ... โ†’ Connectors โ†’ Incoming Webhook

TEAMS_WEBHOOK="https://company.webhook.office.com/webhookb2/..."

curl -X POST "$OPTSCALE_URL/api/v2/notification_channels" \
  -H "Authorization: Bearer $OPTSCALE_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"type\": \"teams\",
    \"name\": \"finops-teams-alerts\",
    \"config\": {
      \"webhook_url\": \"$TEAMS_WEBHOOK\",
      \"title\": \"OptScale FinOps Alert\"
    },
    \"events\": [
      \"anomaly_detected\",
      \"budget_threshold_exceeded\"
    ]
  }"
Security Note: Store webhook URLs and API keys in a secrets manager (AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault). Never commit credentials to version control. Rotate API keys quarterly.

Maintenance and Operations

TaskFrequencyOwner
Review anomaly detection alertsDailyFinOps Analyst
Review and act on right-sizing recommendationsWeeklySRE + FinOps
Review RI/Savings Plans utilizationWeeklyFinOps Analyst
Generate and distribute cost reportsMonthlyFinOps Analyst
Tag compliance auditMonthlyFinOps + Engineering
OptScale version upgradeQuarterlyPlatform Engineering
Credential rotation (cloud, API keys)QuarterlySecurity + FinOps
Disaster recovery test (backup restore)Semi-annuallyPlatform Engineering
Operational Tip: Treat OptScale like any production service. Monitor its health endpoints, maintain backups of the PostgreSQL and ClickHouse databases, and document your runbook for common operational tasks. Use the same on-call rotation as your other FinOps tooling.