41 pages ยท 8 sections
Ctrl K
GitHub Portfolio

Tagging Strategy

A well-designed tagging strategy is the foundation of cloud cost allocation, security governance, and operational automation. This guide provides a complete tagging framework with Policy-as-Code enforcement that I have successfully deployed to achieve >95% tag compliance across multi-account AWS Organizations.

Why Tagging Matters

Tagging is not merely administrative hygiene โ€” it is a critical operational capability. Without consistent tags, organizations face:

DomainWithout TagsWith Proper Tagging
Cost Management"We spent $200K on EC2. We don't know who owns it.""Team Platform spent $45K on EC2 production, trending down 8% MoM."
Security"We found a public S3 bucket. Who owns it?"Alert routed to bucket owner via automated tag lookup.
Operations"Which environment is this RDS instance in?"Immediate identification: Environment=production, Team=payments.
AutomationManual resource categorization for patchingAutomated patching by environment tag with maintenance window mapping.
ComplianceWeeks of manual work for audit evidenceInstant cost allocation reports by compliance scope tag.

Mandatory vs Optional Tags

Mandatory Tags (Enforced at Creation)

These tags must be present on every provisioned resource. Missing tags trigger automated remediation or deployment blocking.

Tag KeyDescriptionAllowed ValuesExample
CostCenterFinance cost center code for chargebackValid cost center IDs from finance systemCC-12345
BusinessUnitBusiness unit or divisionPlatform | Product | Data | Marketing | Sales | G&APlatform
ProjectProject or service name (kebab-case)Lowercase alphanumeric with hyphenspayment-gateway
EnvironmentDeployment environmentdev | staging | productionproduction
OwnerTeam or primary contactteam-{name} or email addressteam-platform
DataClassificationData sensitivity classificationpublic | internal | confidential | restrictedinternal
ComplianceScopeApplicable compliance frameworksSOC2 | PCI-DSS | HIPAA | GDPR | ISO27001 | noneSOC2
ManagedByInfrastructure management toolterraform | cloudformation | pulumi | manualterraform

Optional Tags (Recommended)

Tag KeyDescriptionExample
AutoShutdownWhether resource can be auto-stopped (non-prod)true | false
BackupPolicyBackup retention and scheduledaily-7d | daily-30d | none
MaintenanceWindowApproved maintenance windowtue-02:00-tue-04:00
ExpenseTypeCapEx vs OpEx classificationcapex | opex
LifecycleResource lifecycle statusactive | deprecated | pending-deletion

Recommended Tag Schema

Naming Conventions

## Tag Naming Standards

### Case Convention: PascalCase
# Applies to: AWS (CostCenter, BusinessUnit)
# Rationale: AWS Cost Explorer groups tags case-sensitively.
# PascalCase provides readability and consistency.

### Case Convention: lowercase-with-hyphens
# Applies to: GCP labels, Azure tags
# Rationale: GCP labels enforce lowercase; hyphens improve readability.

### Value Conventions:
# - Use kebab-case for multi-word values: "payment-gateway", "not PaymentGateway"
# - Use controlled vocabularies for Environment: "dev|staging|production"
# - Never use spaces in tag values
# - Never use special characters except hyphens and underscores
# - Keep tag values under 50 characters where possible

### Example Resource Tags (AWS EC2):
{
  "CostCenter":        "CC-12345",
  "BusinessUnit":      "Platform",
  "Project":           "api-gateway",
  "Environment":       "production",
  "Owner":             "team-platform@company.com",
  "DataClassification": "internal",
  "ComplianceScope":   "SOC2",
  "ManagedBy":         "terraform",
  "AutoShutdown":      "false"
}

### Example GCP Labels:
{
  "cost-center":        "cc-12345",
  "business-unit":      "platform",
  "project":            "api-gateway",
  "environment":        "production",
  "owner":              "team-platform",
  "data-classification": "internal",
  "compliance-scope":   "soc2"
}

Tag Enforcement with Policy-as-Code

AWS Organizations Tagging Policies

AWS Organizations Tag Policies enforce tag standards across all accounts in your organization. Violations are reported in AWS Config and can block IAM actions.

# tagging-policy.json โ€” AWS Organizations Tag Policy
{
  "Version": "2019-10-11",
  "Tags": {
    "CostCenter": {
      "TagKey": {
        "@@assign": "CostCenter"
      },
      "TagValue": {
        "@@assign": ["CC-10001", "CC-10002", "CC-10003", "CC-10004", "CC-10005"]
      },
      "EnforcedFor": {
        "@@assign": [
          "ec2:instance",
          "ec2:volume",
          "ec2:snapshot",
          "rds:db",
          "rds:snapshot",
          "s3:bucket",
          "lambda:function",
          "elasticloadbalancing:loadbalancer",
          "ecs:service",
          "eks:cluster",
          "dynamodb:table"
        ]
      }
    },
    "Environment": {
      "TagKey": {
        "@@assign": "Environment"
      },
      "TagValue": {
        "@@assign": ["dev", "staging", "production"]
      },
      "EnforcedFor": {
        "@@assign": [
          "ec2:instance",
          "ec2:volume",
          "rds:db",
          "s3:bucket",
          "lambda:function",
          "elasticloadbalancing:loadbalancer",
          "dynamodb:table"
        ]
      }
    },
    "Owner": {
      "TagKey": {
        "@@assign": "Owner"
      },
      "TagValue": {
        "@@assign": [
          "team-platform",
          "team-data",
          "team-product",
          "team-security",
          "team-sre"
        ]
      },
      "EnforcedFor": {
        "@@assign": [
          "ec2:instance",
          "ec2:volume",
          "rds:db",
          "s3:bucket",
          "lambda:function"
        ]
      }
    },
    "DataClassification": {
      "TagKey": {
        "@@assign": "DataClassification"
      },
      "TagValue": {
        "@@assign": ["public", "internal", "confidential", "restricted"]
      },
      "EnforcedFor": {
        "@@assign": [
          "ec2:instance",
          "rds:db",
          "s3:bucket",
          "dynamodb:table"
        ]
      }
    }
  }
}
# deploy_tag_policy.sh โ€” Deploy tagging policy via AWS CLI
#!/bin/bash
set -euo pipefail

POLICY_NAME="mandatory-tags-policy"
TARGET_ID="r-xxxx"  # Your Organization Root ID

# Create the tag policy
POLICY_ID=$(aws organizations create-policy \
  --name "$POLICY_NAME" \
  --description "Mandatory tagging policy for all cloud resources" \
  --type TAG_POLICY \
  --content file://tagging-policy.json \
  --query 'Policy.PolicySummary.Id' --output text)

echo "Created tag policy: $POLICY_ID"

# Attach to organization root (affects all accounts)
aws organizations attach-policy \
  --policy-id "$POLICY_ID" \
  --target-id "$TARGET_ID"

echo "Tag policy attached to organization root."

# Enable tag policy in Organization
aws organizations enable-policy-type \
  --root-id "$TARGET_ID" \
  --policy-type TAG_POLICY

echo "Tag policies enabled. Compliance reports will appear in AWS Config within 24 hours."

Terraform Sentinel Policies

For Terraform Cloud/Enterprise users, Sentinel policies block non-compliant resources before they reach the cloud provider.

# enforce_mandatory_tags.sentinel โ€” Terraform Sentinel Policy
# This policy enforces mandatory tags on all AWS resources

import "tfplan"
import "strings"

# Mandatory tags that must be present on all taggable resources
mandatory_tags = ["CostCenter", "Environment", "Owner", "DataClassification"]

# List of AWS resource types that must be tagged
required_resource_types = [
  "aws_instance",
  "aws_db_instance",
  "aws_s3_bucket",
  "aws_lambda_function",
  "aws_ebs_volume",
  "aws_lb",
  "aws_ecs_service",
  "aws_eks_cluster",
  "aws_dynamodb_table",
]

# Get all resources from the plan
all_resources = tfplan.resource_changes

# Validate tags on each resource
violations = 0
for all_resources as address, rc {
  # Check if this resource type requires tags
  if rc.type in required_resource_types {
    # Get the planned tags (after apply)
    planned_tags = rc.change.after.tags else {}

    # Check each mandatory tag
    for mandatory_tags as tag {
      if planned_tags[tag] is undefined or planned_tags[tag] is "null" {
        print("VIOLATION: Missing mandatory tag '" + tag + "' on resource: " + address)
        violations = violations + 1
      }
    }

    # Validate Environment tag value
    env_value = planned_tags["Environment"] else ""
    if env_value not in ["dev", "staging", "production", "shared"] {
      print("VIOLATION: Invalid Environment tag value '" + env_value +
            "' on resource: " + address + ". Must be: dev, staging, production, or shared.")
      violations = violations + 1
    }
  }
}

# Policy decision
main = rule {
  violations == 0
}

# Soft-mandatory enforcement: fails the run but can be overridden
# Hard-mandatory: set enforcement_level to "hard-mandatory" in Terraform Cloud
# sentinel.hcl โ€” Sentinel configuration for the workspace
policy "enforce_mandatory_tags" {
  enforcement_level = "soft-mandatory"
  source            = "./enforce_mandatory_tags.sentinel"
}

policy "enforce_cost_center_values" {
  enforcement_level = "hard-mandatory"
  source            = "./enforce_cost_center_values.sentinel"
}

policy "prevent_production_in_dev_account" {
  enforcement_level = "hard-mandatory"
  source            = "./prevent_production_in_dev_account.sentinel"
}

OPA (Open Policy Agent) Rules

OPA with Terraform provides flexible, language-agnostic policy enforcement that works across CI/CD pipelines.

# mandatory_tags.rego โ€” OPA Policy for Terraform tagging compliance
package terraform.aws.tags

import future.keywords.if
import future.keywords.in

# List of resource types that require mandatory tags
required_tagged_resources := {
    "aws_instance",
    "aws_db_instance",
    "aws_s3_bucket",
    "aws_lambda_function",
    "aws_ebs_volume",
    "aws_lb",
    "aws_ecs_service",
    "aws_eks_cluster",
    "aws_dynamodb_table",
    "aws_autoscaling_group",
    "aws_rds_cluster",
    "aws_elasticache_cluster",
}

# Mandatory tag keys
mandatory_tags := {"CostCenter", "Environment", "Owner", "DataClassification"}

# Valid Environment values
valid_environments := {"dev", "staging", "production", "shared"}

# Valid DataClassification values
valid_classifications := {"public", "internal", "confidential", "restricted"}

# Deny if resource is missing mandatory tags
deny[msg] if {
    some resource in input.resource_changes
    resource.type in required_tagged_resources
    resource.change.after.tags
    missing := mandatory_tags - {key | resource.change.after.tags[key]}
    count(missing) > 0
    msg := sprintf(
        "Resource '%s' (%s) is missing mandatory tags: %s",
        [resource.address, resource.type, concat(", ", missing)]
    )
}

# Deny if Environment tag has invalid value
deny[msg] if {
    some resource in input.resource_changes
    resource.type in required_tagged_resources
    tags := resource.change.after.tags
    env := tags.Environment
    not env in valid_environments
    msg := sprintf(
        "Resource '%s' has invalid Environment tag value '%s'. Valid values: %s",
        [resource.address, env, concat(", ", valid_environments)]
    )
}

# Deny if DataClassification tag has invalid value
deny[msg] if {
    some resource in input.resource_changes
    resource.type in required_tagged_resources
    tags := resource.change.after.tags
    classification := tags.DataClassification
    not classification in valid_classifications
    msg := sprintf(
        "Resource '%s' has invalid DataClassification tag value '%s'. Valid values: %s",
        [resource.address, classification, concat(", ", valid_classifications)]
    )
}

# Deny if CostCenter is empty
deny[msg] if {
    some resource in input.resource_changes
    resource.type in required_tagged_resources
    tags := resource.change.after.tags
    tags.CostCenter == ""
    msg := sprintf(
        "Resource '%s' has empty CostCenter tag. Must be a valid cost center code.",
        [resource.address]
    )
}

# Allow if no violations
allow if {
    count(deny) == 0
}
# Evaluate OPA policy against Terraform plan
#!/bin/bash
set -euo pipefail

# Generate Terraform plan
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json

# Run OPA evaluation
opa eval --data mandatory_tags.rego \
  --input tfplan.json \
  "data.terraform.aws.tags.deny" \
  --format pretty

# Gate pipeline: fail if there are denials
VIOLATIONS=$(opa eval --data mandatory_tags.rego \
  --input tfplan.json \
  "data.terraform.aws.tags.deny" \
  --format json | jq 'length')

if [ "$VIOLATIONS" -gt 0 ]; then
    echo "ERROR: $VIOLATIONS tagging policy violations found."
    exit 1
fi

echo "โœ“ All tagging policies passed."

Complete Terraform Module with Mandatory Tagging

# modules/tagged_resources/main.tf
# Terraform module enforcing mandatory tags on all resources

variable "name" {
  description = "Resource name"
  type        = string
}

variable "environment" {
  description = "Environment"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be dev, staging, or production."
  }
}

variable "cost_center" {
  description = "Finance cost center code"
  type        = string
  validation {
    condition     = can(regex("^CC-[0-9]{5}$", var.cost_center))
    error_message = "CostCenter must be in format CC-XXXXX."
  }
}

variable "owner" {
  description = "Team or owner email"
  type        = string
}

variable "data_classification" {
  description = "Data classification level"
  type        = string
  default     = "internal"
  validation {
    condition     = contains(["public", "internal", "confidential", "restricted"], var.data_classification)
    error_message = "DataClassification must be public, internal, confidential, or restricted."
  }
}

variable "project" {
  description = "Project name"
  type        = string
}

variable "business_unit" {
  description = "Business unit"
  type        = string
  default     = "Platform"
}

variable "compliance_scope" {
  description = "Compliance frameworks"
  type        = string
  default     = "SOC2"
}

variable "managed_by" {
  description = "Infrastructure management tool"
  type        = string
  default     = "terraform"
}

locals {
  # Universal tags applied to all resources
  mandatory_tags = {
    CostCenter          = var.cost_center
    BusinessUnit        = var.business_unit
    Project             = var.project
    Environment         = var.environment
    Owner               = var.owner
    DataClassification  = var.data_classification
    ComplianceScope     = var.compliance_scope
    ManagedBy           = var.managed_by
  }

  # Additional tags derived from variables
  derived_tags = {
    AutoShutdown = var.environment == "production" ? "false" : "true"
  }

  # Merge all tags
  common_tags = merge(local.mandatory_tags, local.derived_tags)
}

# Output the tags for use in parent modules
output "tags" {
  description = "Mandatory tags map"
  value       = local.common_tags
}

output "mandatory_tag_keys" {
  description = "List of mandatory tag keys"
  value       = keys(local.mandatory_tags)
}

# Example usage in a root module:
# module "tags" {
#   source              = "./modules/tags"
#   name                = "api-service"
#   environment         = "production"
#   cost_center         = "CC-12345"
#   owner               = "team-platform@company.com"
#   project             = "payment-gateway"
#   data_classification = "confidential"
# }
#
# resource "aws_instance" "api" {
#   ami           = data.aws_ami.amazon_linux_2023.id
#   instance_type = "m6i.xlarge"
#   tags          = module.tags.tags
# }

Tag Inheritance and Propagation

Tags should flow automatically from higher-level constructs to resources. Implement inheritance at multiple levels:

## Tag Inheritance Architecture

Organization (AWS Organization)
  โ”œโ”€โ”€ OU: Production
  โ”‚     โ””โ”€โ”€ Account: Production-Platform
  โ”‚           โ””โ”€โ”€ VPC: vpc-platform-prod
  โ”‚                 โ””โ”€โ”€ Tag: Environment=production (inherited by all resources)
  โ”‚                 โ””โ”€โ”€ EKS Cluster: eks-platform-prod
  โ”‚                       โ””โ”€โ”€ Tag: Project=platform (propagated to nodes)
  โ”‚                       โ””โ”€โ”€ Node Group: platform-nodes
  โ”‚                             โ””โ”€โ”€ EC2 Instances inherit: Environment, Project, Cluster
  โ”‚
  โ”œโ”€โ”€ OU: Development
  โ”‚     โ””โ”€โ”€ Account: Dev-Sandbox
  โ”‚           โ””โ”€โ”€ Tag: AutoShutdown=true (account-level)
  โ”‚
  โ””โ”€โ”€ Tag Policies (enforced at Organization level)

Propagation in Terraform

# tag_propagation.tf โ€” Tag propagation patterns in Terraform

# Pattern 1: VPC-level tags propagated to subnets
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"

  tags = local.mandatory_tags
}

resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
  availability_zone = var.availability_zones[count.index]

  # Merge VPC tags with subnet-specific tags
  tags = merge(aws_vpc.main.tags, {
    Name = "private-${var.availability_zones[count.index]}"
    Type = "private"
  })
}

# Pattern 2: ASG tag propagation to instances
resource "aws_autoscaling_group" "app" {
  name                = "app-asg"
  vpc_zone_identifier = aws_subnet.private[*].id
  desired_capacity    = 4
  min_size            = 2
  max_size            = 10

  launch_template {
    id      = aws_launch_template.app.id
    version = "$Latest"
  }

  # Tags propagated to all instances launched by this ASG
  tag {
    key                 = "CostCenter"
    value               = local.mandatory_tags["CostCenter"]
    propagate_at_launch = true
  }
  tag {
    key                 = "Environment"
    value               = local.mandatory_tags["Environment"]
    propagate_at_launch = true
  }
  tag {
    key                 = "Project"
    value               = local.mandatory_tags["Project"]
    propagate_at_launch = true
  }
  tag {
    key                 = "Owner"
    value               = local.mandatory_tags["Owner"]
    propagate_at_launch = true
  }
}

# Pattern 3: EKS cluster tags propagated to node groups
resource "aws_eks_cluster" "main" {
  name     = var.cluster_name
  role_arn = aws_iam_role.eks_cluster.arn
  version  = var.kubernetes_version

  vpc_config {
    subnet_ids = var.private_subnet_ids
  }

  tags = local.mandatory_tags
}

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "main"
  node_role_arn   = aws_iam_role.eks_nodes.arn
  subnet_ids      = var.private_subnet_ids

  scaling_config {
    desired_size = 3
    min_size     = 2
    max_size     = 10
  }

  # EKS automatically propagates cluster tags to EC2 instances
  tags = merge(local.mandatory_tags, {
    "k8s.io/cluster-autoscaler/enabled"             = "true"
    "k8s.io/cluster-autoscaler/${var.cluster_name}" = "owned"
  })
}

Tag-Based IAM Policies

Tags enable attribute-based access control (ABAC), allowing IAM policies that grant access based on resource tags rather than explicit ARNs.

# abac_iam_policy.json โ€” ABAC: Engineers can only manage resources with their team tag
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowEC2ManagementForTeamResources",
      "Effect": "Allow",
      "Action": [
        "ec2:StartInstances",
        "ec2:StopInstances",
        "ec2:RebootInstances",
        "ec2:TerminateInstances",
        "ec2:ModifyInstanceAttribute",
        "ec2:CreateTags",
        "ec2:DeleteTags"
      ],
      "Resource": "arn:aws:ec2:*:*:instance/*",
      "Condition": {
        "StringEquals": {
          "ec2:ResourceTag/Owner": "${aws:PrincipalTag/Team}"
        }
      }
    },
    {
      "Sid": "AllowViewAllEC2",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeVolumes",
        "ec2:DescribeSnapshots"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DenyProductionModification",
      "Effect": "Deny",
      "Action": [
        "ec2:TerminateInstances",
        "ec2:StopInstances"
      ],
      "Resource": "arn:aws:ec2:*:*:instance/*",
      "Condition": {
        "StringEquals": {
          "ec2:ResourceTag/Environment": "production"
        }
      }
    }
  ]
}
# terraform for ABAC IAM role
resource "aws_iam_role" "platform_engineer" {
  name = "PlatformEngineer"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        AWS = "arn:aws:iam::${var.master_account_id}:root"
      }
      Condition = {
        "StringEquals" = {
          "aws:PrincipalTag/Team" = "team-platform"
        }
      }
    }]
  })

  tags = local.mandatory_tags
}

Cost Allocation Report by Tags

#!/usr/bin/env python3
# cost_allocation_by_tags.py โ€” Generate cost allocation reports using tags
import boto3
import json
from datetime import datetime, timedelta
from collections import defaultdict

def get_cost_by_tags(start_date, end_date, tag_key="CostCenter"):
    """Query AWS Cost Explorer for costs grouped by tag."""
    ce = boto3.client('ce')

    response = ce.get_cost_and_usage(
        TimePeriod={
            'Start': start_date,
            'End': end_date
        },
        Granularity='MONTHLY',
        Metrics=['BlendedCost', 'UsageQuantity'],
        GroupBy=[
            {'Type': 'TAG', 'Key': tag_key}
        ]
    )

    return response['ResultsByTime']

def get_untagged_cost(start_date, end_date):
    """Find costs not associated with any tag."""
    ce = boto3.client('ce')

    response = ce.get_cost_and_usage(
        TimePeriod={
            'Start': start_date,
            'End': end_date
        },
        Granularity='MONTHLY',
        Metrics=['BlendedCost'],
        Filter={
            'Tags': {
                'Key': 'CostCenter',
                'MatchOptions': ['ABSENT']
            }
        }
    )

    return response['ResultsByTime']

def generate_monthly_report(year, month):
    """Generate comprehensive cost allocation report."""
    start = f"{year}-{month:02d}-01"
    if month == 12:
        end = f"{year + 1}-01-01"
    else:
        end = f"{year}-{month + 1:02d}-01"

    print(f"\n{'='*70}")
    print(f"  Cost Allocation Report โ€” {year}-{month:02d}")
    print(f"{'='*70}")

    # Cost by CostCenter
    print("\n  --- By Cost Center ---")
    cost_by_cc = get_cost_by_tags(start, end, "CostCenter")
    total = 0.0

    for period in cost_by_cc:
        for group in period.get('Groups', []):
            tag_value = group['Keys'][0].replace('CostCenter$', '')
            cost = float(group['Metrics']['BlendedCost']['Amount'])
            total += cost
            if cost > 0:
                print(f"  {tag_value:<25} ${cost:>12,.2f}")

    # Untagged cost
    print("\n  --- Untagged Resources ---")
    untagged = get_untagged_cost(start, end)
    untagged_total = 0.0
    for period in untagged:
        for group in period.get('Groups', []):
            cost = float(group['Metrics']['BlendedCost']['Amount'])
            untagged_total += cost

    print(f"  Untagged Total:          ${untagged_total:>12,.2f}")
    if total > 0:
        compliance = ((total - untagged_total) / total) * 100
        print(f"  Tag Compliance:          {compliance:.1f}%")

    # Cost by Environment
    print("\n  --- By Environment ---")
    cost_by_env = get_cost_by_tags(start, end, "Environment")
    for period in cost_by_env:
        for group in period.get('Groups', []):
            tag_value = group['Keys'][0].replace('Environment$', '')
            cost = float(group['Metrics']['BlendedCost']['Amount'])
            if cost > 0:
                pct = (cost / total) * 100 if total > 0 else 0
                print(f"  {tag_value:<25} ${cost:>12,.2f} ({pct:.1f}%)")

    # Cost by Team/Owner
    print("\n  --- By Owner ---")
    cost_by_owner = get_cost_by_tags(start, end, "Owner")
    for period in cost_by_owner:
        for group in period.get('Groups', []):
            tag_value = group['Keys'][0].replace('Owner$', '')
            cost = float(group['Metrics']['BlendedCost']['Amount'])
            if cost > 0:
                print(f"  {tag_value:<25} ${cost:>12,.2f}")

    print(f"\n  {'='*70}")
    print(f"  TOTAL:                    ${total:>12,.2f}")
    print(f"  {'='*70}")

if __name__ == "__main__":
    today = datetime.now()
    first_of_month = today.replace(day=1)
    last_month = first_of_month - timedelta(days=1)
    generate_monthly_report(last_month.year, last_month.month)

Tag Compliance Dashboard in Datadog

# datadog_tag_compliance.py โ€” Publish tag compliance metrics to Datadog
from datadog import initialize, api
import boto3
import os

options = {
    'api_key': os.environ['DATADOG_API_KEY'],
    'app_key': os.environ['DATADOG_APP_KEY']
}
initialize(**options)

def calculate_tag_compliance():
    """Calculate tag compliance percentage across all AWS resources."""
    ec2 = boto3.client('ec2')
    regions = ec2.describe_regions()['Regions']

    mandatory_tags = ['CostCenter', 'Environment', 'Owner', 'DataClassification']
    total_resources = 0
    compliant_resources = 0

    for region in regions:
        region_name = region['RegionName']
        regional_ec2 = boto3.client('ec2', region_name=region_name)

        # Check instances
        instances = regional_ec2.describe_instances(
            Filters=[{'Name': 'instance-state-name', 'Values': ['running', 'stopped']}]
        )
        for reservation in instances['Reservations']:
            for instance in reservation['Instances']:
                total_resources += 1
                tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
                if all(tag in tags for tag in mandatory_tags):
                    compliant_resources += 1

        # Check volumes
        volumes = regional_ec2.describe_volumes()
        for volume in volumes['Volumes']:
            total_resources += 1
            tags = {tag['Key']: tag['Value'] for tag in volume.get('Tags', [])}
            if all(tag in tags for tag in mandatory_tags):
                compliant_resources += 1

    compliance_pct = (compliant_resources / total_resources * 100) if total_resources > 0 else 0
    return compliance_pct, total_resources, compliant_resources

def publish_to_datadog():
    """Publish tag compliance metrics to Datadog."""
    compliance_pct, total, compliant = calculate_tag_compliance()

    now = int(time.time())

    api.Metric.send([
        {
            'metric': 'finops.tag.compliance.percentage',
            'points': [(now, compliance_pct)],
            'type': 'gauge',
            'tags': ['env:all', 'team:finops'],
            'host': 'finops-automation'
        },
        {
            'metric': 'finops.tag.compliance.total_resources',
            'points': [(now, total)],
            'type': 'gauge',
            'tags': ['env:all', 'team:finops'],
            'host': 'finops-automation'
        },
        {
            'metric': 'finops.tag.compliance.compliant_resources',
            'points': [(now, compliant)],
            'type': 'gauge',
            'tags': ['env:all', 'team:finops'],
            'host': 'finops-automation'
        }
    ])

    print(f"Published: {compliance_pct:.1f}% compliance ({compliant}/{total} resources)")

    # Set alert if compliance drops below 95%
    if compliance_pct < 95:
        print(f"WARNING: Tag compliance {compliance_pct:.1f}% is below 95% threshold!")

if __name__ == "__main__":
    import time
    publish_to_datadog()

Tag Governance Process and Review Cycle

## Tag Governance Process

### Roles and Responsibilities

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  FinOps Team (Owner)                                            โ”‚
โ”‚  โ€ข Define and maintain tagging standards                        โ”‚
โ”‚  โ€ข Configure Policy-as-Code enforcement                         โ”‚
โ”‚  โ€ข Monitor compliance dashboards                                โ”‚
โ”‚  โ€ข Report compliance metrics to leadership                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Platform/SRE Team                                              โ”‚
โ”‚  โ€ข Implement tagging in Terraform modules and templates         โ”‚
โ”‚  โ€ข Maintain CI/CD pipeline policy checks                        โ”‚
โ”‚  โ€ข Provide tooling for automated remediation                    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Engineering Teams                                              โ”‚
โ”‚  โ€ข Tag all resources at creation time                           โ”‚
โ”‚  โ€ข Remediate non-compliant resources within 5 business days     โ”‚
โ”‚  โ€ข Use approved Terraform modules with built-in tags            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Security Team                                                  โ”‚
โ”‚  โ€ข Review DataClassification tagging accuracy                   โ”‚
โ”‚  โ€ข Audit compliance-scope tags for regulated workloads          โ”‚
โ”‚  โ€ข Escalate resources with incorrect classification             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

### Review Cycle

Weekly:  Automated compliance report generated and distributed
         New non-compliant resources flagged in #finops-alerts

Monthly: Tag compliance reviewed in FinOps working group meeting
         Backfill remediation completed for resources >30 days old
         Policy adjustments made based on false positive rate

Quarterly: Full tagging standard review with all stakeholders
           New mandatory tags proposed and ratified
           Cost center validation against finance system
           Tagging policy version updated and communicated

Annually:  Comprehensive tagging strategy review
           Benchmark against industry standards
           Tool evaluation (AWS Tag Editor, custom solutions)
FinOps-grade Tagging Standards: The tagging framework described here is based on my direct experience introducing FinOps-grade tagging standards that enabled per-team cost attribution at previous organizations. The key success factors were: (1) start with 4-5 mandatory tags, not 15+; (2) enforce at CI/CD time, not post-deployment; (3) provide engineers with pre-tagged Terraform modules so tagging is automatic; (4) publish a weekly compliance scorecard to create positive competitive pressure between teams. With this approach, we achieved >95% compliance within 60 days.
Policy-as-Code Guardrails: I have proposed and implemented Policy-as-Code guardrails that block non-compliant resources pre-deployment. The critical insight is that enforcement must happen in the CI/CD pipeline โ€” not after deployment. Post-deployment remediation is expensive and creates friction. Terraform Sentinel or OPA policies that fail builds with clear error messages create a self-service model where engineers fix issues themselves without FinOps team intervention.