Cloud Armor Done Right: WAF Rules, Adaptive Protection, and Bot Management That Actually Work

Most teams bolt security onto GCP as an afterthought. They enable Cloud Armor in five minutes, click "Add rule", pick the OWASP preset, and ship it — then spend the next week firefighting false positives that block legitimate users or, worse, never notice the attack traffic sailing right through because the rule sensitivity is set too low.

This article fixes that. We’re going to set up Cloud Armor properly: WAF rules with tuned sensitivity, Adaptive Protection that actually auto-deploys mitigation instead of just whining in the logs, and bot management that distinguishes a browser from a scraper without turning your login page into a CAPTCHA obstacle course.

The official Cloud Armor docs are decent reference material but they read like a manual. This is a field guide.

How Cloud Armor Actually Sits in Your Stack

Cloud Armor operates at the Global HTTP(S) Load Balancer layer, not at the VM or container level. Rules are evaluated before traffic hits your backends. That means it catches L7 attacks regardless of what’s running behind the LB — GKE, Cloud Run, App Engine, a plain instance group.

Security policies are attached to backend services, not to the load balancer frontend. One policy can be shared across multiple backend services, or you can use separate policies per service if they have different risk profiles. For most shops, one policy per environment (prod, staging) is the right granularity.

One important architectural note: Cloud Armor only works with external load balancers. Internal HTTP(S) LBs and Cloud CDN origins don’t support it. Plan accordingly.

Creating the Base Security Policy

Let’s do this with Terraform from the start — clicking around the console creates snowflakes that you can’t reproduce.

# security_policy.tf

resource "google_compute_security_policy" "main" {
  name        = "prod-security-policy"
  description = "Production WAF + bot management policy"

  # Adaptive Protection — enable immediately, tune later
  adaptive_protection_config {
    layer_7_ddos_defense_config {
      enable          = true
      rule_visibility = "STANDARD"  # or PREMIUM for more signal
    }
  }

  # Default rule: allow everything not matched above
  # We'll add deny rules before this
  rule {
    action   = "allow"
    priority = 2147483647  # INT32_MAX — the default rule
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "Default allow rule"
  }
}

# Attach to your backend service
resource "google_compute_backend_service" "api" {
  name     = "api-backend"
  # ... your backends ...

  security_policy = google_compute_security_policy.main.id
}

Gotcha #1: The default rule must exist and must be priority 2147483647. If you forget it and leave no catch-all, the policy behaves unpredictably. Always be explicit about your default action — allow for most public services, deny(403) if you want an allowlist-only setup.

WAF Rules: Preconfigured Rulesets

Cloud Armor ships preconfigured rules based on the OWASP ModSecurity Core Rule Set. Each rule group targets a specific attack class. Here are the ones worth enabling for anything public-facing:

Rule Group	Expression	What It Catches
SQLi	`evaluatePreconfiguredWaf('sqli-v33-stable')`	SQL injection
XSS	`evaluatePreconfiguredWaf('xss-v33-stable')`	Cross-site scripting
LFI	`evaluatePreconfiguredWaf('lfi-v33-stable')`	Local file inclusion
RFI	`evaluatePreconfiguredWaf('rfi-v33-stable')`	Remote file inclusion
RCE	`evaluatePreconfiguredWaf('rce-v33-stable')`	Remote code execution
Scanners	`evaluatePreconfiguredWaf('scannerdetection-v33-stable')`	Vulnerability scanners
Protocol attacks	`evaluatePreconfiguredWaf('protocolattack-v33-stable')`	HTTP protocol abuse
PHP	`evaluatePreconfiguredWaf('php-v33-stable')`	PHP-specific attacks

The -v33-stable suffix is the CRS version. Always pin to stable in production — the canary variants get updates that can break things.

# WAF rules block — add to your security_policy resource

rule {
  action   = "deny(403)"
  priority = 1000
  match {
    expr {
      expression = "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1})"
    }
  }
  description = "Block SQLi"
}

rule {
  action   = "deny(403)"
  priority = 1001
  match {
    expr {
      expression = "evaluatePreconfiguredWaf('xss-v33-stable', {'sensitivity': 1})"
    }
  }
  description = "Block XSS"
}

rule {
  action   = "deny(403)"
  priority = 1002
  match {
    expr {
      expression = "evaluatePreconfiguredWaf('rce-v33-stable', {'sensitivity': 1})"
    }
  }
  description = "Block RCE attempts"
}

rule {
  action   = "deny(403)"
  priority = 1003
  match {
    expr {
      expression = "evaluatePreconfiguredWaf('lfi-v33-stable', {'sensitivity': 1})"
    }
  }
  description = "Block LFI"
}

Sensitivity Levels: Don’t Skip This Part

Every preconfigured rule accepts a sensitivity parameter from 0 to 4. This is the most important tuning knob you have.

Sensitivity 0: No rules active. Useless.
Sensitivity 1: Low false positive rate. Start here for production.
Sensitivity 2: More coverage, some false positives. Consider for APIs that don’t serve browser traffic.
Sensitivity 3–4: High detection, high false positives. Only useful if you’ve tuned out the false positives manually — which almost nobody does correctly.

Start at sensitivity 1. Run rules in preview mode for a week. Promote to blocking after reviewing logs.

Gotcha #2: You cannot set both sensitivity in the expression and use excludedRules in the same rule config in Terraform via the expression string alone. Use evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1, 'opt_out_rule_ids': ['owasp-crs-v030001-id942110-sqli']}) to exclude individual rule IDs that are causing false positives. You can find the specific rule ID in the Cloud Logging enforcedSecurityPolicy.preconfiguredExprIds field.

Preview Mode: Use It

Switching a rule to preview = true means it logs matches but doesn’t enforce them. Critical for validating rules before they block real users.

rule {
  action   = "deny(403)"
  priority = 1000
  preview  = true  # Log only — no blocking yet
  match {
    expr {
      expression = "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 2})"
    }
  }
  description = "SQLi preview — promoting to block after review"
}

Query preview hits in Cloud Logging:

gcloud logging read \
  'resource.type="http_load_balancer" AND jsonPayload.enforcedSecurityPolicy.outcome="PREVIEW"' \
  --project=YOUR_PROJECT \
  --format="table(timestamp, jsonPayload.enforcedSecurityPolicy.name, jsonPayload.enforcedSecurityPolicy.priority, httpRequest.requestUrl)" \
  --limit=100

Custom Rules: Rate Limiting and IP Controls

WAF rules handle attack signatures. Custom rules handle everything else — abusive clients, scrapers, internal allowlists.

# Block known bad IP ranges
rule {
  action   = "deny(403)"
  priority = 100
  match {
    versioned_expr = "SRC_IPS_V1"
    config {
      src_ip_ranges = [
        "192.0.2.0/24",   # example bad range
        "203.0.113.0/24",
      ]
    }
  }
  description = "Blocklist — manually maintained bad actors"
}

# Allowlist internal monitoring and CI
rule {
  action   = "allow"
  priority = 50
  match {
    versioned_expr = "SRC_IPS_V1"
    config {
      src_ip_ranges = [
        "10.0.0.0/8",       # internal VPC
        "35.191.0.0/16",    # GCP health checker range
        "130.211.0.0/22",   # GCP health checker range
      ]
    }
  }
  description = "Allow internal + health checks"
}

# Rate limit by IP — 100 requests per minute
rule {
  action   = "throttle"
  priority = 500

  rate_limit_options {
    conform_action = "allow"
    exceed_action  = "deny(429)"

    rate_limit_threshold {
      count        = 100
      interval_sec = 60
    }

    enforce_on_key = "IP"
  }

  match {
    versioned_expr = "SRC_IPS_V1"
    config {
      src_ip_ranges = ["*"]
    }
  }
  description = "Global rate limit — 100 req/min per IP"
}

Gotcha #3: Health check probes come from 35.191.0.0/16 and 130.211.0.0/22. If you don’t allowlist these before your deny rules, your load balancer will mark backends as unhealthy. This is one of the most common "I broke prod" mistakes with Cloud Armor. Add the allowlist rule at a lower priority number (higher precedence) than your deny rules.

Adaptive Protection: ML-Based L7 DDoS Defense

Adaptive Protection monitors your traffic baseline and flags volumetric L7 attacks — coordinated floods, slowloris variants, layer 7 amplification. When it detects an attack, it suggests a mitigation rule in the console and in Cloud Logging. With auto-deploy enabled, it can activate that rule without manual intervention.

Enabling Auto-Deploy

Auto-deploy is opt-in and deserves thought before you enable it. It will block traffic based on ML scoring. False positives happen.

resource "google_compute_security_policy" "main" {
  name = "prod-security-policy"

  adaptive_protection_config {
    layer_7_ddos_defense_config {
      enable          = true
      rule_visibility = "STANDARD"
    }

    auto_deploy_config {
      load_threshold              = 0.8   # CPU/RPS load % that triggers auto-deploy
      confidence_threshold        = 0.5   # ML confidence required before deploying
      impacted_baseline_threshold = 0.01  # % of legit traffic the rule must not block
      expiration_sec              = 7200  # Auto-deployed rules expire after 2 hours
    }
  }
  # ... rules ...
}

The impacted_baseline_threshold is the most important safety parameter. 0.01 means: don’t auto-deploy a rule if it would affect more than 1% of your normal traffic. Setting this to 0 removes the safety net entirely — don’t do that in production.

Monitoring Adaptive Protection Events

# See what Adaptive Protection is detecting
gcloud logging read \
  'resource.type="http_load_balancer" AND logName=~"cloudarmor"' \
  --project=YOUR_PROJECT \
  --freshness=1h \
  --format=json | jq '.[] | .jsonPayload.adaptiveProtection'

You can also set up an alert policy in Cloud Monitoring against the compute.googleapis.com/security_policy/adaptive_protection/l7_ddos_attack_score metric to get paged when an attack is in progress.

Gotcha #4: Adaptive Protection needs at least a few days of baseline traffic to work well. If you deploy it on day one of a new service and immediately get probed, the "baseline" is too noisy for the model to distinguish attack from normal. Give it a week before trusting auto-deploy suggestions.

Bot Management with reCAPTCHA Enterprise

Cloud Armor’s bot management integrates with reCAPTCHA Enterprise. There are three modes:

Token validation — your frontend embeds a reCAPTCHA v3 token; Cloud Armor validates it server-side and enforces based on score.
Browser challenge — Cloud Armor redirects suspicious requests to a reCAPTCHA challenge page transparently (good for login pages).
Manual challenge — explicit CAPTCHA for high-risk actions.

Wiring Up reCAPTCHA Enterprise

First, you need a reCAPTCHA Enterprise key:

gcloud recaptchaenterprise keys create \
  --web \
  --display-name="prod-web-key" \
  --allow-all-domains \
  --integration-type=SCORE \
  --project=YOUR_PROJECT

Then link the key to your Cloud Armor policy:

resource "google_recaptcha_enterprise_key" "prod" {
  display_name = "prod-web-key"
  project      = var.project_id

  web_settings {
    integration_type  = "SCORE"
    allow_all_domains = true
    # Or restrict to specific domains:
    # allowed_domains = ["yourdomain.com"]
  }
}

resource "google_compute_security_policy" "main" {
  name = "prod-security-policy"

  recaptcha_options_config {
    redirect_site_key = google_recaptcha_enterprise_key.prod.name
  }

  # ... other config ...
}

Bot Management Rules

# Block bot sessions with very low reCAPTCHA score
rule {
  action   = "deny(403)"
  priority = 200
  match {
    expr {
      expression = "request.path.matches('/api/.*') && token.recaptcha_session.score < 0.3"
    }
  }
  description = "Block low-score bots from API"
}

# Redirect suspicious login attempts to CAPTCHA challenge
rule {
  action   = "redirect"
  priority = 210

  redirect_options {
    type   = "GOOGLE_RECAPTCHA"
    # TARGET not needed for GOOGLE_RECAPTCHA type
  }

  match {
    expr {
      expression = "request.path == '/login' && token.recaptcha_session.score < 0.5"
    }
  }
  description = "Challenge suspicious login attempts"
}

# Allow humans through
rule {
  action   = "allow"
  priority = 215
  match {
    expr {
      expression = "token.recaptcha_session.score >= 0.5"
    }
  }
  description = "Allow high-confidence human sessions"
}

Frontend Integration

Your pages need to embed the reCAPTCHA session token. For a React app:

// Load the reCAPTCHA script with your site key
useEffect(() => {
  const script = document.createElement('script');
  script.src = `https://www.google.com/recaptcha/enterprise.js?render=${RECAPTCHA_SITE_KEY}&waf=session`;
  document.head.appendChild(script);
}, []);

The waf=session parameter is what tells the reCAPTCHA library to inject the session cookie that Cloud Armor reads. Without it, token validation won’t work at the WAF layer.

Gotcha #5: The token.recaptcha_session context variable only works when users have the reCAPTCHA script loaded. API clients that don’t run JavaScript will have no token, and rules checking the score will not match — they’ll fall through to the next rule. Structure your rules so that no-token requests either hit a challenge or are rate-limited separately. Don’t assume "no token = bot" naively, because internal monitoring and health checks also won’t have tokens.

Logging and Monitoring

Cloud Armor doesn’t log by default at full verbosity. Enable verbose logging:

gcloud compute backend-services update api-backend \
  --global \
  --logging-sample-rate=1.0 \
  --enable-logging

In Terraform:

resource "google_compute_backend_service" "api" {
  name = "api-backend"

  log_config {
    enable      = true
    sample_rate = 1.0
  }
}

Key fields in jsonPayload.enforcedSecurityPolicy:

name — which policy fired
priority — which rule matched
outcome — ACCEPT, DENY, PREVIEW
preconfiguredExprIds — specific WAF sub-rule IDs (gold for tuning false positives)

Build a Cloud Monitoring dashboard tracking:

compute.googleapis.com/security_policy/request_count filtered by outcome=DENY — your block rate
compute.googleapis.com/security_policy/adaptive_protection/l7_ddos_attack_score — active attack confidence
5xx rate from your backend — sanity check that WAF isn’t over-blocking

Production Hardening Checklist

Start in preview mode. Every new WAF rule gets preview = true for at least 48 hours. No exceptions. Review logs, identify false positives, add exclusions, then flip to blocking.

Use a separate policy for each risk tier. Your /api/public/health endpoint doesn’t need the same scrutiny as /admin or /api/payments. Different backend services can carry policies with different sensitivity levels.

Automate the blocklist pipeline. Don’t maintain IP blocklists by hand. Feed threat intel (from services like Google Cloud Threat Intelligence or external feeds) into a Cloud Function that calls the Cloud Armor API to update the security policy. Stale blocklists give false confidence.

Test before deploying with gcloud compute security-policies rules test:

gcloud compute security-policies rules test \
  --security-policy=prod-security-policy \
  --src-ip-address=1.2.3.4 \
  --uri="/search?q=1'+OR+'1'='1" \
  --project=YOUR_PROJECT

This tells you which rule would fire without sending real traffic.

Set budget alerts on Cloud Armor pricing. Adaptive Protection and bot management cost money beyond the base policy pricing. L7 DDoS protection is billed per policy per month plus per million requests. A misconfigured rule causing a traffic spike can also spike your bill. Set a budget alert at 150% of expected spend.

Gotcha #6: Terraform’s google_compute_security_policy resource will destroy and recreate the policy if you change certain immutable attributes (like type). A security policy deletion momentarily leaves your backend unprotected. Use lifecycle { prevent_destroy = true } on your policy resource and be deliberate about updates.

Putting It All Together

Here’s the rough priority ladder for a typical public web application:

Priority	Rule	Action
50	Internal IPs + health checks	Allow
100	Known bad IPs blocklist	Deny 403
200	Low reCAPTCHA score on sensitive paths	Deny 403
210	Suspicious login (mid reCAPTCHA score)	Redirect to challenge
500	Rate limit — 100 req/min per IP	Throttle
1000	SQLi WAF rule (sensitivity 1)	Deny 403
1001	XSS WAF rule (sensitivity 1)	Deny 403
1002	RCE WAF rule (sensitivity 1)	Deny 403
1003	LFI WAF rule (sensitivity 1)	Deny 403
1004	Scanner detection (sensitivity 1)	Deny 403
2147483647	Default	Allow

Adaptive Protection sits above all of this at the policy level — it inserts auto-generated rules in the 10000–20000 priority range when it detects an attack.

Lower priority number = evaluated first. This ordering matters: your allowlist needs to sit at a lower number than your deny rules, or your internal monitoring gets blocked alongside the attackers.

The temptation is to crank sensitivity to 3 or 4 across all rules and feel secure. That’s how you end up with users unable to search for O'Brien because the apostrophe trips an SQLi rule. Start conservative, tune with real traffic data, and treat your WAF as a living configuration rather than a one-time deploy.