Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

CloudFront + WAFv2 — Edge front door ​

Status: Active in prod since 2026-04-23 (Phase 8) and staging since 2026-04-21 (Phase 6). Purpose: SOC 2 CC6.1 / CC6.6 (boundary protection), HIPAA §164.308(a)(1)(ii)(B) (risk management) + §164.312(b) (audit controls), NIST 800-53 R4 SC-7 (boundary protection) + SI-4 (system monitoring), CMS EDE Phase 3 perimeter defense.

Summary ​

Every viewer request to askflorence.health, www.askflorence.health, or stage.askflorence.health lands on CloudFront first. CloudFront enforces TLS 1.2+ at the viewer edge, attaches a response-headers policy for security + IP-opacity headers, and runs every request through a WAFv2 web ACL before forwarding to the origin ALB. WAF logs ship to a CloudWatch log group in each environment account; the prod log group is on a 90-day hot retention with a planned cross-account export to log-archive S3 (object-lock COMPLIANCE 7-year) at Phase 11.

The configuration is Terraform-managed via infra/modules/cloudfront-waf/, wired into each environment from infra/envs/{staging,prod}/cloudfront.tf. There is exactly one place to change rule structure for both environments — the module — which is how scoped exemptions stay consistent across envs.

Resources ​

ResourceStagingProd
Web ACL nameaskflorence-staging-web-aclaskflorence-prod-web-acl
Web ACL ID4d7e1072-04b4-466b-b67a-5ce03036757de05c650b-4dec-456a-af42-3ec0a7c3dcdc
Account549136075525039624954211
Region (CLOUDFRONT scope)us-east-1us-east-1
WAF log groupaws-waf-logs-askflorence-staging-web-aclaws-waf-logs-askflorence-prod-web-acl
Log retention (CloudWatch)14 days90 days
KMS encryption (logs)alias/askflorence-staging-dataalias/askflorence-prod-data
Default actionAllowAllow
Aliases servedstage.askflorence.healthaskflorence.health, www.askflorence.health, prod-canary.askflorence.health

Rule stack (priority order) ​

PriorityRuleVendorModeScope-down
0AWSManagedRulesCommonRuleSetAWS managedEnforce (vendor default Block)URI does NOT start with /ingest/ — see Scoped exemptions
10AWSManagedRulesKnownBadInputsRuleSetAWS managedEnforceNone — runs on all requests
20AWSManagedRulesSQLiRuleSetAWS managedEnforceURI does NOT start with /ingest/ — see Scoped exemptions
30AWSManagedRulesAmazonIpReputationListAWS managedEnforceUser-Agent does NOT match the documented social-crawler allowlist — see Scoped exemptions
40AWSManagedRulesAnonymousIpListAWS managedEnforceSame UA-allowlist exemption as priority 30
100RateBasedBlanketCustomBlockNone — 2000 req/5min/IP applies to ALL requests including exempted-from-managed-rule traffic

The default WAF action is Allow. Every rule that matches its statement returns the rule action (Block for managed groups in Enforce mode, Block for the rate-based rule). Anything that doesn't match any rule passes through.

Scoped exemptions ​

Two scoped exemptions are applied to remediate documented false-positives observed post-Phase-10 cutover. Both are narrow, documented, IaC-managed, and preserve audit logging — the model the migration plan calls for under "documented, risk-based deviations from default managed-rule posture."

Exemption 1 — PostHog analytics proxy /ingest/* ​

Rules exempted: AWSManagedRulesCommonRuleSet (priority 0) and AWSManagedRulesSQLiRuleSet (priority 20). Scope: request URI starts with /ingest/.

Why: the path is a first-party Next.js rewrite to PostHog (/ingest/static/* → us-assets.i.posthog.com/static/* and /ingest/* → us.i.posthog.com/*). Browsers POST gzip-compressed event payloads to it. The compressed body pattern-matches Common (size/format/encoding) and SQLi signatures, returning HTTP 403 to every legitimate analytics emit. The endpoint does not surface SQL or user-controllable input — it forwards opaque event blobs to PostHog. Managed-rule inspection adds no value here.

Residual coverage on /ingest/* requests:

  • AWSManagedRulesKnownBadInputsRuleSet (priority 10) — still active.
  • AWSManagedRulesAmazonIpReputationList (priority 30) — still active.
  • AWSManagedRulesAnonymousIpList (priority 40) — still active.
  • RateBasedBlanket (priority 100) — still active. 2000 req/5min/IP cap applies to /ingest/* traffic the same as everywhere else.

Exemption 2 — Social-media link-unfurl crawlers ​

Rules exempted: AWSManagedRulesAmazonIpReputationList (priority 30) and AWSManagedRulesAnonymousIpList (priority 40). Scope: User-Agent contains (case-insensitive) any of:

User-Agent substringCrawler
telegrambotTelegram link previews
facebookexternalhitFacebook + Instagram OG fetch
facebookcatalogFacebook product catalog
linkedinbotLinkedIn link previews
slackbotSlack link unfurl (matches Slackbot-LinkExpanding)
discordbotDiscord link previews
twitterbotTwitter/X card validator
whatsappWhatsApp link previews
skypeuripreviewSkype/Teams link previews
redditbotReddit link previews
applebotApple Spotlight/Siri previews (partial iMessage coverage)

The allowlist is in infra/modules/cloudfront-waf/variables.tf under social_crawler_user_agents. Variable validation enforces a minimum of 2 entries (WAFv2 or_statement requires ≥2 sub-statements).

Why: these crawlers operate from cloud datacenter CIDR ranges (Telegram 149.154.0.0/16, Meta AS32934, Microsoft Azure ranges for LinkedIn/Skype, etc.) that the AWS-managed IP-reputation feeds flag wholesale based on activity from other actors in the same range. Pre-fix, every social share of an askflorence.health link returned a broken preview — material funnel drag for the consumer + agent acquisition flows.

Residual coverage on crawler-UA requests:

  • AWSManagedRulesCommonRuleSet (priority 0) — still active.
  • AWSManagedRulesKnownBadInputsRuleSet (priority 10) — still active.
  • AWSManagedRulesSQLiRuleSet (priority 20) — still active.
  • RateBasedBlanket (priority 100) — still active.

A UA-spoofing attacker from a flagged IP must therefore still bypass payload-inspection rules and the rate-based cap to make any progress. The exemption only neuters IP-reputation gating for the documented allowlist, not the rest of the rule stack.

Compliance posture (both exemptions) ​

FrameworkControlHow this configuration satisfies it
HIPAA Security Rule§164.308(a)(1)(ii)(B) Risk managementDocumented risk-based exception with compensating controls.
HIPAA Security Rule§164.312(b) Audit controlsAll /ingest/* and crawler-UA requests still logged to the WAF log group with action field showing whether the exempted rule(s) were skipped. Forensics intact.
SOC 2 TSCCC6.1 Logical access controlsBoundary controls remain in BLOCK mode for all rules; exemptions are payload-class- and identity-scoped, not blanket allows.
SOC 2 TSCCC6.6 Boundary protectionDefense in depth preserved — every request still hits ≥4 enforcement layers.
SOC 2 TSCCC7.1 / CC7.2 System monitoringCloudWatch metrics on every managed rule group expose pre/post-exemption block rates. CloudTrail records wafv2:UpdateWebACL for every IaC change.
SOC 2 TSCCC8.1 Change managementTerraform-managed; commit history + reviewed PR + dated change-log entry.
NIST 800-53 R4 (MARS-E 2.2)SC-7 Boundary protection"Risk-commensurate" boundary protection on a public-data path.
NIST 800-53 R4 (MARS-E 2.2)SI-4 System monitoringWAF logs to S3 + CloudWatch metrics unchanged.
NIST 800-53 R4 (MARS-E 2.2)AU-2 / AU-3 Audit eventsAudit records still generated for every request including the action, terminatingRuleId, and ruleGroupList fields that show the exemption fired.
CMS EDE Phase 3MARS-E 2.2 inheritanceBoth exemptions apply only to public consumer + first-party analytics paths that carry no PHI / PII / FTI / application / cms_hub data class today.

When to re-evaluate ​

TriggerAction
Phase 5 cutover (agent portal + member dashboard ship)Confirm authenticated/PHI-bearing routes are not addressable from the social-crawler UA allowlist (crawlers can't authenticate, so this is naturally self-limiting). Confirm /ingest/* is not carrying any new property-class data — the SDK property allowlist + outbound-egress wire-level guard are the actual defenses there, not WAF.
EDE Phase 3 audit prep (~Sept 2026)Include both exemptions in the NIST 800-53 control mapping document under SC-7. Each gets a one-page entry citing this section.
PostHog vendor decision (Phase 11)If PostHog migrates to self-hosted in our AWS FedRAMP Moderate env or is replaced with CloudWatch RUM, re-assess whether /ingest/* exemption is still needed.
Crawler list driftAdd new entries to social_crawler_user_agents only when (a) a real partner/share path is breaking AND (b) the crawler's documentation lists a stable User-Agent string. Avoid generic browser strings.

Response-headers policy ​

Attached to both default + /_next/static/* cache behaviors. Two jobs:

  1. Security headers:
    • Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
    • Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self' data:; connect-src 'self' https://us.i.posthog.com https://us-assets.i.posthog.com; frame-ancestors 'none'; base-uri 'self'; form-action 'self'
    • X-Content-Type-Options: nosniff
    • X-Frame-Options: DENY
    • Referrer-Policy: strict-origin-when-cross-origin
  2. IP opacity (per migration plan):
    • Server: AskFlorence (override)
    • Strip X-Powered-By, X-AspNet-Version, X-AspNetMvc-Version
    • Via is intentionally not stripped — CloudFront refuses to suppress it via RemoveHeaders; CloudFront's own Via identifies the CDN, not the origin stack, so this is OK for IP-opacity goals.

CSP unsafe-inline + unsafe-eval will tighten in Phase 11 once Next.js inline scripts move to nonces or hashes.

Verification ​

Run after any rule change:

bash
# A. PostHog /ingest/e/ — should NOT be 403 (scope-down working)
curl -s -o /dev/null -w "%{http_code}\n" -X POST -H "Content-Type: application/json" \
  -d '{"api_key":"x","event":"test"}' \
  "https://askflorence.health/ingest/e/?ip=0&_=test&ver=1.367.0"
# Expect: 400 from PostHog (rejecting test body), NOT 403.

# B. SQLi probe on a NON-/ingest path — should STILL be 403
curl -s -o /dev/null -w "%{http_code}\n" \
  "https://askflorence.health/api/counties?zip=84094%27%20UNION%20SELECT%201,2,3--"
# Expect: 403. Confirms SQLi rule still enforces general traffic.

# C. Crawler UA — should be 200 (exemption + normal serve)
curl -s -o /dev/null -A "TelegramBot (like TwitterBot)" -w "%{http_code}\n" \
  "https://askflorence.health/"
# Expect: 200. Repeat for facebookexternalhit, LinkedInBot, Slackbot, etc.

# D. Health endpoint
curl -s "https://askflorence.health/api/health"
# Expect: {"status":"ok","commit":"...","env":"prod"}

Operational runbook ​

Inspect what the WAF blocked recently ​

bash
# Recent BLOCK actions (last 30 minutes), prod
aws --profile askflorence-prod logs filter-log-events \
  --log-group-name aws-waf-logs-askflorence-prod-web-acl \
  --filter-pattern '{ $.action = "BLOCK" }' \
  --start-time $(($(date +%s)*1000 - 1800000)) \
  --max-items 20 \
  --query 'events[].message' \
  --output text | jq '{timestamp: .timestamp | tonumber | (./1000) | strftime("%Y-%m-%dT%H:%M:%SZ"), uri: .httpRequest.uri, ip: .httpRequest.clientIp, ua: ([.httpRequest.headers[] | select(.name | ascii_downcase == "user-agent") | .value] | first), rule: .terminatingRuleId, group: .ruleGroupList[0].ruleGroupId}'

Confirm an exempted crawler UA is no longer being blocked ​

bash
# Count BLOCKs in the last hour matching a crawler UA — should be 0 after the fix
aws --profile askflorence-prod logs filter-log-events \
  --log-group-name aws-waf-logs-askflorence-prod-web-acl \
  --filter-pattern '{ $.action = "BLOCK" && ($.httpRequest.headers[*].value = "*TelegramBot*" || $.httpRequest.headers[*].value = "*facebookexternalhit*" || $.httpRequest.headers[*].value = "*LinkedInBot*" || $.httpRequest.headers[*].value = "*Slackbot*" || $.httpRequest.headers[*].value = "*Discordbot*" || $.httpRequest.headers[*].value = "*Twitterbot*" || $.httpRequest.headers[*].value = "*WhatsApp*") }' \
  --start-time $(($(date +%s)*1000 - 3600000)) \
  --query 'length(events[])' \
  --output text

Add a new crawler to the allowlist ​

  1. Edit social_crawler_user_agents in infra/modules/cloudfront-waf/variables.tf.
  2. Append a row to Exemption 2's allowlist table above with the crawler name + UA substring + reason.
  3. terraform apply against staging, then prod.
  4. Append an entry to change-log.md with timestamp + commit SHA.

Remove a scope-down (re-enforce a rule on the exempted scope) ​

Set the controlling variable to its disabling value (empty string for posthog_proxy_uri_prefix, empty list for social_crawler_user_agents) in the relevant env's cloudfront.tf. Plan + apply. The dynamic block disappears from the resource, and the managed rule resumes inspecting the previously-exempted requests.

Rotate the rate-based limit ​

Edit web_acl_rate_limit_per_5min on the relevant module.cloudfront_* block in infra/envs/{staging,prod}/cloudfront.tf. Default is 2000.

Related ​

  • infra/modules/cloudfront-waf/ — Terraform module
  • infra/envs/staging/cloudfront.tf — staging instantiation
  • infra/envs/prod/cloudfront.tf — prod instantiation
  • next.config.ts — /ingest/* rewrite to PostHog
  • Change Log — every WAF change is recorded here
  • Issue #47 — AWS migration parent issue (where the WAF false-positive observations originated)
Pager
Previous pageConfig
Next pageData sources & ingest

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.