Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

AWS Setup Runbook ​

Purpose: the day-to-day operator's guide to the AskFlorence AWS environment. How we authenticate, how we deploy, how we read logs, how we roll things back, and which AWS surface area belongs to which environment. Paired with the deeper cloudtrail-setup.md, guardduty-setup.md, and phase-specific network docs under this directory.


Account topology ​

Account nameAccount IDPurposeSSO permission sets available
askflorence-management778477254880AWS Organizations root; billing; Terraform state backend; the pre-existing askflorence-data S3 bucketAdminAccess, PowerUserAccess, BillingReadOnly, SecurityAudit
askflorence-prod039624954211Production workloads (ECS, ALB, CloudFront, WAF, prod Secrets Manager). Provisioned at Phase 8; Phase 1/2/2.5 baseline only as of Phase 5AdminAccess (break-glass), PowerUserAccess, SecurityAudit
askflorence-staging549136075525Pre-prod validation; Phase 5 staging stack reachable at stage.askflorence.healthAdminAccess, PowerUserAccess, SecurityAudit
askflorence-log-archive754660694122Immutable audit trail destination (CloudTrail, Config, WAF, VPC Flow Logs). Object-lock COMPLIANCE 7-year retention on tfstate + CloudTrail bucketsSecurityAudit, PowerUserAccess (limited — no workload resources live here)

Details + provenance: account-inventory.md.


Access + profiles ​

Day-to-day human access is via AWS IAM Identity Center (SSO). Long-lived IAM users, access keys, and static credentials are not used anywhere. CI/CD uses GitHub Actions OIDC federation to short-lived STS role assumptions per environment.

Local ~/.aws/config profiles ​

These are the canonical profile names referenced throughout the runbooks:

ini
[profile askflorence]
sso_session     = askflorence
sso_account_id  = 778477254880
sso_role_name   = AdministratorAccess
region          = us-east-1

[profile askflorence-prod]
sso_session     = askflorence
sso_account_id  = 039624954211
sso_role_name   = AdministratorAccess
region          = us-east-1

[profile askflorence-staging]
sso_session     = askflorence
sso_account_id  = 549136075525
sso_role_name   = AdministratorAccess
region          = us-east-1

[profile askflorence-log-archive]
sso_session     = askflorence
sso_account_id  = 754660694122
sso_role_name   = AdministratorAccess
region          = us-east-1

[sso-session askflorence]
sso_start_url   = https://askflorence.awsapps.com/start
sso_region      = us-east-1
sso_registration_scopes = sso:account:access

Authenticating ​

bash
aws sso login --profile askflorence
# Browser opens → approve → tokens cached for ~8h
aws sts get-caller-identity --profile askflorence-staging   # sanity check

Tokens expire. If you see Error loading SSO Token or credentials not found, re-run aws sso login. The rest of this doc assumes you're logged in and have set AWS_PROFILE=askflorence-<env> before each command.


Deploy a change to prod ​

Prod is on a manual workflow_dispatch trigger — no deploys happen without an explicit click. GitHub Team plan doesn't support required-reviewers on private-repo environments, so workflow_dispatch is the approval surrogate.

bash
# Browser:
#   https://github.com/askflorencehealth/ask-florence/actions/workflows/deploy-prod.yml
#   → Run workflow → leave ref as `main` → Run workflow

# CLI equivalent:
gh workflow run deploy-prod.yml --ref main

The workflow:

  1. OIDC-assumes arn:aws:iam::039624954211:role/GitHubActionsDeployRole
  2. Builds + pushes image to prod ECR (immutable tags, GHA cache backend — no :latest on prod)
  3. Renders a fresh task-def revision with the new :<sha> image
  4. aws ecs deploy-task-definition with wait-for-service-stability (up to 15 min)
  5. On first deploy only: scales desired 0 → 2
  6. Smoke GET /api/health against origin.askflorence.health (direct ALB, bypassing CloudFront + WAF — avoids GitHub runner IP false-positives in WAF's AnonymousIpList/AmazonIpReputationList)

Watch live:

bash
gh run watch $(gh run list --workflow=deploy-prod.yml --limit 1 --json databaseId -q '.[0].databaseId')

Rollback: re-run with ref: <previous-sha> via CLI — gh workflow run deploy-prod.yml --ref <sha>. The workflow checks out that ref, builds its image, and deploys.

Prod-only caveats:

  • Immutable tags: every tag pushed to prod ECR is permanent. No :latest — each deploy pins its own :<sha>. Rollback is a separate build from the same commit, not a tag move.
  • :latest is NOT pushed on prod. If anything references :latest for prod, it's looking at a stale image. Always reference the SHA.
  • Deletion protection ON: the prod ALB cannot be deleted by terraform destroy without first setting enable_deletion_protection = false and re-applying. This is intentional.

Before any invocation: confirm the canary hostname prod-canary.askflorence.health + origin.askflorence.health resolve through Cloudflare correctly. If DNS breaks, the workflow's smoke step will fail even with a healthy deploy.

Atlas gotcha re-run from Phase 7: if the prod Atlas IP allowlist gets tightened and an already-running ECS task holds stale DNS state, force-new-deployment. Not applicable during normal deploys — only comes up after networking changes.

Deploy a change to staging ​

bash
# 1. Land the code change on main, then fast-forward staging to main.
git checkout main && git pull
# ... make changes, commit ...
git push origin main
git checkout staging
git merge --ff-only main
git push origin staging         # triggers .github/workflows/deploy-staging.yml

The deploy-staging.yml workflow does: OIDC-assume GitHubActionsDeployRole in staging → ECR login → docker buildx build --push (amd64, with PostHog build args + DEPLOY_ENV=staging) → pull current task def → render with new image via amazon-ecs-render-task-definition → register + deploy via amazon-ecs-deploy-task-definition (wait-for-stability, 10-min cap) → smoke test GET https://stage.askflorence.health/api/health.

Watch a run:

bash
gh run watch --exit-status $(gh run list --workflow=deploy-staging.yml --limit 1 --json databaseId -q '.[0].databaseId')

Prod deploy flow will be identical but against askflorence-prod via a deploy-prod.yml workflow that requires a manual approval on a protected GitHub environment. Lands in Phase 8.


Reading application logs ​

bash
export AWS_PROFILE=askflorence-staging
aws logs tail /aws/ecs/askflorence-staging-app --region us-east-1 --since 5m --format short --follow

Per-container streams named app/app/<task-id>. Retention 14 days (staging). Prod retention will be 90 days hot + long-term archive to log-archive S3.


Updating secrets ​

Secrets live in the workload account (staging/* and prod/* namespaces), CMK-encrypted (staging: alias/askflorence-staging-data; prod will be alias/askflorence-prod-data). Never pass secret values on the command line — always via a mode-600 temp file:

bash
export AWS_PROFILE=askflorence-staging
TMP=$(mktemp -t secret-XXXX)
trap 'rm -f "$TMP"' EXIT
printf '%s' "$NEW_VALUE" > "$TMP"
chmod 600 "$TMP"
aws secretsmanager put-secret-value \
  --secret-id staging/<secret-name> \
  --secret-string "file://$TMP" \
  --region us-east-1

Secret values are injected into ECS task containers via the task execution role's secretsmanager:GetSecretValue + kms:Decrypt on the specific ARNs. Secrets are only re-read when a new task starts. For a running secret change to take effect, force a deployment:

bash
aws ecs update-service --cluster askflorence-staging --service askflorence-staging-app --force-new-deployment --region us-east-1

The end-to-end script for populating all staging Mongo URIs from .env.staging.local lives at scripts/aws/populate-staging-secrets.sh.


Registering a task definition revision out of band ​

The Terraform ecs-service module sets lifecycle { ignore_changes = [container_definitions] }, which means terraform apply alone will not push env-var changes onto the running task. GitHub Actions CI/CD is the intended path. For one-off env changes between deploys, register a fresh revision manually:

bash
export AWS_PROFILE=askflorence-staging
aws ecs describe-task-definition \
  --task-definition askflorence-staging-app-task \
  --region us-east-1 \
  --query taskDefinition > /tmp/td.json

# Mutate /tmp/td.json (e.g. add/remove env vars), then:
python3 -c '
import json, sys
td = json.load(open("/tmp/td.json"))
for k in ("taskDefinitionArn","revision","status","requiresAttributes","compatibilities","registeredAt","registeredBy"):
    td.pop(k, None)
json.dump(td, open("/tmp/td-new.json","w"))
'

aws ecs register-task-definition --cli-input-json file:///tmp/td-new.json --region us-east-1
aws ecs update-service --cluster askflorence-staging --service askflorence-staging-app \
  --task-definition askflorence-staging-app-task \
  --region us-east-1 --force-new-deployment

Next GH Actions deploy picks up the new revision as its base and layers the new image on top.


Rollback ​

  • Code revert — git revert <sha>, push to staging (or main + fast-forward staging), GH Actions redeploys the prior image.
  • Image revert — point the ECS service at an older task def revision:
    bash
    aws ecs update-service --cluster askflorence-staging --service askflorence-staging-app \
      --task-definition askflorence-staging-app-task:<old-revision> --region us-east-1
  • Infra revert — terraform apply against a prior commit under infra/envs/staging/. State backend handles locking via askflorence-tfstate-locks DynamoDB.
  • DNS revert (if Phase 10 cutover is live and something breaks on AWS prod) — in Cloudflare, flip the askflorence.health CNAME back from the prod CloudFront distribution to the preserved Vercel deployment. TTL on that record is kept low (300s) for the first 72h after cutover for exactly this reason.

What lives where ​

ConcernSource of truth
Infra definitionsinfra/envs/<env>/ + infra/modules/
Terraform stateS3 askflorence-tfstate-778477254880 (mgmt), DynamoDB locks askflorence-tfstate-locks
SecretsAWS Secrets Manager, staging/* in 549136075525, prod/* in 039624954211
CI/CD.github/workflows/deploy-staging.yml (Phase 5) + deploy-prod.yml (Phase 8)
Organizations baselineaws-organizations.md, cloudtrail-setup.md, guardduty-setup.md, security-hub-setup.md, config-setup.md
Networkingphase-4-staging-dns-records.md + per-env infra/envs/<env>/network.tf

Further reading ​

  • aws-organizations.md — Org, SCPs, SSO permission sets, budgets.
  • cloudtrail-setup.md — org trail, S3 object lock, Insights.
  • guardduty-setup.md — detector config, feature plans.
  • security-hub-setup.md — NIST 800-53 Rev 5 standard + delegated admin.
  • config-setup.md — Config aggregator + recording scope.
  • account-inventory.md — accounts + root email + BAA + SSO state.
  • change-log.md — canonical timestamped record of every infra change.

This document will keep growing as phases land. Phase 6 adds the CloudFront + WAF front-door; Phase 7 adds Atlas VPC peering; Phase 8 mirrors this into prod; Phase 10 covers the Vercel → AWS cutover mechanics.

Pager
Previous pageAccount Inventory
Next pageAWS Organizations

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.