Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

S3 data bucket migration runbook ​

Status: planned, not executed. Target window: Phase 11 (post 48h Phase 10 bake).

Why this exists: the current askflorence-data bucket in the management account (778477254880) predates AWS Organizations and holds two unrelated payloads — PUF source CSVs for the ingest pipeline, and agent-uploaded templates from the runtime app. Both will benefit from being split into purpose-scoped, environment-scoped buckets with clean SOC 2 / EDE audit lines.


Target end-state architecture ​

Management account (778477254880)
└── askflorence-data-archive                  ← Immutable backup. Object Lock
                                                 COMPLIANCE mode, 7-year retention.
                                                 Replicated-to, never read by runtime.

Staging account (549136075525)
├── askflorence-staging-data                  ← Authoritative PUF source for review.
│   └── puf/<year>/                            Each year's new PUF validated here
│       ├── plan-attributes-puf.csv            BEFORE promotion to prod.
│       ├── benefits-and-cost-sharing-puf.csv
│       └── ...
└── askflorence-staging-agent-uploads         ← Staging-app agent template uploads.
    └── agent-survey-uploads/

Prod account (039624954211)
├── askflorence-prod-data                     ← Authoritative PUF source for runtime.
│   └── puf/<year>/                            Same layout as staging. Populated via
│                                              explicit promotion after staging validates.
└── askflorence-prod-agent-uploads            ← Prod-app agent template uploads.
    └── agent-survey-uploads/

Why this shape ​

ConcernCurrent (mgmt bucket)Target
Blast radiusmgmt compromise reaches prod agent uploads + PUFEach env owns its own bucket; compromise stays local
Year-over-year PUF release cycleNew PUF ingested directly to prod with no review envStaging ingests first → audit → promote → prod
Backup/retentionNo separate archive; primary == backupImmutable archive in mgmt with Object Lock; primary is mutable
Auditor framing"Why is production PHI in management account?"Clean per-environment isolation
Lifecycle independenceOne bucket, conflicting policies (PUF-indefinite vs agent-7yr)One policy per bucket, per purpose
Cross-account IAM complexityEvery write needs mgmt bucket policy + prod task roleSame-account writes, simpler IAM
Cost allocationProd app costs billed to mgmt accountProd S3 costs stay in prod account

Why staging-first for PUF ​

CMS releases updated PUF data each year in Q3/Q4 for the next plan year's open enrollment. The current workflow ingests that data directly to the prod Atlas cluster via scripts/db/ingest-puf-augment.js, with a sanity check afterward via the tier 1-5 audits. No review environment sits between "CMS drops new data" and "prod serves it to real consumers."

The staging-first pattern adds a review gate:

  1. New CMS PUF CSVs uploaded to staging bucket askflorence-staging-data/puf/<year>/
  2. ingest-puf-augment.js run against staging Atlas with the new year — stores as year: 2027 (or whichever)
  3. Tier 1-5 audit harness runs against staging — catches rate drift, shape changes, missing fields, new carrier formats
  4. Manual review at stage.askflorence.health with the new year's data
  5. Only once staging is clean: aws s3 sync staging bucket → prod bucket, run ingest against prod Atlas, run final audit
  6. aws s3 sync staging bucket → archive bucket for the permanent historical record

This mirrors the code promotion flow (main → staging → prod) for data. Any breaking change in CMS's PUF format gets caught on staging, not in production.


Migration plan ​

Phase 11 — Agent-uploads bucket split (low-risk, no data migration) ​

Why first: the agent-uploads prefix in the current mgmt bucket has exactly one real object today (a smoke-test PDF from the Phase 10 cutover validation). Vercel-era uploads, if any exist, stay where they are as historical. This step has zero user-data migration.

Prod steps:

  1. New file infra/envs/prod/s3-agent-uploads.tf:

    • aws_s3_bucket prod_agent_uploads with name askflorence-prod-agent-uploads
    • Bucket encryption: SSE-KMS using module.kms_prod.key_arn
    • Versioning: enabled
    • Public access block: all 4 flags true
    • Bucket policy: DenyNonSSLRequests (mirror of mgmt bucket pattern)
    • Lifecycle: agent-survey-uploads/ → 7-year retention then delete (HIPAA minimum with buffer)
  2. Update infra/envs/prod/ecs.tf:

    • Change S3_AGENT_SURVEY_BUCKET=askflorence-data to S3_AGENT_SURVEY_BUCKET=askflorence-prod-agent-uploads
    • Remove the S3AgentSurveyUploadsWrite inline policy's cross-account resource ARN; replace with same-account arn:aws:s3:::askflorence-prod-agent-uploads/agent-survey-uploads/*
  3. GuardDuty Malware Protection for S3: add the new bucket to the protected-resources list.

  4. aws s3 sync s3://askflorence-data/agent-survey-uploads/ s3://askflorence-prod-agent-uploads/agent-survey-uploads/ — copy any accumulated objects (expected: 0 user data, maybe smoke tests).

  5. terraform apply + register new ECS task def revision + force-new-deployment.

  6. Smoke test: POST /api/agents/discovery/upload from prod with a real PDF → verify object lands in askflorence-prod-agent-uploads, NOT in askflorence-data.

  7. Update infra/envs/management/s3-askflorence-data.tf: remove the AllowProdEcsTaskRolePutAgentSurveyUploads statement. Prod task role loses the cross-account grant. Mgmt bucket's agent-survey-uploads/ prefix becomes read-only history.

Staging steps (mirror of prod but scoped to staging):

  1. New file infra/envs/staging/s3-agent-uploads.tf with askflorence-staging-agent-uploads.
  2. Update infra/envs/staging/ecs.tf to point S3_AGENT_SURVEY_BUCKET there.
  3. GuardDuty Malware Protection added.
  4. Smoke test via stage.askflorence.health.

Rollback: aws s3 sync in reverse (new bucket → mgmt bucket) + revert env var + revert task role policy. No user impact during rollback because the active env var is the primary switch.


Phase 11.5 — Mgmt immutable archive bucket ​

Purpose: receive replicated copies from both staging and prod data buckets, retain immutably for 7 years. No runtime process reads from this bucket.

Steps:

  1. New file infra/envs/management/s3-data-archive.tf:

    • aws_s3_bucket data_archive with name askflorence-data-archive
    • Object Lock enabled at create-time (can only be enabled when the bucket is created, not retrofitted)
    • Default retention: 7 years COMPLIANCE mode
    • Versioning: enabled (required for Object Lock)
    • SSE-KMS with mgmt CMK alias/askflorence-data
    • Public access block: all 4 flags true
    • Bucket policy: DenyNonSSLRequests + DenyDeleteObject + only-allow-replication-writes
    • Replication destination configuration accepting writes from staging + prod buckets
  2. Cross-account replication IAM:

    • Role askflorence-data-replicator in mgmt account, trusted by s3.amazonaws.com
    • Role policy: PutObject + PutObjectVersionAcl on askflorence-data-archive
    • Source-account grants: staging + prod bucket policies allow this replicator role to read source objects
  3. Lifecycle on archive: Standard → Standard-IA after 30 days → Glacier Deep Archive after 90 days. Deep Archive is ~$0.00099/GB-month; effectively free for PUF volumes.

Rollback: destroy in reverse order. Object Lock COMPLIANCE mode means objects placed during the testing period are permanent — test this step in a disposable bucket first before committing to askflorence-data-archive.


Phase 11.75 — Staging PUF data bucket + staging-first ingest validation ​

Purpose: establish staging as the PUF review environment.

Steps:

  1. New file infra/envs/staging/s3-puf-data.tf:

    • aws_s3_bucket staging_data with name askflorence-staging-data
    • SSE-KMS staging CMK
    • Versioning enabled
    • Public access block all flags true
    • Replication to askflorence-data-archive in mgmt account (for historical backup)
    • Lifecycle: same as prod bucket (Standard → IA after 90d)
  2. aws s3 sync s3://askflorence-data/ s3://askflorence-staging-data/ --exclude 'agent-survey-uploads/*' — copies all PUF years to staging.

  3. Update scripts/db/ingest-*.js env var handling:

    • Current scripts read from askflorence-data hard-coded or via env
    • Add S3_PUF_SOURCE_BUCKET env with staging bucket default when running locally against staging Mongo
    • Document the env var contract in docs/infrastructure/mongodb-setup.md
  4. Ingest sanity check (critical step — do NOT skip):

    • MONGODB_WRITE_URI=<staging> S3_PUF_SOURCE_BUCKET=askflorence-staging-data node scripts/db/ingest-puf-augment.js --dry-run --year 2026
    • Confirm the dry-run output matches the existing prod Atlas state for year 2026 — no drift introduced by the bucket change
    • Run the same thing in --apply mode against a throwaway staging collection; verify collection contents byte-for-byte match what's already in prod Atlas
  5. If step 4 passes cleanly, staging data bucket is the authoritative review environment going forward.

Rollback: if the ingest scripts break, revert S3_PUF_SOURCE_BUCKET to point at the mgmt bucket (unchanged at this point). Scripts resume working. Staging data bucket stays as a replicated copy, not authoritative.


Phase 12 — Prod PUF data bucket + promotion flow ​

Steps:

  1. New file infra/envs/prod/s3-puf-data.tf:

    • aws_s3_bucket prod_data with name askflorence-prod-data
    • SSE-KMS prod CMK
    • Versioning enabled
    • Public access block all flags true
    • Replication to askflorence-data-archive in mgmt account
    • Lifecycle: Standard → IA after 90d
  2. Initial population: aws s3 sync s3://askflorence-staging-data/ s3://askflorence-prod-data/ (staging → prod for the first time)

  3. Update ingest scripts' prod-pointing env:

    • S3_PUF_SOURCE_BUCKET=askflorence-prod-data in any prod-run context
    • This is a scripts-only change; the serving app does not read S3 for PUF data (it reads Atlas)
  4. Re-run full audit tier 1-5 against prod Atlas after the first ingest from the new bucket — confirm no regression in serving data.

  5. Document the PUF promotion workflow at docs/runbooks/puf-year-promotion.md (new file):

    New PUF year arrives from CMS:
    1. Upload to staging bucket: aws s3 cp plan-attributes-puf.csv s3://askflorence-staging-data/puf/<year>/
    2. Run ingest against staging Atlas: S3_PUF_SOURCE_BUCKET=askflorence-staging-data ... --year <year>
    3. Run audit tier 1-5 against staging: scripts/audit/*.js with staging URI
    4. Manual review at stage.askflorence.health
    5. aws s3 sync s3://askflorence-staging-data/puf/<year>/ s3://askflorence-prod-data/puf/<year>/
    6. Run ingest against prod Atlas: S3_PUF_SOURCE_BUCKET=askflorence-prod-data ... --year <year>
    7. Run audit tier 1-5 against prod: scripts/audit/*.js with prod URI
    8. Replication to archive bucket happens automatically (both staging + prod replicate to mgmt archive)

Phase 12.5 — Deprecate mgmt askflorence-data bucket ​

Steps:

  1. Verify all runtime flows have moved off askflorence-data:

    • Prod ECS task def env: S3_AGENT_SURVEY_BUCKET points at prod bucket
    • Staging ECS task def env: points at staging bucket
    • Prod + staging ingest scripts point at their respective new buckets
    • No code path reads from askflorence-data directly
  2. Make the bucket read-only:

    • Replace bucket policy with only DenyNonSSLRequests + Deny * for any write actions + allow Get/List for archive browsing
    • Revoke any IAM user creds that still had write access (Vercel-era IAM user — see Phase 11 Resend retirement + related static-creds retirement)
  3. Leave the existing objects in place as the pre-migration historical record. Do NOT delete.

  4. Optionally: set a lifecycle policy to transition everything in the bucket to Glacier Deep Archive after 90 days. Keep-forever retention.


Documentation trail (evidence for auditors) ​

Every step of the migration generates evidence. Log locations:

Evidence typeWhere it lives
Timestamped change recorddocs/infrastructure/change-log.md — one entry per phase step
Session-level narrativedocs/session-log/<date>-s3-data-migration-phase-<n>.md — the chronological "what happened"
Terraform state diffsinfra/envs/<env>/terraform.tfstate.backup (auto) + git log on infra/
Ingest sanity check outputscripts/audit/audit-tier-*-results.json snapshots before + after the migration
Data-level confirmationscripts/audit/audit-parity-check.js output at each phase transition
CloudTrailAll bucket creation, policy changes, replication config changes captured in org trail in log-archive account (Phase 2 setup)

Rollback philosophy ​

At every phase, the previous phase's state is preserved. No step makes the PRIOR state unreachable. This means:

  • Phase 11 agent-uploads split: mgmt bucket's agent-survey-uploads/ prefix stays in place; flipping S3_AGENT_SURVEY_BUCKET env var back restores the old behavior
  • Phase 11.5 archive bucket: writes are one-directional (replication INTO archive); no workload depends on reading FROM archive yet
  • Phase 11.75 staging data bucket: scripts fall back to mgmt bucket via env-var change
  • Phase 12 prod data bucket: scripts fall back to mgmt bucket via env-var change
  • Phase 12.5 mgmt deprecation: read-only posture, not destruction; re-granting write access is one bucket policy edit if needed

The mgmt askflorence-data bucket is never deleted during this migration. It becomes read-only cold storage but remains intact. This is the recovery floor.


Effort estimate ​

PhaseEffortRiskData migration
11 — Agent-uploads split~1 hourLow~0 objects (fresh prefix)
11.5 — Archive bucket~30 minLow (isolated, no downstream yet)None
11.75 — Staging data bucket + ingest validation~2 hoursMedium (ingest scripts touch prod DB if not careful)~100 MB PUF copy to staging
12 — Prod data bucket + promotion flow~2 hoursMedium (same as 11.75 on prod side)~100 MB staging-to-prod sync
12.5 — Mgmt bucket deprecation~30 minVery lowNone
Total~6 hours focused—~200 MB total network transfer

All five phases can happen in a single focused session, or spread across 2-3 sessions. All are post-48h-bake, and all are compatible with the existing Phase 11/12 hardening + compliance closeout work.


Related work ​

  • Session log 2026-04-23 — Phase 10 cutover — context on why this migration surfaced
  • aws-setup runbook — account topology + general AWS operations
  • infra/envs/management/s3-askflorence-data.tf — current state of the mgmt bucket policy; gets reduced in Phase 12.5
  • infra/envs/prod/ecs.tf — S3AgentSurveyUploadsWrite inline policy + S3_AGENT_SURVEY_BUCKET env var that Phase 11 updates
  • GuardDuty setup — Malware Protection for S3 config; will be extended to the new buckets in Phase 11
Pager
Previous pageRollback via Terraform (ENG-277)
Next page2026-Q2 Review

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.