Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

Tier 0.5 federal ZIP completeness audit — fresh-session entry brief ​

STATUS: SESSION COMPLETE 2026-05-01. Outcome captured below; full reports linked.

Worktree: /Users/tahaabbasi/Developer/ask-florence-tier-0-5-federal Branch: tier-0-5-federal-completeness-audit Issue (execution tracker): #80 (final close-out comment here) Parent context issue: #79 Reference plan (read for context, do NOT execute verbatim): /Users/tahaabbasi/.claude/plans/hey-so-okay-that-delightful-eich.md Session plan (executed): /Users/tahaabbasi/.claude/plans/users-tahaabbasi-developer-ask-florence-fizzy-gray.md

Session outcome (2026-05-01) ​

4,364 docs landed on prod across 4 user-greenlighted batches (3 Tier 0.5 + 1 Tier 1 tuple-completeness). All under verified mongodump backups + per-batch greenlights per Constraints 1+2.

Final auditResult
Tier 0.5 re-run0 gap zips remaining (was 4,347)
Tier 1 (federal zip-county)22,302/22,302 = 100.00% exact match
Tier 1.5 (SBE zip-county)13,055/13,055 = 100.00% exact match
Calculator baseline diffZERO DIFFS on all 12 scenarios
85001 + 50613 user-reported gapsBoth resolve correctly on prod

Where to read the full story:

  • docs/validation/tier-0-5-federal-uspscompleteness.md — canonical Tier 0.5 audit report
  • docs/validation/audit-operations-runbook.md — operational lessons captured for any future audit
  • docs/infrastructure/change-log.md — top two entries cover this session

Original entry brief follows below — preserved verbatim as the historical handoff that this session was given.


What you're solving ​

User report 2026-05-01: co-founder entered ZIP 85001 (downtown Phoenix) on prod calculator → got 404 / "Oops". CMS says 85001 is valid AZ/04013/Maricopa. Our zip_county collection has zero docs for it.

Root cause: the original 2026-04-13 federal-30 ingest (commit 330871e) AND the recent Tier 0 federal completeness audit (commit 2b24f2c) both used U.S. Census 2020 ZCTA-County as the universe. Census ZCTA only catalogs ZIPs with significant residential population. USPS recognizes a broader class — PO-Box-only ZIPs in business districts, single-building ZIPs (universities, military bases, large corporate campuses), industrial-zone ZIPs. CMS Marketplace API (which uses USPS data) recognizes them; Census doesn't. Tier 0 was structurally blind to them.

The single-doc fix for 85001 is staged in commit 843bdf7 (scripts/db/fix-federal-zip-gaps.js extended) but not applied — apply pending fresh AWS SSO. You may apply that fix as a Phase 0 quick-win for 85001 only, but only after honoring Constraint 1 (BACKUP) below. Or you may roll it into the broader Tier 0.5 sweep.

Goal: build Tier 0.5 — same audit pipeline as Tier 0 but with a USPS-derived universe — to systematically catch every PO-Box-only / business-only / USPS-only federal+NY ZIP missing from zip_county.


TWO HARD CONSTRAINTS (load-bearing — the user wrote these specifically for this work) ​

CONSTRAINT 1 — BACKUP PROD BEFORE ANYTHING ​

Before you write a single doc to the prod zip_county collection (any apply path: Tier 0.5 batch, single-doc 85001 patch, anything), take a fresh mongodump of the collection.

Required:

  • Tag the snapshot with ISO-UTC timestamp + the planned _seedSource marker
  • Store at a versioned, retrievable location (S3 in the prod account preferred — bucket askflorence-data works; local with documented path acceptable for fast iteration)
  • Verify the snapshot by file size + record count + spot-check that mongorestore works against a scratch DB
  • Document the restore command in your plan file
  • Each batched apply gets its own immediately-preceding fresh backup

No apply step proceeds without a verified backup. Non-negotiable. If you find yourself about to call bulkWrite / insertOne / --apply on prod without a backup taken in the last few minutes, STOP.

Recommended pattern:

bash
# 1. Get prod read URI from Secrets Manager
aws sso login --profile askflorence-prod
PROD_READ_URI=$(aws --profile askflorence-prod secretsmanager get-secret-value \
  --secret-id prod/mongodb/app-read --query SecretString --output text)

# 2. Dump zip_county to local with timestamp tag
TS=$(date -u +%Y%m%dT%H%M%SZ)
TAG="pre-tier-0-5-${TS}"
mkdir -p /tmp/backups/${TAG}
mongodump --uri="${PROD_READ_URI}" --collection=zip_county \
  --out=/tmp/backups/${TAG}/dump

# 3. Verify
ls -lh /tmp/backups/${TAG}/dump/*/zip_county.bson
mongorestore --uri="<scratch-uri>" --nsFrom='*.zip_county' --nsTo='restored.zip_county_test' \
  --dryRun /tmp/backups/${TAG}/dump

# 4. Upload to S3 (prod-account bucket)
aws --profile askflorence-prod s3 sync /tmp/backups/${TAG}/ \
  s3://askflorence-data/db-backups/zip_county/${TAG}/

# 5. Document restore command in your plan file

CONSTRAINT 2 — ANALYZE → REPORT → PHASED DECISIONS — never auto-update ​

The audit DOES NOT auto-insert. Flow:

  1. Dry-run audit → produce a structured gap report:
    • Per-state counts (insertable / needs-PUF / discrepancy / cms-error)
    • Per-class counts
    • Per-conflict samples (if a state has > 50 gaps, that's anomalous — sample 10)
    • Full machine-readable JSON at scripts/db/data/federal-tier-0-5-gap-report-<date>.json
    • Human-readable markdown summary in your status comment on issue #80
  2. STOP. Surface the report to the user via issue comment + a clear "awaiting user review" message in the chat. Do not proceed.
  3. User makes phased decisions explicitly. Examples:
    • "Apply the AZ + TX classes only, hold the rest"
    • "Hold the discrepancy class entirely until I review samples"
    • "State X has 50 gaps — investigate before any apply"
  4. You execute ONLY what the user explicitly greenlights, in the batches they approve.
  5. Each batch gets a fresh backup per Constraint 1.
  6. After each batch: post a status comment on #80 with batch summary + verification matrix.

If at any point you're unsure whether the user has greenlighted a specific apply, ask — do not assume.


Scope ​

  • PROD ONLY this run. Staging is locked out (massive in-flight data task there). Don't touch staging Atlas — no allowlist add, no script run, no DB write.
  • States in scope: federal-30 ∪ {NY}. SBE states are out of scope (Tier 1.5 covers those at #70).
  • Pure data layer. No app code changes, no route changes, no STATE_BASED_MARKETPLACES changes.
  • No app deploy needed. Vercel + AWS apex unchanged.

Step-by-step shape (your starting framework — adapt during planning) ​

This is the original 10-phase blueprint from the reference plan. Treat it as a starting reference. Your own plan will diverge per the two constraints — phase 5 becomes the explicit user-decision gate.

#PhaseWhat
0Recon + create your own plan fileRead this prompt + reference plan + #80 + #79. Write your own plan file at ~/.claude/plans/<your-session-tag>.md (or use the existing one — your call). Bake Constraints 1+2 into every apply path.
1HUD download + sanityClick-through HUD account at huduser.gov; download ZIP_COUNTY_<YYYYqQ>.csv. Verify 85001 + 5 known PO-Box ZIPs are present. Fallback: zipcodes npm package if HUD auth is friction.
2Build USPS snapshotscripts/db/build-usps-snapshot.js — filter HUD CSV to federal-30 + NY; write scripts/db/data/usps-zip-state-<date>.csv (committed).
3Build Tier 0.5 audit scriptscripts/db/audit-federal-completeness-tier-0-5.js — mirrors Tier 0 (scripts/db/audit-federal-completeness.js); only universe input differs. Same 4-way classification (insertable / needs-PUF / discrepancy / cms-error).
4Run dry-run audit against PROD READ replicaGenerates federal-tier-0-5-gap-report-<date>.json. ~10 min runtime.
5HARD STOP — TRIAGESurface report on #80. Wait for user phased-decision. No apply path runs in this phase.
6Build seed script (per user direction)scripts/db/seed-federal-tier-0-5.js (or extend seed-federal-completeness.js) — supports per-class / per-state filters so user can apply subsets. Marker: _seedSource: "federal-tier-0-5-audit-<date>". Three modes: --dry-run / --apply / --rollback.
7Apply user-greenlighted batches (each with backup)For each batch: BACKUP → dry-run → user approves → apply → smoke matrix → status comment on #80.
8Validation tierCalculator baseline diff (ZERO DIFFS gate), Tier 1 audit, multi-county integrity check, smoke matrix on user-approved sample of newly-inserted ZIPs.
9Docsdocs/validation/tier-0-5-federal-uspscompleteness.md, docs/infrastructure/data-sources.md (annual refresh playbook), docs/infrastructure/change-log.md (dated entry per apply batch).
10Status comments + closeUpdate #80 with final state. Update #79 / #47 Phase 11 checklist.

Realistic actual end-to-end estimate: 30-60 min of script writing + dry-run, then potentially several user-paced phases for actual applies depending on what the report surfaces.


Critical files ​

To create:

  • scripts/db/build-usps-snapshot.js
  • scripts/db/data/usps-zip-state-<date>.csv (committed snapshot)
  • scripts/db/audit-federal-completeness-tier-0-5.js
  • scripts/db/data/federal-tier-0-5-gap-report-<date>.json (committed audit-trail report)
  • scripts/db/seed-federal-tier-0-5.js — must support --state=<list>, --class=<list> filters so user can apply subsets per Constraint 2
  • docs/validation/tier-0-5-federal-uspscompleteness.md

To modify:

  • docs/infrastructure/data-sources.md — add HUD ZIP-County crosswalk; extend annual refresh playbook
  • docs/infrastructure/change-log.md — dated entry per apply batch
  • docs/security-compliance/vendor-register.md — note HUD as a data source (no BAA needed; public federal data)

To leave UNTOUCHED (hard guardrails):

  • All 30,326 pre-existing federal+NY county docs (never modify)
  • All 17,537 SBE-side docs with _seedSource: "cms-2026-04-30" (out of scope)
  • All 3 docs with _seedSource: "federal-gap-fix-2026-05-01" (Path 1 fixes)
  • All 366 docs with _seedSource: "federal-completeness-audit-2026-05-01" (Tier 0 fixes)
  • src/app/api/counties/route.ts (route logic correct)
  • src/lib/hooks/use-calculator.ts (hook logic correct)
  • STATE_BASED_MARKETPLACES, FEDERAL_STATES, SBE_STATES constants
  • STAGING ATLAS entirely

Patterns to reuse (don't re-invent) ​

  • Audit pipeline — scripts/db/audit-federal-completeness.js (Tier 0). Copy structure, swap universe loader.
  • Seed pattern — scripts/db/seed-federal-completeness.js. Copy structure, change marker tag, add per-state / per-class filter flags for Constraint 2.
  • CMS query — same lookupZipViaCms() pattern used in scripts/db/build-cms-snapshot.js (5 concurrent, 25 req/sec validated).
  • regionId lookup — same in-memory map pattern from audit-federal-completeness.js.
  • 3-mode CLI — scripts/db/ingest-puf-augment.js.
  • FEDERAL_PLUS_NY set — same constant as Tier 0 audit.
  • Calculator baseline diff — scripts/audit/calculator-baseline-diff.ts runs against prod URI in .env.local; ZERO DIFFS gate.

Tooling + auth expectations ​

You'll need:

  • AWS SSO refreshed: aws sso login --profile askflorence-prod (and askflorence-staging if you ever need it — but you shouldn't).
  • Atlas CLI auth: atlas auth login if you need to add/remove your laptop IP from the prod Atlas project allowlist.
  • MongoDB tools: mongodump, mongorestore, mongosh (Homebrew: brew install mongodb-database-tools mongosh).
  • Node 22+: should already be installed.
  • HUD account: register at https://www.huduser.gov/portal/datasets/usps_crosswalk.html if you go that route. Free.
  • gh CLI: for posting status comments on #80.

Prod Mongo URIs from Secrets Manager (only fetch when you need them; keep them out of .env.local since they rotate):

bash
PROD_READ_URI=$(aws --profile askflorence-prod secretsmanager get-secret-value \
  --secret-id prod/mongodb/app-read --query SecretString --output text)
PROD_WRITE_URI=$(aws --profile askflorence-prod secretsmanager get-secret-value \
  --secret-id prod/mongodb/app-write --query SecretString --output text)

PROD_READ_URI is sufficient for Phases 0-5 (everything until apply). PROD_WRITE_URI only needed at apply time (Phase 7), and only for the specific batch you're applying.


Style + protocol reminders (from CLAUDE.md / AGENTS.md) ​

  • No em dashes (—) or en dashes (–) in any new docs/comments. Hyphens only.
  • This Next.js version has breaking changes from typical training data — don't touch app code in this session anyway, but if you ever need to, read node_modules/next/dist/docs/ first.
  • Don't push to main without explicit user sign-off. This work lives on the tier-0-5-federal-completeness-audit branch.
  • Don't deploy. No vercel --prod runs from this session.
  • All commits in this worktree use the standard message format with Co-Authored-By: Claude Opus 4.7 (1M context).
  • Track every batch boundary as a status comment on #80, not just in chat.

What "done" looks like ​

  • Tier 0.5 audit run end-to-end at 100% match (every USPS-known federal+NY ZIP either accounted for in DB or documented as an exclusion)
  • Every apply was preceded by a verified backup with documented restore command
  • Every apply was explicitly user-greenlighted via the analyze → report → phased flow on #80
  • Calculator baseline diff: ZERO DIFFS post-apply
  • Tier 1 + Tier 1.5 audits: 100% post-apply
  • 85001 specifically resolves on prod (curl https://askflorence.health/api/counties?zip=85001 returns Maricopa County response)
  • Markdown report committed at docs/validation/tier-0-5-federal-uspscompleteness.md
  • Annual refresh playbook in docs/infrastructure/data-sources.md extended
  • #80 closed with final summary comment

Your first action when this session starts ​

  1. Read this entire file.
  2. Read /Users/tahaabbasi/.claude/plans/hey-so-okay-that-delightful-eich.md (the original session's plan — comprehensive context).
  3. Read gh issue view 80 and gh issue view 79.
  4. Read scripts/db/audit-federal-completeness.js and scripts/db/seed-federal-completeness.js to internalize the Tier 0 pattern you'll mirror.
  5. Write your own plan file at ~/.claude/plans/<your-session-tag>.md. The plan must explicitly bake in Constraints 1 + 2 — every phase that touches prod has a backup precondition, and Phase 5 is a hard stop with no apply path.
  6. Enter plan mode, share the plan via ExitPlanMode for user approval.
  7. Then execute Phase 0 through Phase 4 (recon → dry-run audit → report).
  8. Stop at Phase 5. Surface report on #80. Wait for user.
  9. Execute user-greenlighted batches one at a time, each with its own backup + post-apply verification + status comment.
Pager
Next pageHome

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.