Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

Incident Response Plan ​

Status: Active. Effective 2026-05-11. Owner: Taha Abbasi (Incident Commander) + Asad Khalid (Compliance Liaison) + Ian Friend (Comms Lead). Reviewed: Annually + after every Severity 2+ incident as part of post-mortem.

Purpose ​

Defines roles, escalation paths, regulatory notification timelines, and the operational playbook for security and privacy incidents. Required artifact for:

  • HIPAA §164.308(a)(6) — security incident procedures
  • HIPAA Breach Notification Rule (45 CFR §164.400-414) — 60-day window from discovery; HHS + affected individuals + (if >500) media notification
  • CMS EDE Phase 3 / MARS-E 2.2 IR-1 through IR-8 — incident response policy + procedures + handling + monitoring + reporting + assistance + plan + spillage response
  • SOC 2 CC7.3 — incident response; CC7.4 — disclosure
  • NIST 800-53 R4 Moderate IR-family controls
  • State breach notification statutes (varies; default to longest applicable — CA = 30 days for affected residents, others vary)

The operational playbook lives at docs/runbooks/security-incident-response.md. This document defines roles, criteria, and timelines; the runbook is the moment-of-incident first-responder checklist.

Roles ​

RolePersonBackupResponsibilities
Incident CommanderTaha AbbasiAsad Khalid (until a second technical admin is provisioned, see risk R-001)Owns the incident end-to-end. Triggers containment, drives RCA, owns the post-mortem.
Compliance LiaisonAsad KhalidTaha AbbasiOwns regulatory clock — calculates Breach Notification Rule timing, drafts HHS / state-AG / affected-individual notifications. Owns vendor BAA + sub-processor coordination.
Comms LeadIan FriendAsad KhalidOwns external comms — affected-individual emails, public statement (if needed), agent + investor comms.
Engineering ResponderWhoever is most-relevant to the incident's blast radiusAll engineersHands-on remediation, evidence preservation, technical RCA
Customer + Agent LiaisonIan FriendAsad KhalidDirect contact with affected agents / future members. Owns the "what to tell them" decision matrix.

Severity classification ​

SeverityDefinitionNotification window (internal)Examples
SEV-0Confirmed PHI breach OR active exploit OR site fully down to customersImmediate (within 15 min of detection — page the IC)Database publicly accessible; ransomware; credential dump on the dark web
SEV-1Suspected PHI breach OR significant data integrity issue OR auth/MFA bypass OR vendor BAA breachWithin 1 hourSuspect lateral movement; suspected SQL injection; vendor reports BAA violation
SEV-2Limited data exposure OR availability degradation OR misconfiguration with potential breach implicationWithin 4 hoursA GET /api/waitlist triggers spurious emails; PostHog free-tier captures unexpected data; an admin login from an unrecognized IP
SEV-3Operational defect with potential compliance implication but no immediate data exposureWithin 24 hoursQuarterly access review missed; BAA expiration not renewed in time; CI guard fails on a PR

PHI involvement upgrades the severity by one tier minimum. When in doubt, classify higher.

Detection sources ​

The incident-response practice depends on these signals being monitored:

SourceWhat it surfacesStatus today
AWS GuardDutyAccount-level threat detection (anomalous API patterns, credential misuse, malware)Active, org-wide, log-archive aggregation
AWS Security HubCross-service findings (CIS, FSBP, NIST 800-53 standards)Active, org-wide
AWS CloudTrailAll AWS API activity, 7-year retention in log-archive S3Active, org-wide
Atlas database auditDB-layer auth attempts + admin actionsActive on M10 HIPAA tier (prod) + M30 (staging)
agent_audit_log collectionApp-layer auth + admin actions + future per-record PHI accessLive (append-only enforced at DB layer)
staging-cluster-drift workflowNightly drift check on cross-cluster reader role; opens P1 issue on driftActive, daily 08:00 UTC
staging-collections-guard workflowPer-PR static guard on cross-cluster data classificationActive
validate-secrets workflowPer-PR check for malformed secrets (the bug class that broke Resend)Active
Vercel / AWS WAF logsL7 attack patterns, blocked requestsActive (separate WAF rule-exclusion item pending for PostHog crawlers)
Customer / agent reportExternal party emails security@askflorence.health or taha@Always active; no formal triage queue yet
Founder direct observation"Email's not sending," "the cost spiked," "this CTA does nothing"The 2026-04-30 home CTA no-op + 2026-04-10 Resend incident were detected this way

Procedure (5-step lifecycle) ​

The runbook is the moment-of-incident checklist; this is the lifecycle framework.

Step 1 — Detect ​

  • Trigger source: any of the detection signals above. Page the IC within the notification window above.
  • IC ack within the window — text or call from any team member is acceptable.
  • Open a private incident channel (Slack DM thread or Google Chat space — 🚨 sev-N incident <date> <short slug>).

Step 2 — Contain ​

  • Block the immediate vector. Examples:
    • Revoke compromised credentials (Secrets Manager update-secret + IAM key rotation)
    • Disable the affected route (/api/... 503 toggle or feature flag)
    • Quarantine the affected ECS task / Lambda function
    • Block the source IP at WAF or Atlas allowlist
  • Preserve evidence — do NOT delete logs, do NOT clean up. Snapshot the relevant Atlas cluster + S3 bucket BEFORE remediating if there is any chance of post-mortem need.

Step 3 — Assess ​

  • IC + Engineering Responder: what data was accessed? Which subjects affected? Scope of exposure?
  • Compliance Liaison: is this a HIPAA breach (45 CFR §164.402 definition)? If yes, the 60-day clock starts at discovery (not "investigation complete") — log the timestamp.
  • Time-bound the assessment. SEV-0/1 assessment in ≤ 24h; SEV-2 in ≤ 72h.

Step 4 — Notify ​

RecipientTriggerTimelineChannel
Affected individualsHIPAA breach involving their PHIWithin 60 days of discovery; state laws may require earlier (CA = 30 days for affected residents)Letter or email per individual preference + HHS-mandated content
HHS Office for Civil RightsHIPAA breach affecting any individualWithin 60 days (>500 affected) or annually (<500); via OCR breach portalOnline submission
MediaHIPAA breach affecting >500 individuals in a stateWithin 60 days; "prominent media outlet" in the statePress release
State Attorney GeneralPer state-specific law (CA, NY, etc.)Varies — default to 30 daysPer state-specific procedure
CMS EDE program contactEDE program-eligibility-relevant incidentPer EDE Phase 3 program requirements (once submitted)EDE program portal
Vendor BAA partnersIncident involving their data flowsPer BAA terms (typically 30 days)Per vendor contract
Investors + advisorsSEV-0 customer-facing incidentSame business dayEmail + scheduled brief
Internal teamAny SEV-0/1ImmediatePrivate Google Chat / Slack

Compliance Liaison owns the regulatory clock. A spreadsheet template lives in the runbook for tracking notification deadlines per breach.

Step 5 — Remediate + post-mortem ​

  • Implement the fix. Document the fix.
  • Within 5 business days of incident close, the IC files a post-mortem at docs/session-log/<date>-incident-<slug>.md covering:
    • Timeline (detection → containment → notification → remediation)
    • Root cause analysis
    • Contributing factors
    • What worked
    • What didn't
    • Preventive measures (with owners + due dates)
    • Status of regulatory notifications
  • Post-mortem is reviewed at the next quarterly access review; preventive-measure follow-ups are tracked to closure.

Documented incident history (worked examples) ​

These are the actually-documented incidents AskFlorence has encountered. They are the IRP's worked examples — drilling against these scenarios builds the muscle for real ones.

2026-04-10 — Resend transactional email outage (SEV-2 in retrospect) ​

  • Detection: founder-side test send showed no delivery; downstream Vercel logs showed Resend 401s
  • Root cause: literal \n character embedded in the RESEND_API_KEY Vercel env var + Resend domain DKIM CNAMEs never published; both compounded to cause updates.askflorence.health domain to fail Resend's status check
  • Impact: 3 weeks of broken transactional email on Vercel-hosted prod (waitlist confirmations + ops notifications). No external party complained because the volume was low pre-AWS-cutover; impact = ~30-40 lost confirmation emails to early waitlist signups
  • Resolution: decision to retire Resend in favor of AWS SES (v0.33.0 commits, 2026-04-30)
  • Preventive measure: validate-secrets CI workflow (now live)
  • What IRP would have done differently: if this had been SEV-1 we should have notified affected individuals (waitlist signups didn't receive their confirmation); decision made retrospectively to absorb the customer-trust impact rather than re-notify

2026-04-10 — GET /api/waitlist crawler-triggered SES sends (SEV-3) ​

  • Detection: founder observed unexpected SES Send metric values + spam-folder mail to a hardcoded address
  • Root cause: GET handler triggered a real SES send for every request; Googlebot / unfurlers / monitoring tools hit the URL ~15-25 times over 30 days
  • Impact: ~15-25 spurious emails to a hardcoded ops address. No PII / PHI exposure (no external recipient leak). Documented at commit 4422ca8
  • Preventive measure: engineering rule documented in CLAUDE.md — "no side-effect-triggering code in a GET handler unless gated on auth or NODE_ENV != production"
  • IRP role this exercised: detection + containment + post-mortem. No regulatory notification was needed (no PHI / PII involved beyond a single internal address).

2026-04-30 — Home CTA no-op (SEV-2) ​

  • Detection: founder noticed signup count was zero in HubSpot post-deploy
  • Root cause: v0.29.0 home swap shipped a fake-success CTA handler that didn't actually call /api/waitlist
  • Impact: every click between v0.29.0 deploy and the fix produced no record (no Mongo row, no SES email, no PostHog event). Estimated 5-15 lost signups based on landing-page traffic
  • Preventive measure: post-deploy smoke testing of conversion-critical CTAs
  • IRP exercise: detection by founder observation; remediation in same session

2026-05-06 — CMS ingest cost spike (SEV-3) ​

  • Detection: monthly Atlas billing email; cost on M60 ~$2,800/mo
  • Root cause: full re-ingest pattern on M60 instead of delta-aware refresh
  • Impact: financial only — $2,000+ unbudgeted Atlas spend
  • Resolution: delta-aware refresh cadence per decisions/2026-05-09-refresh-cadence.md
  • Preventive measure: budget alarms on AWS + Atlas cost (planned)
  • IRP exercise: assessment + remediation + post-mortem (in the decision doc)

2026-05-09 — HubSpot GDPR-delete blocklist (SEV-2) ​

  • Detection: founder noticed a real email address (taha@askflorence.health) was in the test-data cleanup list of a HubSpot script run
  • Root cause: test cleanup script used POST /crm/v3/objects/contacts/gdpr-delete indiscriminately — that endpoint is irreversible portal-level blocklist
  • Impact: primary work email of the founder permanently blocked from HubSpot's UI / CSV-import paths (auto-write via API still works); HubSpot Support confirmed irreversible
  • Preventive measure: engineering convention captured in CLAUDE.md — use +alias@ test addresses for HubSpot data; archive endpoint for soft-delete; never gdpr-delete a real address
  • IRP exercise: assessment, no regulatory notification needed (no PHI affected; only impacted internal workflow ergonomics)

Tabletop exercise ​

The IRP becomes effective when the team has practiced against it. First tabletop scheduled for Q2 2026 access review (July 2026) — a 60-minute walkthrough of a SEV-1 PHI-breach scenario, role-playing IC + Compliance Liaison + Comms Lead. Tabletop outcomes are documented in the same access-review file.

Reference ​

  • Operational playbook: docs/runbooks/security-incident-response.md
  • Break-glass procedure: docs/runbooks/break-glass-root-login.md
  • Risk Assessment — incident-relevant risks
  • Data Retention Policy — erasure flow that may be triggered by an incident
  • Privacy Impact Assessment — data flows that determine incident scope
  • Access Control Policy — credential rotation as remediation
  • Vendor / Subprocessor Register — vendor BAA contact info
  • HIPAA Breach Notification Rule: 45 CFR §§164.400-414
  • HHS OCR Breach Portal: https://ocrportal.hhs.gov/ocr/breach/wizard_breach.jsf
Pager
Previous pageConsent Capture & Versioning
Next pageAccess Control Policy

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.