Appearance
Incident Response Plan
Status: Active. Effective 2026-05-11. Owner: Taha Abbasi (Incident Commander) + Asad Khalid (Compliance Liaison) + Ian Friend (Comms Lead). Reviewed: Annually + after every Severity 2+ incident as part of post-mortem.
Purpose
Defines roles, escalation paths, regulatory notification timelines, and the operational playbook for security and privacy incidents. Required artifact for:
- HIPAA §164.308(a)(6) — security incident procedures
- HIPAA Breach Notification Rule (45 CFR §164.400-414) — 60-day window from discovery; HHS + affected individuals + (if >500) media notification
- CMS EDE Phase 3 / MARS-E 2.2 IR-1 through IR-8 — incident response policy + procedures + handling + monitoring + reporting + assistance + plan + spillage response
- SOC 2 CC7.3 — incident response; CC7.4 — disclosure
- NIST 800-53 R4 Moderate IR-family controls
- State breach notification statutes (varies; default to longest applicable — CA = 30 days for affected residents, others vary)
The operational playbook lives at docs/runbooks/security-incident-response.md. This document defines roles, criteria, and timelines; the runbook is the moment-of-incident first-responder checklist.
Roles
| Role | Person | Backup | Responsibilities |
|---|---|---|---|
| Incident Commander | Taha Abbasi | Asad Khalid (until a second technical admin is provisioned, see risk R-001) | Owns the incident end-to-end. Triggers containment, drives RCA, owns the post-mortem. |
| Compliance Liaison | Asad Khalid | Taha Abbasi | Owns regulatory clock — calculates Breach Notification Rule timing, drafts HHS / state-AG / affected-individual notifications. Owns vendor BAA + sub-processor coordination. |
| Comms Lead | Ian Friend | Asad Khalid | Owns external comms — affected-individual emails, public statement (if needed), agent + investor comms. |
| Engineering Responder | Whoever is most-relevant to the incident's blast radius | All engineers | Hands-on remediation, evidence preservation, technical RCA |
| Customer + Agent Liaison | Ian Friend | Asad Khalid | Direct contact with affected agents / future members. Owns the "what to tell them" decision matrix. |
Severity classification
| Severity | Definition | Notification window (internal) | Examples |
|---|---|---|---|
| SEV-0 | Confirmed PHI breach OR active exploit OR site fully down to customers | Immediate (within 15 min of detection — page the IC) | Database publicly accessible; ransomware; credential dump on the dark web |
| SEV-1 | Suspected PHI breach OR significant data integrity issue OR auth/MFA bypass OR vendor BAA breach | Within 1 hour | Suspect lateral movement; suspected SQL injection; vendor reports BAA violation |
| SEV-2 | Limited data exposure OR availability degradation OR misconfiguration with potential breach implication | Within 4 hours | A GET /api/waitlist triggers spurious emails; PostHog free-tier captures unexpected data; an admin login from an unrecognized IP |
| SEV-3 | Operational defect with potential compliance implication but no immediate data exposure | Within 24 hours | Quarterly access review missed; BAA expiration not renewed in time; CI guard fails on a PR |
PHI involvement upgrades the severity by one tier minimum. When in doubt, classify higher.
Detection sources
The incident-response practice depends on these signals being monitored:
| Source | What it surfaces | Status today |
|---|---|---|
| AWS GuardDuty | Account-level threat detection (anomalous API patterns, credential misuse, malware) | Active, org-wide, log-archive aggregation |
| AWS Security Hub | Cross-service findings (CIS, FSBP, NIST 800-53 standards) | Active, org-wide |
| AWS CloudTrail | All AWS API activity, 7-year retention in log-archive S3 | Active, org-wide |
| Atlas database audit | DB-layer auth attempts + admin actions | Active on M10 HIPAA tier (prod) + M30 (staging) |
agent_audit_log collection | App-layer auth + admin actions + future per-record PHI access | Live (append-only enforced at DB layer) |
staging-cluster-drift workflow | Nightly drift check on cross-cluster reader role; opens P1 issue on drift | Active, daily 08:00 UTC |
staging-collections-guard workflow | Per-PR static guard on cross-cluster data classification | Active |
validate-secrets workflow | Per-PR check for malformed secrets (the bug class that broke Resend) | Active |
| Vercel / AWS WAF logs | L7 attack patterns, blocked requests | Active (separate WAF rule-exclusion item pending for PostHog crawlers) |
| Customer / agent report | External party emails security@askflorence.health or taha@ | Always active; no formal triage queue yet |
| Founder direct observation | "Email's not sending," "the cost spiked," "this CTA does nothing" | The 2026-04-30 home CTA no-op + 2026-04-10 Resend incident were detected this way |
Procedure (5-step lifecycle)
The runbook is the moment-of-incident checklist; this is the lifecycle framework.
Step 1 — Detect
- Trigger source: any of the detection signals above. Page the IC within the notification window above.
- IC ack within the window — text or call from any team member is acceptable.
- Open a private incident channel (Slack DM thread or Google Chat space —
🚨 sev-N incident <date> <short slug>).
Step 2 — Contain
- Block the immediate vector. Examples:
- Revoke compromised credentials (Secrets Manager
update-secret+ IAM key rotation) - Disable the affected route (
/api/...503 toggle or feature flag) - Quarantine the affected ECS task / Lambda function
- Block the source IP at WAF or Atlas allowlist
- Revoke compromised credentials (Secrets Manager
- Preserve evidence — do NOT delete logs, do NOT clean up. Snapshot the relevant Atlas cluster + S3 bucket BEFORE remediating if there is any chance of post-mortem need.
Step 3 — Assess
- IC + Engineering Responder: what data was accessed? Which subjects affected? Scope of exposure?
- Compliance Liaison: is this a HIPAA breach (45 CFR §164.402 definition)? If yes, the 60-day clock starts at discovery (not "investigation complete") — log the timestamp.
- Time-bound the assessment. SEV-0/1 assessment in ≤ 24h; SEV-2 in ≤ 72h.
Step 4 — Notify
| Recipient | Trigger | Timeline | Channel |
|---|---|---|---|
| Affected individuals | HIPAA breach involving their PHI | Within 60 days of discovery; state laws may require earlier (CA = 30 days for affected residents) | Letter or email per individual preference + HHS-mandated content |
| HHS Office for Civil Rights | HIPAA breach affecting any individual | Within 60 days (>500 affected) or annually (<500); via OCR breach portal | Online submission |
| Media | HIPAA breach affecting >500 individuals in a state | Within 60 days; "prominent media outlet" in the state | Press release |
| State Attorney General | Per state-specific law (CA, NY, etc.) | Varies — default to 30 days | Per state-specific procedure |
| CMS EDE program contact | EDE program-eligibility-relevant incident | Per EDE Phase 3 program requirements (once submitted) | EDE program portal |
| Vendor BAA partners | Incident involving their data flows | Per BAA terms (typically 30 days) | Per vendor contract |
| Investors + advisors | SEV-0 customer-facing incident | Same business day | Email + scheduled brief |
| Internal team | Any SEV-0/1 | Immediate | Private Google Chat / Slack |
Compliance Liaison owns the regulatory clock. A spreadsheet template lives in the runbook for tracking notification deadlines per breach.
Step 5 — Remediate + post-mortem
- Implement the fix. Document the fix.
- Within 5 business days of incident close, the IC files a post-mortem at
docs/session-log/<date>-incident-<slug>.mdcovering:- Timeline (detection → containment → notification → remediation)
- Root cause analysis
- Contributing factors
- What worked
- What didn't
- Preventive measures (with owners + due dates)
- Status of regulatory notifications
- Post-mortem is reviewed at the next quarterly access review; preventive-measure follow-ups are tracked to closure.
Documented incident history (worked examples)
These are the actually-documented incidents AskFlorence has encountered. They are the IRP's worked examples — drilling against these scenarios builds the muscle for real ones.
2026-04-10 — Resend transactional email outage (SEV-2 in retrospect)
- Detection: founder-side test send showed no delivery; downstream Vercel logs showed Resend 401s
- Root cause: literal
\ncharacter embedded in theRESEND_API_KEYVercel env var + Resend domain DKIM CNAMEs never published; both compounded to causeupdates.askflorence.healthdomain to fail Resend's status check - Impact: 3 weeks of broken transactional email on Vercel-hosted prod (waitlist confirmations + ops notifications). No external party complained because the volume was low pre-AWS-cutover; impact = ~30-40 lost confirmation emails to early waitlist signups
- Resolution: decision to retire Resend in favor of AWS SES (v0.33.0 commits, 2026-04-30)
- Preventive measure:
validate-secretsCI workflow (now live) - What IRP would have done differently: if this had been SEV-1 we should have notified affected individuals (waitlist signups didn't receive their confirmation); decision made retrospectively to absorb the customer-trust impact rather than re-notify
2026-04-10 — GET /api/waitlist crawler-triggered SES sends (SEV-3)
- Detection: founder observed unexpected SES
Sendmetric values + spam-folder mail to a hardcoded address - Root cause: GET handler triggered a real SES send for every request; Googlebot / unfurlers / monitoring tools hit the URL ~15-25 times over 30 days
- Impact: ~15-25 spurious emails to a hardcoded ops address. No PII / PHI exposure (no external recipient leak). Documented at commit
4422ca8 - Preventive measure: engineering rule documented in
CLAUDE.md— "no side-effect-triggering code in a GET handler unless gated on auth or NODE_ENV != production" - IRP role this exercised: detection + containment + post-mortem. No regulatory notification was needed (no PHI / PII involved beyond a single internal address).
2026-04-30 — Home CTA no-op (SEV-2)
- Detection: founder noticed signup count was zero in HubSpot post-deploy
- Root cause: v0.29.0 home swap shipped a fake-success CTA handler that didn't actually call
/api/waitlist - Impact: every click between v0.29.0 deploy and the fix produced no record (no Mongo row, no SES email, no PostHog event). Estimated 5-15 lost signups based on landing-page traffic
- Preventive measure: post-deploy smoke testing of conversion-critical CTAs
- IRP exercise: detection by founder observation; remediation in same session
2026-05-06 — CMS ingest cost spike (SEV-3)
- Detection: monthly Atlas billing email; cost on M60 ~$2,800/mo
- Root cause: full re-ingest pattern on M60 instead of delta-aware refresh
- Impact: financial only — $2,000+ unbudgeted Atlas spend
- Resolution: delta-aware refresh cadence per
decisions/2026-05-09-refresh-cadence.md - Preventive measure: budget alarms on AWS + Atlas cost (planned)
- IRP exercise: assessment + remediation + post-mortem (in the decision doc)
2026-05-09 — HubSpot GDPR-delete blocklist (SEV-2)
- Detection: founder noticed a real email address (
taha@askflorence.health) was in the test-data cleanup list of a HubSpot script run - Root cause: test cleanup script used
POST /crm/v3/objects/contacts/gdpr-deleteindiscriminately — that endpoint is irreversible portal-level blocklist - Impact: primary work email of the founder permanently blocked from HubSpot's UI / CSV-import paths (auto-write via API still works); HubSpot Support confirmed irreversible
- Preventive measure: engineering convention captured in
CLAUDE.md— use+alias@test addresses for HubSpot data;archiveendpoint for soft-delete; nevergdpr-deletea real address - IRP exercise: assessment, no regulatory notification needed (no PHI affected; only impacted internal workflow ergonomics)
Tabletop exercise
The IRP becomes effective when the team has practiced against it. First tabletop scheduled for Q2 2026 access review (July 2026) — a 60-minute walkthrough of a SEV-1 PHI-breach scenario, role-playing IC + Compliance Liaison + Comms Lead. Tabletop outcomes are documented in the same access-review file.
Reference
- Operational playbook:
docs/runbooks/security-incident-response.md - Break-glass procedure:
docs/runbooks/break-glass-root-login.md - Risk Assessment — incident-relevant risks
- Data Retention Policy — erasure flow that may be triggered by an incident
- Privacy Impact Assessment — data flows that determine incident scope
- Access Control Policy — credential rotation as remediation
- Vendor / Subprocessor Register — vendor BAA contact info
- HIPAA Breach Notification Rule: 45 CFR §§164.400-414
- HHS OCR Breach Portal: https://ocrportal.hhs.gov/ocr/breach/wizard_breach.jsf