Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

Data Retention Policy ​

Status: Active. Effective 2026-05-11. Owner: Taha Abbasi (technical implementation) + Asad Khalid (legal / regulatory). Reviewed: Annually + whenever a new collection / data store / regulatory obligation is added.

Purpose ​

Defines how long AskFlorence retains each class of data, by what mechanism it is deleted, and how the retention claim is auditable. Required artifact for:

  • HIPAA §164.316 (documentation requirements; 6-year minimum retention for the documented policies + procedures themselves)
  • HIPAA §164.530(j) (record retention for HIPAA-related decisions)
  • HIPAA §164.312(b) (audit controls — implicit retention period)
  • CMS EDE Phase 3 / MARS-E 2.2 AU-11 (audit record retention)
  • SOC 2 CC2.3 (information used to support objectives — retention as part of information lifecycle)
  • State breach notification laws (varies; default to longest applicable)

Data classification — source of truth ​

This policy aligns with the data classification taxonomy. When a row below references a "data class," it refers to the classes defined there: Public / Internal / PII / PHI plus the EDE-introduced classes FTI (Federal Tax Information) and cms_hub (data fetched from the CMS Marketplace hub or HealthCare.gov FFE).

Retention schedule ​

Data classExamplesRetention periodDeletion mechanismNotes
PublicPlan names, premium amounts, ZIP→county mappings, NPI provider directory, RxCUI formulary tier mappingsIndefinite (current plan year + prior years)None — kept indefinitely as reference dataRefreshed per ingest cadence; superseded versions retained for historical comparison
Internal — application telemetryAPI access logs, ingest manifests, deploy logs (CloudWatch)90 days hot in CloudWatch; 7 years in log-archive S3 (CloudTrail org-trail)CloudWatch Logs retention policy + S3 lifecycle to Glacier after 1 yearOrg-wide CloudTrail trail captures all AWS API events
Internal — audit log (DB-layer)agent_audit_log collection — every auth event, admin action, data change7 years minimum (HIPAA §164.312(b)); target 10 years (EDE-safer)TTL index at the Mongo collection level (set at Phase 5 collection creation alongside the append-only role binding)Append-only enforced at DB permission layer (ADR 0002); aged-out records cannot be selectively purged before TTL fires
Internal — Mongo audit logs (Atlas-side)Atlas database audit logs (atlasAdmin-level audit)Atlas-managed retention (90 days default; 12 months on paid tier — confirm tier)Atlas-managedUsed for incident-response post-mortem reconstruction
PII — waitlist / agent waitlistagent_waitlist_submissions (email, name, phone, NPN, role); consumer-side waitlist (email only)6 years from last activityManual review at quarterly access review; planned automated TTL post-Phase-5The CAN-SPAM "unsubscribe" path triggers a soft-archive within 10 business days (#59). GDPR / CCPA "right to erasure" requests trigger immediate purge (within 30 days) with audit-log row written.
PII — agent discovery survey responsesagent_survey_responses (NPN, agent profile + free-text fields)6 years from collectionManual review + planned automated TTLSame erasure-on-request path as waitlist. Consent capture is per-record per agent platform compliance.
PII — Google Workspace email + Drive contentFounder + ops @askflorence.health mail; team documentsPer Google Workspace Vault retention rules — to be configured before SOC 2 evidence window startsGoogle Vault retention rules (default: indefinite)Vault rules to be applied: mail = 7 years; chat = 1 year; Drive content = until role changes
PII — HubSpot CRMAgent waitlist + survey mirrors (member data never touches HubSpot by design)7 years from last contact (HubSpot Marketing Hub default)HubSpot platform deletion (HubSpot data retention controls)GDPR-delete endpoint use is restricted — only +test* aliases per the 2026-05-09 incident learning
PHI — consumers, enrollments (not yet created)SSN, DOB, plan-enrollment records10 years from last activity (HIPAA 6-year minimum; EDE-safer 10)TTL index at collection level + CSFLE-encrypted blobs become unrecoverable when KMS CMK is rotated past retention boundaryThese collections do NOT exist today. Pre-launch checklist includes: (1) CSFLE + KMS-per-field, (2) TTL index, (3) audit-log row written on insert/update/delete, (4) GDPR / state-AG erasure procedure documented.
PHI — filesIncome verification PDFs, ID proofing artifacts (not yet collected)10 years from collection + immediate redaction on close-of-enrollment for non-PHI fieldsS3 Object Lock + lifecycle to Glacier Deep Archive after 1 year + permanent deletion at TTLS3 bucket policy + Object Lock retention applied at creation time; pre-launch checklist mirrors PHI collections above.
FTI — Federal Tax Information (not yet collected)Income data from IRS Data Hub (FTI as defined by IRS Publication 1075)Per IRS 1075 — typically until purpose-served, then secure-destroyIRS-1075-aligned destruction procedure (not yet documented; required before any FTI is collected)FTI is collected only at enrollment with explicit consent + audit log; never logged in application telemetry; storage path is purpose-bound (eligibility determination)
cms_hub — CMS Marketplace API / FFE dataEligibility determinations, FFM plan inventories, public marketplace dataIndefinite for public; 10 years for any identified-individual-bound determinationsPer CMS EDE program requirements + same TTL as PHI for identified recordsPublic marketplace data refreshed at ingest cadence; identified records (Phase 5+) follow PHI retention.
Secrets — credentials, API keysAWS Secrets Manager entries; Atlas connection strings; SES domain identitiesUntil secret value rotation (annual or on-incident); old versions retained 30 days for rollbackSecrets Manager has a 30-day default recovery window; explicit force-delete-without-recovery only with ADRRotation cadence in access-control-policy
Backups — S3 versioning, Atlas snapshotsVersioned objects in tfstate / data buckets; Atlas continuous snapshotsS3: 90 days for tfstate, lifecycle thereafter; Atlas: per-tier (M10 = 7-day point-in-time + daily snapshots for 30 days)S3 lifecycle policies + Atlas snapshot retention configBackups inherit the encryption + classification of the source data
Source code (GitHub)All repositoriesIndefinite (commit history is the audit trail)Branch deletion does not remove history; force-pushes are blocked on mainNo PHI / secrets in repo by .gitignore + GitHub secret scanning
Compliance documentation (this directory)Policies, control mappings, runbooks, ADRs, access reviews6 years minimum (HIPAA §164.316) — preserved indefinitely as part of git historyNever deleted; superseded versions retained as git history; quarterly access-review documents stamped + archived in-treeVersioned via git; each annual policy review appends a row to the change-log, never overwrites prior versions

Deletion procedures ​

Routine deletion (TTL-driven) ​

  • Mongo TTL indexes on PHI / PII collections — configured at collection creation; verified at quarterly access review.
  • S3 lifecycle rules — configured at bucket creation; verified at quarterly access review (aws s3api get-bucket-lifecycle-configuration).
  • CloudWatch Log retention — set per log group at creation.

Erasure on request (GDPR / CCPA / HIPAA right-of-access-and-amendment) ​

When a data subject requests erasure:

  1. Validate the request — confirm identity using the contact email on file + any additional identifier (NPN for agents).
  2. Document the request — write an audit-log row to agent_audit_log (action: "erasure_request").
  3. Scope — identify every collection + system holding the subject's data. Default scope: Atlas (agent_waitlist_submissions, agent_survey_responses, future consumers / enrollments); HubSpot CRM; SES suppression list (if marketing send history exists); CloudWatch Logs (if request mentions a session ID, scrub via PII-redaction script).
  4. Execute — within 30 days:
    • Atlas: hard-delete the record AND write an erasure_complete audit-log row.
    • HubSpot: use the archive (soft-delete) endpoint NOT the gdpr-delete endpoint unless the address is unambiguously synthetic. The 2026-05-09 incident with taha@askflorence.health is the negative example: gdpr-delete permanently blocklists, and the irreversible portal-level blocklist cannot be lifted even by HubSpot Support.
    • SES: add to suppression list to prevent any future sends.
  5. Confirm — email confirmation to the requester (using a fresh thread, not the suppression-listed address).
  6. Retain the audit-log entries — the erasure request + completion rows remain in agent_audit_log for the full 7-10 year retention period. The audit-log entries are NOT subject to the erasure (regulatory permitted exception); they are minimized — they record that the erasure occurred, not the erased content.

Decommissioning / migration ​

When a collection / data store is decommissioned (e.g., Phase 5 schema migration):

  1. Capture a backup snapshot dated + named with the migration session.
  2. Migrate live readers + writers to the new collection (getReferenceDb pattern, etc.).
  3. Verify the new collection is operating correctly + the old collection has zero read/write traffic for 30 days minimum.
  4. Drop the old collection. Audit-log the drop.
  5. Retain the dated backup snapshot for the full retention period of the data class involved (e.g., if the collection held PHI, retain the snapshot 10 years).

Vendor-side data ​

Each vendor BAA (see vendor register) commits the vendor to deletion-on-termination procedures. At vendor retirement:

  1. Trigger the contract-termination deletion procedure with the vendor.
  2. Collect a written confirmation of deletion + scheduled-purge date.
  3. Move the vendor to the "retired" section of the vendor register; preserve the BAA + deletion-confirmation in docs/infrastructure/evidence/ for the full retention period (6 years HIPAA minimum, 10 years EDE-safer).

Verification ​

CadenceWhatHow
Quarterly access reviewConfirm TTL indexes are in place + S3 lifecycle rules are configuredaws s3api get-bucket-lifecycle-configuration + Atlas db.runCommand({listCollections: 1}) review
Annually (audit prep)Sample a deleted record + confirm it is irretrievable (subject to backup retention)One synthetic erasure exercise during Q3 review
At every vendor retirementCollect deletion-confirmation; archive in docs/infrastructure/evidence/Documented in vendor register row
At every collection dropAudit-log row + dated snapshotDecommissioning procedure above

Reference ​

  • Data Classification Policy — source-of-truth for what data is in each class
  • Encryption Policy — encryption posture per data class
  • Access Control Policy — credential rotation + access cadence
  • Incident Response Plan — handling of inadvertent retention-policy violations
  • Vendor / Subprocessor Register — vendor-side deletion commitments
  • SOC 2 Control Mapping — CC2.3 row
  • HIPAA Control Mapping — §164.316 (documentation retention), §164.312(b) (audit retention)
  • CMS EDE Appendix A Mapping — §9 (Access Control Logging retention)
Pager
Previous pageEncryption Policy
Next pagePrivacy Impact Assessment

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.