Appearance
Sensitive data handling — Member portal
Status: Draft. Phase A deliverable per ENG-187. Must be reviewed + signed off before Phase B sections that collect SSN / immigration documents / payment data ship code.
Owner: Taha (founder, CTO-of-record). Reviewer: Asad (CFO, compliance owner).
Scope: every field the member portal collects that, if leaked, would create an identity-theft, fraud, or HIPAA breach risk. Concretely: SSN, immigration document numbers, full DOB, full address, full income detail, payment account / card numbers. The control set below is the floor — Phase B sections may add controls but cannot remove any.
This doc lives under docs/security-compliance/ alongside encryption-policy.md, data-retention-policy.md, and access-control-policy.md. It's referenced from ENG-187 under the "Sensitive data handling" plan section.
1. Storage at rest
| Field | At-rest treatment | Rationale |
|---|---|---|
| SSN | CSFLE field-level encryption with KMS-CMK-derived data keys, AES-256-CBC, deterministic algorithm so equality-match queries work for SSA verification re-runs | Highest-value identity theft vector. AWS BAA covers KMS. Driver-layer encryption means even direct Atlas queries by app_write return ciphertext |
| Immigration document numbers (I-551, I-94, I-766, etc.) | CSFLE field-level encryption, same key family as SSN | Same risk class. SAVE verification re-uses the value |
| Full DOB | CSFLE field-level encryption | Combined with name + ZIP this is a re-identification vector |
| Full home address | Plaintext in main collection, encrypted backups | Already widely-handled in agent flows; not a unique re-id vector on its own |
| Phone, email | Plaintext | Standard contact info |
| Income per source ($amount, frequency, employer name) | Plaintext in main collection | Member sees this in their portal as the editable record; FFM submission requires plaintext |
| Bank routing + account number | CSFLE field-level encryption, separate key from SSN family for blast-radius isolation | Direct fraud vector |
| Card primary account number (PAN) | Never stored in member_applications. PAN tokenized via payment vendor (Stripe / Square / chosen Phase B vendor) at moment of entry; we store only vendor-side token + last-4 + brand | PCI-DSS: storing PANs requires Level 1 attestation we don't want to assume |
CSFLE is enforced at the MongoDB driver layer (autoEncryption with mongocryptd). Application code SETs values as plaintext; driver encrypts on write. Reads through the driver decrypt automatically. Direct Atlas queries (e.g. an Atlas admin running a query in the UI) return ciphertext for encrypted fields. This is the property we want for tamper / leak / insider-access containment.
Key management: KMS-CMK in the prod AWS account, alias alias/prod-member-portal-csfle-master, rotation enabled (annual). Data Encryption Keys (DEKs) are stored in a dedicated __keyVault collection per MongoDB CSFLE convention. DEKs are wrapped by the CMK.
Key rotation cadence: annual rotation of the master CMK (automatic). Re-encryption of existing data is not automatic but is also not required for envelope encryption — only the new wrappings use the new key version. Member-portal application data has a 7-year retention horizon (per data-retention-policy.md); re-encryption job is deferred unless a compromise indicator forces it.
2. Transport
- TLS 1.2 minimum on every hop: viewer → CloudFront, CloudFront → ALB (custom-origin), ALB → ECS task. The ACM cert covering
*.askflorence.healthis used for both the CloudFront viewer cert AND the origin-side TLS handshake to ALB - Internal VPC traffic is TLS too — no plaintext on the wire even within our VPC. ALB → task uses HTTPS (port 443 / certified at the task per ECS service config)
- MongoDB Atlas connections use TLS 1.2 minimum, certificate-validated
3. Presentation to the member
For SSN and immigration document numbers — the two highest-value re-identification fields — apply this rule:
Default presentation: masked. When a portal page renders a previously-captured value, the value is masked: ***-**-1234 for SSN, A123 4567 **** style for document numbers. The full value is never sent to the browser on the standard read path.
Edit path: step-up verification. When the member clicks "edit my SSN" or "edit my immigration document":
- The page presents a step-up verification challenge (re-enter password OR re-do magic link OR TOTP if enabled)
- On successful challenge, server returns the unmasked value to the browser ONE TIME, in the edit form
- After submit (or cancel), the page reverts to masked display
- The unmasked-value response includes a
Cache-Control: no-store+Clear-Site-Data: "cache"header to prevent retention in browser cache
No "always-visible" mode, ever. Members who want to verify their SSN with a third party (employer, bank) must reveal it through the step-up path.
For DOB: shown in full (it's already on the calculator and on every form section the member filled). For income: shown in full (the member wrote it). For address: shown in full. For payment fields: vendor-tokenized; we only show **** 1234 · Visa (last-4 + brand).
4. Step-up verification before reveal
The challenge required to unmask a sensitive field:
| Phase | Challenge |
|---|---|
| MVP-1 (single-factor) | Re-enter email + receive magic link + click within 10 min |
| Phase F (MFA on) | Magic link + TOTP / hardware-key challenge |
The challenge response token is single-use, 10-min TTL, bound to the specific field reveal (scope: "reveal_ssn" in the JWT claim). Re-using the token for a different reveal action fails server-side validation.
5. Logging discipline
Application logs MUST NEVER include sensitive field values. The deny-list is enforced at the structured-logger layer (src/lib/logger.ts — to be created Phase A). Deny-listed property names:
ssn, ssn_last_four, immigration_document_number, immigration_doc_number,
date_of_birth, dob, bank_routing_number, bank_account_number,
card_pan, card_number, card_cvv, card_expThe logger redacts these property names recursively in any object passed to logger.{info,warn,error} calls — value becomes [REDACTED]. A CI check (Phase B) scans logger.* call sites for hand-built strings that include deny-listed substrings.
Stack traces are stripped of all query, body, headers.cookie, headers.authorization fields before being shipped to CloudWatch / observability backends.
No PHI in error messages returned to clients. Server errors that bubble from validation must use generic copy ("We couldn't save your changes — please try again or contact support").
6. Backup + restore
Encrypted Atlas backups remain encrypted at rest. The MongoDB CSFLE master key (KMS-CMK) is also backed up — losing it means permanent loss of all sensitive data.
Dual-control restore: restoring from backup requires (a) Atlas project admin (Taha) + (b) AWS KMS-CMK access (Taha; Asad as backup). Both controls are documented in access-control-policy.md. The restore runbook lives alongside break-glass-root-login.md and atlas-user-provisioning.md — to be created Phase A as member-portal-restore.md in the same runbook directory.
7. Egress controls
Sensitive fields NEVER appear in:
- HubSpot sync — member-portal data has zero HubSpot egress. The HubSpot sync worker explicitly skips the member_applications collection. Codified in
src/lib/hubspot-sync.tsallowlist (collection-scoped, not field-scoped, for defense-in-depth) - First-party analytics events (OpenPanel) — event names + properties use only sanitized fields (
step_key,section_completed,submission_status, bucketed values). NEVER include identity values, SSN, doc numbers, raw income, etc. (ADR 0009) - SES email content — email templates pull only the member's first name, the plan name, and the application ID. No SSN, DOB, income, or document numbers in email bodies
- Outbound webhooks (Phase E+ FFM ack handler) — payloads sanitized via a separate
outboundPayloadFilterbefore send
A CI check scans for hand-built JSON bodies that include sensitive field names.
8. What we present vs what we hide
| Field | Member can see in portal | Pattern |
|---|---|---|
| Full name | Yes | Always visible |
| Date of birth | Yes | Always visible |
| Sex | Yes | Always visible |
| Home address | Yes | Always visible |
| Phone | Yes | Always visible |
| Yes | Always visible | |
| SSN | Masked default; step-up to reveal | ***-**-1234 |
| Immigration doc number | Masked default; step-up to reveal | A123 4567 **** |
| Citizenship status | Yes | Always visible |
| Income detail (per source) | Yes | Always visible |
| Employer name + EIN | Yes | Always visible |
| Bank routing | Masked | ***0123 |
| Bank account | Masked default; step-up to reveal | *****1234 |
| Card | Tokenized; last-4 + brand only | **** 1234 · Visa |
For household members: same rules apply to each member's sensitive fields. The primary applicant can see masked summaries of all household members' fields (since they entered them); step-up reveal is per-member.
9. Audit trail
Every read of a sensitive field — even by the member themselves — appends an entry to agent_audit_log:
{
event: "member_sensitive_field_accessed",
actor: { type: "member" | "agent" | "system", accountId, sessionId },
applicationId,
fieldPath, // generic name, never the value
revealedTo: "member_ui" | "agent_review" | "ffm_submission",
timestamp
}Auditor (with audit_reader role) can reconstruct: who saw what, when. The audit log is append-only per ADR 0002 and tamper-evident.
10. Data retention
Per data-retention-policy.md, member-portal data retention windows:
| State | Retention |
|---|---|
| Abandoned drafts (no email captured) | 90 days from last update, then hard-delete |
| Abandoned drafts (email captured) | 18 months from last update, then hard-delete |
| Submitted-but-not-yet-active | Through coverage year + 7 years (tax retention) |
| Active member | Through coverage year + 7 years post-termination |
| Florence conversation logs | Same as parent application doc |
agent_audit_log | 7 years minimum (HIPAA + EDE Year 9 default), 10 years for safety |
| Backups | 90 days rolling; encrypted; subject to dual-control restore |
Deletion is hard-delete (not soft-delete) at TTL boundary. CSFLE-encrypted fields go to ciphertext-shredding at TTL — the DEK for that range is destroyed, rendering the ciphertext unrecoverable even with the master key.
A right-to-be-forgotten request that comes BEFORE the retention TTL fires triggers an immediate ciphertext-shredding path on the affected document, EXCEPT for the agent_audit_log entries which must be retained for compliance — those rows have the personallyIdentifiable: false invariant (only IDs + event types, no names/SSNs/DOBs).
11. Plan-of-record for Phase B section integration
Every Phase B section that collects a sensitive field must:
- Tag the field in the per-section Zod schema with
.brand<"sensitive">() - Add an entry to
SENSITIVE_FIELDSregistry insrc/lib/portal/sensitive-fields.ts(drives the masking + step-up + logger deny-list) - Write an integration test asserting (a) masked display on read, (b) step-up challenge on edit, (c) audit-log entry on reveal, (d) deny-list scrub in logs
- Document the field in a Phase B addendum to this doc
Sections that include sensitive fields (per the canonical 9-section scope):
- Section 1 (primary applicant identity): SSN
- Section 2 (household composition): per-member SSN, DOB
- Section 3 (citizenship / immigration): immigration document numbers
- Section 9 (payment): bank or card details
Section 4 (income) does NOT collect a directly-sensitive field per this doc's definition (income amount is not a re-identification vector on its own), but it IS subject to the HubSpot-egress block and the first-party-analytics no-PHI rule (OpenPanel: bucketed income only, never raw).
12. Open questions for sign-off
- [ ] Approve the masked-by-default + step-up-reveal pattern for SSN and document numbers, OR opt-in to a different pattern (e.g. "never show, even on edit — always require re-entry")
- [ ] Approve the 18-month retention for abandoned drafts with captured email — too long? too short?
- [ ] Approve the dual-control restore designation: Taha + Asad. Need a third on-call backup before Asad is fully onboarded
- [ ] Approve the payment vendor approach (Stripe / Square / other) — separate vendor decision, but it locks the PCI scope downstream
Sign-off
| Reviewer | Role | Status | Date |
|---|---|---|---|
| Taha Abbasi | CTO-of-record | Pending | — |
| Asad Khalid | CFO / compliance owner | Pending | — |
Once both sign off, link this doc from the Phase A PR and from the ENG-187 Linear issue.