Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

New York Phase C/D ingestion playbook ​

Companion to ca-phase-c-d-ingestion-playbook.md and sbe-state-watchouts.md. This is the NY-specific execution plan for bringing NY to end-to-end doctor + Rx coverage parity (ENG-412). Phase 0 (source discovery) results are captured here; Phases 1-4 execute against them.

Goal ​

A NY consumer enters a ZIP → gets plans (✅ already works) → searches their doctor by name → sees which plans cover them → searches medications → sees which plans cover them. Same flow as FFM + the completed CA work (ENG-395/408).

Verified starting state (2026-05-28) ​

CapabilityStatus
ZIP → county, plans, pricing, subsidy✅ works (282 NY 2026 plans, calculateNyEligibility)
Doctor coverage❌ 0 NY providers in providers_staging
Drug coverage❌ 0 of 282 NY plans cover atorvastatin; formularies not ingested

The NY advantage over CA ​

NY is NPPES-NPI-native. Provider identity across NY's directory (PNDS), §1311 MRFs, and our /api/providers/autocomplete (NPPES) is the SAME 10-digit NPI. So once NY providers land in providers_staging keyed by _id: npi (FFM-style), the autocomplete → coverage join works with no bridge gap — the CA limitation (ENG-410, NPPES↔Symphony) does not exist for NY. NY becomes the first fully-complete SBE provider surface.

Also: NY puf is populated (CA's is empty — see CA decision #1). So NY provider-plan mapping can use puf.networkId for per-network precision instead of CA's HIOS-prefix coarsening.

No route changes needed. NY is in OWNED_COVERAGE_STATES (ENG-411); /api/{drugs,providers}/covered already dispatch NY to the owned-data path. The flow lights up the moment data lands.

Phase 0 — source discovery (DONE; decisions locked) ​

Provider directory ​

CandidateVerdict
PNDS — pndslookup.health.ny.gov (Provider Network Data System, NY DOH, operated by IPRO)Confirms data exists + is NPI-native + statewide (the NY analog of CA's Symphony — enter provider → which plans cover them; updated every 3 months). BUT: reCAPTCHA-gated jQuery form-POST tool, NOT a clean anon JSON API. Per our security rules we do NOT bypass CAPTCHA → not a scrape target. Potential future licensed data feed from NY DOH / IPRO (the Symphony-license analog).
§1311 Transparency-in-Coverage MRFs (per NY carrier)✅ CHOSEN. Federally mandated, machine-readable, NPI-keyed, no CAPTCHA. NY carriers publish these regardless of marketplace type. Same pipeline we already run for FFM (scripts/db/ingest-mrf-providers.js) — NY providers slot into providers_staging keyed by NPPES NPI exactly like FFM, with per-plan network membership. This is the clean technical + legal path.
Per-carrier provider-directory portalsFallback for carriers whose MRF is unusable; higher per-carrier effort.

Decision: ingest NY providers from §1311 MRFs, reusing the FFM MRF provider pipeline. _id: npi (NOT a namespaced ny-sym: id — NY is NPI-native, unlike CA). PNDS is the consumer-facing proof + a future licensed-feed option, not the ingest source.

Drug formularies ​

Per-carrier formulary PDFs/files — same approach as CA (CA carrier-PDF parser playbook is the template). Confirmed sources for major NY medical carriers (verify + expand the full ~13-issuer list during Phase 1):

Carrier (HIOS)Formulary source
Fidelis / Ambetter (25303)fideliscare.org/Portals/0/Formularies/QHP-2026-formulary-Fidelis-Care.pdf (QHP) + EP-2026-Formulary-Fidelis-Care.pdf (Essential Plan)
Healthfirst (91237)healthfirst.org/formularies (landing page → per-plan PDFs)
MetroPlus (11177)metroplus.org/wp-content/uploads/... per-plan PDFs
(various)fm.formularynavigator.com/FBO/... — MMIT-hosted, SAME host as CA Anthem (e.g. NY_Essential_Formulary.pdf, 2026_QHP_Formulary.pdf). The CA FormularyNavigator handling (browser UA + backoff on 429) applies.
Excellus/Highmark (78124), MVP (56184), EmblemHealth (88582), CDPHP (94788), Oscar (74289), UnitedHealthcare (54235), Anthem (41046/44113)TBD — harvest in Phase 1; same per-carrier-PDF approach.

Decision: parse NY carrier formulary PDFs → resolve to RxCUI (reuse scripts/db/data/rxcui-resolution-cache.json + CMS autocomplete for misses) → upsert to formularies_staging keyed _id: "<rxcui>:<year>", plans[] entries for NY plan_ids, source: "ny_<carrier>_2026_marketplace_formulary". Note NY's Essential Plan (EP) is a NY-specific program with its own formulary — include EP plan_ids where applicable.

Phase 1 — drug formulary ingest → formularies_staging ​

Mirror scripts/db/ingest-ca-formularies.py + ca-phase-cd-runner.cjs:

  • [ ] Harvest NY carrier formulary URLs (full ~13-issuer inventory) → scripts/db/data/ny-carrier-formularies-2026.ts
  • [ ] Download + parse (pdfplumber table-aware; positional fallback for single-column tier-digit layouts — both parsers from the CA work are reusable)
  • [ ] Resolve RxCUIs (reuse FFM cache)
  • [ ] Map carrier → NY plan_ids by HIOS prefix (or per-network via puf.networkId since NY puf is populated)
  • [ ] Upsert (additive $addToSet, source: "ny_*", cluster guard, pre/post FFM+CA count assertion, dry-run default)

Phase 2 — provider directory ingest → providers_staging ​

  • [ ] Identify each NY carrier's §1311 MRF index URL (CMS TiC index → per-issuer provider-reference files). NY carriers publish at their TiC disclosure URLs.
  • [ ] Reuse the FFM MRF provider pipeline (scripts/db/ingest-mrf-providers.js) — _id: npi, plans[] keyed by NY plan_id + network_tier, source: "ny_1311_mrf", af_state_scope: ["NY"].
  • [ ] Run via AWS Fargate RunTask in-VPC (reuse the ENG-408 ecs-smoke-runner task-family + bundle pattern) — MRFs are large; no Starlink dependency.

Phase 3 — safety + verify (every covenant from ENG-395/408) ​

  • [ ] Atlas snapshot before any write; log snapshot ID.
  • [ ] FFM and CA cohorts byte-identical (pre/post counts: FFM _id-range + CA ca-sym: range both unchanged; NY adds new _id: npi provider docs + new formularies_staging plan entries).
  • [ ] Cluster identity guard (refuse non-staging) + collection allowlist (formularies_staging + providers_staging only).
  • [ ] Apex smoke: NY ZIP 10001 / 11201 + a real NY doctor by name → correct per-plan coverage (NPPES NPI joins directly — no bridge); NY Lipitor → correct per-plan coverage.

Phase 4 — docs ​

  • [ ] Update sbe-state-watchouts.md NY section: locked decisions + verified E2E matrix (replace the Phase-0 "broken" matrix with the post-ingest "works" matrix).
  • [ ] Florence tooling: NY "just works" given NPPES-native (autocomplete NPI = stored NPI). Confirm + note in docs/florence-ai/tool-surface.md SBE-vs-FFM matrix that NY check_provider works WITHOUT the CA caveat. ENG-410's CA-only gap does NOT apply to NY.

Standing covenants (same as ENG-395/408) ​

Snapshot before writes · FFM + CA byte-identical · cluster identity guard · collection allowlist · $addToSet additive · dry-run default · Fargate in-VPC for bulk crawl/ingest · no prod-cluster writes for coverage data (staging via PrivateLink, ADR 0004).

Reusable-for-other-SBE notes ​

The PNDS-is-CAPTCHA-gated → use-§1311-MRF decision is likely the common SBE pattern: most states have a consumer provider-lookup tool (often CAPTCHA-protected) AND federally-mandated §1311 MRFs. Default to MRFs for ingestion; treat the state tool as proof-of-data + a potential licensed feed. NPI-native states (NY and most non-CA SBEs) avoid CA's Symphony-providerId bridge problem entirely.

Cross-references ​

  • ENG-412 (this work) · ENG-395 (CA Phase C/D template) · ENG-408 (CA provider ingest harness + Fargate pattern) · ENG-407 (state-aware route dispatch) · ENG-411 (coverage-dispatch predicate — NY rides on OWNED_COVERAGE_STATES) · ENG-410 (CA-only NPI bridge — N/A for NY)
  • scripts/db/ingest-mrf-providers.js (FFM MRF provider pipeline — reused for NY) · scripts/db/ingest-ca-formularies.py + parse-ca-formulary*.py (CA formulary parsers — reused for NY)
Pager
Previous pageCA Phase C/D Playbook
Next pageOverview

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.