Appearance
ADR 0010 — PostHog HIPAA Cloud as unified analytics + error tracking + session replay (supersedes ADR 0009)
Status
Accepted — 2026-05-26. Supersedes ADR 0009. Tracked under the mobile-v1 build initiative (per docs/architecture/mobile-app-strategy.md, Decision 10).
Context
ADR 0009, accepted ten days ago, picked self-hosted OpenPanel + GlitchTip as the cross-surface analytics + error-tracking stack. The compliance argument that motivated it (BAA does not override a portal CSP that architecturally blocks third-party scripts; first-party self-hosted is the only stack that can see every surface including the PHI portal) is still factually correct. What changed in the ten days is the speed/quality calculus on mobile:
- Florence AI shipped on web (ENG-356,
/florence, v0.52.0 - v0.57.0) with substantial product validation. Mobile v1 is now pivoted to a Florence-first scope perdocs/architecture/mobile-app-strategy.md(founder direction, 2026-05-26 late) - the WOW interactivity loop native gives us that mobile web cannot. - PostHog now ships Error Tracking as a GA product (since 2024), consolidating what ADR 0009 split between OpenPanel (analytics) and GlitchTip (errors). Plus session replay, feature flags, surveys, and LLM observability in one workspace.
- PostHog HIPAA Cloud BAA setup is hours, not days (founder direction). A self-hosted Postgres + ClickHouse + Redis OpenPanel stack plus a Sentry-compatible GlitchTip stack are multi-week builds that v1 cannot wait for.
- The mobile-v1 demo posture uses test data only (no real PHI yet per
docs/architecture/mobile-app-strategy.mdDecision 4). The CSP-on-portal compliance concern that motivated ADR 0009's hard line is not load-bearing for v1's first ship; real-PHI mode is a follow-up where the CSP question can be re-evaluated with real signal.
ADR 0009's strongest argument - that self-hosted is the only stack that can sit inside a portal CSP - remains real for the portal surface specifically. This decision accepts that constraint by carving the surfaces:
Decision
Adopt PostHog HIPAA Cloud as the v1 analytics + error tracking + session replay + feature flags + surveys + LLM observability layer. Single workspace, single BAA, single SDK family, surface-by-surface integration:
| Surface | Integration | CSP / compliance note |
|---|---|---|
| Mobile app v1 (Florence-first, native) | posthog-react-native SDK direct | Native app, CSP not applicable. Ships from day one of the build. |
Web apex (askflorence.health, marketing tier) | posthog-js direct | No PHI on apex. Standard browser SDK. Migrates off OpenPanel (which never fully shipped per #75). |
Web portal (app.askflorence.health, PHI tier) | Deferred — reverse-proxy via portal subdomain OR keep current minimal-instrumentation. CSP allows first-party scripts only; PostHog supports a reverse-proxy pattern that makes its endpoints look first-party. Evaluation deferred to when portal-side member-portal work surfaces concrete instrumentation needs. | This is the surface where ADR 0009's CSP argument still bites. Carving it out of v1 buys us time to evaluate; the alternative (jamming PostHog into the portal via a CSP loosening) would dilute ADR 0009's compliance posture without delivering corresponding value for v1. |
The mobile-v1 event model + super-property model are spec'd in docs/architecture/mobile-app-strategy.md §"Analytics for v1 (PostHog HIPAA, lean event model)" and are not reproduced here.
What this decision covers (one workspace, one BAA)
- Product analytics (events, funnels, retention, cohorts) - the OpenPanel job in ADR 0009.
- Error tracking (sourcemaps, error grouping, JS + RN) - the GlitchTip job in ADR 0009. PostHog Error Tracking is newer than Sentry but good-enough for our scale; Sentry can be added later if PostHog Error Tracking proves insufficient.
- Session replay (web + mobile) - new capability ADR 0009 did not contemplate.
- Feature flags + experiments - new capability ADR 0009 did not contemplate; matters for the Florence-WOW-demo flag pattern already in use.
- Surveys - new capability.
- LLM observability (cost / latency / prompt version tracking for Florence) - matters for Florence iteration.
Identity model
Anonymous distinct_id from device install until florence_email_captured fires identify(emailLowercase). Same canonical lowercased-email identity ADR 0009 specified. Cross-surface stitching (visitor → device → member) is preserved in spirit; the join key remains emailLowercase so PostHog's identify/alias model maps cleanly onto our accounts collection when member-mode flips post-v1.
Server-side spine preserved
ADR 0009's architectural invariant ("server-side event spine is the through-line; swapping vendors is changing an ingest target, not re-instrumenting") carries forward unchanged. Business events fire from the shared API routes; the PostHog SDK is the client transport. This decision is reversible without re-instrumenting: if PostHog HIPAA Cloud falls short of our scale or compliance posture, switching back to self-hosted (OpenPanel + GlitchTip per ADR 0009, or PostHog self-hosted, or a new stack) is an ingest-target swap.
Consequences
Accepted:
- Vendor dependency to PostHog. Mitigated by the server-side spine invariant + the fact that PostHog SDKs are MIT-licensed open source (no lock-in on the data shape). Migration cost away from PostHog Cloud is bounded.
- Cost is contact-sales tier on HIPAA Cloud. Founder accepts the spend for v1 timing; specific monthly cost is a procurement-track number (#57 vendor register update on first invoice).
- ADR 0009's "no analytics-vendor BAA" stance is reversed for v1. PostHog Cloud is a HIPAA-tier third-party processor under BAA. We accept the BAA-coverage path for v1; the self-hosted-from-day-one moat ADR 0009 framed is conceded for the same reason ADR 0009 framed it as a moat (we own the data path) - that argument re-emerges if/when we need to repatriate analytics in-house.
- Portal-side analytics deferred (not blocked, deferred). Portal currently has minimal instrumentation; this stays true until the member-portal build forces the question. At that point either: (a) PostHog reverse-proxy pattern via the portal subdomain, OR (b) revert to self-hosted for the portal only, OR (c) accept server-side-only events on portal. Decision when the question is concrete.
Gained:
- Speed. Hours, not weeks. Mobile v1 is unblocked from day one on analytics.
- Mature mobile SDK.
posthog-react-nativeis well-maintained and used by many production health/voice apps. ADR 0009's OpenPanel mobile SDK was selected on paper; PostHog's is selected on production track record. - Consolidation. One tool, one workspace, one SDK family across analytics + errors + replay + flags + surveys + LLM obs. ADR 0009 split this across two tools (OpenPanel + GlitchTip) and would have needed a third for LLM observability.
- Session replay + LLM observability. Features OpenPanel doesn't offer. Replay matters for diagnosing the Florence interactivity-loop UX; LLM observability matters for iterating Florence's persona + tool cost.
Alternatives considered (this round)
| Alternative | Why rejected |
|---|---|
| Stay on ADR 0009 (OpenPanel + GlitchTip self-hosted) for v1 | Multi-week build (Postgres + ClickHouse + Redis OpenPanel stack + Sentry-compatible GlitchTip stack). Mobile v1 cannot wait. The "self-hosted from day one" moat is real but does not pay back inside the v1 timeframe. |
| Mobile on PostHog HIPAA Cloud, web stays on the OpenPanel + GlitchTip path (parallel stacks) | Creates two separate funnels with no canonical identity graph. The cross-surface moat ADR 0009 named is exactly what this decision protects by unifying. Two stacks = double the ops work + zero of the unified-funnel value. |
| PostHog self-hosted from day one | Heaviest stack of all options (ClickHouse + Kafka + Redis + Postgres, 4 vCPU / 16 GB floor). Cloud-to-self-hosted migration direction is documented as deprecated/unreliable by PostHog. Slow build + future-fragile path. |
Revisit triggers
Reopen this decision if ANY of these fire:
- PostHog HIPAA Cloud pricing inflates past the "founder accepts" line - move portion(s) of the load to self-hosted PostHog or back to OpenPanel + GlitchTip per ADR 0009.
- Portal CSP block blocks meaningful portal analytics - means the surface carve we made is wrong, and either we need to commit to reverse-proxy or repatriate to self-hosted for portal.
- EDE / SOC 2 audit feedback names PostHog Cloud as a finding - repatriate.
- PostHog Cloud deprecates a feature we depend on (session replay, LLM observability, HIPAA tier itself) - re-evaluate.
- Cross-surface identity stitching breaks at scale (PostHog's identify/alias model fails our visitor → member graph in observed production) - same set of alternatives reopen.
- Self-hosting becomes faster or cheaper at our scale (very unlikely in v1 timeframe; possible at later stages).
References
- ADR 0009 - the superseded decision; rationale still sound for the portal surface, accepted as a constraint here.
docs/architecture/mobile-app-strategy.md§"Analytics for v1 (PostHog HIPAA, lean event model)" - the v1 event + property model.docs/briefs/elevenlabs-react-native-integration-research.md- parallel research that locks the mobile stack assumptions this ADR rides on.docs/florence-ai/index.md- Florence runtime model (LLM observability target).- PostHog HIPAA: https://posthog.com/docs/privacy/hipaa-compliance
- PostHog Error Tracking: https://posthog.com/docs/error-tracking
- PostHog reverse proxy (portal-CSP path): https://posthog.com/docs/advanced/proxy