Appearance
Tool surface
Florence interacts with the deterministic AskFlorence platform exclusively via LLM tool use. Every factual claim in a Florence response comes from a tool call; Florence does not compute or recall facts. The tool surface is therefore the single most important contract in the architecture.
This document defines the shape of that contract, the two tool families (api_* and ui_*), the auth/classification model, and the pattern every deterministic endpoint follows when it becomes a Florence tool.
Two families of tools
api_* — read / write against the deterministic platform
Server-side tools that invoke deterministic endpoints. Examples: api_search_plans, api_check_drug_coverage, api_get_member_plan_details. Result is a structured JSON payload that becomes part of Florence's conversation context.
ui_* — drive the frontend state
Server-side tools whose result is delivered to the client as a message over the SSE stream, triggering state changes in the UI. Examples: ui_set_plan_filter, ui_open_plan, ui_highlight_field, ui_start_intake_section. No direct deterministic-API call; the UI reflects Florence's intent.
ui_* tools are what make Florence a co-pilot rather than a chat widget. "Show me plans my kid's pediatrician takes" → Florence calls api_check_provider_network + ui_set_plan_filter + narrates. The user sees the filter change.
The tool definition contract
Every Florence tool is defined in one place with a uniform contract:
ts
// src/lib/florence/tools/types.ts
type DataClass = "Public" | "PHI" | "PII" | "FTI" | "ApplicationPayload";
type AuthContext = "anonymous" | "authenticated_member" | "authenticated_agent" | "authenticated_admin";
interface FlorenceTool<Input, Output> {
name: string; // e.g. "api_check_drug_coverage"
description: string; // shown to the LLM
inputSchema: ZodSchema<Input>; // LLM-visible schema
outputSchema: ZodSchema<Output>; // runtime-validated
// Classification contract
acceptsAuthContexts: readonly AuthContext[];
inputClass: DataClass; // highest class an input field may carry
outputClass: DataClass; // highest class the output may carry
// Execution
execute(input: Input, ctx: ToolExecutionContext): Promise<Output>;
// Observability
cacheKey?(input: Input): string; // opt-in result cache (see TTL rules)
cacheTtlSeconds?: number;
// Evolution
version: string; // semver; tools are versioned (see lifecycle)
status: "stable" | "beta" | "deprecated";
}The ToolExecutionContext carries the auth context, the conversation ID, the turn ID, the member ID (if authenticated), and the parent Florence system role (member-mode vs. agent-mode). The wrapper enforces acceptsAuthContexts before the underlying endpoint is called — unauthorized tool calls are rejected at the tool layer, not the prompt.
The deterministic-integration pattern
Every Florence api_* tool wraps a deterministic endpoint. The pattern is uniform so that adding a new API (drug lookup, provider lookup, appointment booking, claims, renewal analysis, etc.) is a routine operation, not an architectural decision.
Steps every wrapper performs, in order
- Input validation against the Zod schema. Malformed inputs from the LLM are rejected with a structured error the LLM can read and retry against.
- Auth gate. Compare
ctx.authContextagainsttool.acceptsAuthContexts. Deny unambiguously if mismatched. Log the denial. - Classification gate. If the tool's
outputClassisFTIorApplicationPayload, the destination adapter sink (see data classification) must allow that class. This is a compile-time check via branded types + a runtime belt. - Result cache check. If the tool declares a
cacheKeyand a fresh result exists withincacheTtlSeconds, return the cached value. Plan data, drug coverage, and provider network are prime cache candidates. Member-specific tools generally are not. - Execute the underlying deterministic endpoint. Standard retry policy (3 attempts, exponential backoff, circuit breaker at 5 consecutive failures).
- Output validation against the Zod schema. Deterministic responses are strict — a schema mismatch is a deployment bug, not a user-visible error.
- Serialize into a compact, LLM-friendly shape. See serialization below.
- Audit emit. Every tool call produces an audit-log row (tool name + version, input hash, output summary hash, auth context, auth decision, cache hit/miss, latency, errors).
Wrappers live in src/lib/florence/tools/
src/lib/florence/tools/
index.ts — exports the tool registry
types.ts — FlorenceTool, ToolExecutionContext, DataClass, AuthContext
registry.ts — imports every tool module; builds the registry
helpers/
execute.ts — the uniform wrapper (auth, classification, cache, audit)
cache.ts — tool-result cache
serializer.ts — serialization helpers
api/ — one file per api_* tool
search-plans.ts
check-drug-coverage.ts
check-provider-network.ts
get-member-plan-details.ts
...
ui/ — one file per ui_* tool
set-plan-filter.ts
open-plan.ts
...One tool per file. Adding a tool is a single PR that touches: the tool file, the registry, the tool-registry doc, and the eval set. See adding a tool.
Serialization
LLM context is expensive; serialized outputs should be compact, well-typed, and self-describing.
- Prefer stable field names over positional arrays:
{ premium: 234.50, deductible: 1500 }not[234.50, 1500]. - Include a
_meta.toolfield with the tool name and version in every serialized result, so the grounding check can tie factual claims back to a specific call. - Include a
_meta.callIdUUID so audit logs and grounding checks can cross-reference. This UUID is how Florence "cites" her claims internally. - Truncate wisely: a 50-plan list returned as full detail is ~30 k tokens. A 50-plan list returned as
{ planId, issuer, metal, premiumMonthly, rating, topBenefits }is ~3 k tokens. Florence asks for detail on a specific plan via a follow-up tool call. - Never include raw PII/FTI fields that Florence does not need. A member-profile tool returns medication-and-provider context to Florence; it does not return SSN or DOB just because the record has them.
The current tool surface
See tool registry for the living inventory. At the time of this writing:
- Stable:
api_search_plans(wraps/api/plans),api_check_eligibility(wraps/api/eligibility) - Coming soon (Phase C, #17):
api_check_drug_coverage - Coming soon (Phase D, #18):
api_check_provider_network - Coming soon (Phase 5 member auth):
api_get_member_profile,api_get_member_plan_details,api_initiate_sep_workflow - Coming soon (Phase 5 agent auth):
api_list_my_assigned_members,api_get_member_full_history,api_draft_sep_letter,api_compose_member_message,api_assign_escalation - Always present:
ui_set_plan_filter,ui_open_plan,ui_highlight_field,ui_start_intake_section,ui_show_comparison - Escalation:
api_escalate_to_human(v1: email + Mongo row; Phase 5: admin-dashboard queue)
SBE vs FFM data lookup — which backend each tool hits per state
Read this before touching any coverage tool. The deterministic endpoints behind Florence's coverage tools do NOT all use the same data source. The backend is chosen by the plan's state, not by plan-ID format (NY uses 14-char FFM-format IDs; CA uses 16-char — only state disambiguates). The canonical fork is isOwnedDataState(state) in @askflorence/shared (today: NY + CA). See docs/data-sources/sbe-state-watchouts.md for the full per-state decision log.
Two data planes:
- FFM states (~30) — CMS Marketplace API is authoritative for plans, drug coverage, and provider coverage. Provider identity = NPPES NPI.
- Owned-data states (NY, CA) — CMS does not index these plans. We serve them from our own collections via
getReferenceDb()(staging cluster over PrivateLink, ADR 0004). Drug coverage fromformularies_staging; provider coverage fromproviders_staging. CA provider identity = SymphonyproviderId, NOT NPPES NPI (the CalHEERS/Symphony directory does not expose NPI —external_ids.npiis null on every CA provider doc).
What this means for each tool (verified on apex 2026-05-28):
| Tool | Endpoint | Lookup key | FFM | NY | CA |
|---|---|---|---|---|---|
find_plans | /api/plans | state | ✅ CMS | ✅ owned | ✅ owned |
check_drug | /api/drugs/find | rxcui + plan_id against formularies_staging | ✅ | ✅ | ✅ |
find_doctors (suggest) | /api/providers/suggest (passes state) | zip + Symphony for CA | ✅ NPPES | ✅ | ✅ Symphony |
check_provider | /api/providers/autocomplete (NPPES) → /api/providers/covered | NPPES NPI | ✅ | ✅ | ❌ see below |
Why check_drug is state-agnostic and just works: drugs are identified by RxCUI, a single national identifier. /api/drugs/find queries formularies_staging by (rxcui, plan_id) regardless of state — and CA drugs were ingested there (ENG-395). Search AND coverage use the same key, so it's fully end-to-end for every state. No state param needed.
Why check_provider is broken for CA (known gap, [ENG-410]): the discovery step (/api/providers/autocomplete) hits NPPES and returns an NPPES NPI. But CA provider data in providers_staging is keyed by Symphony providerId (_id: "ca-sym:<id>", npi: null). There is no NPPES↔Symphony bridge, so /api/providers/covered finds no match → NotCovered for every CA doctor a user names, even in-network ones. coverAcrossAllPlans() also does not thread state today. find_doctors/suggest is unaffected — it discovers via Symphony, so its providerIds match. Until ENG-410 lands (route CA provider discovery through /api/providers/search-ca → Symphony providerIds), prefer the suggest path for CA provider questions and do not trust a check_provider "not covered" for a CA plan.
Rule for any new or modified coverage tool:
- Thread the user's
state(fromstore.location.state) into the deterministic call. Owned-data routes need it to dispatch off CMS. - Know your identifier: national (RxCUI) → state-agnostic; provider (NPI vs Symphony providerId) → state-specific, and discovery + coverage MUST use the same ID system or they won't join.
- If you add a state to
OWNED_DATA_STATES, re-verify every coverage tool against a real plan from that state before announcing the tool as working there.
Versioning
Tools are versioned because they will evolve.
- Minor versions (additive fields, relaxed input validation): no breaking change, no eval reset, old calls still valid.
- Major versions (input rename, output shape change, behavior change): register as
api_foo_v2, keepapi_fooavailable during deprecation, run both through evals, retire the old version after the next successful audit window.
The registry tracks version + status per tool. The system prompt includes only stable + beta tools; deprecated tools are still callable (for in-flight conversations) but not announced to the model.
Extensibility — tool surface grows with the platform
The integration pattern is deliberately uniform so that drug lookup, provider lookup, appointment booking, renewal analysis, claims, bills, and anything else the deterministic platform grows to support can be exposed to Florence as routine work:
- Build the deterministic endpoint as usual.
- Write a tool wrapper in
src/lib/florence/tools/api/declaring input, output, classification, auth contexts. - Register it in
registry.ts. - Add it to the tool registry doc.
- Write eval coverage (at minimum: three factual cases, one adversarial, one auth-boundary).
- Ship behind a feature flag; monitor the cost + latency + routing-mix impact for a week; graduate to
stable.
Full playbook: adding a tool.