Data Models

Status: Schemas locked. Collections created in Phase 2 (survey responses) and Phase 5 (everything else).

Collections overview

Collection	Phase	Purpose	Written by	Read by
`agent_survey_responses`	2	11-screen research survey results	`app_writer_survey`	`app_writer_agents`, `app_admin_agents`
`agents`	5	Agent accounts and profile	`app_writer_agents`	`app_writer_agents`, `app_admin_agents`
`agencies`	5	Agency orgs (agent team parents)	`app_writer_agents`	`app_writer_agents`, `app_admin_agents`
`agent_sessions`	5	Auth tokens (magic links + sessions)	`app_writer_agents`	`app_writer_agents`
`admins`	5	Admin / super-admin role registry	`app_admin_agents`	`app_admin_agents`
`super_admins`	5	Super-admin credentials (pwd hash, TOTP)	`app_admin_agents`	`app_admin_agents`
`agent_audit_log`	5	Append-only audit trail	`app_writer_agents`, `app_admin_agents`	`audit_reader`
`agent_nipr_records`	5	NIPR PDB validation results	`app_writer_agents`	`app_writer_agents`, `app_admin_agents`
`agent_id_verifications`	5	KYC vendor results	`app_writer_agents`	`app_writer_agents`, `app_admin_agents`
`agent_unsubscribes`	2	CAN-SPAM opt-out records (pre-Phase-5 only)	`app_writer_survey`	`app_writer_agents`

`agent_survey_responses` (Phase 2)

Ships first. Forward-compatible with Phase 5 agent accounts — no breaking changes needed when accounts arrive, just a backfill script to link responses to their eventual agentId.

{
  _id: ObjectId,
  submittedAt: Date,

  // Identity (no account yet when Phase 2 ships)
  email: string,          // normalized lowercase, indexed
  fullName: string,
  companyName: string,

  // OPTIONAL — backfilled by Phase 5 migration script
  // (scripts/db/backfill-agent-survey-ids.js) when an agent creates
  // an account and email matches
  agentId?: ObjectId,

  // Survey payload — structured blob, schema will evolve
  responses: {
    oepVolume, sepVolume,
    leadSources: string[], leadCost, conversionRate,
    selfEmployedPct, medicaidRejectedPct,
    topClientReasons: string[], ageRange,
    enrollmentTools: string[], enrollmentTime,
    timeSavingsRating, biggestFriction,
    consentDocs: string[], docStorage, clawbackExperience,
    retentionRate, proactiveRenewal, autoRenewalHarm,
    hardestPart, wishlist,
    featureRatings: {
      prequalLeads, brandedPage, memberPortal,
      renewalAlerts, dashboard, aiAssistant
    },
    pmpmRange, carriersWorkedWith: string[],
    adjacentInterests: string[],
    idealPartnership, partnershipModelPref,
    willingToHelp, canRefer, otherNotes
  },

  // Consent — see compliance doc
  consent: ConsentSubdocument,

  // Metadata
  schemaVersion: "v1",
  ip?: string,
  userAgent?: string,
  referrer?: string
}

Indexes: email, submittedAt, agentId. Retention: indefinite (research data).

`agents` (Phase 5)

Core agent account record. NPN is the immutable primary identifier.

{
  _id: ObjectId,
  npn: string,                // unique index, immutable
  email: string,               // unique index, lowercase, mutable via flow
  fullName: string,
  phone: string,
  agencyId?: ObjectId,         // null for solo agents

  licensedStates: string[],    // two-letter codes
  status: "pending_review"     // just signed up
        | "id_verification_required"  // NIPR passed, need ID verify
        | "active"             // cleared, can receive leads
        | "rejected"
        | "suspended",
  partnershipModel?: "full-service" | "submit-ready",

  orgSize: "solo" | "2-5" | "6-15" | "16-50" | "50+",
  currentAcaMembers: "0-50" | "51-200" | "201-500" | "501-1000" | "1001-5000" | "5000+",
  referralSource: string,

  emailVerified: boolean,
  totpSetupCompletedAt?: Date,  // required for Tier 2 access

  createdAt: Date,
  updatedAt: Date,
  activatedAt?: Date,
  notes?: string,              // internal ops notes

  // Consent carried forward from waitlist / onboarding
  consent: ConsentSubdocument
}

Indexes: npn (unique), email (unique), status, licensedStates, agencyId, createdAt. Retention: indefinite while active. After rejection/suspension, follow HIPAA retention (6-10 years) then hard delete.

`agencies` (Phase 5)

{
  _id: ObjectId,
  name: string,
  primaryAgentId: ObjectId,    // agency creator / owner
  teamSize: string,             // carried forward from onboarding
  createdAt: Date
}

Agency members are all agents docs with agencyId pointing here. Agency owner has additional permissions within the agency scope (invite, remove team members — Phase 6+ feature, not in MVP).

`agent_sessions` (Phase 5)

Auth tokens. TTL-indexed in MongoDB so old tokens auto-expire.

{
  _id: ObjectId,
  agentId: ObjectId,
  token: string,               // opaque random 32-byte hex, indexed
  type: "magic_link" | "session",
  expiresAt: Date,             // 15min for magic_link, 8hr for session
  usedAt?: Date,               // magic links are single-use
  createdAt: Date,
  userAgent?: string,
  ip?: string
}

Indexes: token (unique), agentId, expiresAt (TTL — auto-cleans expired entries). Retention: auto-expires per the expiresAt TTL. Audit trail of auth events lives in agent_audit_log separately.

`admins` (Phase 5)

Role registry for platform admins.

{
  _id: ObjectId,
  email: string,               // unique, lowercase
  role: "super_admin" | "admin" | "support",
  addedBy?: ObjectId,          // super_admin who granted (null for seed)
  addedAt: Date,
  revokedAt?: Date,            // soft-revoke, preserves audit trail
  lastLoginAt?: Date,
  totpSetupCompletedAt?: Date  // required before any access
}

Indexes: email (unique), role. Write access: only app_admin_agents MongoDB user. Retention: indefinite; soft-revoked rows never hard-deleted.

`super_admins` (Phase 5)

Separate collection for super-admin credentials. Separated from admins so compromise of the regular admin collection doesn't immediately grant super-admin access.

{
  _id: ObjectId,
  email: string,               // unique
  passwordHash: string,         // argon2id
  totpSecret: string,          // AES-encrypted at app layer
  recoveryCodes: string[],      // hashed, single-use
  allowedIps: string[],         // CIDR ranges for IP allowlist
  lastLoginAt?: Date,
  lastLoginIp?: string,
  createdAt: Date
}

Written by: app_admin_agents only, and only via the super-admin management flow (or break-glass scripts). Seeded by: scripts/admin/seed-super-admin.js (one-time bootstrap).

`agent_audit_log` (Phase 5)

Append-only. The source of truth for every auth event, admin action, and data change. Required for SOC 2 / HIPAA / EDE.

{
  _id: ObjectId,
  actorType: "agent" | "admin" | "super_admin" | "system",
  actorId?: ObjectId,
  actorEmail?: string,

  action: string,              // e.g. "login_succeeded", "status_changed", "admin_role_granted"
  resourceType?: string,       // "agent" | "admin" | "super_admin" | "session"
  resourceId?: ObjectId,

  changes?: {
    before: any,
    after: any
  },

  ip?: string,
  userAgent?: string,
  outcome: "success" | "failure",
  timestamp: Date
}

Indexes: timestamp, actorId, action, resourceId. Retention: 6 years minimum (HIPAA § 164.316), 10 years preferred for EDE margin. Write access: app_writer_agents and app_admin_agents (append only; app layer enforces no update/delete). Read access: audit_reader (readonly MongoDB user used by the compliance-export script).

See Auth Architecture for the full list of events logged.

`agent_nipr_records` (Phase 5)

NIPR PDB Detail Report responses for audit trail.

{
  _id: ObjectId,
  agentId: ObjectId,
  checkedAt: Date,
  npnValid: boolean,
  licenseStates: string[],
  niprResponseId: string,
  rawResponse: any,            // full API response for audit
  cost: number                  // e.g. 1.30
}

Indexes: agentId, checkedAt. Retention: 10 years (EDE audit margin).

`agent_id_verifications` (Phase 5)

KYC vendor results.

{
  _id: ObjectId,
  agentId: ObjectId,
  vendor: "persona" | "stripe_identity" | "plaid" | "veriff",
  vendorRecordId: string,       // vendor's ID for this verification
  status: "pending" | "passed" | "failed" | "manual_review",
  verifiedAt?: Date,
  rawResponse: any              // full vendor response for audit
}

Indexes: agentId, status, verifiedAt. Retention: 10 years (EDE audit margin).

`agent_unsubscribes` (Phase 2-only)

Minimal unsubscribe log for the period before agent_audit_log exists. Phase 5 migration copies these into the audit log.

{
  _id: ObjectId,
  email: string,
  unsubscribedAt: Date,
  optInTypes: string[],         // e.g. ["marketingEmail"]
  ip: string,
  userAgent: string
}

Indexes: email, unsubscribedAt.

The `consent` sub-document

Used by agent_survey_responses, waitlist entries with interest: "agent", future agents records, and anywhere else we capture an email with intent to communicate.

consent: {
  versions: {
    privacyPolicy: "v2026.04",        // version in effect when consent given
    termsOfService: "v2026.04",
    consentStatement: "<exact text shown to user>"
  },
  optIns: {
    platformContact: boolean,          // "contact me about becoming an agent"
    marketingEmail: boolean,           // "product updates and research"
    marketingSms?: boolean             // future-reserved
  },
  capturedAt: Date,
  ip: string,
  userAgent: string,
  referrer?: string,
  pageUrl: string,
  method: "checkbox" | "implicit" | "verbal_recorded"
}

This structure is designed to satisfy HubSpot / Salesforce import requirements and GDPR / CCPA consent proof by construction. See Compliance Model.

Cross-phase migration: survey backfill

When Phase 5 ships and agents create real accounts, run once:

scripts/db/backfill-agent-survey-ids.js

For every agent_survey_responses row where agentId is absent, find a matching agent by email, set agentId, preserve submittedAt. Idempotent — safe to re-run.

Overview — phase roadmap
Auth Architecture — how agent_sessions, agent_audit_log, and TOTP tie together
MongoDB Permissioning — which DB user can touch which collection
Compliance Model — retention rationale and regulatory mapping