ElevenLabs Conversational AI on React Native - feasibility research

Research date: 2026-05-26 Owner: open Status: research-complete (no decisions made yet — feeds into mobile-v1 build planning per docs/architecture/mobile-app-strategy.md)

TL;DR

Yes - ElevenLabs ships an official @elevenlabs/react-native SDK (v1.2.3, published 2026-05-13, ~14.6K weekly downloads, MIT licensed, elevenlabs/packages monorepo) that re-exports the same @elevenlabs/react hook surface (ConversationProvider, useConversation, etc.) and auto-configures WebRTC polyfills and the native iOS/Android AudioSession. Our existing /api/florence/agent-session server route returns exactly the { conversationToken, connectionType: "webrtc" } shape the RN SDK expects, so the server is RN-portable as-is. The SDK requires an Expo development build (no Expo Go) but works inside the managed workflow via config plugins (@livekit/react-native-expo-plugin + @config-plugins/react-native-webrtc) - no manual ejection. Biggest known risks: (1) WebSocket transport is not supported on RN - WebRTC only (which is what we use); (2) background-audio behavior is undocumented and tracked open - lock-screen audio session continuity needs a spike; (3) onAudioAlignment does not fire on the WebRTC transport (#789 open) - our greeting-stagger logic will need to fall back to the mode-change debounce path we already have.

Question 1: Does an official @elevenlabs/react-native SDK exist?

Yes. Package: @elevenlabs/react-native, latest 1.2.3 (2026-05-13). Source: elevenlabs/packages/packages/react-native, MIT, 104 stars, last pushed 2026-05-25 (yesterday). ~14,639 weekly downloads (npm point/last-week). Announced by ElevenLabs Aug 2025 (tweet) with "WebRTC and first-class @expo support built in." Actively maintained - CHANGELOG shows weekly patch releases through May 2026.

What it exports (verified by reading src/index.react-native.ts): registers a RN-specific session-setup strategy that (1) polyfills WebRTC globals via @livekit/react-native's registerGlobals(), (2) configures + starts a native AudioSession (speaker output, communication audio type on Android), (3) creates the WebRTC connection, (4) attaches native volume processors. Then re-exports the entire @elevenlabs/react API via export * from "./index.js" so consumers get identical ConversationProvider, useConversation, useConversationControls, useConversationStatus, useConversationInput, useConversationMode, useConversationFeedback, useConversationClientTool, useRawConversation hooks.

Peer dependencies (from package.json): @livekit/react-native ^2.9.2, @livekit/react-native-webrtc ^137.0.2, react >=17.0.0, react-native >=0.70.0. Internal deps: @elevenlabs/client + @elevenlabs/react (same packages the web app uses).

Question 2: If no official SDK, what's the integration path?

N/A - the official SDK exists. For completeness:

The web @elevenlabs/client package is NOT directly usable from RN. Its MediaDeviceInput.create calls new window.AudioContext(...) (Web Audio API) and livekit-client reaches for document, HTMLAudioElement, Event, CloseEvent, navigator.mediaDevices.addEventListener. See issue #766 for the failure trace. This is precisely why @elevenlabs/react-native exists - it swaps the session-setup strategy to a LiveKit-native WebRTC path.
react-native-webrtc raw integration is not the path. The SDK uses @livekit/react-native-webrtc (a LiveKit-tuned fork) because the ElevenLabs Convai WebRTC backend speaks LiveKit's protocol. Rolling our own with react-native-webrtc would require reverse-engineering the LiveKit handshake the SDK does for us.

Question 3: How does the existing web implementation hand off to native?

Read of apps/web/src/lib/florence/agent/provision.ts and apps/web/src/components/florence/useFlorenceConversation.ts:

Server response shape (createFlorenceSession):

interface FlorenceSession {
  agentId: string;
  conversationToken: string;
  connectionType: "webrtc";
  voiceId: string;
}

This is portable as-is. Per the SDK's Options type (@elevenlabs/client), startSession accepts the same { conversationToken, connectionType: "webrtc" } shape on both web and RN - the RN SDK's session-setup strategy explicitly checks options.connectionType === "websocket" and throws, but accepts "webrtc" cleanly (verified in index.react-native.ts:31-38). The example app in examples/react-native-expo/App.tsx uses agentId directly (which only works for public agents); for private agents (our case) the docs confirm conversationToken is the path and the token is valid for 10 minutes. Our existing route hits this requirement exactly.

Client lifecycle - what's portable:

The web hook uses useConversation with callbacks: onConnect, onDisconnect, onModeChange, onMessage, onInterruption, onAudioAlignment, onError, onUnhandledClientToolCall. Per the RN SDK docs:

Callback	RN-supported?	Notes
`onConnect`	yes	confirmed in example + docs
`onDisconnect`	yes	confirmed
`onModeChange`	yes	confirmed (`speaking` <-> `listening`)
`onMessage`	yes	confirmed
`onError`	yes	confirmed
`onStatusChange`	yes	RN-only convenience
`onAudioAlignment`	partial - WebRTC broken	Issue #789 open: WebRTC transport drops alignment events. Our hook already has a mode-change fallback path (`schedulePitchStaggerFallback`) that handles this exact case - direct port.
`onInterruption`	uncertain - needs spike	Not in RN docs explicitly. Likely re-exported from `@elevenlabs/react` but not verified in example.
`onUnhandledClientToolCall`	uncertain - needs spike	Same as above.

Methods (verified in example useConversationControls): startSession, endSession, sendUserMessage, sendContextualUpdate, sendUserActivity, getId, sendFeedback, setMuted. All the methods our hook depends on. Volume getters (getInputVolume/getOutputVolume) are not in the RN docs but the SDK attaches native volume processors and the example uses a <VolumeBar direction="input"/> component, so the data is exposed - exact API name needs a quick spike.

Tap-session + scene reducer logic in useFlorenceConversation (the useTapSession, sceneReducer, pickerBatch debounce, composeTapStartMessage) is all pure React state machinery with no DOM dependencies - directly portable.

Not portable as-is: the window.addEventListener("unhandledrejection", ...) SDK-crash guard in useFlorenceConversation (lines 575-612) - RN uses ErrorUtils.setGlobalHandler instead. Whether the underlying SDK crash even reproduces on RN is unknown (different transport, different code paths). Treat as "verify on first spike, port the pattern if needed."

Question 4: Expo managed-workflow compatibility

Yes, managed workflow works - but requires an Expo development build (custom dev client), not Expo Go. This is well-documented and standard for any WebRTC-using Expo app. No ejection required.

Setup (per the Expo + ElevenLabs integration guide and the working example):

bash

npx expo install @elevenlabs/react-native @livekit/react-native \
  @livekit/react-native-webrtc @config-plugins/react-native-webrtc \
  @livekit/react-native-expo-plugin livekit-client expo-dev-client

app.json plugins:

json

"plugins": ["@livekit/react-native-expo-plugin", "@config-plugins/react-native-webrtc"]

Info.plist: NSMicrophoneUsageDescription required. Android: RECORD_AUDIO, ACCESS_NETWORK_STATE, INTERNET, MODIFY_AUDIO_SETTINGS, WAKE_LOCK, BLUETOOTH (8 perms total per the docs).

Verified compatibility in the official example: Expo SDK 55, React Native 0.83.6, React 19.2.0, Hermes engine. Builds via expo prebuild --platform ios / expo prebuild --platform android then runs in a dev client. This is the standard managed-workflow path for any app using native modules - the same path mobile v1 will need anyway per docs/architecture/mobile-app-strategy.md.

Audio session caveats (the founder's "premium feel" concern):

The SDK auto-configures AudioSession (speaker output, communication audio type on Android) on startSession and stops it on endSession. Good - we don't have to write the iOS audio session code ourselves.
Lock-screen audio + background audio behavior is undocumented and explicitly tracked open as issue #715 ("Document React Native conversation behavior when apps are backgrounded"). The issue explicitly says conversations may drop when backgrounded and that SDK-level guidance is still pending.
What this means for "premium feel": if the user locks the phone mid-conversation, behavior is unknown without a spike. iOS allows continued audio with the audio background mode in Info.plist + active AVAudioSession category playAndRecord - both of which the SDK's auto-configured AudioSession likely handles, but needs explicit verification on device.
Interruption handling (phone call comes in, AirPods disconnect): not documented for the RN SDK. Standard LiveKit behavior is to pause + resume on AVAudioSessionInterruptionTypeBegan / Ended - again, device spike required.

Question 5: Production readiness signals

Mixed - real production usage but real RN-specific bugs still open.

Positive signals:

Active maintenance: 11 releases in 2026 alone, last push 2026-05-25, weekly patch cadence.
~14.6K weekly downloads suggests real adoption beyond demo apps.
Expo published their own integration guide (expo.dev blog: "How to build universal app voice agents with Expo & ElevenLabs") - this is a first-party Expo endorsement.
A working example app (examples/react-native-expo) lives in the monorepo with expo-dev-client, expo-image-picker, full hook usage including useConversationFeedback, image upload, frequency bands, volume bars.

Negative signals (all open issues on RN-specific code paths):

#766 (2026-05-10, open) - 1.2.0 unusable on a fresh RN install from npm (browser-globals leakage). v1.2.2's changelog says "Fix React Native SDK imports so native builds no longer pull in DOM/Web-only APIs from the client package" - so 1.2.3 likely resolves this, but the issue is still open with users reporting they need a ~150-line polyfill workaround. First spike must verify 1.2.3 works clean from npm.
#789 (2026-05-16, open) - onAudioAlignment not surfaced on WebRTC transport. Our greeting-stagger has a fallback already; degraded but not blocking.
#715 (2026-04-30, open) - backgrounded-app conversation behavior undocumented.
#641 (2026-03-31, open) - WebRTC TrackSubscribed handler uses web-only DOM APIs. May be the same root cause as #766.
#658 - WebSocket volume metering returns 0 (does not affect us - we use WebRTC).
#598 - Scribe (separate feature) not supported on RN. Does not affect Florence.
One Medium post by an indie dev describing bridging the Kotlin SDK into RN as an Android-only workaround - signals that some devs have hit roadblocks bad enough to take a Kotlin-bridge detour, but the date and context suggest pre-@elevenlabs/react-native.

Risks + open questions

Verify 1.2.3 installs clean from npm (not workspace) - reproduce the #766 repro path with npm install @elevenlabs/react-native@1.2.3 and confirm a startSession({connectionType: "webrtc", conversationToken}) reaches connected without polyfills. Gating spike before committing to the architecture.
Lock-screen + background audio behavior on iOS. Test on a real device: start a session, lock the phone, does audio continue? Does the call resume after unlocking? Same test with phone call interruption. The mobile-strategy doc demands a "tight" interactivity loop with audio session handling that mobile web cannot deliver - this is the killer feature for the native build, so verifying it works is the spike that justifies the whole effort.
onAudioAlignment on WebRTC. Our web useFlorenceConversation uses per-character alignment for the precise pitch-stagger trigger. Per #789, this doesn't fire on RN. The fallback (schedulePitchStaggerFallback using time-based delays after first speak) is already in the hook - confirm it ports cleanly and the founder accepts slightly looser timing on the first version.
onInterruption + onUnhandledClientToolCall callback availability. The RN docs don't list them explicitly. Verify in the SDK's exported types before relying on them.
The SDK-crash window guard (unhandledrejection listener for the error_type SDK bug). Different host runtime on RN (Hermes). Verify whether the SDK crash reproduces; if yes, port to ErrorUtils.setGlobalHandler.
Voice quota preflight + 503 response handling. Our /api/florence/agent-session route returns 503 with error: "voice_quota_exhausted" when over cap. The RN client needs to handle that branch in the begin() flow exactly like the web does - direct port of useFlorenceConversation.ts:660-697.
Origin/Referer guard on /api/florence/agent-session. applyApiGuard in the route currently allowlists web origins. A native RN app sends no Origin header. This will block the route from working on mobile unless we add a "from native client" auth path (signed token in a header, or a separate route, or relaxation under a feature flag). Same problem will hit every mobile API call - tracked by mobile-strategy doc but worth flagging as a P0.
Scene reducer + tap-session logic in packages/shared/. Per ENG-349 M0, framework-agnostic logic should live in packages/shared/. The scene reducer (director.ts, steps.ts, tools.ts) is currently in apps/web/src/lib/florence/. Pre-mobile-v1 refactor: move pure logic to packages/shared/, leave useFlorenceConversation as the web binding, write a parallel useFlorenceConversation.native.ts in apps/mobile/.

Recommended integration approach

Use the official @elevenlabs/react-native SDK on Expo managed workflow with a dev client. Day-1 spike (target: 1 day): create a barebones Expo dev-client app, npm install @elevenlabs/react-native@1.2.3 (from npm, not workspace), wire ConversationProvider + useConversation to hit our existing /api/florence/agent-session (with a temporary auth bypass for the spike), confirm WebRTC connect + voice loop + lock-screen behavior on a real iOS device. That single spike answers risks 1, 2, 5 and resolves whether the SDK is production-ready for our quality bar. If yes, the full porting plan is straightforward: move pure scene/tool logic to packages/shared/, rebuild useFlorenceConversation as a thin RN binding (~80% identical, swap WebRTC plumbing for SDK callbacks), build out the picker/sheet/voice-presence UI components in RN, ship behind a feature flag. If the lock-screen / background-audio spike fails, that's the moment to evaluate whether mobile v1 lives on a different surface (native iOS Swift SDK exists, Kotlin SDK exists - both could be bridged) - but the official RN SDK is the right first bet given Expo's first-party endorsement and our existing server contract being SDK-native.