Skip to main content
Pipecat’s RTVI (Real-Time Voice Interaction) protocol provides a standardized communication layer between clients and servers for building real-time voice and multimodal applications. It handles the synchronization of user and bot interactions, transcriptions, LLM processing, and text-to-speech delivery. This page provides an overview of RTVI from the server’s perspective and how to use it in your bot applications.

RTVI Protocol

A complete specification of the RTVI protocol for client-server communication.

Architecture

RTVI operates with two primary components:
  1. RTVIProcessor - A frame processor residing in the pipeline that serves as the entry point for sending and receiving messages to/from the client.
  2. RTVIObserver - An observer that monitors pipeline events and translates them into client-compatible messages, handling:
    • Speaking state changes
    • Transcription updates
    • LLM responses
    • TTS events
    • Performance metrics
RTVI is enabled by default. When you create a PipelineTask, it automatically adds RTVIProcessor to the start of your pipeline and registers an RTVIObserver. The default on_client_ready handler calls set_bot_ready() automatically.

Basic Example

With automatic RTVI setup, your pipeline code can focus on core functionality:
pipeline = Pipeline(
    [
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ]
)

# Add the RTVIObserver to your pipeline task
task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
)

# Access the RTVI processor via task.rtvi
@task.rtvi.event_handler("on_client_ready")
async def on_client_ready(rtvi):
    # set_bot_ready() is called automatically, add custom logic here
    await task.queue_frames([LLMRunFrame()])

# Handle participant disconnection
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
    await task.cancel()

# Run the pipeline
runner = PipelineRunner()
await runner.run(task)

Customizing RTVI

You can customize RTVI behavior through PipelineTask parameters:
from pipecat.processors.frameworks.rtvi import RTVIProcessor, RTVIObserverParams

task = PipelineTask(
    pipeline,
    rtvi_processor=RTVIProcessor(),  # Provide your own processor
    rtvi_observer_params=RTVIObserverParams(...),  # Customize observer
)
To disable RTVI entirely:
task = PipelineTask(pipeline, enable_rtvi=False)

Protocol Flow

  1. Client connects and sends a client-ready message
  2. Server responds with bot-ready and initial configuration
  3. Client and server exchange real-time events:
    • Speaking state changes (user/bot-started/stopped-speaking)
    • Transcriptions (user-transcription/bot-output)
    • LLM processing (bot-llm-started/stopped, bot-llm-text, llm-function-call)
    • TTS events (bot-tts-started/stopped, bot-tts-text, bot-tts-audio)

Key Components

Client Integration

RTVI is implemented in Pipecat client SDKs, providing a high-level API to interact with the protocol. Visit the Pipecat Client SDKs documentation:

Client SDKs

Learn how to implement RTVI on the client-side with our JavaScript, React, and mobile SDKs