RTVI (Real-Time Voice Interaction)

Pipecat’s RTVI (Real-Time Voice Interaction) protocol provides a standardized communication layer between clients and servers for building real-time voice and multimodal applications. It handles the synchronization of user and bot interactions, transcriptions, LLM processing, and text-to-speech delivery. This page provides an overview of RTVI from the server’s perspective and how to use it in your bot applications.

RTVI Protocol

A complete specification of the RTVI protocol for client-server communication.

Architecture

RTVI operates with two primary components:

RTVIProcessor - A frame processor residing in the pipeline that serves as the entry point for sending and receiving messages to/from the client.
RTVIObserver - An observer that monitors pipeline events and translates them into client-compatible messages, handling:
- Speaking state changes
- Transcription updates
- LLM responses
- TTS events
- Performance metrics

RTVI is enabled by default. When you create a PipelineTask, it automatically adds RTVIProcessor to the start of your pipeline and registers an RTVIObserver. The default on_client_ready handler calls set_bot_ready() automatically.

Basic Example

With automatic RTVI setup, your pipeline code can focus on core functionality:

pipeline = Pipeline(
    [
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ]
)

# Add the RTVIObserver to your pipeline task
task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
)

# Access the RTVI processor via task.rtvi
@task.rtvi.event_handler("on_client_ready")
async def on_client_ready(rtvi):
    # set_bot_ready() is called automatically, add custom logic here
    await task.queue_frames([LLMRunFrame()])

# Handle participant disconnection
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
    await task.cancel()

# Run the pipeline
runner = PipelineRunner()
await runner.run(task)

Customizing RTVI

You can customize RTVI behavior through PipelineTask parameters:

from pipecat.processors.frameworks.rtvi import RTVIProcessor, RTVIObserverParams

task = PipelineTask(
    pipeline,
    rtvi_processor=RTVIProcessor(),  # Provide your own processor
    rtvi_observer_params=RTVIObserverParams(...),  # Customize observer
)

To disable RTVI entirely:

task = PipelineTask(pipeline, enable_rtvi=False)

Protocol Flow

Client connects and sends a client-ready message
Server responds with bot-ready and initial configuration
Client and server exchange real-time events:
- Speaking state changes (user/bot-started/stopped-speaking)
- Transcriptions (user-transcription/bot-output)
- LLM processing (bot-llm-started/stopped, bot-llm-text, llm-function-call)
- TTS events (bot-tts-started/stopped, bot-tts-text, bot-tts-audio)

Key Components

RTVIProcessor

Configure and manage RTVI services, actions, and client communication

RTVIObserver

Translate internal pipeline events to standardized client messages

Client Integration

RTVI is implemented in Pipecat client SDKs, providing a high-level API to interact with the protocol. Visit the Pipecat Client SDKs documentation:

Client SDKs

Learn how to implement RTVI on the client-side with our JavaScript, React, and mobile SDKs

API Reference

Services

Utilities

Frameworks

Pipeline

RTVI (Real-Time Voice Interaction)

RTVI Protocol

Architecture

Basic Example

Customizing RTVI

Protocol Flow

Key Components

RTVIProcessor

RTVIObserver

Client Integration

Client SDKs

API Reference

Services

Utilities

Frameworks

Pipeline

RTVI Protocol

​Architecture

​Basic Example

​Customizing RTVI

​Protocol Flow

​Key Components

RTVIProcessor

RTVIObserver

​Client Integration

Client SDKs

Architecture

Basic Example

Customizing RTVI

Protocol Flow

Key Components

Client Integration