Pipecat

Overview

Pipecat is an open-source Python framework for realtime voice agents. Cartesia is available as a first-party provider plugin for TTS and STT services in the Pipecat repo.

Prerequisites

Pipecat’s examples require a recent Python installation (see the Pipecat repo’s root-level README for current prerequisites). Install the pipecat-ai Python package with the cartesia extra for TTS/STT (bracket syntax):

pip install "pipecat-ai[cartesia]"

You’d also need to choose the transport extras your sample needs - you can do this by matching whatever the upstream README lists for that example.

Getting Started

Integrating Cartesia is as simple as importing Cartesia services and plugging them into your agent:

CartesiaTurnsSTTService requires pipecat-ai[cartesia]>=1.3.0.
We strongly recommend it over the older CartesiaSTTService for improved turn detection.

from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.cartesia.turns.stt import CartesiaTurnsSTTService

stt = CartesiaTurnsSTTService(
    api_key=os.environ.get("CARTESIA_API_KEY"),
)

tts = CartesiaTTSService(
    api_key=os.environ.get("CARTESIA_API_KEY"),
)

# add cartesia stt and tts to your existing pipeline
pipeline = Pipeline(
    [
        transport.input(),
        stt,
        user_aggregator,
        llm,
        tts,
        transport.output(),
        assistant_aggregator,
    ]
)

Basic Example

Check out /examples/voice/voice-cartesia-turns.py in the pipecat-ai/pipecat repo for a fully working voice agent.

# clone and setup pipecat-ai/pipecat
git clone git@github.com:pipecat-ai/pipecat.git
cd pipecat
uv sync

# run with required API keys:
# - CARTESIA_API_KEY
# - OPENAI_API_KEY
uv run examples/voice/voice-cartesia-turns.py

Advanced Example

You can take advantage of Ink’s turn.eager_end events to start generating an agent response slightly earlier than normal. This can cut around half a second off your latency, making your agent more human-like.

Speculative User Aggregator

How to use on_turn_eager_end from CartesiaTurnsSTTService

Get Started

Text-to-Speech

Speech-to-Text

Tools

Integrations

Enterprise

Overview

Prerequisites

Getting Started

Basic Example

Advanced Example

Speculative User Aggregator

​Overview

​Prerequisites

​Getting Started

​Basic Example

​Advanced Example

Speculative User Aggregator

Overview

Prerequisites

Getting Started

Basic Example

Advanced Example