Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

Vision Agents logo
Stream maintains Vision Agents—an open-source Python framework for voice- and vision-driven agents with realtime media over Stream’s WebRTC edge. Cartesia is supported as the TTS provider; install steps, environment variables, and parameters are in Stream’s Cartesia integration. You need a Stream developer account for realtime transport and a Cartesia API key for speech. The “Simple Agent” example in GitHub and the voice / video intros are good starting points.

Demo

Vision Agents Cartesia Demo

Try out the Simple Agent Cartesia demo.