Introduction
CARTER is the unhinged meme mode of Cartesia — showing what’s possible when you remove all guardrails from Cartesia’s Sonic model. This guide shows you how to build voice AI with personality, not corporate speak.Why Build with Cartesia?
Real Emotions
Express genuine emotions through voice with fine-grained control
Ultra-Low Latency
Sub-6ms response times for natural conversations
Production Ready
Enterprise-grade reliability and scalability
Simple Integration
Clean APIs and SDKs for rapid development
Cartesia Sonic Model
The Sonic model powers CARTER’s voice capabilities:- Emotional Expression: Control tone, pitch, and emotion in real-time
- Multiple Voices: Choose from stable voices or emotive character voices
- Low Latency: 6ms average response time
- High Quality: Natural-sounding speech with proper pronunciation
- Streaming Support: Real-time audio generation
Sonic is Cartesia’s latest text-to-speech model, offering unprecedented emotional range and responsiveness.
Getting Started
Core Capabilities
Text-to-Speech
Generate natural-sounding speech with emotional control:Streaming Voice
Real-time voice generation for conversational AI:Voice Cloning
Clone voices for custom characters:Integration Patterns
CARTER uses several key patterns you can implement:WebSocket Connection
WebSocket Connection
Maintain persistent connections for low-latency streaming
Emotion Control
Emotion Control
Dynamically adjust voice emotions
Context Management
Context Management
Maintain conversation context for natural flow
Next Steps
Integration Guide
Step-by-step integration
Voice API Reference
Full API documentation
Code Examples
Working code samples
Cartesia Docs
Official Cartesia documentation
