> ## Documentation Index > Fetch the complete documentation index at: https://docs.deepslate.eu/llms.txt > Use this file to discover all available pages before exploring further. # How Opal Works > Our end-to-end speech-to-speech AI model powering natural voice conversations Opal is Deepslate's proprietary end-to-end speech-to-speech (S2S) model. Unlike traditional voice AI systems that chain together separate components, Opal processes audio input and generates audio output in a single unified model. ## What Makes Opal Different Direct speech processing means faster responses and better context awareness. No transcription errors to compound. Sub-300ms first byte latency enables natural turn-taking that feels human, not robotic. Advanced reasoning with complex instruction following, context retention, and task completion. Understands emotional cues and responds with appropriate tone and inflection. ## Core Architecture Unlike traditional voice AI that chains separate ASR, LLM, and TTS components, Opal understands speech directly. This eliminates latency penalties and error propagation between stages. ### Traditional Cascaded Approach ```mermaid theme={null} flowchart LR A[Audio In] --> B[STT] B --> C[Text] C --> D[LLM] D --> E[Text] E --> F[TTS] F --> G[Audio Out] ``` Each stage introduces latency. Transcription errors compound through the pipeline. Total response time is the sum of all components. ### Opal End-to-End Approach Opal supports two output modes depending on your use case: ```mermaid theme={null} flowchart LR A[Audio In] --> B[Speech Encoder] B --> C[Embedding] C --> D[LLM] D --> E[Embedding] E --> F[Speech Decoder] F --> G[Audio Out] ``` The model operates entirely in embedding space, preserving acoustic information that would be lost in text-based intermediate representations. No transcription step means no transcription errors. ```mermaid theme={null} flowchart LR A[Audio In] --> B[Speech Encoder] B --> C[Embedding] C --> D[LLM] D --> E[Text Out] ``` Use this mode when speech synthesis is not required (e.g. transcription workflows) or when you want to use external TTS providers like **ElevenLabs** or **Cartesia** for voice generation. ## Performance Comparison | Metric | Opal | Traditional Cascade | | --------------------- | --------------- | ----------------------- | | First byte latency | **Under 300ms** | 800-1500ms | | Turn-taking gap | **Natural** | Noticeable delay | | Interruption handling | **Native** | Often problematic | | Error propagation | **None** | Compounds across stages | ## Key Capabilities Opal combines speech understanding with advanced reasoning: * **Complex instruction following** — Handles multi-step requests and nuanced instructions * **Context retention** — Maintains conversation context across long interactions * **Domain adaptation** — Quickly adapts to specialized terminology and workflows * **Task completion** — Drives conversations toward defined goals while handling edge cases Opal maintains consistent voice characteristics throughout conversations or adopts custom voice profiles. This enables branded voice experiences that match your organization's identity. The model understands emotional cues in speech and responds appropriately: * Detecting caller frustration, confusion, or satisfaction * Adjusting response tone to match the situation * Conveying empathy, urgency, or reassurance as needed Opal supports multiple languages and accents, enabling global deployment without requiring separate models for each locale. Opal supports streaming in both directions: * **Input streaming** — Begins processing before the speaker finishes * **Output streaming** — Starts speaking while still generating the response This enables natural interruption handling and reduces perceived latency. ## Integration with Deepslate Realtime Opal powers both **Assistants** (inbound) and **Agents** (outbound) on the Deepslate platform. When you configure an assistant or agent, you're defining the behavior, knowledge, and goals — Opal handles the real-time voice interaction. Handle inbound calls with AI-powered voice conversations Make outbound calls for proactive customer outreach