Skip to main content
The deepslate-pipecat package provides a DeepslateRealtimeLLMService implementation for the Pipecat framework, enabling seamless integration with Deepslate’s unified voice AI infrastructure.
This plugin lives in the deepslate-sdks monorepo. We welcome contributions — feel free to open issues or pull requests there.

Prerequisites

  • A Deepslate account with API credentials
  • Python 3.11+
  • A Pipecat-compatible transport (e.g. Daily.co, Twilio, generic WebSocket)
  • (Optional) ElevenLabs API key for server-side TTS

Installation

pip install deepslate-pipecat

Environment Variables

Set up your credentials as environment variables:
VariableRequiredDescription
DEEPSLATE_VENDOR_IDYesYour Deepslate vendor ID
DEEPSLATE_ORGANIZATION_IDYesYour Deepslate organization ID
DEEPSLATE_API_KEYYesYour Deepslate API key
ELEVENLABS_API_KEYNoElevenLabs API key for server-side TTS
ELEVENLABS_VOICE_IDNoElevenLabs voice ID
ELEVENLABS_MODEL_IDNoElevenLabs model (e.g., eleven_turbo_v2)
Never expose your Deepslate or ElevenLabs API keys to clients. This plugin is for server-side use only.

Quick Start

import asyncio
import os

import aiohttp
from dotenv import load_dotenv

from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.transports.daily.transport import DailyParams, DailyTransport

from deepslate.pipecat import DeepslateOptions, DeepslateRealtimeLLMService, ElevenLabsTtsConfig

load_dotenv()

async def main():
    async with aiohttp.ClientSession() as session:
        room_name = os.environ["DAILY_ROOM_URL"].split("/")[-1]
        async with session.post(
            "https://api.daily.co/v1/meeting-tokens",
            headers={"Authorization": f"Bearer {os.environ['DAILY_API_KEY']}"},
            json={"properties": {"room_name": room_name}},
        ) as r:
            token = (await r.json())["token"]

    transport = DailyTransport(
        room_url=os.environ["DAILY_ROOM_URL"],
        token=token,
        bot_name="Deepslate Bot",
        params=DailyParams(
            audio_in_enabled=True,
            audio_out_enabled=True,
            vad_enabled=False,  # VAD is handled server-side by Deepslate
        ),
    )

    llm = DeepslateRealtimeLLMService(
        options=DeepslateOptions.from_env(
            system_prompt="You are a friendly and helpful AI assistant."
        ),
        tts_config=ElevenLabsTtsConfig.from_env(),
    )

    pipeline = Pipeline([transport.input(), llm, transport.output()])
    task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))

    @transport.event_handler("on_participant_left")
    async def on_participant_left(transport, participant, reason):
        await task.cancel()

    await PipelineRunner().run(task)

if __name__ == "__main__":
    asyncio.run(main())

Configuration Reference

The main configuration class for connecting to the Deepslate API. Use DeepslateOptions.from_env() to load credentials from environment variables, with optional keyword overrides.
ParameterTypeDefaultDescription
vendor_idstrenv: DEEPSLATE_VENDOR_IDYour Deepslate vendor ID
organization_idstrenv: DEEPSLATE_ORGANIZATION_IDYour Deepslate organization ID
api_keystrenv: DEEPSLATE_API_KEYYour Deepslate API key
base_urlstrhttps://app.deepslate.euBase URL for the Deepslate API
system_promptstr"You are a helpful assistant."System prompt for the model
temperaturefloat1.0Sampling temperature (0.0–2.0)
generate_reply_timeoutfloat30.0Timeout in seconds waiting for a model reply (0 = no timeout)
ws_urlstr | NoneNoneDirect WebSocket URL override — useful for local development
max_retriesint3Maximum reconnection attempts before emitting an ErrorFrame
Pass a VadConfig to DeepslateRealtimeLLMService to tune server-side Voice Activity Detection. Disable client-side VAD on your transport since Deepslate handles it.
ParameterTypeDefaultDescription
confidence_thresholdfloat0.5Minimum confidence to classify audio as speech (0.0–1.0)
min_volumefloat0.01Minimum volume threshold (0.0–1.0)
start_duration_msint200Consecutive speech required to detect a turn start (ms)
stop_duration_msint500Silence required to detect a turn end (ms)
backbuffer_duration_msint1000Audio buffered before the detection window (ms)
from deepslate.pipecat import VadConfig, DeepslateRealtimeLLMService

llm = DeepslateRealtimeLLMService(
    options=opts,
    vad_config=VadConfig(
        confidence_threshold=0.3,
        stop_duration_ms=300,
    ),
)
Use a voice cloned and hosted within Deepslate — no external TTS provider credentials required. Pass an instance to DeepslateRealtimeLLMService(tts_config=...) to enable PCM audio output.
ParameterTypeDefaultDescription
voice_idstrrequiredThe ID of the hosted (cloned) voice to use for synthesis
modeHostedTtsModeHostedTtsMode.HIGH_QUALITYQuality/latency tradeoff for synthesis
HostedTtsMode values:
ValueDescription
HIGH_QUALITYBest output quality with still relatively low latency. Recommended for most use cases (default).
LOW_LATENCYLow latency generation mode that takes next to no time to complete. Output quality may be significantly reduced.
from deepslate.pipecat import HostedTtsConfig, HostedTtsMode, DeepslateRealtimeLLMService

# Default — high quality
llm = DeepslateRealtimeLLMService(
    options=opts,
    tts_config=HostedTtsConfig(voice_id="c3dfa73f-a1ab-4aad-b48a-0e9b9fe4a69f"),
)

# Explicit low latency mode
llm = DeepslateRealtimeLLMService(
    options=opts,
    tts_config=HostedTtsConfig(
        voice_id="c3dfa73f-a1ab-4aad-b48a-0e9b9fe4a69f",
        mode=HostedTtsMode.LOW_LATENCY,
    ),
)
Configure server-side text-to-speech with ElevenLabs via Deepslate. Pass an instance to DeepslateRealtimeLLMService(tts_config=...) to enable PCM audio output.
ParameterTypeDescription
api_keystrElevenLabs API key (env: ELEVENLABS_API_KEY)
voice_idstrVoice ID (env: ELEVENLABS_VOICE_ID)
model_idstr | NoneModel ID, e.g., eleven_turbo_v2 (env: ELEVENLABS_MODEL_ID)
locationElevenLabsLocationAPI endpoint region — US (default), EU, or INDIA
voice_settingsElevenLabsVoiceSettingsConfig | NoneFine-grained voice control (see below)
Use ElevenLabsTtsConfig.from_env() to create a config from environment variables.ElevenLabsVoiceSettingsConfig — fine-grained control over the synthesized voice:
ParameterTypeDescription
stabilityfloat | NoneVoice consistency (0.0–1.0); higher = more stable
similarity_boostfloat | NoneClarity and similarity to the original voice (0.0–1.0)
stylefloat | NoneStyle exaggeration (0.0–1.0)
use_speaker_boostbool | NoneBoost similarity to the original speaker
speedfloat | NoneSpeaking speed multiplier
from deepslate.pipecat import (
    ElevenLabsTtsConfig,
    ElevenLabsVoiceSettingsConfig,
    ElevenLabsLocation,
)

tts_config = ElevenLabsTtsConfig.from_env(
    location=ElevenLabsLocation.EU,
    voice_settings=ElevenLabsVoiceSettingsConfig(
        stability=0.7,
        similarity_boost=0.85,
        speed=1.1,
    ),
)
Server-side TTS enables automatic interruption handling. When the user interrupts, Deepslate tracks exactly what was spoken and truncates the context accordingly. Without server-side TTS, the service emits LLMTextFrame for a downstream Pipecat TTS service, but this interruption context tracking will not be available.

Features

Real-time Voice Streaming

Low-latency bidirectional PCM audio streaming over WebSockets for natural conversations

Server-side VAD

Voice activity detection handled server-side for reliable, configurable speech detection

Function Calling

Full tool/function calling support using OpenAI JSON schema format with async handlers

ElevenLabs TTS

Server-side TTS with regional endpoints and fine-grained voice settings

Low Latency Mode

Hosted voice TTS supports a low latency mode for fastest possible response at the cost of some output quality

Direct Speech

Speak text directly via TTS without routing through the LLM

Conversation Queries

Run one-shot side-channel inference without affecting the main conversation

Chat History Export

Export the full structured conversation history on demand

Dynamic Context Injection

Inject user or system messages mid-conversation via LLMMessagesAppendFrame

Automatic Reconnection

Exponential-backoff reconnection with a configurable retry limit

Transport Agnostic

Works with any Pipecat transport: Daily.co, Twilio, generic WebSocket, and more

Session Initialized Frame

DeepslateRealtimeLLMService emits a DeepslateSessionInitializedFrame exactly once, when the WebSocket session is fully initialized and ready to accept messages.
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from deepslate.pipecat.frames import DeepslateSessionInitializedFrame, DeepslateDirectSpeechFrame

class WelcomeProcessor(FrameProcessor):
    async def process_frame(self, frame, direction):
        await self.push_frame(frame, direction)
        if isinstance(frame, DeepslateSessionInitializedFrame):
            await self.push_frame(
                DeepslateDirectSpeechFrame(text="Hello! How can I help you today?"),
                FrameDirection.DOWNSTREAM,
            )

pipeline = Pipeline([transport.input(), llm, WelcomeProcessor(), transport.output()])

Function Calling

Define tools in OpenAI JSON schema format, register async handlers on the service, and push the definitions into the pipeline before it starts:
import random
from pipecat.frames.frames import LLMSetToolsFrame
from pipecat.services.llm_service import FunctionCallParams

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "lookup_weather",
            "description": "Get the current weather for a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "The city to look up."}
                },
                "required": ["location"],
            },
        },
    },
]

async def lookup_weather(params: FunctionCallParams):
    result = {
        "location": params.arguments.get("location", "unknown"),
        "temperature_celsius": random.randint(10, 35),
    }
    await params.result_callback(result)

# Register the handler on the service
llm.register_function("lookup_weather", lookup_weather)

# Queue tool definitions — synced to Deepslate after the pipeline starts
await task.queue_frame(LLMSetToolsFrame(tools=TOOLS))

Dynamic Context Injection

Inject messages into the live conversation context without restarting the session. This is useful for passing user profile data, injecting tool results from external systems, or priming the model with background context.
from pipecat.frames.frames import LLMMessagesAppendFrame
from pipecat.processors.llm.base import OpenAILLMContextFrame

# Inject a user message mid-conversation
await task.queue_frame(
    LLMMessagesAppendFrame(
        messages=[{"role": "user", "content": "My name is Alice and I prefer short answers."}]
    )
)
Use LLMMessagesUpdateFrame to resync the full context and optionally trigger an immediate model reply.

Direct Speech

Push a DeepslateDirectSpeechFrame to synthesize and play text directly — bypassing the LLM entirely. Useful for scripted prompts, confirmations, or fallback messages.
from deepslate.pipecat.frames import DeepslateDirectSpeechFrame

await task.queue_frame(
    DeepslateDirectSpeechFrame(
        text="Welcome back! How can I help you today?",
        include_in_history=True,  # Records as an assistant turn (default: True)
    )
)
Set include_in_history=False to speak without adding the text to the conversation context — ideal for system-level announcements.

Conversation Queries

A DeepslateConversationQueryFrame runs a one-shot inference call on a side channel. The result arrives as a DeepslateConversationQueryResultFrame and does not affect the main conversation history or trigger any audio output.
from deepslate.pipecat.frames import (
    DeepslateConversationQueryFrame,
    DeepslateConversationQueryResultFrame,
)

# Send the query
await task.queue_frame(
    DeepslateConversationQueryFrame(
        prompt="Summarize the conversation so far in one sentence.",
        instructions="Respond in plain text only, no formatting.",
    )
)

# Receive the result downstream in your pipeline
# DeepslateConversationQueryResultFrame.text contains the model's reply
This is useful for background analysis, logging summaries, or deciding on the next action without affecting the user-facing conversation.

Chat History Export

Push a DeepslateExportChatHistoryFrame to request the full conversation history. The result arrives as a DeepslateChatHistoryFrame downstream in the pipeline.
from deepslate.pipecat.frames import (
    DeepslateExportChatHistoryFrame,
    DeepslateChatHistoryFrame,
)

# Request the export
await task.queue_frame(
    DeepslateExportChatHistoryFrame(
        await_pending=False,  # Set True to wait for any in-flight operations first
    )
)

# Handle the result in a downstream processor
# DeepslateChatHistoryFrame.messages is a list[ChatMessageDict]
Each ChatMessageDict has role, delivery_status, ephemeral, and a content list of typed blocks (text, input_audio, tool_call, tool_result, and more).

Custom Frames Reference

In addition to standard Pipecat frames, deepslate-pipecat exposes the following frames for controlling and observing Deepslate-specific behaviour.

Input Frames (push into the pipeline)

FrameDescription
DeepslateExportChatHistoryFrameRequest a full chat history export. await_pending: bool — wait for in-flight ops before exporting.
DeepslateDirectSpeechFrameSpeak text directly via TTS, bypassing the LLM. text: str, include_in_history: bool.
DeepslateConversationQueryFrameOne-shot side-channel inference. prompt: str | None, instructions: str | None.

Output Frames (emitted by the service)

FrameDescription
DeepslateSessionInitializedFrameEmitted once when the session is fully initialized and ready to accept messages.
DeepslateChatHistoryFrameChat history export result. messages: list[ChatMessageDict].
DeepslateConversationQueryResultFrameSide-channel query result. text: str.
DeepslateUserTranscriptionFrameUser speech-to-text transcription from Deepslate.
DeepslateModelTranscriptionFrameWord-aligned transcription for the model’s TTS audio. text: str.

Transport Examples

The Deepslate service is transport-agnostic. Swap the transport to suit your deployment.
from pipecat.transports.daily.transport import DailyTransport, DailyParams

transport = DailyTransport(
    room_url=daily_room_url,
    token=token,
    bot_name="My Voice Bot",
    params=DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_enabled=False,  # Deepslate handles VAD
    ),
)

pipeline = Pipeline([transport.input(), llm, transport.output()])
from pipecat.transports.services.twilio import TwilioTransport

transport = TwilioTransport(
    account_sid=twilio_account_sid,
    auth_token=twilio_auth_token,
    from_number=twilio_from_number,
)

pipeline = Pipeline([transport.input(), llm, transport.output()])
from pipecat.transports.network.websocket import WebsocketTransport, WebsocketParams

transport = WebsocketTransport(
    host="0.0.0.0",
    port=8765,
    params=WebsocketParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
    ),
)

pipeline = Pipeline([transport.input(), llm, transport.output()])

Contributing

This plugin is open source. Visit the deepslate-sdks monorepo to:
  • Report issues
  • Submit pull requests
  • Request features

Next Steps

API Reference

Full message schemas and configuration options

Pipecat Docs

Pipecat framework documentation

GitHub Repository

Source code, issues, and contributions