Media Streams
A media stream begins when MiniVoice executes start_stream in the action list returned by your answer_url. The same call flow can later execute stop_stream to end the stream.
Streaming is controlled by JSON actions returned from answer_url. There is no separate customer REST endpoint for starting a stream. MiniVoice validates the action payload before execution: start_stream requires a secure wss URL, and track may be inbound, outbound, or both.
Real Request
{
"actions": [
{
"type": "start_stream",
"stream_url": "wss://media.example.com/minivoice",
"track": "both"
}
]
}
Real Response
{
"result": "stream action accepted by the call flow"
}
Stop Stream Example
{
"actions": [
{
"type": "stop_stream",
"stream_url": "wss://media.example.com/minivoice",
"track": "both"
}
]
}
Real Webhook Example
{
"event": "call.completed",
"created_at": "2026-05-29T12:10:00Z",
"call": {
"id": "call_123",
"customer_id": "cust_123",
"application_id": "app_123",
"status": "completed",
"direction": "outbound",
"from": "+15551230001",
"to": "+15551230002"
},
"variables": {},
"data": {}
}
Common Use Case
Use streaming when your application needs live audio during the call, such as real-time transcription, agent assist, or live monitoring. Keep stream processing separate from webhook processing so a stream receiver problem does not block normal call lifecycle handling.
Common Failure Case
{
"actions": [
{
"type": "start_stream",
"stream_url": "https://media.example.com/minivoice",
"track": "speaker"
}
]
}
This action is invalid because the URL is not wss and the track is not one of the supported values. Log answer_url responses during testing so malformed actions can be corrected quickly.
Receiver design
Design the WebSocket receiver as a real-time service with simple responsibilities: accept the connection, authenticate the source if your environment requires it, process or forward media quickly, and close resources when the stream ends. Keep expensive analysis, database writes, and third-party API calls out of the hot path when possible.
Your application should keep separate logs for the answer_url response, the stream receiver, and MiniVoice webhooks. Those logs answer different questions. The answer_url log shows whether the action was valid. The receiver log shows whether media arrived. The webhook log shows the final call result.
Track choice
Choose the narrowest track that solves the problem. inbound is enough for caller-only analysis, outbound is useful for agent audio, and both is appropriate when the receiver needs the full conversation.
Testing
Use a WebSocket receiver that logs connection open, media received, disconnect, and errors. Place a test call, return start_stream, speak for a few seconds, then stop the stream or end the call. Confirm that call.completed still arrives even if the stream receiver disconnects.