<Stream>

The <Stream> instruction makes it possible to send raw audio streams from a running phone call over WebSockets in near real time, to a specified URL. The audio frames themselves are base64 encoded, embedded in a json string, together with other information like sequence number and timestamp. The feature can be used with Speech-To-Text systems and others.

Attributes

An example on how to use Stream:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Start>
     <Stream url="wss://your-application.com/audiostream" />
  </Start>
</Response>

This cXML will instruct Signalwire to make a copy of the audio frames of the current call and send them in near real-time over WebSocket to wss://your-application.com/audiostream.

<Stream> will start the audio stream asynchronous manner it will continue with the next cXML instruction at once. In case there is no instruction, Signalwire disconnect the call.

JavaScript
C#
Python
Ruby

const { RestClient } = require("@signalwire/compatibility-api");
const response = new RestClient.LaML.VoiceResponse();

const start = response.start();
start.stream({
  name: "Example Audio Stream",
  url: "wss://your-application.com/audiostream",
});

console.log(response.toString());

using System;
using Twilio.TwiML;
using Twilio.TwiML.Voice;


class Example
{
    static void Main()
    {
        var response = new VoiceResponse();
        var start = new Start();
        start.Stream(name: "Example Audio Stream", url: "wss://your-application.com/audiostream");
        response.Append(start);

        Console.WriteLine(response.ToString());
    }
}

from twilio.twiml.voice_response import Parameter, VoiceResponse, Start, Stream

response = VoiceResponse()
start = Start()
stream = Stream(url='wss://your-application.com/audiostream')
stream.parameter(name='FirstName', value='Jane')
stream.parameter(name='LastName', value='Doe')
start.append(stream)
response.append(start)

print(response)

require 'signalwire/sdk'

response =  Signalwire::Sdk::VoiceResponse.new
response.start do |start|
    start.stream(url: 'wss://your-application.com/audiostream') do |stream|
        stream.parameter(name: 'FirstName', value: 'Jane')
        stream.parameter(name: 'LastName', value: 'Doe')
    end
end

puts response

Attribute
`url`	Absolute or relative URL. A WebSocket connection to the url will be established and audio will start flowing towards the Websocket server. The only supported protocol is `wss`. For security reasons `ws` is NOT supported.
`name` optional	Unique name for the Stream, per Call. It is used to stop a Stream by name.
`track` optional	This attribute can be one of: `inbound_track`, `outbound_track`, `both_tracks` . Defaults to `inbound_track`. For `both_tracks` there will be both `inbound_track` and `outbound_track` events.
`statusCallback` optional	Absolute or relative URL. SignalWire will make a HTTP GET or POST request to this URL when a Stream is started, stopped or there is an error.
`statusCallbackMethod` optional	GET or POST. The type of HTTP request to use when requesting a statusCallback. Default is POST.

`StatusCallback` Parameters

For a statusCallback, SignalWire will send a request with the following parameters:

Parameter
`AccountSid` string	The unique ID of the Account this call is associated with.
`CallSid` string	A unique identifier for the call. May be used to later retrieve this message from the REST API.
`StreamSid` string	The unique identifier for this Stream.
`StreamName` string	If defined, this is the unique name of the Stream. Defaults to the StreamSid.
`StreamEvent` string	One of `stream-started`, `stream-stopped`, or `stream-error`.
`StreamError` string	If an error has occurred, this will contain a detailed error message.
`Timestamp` string	The time of the event in ISO 8601 format.

Custom Parameters

To pass parameters towards the wss server, it is possible to include additional key value pairs. This can be done by using the nested <Parameter> cXML noun. These parameters will be added to the Start message, as json.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
   <Start>
     <Stream url="wss://your-application.com/audiostream" >
        <Parameter name="Cookie" value ="948f9938-299a-d43e-0df4-af3a7eccb0ac"/>
        <Parameter name="Type" value ="SIP" />
      </Stream>
    </Start>
</Response>

Stopping a Stream

It is possible to stop a stream at any time by name. For instance by naming the Stream "mystream", you can later use the unique name of "mystream" to stop the stream.

<Start>
    <Stream name="mystream" url="wss://mystream.ngrok.io/audiostream" />
</Start>

<Stop>
   <Stream name="mystream" />
</Stop>

Bidirectional Stream

The <Stream> instruction can allow you to receive audio into the call too. In this case, the stream must be bidirectional. The external service (e.g., an AI agent) will then be able to both hear the call and play audio.

To initialize a bidirectional stream, wrap the <Stream> instruction in <Connect> instead of <Start>:

<Connect>
    <Stream url="wss://mystream.ngrok.io/audiostream" />
</Connect>

WebSocket Messages

There are 5 separate types of events that occur during the Stream's life cycle. These events are represented via WebSocket Messages: Connected, Start, Media, DTMF and Stop. Each message sent is a JSON string. The type of event which is occurring can be identified by using the event property of every JSON object.

Connected Message

The first message sent once a WebSocket connection is established is the Connected event. This message describes the protocol to expect in the following messages.

Event
event	The string value of "connected"
protocol	Defines the protocol for the WebSocket connections lifetime. eg: "Call"
version	Semantic version of the protocol.

Example Connected Message

{
  "event": "connected",
  "protocol": "Call",
  "version": "0.2.0"
}

Start Message

This message contains important information about the Stream and is sent immediately after the Connected message. It is only sent once at the start of the Stream.

Event
event	The string value of `start`.
sequenceNumber	Number used to keep track of message sending order. First message starts with number "1" and then is incremented.
start	An object containing Stream metadata.
start.streamSid	The unique identifier of the Stream.
start.accountSid	The Account identifier that created the Stream.
start.callSid	The Call identifier from where the Stream was started.
start.tracks	An array of values that indicates what media flows to expect in subsequent messages. Values are one of "inbound" or "outbound" or both.
start.customParameters	An object that represents the Custom Parameters that where set when defining the Stream.
start.mediaFormat	An object containing the format of the payload in the Media Messages.
start.mediaFormat.encoding	The encoding of the data in the upcoming payload. Default is "audio/x-mulaw".
start.mediaFormat.sampleRate	The Sample Rate in Hertz of the upcoming audio data. Default value is 8000, which is the rate of PCMU.
start.mediaFormat.channels	The number of channels in the input audio data. Default value is 1. For `both_tracks` it will be 2.

Example Start Message

{
  "event": "start",
  "sequenceNumber": "2",
  "start": {
    "streamSid": "c0c7d59b-df06-435e-afbc-9217ce318390",
    "accountSid": "123abc",
    "callSid": "a30d16a5-0368-4104-afbf-14247e76a63d",
    "tracks": ["inbound", "outbound"],
    "customParameters": {
      "FirstName": "Jane",
      "LastName": "Doe",
      "RemoteParty": "Bob"
    },
    "mediaFormat": {
      "encoding": "audio/x-mulaw",
      "sampleRate": 8000,
      "channels": 1
    }
  }
}

Media Message

This message type encapsulates the raw audio data.

Event
event	The string value of `media`.
sequenceNumber	Number used to keep track of message sending order. First message starts with number "1" and then is incremented for each message.
media	An object containing media metadata and payload.
media.track	One of the strings `inbound` or `outbound`.
media.chunk	The chunk for the message. The first message will begin with number "1" and increment with each subsequent message.
media.timestamp	Presentation Timestamp in Milliseconds from the start of the stream.
media.payload	Raw audio encoded in base64.

Example Media Messages

Outbound

{
  "event": "media",
  "sequenceNumber": "3",
  "media": {
    "track": "outbound",
    "chunk": "1",
    "payload": "iY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//jw=="
  }
}

Inbound

{
  "event": "media",
  "sequenceNumber": "4",
  "media": {
    "track": "inbound",
    "chunk": "1",
    "timestamp": "5",
    "payload": "/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JDw=="
  }
}

Stop Message

A stop message will be sent when the Stream is either stopped or the Call has ended.

Example Stop Message

{
  "event": "stop",
  "sequenceNumber": "5"
}

Event
event	The string value of `stop`.
sequenceNumber	Number used to keep track of message sending order. First message starts with number "1" and then is incremented for each message.

DTMF Message

A DTMF message will be sent when the Stream receives a DTMF tone.

Event
event	The string value of `dtmf`.
sequence_number	Number, as a string, used to keep track of message-sending order. The first message starts with "1" and then is incremented for each message.
streamSid	The unique identifier of the stream as a string.
dtmf	An object containing the details of the detected DTMF.
dtmf.duration	The duration of the DTMF in milliseconds.
dtmf.digit	The digit, as a string, that corresponds to the DTMF.

Example DTMF Message

{
  "event": "dtmf",
  "sequence_number": "1",
  "streamSid": "c0c7d59b-df06-435e-afbc-9217ce318390",
  "dtmf": {
    "duration": 2700,
    "digit": "8"
  }
}

Clear Message

Send the clear event message if you would like to interrupt the audio that has been sent various media event messages. This will empty all buffered audio.

Event
event	The string value of `clear`.
streamSid	The unique identifier of the stream as a string.

Example Clear Message

{ 
 "event": "clear",
 "streamSid": "c0c7d59b-df06-435e-afbc-9217ce318390"
}

Notes on Usage

The url does not support query string parameters. To pass custom key value pairs to the WebSocket, make use of Custom Parameters instead.
There is a one to one mapping of a stream to a websocket connection, therefore there will be at most one call being streamed over a single websocket connection. Information will be provided so that you can handle handle multiple inbound connections and manage the association between the unique stream identifier (StreamSid) and the connection.
On any given call there are inbound and outbound tracks, inbound represents the audio Signalwire receives from the call, outbound represents the audio generated by Signalwire for the Call.

Attributes​

StatusCallback Parameters​

Custom Parameters​

Stopping a Stream​

Bidirectional Stream​

WebSocket Messages​

Connected Message​

Start Message​

Media Message​

Stop Message​

DTMF Message​

Clear Message​

Notes on Usage​