'Stream Audio via WebSocket - Web Audio

I'm so close to getting audio chat working via Websockets. The idea of this application I'm building is to have a group voice chat working in browser.

I'm using a socket.io server to relay this information.

The audio is transmitting fine. This code is used:

let hasHeader = false 
export function emitAudioStream(mic, sock, room) {
    console.log('beginning record')
    const recorder = new MediaRecorder(mic)
    recorder.ondataavailable = (evt) => {
        // fetch the header
        if (!hasHeader) {
            console.log('header:', evt.data)
            sock.emit('header:audio', evt.data)
            hasHeader = true
        }
        // console.log(evt.data.size)
        sock.emit('stream:audio', ({room, streamData: evt.data}))
    }
    recorder.start()
    console.log(`Recording begin. (State: "${recorder.state}")`)

    setInterval(() => {
        recorder.requestData()
    }, 1e3/60)
}

There are rooms of 'participants' - individuals connected. The server handles requests like this:

    sock.on('header:audio', (packet) => {
        console.log(`setting audio header for ${sock.id}`)
        sock.__audioHeader = packet
    })

    sock.on('stream:audio', ({room, streamData}) => {
        const participants = rooms[room]
        if (!participants) {
            console.log(`not found ${room} room`)
            return
        } 
        // create a getParticipants to handle not found
        // add flag to include current socket
        participants.forEach(participant => {
            // if (participant.id === sock.id) return 
            participant.emit('stream:audio:packet', {header: sock.__audioHeader, streamData})
        })
    })

Back on the client, where I'm trying to play (where this is all failing), it looks like this. I've likely mis-interpreted the Web Audio docs. Can anyone point me in the right direction/explain why this isn't the right approach?

sck.on('stream:audio:packet', ({header, streamData}) => {
  playSound(streamData)
})

function playSound(buffer) {
  const context = new AudioContext()
  var source = context.createBufferSource()
  source.buffer = buffer
  source.connect(context.destination)
  source.start(0)
}

Another decoding attempt I've used:

        sck.on('stream:audio:packet',async  ({header, streamData}) => {
            if (streamData === 'data:') return
            const b64ToBuffer = (data) => fetch(data).then(r => r.blob())
            const buff = await b64ToBuffer(streamData)


            playSound(await buff.arrayBuffer())
        })

        let context = new AudioContext()


        
        async function playSound(buffer) {
            try {
                const buff = await context.decodeAudioData(buffer)
                let source = context.createBufferSource()
                source.connect(context.destination)
                console.log(buff)
                source.buffer = buff
                source.start(0)
            } catch (err) {
                console.warn('error decoding data:', err)
            }
        }


Solution 1:[1]

The reason why your current solution doesn't work is that a MediaRecorder is not required to emit chunks which can be encoded on their own. All the chunks need to be stitched together after stopping the MediaRecorder in order to get a valid file. Furthermore the Web Audio API can only decode full files with its decodeAudioData() method.

As said in the comments above WebRTC is the API which is specially made for this use case. If you want to have separate rooms you could make sure that your signaling process only connects clients which belong to the same room.

If you want to avoid WebRTC you could try a library that I wrote which adds WAVE support for the MediaRecorder. The library is called extendable-media-recorder. When asked to emit chunks those chunks are also not valid WAVE files on their own but decoding partial WAVE files by hand is way easier than decoding compressed files. Despite of the very first 44 bytes which comprise the header it's just raw PCM data.

You could also do the opposite and keep the native MediaRecorder and combine it with a custom decoder on the receiving end. If you configure the MediaRecorder to encode Opus files opus-stream-decoder should be able to decode the chunks.

Solution 2:[2]

use streams, u can get streams using navigator.mediaDevices.getUserMedia(constraints) MDN Reference and add socket.io-streams to socket.io and use an Audio Element or Video Element to play them

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Mohamad the master