'How to stream audio from microphone to Google Speech-to-Text in Python using WebRTC
I'm looking for documentation about transcribing audio streaming data coming from WebRTC using Google Cloud Speach-To-Text. I'm using aiortc as a library in Python to handle the video and audio stream coming from a client web app.
Here is a snippet of the class that I'm using to process the audio data.
class AudioTransformTrack(MediaStreamTrack):
kind = "audio"
def __init__(self, track):
super().__init__()
self.track = track
async def recv(self):
frame = await self.track.recv()
data_np = frame.to_ndarray().astype(dtype='float32').reshape(1920, )
# print("data_np.shape:", data_np.shape)
y_16k = librosa.resample(data_np, 48000, 16000)
audio_data = y_16k.astype(dtype='int16').tobytes()
return frame
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|