'Stereo to Mono wav in Python
I am loading a wav with the scipy method wavefile.read() which gives me the samplerate and the audiodata
I know that this audio data if stereo is stored as a multi-dimensional array such as
audiodata[[left right]
[left right]
...
[left right]]
I am then using this method to create a new array of mono audio data by taking (right+left)/2
def stereoToMono(audiodata)
newaudiodata = []
for i in range(len(audiodata)):
d = (audiodata[i][0] + audiodata[i][1])/2
newaudiodata.append(d)
return np.array(newaudiodata, dtype='int16')
and then i write this to file using
wavfile.write(newfilename, sr, newaudiodata)
This is producing a Mono wav file, however the sound is dirty and has clickd etc throughout
what am I doing wrong?
Solution 1:[1]
First, what is the datatype of audiodata
? I assume it's some fixed-width integer format and you therefore get overflow. If you convert it to a floating point format before processing, it will work fine:
audiodata = audiodata.astype(float)
Second, don't write your Python code element by element; vectorize it:
d = (audiodata[:,0] + audiodata[:,1]) / 2
or even better
d = audiodata.sum(axis=1) / 2
This will be vastly faster than the element-by-element loop you wrote.
Solution 2:[2]
turns out, all i had to change was
(right+left)/2
to
(right/2) + (left/2)
Solution 3:[3]
After applying the mean you have to save the file with data as int16
wavfile.write(newfilename, sr, np.int16(newaudiodata))
Solution 4:[4]
This should work. You take first channel from stereo data:
audiodata = [s[0] for s in audiodata]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | cfh |
Solution 2 | user2145312 |
Solution 3 | Mauro Gentile |
Solution 4 | user299472 |