'How to convert amplitude to dB in python using Librosa?
I have a few questions, which are all very related. The main problem here is to convert the amplitude of an audio file to dB scale and I am doing it as below which I am not sure is correct:
y, sr = librosa.load('audio.wav')
S = np.abs(librosa.stft(y))
db_max = librosa.amplitude_to_db(S, ref=np.max)
db_median = librosa.amplitude_to_db(S, ref=np.median)
db_min = librosa.amplitude_to_db(S, ref=np.min)
db_max_AVG = np.mean(db_max, axis=0)
db_median_AVG = np.mean(db_median, axis=0)
db_min_AVG = np.mean(db_min, axis=0)
My question is how can I convert 'y' to dB scale. Is not 'y' the amplitude? Also, the shape of 'y' and 'db_max_AVG' is not the same. The size of 'db_max_AVG' is 9137 while the size of 'y' is 4678128. Another question is that my audio file is 3 minutes and 32 seconds and the shape of y is:
print(y.shape)
(4678128,)
I do not know what this number represents because it obviously does not represent milliseconds or microseconds. Below you can see two plots of 'y' using different methods:
plt.plot(y)
plt.show()
librosa.display.waveplot(y, sr=22050, x_axis='time')
Solution 1:[1]
If you just want to convert the time domain amplitude readings from linear values in the range -1 to 1 to dB, this will do it:
import numpy as np
amps = [1, 0.5, 0.25, 0]
dbs = 20 * np.log10(np.abs(amps))
print(amps, 'in dB', dbs)
Should output:
[1, 0.5, 0.25, 0] in dB [ 0.-6.02059991 -12.04119983 -inf]
Note that maximum amplitude (1) goes to 0dB, half amplitude (0.5) goes to -6dB, quarter goes to -12dB.
You get a divide by zero error caused by that zero amplitude as the dB scale cannot cope with silence :)
Here is a reference to a 1971 Audio Engineering Society paper for the well known 20 * log10(amp) equation:
https://www.aes.org/e-lib/browse.cfm?elib=2157 (see equation 8)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |