'classify audio data based on thresholding in R
I have a number of audio files (~5 minutes long). All I need to do is to detect whether someone is speaking or not. It is only one speaker in each file, but there is a static noise signal (it is consistent, but it just means that the baseline amplitude of the audio data is not 0). I am trying to do this in R.
I am considering whether it would be better to try to figure out the dB of the static noise and use threshold detection, just classify based on an amplitude, or use a low bandpass filter to get rid of it altogether and then assume a baseline of 0?
Here's what I've setup so far:
library(seewave)
library(tuneR)
library(bioacoustics)
wave <-read_audio("~/Desktop/68_merge.wav") #read audio data
env(wave,fastdisp=T) #plot the data
#one method is to use threshold_detction, but I don't know how do identify the appropriate threshold to use in my data...
out <- threshold_detection(wave, threshold = 5) #try threshold detection but I don't know what to set the dB threshold to
#the other method would be to just identify the amplitude
wave.dat<- env(wave, norm = T, plot = F)
time <- ...
speaking <- ifelse(wave > ..., c(1), c(0))
df.speaking <- as.data.frame(cbind(time, speaking))
#last method would be to implement a bandpass filter, something like
b<-ffilter(wave,to=1500)
Solution 1:[1]
You could consider frequency filtering your signal to remove the static noise, which is likely concentrated at a certain frequency. Once removed, the signals that you do want to detect will have a greater signal-to-noise ratio and be easier to identify using an amplitude-based threshold detector.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Martijn Pieters |