'classify audio data based on thresholding in R

I have a number of audio files (~5 minutes long). All I need to do is to detect whether someone is speaking or not. It is only one speaker in each file, but there is a static noise signal (it is consistent, but it just means that the baseline amplitude of the audio data is not 0). I am trying to do this in R.

I am considering whether it would be better to try to figure out the dB of the static noise and use threshold detection, just classify based on an amplitude, or use a low bandpass filter to get rid of it altogether and then assume a baseline of 0?

Here's what I've setup so far:

library(seewave)
library(tuneR)
library(bioacoustics)

wave <-read_audio("~/Desktop/68_merge.wav") #read audio data
env(wave,fastdisp=T) #plot the data

#one method is to use threshold_detction, but I don't know how do identify the appropriate threshold to use in my data...
    out <- threshold_detection(wave, threshold = 5) #try threshold detection but I don't know what to set the dB threshold to

#the other method would be to just identify the amplitude 
    wave.dat<- env(wave, norm = T, plot = F)
    time <- ...
    speaking <- ifelse(wave > ..., c(1), c(0))
    
    df.speaking <- as.data.frame(cbind(time, speaking))

#last method would be to implement a bandpass filter, something like
b<-ffilter(wave,to=1500)


Solution 1:[1]

You could consider frequency filtering your signal to remove the static noise, which is likely concentrated at a certain frequency. Once removed, the signals that you do want to detect will have a greater signal-to-noise ratio and be easier to identify using an amplitude-based threshold detector.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martijn Pieters