A multi-class method for detecting audio events in news broadcasts
Sergios Petridis, Theodoros Giannakopoulos and Stavros Perantonis
In: SETN 2010, 4-7 May 2010, Athens, Greece.
We propose a method for audio event detection in video streams from news. Apart from detecting speech, which is obviously the major class in such content, the proposed method detects five non-speech audio classes. The major difficulty of the particular task lies in the fact that most of the non-speech audio events are actually background sounds, with speech as the primary sound. We have adopted a set of 21 statistics computed on a mid-term basis over 7 audio features. A variation of the One Vs All classification architecture has been adopted and each binary classification problem is modeled using a separate probabilistic Support Vector Machine. Experiments have shown that the proposed method can achieve high precision rates for most of the audio events of interest.