Ask HN: Best method to classify short audio events in real time?

7 points by desertraven a year ago

I don't have too much experience with statistics (or ML), but a lot of the articles I've found are quite complicated for something I expected to be simple.

There are four distinct sounds I need to detect in real time with an embedded device. Think a clap sensor, but with 4 different sounding claps.

How might I go about this? How much training data (if any) do I need to collect? Is there an off-the-shelf method to just classify a few different audio events to a high degree of accuracy, and then embed that to a microcontroller (even a computer at this point)?

Thanks!

simne a year ago

I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).

Best book I know on DSP:

The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

  • simne a year ago

    There are at least two possible approaches:

    1. just adc, and work with time series. Could be easy done on modern soft/hard, but slow.

    2. create Fourier transform, or use digital filters, to filter out noise (usually by filter out some frequencies) and get some sort of time-frequency table. Than easy match your sound by Bayes, with simple formula, something like: result = k[0]f[0]+k[1]f[1]+...+k[n]*f[n], f - frequency amplitude from FFT. Than just if (result > 0.8): matched = true. Just need to find right coefficients. May be, you will need to do 2-dimension matching, with sequence of sequential FFTs, this is not much harder.