Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Fine for the time dimension of the spectrogram, but rather dubious for the frequency axis.

MFCCs[1] are exactly that, a type of convolution along the frequency axis of a Fourier transform, and are highly apt features for music classification tasks.

It makes sense if you think of timbre as a time-varying relationship between the harmonics of a single pitch; translation invariance along the frequency axis can tell you that you there are partials typical e.g. of a guitar or of a flute, without caring what particular pitch those instruments are playing. And timbre is a bigger source of variety in popular music than e.g. the particular notes used.

[1] https://en.wikipedia.org/wiki/Mel-frequency_cepstrum



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: