Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is interesting.

My first thought was to wonder how a LSTM would do. Once might think it would be a better representation for music? There's some models which use convolutional layers along with a LSTM for video representation (eg [1]) and it would be interesting to see if convolutions are useful for capturing similar themes of music.

I wonder if one could build a music embedding (word2vec style) and use similarities in the embedding space as recommendations? The obvious objective function would be skip-gram, but there might be more interesting objectives there too.

[1] https://github.com/loliverhennigh/Convolutional-LSTM-in-Tens...



An architecture like WaveNet could also be interesting here: https://deepmind.com/blog/wavenet-generative-model-raw-audio... (HN thread: https://news.ycombinator.com/item?id=12455510)


I could be totally off on this, but his encoding is an image and LSTM is for time series, which would require a different representation.

I completely agree LSTM would be useful as it would by default require a different representation. I think most commenters agree this representation is overly simplistic. Amazed it works as well as it does!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: