Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When they released Stable Audio 2.0, I tried to create "unusual" songs with prompts like "roaring dragon tumbling rocks stormy morning". The results are quite interesting:

https://www.youtube.com/@MarekGibney/videos

I find it fascinating that you can put all information needed to recreate a whole complex song into a string like

     rough stormy morning car rocks hammering
     drum solo roaring dragon downtempo
     audiosparx-v2-0 seed 5
This means a whole album of these songs could easily fit into a single TCP/IP packet.

If a music genre evolves in which each song is completely defined by its title, maybe it will be called "promptmusic".

I will try the new model with the same prompts and upload the results.



That's a great example of the fact that information about something, say a song, isn't entirely encoded only in the medium you use to transfer it - it's partially there, and partially in the device you're using to read it! An MP3 file is just gibberish without a program that can decode it.

In this case, the whole album could indeed fit into a single TCP/IP packet - because the bulk of information that make up those songs is contained in the model, which weights however many gigabytes it does. The packet carrying your album is meaningless until the recipient also procures the model.

(Tangent: this observation was my first mind. blown. experience when reading GEB over a decade ago.)


Also note that it is the same for language. The meaning of these words are not in this text. The words are merely codes which point to things in the readers databank. And hopefully the word have similar enough associations to mine, such that the decoded message is close to what I attempted to encode...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: