Riffusion - Generative AI for Music | Research Scientist, Research Engineer | San Francisco | Full-time
Riffusion is a small team training foundation models for music generation and building products that create more musicians in the world. We strive to create and deploy models that are expressive, fast, controllable, and inspiring at scale.
We’re establishing our founding research team and looking for individuals who love music and are excited to build a more creative future with us. Experience with large scale generative model training and diffusion architectures is preferred. Very strong software engineering and computer science fundamentals required. We’re backed by top investors and have substantial compute at the ready.
You can make music with us https://riffusion.com and reach me at { seth at riffusion dot com }.
Riffusion - Generative AI for Music | Research Scientist, Research Engineer | San Francisco | Full-time
Riffusion is a small team training foundation models for music generation and building products that create more musicians in the world. We strive to create and deploy models that are expressive, fast, controllable, and inspiring at scale.
We’re establishing our founding research team and looking for individuals who love music and are excited to build a more creative future with us. Experience with large scale generative model training and diffusion architectures is preferred. Very strong software engineering and computer science fundamentals required. We’re backed by top investors and have substantial compute at the ready.
You can make music with us https://riffusion.com and reach me at { seth at riffusion dot com }.
Nice! It is best at pronouncing in English, but we've had a bunch of fun trying to get other languages too. Sometimes you can make things happen phonetically.
Even for english words that it doesn't get right the first time haha
It's still really early innings for this technology, so we're happy to be learning and building fun technology helps people to do creative things. Main thing we're focused on is turning it from a toy you enjoy once, to something you come back to and dive deeper into (even if it's still just for fun).
Author here: fwiw we are running the app on a10g GPUs, which generally can turn around a 512x512 in 3.5s with 50 inference steps. This time includes converting the image into audio which should be done on the GPU as well for real-time purposes. We did some optimization such as a traced unet, fp16 and removing autocast. There are lots of ways it could be sped up further I'm sure!
We’re establishing our founding research team and looking for individuals who love music and are excited to build a more creative future with us. Experience with large scale generative model training and diffusion architectures is preferred. Very strong software engineering and computer science fundamentals required. We’re backed by top investors and have substantial compute at the ready.
You can make music with us https://riffusion.com and reach me at { seth at riffusion dot com }.