Hugging Face released Speech To Speech, an effort for an open-sourced and modular GPT4-o. With this, you can create a voice assistant that replies in under 500 ms. Its modularity and Apache 2.0 license make it perfect for integrating it into any project requiring a powerful voice assistant. It can run locally on a MacBook or be set up on a server. It supports multiple languages, and it can even change languages in under 100ms after detecting that the user is speaking a different language.