Often times on fal.ai I'm trying a new text to image model and I just get an error. I try again, fail, try again and it works. Sad seeing it take $0.05 and give me an error but most of the time I pull the one-armed bandit again.
if you sandbox it and just address the main security issues (and don't mention prompt injection because its not like every LLM doesn't suffer from that vulnerability), OpenClaw can be fun to tinker with. The whole sky-is-falling mainstream media message is boring, look inside it and you simply be smart and responsible and you can enjoy it
i added a comment about GSD here and it was nice to see yours too... I'm just a user of GSD but boy has it changed the context rot I used to experience and the system is just one-shotting everything I ask from basic to complex and finally being able to handle stuff and not make things up and mess stuff up (or go in circles)...
You should try out the GSD system, check "GSD github repo"... it will launch everything as a phase and steps and clear context for you so you never compact, and tons of other features...
The way "Phases" are handled is incredible with research then planning, then execution and no context rot because behind the scenes everything is being saved in a State.md file...
I'm on Phase 41 of my own project and the reliability and almost absence of any error is amazing. Investigate and see if its a fit for you. The PAL MCP you can setup to have Gemini with its large context review what Claude codes.
If the context window is managed properly that offsets almost all these "random drill holes", I see so many people just filling up the buffer and then compacting and complaining. With ambitious huge task requests and not having any form of system that breaks a job into multiple mini tasks.
Context rot is real and people who complain about AI's hallucinating and running random wild, I don't see it when the context window is managed properly.
For better lip sync you could try using rhubarb to extract from the mp3.
What is your backend speech processor so you can get the real-time streaming response?
Rhubarb would add a bit of latency for sure.
For real-time: we use WebRTC for streaming. Input is streaming STT, then a low-latency LLM, then TTS, then we drive Live2D parameters on the client.
Lip sync: we currently do (simple phoneme / amplitude-based) and are testing viseme extraction. Rhubarb is on our list, but we’re cautious about added latency.
reply