zuzuen_1's comments

zuzuen_1 · 2025-11-16T11:37:58 1763293078

I've been building Modulo AI for the past year - an AI system that fixes GitHub issues.

Early versions took 5+ minutes to analyze a single issue.

After months of optimization, we're now sub-60 seconds with better accuracy. This presentation encapsulates what we learned about the performance characteristics of production LLM systems that nobody talks about.

- Strategies for faster token throughput.

- Strategies for quick time to first token.

- Effective context window management and

- Model routing strategies.

If you're interested in building AI agents, I'm sure you'll find some interesting insights in it!

Install and try out our Github application: https://github.com/apps/solve-bug Try Modulo via browser at: https://moduloware.ai

Here are the code examples for the presentation: https://github.com/kirtivr/pydelhi-talk

What performance issues have you been seeing in your AI agents? And how did you tackle them?

zuzuen_1 · 2025-08-14T09:38:02 1755164282

Perhaps when LLMs introduce a lot more primitives for modifying behvavior such a programming language would be necessary.

As such for anyone working with LLMs, they know most of the work happens before and after the LLM call, like doing REST calls, saving to database, etc. Conventional programming languages work well for that purpose.

Personally, I like JSON when the data is not too huge. Its easy to read (since it is hierarchical like most declarative formats) and parse.

zuzuen_1 · 2025-08-14T09:59:30 1755165570

One pain point such a PL could address is encoding tribal knowledge about optimal prompting strategies for various LLMs, which changes with each new model release.

zuzuen_1 · 2025-08-12T14:42:50 1755009770

I would be more interested in Qodo's performance on the swe-bench-multilingual benchmark. Swe-bench-verified only includes bugs related to python breakages.

The best submission is swe-bench-multilingual is Claude 3.7 Sonnet which solves ~43% of the issues in the dataset.

zuzuen_1 · 2025-08-12T14:37:27 1755009447

Does anyone have a benchmark on the effectiveness of using embeddings for mapping bug reports to code files as opposed to extensive grepping as Qodo, Cursor and a number of tools I use do to localize faults?