Very interesting how Singapore ranks 2nd in terms of token volume. I wonder if this is potentially Chinese usage via VPN, or if Singaporean consumers and firms are dominating in AI adoption.
Also interesting how the 'roleplaying' category is so dominant, makes me wonder if Google's classifier sees a system prompt with "Act as a X" and classifies that as roleplay vs the specific industry the roleplay was intended to serve.
Almost certainly VPN traffic. Most major LLMs block both China and Hong Kong (surprisingly, not the other way around), so Singapore ends up being the fastest nearby endpoint that isn't restricted.
Ah, you're right. Still, I wonder if it's because of Chinese people and companies using Singaporean bank accounts. It just seems odd that such a small country is so overrepresented here.
It's their own decisions they made long before the controls and presure. Besides being in bed with the US gov, people that run big AI shops tend to be fervently nationalistic and politically ambitious on their own. Leopold Aschenbrenner's dystopian rant [1] or Dario Amodei's [2] [3] are pretty representative.
Early on there was a lot of distillation going on, apparently. Note that OpenAI introduced ID verification for high volume accounts and I think it was for that reason. It does raise questions about how much of the Chinese model's performance is entirely home grown. At least historically, it was quite hard to crawl the English web from behind the Great Firewall.
Bidaya AI | Senior Software Engineer (Full Stack) | Calgary, Canada | HYBRID | Full-time | Up to $140k CAD + Generous Equity
We're an early-stage (pre-seed) VC-backed startup automating RFP proposals for the AEC (Architecture, Engineering, Construction) industry.
We are building an agentic AI platform that embeds directly into Microsoft Word, helping firms find and win more work, while serving as their knowledge management hub for all business development.
You would be the first full-time engineering hire working directly with the founders (I'm the technical co-founder). We need someone who can ship production code across the whole stack.
Deep Word Integration: Building a high-performance, "Cursor-like" experience within the constraints of Office.js.
Agentic Workflows: Orchestrating AI agents that can read complex government requirements, reason about compliance, and generate winning output autonomously.
Evolving Knowledge Graph: Architecting a library system that doesn't just store files, but learns from project history and feedback loops.
If you want a chance to work on a hard problem, in an exciting space, with a strong team that has validated the market and de-risked the business, with major upside, let's talk!
Langfuse and Helicone work well for traditional LLM operations, but AI agents are different. We discovered that AI agents require fundamentally different tooling, here are some examples.
First, while LLMs simply respond to prompts, agents often get stuck in behavioral loops where they repeat the same actions; to address this, we built a graph visualization that automatically detects when an agent reaches the same state multiple times and groups these occurrences together, making loops immediately visible.
Second, our evaluations are much more tailored for AI Agents. LLM ops evaluations usually occur at a per prompt level (i.e hallucination, qa-correctness) which makes sense for those use cases, but agent evaluations are usually per session or run. What this means is that usually a single prompt in isolation didn’t cause an issue but some downstream memory issue or previous action caused this current tool to fail. So, we spent a lot of time creating a way for you to create a rubric. Then, to evaluate the rubric (so that there isn’t context overload) we created an agentic pipeline which has tools like viewing rubric examples, ability to zoom “in and out” of a session (to prevent context overload), referencing previous examples, etc.
Third, time traveling and clustering of similar responses. LLM debugging is straightforward because prompts are stateless and are independent from one another, but agents maintain complex state through tools, context, and memory management; we solved this by creating “time travel” functionality that captures the complete agent state at any point, allowing developers to modify variables like context or tool availability and replay from that exact moment and then simulate that 20-30 times and group together similar responses (with our clustering alg).
Fourth, agents exhibit far more non-deterministic behavior than LLMs because a single tool call can completely change their trajectory; to handle this complexity, we developed workflow trajectory clustering that groups similar execution paths together, helping developers identify patterns and edge cases that would be impossible to spot in traditional LLM systems.
As there is only one bottle of this wine in the world, I think what matters is the dose but not concentration of Pb.
In addition, the 0.14 mg/L figure reported in the paper is at a similar level to the current safety standard. The International Organization of Vine and Wine (OIV), an intergovernmental agency comprised of 45 international member states, has a current maximum acceptable limit of 0.15 mg/L for Pb in wine starting from the 2007 harvest year.
Also interesting how the 'roleplaying' category is so dominant, makes me wonder if Google's classifier sees a system prompt with "Act as a X" and classifies that as roleplay vs the specific industry the roleplay was intended to serve.
reply