My bad then. I meant that it's "Crazy Good" as in that the free tier gave me a tremendous amount of tokens.
What I didn't realize though, is that the limit doesn't reset each 5 hours as is the case for claude. I hit the limit of the free tier about 2 hours in, and while I was expecting to be able to continue later today, it alerts me that I can continue in a week.
So my hype for the amount of tokens one gets compared to claude was a bit too eager. Hitting the limit and having to wait a week probably means that we get a comparable token amount vs the $20 claude plan. I wonder how much more I'd get when buying the $20 plus package. The pricing page doesn't make that clear (since there was no free plan before yesterday I guess): https://developers.openai.com/codex/pricing/
Is anyone a researcher here that has studied the proven ability to sneak malicious behavior into an LLM's weights (somewhat poisoning weights but I think the malicious behavior can go beyond that).
As I recall reading in 2025, it has been proven that an actor can inject a small number of carefully crafted, malicious examples into a training dataset. The model learns to associate a specific 'trigger' (e.g. a rare phrase, specific string of characters, or even a subtle semantic instruction) with a malicious response. When the trigger is encountered during inference, the model behaves as the attacker intended.You can also directly modify a small number of model parameters to efficiently implement backdoors while preserving overall performance and still make the backdoor more difficult to detect through standard analysis. Further, can do tokenizer manipulation and modify the tokenizer files to cause unexpected behavior, such as inflating API costs, degrading service, or weakening safety filters, without altering the model weights themselves. Not saying any of that is being done here, but seems like a good place to have that discussion.
> The model learns to associate a specific 'trigger' (e.g. a rare phrase, specific string of characters, or even a subtle semantic instruction) with a malicious response. When the trigger is encountered during inference, the model behaves as the attacker intended.
Reminiscent of the plot of 'The Manchurian Candidate' ("A political thriller about soldiers brainwashed through hypnosis to become assassins triggered by a specific key phrase"). Apropos given the context.
It tells you how bad their product management and engineering team is that they haven’t just decided to kill Siri and start from scratch. Siri is utterly awful and that’s an understatement, for at least half a decade.