I just proved that constraint solving problems can be encoded as p-adic linear regression problems[+], and that therefore we can use machine learning optimisation techniques to get exact answers.
So of course no journal or conference is in the least bit interested, and I'm now reformatting it for another obscure low-tier journal that no-one will ever read.
Otherwise:
- automating the translation of a Byzantine Greek work that has never been translated into English before. https://stephanos.symmachus.org
- also preparing evidence for a case against the university I sometimes work for.
Stephanos of Byzantium wrote what we would call an encyclopedia of people and places. Most of it has been lost, but shortened versions still exist. I've set up a bot to scan a version of what we have, translate it, extract out the proper nouns and try to figure out what was lost. https://stephanos.symmachus.org/ I'm also linking it to the translation of Pausanias https://pausanias.symmachus.org that another bot is doing.
Also, trying to finish a PhD on machine learning when you want to minimise a p-adic loss.
I create a separate Linux user (which doesn't have sudo rights) for each project. I have to log each user in to Claude code or codex, but then I can use ordinary Unix permissions to keep the bots under control and isolated.
> The volume of cargo carried by sailing vessels in the old days was orders of magnitude lower.
Surprisingly, no, it wasn't. I'll slightly fudge the numbers and talk in terms of proportion of world trade that was carried by ocean-going vessels (because if you double the population then it's reasonable to talk about doubling the number of ships).
The world economy was very globalised in 1913. That level of globalisation in trade wasn't matched again until the 1990s.
We're only a little more global now than we were in the age of sail.
The British navy and merchant fleet was a wonder of its era.
Show your work. Without numbers, those are all just assertions. And the assertion that the world's economies were more globalized before WW1 than after the Cold War is particularly dubious.
Writing a course for a customer on how to use Claude Code well, especially around brownfield development (working on existing code bases, not so much around vibe-coding something new).
If the "outcompeting" is possible because of Chinese government subsidies, then it's important to protect local industry from unfair competition.
It's similar to the logic behind anti-trust actions against monopolists. If the playing field isn't level, then the USA government steps in to level it.
(Whether BYD is subsidised or not is another question, but the above is the logic of protecting local industry.)
> If the playing field isn't level, then the USA government steps in to level it.
More recently though, it kind of seems like if the playing field isn't tipped strongly towards the US, then the US government will step in to tip it their way.
Not sure why this is downvoted. The Chinese government has been quite transparent in terms of globally dominating several industries including EV through heavy government support.
It would make no sense to destroy your own industry because it can’t compete with a heavily subsidized foreign industry.
Indeed. The obvious counter-example to the claim is "rainbows" which were definitely the topic of heated scientific argument for hundreds of years (and non-scientific ones before that).
I think of it as trying to encourage the LLM to want to give answers from a particular part of the phase space. You can do it by fine tuning it to be more likely to return values from there, or you can prompt it to get into that part of the phase space. Either works, but fiddling around with prompts doesn't require all that much MLops or compute power.
That said, fine tuning small models because you have to power through vast amounts of data where a larger model might be cost ineffective -- that's completely sensible, and not really mentioned in the article.
My understanding of model distillation is quite different in that it trains another (typically smaller) model using the error between the new model’s output and that of the existing - effectively capturing the existing model’s embedded knowledge and encoding it (ideally more densely) into the new.
What what I was referring to is similar in concept, but I've seen both described in papers as distillation. What I meant was you take the output of a large model like GPT4 and use that as training data to fine-tune a smaller model.
Yes, that does sound very similar. To my knowledge, isn’t that (effectively) how the latest DeepSeek breakthroughs were made? (i.e. by leveraging chatgpt outputs to provide feedback for training the likes of R1)
> That said, fine tuning small models because you have to power through vast amounts of data where a larger model might be cost ineffective -- that's completely sensible, and not really mentioned in the article.
...which I thought was arguably the most popular use case for fine tuning these days.
Not sure I agree with either/or. In person assessments are still pretty robust. I think an ideal university will teach both with a clear division between them (e.g. whether a particular assessment or module allows AI). What I'm currently struggling with is how to design an assessment in which the student is allowed to use AI - how do I actually assess it? Where should the bar actually be? Can it be relative to peers? Does this reward students willing to pay for more advanced AI?
reply