Yeah the "High frame rate understanding" feature caught my eye, actual real time analysis of live video feeds seems really cool. Also wondering what they mean by "video reasoning/thinking"?
> 3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code
I'm curious as to how close these models are to achieving that once long-ago mocked claim (by Microsoft I think?) that AIs could view gameplay video of long lost games and produce the code to emulate them.
Somewhat related: a while ago I was working on a project and wanted to use an RS485 to TTL conversion board which came with badly translated instructions. However, somebody had reverse engineered the design and uploaded an EasyEDA schematic. I shoved the raw JSON for the schematic (which looked quite cryptic to me) into Gemini 2.5 Pro and asked it if it could understand it, and it cheerfully responded with:
> Of course, Jack. I can understand the schematic from the provided JSON file. It describes an RS485 to TTL Converter Module.
> Here is a detailed breakdown of the circuit's design and functionality
...followed by an absolutely reasonable description of the whole board. It was imprecise, but with some guidance (and by putting together my basic skills with Gemini's vast but unreliable knowledge) I was able to figure out a few things I needed to know about the board. Quite impressive.
I had a really similar experience, which is a big reason why I built this. Uploading my own schematics to the usual web LLMs gave a mix of useful notes and some pretty big misunderstandings. I really believe this tool is set up to deliver better results than the general-purpose GPT/Gemini/Claude interfaces for this kind of task. Hoping others try it and have a much better experience too!
Also good call on processing EasyEDA schematics. I hadn’t considered that initially, but I’m definitely going to add support for it.
In general, there are always "better" solutions to any problem, but finding the right balance for your budget is the key.
If doing industrial work, than consumer-grade workmanship / LLM-slop is usually unacceptable. Start with the FTDI firmware tool and an isolation chip App-note...
One thing to note about these two APIs is that they affect how the session history (the back/forward stack) behaves, but the global browser history (entries shown in the History tab) is separate.
Most browsers record every change in the global history regardless of whether `history.pushState` or `history.replaceState` is used. The HTML Spec[0] is explicit about session history but does not define how global history should behave.
I can understand why the spec makes no mention of this -- global history is a user-facing UI feature, similar to address bar autocomplete, and it makes sense for browsers to control this behavior. That said, I'm always annoyed when I look into my history tab after visiting a page like this (e.g. Vercel Domains[1]), and see my global history flooded with entries for each individual keystroke I've made, all in the name of "user experience".
In this particular case, it's just a fun gimmick, but for everyday websites I'd much prefer if they just debounced the updates to the URL to avoid cluttering the global history.
Thanks for the feedback, Vercel domain uses nuqs [1] (I'm the author) for URL state, and I agree flooding the browser history is a bad experience.
Is there a way to update the URL (ie: keeping it reactive in the address bar) without creating those history entries, or to ask the browser to squash the last entry it created into the previous one?
I am not aware of any approaches that work consistently across all major browsers. This matter is nothing new -- there's a Bugzilla report[0] from 13 years ago about this behavior that remains open.
Since there's no spec for global history and it's unlikely one will be introduced, the most practical solution to avoid flooding the browser history would be to debounce the changes.
This is the approach taken by Google Maps -- with maps being a well-known case where URL updates would clutter the history, as noted in the Bugzilla report.
function drawWorld() {
var hash = '#|' + gridString() + '|[score:' + currentScore() + ']';
if (urlRevealed) {
// Use the original game representation on the on-DOM view, as there are no
// escaping issues there.
$('#url').textContent = location.href.replace(/#.*$/, '') + hash;
}
// Modern browsers escape whitespace characters on the address bar URL for
// security reasons. In case this browser does that, replace the empty Braille
// character with a non-whitespace (and hopefully non-intrusive) symbol.
if (whitespaceReplacementChar) {
hash = hash.replace(/\u2800/g, whitespaceReplacementChar);
}
history.replaceState(null, null, hash);
// Some browsers have a rate limit on history.replaceState() calls, resulting
// in the URL not updating at all for a couple of seconds. In those cases,
// location.hash is updated directly, which is unfortunate, as it causes a new
// navigation entry to be created each time, effectively hijacking the user's
// back button.
if (decodeURIComponent(location.hash) !== hash) {
console.warn(
'history.replaceState() throttling detected. Using location.hash fallback'
);
location.hash = hash;
}
}
This seems to be done through the accessibility services, which makes sense: they're built for apps like screen readers that must by nature be able to read the content of other apps. Each app has to be manually authorized by going to the accessibility menu, and there's a big scary pop up that says "enabling this app will allow it to see and interact with everything you do."
This technique was also used in mid-late non-motion-plus Wii games to smooth out the pointer movement! Early games hadan incredibly twitchy pointer because they were simply mapping the IR data 1:1 to cursor movement, whereas later ones have an invisible circle around the cursor and only move the cursor itself once the circle edges start "dragging" it.
It is the best version of fuzzy search I have ever seen: the ultimate "tip of my tongue" assistant. I can ask super vague things like "Hey, I remember seeing a tool that allows you to put actual code in your files to do codegen, what could it be?" and it instantly gives me a list of possible answers, including the thing I'm looking for: Cog.
I know that a whole bunch of people will respond with the exact set of words that will make it show up right away on Google, but that's not the point: I couldn't remember what language it used, or any other detail beyond what I wrote and that it had been shared on Hacker News at some point, and the first couple Google searches returned a million other similar but incorrect things. With an LLM I found it right away.
The training cutoff comes into play here a bit, but 95% of the time I'm fuzzy searching like that I'm happy with projects that have been around for a few years and hence are both more mature and happen to fall into the training data.
Technically speaking, Claude Code is an agent, for example. It's just a fancy term for an LLM that can call tools in a loop until it thinks it's done with whatever it was tasked to do.
ChatGPT's Deep Research mode is also an agent: it will keep crawling the web and refining things until it feels it has enough material to write a good response.
I have very fond memories of using Opera Mini on my N-Gage and on a bunch of later feature phones that only had J2ME support. It felt revolutionary at the time, especially since data plans were still extremely restrictive.
Wifi on phones was also not quite a thing yet, so I had found an application for S60 phones that allowed them to share a computer's connection via Bluetooth. The range was extremely limited, but enough for me to browse the internet from my bed!
I only remember doing the reverse: when the wireless AP at home broke down, I created a Bluetooth PAN to use the cellular Internet connection on my computer. Didn't need to install any J2ME app. (At that time desktop OSes had basically no background process that would use the Internet silently. )