LLMs are what they are, calling them "AGI" won't make them any more useful or exciting than they are, it's just going to devalue the term "AGI" which has revolutionary, disease-curing, humanity-saving connotations. What are you looking for us to say exactly?
1. We aren't even close to AGI and it's unclear that we'll ever get there, but it would change the course of humanity in a significant way if we ever do.
2. Wow we've reached AGI but now I'm realizing that AGI is lame, we need a new term for the humanity-saving sales pitch that we were promised!
I think getting out of the binary is good for the long run. We have something which is artificial, intelligent, and general in scope. We're there. Is it perfect? No. Is it even good? Sometimes! Do airplanes flap their wings? Also no, but they do a lot of stuff nonetheless.
That's where we disagree, I do not consider a system that isn't capable of learning, improving, or reasoning to be generally intelligent. My most basic criteria for "AGI" is a system that can absorb and integrate new knowledge through repetition and experience in real time, just like a human would.
Further, their statements, knowledge, and "beliefs" should be reasonably self-consistent. That's where I'm usually told that humans aren't self-consistent either, which is true! But if I ever met a human that was as inconsistent as LLMs usually are, I'd recommend that they get checked for brain damage.
Of course the value of LLMs isn't binary, they're useful tools in many ways, but the sales pitch was always AGI == human-like, and not AGI == human-sounding, and that's quite clearly not where we are right now.
Yeah, this is in 'flies like a plane, not like a bird' territory. But I think it's closer than you think.
The systems do learn and have improved rapidly over the last year. Humans have two learning modes - short-term in-context learning, and then longer-term learning that occurs with practice and across sleep cycles. In particular, humans tend to suck at new tasks until they've gotten in some practice and then slept on it (unless the new task is a minor deviation from a task they are already familiar with).
This is true for LLM's as well. They have some ability to adapt to the context of the current conversation, but don't perform model weight updates at this stage. Weight updates happen over a longer period, as pre-training and fine-tuning data are updated. That longer-phase training is where we get the integration of new knowledge through repetition.
In terms of reasoning, what we've got now is somewhere between a small child and a math prodigy, apparently, depending how much cash you're willing to burn on the results. But a small child is still a human.
I presume you're referring to the recent METR study. One aspect of the study population, which seems like an important causal factor in the results, is that they were working in large, mature codebases with specific standards for code style, which libraries to use, etc. LLMs are much better at producing "generic" results than matching a very specific and idiosyncratic set of requirements. The study involved the latter (specific) situation; helping people learn mainstream material seems more like the former (generic) situation.
(Qualifications: I was a reviewer on the METR study.)
For people who prefer reading to watching videos, I wrote a detailed account of my process for solving one of last year's IMO problems, along with thoughts on how this relates to AI:
1. I didn't see it stated explicitly, but I presume the neural net is on the far end of a radio link somewhere, not running on hardware physically mounted on the drone?
2. After viewing the FPV video on the linked page: how the hell do human pilots even come close to this pace? Insane (even assuming that the video they're seeing is higher quality than what's shown on YouTube – is it?)
3. The control software has access to an IMU. This seems to represent some degree of unfair advantage? I presume the human pilots don't have that – unless the IMU data is somehow overlaid onto their FPV view (but even then, I can't imagine how much practice would be needed to learn to make use of that in realtime).
1) No, this is interesting specifically because it was all onboard, the drone has Jetson Orin NX on it.
2) No, the video the pilot sees is usually quite bad. Racing pilots usually use either HDZero (mid resolution video with weird pixel artifacts sometimes) or analog video (looks like a broken 1980s VCR). It’s amazing what they can fly through. These DCP spec drones are also slow by racing standards. Look up MultiGP racing, it’s even faster.
3) It can be overlaid but it’s useless. The human pilot is using the control sticks as the input to an outer rate regulation loop which contains the gyro as input to an inner stabilization loop though, so the IMU is still in the mix for human control.
2. The video they're seeing is worse. Spectators typically see the frames saved directly from the camera, but the pilot will be seeing them compressed and beamed over the air to their headset. See vid.
3. The human pilots do actually have access to it. Not directly, but the flight controller translates their inputs and makes use of the IMU to do so.
I’m reminded of when the US military figured out it should just replace all its proprietary field drone controllers with Xbox controllers because every single grunt that enlisted already had 10,000 hours on the things. If the future of warfare is drones, Christ, that video is terrifying.
Funny you should say that. Gamepads are not quite what you want for drone piloting for three main reasons:
1. Less precise. Gimble size matters.
2. All inputs sprung. This is exactly what you want for your three rotational axis, but you absolutely do not want your throttle resetting to 50% when you lay off. You can fix this using 3D mode where the zero setting is in the middle, but then you lose even more precision.
3. Circular inputs. This means at low or high throttle you have less roll available.
The main reason you'd want a gamepad is the size and shape. They do make gamepad-style radios, like the Radiomaster Pocket, which combine the best of both worlds.
You can pick up a simulator for $10-20 if anyone wants to give it a whirl, and many are even on Steam, but the general recommendation is to pick up a dedicated radio as soon as possible.
Note that this mainly applies to FPV quadcopters, due to how sensitive and twitchy they can be. When it comes to controlling pretty much anything else (I'd argue even most planes) these advantages are no longer relevant.
The US military is not limited to using stock COTS hardware. They have imitated the form factor and general feel of those controls, but custom built and ruggedized.
The numbering jump is because there was "Claude 3.5" and then "Claude 3.5 (new)" and they decided to retroactively stop the madness and rename the later to 3.6 (which is what everyone was calling it anyway).
This one line was like 90% of the original implementation of Writely (the startup that became Google Docs; source: I was one of the founders).
The other 90% was all the backend code we had to write to properly synchronize edits across different browsers, each with their own bizarre suite of bugs in their contenteditable implementations :-)
Not OP, but common solutions in this space represent the state as conflict-free replicated data types (CRDTs). Some popular browser-based libraries for that are Y.js[0] and Automerge[1].
I'm not sure exactly what you mean by "dynamic text box" but it was just a contenteditable div. There have been at least two complete rewrites since I was involved, nowadays I believe it's a canvas with all of the editing, formatting, layout, and rendering done "by hand" in JavaScript.
The kind of text box used in php forums ... called text area I think, and this would be hidden every time the focus went away and an HTML based layout presented in its place. This seemed to be clunky but I thought that was what made writely possible. ContentEditable is such a breath of fresh air had I ever known about it. I wonder if IE6 supported that.
The article clearly states that providing enough power to run the heaters was one of the challenges that led to the death of the probe. Satellites are rarely in the shade for an extended period.
There isn't a uniform temperature across the entire exchanger. There's a smooth gradient extending from one end to the other. If the outside is hotter, then the inbound air gradually cools as it gives up heat to the outbound air which is gradually warming.