Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Why do so many assume we’re on the cusp of super-intelligent AI?
41 points by mrtranscendence on May 25, 2023 | hide | past | favorite | 59 comments
It may just be the circles of the internet I’ve been hanging around in, but it seems very commonly assumed that the pace of progress in the field of AI is so rapid that — unless we take steps to avoid it — someone (or more likely, some large organization) will soon develop a super-intelligent AI.

And it’s not just the fringe. We can see see such apparently sober voices as Geoffrey Hinton sounding the alarm about super-smart AI.

But … why? All the recent exciting progress has been due to large language models, which are basically hacks that allow us to use large volumes of normally-intelligent text to train a statistical model for next-token prediction. It’s not even particularly complex to do this in principle (though wrangling all the compute you need can be quite difficult). It’s all very clever, yes, but at bottom it’s just a brute force approach.

These solutions get us neat tools but I don’t see how they bring us even one step closer to super-intelligence. You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on. And such models don’t bring us any real understanding of what it would take to make super-intelligent machines.

But if Geoffrey Hinton is worried surely I’ve gone wrong somewhere. What am I not seeing?



I'm not deeply in this space, but from what I understand, the problems are thus:

1. Progress in LLMs has come much more rapidly than expected. This means that when [arbitrary threshold] is crossed, we probably won't have much, if any, advance warning.

2. Nobody on earth knows what the path to AGI (or even narrow-but-still-superhuman intelligence with enough agency to be dangerous) looks like. So, it's not currently possible to know if LLMs are a path to existential threat. (No, it is not correct to say that LLMs are too simple to ever be a threat, as far as I can tell.) (Recall also that we don't know at all where we get our own consciousness or how it works.)

So it seems like #2 is more where you're hung up, and frankly, it's just unknown. If we knew what the path to AGI looked like, we woul be in a very different world. When you combine with #1, it becomes very scary because we might be on the cusp of an irreversible change, so it's useful to assume we're there if doing so allows you to potentially alter or avert disaster.


>Progress in LLMs has come much more rapidly than expected

I hear this very often, but I'm not sure, what were the expectations 10 years ago? What are the expectations for the next 10 years?


Almost to the date seven years ago: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

"We downloaded the raw Latex source file (a 16MB file) and trained a multilayer LSTM. Amazingly, the resulting sampled Latex almost compiles. We had to step in and fix a few issues manually but then you get plausible looking math, it’s quite astonishing:"

My emphasis.


Well, it's what I've heard from many "doomerists" in the last year. It's not even a question of 10 years; ChatGPT at launch was pretty surprising to a lot of people. Could be many experts weren't surprised by ChatGPT, though, since we got Sydney and GPT-4 and others pretty soon afterward.


10 years ago we knew you needed a huge amount of data. These LLM are proof of that


> "You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on."

This is wrong.

I saw so many people who think this though, even smart people. But it's just clearly not true if you think about it. It's like saying that you can't train a model to predict a trend from a scatter plot because the model can't be smarter than the average point in the scatter plot (or even the smartest one), and points in a scatter plot aren't smart at all.

I think the gap in understanding is that people aren't used to 'models' being treated themselves as 'data points'. So when they imagine a model being trained over model-like data points, they start getting confused between what is a model and what is a data point, and they start thinking that the model being trained can't be more capable than the smartest (or some even say average lol) data point (which is itself a model) in its training set.

Another reason why this kind of thinking is unintuitive is because of the raw scale of these LLMs. The good ones like the first ones that people are saying might become super-intelligent are going to be entire data centers, or data center sized supercomputers like Aurora, and they will cost billions of dollars in training. During training they will have more than a trillion parameters and trained on more than tens of trillions of tokens, so more than 10,000,000,000,000,000,000,000,000 numerical updates during their training. That number is just very large and outside the realm of human intuition for things like running through a for-loop in your mind when you are imagining the algorithm.


Even if the statement were true, the data that these models are trained on is, collectively, smarter than the smartest person who ever lived. An individual person simply cannot absorb, in an 80-year-or-so lifetime, even a tiny fraction of the amount of information contained in the training data. It is a truly mind-boggling amount of data. The most frightening thing is that, even with pretty naive and primitive algorithms, this data has been able to reproduce roughly human-level creativity and linguistic ability. Imagine how powerful these models can become---even using the same data---with more sophisticated algorithmic approaches.


Even if the statement were true, the data that these models are trained on is, collectively, smarter than the smartest person who ever lived.

"Smarter" in a horizontal sense. But not vertical.

The latter being what people really mean by "smarter". You know, that thinking-outside-the-box stuff.

That's why LLMs aren't able to "solve" problems that don't have existing writeups somewhere (beyond certain problems more or less equivalent to combinatorial search). Or produce text useful beyond the mild amusement level (or for spam and/or research fraud -- at which it of course excels quite brilliantly).

And still trips over itself narratively, left, right and center. And still makes all kind of cute-sounding, and superficially correct (but ultimately misleading) pronouncements about the world that a true expert would never make.

Taking an example from a recent NYT article: "Give me some useful mnemonics for learning Spanish" (paraphrased slightly):

   Hola:  "Hola, cola" - sounds like "cola" (Coke)
   Adiós: "Adiós, my toes" - rhymes with "toes"
   Gracias: "Gracias, grassy ass" - sounds like "grassy ass"
   Mañana: "Mañana, banana" - rhymes with "banana"
   Uno: "Uno, you know" - sounds like "you know"
   Tres: "Tres, dress" - rhymes with "dress"
As everyone who knows Spanish knows -- none of these really rhymes in Spanish (and if you try actually speaking these words (except the sfollowing the templates given, you'll get laughed pretty quickly). But they're certainly cute, and astonishingly impressive.


>That's why LLMs aren't able to "solve" problems that don't have existing writeups somewhere (beyond certain problems more or less equivalent to combinatorial search).

Neither can humans. We create by analogy; "thinking outside the box" just means making less obvious (but still valid) analogies.

>Or produce text useful beyond the mild amusement level (or for spam and/or research fraud -- at which it of course excels quite brilliantly).

People are using LLMs very effectively for writing news reports, opinion articles, legal summaries, and computer code. It's already gone well beyond "mild amusement." There's a reason many companies have had to adopt policies regarding LLMs at work: people are already using them at work.

And sure, LLMs still make lots of mistakes. But they are already better writers than the vast majority of humans, and as I noted, the algorithms being applied are pretty simple yet.


Neither can humans. We create by analogy;

This seems to be at (very, very) best unsubstantiated; most likely an insufficient explanation of how humans create.

People are using LLMs very effectively for writing news reports, opinion articles, ...

Well, we certainly differ in our assessments on that. Literally everything I've seen in the first two categories is complete garbage.

In the sense of being either patently unreliable (news), or simply having nothing interesting to say (opinion).


Literally everything I've seen in the first two categories is complete garbage.

You are assuming that you are aware of the origin of everything you read. I suppose that's possible, but unlikely. At this point the use of generative AI is widespread enough that it's likely that you've read material that you were not aware was AI generated, at least in draft form.


I think it depends a lot on the sources one pulls from, actually.

It would be a major scandal indeed if a stalwart source (like the New Yorker, say) were to be caught trolling its readers with generative content. Also very difficult to keep secret for long (if done at scale), the way the literary world works. Assuming that could even happen, given the current SOA (which from the samples I've been seeing, even those touted as "mind-blowing", seems highly doubtful).

Meanwhile if one's daily bread is intrinsically spammy sources like BuzzFeed et al - I agree that one will scarcely notice the difference.


It is, though, by the same argument also collectively dumber than the dumbest person who ever lived. No single human moron's brain could accommodate so much misinformation, useless fluff, spurious reasoning, etc. etc. etc.

So I think time will tell. My money is on a sort of regression to the mean: these models will capture the style and "creativity" of the average 2020s Reddit, StackOverflow, etc. user. I can't say I'm terribly excited.


> It is, though, by the same argument also collectively dumber than the dumbest person who ever lived.

Not at all, it's not symmetrical. You're ignoring the training data and all the additional RLHF fine tuning. The model is being actively penalized for being dumb which is why it isn't that dumb in a lot of cases.


but the popular models plastered across HN are in fact, dumb in a lot of cases.

Everyone is impressed when the model does something they don't understand deeply. But it's very rare when someone is impressed when the model is generating text based in something they do understand deeply.

I do think it's slightly better than the mean across all topics. But I also strongly suspect it'll soon serve as a great example for regressing to the mean.


> But it's very rare when someone is impressed when the model is generating text based in something they do understand deeply.

This statement sounds out of date. And you can see this sentiment a lot on HN. I don’t know if the people who say this haven’t tried GPT-4 or they have and are just stubbornly refusing to change their mind about something when presented with new evidence.


> s haven’t tried GPT-4 or they have and are just stubbornly refusing to change their mind about something when presented with new evidence.

Have you considered that the evidence just isn't convincing yet?

If you view popular llms as text prediction machines, they are in fact much better than spell check or auto correct from a few years ago. But if you actually ask it to solve the problem with nuance it will not use nuance. That's the part that would impress me.

As a recent example if you ask chat GPT how to use ffmpeg to slice out a video. and you tell it that you only want 3 seconds of video. somebody who deeply understands how ffmpeg works, (or even someone who deeply read the documentation) would point out that you have to be aware that it can only cut to keyframes. so if the keyframe is not aligned to the time you ask for you will not get the video that you expect.

another example ask it to play 20 questions with you, it will cheat at the end. even if you give it very specific instructions it still is unable to follow them to the fair conclusion of the game. (or it will make a mistake and understanding about some of the semantics of the question, but I don't fault it for a difference in context)

I'd caution you that just because you're impressed for the subjects that you understand deeply does not mean that chat gpt is good at all subjects. and therefore I assert that it is disrespectful to be so dismissive of people who have different opinions than yours. I believe the default should be to assume good faith rather than dismissiveness "they just don't understand"


I had it extract some general rules from this page and pasted them below:

https://video.stackexchange.com/questions/26302/ffmpeg-cutti...

USER given the following general rules:

1. The issue described in the text occurs when cutting a video between keyframes, resulting in blank video with original sound. 2. Potential solutions include overlaying the copied part with a few seconds of the re-encoded original, seeking to an earlier frame, finding the previous keyframe and clipping from that, or re-encoding the whole clip. 3. FFmpeg does not have a built-in feature to automatically resolve this issue.

Are any of these rules applicable to the question below?

"how do I use ffmpeg to slice out a video. and I only want 3 seconds of video." ASSISTANT Enter an assistant message here.

ASSISTANT Yes, rule 1 and rule 3 are applicable to the question. The issue of cutting a video between keyframes may apply when slicing out a 3-second video using FFmpeg, and FFmpeg does not have a built-in feature to automatically resolve this issue.

USER Given that what is the answer to the question

ASSISTANT To use FFmpeg to slice out a 3-second video, you can use the following command:

```bash ffmpeg -i input.mp4 -ss START_TIME -t 3 -c:v libx264 -c:a copy output.mp4 ```

Replace `START_TIME` with the timestamp (in the format `hh:mm:ss`) from which you want to start the 3-second clip. For example, if you want to start at 1 minute and 30 seconds, the command would be:

```bash ffmpeg -i input.mp4 -ss 00:01:30 -t 3 -c:v libx264 -c:a copy output.mp4 ```

However, if you encounter an issue with blank video caused by keyframe cutting, you may need to seek to an earlier frame, find the previous keyframe, or re-encode the whole clip as mentioned in the provided rules.

It seems to be able to find the applicable rules and apply them.


I'm not sure what you thought I meant. Or what this is supposed to show. If you tell your text prediction engine to make sure to account for keyframes. It's not impressive that it responds with how to account for them. I know it can use ffmpeg, I haven't seen it attempt to account for well known edge cases, or caveats they way a real expert trying to be helpful would.


ChatGPT with GPT-4 with just "how to use ffmpeg to slice out a video" pasted directly from your comment without any additional prompting or clarification brought up your point about the keyframes.

So whatever you think you're criticising isn't the same thing that I'm using. Kind of demonstrates my point that you're either using an older version or choosing to ignore evidence for some reason.


It doesn't impress me that once corrected (which it has been multiple times), that it stops giving wrong answers. If it was able to detect and supply nuance to something novel. Then I might be impressed.

BTW: when I posed this question

> how do I use ffmpeg to slice out a video, I only want 3 seconds of video from the middle

to chatgpt4, it did not mention anything about accounting for keyframes. So I'm not sure what version you're using, but it's not the one I have access to :/ Another reason to assume good faith, because you seem to have access to something that I don't.

And, this is now (well before this specific conversation) an expired example because openai's model has been specifically training on this nuance, and thus even if it never made a mistake around it, it still wouldn't be impressive to me; simply because this nuance was directly added to it's model. Remember, we're talking about if the abilities of available models are impressive, what would be impressive to me is if they're able to generalize well enough to know things they haven't been directly taught.


I watched some flat earthers in debates, and they seemed to revel in their freedom to not be impressed by their opponents arguments. It seemed like a kind of rush for them, like a kind of power trip, that was more intense the stronger their opponents arguments. Like a kind of 'no matter how strong your argument is you can't make me agree with you' power trip.


I don't mean to imply that I'm attempting to not be impressed. Before writing the comment you replied to, I asked a friend who is enamored by the new abilities how I would know my mental model for how it works, and what it's able to do is wrong. If I'm wrong, missing something, or making a mistake. I'd like to know about it. Outside of that, the context for the whole discussion is

> Why do so many assume we’re on the cusp of super-intelligent AI?

And how "smart" text to text engines can become.

My argument is actually pretty close to yours. People that are impressed, want to be. They're looking for ways to be impressed. Just like you think I'm looking for ways not to be impressed. Which, even if I am, it still should be easy to convince me (or others) that super-intelligent AI is coming soon to an API near you!

Right now, the LLMs that are the new hotness, aren't super intelligent. They're not domain experts, they can't play a simple game that children love without cheating. They're easily distracted if you know how to word a question correctly. While it's cool they can generate language that seems to be true, and useful. It's not as useful as what already exists. (For the context that I care about) And, all of these examples are markers of below average intelligence. What's the argument that would convince me that very soon, we'll take something stupid and make it hyper-intelligent? Because that's no something I've seen happen before.

Still, all of that said. Your comment that seems clearly meant to be insulting, doesn't actually make any other point. Did you mean to imply something else? Or was it just an attempt to throw insults by pretending to make an observation? Because the conversation is about if people should be impressed. Which when it comes down to it an opinion, are opinions bad?


I guess what I'm supposed to do in 2023 in USA if I have a prediction, like a prediction that increasing params+flops+dataset LLM training will keep unlocking new cognitive capabilities, is to make money from it by doing some kind of investment. I don't know how to do it though. The obvious suggestions are like TSMC, NVIDIA, an AI-specific portfolio, etc. but I'm not confident that those would be good investments for unrelated reasons even though I am confident in my prediction.


I think the issue is just that people are using different words to mean the same thing.

If we get precise, take for example what you said earlier:

> it's slightly better than the mean across all topics.

Most people who can do "better than average across all topics" usually aren't considered "dumb" at all. They might even be considered impressive, albeit being a jack of all trades might be a career disadvantage.

In these discussions I see (generally) a strong sentiment among the naysayers along the lines of "because ChatGPT is so overhyped, I've decided to play it cool and downplay it as a reaction to the hype". And hence the other comment suggesting that you're intentionally not allowing yourself to be impressed by the state of the art achievements.

IMHO there's too much arguing over how we should subjectively "feel" about the new tech. Everyone should be free to "feel" whatever they want to feel about the state of things, whether being impressed or not.


It’s amazing how easily people are fooled into thinking something is intelligent. I met Stephen Hawking once and asked him how to slice video with ffmpeg. He didn’t have a clue! Obviously he isn’t an intelligent person. Yet so many people think he is smart.


Your strawman argument isn't impressive either, even chatGPT knows how to avoid them! But I'll play along! Didn't happen to try something different, like playing 20 questions? Or something else? Because I've already given those as other examples for simple tasks that I don't think a banal text prediction algorithm could do.


I think it's very hard to see what happens when these things can handle video and become truly multi-model.

you would get a very powerful world model, I think. There don't seem to be any sensors you can't hook them up to.

could it learn to infer left from right, object permanence, things like falling and gravity?

Another thing to look at is large organizations accomplish much more than any individual in them, even the CEO. Revenue per employee in tech companies has continued to grow, how far does that ratio increase?


What things specifically? LLMs?


Not LLMs.

Here's a paper that's doing video prediction. https://paperswithcode.com/task/video-prediction


It's not only Hinton but Yoshua Bengio too that wrote a great article that was downvoted in olbivion here while he has the higher h-index in computer sciece right now. Should make you suspicious tht there is some psychological bias among tech right now because the alternative is very scary. This bias is called normalcy bias.


Getting downvoted? Schmidhuber did it first :)


> sober voices as Geoffrey Hinton sounding the alarm about super-smart AI

Can someone please explain to me what exactly the danger is / the dangers are of "super-intelligent" AI?

AIUI, an AI is a combination of hardware, software and parametrization. In broad terms, it exists as a black box which supplies to humans responses to token sets fed to it.

Even if an AI has the launch codes for ICBMs somewhere in its training data, it doesn't have an interface to the nearest missile silo to use them. It cannot commandeer the resources (hardware, space, cooling, electricity) it needs to operate, it is dependent on humans to supply those. So humans can pull the plug on it at any time.

Even if an AI were to become both sentient and nefarious, by what mechanism would it harm humans?

I'm genuinely looking for concrete examples of such a mechanism because I can't imagine any which humans couldn't trivially control or override.


I used to wonder the same, and was dismissive of the idea of "superintelligence". Now I think the whole thing- superintelligence, alignment, etc-, can be simplified a lot. Imagine you could give to someone (a random human being) the gift of omnipotence. Would you trust anyone with this ultimate power? That's the whole problem.


> Even if an AI has the launch codes for ICBMs somewhere in its training data, it doesn't have an interface to the nearest missile silo to use them. It cannot commandeer the resources (hardware, space, cooling, electricity) it needs to operate, it is dependent on humans to supply those.

How do you know the AI doesn’t have an interface to the middle silo?

Are there no military systems that are connected to the internet?

Is the power grid connected to the internet? How about our water filtration system?

I’m not worried that gpt4 is going to start launching missiles, but I also don’t understand where the certainty comes from that it isn’t possible for an internet connected AI to launch missiles.


> I'm not worried that gpt4 is going to start launching missiles

Cool SciFi showerthought:

An AI will probably know that Stuxnet happened.

https://en.m.wikipedia.org/wiki/Stuxnet

An AI could trick multiple human actors into unknowingly assembling all the pieces together for a nuclear launch.



Thank you for the reference. Its author defines a "rogue AI" as:

"an autonomous AI system that could behave in ways that would be catastrophically harmful to a large fraction of humans, potentially endangering our societies and even our species or the biosphere"

and explains that it would also need to be goal-directed in a way which would be at odds with human wellbeing.

Stipulating all that, what is still missing is an explanation of the mechanism by which an AI, rogue or otherwise, could do harm. How is it supposed to affect the world outside its computing substrate?

Absent humans making available the interfaces and resources to do so, it can't. The referenced article includes an example of a genocidal human doing exactly that, and using an AI as a force multiplier. That, as the trope goes, is a social problem, not a technical problem, and it needs a social solution, not a technical one.

Each of the other examples in the referenced article (military AI going rogue, wireheading, amoral corporate AIs manipulating humans) require AIs interfacing with the physical world outside their computing substrate or with the biosphere. Again, because these scenarios remain dependent on humans making available such interfaces, I fail to see how a hypothesized "rogue" AI could achieve any autonomy to do serious damage.

I see this panic about rogue AIs as well-intentioned but misguided, and perhaps exploited by folks who would like to control / diminish / force licensing of general purpose computing.


> How is it supposed to affect the world outside its computing substrate? Absent humans making available the interfaces and resources to do so, it can't.

True, but what more do you need than the ability to send web requests to arbitrary domains, and receive the responses?


Kinda like ChatGPT hired a person to just pass captchas, or Sydney reading news about it's own actions/interactions and therefore getting info it shouldn't have had, there's a lot of space for going out of the guardrails of not having a proper interface

Too many systems rely on people being unaware of exploits, but an AI would never forget something, or get bored or tired of trying, It doesn't need to be smarter than humans, just have enough persistence and attention to detail


> How is it supposed to affect the world outside its computing substrate? Absent humans making available the interfaces and resources to do so, it can't.

This would be more reassuring if hooking it up to a Python prompt wasn't virtually the first thing people did.


The article explains that too. But it's easy to think of millions of examples if you can hack everyone and persuade people to do stuff for you and you are smarter than them.


You ask a very pointed question and after reading the various responses given, I think it is quite obvious that many have forgotten a very simple rule from decades ago: Garbage In -> Garbage Out.

Two things to consider:

1). No computer system will ever go beyond its programming and for an artificial intelligence of any sort, this will be required.

I have worked in many systems over the last 40 years and all too often, when you are forced by circumstance or direction, the analysis of those systems have been shown to be essentially garbage presenting what appears to be reasonable results when those results are not at all reasonable.

2). No matter how much data we "feed" these systems, the data will contain rubbish in terms of what we are trying to do with it and this leads only to further rubbish being created.

What we need to be cognisant of in regards to these technologies is that they are or will be used by people who do not understand the limitations of these technologies and will believe they are far more capable than they really are.

I don't disagree that these tools can be or will be useful as an adjunct to our capabilities. But unless we are actually stupid, we should not be relying on these systems for critical purposes. Of course, this will happen as it has already happened many times in the past and we have suffered from the inherent stupidity involved - we (as humanity) are just too lazy in so many ways.

I have the privilege of watching my 5 month old granddaughter grow and when you see the intelligence inherent here and then compare with so called machine intelligence, you quickly realise that all of our efforts in machine intelligence development is nothing compared to what we see in the development of intelligence in human beings.


The fact that it's just brute force is making this worse. Brute force scales.


Many problems with language that seemed intractable appear to be solved by LLMs but that may be a bit of an illusion.

Note a "chatbot" is fundamentally a manual system that is animated and supervised by a user, when you start building something that works without supervision (say a webcrawler) you start to see these are not ready for prime time. Chatbots already have superhuman performance at seduction (your personal viewpoint, narcissism, "soul", are all impediments to this) and are very good at getting you to think a 70% full glass is almost 100% full. I think the "generate high probability text" bypasses the mechanisms in your mind that perceive incongruities. It will be scary when these are applied to romance scams, "pig butchering" and the like.

There are some NLP tasks (relation extraction) where LLM zero-shot performance is better than the status quo, but in these cases the status quo is preparadigmatic (fancy way to say "it sucks".)

There are two hype trains as to superintelligent A.I: (1) "A.I. Safety" teams in big tech that first made A.I. look important because it was dangerous, and then confirmed their own legitimacy by being sacked ("it is so dangerous they had to fire us to cover up the danger") and (2) an apocalyptic cult that has been preparing for this moment for almost twenty years.

My take is that the efficiency of these things is going to improve dramatically (you'll be using a specialized model that beats GPT-4 at your task that uses 1/1000 or less the resources) that autonomous operation will still require collecting 1000s of examples for training and evaluation. On another level though I think the performance will reach an asymptote and adding more data will lead to diminishing returns.

One of the most frightening situations an engineer can get into, and that people have the hardest time perceiving, is when a project is approaching an asymptote (say 97% done) where you keep working harder and harder to get to 94%, 95%, 95.5% and never quite get there because of a structural inadequacy of your plan. Livingston's book "Friends in High Places" has the best account of this I've seen

https://www.abebooks.com/9780937063064/Friends-High-Places-L...


I think most people are just seeing the speed of (seemingly) progress as a direction to AGI. But we are far from it. Though maybe we are seeing something interesting in linear world models recently emerging. But my guess is gradual progress and then all of a sudden we will have AGI. But that first part will take 20-30 years. at least.


I don't know that it's meaningful to throw out numbers like 20 or 30 years anymore. I've seen enough experts make predictions in this field and be wrong that the only prediction I'm comfortable making about their predictions is that they're probably wrong.


But 2 out of 3 scientists that created the tech you are talking about disagree with you. Makes you wonder if..?


This is a strange way to count, the three Turing award winners are not the ones that "created the tech". Many thousands of people did that.


It's common to call them godfathers of AI. Surely Oppenheimer didn't create the bomb alone but you get the point.


The point is wrong, they don't have any claim to having created deep learning and so their perspective can't be used to argue for the reality of "superintelligence". Even if they did create it, the question of whether "superintelligence" is a meaningful concept or not isn't really related to what they would have had to go through in order to make deep learning work.


So in computer science nobody can claim that they are the godfather of X. In physics too because everyone was based on mathcfrom ancient Greece. I get your point. Let's focus on the important stuff. 2 out of 3 Turing award winners related to this tech put up huge warnings. You think a) ignore or b) take seriously?


Should we be making a distinction between sufficiently-intelligent and super-intelligent? It seems like the near term concern would be sufficiently-intelligent. We can all agree that marketing, while currently crude, does have an effect. Some people are more swayed than others by bought-and-paid-for messages. But right now mass-marketing is pretty lowest-common-denominator. What happens when it gets cheap enough to dedicate one-humanish-intelligence 24/7 dedicated to learning about how to influence you to buy or vote. Monitoring what you look at online, where you go, people you frequently engage with, etc., and customizing/"fabricating" interesting "news" to consume, etc.. And then what happens when it is a trivial cost to put 1,000 human-intelligences working against you?


> It’s all very clever, yes, but at bottom it’s just a brute force approach.

Eh, this doesn’t really matter. If one "brute forces" a construct out of many simple models that acts like an AGI, then (in terms of the danger) it mind as well be an AGI.

And that, btw, is a major point of concern. LLMs and other models are "dumb," frozen and monolithic now, but stringing them together is a relatively simple engineering problem. So is providing some kind of learning mechanism that trains the model as it goes.

> But if Geoffrey Hinton is worried surely I’ve gone wrong somewhere. What am I not seeing?

He was very specifically worried about the pace. Hinton has been at the center of this space forever, and he had no idea things would jump so quickly. And if he couldn't see it, how is anyone supposed to see danger right before it comes?


> It’s all very clever, yes, but at bottom it’s just a brute force approach.

This is the bit that I think you should focus on a bit more. I don't think it's the case you need complicated, clever algorithms and architectures in order to get complicated, clever behavior.

If you start with particle physics, then work your way up to chemistry, and then biology, you can see how we start with very, very simple rules, but at each level there is more and more complexity. The universe "running" physics is the epitome of a brute-force approach. It would be a mistake to say that because the rules of particle physics are so simple, that nothing made of those particles could ever think.

Likewise, even though these models are just big arrays of numbers that we stir in the right way to make them spit out something closer to what we want over and over again, I think it's a mistake to say that out of that, can never arise something much more capable than humans.

> You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on. And such models don’t bring us any real understanding of what it would take to make super-intelligent machines.

Others have addressed the first point here, but for the second -- yes, that's true. As for the second, I think we'll have a bit of warning before we get to true superintelligence, but even now, I it seems to me like we have half of an AGI in large LLM models. They don't seem to be conscious, can't really evaluate its their thoughts except by printing them out and reading them in again, and are only superintelligent in terms of knowing lots of facts about lots of things. But I think we are probably going to figure out how to create the other parts and we'll be there.

I am worried that humanity is on a bit of very-high-inertia train of "more and more progress" without enough safeguards. It was ok in the past, but as our world gets more and more connected and new inventions get spread far and wide in less time than ever before, it's possible for damage to be done on a very wide scale before we can figure out how to counteract it. It also means that good things can spread in the same way -- but the problem is that it's not just the average that matters, it's the variance. It doesn't matter if you create and disseminate 9 out of 10 new technologies that are massively beneficial if the other 1 ends up with humanity gone or completely disempowered, and you can't take advantage of the good stuff.


ChatGPT can map one situation over another, like when you ask it to tell a story in the style of x or y.

Basically that means that in the training of predicting the next token, gpt HAD to come up with some internal way of “modeling” such situation and “render” it.

This ability to model and apply the model to something else could very well be the core building block of general thinking.

Additionally, the progresses in AI have been shockingly higher than expected. That computers would beat go and the turing test was completely unrealistic ten years ago. We can’t rule out that AGI isn’t around the corner.


People on HN have a tendency to state that hyped tech is the next big thing so they can be "ahead of the curve."


I think the intuition is that we made a lot of progress and there are still a bunch of low hanging fruits to grab. Once we have very large context window (>1M tokens) and multimodal training we will unleash unprecedented gains.


> It’s all very clever, yes, but at bottom it’s just a brute force approach.

which, as history shows, is the only ML approach that actually works, and therefore it's scary




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: