we are talking about lean proofs. Given a formal statement and a proof - the lean can verify whether it's correct or not. It's like generating computer programs to solve a problem - the problem lied in generating useful solutions/sub-solutions so that the search is effective. They achieve this via using gemini as a lean proof generator aka. using a world model LLM fine tuned to generate lean proofs in a more effective manner.
Humans are even better at this as you mention - but effectively the approach is similar. Come up with lot of ideas and see what proves it.
I think what we need to look at is that AI systems were not able to do it before and now they are able to do it. Sooner these systems with millions of dollars of compute will just scale the algorithms and beat humans at this as well - just like they did for chess, go, shogi etc.
That's also a fear I have. But having grown up on internet it helps in adjusting to these changes fast. Secondly, I got a degree in CS and been a developer for a decade so yeah I know a video can be a deepfake instinctively. It's harder to scam us than a generation who retired already when deepfake/deepvoicecloning arrived and may be never had a CS exposure.
So, my reasoning is little stupid but it goes like this. I am in my 30s. Let's say if I lose all my money in a scam also - I have time to earn it back and still make it back. Whereas our elders (parents are 60+) who have savings and rely on their savings and can't really earn now - and also have lot of medical bills to pay with their age ==== for them it's like one-shot ending right? That's why I considered them as special case.
That's not stupid reasoning, and I don't know you or your parents. But in the short snippet you wrote, I was interpreting your question as being based on the supposition that your parents are somehow too naive or inexperienced to understand and protect themselves. That's the part that I question.
Particularly since the answer to your question is no different than if you asked the question about anybody of any age.
I guess I need to tell them to not even send any personal detail / bank detail to anyone including me. I sort of felt like may be there is a better solution than just trusting them to follow this. Like we have insurance for death kind of scenarios and medical insurance.. some sort of systematic protection would have been better.
Recently someone I know got a call where they mimicked their child's voice using AI and asked them to send immediate money otherwise child will be in danger. This has got me into thinking - my parents would also do the same for me/ and everyone's would-- add deepfake into it -- how the fuck we are support to assume people won't give emotional response? forget about personal details, people who care about you can might directly do a transfer.
> my parents would also do the same for me/ and everyone's would
Mine wouldn't? That's an obvious phone scam, they would call me directly before wiring $500 to some unproven weirdo over the phone.
> how the fuck we are support to assume people won't give emotional response?
For the same reason you assume people won't immediately become greedy when a Nigerian Prince emails them about an excellent opportunity to earn a few million dollars. Common sense has to play a part or else you're going to be manipulated with or without AI.
The same attack used to get SMS 2fa tokens can be used to capture incoming calls from your parents or grand parents. How is calling "you" going to protect you?
So my grandparents call me, get their call intercepted by a malicious MITM. The worst-case scenario! My AI counterpart is screaming over the line, saying that my legs are only moments away from being sawed off by the mysterious phone-operator that would only identify himself as "the Butcher".
They ask me what my middle name is, and my AI counterpart stops. 10-15 seconds go by, with the halloween chain-rattling CD spinning away in the background. The AI answers correctly in stunted breath, and my grandparents hang up so they can move on with their day. They've been getting calls like this for the past 30 years trying to convince them a family member desperately needs their Social Security number over the landline. If they fall for an AI-generated voice from an unknown caller I'd be genuinely surprised. These are decades-old social engineering scripts, people.
Current architecture of LLMs focus mainly on the retrieval part and the weights learned are just converged to get best outcome for next token prediction. Whereas, ability to put this data into a logical system should also have been a training goal IMO. Next token prediction + Formal Verification of knowledge during training phase itself = that would give LLM ability to keep consistency in it's knowledge generation and see the right hallucinations (which I like to call imagination)
The process can look like-
1. Use existing large models to convert the same previous dataset they were trained on into formal logical relationships. Let them generate multiple solutions
2. Take this enriched dataset and train a new LLM which not only outputs next token but also a the formal relationships between previous knowledge and the new generated text
3. Network can optimize weights until the generated formal code get high accuracy on proof checker along with the token generation accuracy function
In my own mind I feel language is secondary - it's not the base of my intelligence. Base seems more like a dreamy simulation where things are consistent with each other and language is just what i use to describe it.
This suggestion revisits the classic "formal top-down" vs "informal bottom-up" approaches to building a semantic knowledge management system. Top-down has been tried extensively in the pre-big-data models and pre-probabilistic models era, but required extensive manual human curation while being starved for knowledge. The rise of big-data bode no cure for the curation problem. Because its curation can't be automated, larger scale just made the problem worse. AI's transition to probability (in the ~1990s) paved the way to the associative probabilistic models in vogue today, and there's no sign that a more-curated more-formal approach has any hope of outcompeting them.
How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)? However that will eventually be done, the implementation must continue to be probabilistic, informal, and bottom-up. Manual human curation of logical and semantic relations into knowledge models has proven itself _not_ to be sufficiently scalable or anti-brittle to do what's needed.
> How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)?
We could just use RAG to create a new dataset. Take each known concept or named entity, search it inside the training set (1), search it on the web (2), generate it with a bunch of models in closed book mode (3).
Now you got three sets of text, put all of them in a prompt and ask for a wikipedia style article. If the topic is controversial, note the controversy and distribution of opinions. If it is settled, notice that too.
By contrasting web search with closed-book materials we can detect biases in the model and lacking knowledge or skills. If they don't appear in the training set you know what is needed in the next iteration. This approach combines self testing with topic focused research to integrate information sitting across many sources.
I think of this approach as "machine study" where AI models interact with the text corpus to synthesize new examples, doing a kind of "review paper" or "wiki" reporting. This can be scaled for billions of articles, making a 1000x larger AI wikipedia.
Interacting with search engines is just one way to create data with LLMs. Interacting with code execution and humans are two more ways. Just human-AI interaction alone generates over one billion sessions per month, where LLM outputs meet with implicit human feedback. Now that most organic sources of text have been used, the LLMs will learn from feedback, task outcomes and corpus study.
Yes, that's why there was no human in the loop and I was using LLMs as a proxy to bottom up approach in step 1. But the hallucinations can creep into the knowledge graph also as mentioned by another commentator
Yann LeCun said something to the effect you cannot get reasoning with fixed computation budgets, which I found to be a simple way to explain and understand a hypothesized limitation
Logic has all its own problems. See "Godel, Escher, Bach" or ask why OWL has been around for 20 years and had almost no market share, or why people have tried every answer to managing asynchronous code other than RETE, why "complex event processing" is an obscure specialty and not a competitor to Celery and other task runners. Or for that matter why can't Drools give error messages that make any sense?
As a computational biologist, I've used ontologies quite a bit. They have utility, but there is a bit of an economic mismatch between their useful application and the energy required to curate them. You have some experience in this space. Do you think LLMs could speed up ontology / knowledge graph curation with expert review? Or, do you think structured knowledge has a fundamental problem limiting its use?
LLMs right now don't employ any logic. There can always be corners of "I don't know" or "I can't do that" - than the current system which is 100% confident in it's answer because it's not actually trying to match any constraint at all. So at some point the system will apply logic but may not be as formal as we do in pure math.
But the problem is with the new stuff it hasn't seen, and questions humans don't know the answers to. It feels like this whole hallucinations thing is just the halting problem with extra steps. Maybe we should ask ChatGPT whether P=NP :)
Haha, asking chat-gpt surely won't work. Everything can "feel" like a halting problem if you want perfect results with zero error with uncertain and ambiguous new data adding.
My take - Hallucinations can never be made to perfect zero but they can be reduced to a point where these systems in 99.99% will be hallucinating less than humans and more often than not their divergences will turn out to be creative thought experiments (which I term as healthy imagination). If it hallucinates less than a top human do - I say we win :)
OP mostly likely means "weed" like "pest" or "annoyance", i.e. a category of undesirable plants that tend to appear unbidden along with desirable plants. The distinction isn't biological, it's just that when you create a space for growing then things that grow won't all be what you want.
(The term "weed" for marijuana is just a joke derived from that sense of the word.)
Yeah but when you come to halting problems on that level of complexity multi-hierarchical-emergent phenomena occur aperiodically and chaotically that is to say in the frequency domain the aperiodicity is fractal like, discreet and mappable.
For the first step CYC[1] could be a valid solution. From my experience I whould call it a meaningful relation schema for DAGs. There is also an open source version available [2]. But it is no longer maintained by the company itself.
Interesting. I haven't really seen much into this space. But anything which can provably represent concepts and relationships without losing information can work. Devil might be in details; nothing is as simple as it looks on first sight.
Formal verification of knowledge/logical relationships? how would you formally verify a sci-fi novel or a poem? What about the paradoxes that exist in nature, or contradicting theories that are logically correct? This is easier said than done. What you are proposing is essentially 'let's solve this NP-hard problem, that we don't know how to solve and then it will work'.
Oh, exactly. But let me know your thoughts on this - let's say if you have a graph which represents existing sci-fi novel = rather than the current model which is just blindly generating text on statistical probabilities would it not help to have to model output also try to fit into this rather imperfect sci-fi novel KG? If it doesn't fit logically. Based on how strong your logic requirements are system can be least creative to most creative etc.
I was not actually aware that building KG from text is NP-hard problem. I will check it out. I thought it was a time consuming problem when done manually without LLMs but didn't thought it was THAT hard. Hence I was trying to introduce LLM into the flow. Thanks, will read about all this more!
Eg, KGs (RDF, PGs, ...) are logical, but in automated construction, are not semantic in the sense of the ground domain of NLP, and in manual construction, tiny ontology. Conversely, fancy powerful logics like modal ones are even less semantic in NLP domains. Code is more expressive, but brings its own issues.
I had KGs in mind with automated construction which can improve and converge during training phase. I was hypothesizing that if we give incentive during training phase to also construct KGs and bootstrap the initial KGs from existing LLMs - the convergence towards a semantically correct KG extension during inference can be achieved. What do you think?
LLMs generate responses based on statistical probabilities derived from their training data. They do not inherently understand or store an "absolute source of truth." Thus, any KG bootstrapped from an LLM might inherit not only the model's insights but also its inaccuracies and biases (hallucinations). You need to understand that these hallucinations are not errors of logic but they are artifacts of the model's training on vast, diverse datasets and reflect the statistical patterns in that data.
Maybe you could build retrieval model but not generative model.
I thought addition of the "logical" constraints in the existing training loop using KGs and logical validation would help into reducing wrong semantic formation at the training loop itself. But your point is right that what if the whole knowledge graph is hallucinated during the training itself.
I don't have answer to that. I felt there would be lesser KG representations which would fit a logical world, than what fits into the current vast vector spaces of network's weight and biases. But that's just a idea. This whole thing stems from this internal intuition that language is secondary to my thought process and internally I feel I can just play around concepts without language - what kind of Large X models will meet that kind of capability I don't know!
I believe language allows having more than one layer i.e. enabling complexity of representation. For example, If I want to think about the person thinking inside - these kind of recursive concepts need some sort of symbolic nesting to even materialise the idea.
Humans are even better at this as you mention - but effectively the approach is similar. Come up with lot of ideas and see what proves it.