The most important technical problem of our time
Ilya Sutskever thinks "building safe superintelligence is the most important technical problem of our time".
I disagree. I think there’s a more pressing technical problem, one that needs to be solved first – superknowledge.
The world is far shorter on knowledge than intelligence right now. We’ll soon have near-AGI intelligences (GPT-5) relying on knowledge systems built for humans in the late 1900s (Google).
This is an absurd situation, even a dangerous one.
We need to build superknowledge before superintelligence. Let’s explore why.
Intelligence is different from knowledge.
Intelligence is reasoning over an input. Knowledge is retrieving from a data repository.
All the recent advanced AI models have high intelligence, but surprisingly limited knowledge.
For example, GPT-4 can nail any highschool physics problem, but if you ask it to retrieve a list of physics PhDs in NYC – a relatively simpler request – you get this:
GPT-4 does have some knowledge of the world, but it isn’t anywhere close to knowing everything -- every phd webpage, every news article, blog post, youtube video, tweet, reddit post, meme, etc.
That's why LLMs are often combined with a search engine. The LLM brings the intelligence, and the search engine brings the knowledge. At least, in theory. Unfortunately today’s search engines can't handle simple knowledge requests either:
Knowledge systems like Google haven't improved much over the past decade (arguably, they’ve gotten worse). In contrast, intelligence systems improve every month.
That means intelligence is increasingly bottlenecked by knowledge.
Luckily, we now have technology like transformers, which enable radically new knowledge systems. That's what our team at Exa is working on. I believe we're only a few years away from building superknowledge.
Superintelligence is a system that can handle extremely complex reasoning requests.
Superknowledge is a system that can handle extremely complex retrieval requests.
We've achieved superknowledge when there exists an API that can handle any knowledge request over available information, no matter how complex.
Superknowledge would handle requests like:
In short, superknowledge gives everyone comprehensive knowledge of anything as quickly as they want.
I believe we urgently need this comprehensive knowledge, both to progress society and to safeguard it.
If you want to accelerate human progress, superknowledge is perhaps the most overlooked way to do it.
Progress is a constant cycle of learning what's out there and trying something new. Superknowledge eliminates any bottlenecks to the first step so that all energy can be focused on the second.
Superknowledge would make us all superproductive and superinformed.
In our personal lives, much of our time is wasted searching -- for apartments, events, clothing, interesting articles, solutions to personal problems, etc. Superknowledge gathers all information for you in 2 seconds, not 2 days.
Sometimes we even waste not days, but months or years of our lives because we didn’t learn something existed until later – the perfect job opportunity, the right medical treatment. With superknowledge, you’d have a smart alert system so that you're fully in the know about any topic. No more "I wish I knew that earlier", for anything.
Progress will accelerate most from combining superknowledge with an intelligence like GPT-5. GPT-5 can handle the planning and processing while superknowledge handles the retrieval.
Let’s say you want help finishing a research paper. GPT-5 + superknowledge would take each paragraph in your paper and find all the similar ideas from across the web (papers, blog posts, tweets, videos, etc). Then it would find the counterarguments to each of those ideas. Then the counterarguments to the counterarguments, and so on. It would feel as if a week-long academic conference had analyzed your paper, but in 2 seconds.
On the other hand, GPT-5 + Google would get stuck because Google can’t handle queries like finding similar ideas or counterarguments.
It’s difficult for us to fathom how quickly progress will accelerate when every intelligence – whether human or AI – is unblocked by all the knowledge that’s out there.
Superknowledge doesn’t just accelerate us toward an advanced future, it also accelerates us toward a safer one.
When people list the biggest threats to humanity, they don’t usually put the state of our knowledge as the top threat, but it actually is.
That’s because our knowledge underlies everything in our society – what problems we care about, how we act toward others, which politicians we choose, etc. Every societal malfunction is downstream from bad knowledge.
Unfortunately, our current knowledge ecosystem is a mess. Knowledge is scattered across billions of webpages with no tool powerful enough to organize it all. That makes it extremely hard to become truly well-informed on any issue – you never know what knowledge you’re missing.
When people aren’t well-informed, they make the wrong decisions, elect the wrong leaders, and cause inefficiencies throughout society. This is causing real problems, from inane housing laws to actual war.
The rise of agentic AI systems multiplies this problem dramatically. If AIs are stuck with the same knowledge tools as humans, then we’ll just have thousands more intelligences operating over the same incomplete knowledge. These AIs will interact with billions of people daily and perform actions on their behalf. They will be highly intelligent but misinformed, a dangerous combination.
Our society deserves something better. Building superknowledge is the solution.
Superknowledge advances safety because it lets people or AIs quickly become well-informed on any topic – from the technologies related to carbon removal to the laws that should govern AI itself. I’d much rather take advice from an AI that analyzed the 10,000 relevant arguments on the web over one that read the first 10 links of a Google search.
We're now entering the most volatile decade in human history. It’s essential that humans and AIs can rely on a mature knowledge ecosystem that guides us through the chaos.
We just better build it before superintelligence arrives.
It’s no accident that the Bible begins with a story about the tree of knowledge. For 5,000 years, humans have dreamed of knowing everything. We’re going to achieve that dream in about 3 years, and I think it’ll be powered by Exa. This is a historic mission, biblical even.
I’ve personally dreamt of knowing everything for two decades, since I was a little kid lying prone on my 4-foot tall outer-space book wondering what it all means. We're finally almost there.
It's interesting that no-one else is working on this. While there are dozens of labs working on superintelligence, as far as I'm aware there's only one organization in the world working on superknowledge – Exa.
That’s partly because building superknowledge requires an organization with the right incentives. Organizations with ad-based revenue models will not build it. Exa, in contrast, has a usage-based revenue model. We’re highly incentivized to give users full control to retrieve whatever knowledge they need. Turns out users would pay a lot for superknowledge.
Another reason no one's building superknowledge is that it’s hard. We need to design novel ML architectures in a novel research field while building a novel search business. That’s not including all the massive infrastructure required for crawling, storing, processing, and serving petabytes of web data.
Yet superknowledge seems more attainable than superintelligence. It requires fewer magical breakthroughs. We have a pretty clear roadmap to get there.
The clock is ticking. To safely navigate the next decade, we need to build superknowledge before SSI, OpenAI, or some other organization builds superintelligence. For the Exa team, this is the most important technical problem of our time.
SEE MORE
It uses clustering, matryoshka embeddings, binary quantization, and SIMD operations. Written in rust of course 🦀
The Exa Team
December 17, 2024
Exa is pitching a new spin on generative search. It uses the tech behind large language models to return lists of results that it claims are more on point than those from its rivals, including Google and OpenAI.
The Exa Team
December 3, 2024
While there’s no shortage of startups aiming to replace Google with AI-powered search, a startup called Exa has a different idea. Search for the AIs.
The Exa Team
July 16, 2024