Is the state of AI disappointing?

Not when you look at how little data we've given it.

Aug 24, 2024

A colorful, wide cartoon image of a frustrated man standing in front of a stove, watching a pot with water that is still cold. The man has an exaggerated expression of impatience, with furrowed brows, a frown, and arms crossed. The stove is visibly on, with a glowing burner, but the water in the pot shows no signs of heating. The kitchen setting is simple, with a few utensils and a clock on the wall indicating that time has passed. The overall scene is humorous and light-hearted, capturing the frustration of waiting for water to boil. — A watched AI never boils???

Pundits seem equally eager to announce imminent sci-fi utopias as tear down the amazing things we’ve already accomplished. The truth is, it’s early days.

Maybe it’s better if we look at how far we’ve come, or haven’t, and why that might be. Then we’ll look at how much further AI can go.

"By far, the greatest danger of Artificial Intelligence is that people
conclude too early that they understand it."
- Eliezer Yudkowsky

Like Prometheus, we've kindled a spark of intelligence with our hot-to-trot GPUs. But that spark is yet a small candle in the infinite richness of everday human experience.

The progenitors of ML and DL have done a massive amount of extremely boring foundation work in math, statistics, and the PhD level inquiries that have resulted in neural networkers, transformers, and more. We owe them a ton. They toil in nerdy areas speaking a language that, for most people, is more opaque than ancient Greek. And so, most people have no idea what’s involved in AI, and what pioneers are working on right now.

The news (network, or even nichey business news) talk about things like ChatGPT and Claude and whether they’ve changed the world of work, or failed to do so.

It has much potential! But Wall Street is impatient. People are unrealistic about scale. Users have no idea how it works. Even explanations of LLM transformers like next-word-prediction are kindergarten approximations, and don’t even apply to image generating AI’s, which work through a different method (usually diffusion).

What I think is fascinating is that all of the AI advances so far have come from very little of our humanity’s data.

If you’re not impressed by AI, relax: it’s just getting started.
If you think AI is amazing, be inspired: it will go much further.

"The vast majority of the world's data is still unstructured and unanalyzed,
and this is where the real opportunity for AI lies."
- Andrew Ng

We’ve only processed 5-25% of the best data.
75-95% of the highest quality data is excluded for ethical and legal reasons.
Some of the last will only ever be analyzed via personal opt-in and within private walled gardens like insurance companies or governments.

It's as if we're trying to build a new Library of Congress, but filled it with only a few hundred books, and some people walk in and say, “Eh, what’s the fuss about?”

But the data to be ground up in this mill and fed to an awakening intelligence will only grow. The remaining and growing data is a giant feast that will nurture a robot toddler into a giant insight machine (or a thousand)…

The web contains about 500 trillion documents and is estimated to grow by 50% within 5 years.
The average webpage has 600 words. That’s 300 quadrillion words now, growing 50% every five years?
As the world sees more CCTVs, and weather sensors, and other ways of turning organic reality into data, the amount and types of data will multiply.
Industries like healthcare will always produce new and evolving data.

ChatGPT and other LLMs are impressive compared to what (almost-nothing) went before. And GPT4 is rumored to have been trained on over 1 trillion words of data. If we assume the above data is even remotely accurate, GPT4 was trained on 1/300,000th of the web’s data. Imagine ChatGPT or Llama or Claude being 100x better. Or 1,000. Who knows what that looks like.

We didn’t even mantion video yet- the data we’ve used so far doesn’t include any understanding of what’s happening in any videos beyond transcribed words.

There’s no good estimate how much video content has been analyzed so far (there’s no good source), and even then what video has been analyzed may be just for simple things like classification (what’s the genre? what’s the text content), not analysis of video’s visual experience, sentiment, tone of voice, and other things humans experience everytime they watch videos. Maybe AI can recognize the faces in a crowd but has yet to hear their stories or understand their emotions. Let alone always detecting sarcasm accurately.

This is the barest beginning. We’ve analyzed so little. It’s amazing how good it is already, and unfathomable how insightful it will become.

Read more:

Behind the Keynote: Joy, Productivity, and Profit

Discussion about this post

Ready for more?