UKRAINE – 2023/04/09: In this photo illustration, Cerebras Systems logo is seen on a smartphone … [+]
If you’re in finance, the news around Cerebras, which is going for an IPO, is all about how the company’s stock ticker will fare on the NASDAQ.
However, if you’re into technology, the story is a bit different. In fact, if you’re only looking at how these companies are going to compete, you’re really missing the big picture.
Lots of people know how the beginning of the AI revolution led to replacing the idea of a CPU with a GPU: a more sophisticated and specialized type of logical processor that lent itself to machine learning and related work.
At that time, we were following a pretty common prescribed method – input a ton of training data, which often includes a lot of web scraping – and then test the system with it.
All of that work required a lot of processing power, and the GPUs were built for these big workloads.
Now, the industry seems to be moving even further, toward something called inference, which is a different sort of task, and the hardware is going to have to be even more specialized.
So what is inference? Because every time someone starts to talk about it, you can see people’s eyes just kind of glaze over. We don’t like these kinds of words, in general, outside of a highly scientific context.
Well, anyway, inference is basically the ability of AI to learn on the fly – to take live data and put it in a trained model in order to get logical results.
In other words, the trained AI is showing what it has learned from its training sessions.
So this type of activity is going to require some hardware with some heft: to that end, Cerebras has unveiled the wafer scale engine (WSE) that, for tech gearheads, has some pretty impressive specs. (It may come as no surprised that these monster chips are physically produced by Taiwan Semiconductor Manufacturing Company).
WSE: Under the Hood
Cerebras’ WSE-3 has 4 trillion transistors, and staggering amounts of on-chip memory. It has around 9,000 cores, for an estimated 125 petaflops capacity.
We reported a while ago on these types of enormous multicore engines, where the hardware is physically large – measured in inches, rather than centimeters.
“Lower latencies drive higher user engagement,” notes Perplexity CTO Denis Yarats in a press statement. “With Cerebras’ 20x speed advantage over traditional GPUs, we believe user interaction with search and intelligent answer engines will be fundamentally transformed.”
It’s not hard to see how that power is going to super-charge AI efforts in so many industries.
Use Cases for AI Inference
One way to think about this news is that we’re simply wanting more speed and power for an ever more sophisticated set of processes. But you can also think about the role that inference is going to play during this part of our AI evolution. In other words, we’re moving from more supervised types of learning to less supervised types of learning – from the kind of deterministic machine learning we did 10 years ago, to a new type of neural network activity, where we’re trusting the system to learn on its own a lot more.
So the story of Cerebras’ new challenge, not to mention Groq, another company that is jumping on the bandwagon, is the story of hardware playing catch-up.
The hardware itself is impressive – these new Cadillac systems turn heads – but what we should be looking at is what these items are made for that is going to disrupt business.
“As AI becomes integrated into more aspects of daily life and business operations, the importance of efficient and accurate AI inference grows,” writes an author at Run:AI. “Accurate inference is especially critical in sensitive use cases like healthcare, fraud detection, and autonomous driving.”
Those are just some top-level examples: we have yet to really discover some of the more hidden uses of deeper inference models. What is AI going to look like in ten years? Is it still going to seem like it’s coming out of a computer? Or are things going to be really different?
Forbes Technology Council Member Nir Kaldero gave us this list a couple of years ago: some of these admitting have a lot of staying power, although it’s interesting o think about a few of them. The cloud, for example: of course, cloud adoption continues, but now we have a competing idea for many workloads. It’s processing on the edge, on-device, at the margin of the network. And that’s gaining ground, too.
Anyway, the hardware battle is really a harbinger of the next generation of tech systems. And they’re going to be spectacular.
Source: www.forbes.com…