A food fight erupted at the AI HW Summit earlier this year, where three companies all claimed to offer the fastest AI processing. All were faster than GPUs. Now Cerebras has claimed insanely fast AI performance with their latest software running on the company’s Wafer-Scale Engine.
What did Cerebras Announce?
Recently, Cerebras updated their inference processing capabilities with an astonishing 2100 tokens per second running Llama 3.1-70B. That is four pages of text per second. While humans can’t read that fast, computers can. The result is 3 times faster than previously announced, 16 times faster than the fastest available GPU, and 8x faster than GPUs running Llama3.1-3B, a far smaller model, according to the company.
The company also said that, since announcing inference performance, 1000’s of companies have contacted Cerebras to better understand what is possible for fast inference services. Securing additional customers is critical for Cerebras as it continues to grow.
What’s next for Cerebras?
Since the company added inference processing to its existing AI model training capabilities, Cerebras is now a far more attractive alternative to GPUs for large-scale AI creation and usage.
What we lack, however, is cost data, which would help us understand whether Cerebras is also the most cost-effective solution. But cost data for all AI hardware vendors is difficult to get, and varies significantly depending on contracted volume. We are also anxious to learn how fast Cerebras performs on larger models such as Llama 3.1 405B. But so far, the results are extremely impressive and position the company well as the undisputed leader for inference processing speed of 70B parameter models.
Disclosures: This article expresses the author’s opinions and
should not be taken as advice to purchase from or invest in the companies mentioned. Cambrian-AI Research is fortunate to have many, if not most, semiconductor firms as our clients, including Blaize, BrainChip, Cadence Design, Cerebras, D-Matrix, Eliyan, Esperanto, Flex, GML, Groq, IBM, Intel, NVIDIA, Qualcomm Technologies, Si-Five, SiMa.ai, Synopsys, Ventana Microsystems, Tenstorrent and scores of investment clients. We have no investment positions in any of the companies mentioned in this article and do not plan to initiate any in the near future. For more information, please visit our website at https://cambrian-AI.com.
Source: www.forbes.com…