Google has really emptied its coffers this time! Gemma 4 has made a splash in the open - source commIssuing time:2026-04-03 16:23 In the early hours of the morning, Google DeepMind dropped the first bombshell in the open-source world in 2026: the official release of Gemma 4. They released four full-size models at once, from the 2B which can fit into a phone to the 31B which can run at full speed with a single SIM card, all of which are based on the same source material as the closed-source flagship Gemini 3. ![]() A year later, Gemma not only completed an epic leap, but also directly rewrote the rules of the game for the entire open-source big model. The most explosive number: 31B Dense ranked third among open-source projects on the Arena AI text benchmark, with an Elo score of 1452. Its two competitors, one with over 60 billion parameters and the other with over 100 billion, have allowed Gemma 4, with its mere 31 billion, to squeeze into the multi-billion dollar poker table. Even more outrageous is 26B MoE, with a total of 25.2 billion parameters, but only 3.8 billion are activated during inference, while Elo directly hits 1441, ranking sixth among open source algorithms. ![]() ![]() A glance at the report card reveals that this is not an iteration, but a suppression of the bloodline of the previous generation.
In addition, Gemma 4 achieved a 40% performance boost in benchmark tests for multilingual reasoning and knowledge-based question answering. ![]() What's chilling is that this small 31B model actually outperformed a closed-source model that was 20 times its size. Now, a Mac Mini can run Gemma 4, and some people have even successfully run it offline on their phones. ![]() Hugging Face CEO Clément Delangue summed it up in one sentence:This is a huge milestone.。 01 Four models enable seamless integration between edge, cloud, and mobile devices. The Gemma 4 suite offers a basic version and a command-tuned version in each size, precisely covering all use cases:
![]()
It's worth mentioning that the entire series supports Google's latest TurboQuant compression algorithm, which further reduces memory usage with almost no loss of quality. 02 Small model performs like a large model Gemma 4 has no obvious weaknesses and crushes its predecessor in almost all benchmark tests:
Even with the smallest E4B, it can achieve 42.5% in AIME and 52% in LiveCodeBench—achievements that were only found in flagship-level large models a year ago. 03 Make the most of every parameter Gemma 4 didn't pile on fancy new concepts; instead, it combined proven technologies to their fullest potential. Google even proactively cut components like Altup, which had "uncertain effects."
![]() 04 One model handles image viewing, sound listening, and video reading. The entire Gemma 4 series supports image and video input, and E2B and E4B are also compatible with audio, truly achieving full modal unification.
![]() 05 Apache 2.0 The biggest non-technical news from this release is that Gemma 4 is the first to adopt the Apache 2.0 open-source license. The previous Gemma series used a Google custom license, which had various restrictions and attribution requirements, and corporate legal departments had to review each clause before it could be used commercially. Apache 2.0, on the other hand, does it all in one step: ✅ No custom terms ✅ No commercial restrictions ✅ Can be freely modified, distributed, and packaged into the product. ✅ No gray area Since its initial release, Gemma has been downloaded over 400 million times, with over 100,000 community-derived versions. With the support of Apache 2.0, this number is bound to experience explosive growth. The release of Gemma 4 has fully solidified Google's two-pronged strategy. The top layer consists of the Gemini series of closed-source models, which occupy the performance ceiling and monetize through APIs; the bottom layer consists of the Gemma series of open-source models, which feed the developer ecosystem with the same technologies and seize the entry point for local deployment, edge inference, and agent development. One focuses on generating revenue, while the other focuses on building an ecosystem. They don't conflict with each other; on the contrary, they amplify each other's impact. For developers, the choice is now crystal clear:
![]() Google proved with Gemma 4 that parameter efficiency is the future of open-source models, with 31 bytes outperforming competitors 20 times larger, and 2 bytes fitting into a phone pocket. The competition for open-source large models has entered a new era starting today. ![]() Nebula Data, headquartered in Singapore, has branches in Jakarta, Guangzhou, Shanghai, and Hong Kong. The company independently developed Nebula Lab, a one-stop AI content generation and model aggregation platform, equipped with an enterprise-grade AI Agent, aggregating globally applicable large-scale models and industry-specific vertical models. Simultaneously, it launched the Nebula AIoT hardware ecosystem (including smart interactive terminals, IoT gateways, and other products), forming a full-link intelligent solution from cloud to edge to device. This provides integrated services to customers in e-commerce, manufacturing, retail, and other fields, from cloud computing power support and AI intelligent decision-making to terminal scenario implementation. Furthermore, it offers global AIDC (AI Intelligent Computing Center) + low-latency network services, empowering enterprises to embrace AI, connect to the physical world, and expand their global business through its technological foundation. 声明:此篇为星雲數據(香港)有限公司原创文章,转载请标明出处链接:https://www.nebula-data.com/en/sys-nd/279.html
Article classification:
新闻动态
|