The Decoder·2026年4月3日 03:06·約1分で読める

GoogleのGemma 4が初めてApache 2.0ライセンスで利用可能に

#オープンソースLLM #Apache 2.0 #エッジAI #Google #モデル配布

TL;DR

Googleは、スマートフォンからワークステーションまで幅広いデバイスで動作し、初めて完全にオープンなApache 2.0ライセンスで提供される、これまでで最も高性能なオープンモデルファミリー「Gemma 4」をリリースした。

AI深層分析2026年4月3日 04:41

重要/ 5段階

深度40%

キーポイント

Gemma 4のリリース

Googleがこれまでで最も高性能なオープンモデルファミリー「Gemma 4」をリリースした。

Apache 2.0ライセンスの採用

Gemmaモデルが初めて完全にオープンなApache 2.0ライセンスの下で提供されるようになった。

幅広いデバイス対応

4つの新モデルは、スマートフォンからワークステーションまで、幅広いデバイスで動作する。

影響分析・編集コメントを表示

影響分析

このリリースは、高性能AIモデルのオープンソース化をさらに推進し、開発者や企業によるカスタマイズと商用利用の障壁を大幅に下げる。Apache 2.0ライセンスの採用は、オープンソースAIエコシステムの成長と多様なデバイスへのAI導入を加速させる可能性が高い。

編集コメント

Apache 2.0ライセンスの採用は、商用利用を含む柔軟な活用を可能にし、オープンソースAIモデルの実用化における重要なマイルストーンと言える。

image

Googleは、これまでで最も高性能なオープンモデルファミリーであるGemma 4をリリースしました。4つの新モデルはスマートフォンからワークステーションまであらゆる環境で動作し、初めて完全にオープンなApache 2.0ライセンスの下で提供されます。

本記事「GoogleのGemma 4が初めてApache 2.0ライセンスで利用可能に」は、The Decoderで最初に公開されました。

原文を表示

Google is releasing Gemma 4, its most capable open model family yet. The four new models run on everything from smartphones to workstations and ship under a fully open Apache 2.0 license for the first time.

The models are based on the same technology as Google's proprietary Gemini 3 and are published under the commercially permissive Apache 2.0 license, giving developers full control over their data, infrastructure, and models. Earlier Gemma versions shipped under a more restrictive Google proprietary license.

All Gemma 4 models bring significant improvements in multi-step reasoning and math tasks, according to Google. For agentic workflows, they natively support function calling, structured JSON output, and system instructions, letting autonomous agents tap into various tools and APIs.

Four model sizes cover everything from edge devices to workstations

Gemma 4 comes in four sizes: Effective 2B (E2B), Effective 4B (E4B), a 26B Mixture-of-Experts (MoE) model, and a 31B Dense model. All four go beyond simple chat and handle complex logic and agentic workflows.

E2B

E4B

26B MoE

31B Dense

Active parameters

"effective" 2 billion

"effective" 4 billion

3.8 billion active

Architecture

MoE

Dense

Context window

128K tokens

up to 256K tokens

Target hardware

Smartphones, Raspberry Pi, Jetson Orin Nano

Personal computers, consumer GPUs (quantized), workstations, accelerators

Offline operation

✅

Vision (images/video)

✅

Audio input

✅

Quantized on consumer GPU

✅

Arena AI ranking (open)

Special feature

Compute and memory efficiency on edge devices

Optimized for latency, 3.8 billion active parameters, fast token generation

Maximum quality, base for fine-tuning

The 31B model currently sits at 3rd place among all open models worldwide on the Arena AI Text Leaderboard, while the 26B MoE model ranks 6th. Google says Gemma 4 outperforms models 20 times its size. For developers, that translates to high-performance results with significantly lower hardware requirements.

Benchmark

Gemma 4 31B IT Thinking

Gemma 4 26B A4B IT Thinking

Gemma 4 E4B IT Thinking

Gemma 4 E2B IT Thinking

Gemma 3 27B IT

Arena AI (text) (As of 4/2/26)

1452

1441

1365

MMLU (Multilingual Q&A)

No tools

85.2%

82.6%

69.4%

60.0%

67.6%

MMMU Pro (Multimodal reasoning)

76.9%

73.8%

52.6%

44.2%

49.7%

AIME 2026 (Mathematics)

No tools

89.2%

88.3%

42.5%

37.5%

20.8%

LiveCodeBench v6 (Competitive coding problems)

80.0%

77.1%

52.0%

44.0%

29.1%

GPQA Diamond (Scientific knowledge)

No tools

84.3%

82.3%

58.6%

43.4%

42.4%

τ2-bench (Agentic tool use)

Retail

86.4%

85.5%

57.5%

29.4%

6.6%

The two larger models target workstations and servers. The unquantized bfloat16 weights of the 31B model fit on a single 80 GB NVIDIA H100 GPU, and quantized versions should run on consumer graphics cards too.

The 26B MoE model only activates 3.8 billion of its parameters during inference, which should make for especially fast token generation. The 31B dense model aims for maximum quality instead and is meant to serve as a foundation for fine-tuning.

Google's Gemma 4 models score above 1,440 Elo on the Arena AI Leaderboard despite having just 26B and 31B parameters—far smaller than many competitors with hundreds of billions of parameters. | Image: Google

The smaller E2B and E4B models are purpose-built for mobile devices and IoT hardware. They only activate two and four billion parameters, respectively, during inference to save memory and battery life. Both edge models natively process images, video, and audio input for speech recognition. Their context window covers 128,000 tokens, while the larger models can handle up to 256,000 tokens.

Independent benchmarks from Artificial Analysis back up the numbers for the larger Gemma 4 models. On the GPQA Diamond benchmark for scientific reasoning, Gemma 4 31B scores 85.7 percent in reasoning mode. According to Artificial Analysis, that's the second-best result among all open models with fewer than 40 billion parameters, just behind Qwen3.5 27B at 85.8 percent. At around 1.2 million output tokens, Gemma 4 31B likely also needs less compute than Qwen3.5 27B (1.5 million) and Qwen3.5 35B A3B (1.6 million).

On the GPQA Diamond benchmark, the Gemma 4 models land in the top performance tier with 26B and 31B parameters, outperforming significantly larger models like gpt-oss-120B. | Image: Artificial Analysis

The 26B MoE model scores 79.2 percent on the same benchmark, putting it ahead of OpenAI's gpt-oss-120B at 76.2 percent but behind Qwen3.5 9B at 80.6 percent. Artificial Analysis notes that both evaluated models run on a single H100 GPU. The full evaluation of all four Gemma 4 models in the Artificial Analysis Intelligence Index is still pending. As always, benchmark numbers only go so far when it comes to predicting real-world performance.

Where to get Gemma 4 and what platforms it supports

Gemma 4 is available now on Hugging Face, Kaggle, and Ollama. Google AI Studio supports the 31B and 26B models, while Google AI Edge Gallery handles the E4B and E2B variants.

At launch, the models work with a wide range of frameworks and platforms, including Hugging Face Transformers, vLLM, llama.cpp, MLX, Ollama, NVIDIA NIM and NeMo, LM Studio, Unsloth, SGLang, Keras, and others. Fine-tuning works through Google Colab, Vertex AI, or local gaming GPUs. For production deployments, the models scale to Google Cloud via Vertex AI, Cloud Run, and GKE.

On the hardware side, Google says Gemma 4 supports NVIDIA hardware from the Jetson Orin Nano all the way up to Blackwell GPUs, AMD GPUs through the ROCm stack, and Google's own Trillium and Ironwood TPUs.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

Google DeepMind★42026年4月3日 01:00

Gemma 4：バイト単位で最も能力の高いオープンモデル

GoogleがGemma 4を発表した。高度な推論とエージェントワークフロー向けに設計された、これまでで最も知的なオープンモデルである。

AI Business★32026年4月3日 21:51

Google、オープンモデルファミリーGemma 4を発表

Googleは、高度な推論とマルチモーダル機能を備えたオープンモデルファミリー「Gemma 4」を発表した。

The Decoder★32026年4月5日 17:31

Googleの研究が発見：AIベンチマークは人間の意見の相違を体系的に無視している

Googleの研究チームが、AIベンチマークで標準的に使用される3〜5人の人間評価者では信頼性が不十分であり、アノテーション予算の配分方法が予算規模と同様に重要だと指摘した。

ニュース一覧に戻る元記事を読む

The Decoder·2026年4月3日 03:06·約1分で読める

GoogleのGemma 4が初めてApache 2.0ライセンスで利用可能に

#オープンソースLLM #Apache 2.0 #エッジAI #Google #モデル配布

TL;DR

AI深層分析2026年4月3日 04:41

重要/ 5段階

深度40%

キーポイント

Gemma 4のリリース

Googleがこれまでで最も高性能なオープンモデルファミリー「Gemma 4」をリリースした。

Apache 2.0ライセンスの採用

Gemmaモデルが初めて完全にオープンなApache 2.0ライセンスの下で提供されるようになった。

幅広いデバイス対応

4つの新モデルは、スマートフォンからワークステーションまで、幅広いデバイスで動作する。

影響分析・編集コメントを表示

影響分析

編集コメント

Apache 2.0ライセンスの採用は、商用利用を含む柔軟な活用を可能にし、オープンソースAIモデルの実用化における重要なマイルストーンと言える。

image

本記事「GoogleのGemma 4が初めてApache 2.0ライセンスで利用可能に」は、The Decoderで最初に公開されました。

原文を表示

Four model sizes cover everything from edge devices to workstations

E2B

E4B

26B MoE

31B Dense

Active parameters

"effective" 2 billion

"effective" 4 billion

3.8 billion active

Architecture

MoE

Dense

Context window

128K tokens

up to 256K tokens

Target hardware

Smartphones, Raspberry Pi, Jetson Orin Nano

Personal computers, consumer GPUs (quantized), workstations, accelerators

Offline operation

✅

Vision (images/video)

✅

Audio input

✅

Quantized on consumer GPU

✅

Arena AI ranking (open)

Special feature

Compute and memory efficiency on edge devices

Optimized for latency, 3.8 billion active parameters, fast token generation

Maximum quality, base for fine-tuning

Benchmark

Gemma 4 31B IT Thinking

Gemma 4 26B A4B IT Thinking

Gemma 4 E4B IT Thinking

Gemma 4 E2B IT Thinking

Gemma 3 27B IT

Arena AI (text) (As of 4/2/26)

1452

1441

1365

MMLU (Multilingual Q&A)

No tools

85.2%

82.6%

69.4%

60.0%

67.6%

MMMU Pro (Multimodal reasoning)

76.9%

73.8%

52.6%

44.2%

49.7%

AIME 2026 (Mathematics)

No tools

89.2%

88.3%

42.5%

37.5%

20.8%

LiveCodeBench v6 (Competitive coding problems)

80.0%

77.1%

52.0%

44.0%

29.1%

GPQA Diamond (Scientific knowledge)

No tools

84.3%

82.3%

58.6%

43.4%

42.4%

τ2-bench (Agentic tool use)

Retail

86.4%

85.5%

57.5%

29.4%

6.6%

Where to get Gemma 4 and what platforms it supports

Gemma 4 is available now on Hugging Face, Kaggle, and Ollama. Google AI Studio supports the 31B and 26B models, while Google AI Edge Gallery handles the E4B and E2B variants.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

Google DeepMind★42026年4月3日 01:00

Gemma 4：バイト単位で最も能力の高いオープンモデル

GoogleがGemma 4を発表した。高度な推論とエージェントワークフロー向けに設計された、これまでで最も知的なオープンモデルである。

AI Business★32026年4月3日 21:51

Google、オープンモデルファミリーGemma 4を発表

Googleは、高度な推論とマルチモーダル機能を備えたオープンモデルファミリー「Gemma 4」を発表した。

The Decoder★32026年4月5日 17:31

Googleの研究が発見：AIベンチマークは人間の意見の相違を体系的に無視している

ニュース一覧に戻る元記事を読む