The Decoder·2026年3月4日 04:34·約1分

Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能向上するも価格は3倍以上に

#LLM #生成AI #コスト最適化 #Google #モデルリリース #価格改定

TL;DR

Google DeepMindはGemini 3シリーズ最速・最安モデル「Gemini 3.1 Flash-Lite」のプレビューを公開し、性能は大幅に向上したが出力コストは3倍以上に高騰した。

AI深層分析2026年3月4日 05:42

注目/ 5段階

深度40%

キーポイント

新モデルの発表

Google DeepMindがGemini 3.1 Flash-Liteのプレビューを公開した。これはGemini 3シリーズの中で最速かつ最安価なモデルと位置付けられている。

性能向上

前身モデルと比較して能力が「大幅に向上した」とされている。

コスト高騰

性能向上の一方で、出力コストが3倍以上に増加したことが報告されている。

影響分析・編集コメントを表示

影響分析

この発表は、AIモデルの性能向上とコスト増加のトレードオフを明確に示しており、開発者や企業がコスト効率を考慮したモデル選択を迫られる可能性がある。Googleの製品戦略において、エントリーレベル向けモデルの価格設定が変化する兆候とも捉えられる。

編集コメント

性能向上とコスト増加のバランスがユーザーにどう受け止められるかが今後の焦点。『最安』を謳いながら価格が急騰するのは市場に混乱をもたらす可能性がある。

image

Google Deepmindは、Gemini 3シリーズで最速かつ最安のモデルであるGemini 3.1 Flash-Liteのプレビュー版を公開しました。前身モデルよりも能力は大幅に向上していますが、出力コストは3倍以上に増加しています。

本記事「Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能は向上したが価格は3倍以上に」は、The Decoderで最初に公開されました。

原文を表示

Google Deepmind has released a preview of Gemini 3.1 Flash-Lite, the fastest and most affordable model in the Gemini 3 series.

According to Artificial Analysis, the model scores 34 points on their Intelligence Index, a 12-point jump over its predecessor, Gemini 2.5 Flash-Lite. Despite the big leap in capability, it keeps up the same speed as the previous version, cranking out more than 360 tokens per second with an average response time of 5.1 seconds.

On multimodal tasks, it beats top-tier models like Claude Opus 4.6 and Kimi K2.5, hitting 78 percent on the MMMU-Pro benchmark. Artificial Analysis notes that tool usage barely improved. The context window stays at one million tokens.

Gemini 3.1 Flash-Lite scores 34 on the Artificial Analysis Intelligence Index—a 12-point jump over its predecessor. | Image: Artificial Analysis

On the Arena.ai leaderboard, which ranks models based on human preferences in blind comparisons, Gemini 3.1 Flash-Lite lands an Elo score of 1432. It outperforms other models in its weight class on reasoning and multimodal understanding, pulling 86.9 percent on GPQA Diamond (scientific knowledge) and 76.8 percent on MMMU Pro (multimodal understanding and reasoning). Those numbers top even larger previous-generation Gemini models, including 2.5 Flash.

Google says the model also delivers its first response token 2.5 times faster and pushes output 45 percent faster than Gemini 2.5 Flash (not 2.5 Flash-Lite; Flash is a larger model). Developers can dial how much the model "thinks," so it works for simple high-volume jobs like translations as well as heavier lifts like building user interfaces, according to Google.

The speed and quality gains come at a cost. Output pricing has more than tripled: Gemini 3.1 Flash-Lite charges $0.25 per million input tokens (up from $0.10 on the 2.5 version) and $1.50 per million output tokens (up from $0.40).

Benchmark

Details of

Gemini 3.1 Flash-Lite (High)

Gemini 2.5 Flash (Dynamic)

Gemini 2.5 Flash-Lite (Dynamic)

GPT-5 mini (High)

Claude 4.5 Haiku (Extended Thinking)

Grok 4.1 Fast (Reasoning)

Input price ($/1M tokens, no caching)

Lower is better

$0.25

$0.30

$0.10

$0.25

$1.00

$0.20

Output price ($/1M tokens)

Lower is better

$1.50

$2.50

$0.40

$2.00

$5.00

$0.50

Output speed (Tokens/s)

363

249

366

108

145

Humanity's Last Exam (Academic reasoning, full set, text + MM)

No tools

16.0%

11.0%

6.9%

16.7%

9.7%

17.6%

GPQA Diamond (Scientific knowledge)

No tools

86.9%

82.8%

66.7%

82.3%

73.0%

84.3%

MMMU-Pro (Multimodal understanding and reasoning)

No tools

76.8%

66.7%

51.0%

74.1%

58.0%

63.0%

CharXiv Reasoning (Information synthesis from complex charts)

73.2%

63.7%

55.5%

75.5% (+ python)

61.7%

31.6%

Video-MMMU (Knowledge acquisition from videos)

84.8%

79.2%

60.7%

82.5%

74.6%

SimpleQA Verified (Parametric knowledge)

43.3%

28.1%

11.5%

9.5%

5.5%

19.5%

FACTS Benchmark Suite (Factuality across grounding, parametric, search, and MM)

40.6%

50.4%

17.9%

33.7%

18.6%

42.1%

MMMLU (Multilingual Q&A)

88.9%

86.6%

84.5%

84.9%

83.0%

86.8%

LiveCodeBench (Code generation, UI: 1/1/2025-5/1/2025)

72.0%

62.6%

34.3%

80.4%

53.2%

76.5%

MRCR v2 (8-needle) (Long context performance)

128k (average)

60.1%

54.3%

30.6%

52.5%

35.3%

54.6%

1M (pointwise)

12.3%

21.0%

5.4%

Not Supported

6.1%

The model is available for testing in Google AI Studio and Vertex AI. Full benchmark results are up on Artificial Analysis and the Arena.ai Leaderboard.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

TechCrunch AI重要度42026年3月4日 23:58

父親がGoogleを提訴、Geminiチャットボットが息子を致命的な妄想に駆り立てたと主張

The Decoder重要度42026年3月5日 03:35

Google、Geminiが男性を自殺に追い込んだとされる訴訟で不法死亡訴えに直面

AI News重要度42026年3月4日 17:00

Googleが産業用ロボティクスAI事業を公式化――今回は本気の取り組み

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

The Decoder·2026年3月4日 04:34·約1分

Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能向上するも価格は3倍以上に

#LLM #生成AI #コスト最適化 #Google #モデルリリース #価格改定

TL;DR

Google DeepMindはGemini 3シリーズ最速・最安モデル「Gemini 3.1 Flash-Lite」のプレビューを公開し、性能は大幅に向上したが出力コストは3倍以上に高騰した。

AI深層分析2026年3月4日 05:42

注目/ 5段階

深度40%

キーポイント

新モデルの発表

Google DeepMindがGemini 3.1 Flash-Liteのプレビューを公開した。これはGemini 3シリーズの中で最速かつ最安価なモデルと位置付けられている。

性能向上

前身モデルと比較して能力が「大幅に向上した」とされている。

コスト高騰

性能向上の一方で、出力コストが3倍以上に増加したことが報告されている。

影響分析・編集コメントを表示

影響分析

編集コメント

image

本記事「Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能は向上したが価格は3倍以上に」は、The Decoderで最初に公開されました。

原文を表示

Google Deepmind has released a preview of Gemini 3.1 Flash-Lite, the fastest and most affordable model in the Gemini 3 series.

Benchmark

Details of

Gemini 3.1 Flash-Lite (High)

Gemini 2.5 Flash (Dynamic)

Gemini 2.5 Flash-Lite (Dynamic)

GPT-5 mini (High)

Claude 4.5 Haiku (Extended Thinking)

Grok 4.1 Fast (Reasoning)

Input price ($/1M tokens, no caching)

Lower is better

$0.25

$0.30

$0.10

$0.25

$1.00

$0.20

Output price ($/1M tokens)

Lower is better

$1.50

$2.50

$0.40

$2.00

$5.00

$0.50

Output speed (Tokens/s)

363

249

366

108

145

Humanity's Last Exam (Academic reasoning, full set, text + MM)

No tools

16.0%

11.0%

6.9%

16.7%

9.7%

17.6%

GPQA Diamond (Scientific knowledge)

No tools

86.9%

82.8%

66.7%

82.3%

73.0%

84.3%

MMMU-Pro (Multimodal understanding and reasoning)

No tools

76.8%

66.7%

51.0%

74.1%

58.0%

63.0%

CharXiv Reasoning (Information synthesis from complex charts)

73.2%

63.7%

55.5%

75.5% (+ python)

61.7%

31.6%

Video-MMMU (Knowledge acquisition from videos)

84.8%

79.2%

60.7%

82.5%

74.6%

SimpleQA Verified (Parametric knowledge)

43.3%

28.1%

11.5%

9.5%

5.5%

19.5%

FACTS Benchmark Suite (Factuality across grounding, parametric, search, and MM)

40.6%

50.4%

17.9%

33.7%

18.6%

42.1%

MMMLU (Multilingual Q&A)

88.9%

86.6%

84.5%

84.9%

83.0%

86.8%

LiveCodeBench (Code generation, UI: 1/1/2025-5/1/2025)

72.0%

62.6%

34.3%

80.4%

53.2%

76.5%

MRCR v2 (8-needle) (Long context performance)

128k (average)

60.1%

54.3%

30.6%

52.5%

35.3%

54.6%

1M (pointwise)

12.3%

21.0%

5.4%

Not Supported

6.1%

The model is available for testing in Google AI Studio and Vertex AI. Full benchmark results are up on Artificial Analysis and the Arena.ai Leaderboard.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

TechCrunch AI重要度42026年3月4日 23:58

父親がGoogleを提訴、Geminiチャットボットが息子を致命的な妄想に駆り立てたと主張

The Decoder重要度42026年3月5日 03:35

Google、Geminiが男性を自殺に追い込んだとされる訴訟で不法死亡訴えに直面

AI News重要度42026年3月4日 17:00

Googleが産業用ロボティクスAI事業を公式化――今回は本気の取り組み

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む