Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能向上するも価格は3倍以上に
Google DeepMindはGemini 3シリーズ最速・最安モデル「Gemini 3.1 Flash-Lite」のプレビューを公開し、性能は大幅に向上したが出力コストは3倍以上に高騰した。
キーポイント
新モデルの発表
Google DeepMindがGemini 3.1 Flash-Liteのプレビューを公開した。これはGemini 3シリーズの中で最速かつ最安価なモデルと位置付けられている。
性能向上
前身モデルと比較して能力が「大幅に向上した」とされている。
コスト高騰
性能向上の一方で、出力コストが3倍以上に増加したことが報告されている。
影響分析・編集コメントを表示
影響分析
この発表は、AIモデルの性能向上とコスト増加のトレードオフを明確に示しており、開発者や企業がコスト効率を考慮したモデル選択を迫られる可能性がある。Googleの製品戦略において、エントリーレベル向けモデルの価格設定が変化する兆候とも捉えられる。
編集コメント
性能向上とコスト増加のバランスがユーザーにどう受け止められるかが今後の焦点。『最安』を謳いながら価格が急騰するのは市場に混乱をもたらす可能性がある。

Google Deepmindは、Gemini 3シリーズで最速かつ最安のモデルであるGemini 3.1 Flash-Liteのプレビュー版を公開しました。前身モデルよりも能力は大幅に向上していますが、出力コストは3倍以上に増加しています。
本記事「Googleの最速・最安モデルGemini 3.1 Flash-Lite、性能は向上したが価格は3倍以上に」は、The Decoderで最初に公開されました。
原文を表示
Google Deepmind has released a preview of Gemini 3.1 Flash-Lite, the fastest and most affordable model in the Gemini 3 series.
According to Artificial Analysis, the model scores 34 points on their Intelligence Index, a 12-point jump over its predecessor, Gemini 2.5 Flash-Lite. Despite the big leap in capability, it keeps up the same speed as the previous version, cranking out more than 360 tokens per second with an average response time of 5.1 seconds.
On multimodal tasks, it beats top-tier models like Claude Opus 4.6 and Kimi K2.5, hitting 78 percent on the MMMU-Pro benchmark. Artificial Analysis notes that tool usage barely improved. The context window stays at one million tokens.

On the Arena.ai leaderboard, which ranks models based on human preferences in blind comparisons, Gemini 3.1 Flash-Lite lands an Elo score of 1432. It outperforms other models in its weight class on reasoning and multimodal understanding, pulling 86.9 percent on GPQA Diamond (scientific knowledge) and 76.8 percent on MMMU Pro (multimodal understanding and reasoning). Those numbers top even larger previous-generation Gemini models, including 2.5 Flash.
Google says the model also delivers its first response token 2.5 times faster and pushes output 45 percent faster than Gemini 2.5 Flash (not 2.5 Flash-Lite; Flash is a larger model). Developers can dial how much the model "thinks," so it works for simple high-volume jobs like translations as well as heavier lifts like building user interfaces, according to Google.
The speed and quality gains come at a cost. Output pricing has more than tripled: Gemini 3.1 Flash-Lite charges $0.25 per million input tokens (up from $0.10 on the 2.5 version) and $1.50 per million output tokens (up from $0.40).
Benchmark
Details of
Gemini 3.1 Flash-Lite (High)
Gemini 2.5 Flash (Dynamic)
Gemini 2.5 Flash-Lite (Dynamic)
GPT-5 mini (High)
Claude 4.5 Haiku (Extended Thinking)
Grok 4.1 Fast (Reasoning)
Input price ($/1M tokens, no caching)
Lower is better
$0.25
$0.30
$0.10
$0.25
$1.00
$0.20
Output price ($/1M tokens)
Lower is better
$1.50
$2.50
$0.40
$2.00
$5.00
$0.50
Output speed (Tokens/s)
363
249
366
71
108
145
Humanity's Last Exam (Academic reasoning, full set, text + MM)
No tools
16.0%
11.0%
6.9%
16.7%
9.7%
17.6%
GPQA Diamond (Scientific knowledge)
No tools
86.9%
82.8%
66.7%
82.3%
73.0%
84.3%
MMMU-Pro (Multimodal understanding and reasoning)
No tools
76.8%
66.7%
51.0%
74.1%
58.0%
63.0%
CharXiv Reasoning (Information synthesis from complex charts)
73.2%
63.7%
55.5%
75.5% (+ python)
61.7%
31.6%
Video-MMMU (Knowledge acquisition from videos)
84.8%
79.2%
60.7%
82.5%
-
74.6%
SimpleQA Verified (Parametric knowledge)
43.3%
28.1%
11.5%
9.5%
5.5%
19.5%
FACTS Benchmark Suite (Factuality across grounding, parametric, search, and MM)
40.6%
50.4%
17.9%
33.7%
18.6%
42.1%
MMMLU (Multilingual Q&A)
88.9%
86.6%
84.5%
84.9%
83.0%
86.8%
LiveCodeBench (Code generation, UI: 1/1/2025-5/1/2025)
72.0%
62.6%
34.3%
80.4%
53.2%
76.5%
MRCR v2 (8-needle) (Long context performance)
128k (average)
60.1%
54.3%
30.6%
52.5%
35.3%
54.6%
1M (pointwise)
12.3%
21.0%
5.4%
Not Supported
Not Supported
6.1%
The model is available for testing in Google AI Studio and Vertex AI. Full benchmark results are up on Artificial Analysis and the Arena.ai Leaderboard.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
関連記事
今日のまとめ
AI日報で今日の重要ニュースをまとめ読み