InfoQ·2026年4月14日 06:00·約4分で読める

Google、ローカルファーストのオンデバイスAI推論に焦点を当てたGemma 4をリリース

#オンデバイスAI #ローカル推論 #Android開発 #AIエージェント #ソフトウェアライフサイクル #Gemma

TL;DR

Googleは、Android開発におけるローカルファーストでエージェンシックなAIを実現することを目指し、コーディングから本番運用までのソフトウェアライフサイクル全体をサポートするGemma 4モデルファミリーをリリースした。

AI深層分析2026年4月14日 11:41

重要/ 5段階

深度40%

キーポイント

ローカルファーストAI推論への注力

Gemma 4は、クラウド依存を減らし、デバイス上でのAI推論を可能にする「ローカルファースト」アプローチを強調している。

Android開発向けエージェンシックAIの実現

このリリースは、Android開発者が自律的で状況判断可能なAIエージェントを構築できるようにすることを目的としている。

ソフトウェアライフサイクル全体のサポート

Gemma 4モデルファミリーは、コーディングからテスト、デプロイ、本番運用までの開発プロセス全体をカバーするように設計されている。

GoogleのオンデバイスAI戦略の強化

この発表は、Googleがクラウド中心からデバイス上でのAI処理へのシフトを加速させていることを示している。

影響分析・編集コメントを表示

影響分析

この発表は、AI推論のパラダイムをクラウド中心からデバイス上へと移行させる重要な一歩であり、開発者の生産性向上、ユーザーのプライバシー保護、リアルタイム応答性の改善に寄与する可能性がある。特にAndroidエコシステムにおけるAIアプリケーション開発の民主化と加速を促すだろう。

編集コメント

クラウドAIからオンデバイスAIへの明確なシフトを示す戦略的リリース。開発者体験とエンドユーザー体験の両方を向上させる可能性があり、業界のトレンド形成に影響を与えそう。

Gemma 4のリリースにより、Googleはコーディングから本番環境への展開まで、ソフトウェアライフサイクル全体をサポートするように設計されたモデルファミリーを通じて、Android開発におけるローカルかつ自律型（agentic）AIの実現を目指しています。

Gemma 4モデルは、ML Kit GenAI Prompt経由でAndroidアプリを動作させる効率的なオンデバイス版から、デスクトップ上のAndroid StudioでAI搭載のコーディング支援を提供するように設計されたより強力なモデルまで、幅広い機能スペクトルをカバーしています。

Gemma 4には3つのモデルが含まれています：8GBのRAMと2GBのストレージを必要とするGemma E2B、12GBのRAMと4GBのストレージを必要とするGemma E4B、そして24GBのRAMと17GBのストレージを必要とするGemma 26B MoEです。最も強力なモデルは開発マシンでのコーディングエージェントとして使用することを推奨されており、2つの小規模なバリエーションはオンデバイスへの統合に適しています。

Gemma 26B MoE（Mixture of Experts：専門家の混合モデル）は、コードをクラウドベースのAIプロバイダーと共有する必要がないローカルかつ自律型のコーディングを可能にし、厳格なデータプライバシー要件を満たす環境や安全な企業環境で作業する開発者にとって特に価値があります。Googleによると、このモデルはローカルのGPU（Graphics Processing Unit：グラフィックスプロセッシングユニット）およびRAMリソースを活用することで、最新のハードウェア上で効率的に動作します。さらに、トークンクォータやネットワークレイテンシーの使用制限を受けません。Googleによれば、Gemma 26B MoEは新機能やアプリ全体の設計、既存コードのリファクタリング、ビルドエラーやリンター（lint）エラーの解決に使用できます。

より小型の2つのモデル、Gemma E2BおよびGemma E4Bは、デバイス上での推論（on-device inference）を目的として設計されています。具体的には、E4Bはより強力な推論能力を提供し複雑なタスクに適している一方、E2Bは最大限の速度を最適化しており、Gemma E4Bよりも3倍高速な推論を実現し、低レイテンシも提供します。

Googleによると、新モデルは以前のバージョンよりも最大4倍高速で、バッテリー消費量を最大60%削減します。さらに、思考の連鎖（chain-of-thought）プロンプトや条件付き推論においてより高品質な結果を提供し、数学的スキル、時間的な推論、画像処理能力が向上しており、チャート解釈、視覚データ抽出、手書き文字認識といったユースケースに対応しています。

Gemma 4は、Androidデバイス上のAI機能を駆動する次世代のGemini Nanoの基盤を提供します。開発者はすでにこれを使用してアプリのプロトタイプを作成し、今年後半にサポートされるデバイスで利用可能になる予定のGemini Nano 4への準備を進めることができます。Androidデバイス上でGemma 4モデルにアクセスするには、開発者はAICore Developer Previewプログラムに参加できます。

以下は、Kotlinでモデルを使用する方法を示すコードスニペットです：

// 特定のトラックと優先度を指定した構成を定義する

val previewFullConfig = generationConfig {

modelConfig = ModelConfig {

releaseTrack = ModelReleaseTrack.PREVIEW

preference = ModelPreference.FULL

}

// 構成を使用してGenerativeModelを初期化する

val previewModel = GenerativeModel.getClient(previewFullConfig)

// 特定のプレビューモデルが利用可能か確認する

val previewModelStatus = previewModel.checkStatus()

if (previewModelStatus == FeatureStatus.AVAILABLE) {

// 推論処理を続行

val response = previewModel.generateContent("If I get 26 paychecks per year, how much I should contribute each paycheck to reach my savings goal of $10k over the course of a year? Return only the amount.")

} else {

// プレビューモデルが利用できない場合の処理

// （例：ログステートメントを出力）

}

Gemma 4 モデルは、Ollama や LM Studio からもインストール可能です。

About the Author

Sergio De Simone is a software engineer. Sergio has been working as a software engineer for over twenty five years across a range of different projects and companies, including such different work environments as Siemens, HP, and small startups. For the last 10+ years, his focus has been on development for mobile platforms and related technologies. He is currently working for BigML, Inc., where he leads iOS and macOS development.

Show moreShow less

原文を表示

With the release of Gemma 4, Google aims to enable local, agentic AI for Android development through a family of models designed to support the entire software lifecycle, from coding to production.

Gemma 4 models covers a spectrum of capabilities, from efficient on-device variants that power Android apps via the ML Kit GenAI Prompt, to more powerful models designed to deliver AI-powered coding assistance in Android Studio on desktop.

Gemma 4 includes three models: Gemma E2B, which requires 8GB of RAM and 2GB of storage; Gemma E4B, which requires 12GB of RAM and 4GB of storage; and Gemma 26B MoE, which requires 24GB of RAM and 17 of storage. The most powerful model is recommended for use as a coding agent on development machine, while the two smaller variants are suitable for on-device integration.

Gemma 26B MoE enables local, agentic coding without requiring code to be shared with cloud-based AI providers, making it especially valuable for developers working under strict data privacy requirements or in secure enterprise environments. According to Google, it runs efficiently on modern hardware by leveraging local GPU and RAM resources. Additionally, its usage is not constrained by token quotas or network latency. Gemma 26B MoE can be used to design new features or an entire app, refactor existing code, and resolve build/lint errors, Google says.

The two smaller models, Gemma E2B and Gemma E4B, are designed for on-device inference. Specifically, E4B offer stronger reasoning power and is better suited for complex tasks, while E2B is optimized for maximum speed, delivering 3x faster inference than Gemma E4B, along with lower latency.

Google says the new models are up to 4x faster than previous versions and use up to 60% less battery. In addition, they deliver higher-quality results for chain-of-thought prompts and conditional reasoning, with better math skills, temporal reasoning, and image processing, for use cases such as chart interpretation, visual data extraction, and handwriting recognition.

Gemma 4 provides the foundation for the next generation of Gemini Nano, which powers AI features on Android devices. Developers can already use it to prototype their apps and prepare them for Gemini Nano 4, which is expected to become available on supported devices later this year. To access Gemma 4 models on Android devices, developers can join the AICore Developer Preview program.

The following is a code snippet showing how use the models in Kotlin:

// Define the configuration with a specific track and preference

val previewFullConfig = generationConfig {

modelConfig = ModelConfig {

releaseTrack = ModelReleaseTrack.PREVIEW

preference = ModelPreference.FULL

}

// Initialize the GenerativeModel with the configuration

val previewModel = GenerativeModel.getClient(previewFullConfig)

// Verify that the specific preview model is available

val previewModelStatus = previewModel.checkStatus()

if (previewModelStatus == FeatureStatus.AVAILABLE) {

// Proceed with inference

} else {

// Handle the case where the preview model is not available

// (e.g., print out log statements)

}

Gemma 4 models can also be installed from Ollama or LM Studio.

About the Author

Sergio De Simone

Show moreShow less

この記事をシェア

Hugging Face Blog★42026年4月23日 09:00

Chrome拡張機能でTransformers.jsを使用する方法

開発者はChrome拡張機能にTransformers.jsを組み込み、ブラウザ上で機械学習モデルを実行する。これによりサーバー依存を排除し、プライバシー保護と低レイテンシを実現する実装手順を示す。

InfoQ★32026年4月24日 00:00

Google、Room 3.0を発表：Kotlinファーストの非同期マルチプラットフォーム永続化ライブラリ

GoogleはRoom 3.0を発表した。本バージョンは破壊的変更を導入し、Kotlin Multiplatform対応を強化するとともにJSとWasmへのサポートを追加した。

Simon Willison Blog2026年4月16日 01:41

Google の Gemini 3.1 Flash TTS モデルによる自然な音声合成ツール

Google は、単一話者および複数話者の会話モードに対応し、発声指示タグの適用も可能な「Gemini 3.1 Flash TTS」モデルを公開した。このツールにより、テキストから自然な音声を生成してダウンロードできるようになった。

ニュース一覧に戻る元記事を読む

InfoQ·2026年4月14日 06:00·約4分で読める

Google、ローカルファーストのオンデバイスAI推論に焦点を当てたGemma 4をリリース

#オンデバイスAI #ローカル推論 #Android開発 #AIエージェント #ソフトウェアライフサイクル #Gemma

TL;DR

AI深層分析2026年4月14日 11:41

重要/ 5段階

深度40%

キーポイント

ローカルファーストAI推論への注力

Gemma 4は、クラウド依存を減らし、デバイス上でのAI推論を可能にする「ローカルファースト」アプローチを強調している。

Android開発向けエージェンシックAIの実現

このリリースは、Android開発者が自律的で状況判断可能なAIエージェントを構築できるようにすることを目的としている。

ソフトウェアライフサイクル全体のサポート

Gemma 4モデルファミリーは、コーディングからテスト、デプロイ、本番運用までの開発プロセス全体をカバーするように設計されている。

GoogleのオンデバイスAI戦略の強化

この発表は、Googleがクラウド中心からデバイス上でのAI処理へのシフトを加速させていることを示している。

影響分析・編集コメントを表示

影響分析

編集コメント

以下は、Kotlinでモデルを使用する方法を示すコードスニペットです：

// 特定のトラックと優先度を指定した構成を定義する

val previewFullConfig = generationConfig {

modelConfig = ModelConfig {

releaseTrack = ModelReleaseTrack.PREVIEW

preference = ModelPreference.FULL

}

// 構成を使用してGenerativeModelを初期化する

val previewModel = GenerativeModel.getClient(previewFullConfig)

// 特定のプレビューモデルが利用可能か確認する

val previewModelStatus = previewModel.checkStatus()

if (previewModelStatus == FeatureStatus.AVAILABLE) {

// 推論処理を続行

} else {

// プレビューモデルが利用できない場合の処理

// （例：ログステートメントを出力）

}

Gemma 4 モデルは、Ollama や LM Studio からもインストール可能です。

About the Author

Show moreShow less

原文を表示

With the release of Gemma 4, Google aims to enable local, agentic AI for Android development through a family of models designed to support the entire software lifecycle, from coding to production.

The following is a code snippet showing how use the models in Kotlin:

// Define the configuration with a specific track and preference

val previewFullConfig = generationConfig {

modelConfig = ModelConfig {

releaseTrack = ModelReleaseTrack.PREVIEW

preference = ModelPreference.FULL

}

// Initialize the GenerativeModel with the configuration

val previewModel = GenerativeModel.getClient(previewFullConfig)

// Verify that the specific preview model is available

val previewModelStatus = previewModel.checkStatus()

if (previewModelStatus == FeatureStatus.AVAILABLE) {

// Proceed with inference

} else {

// Handle the case where the preview model is not available

// (e.g., print out log statements)

}

Gemma 4 models can also be installed from Ollama or LM Studio.

About the Author

Sergio De Simone

Show moreShow less

この記事をシェア

Hugging Face Blog★42026年4月23日 09:00

Chrome拡張機能でTransformers.jsを使用する方法

InfoQ★32026年4月24日 00:00

Google、Room 3.0を発表：Kotlinファーストの非同期マルチプラットフォーム永続化ライブラリ

GoogleはRoom 3.0を発表した。本バージョンは破壊的変更を導入し、Kotlin Multiplatform対応を強化するとともにJSとWasmへのサポートを追加した。

Simon Willison Blog2026年4月16日 01:41

Google の Gemini 3.1 Flash TTS モデルによる自然な音声合成ツール

ニュース一覧に戻る元記事を読む

Google、ローカルファーストのオンデバイスAI推論に焦点を当てたGemma 4をリリース

キーポイント

影響分析

編集コメント

関連記事

Google、ローカルファーストのオンデバイスAI推論に焦点を当てたGemma 4をリリース

キーポイント

影響分析

編集コメント

関連記事