GoogleのGemma 4がスマホに無料の自律型AIを搭載、データは一切端末外に出ない
Googleの新たなオープンソースモデル「Gemma 4」は、テキスト・画像・音声を完全にオンデバイスで処理し、エージェント機能によりWikipediaやインタラクティブマップなどのツールをクラウドなしで自律的に利用できるようになる。
キーポイント
完全オンデバイス処理
Gemma 4はテキスト、画像、音声の処理を全て端末上で完結させ、データがデバイスから流出しない仕組みを実現している。
エージェント機能による自律的ツール利用
AIがエージェントスキルを活用し、Wikipediaやインタラクティブマップなどの外部ツールをクラウドを介さずに直接利用できる。
オープンソースモデル
Gemma 4はGoogleによって開発されたオープンソースのAIモデルとして提供される。
プライバシー保護の強化
データがデバイスから一切離れない設計により、ユーザーのプライバシー保護が大幅に強化される。
影響分析・編集コメントを表示
影響分析
この技術は、クラウド依存からの脱却とプライバシー保護の両立というAI業界の長年の課題に応える可能性がある。完全オンデバイス処理が実用化されれば、通信環境に依存しないAI活用やデータ漏洩リスクの低減が進み、エッジAI市場の拡大を加速させるだろう。
編集コメント
クラウド依存からの脱却とプライバシー保護を両立する画期的なアプローチ。エッジAIの実用化が一気に進む可能性を秘めた重要な発表と言える。

Googleの新しいオープンソースモデル「Gemma 4」は、テキスト、画像、音声を完全にオンデバイスで処理します。エージェントスキルを利用することで、このAIはWikipediaやインタラクティブマップなどのツールに自律的にアクセスできます。クラウドは必要ありません。
本記事「Google's Gemma 4 puts free agentic AI on your phone and no data ever leaves the device」は、The Decoderで最初に公開されました。
原文を表示
Apr 11, 2026
Screenshots by THE DECODER
Four model sizes cover everything from phones to serversAgent skills bring tool use to on-device AIGemma 4 sets the stage for the next Gemini Nano
Google's new open-source model, Gemma 4, processes text, images, and audio completely on-device. Using agent skills, the AI can independently tap into tools like Wikipedia or interactive maps, no cloud required.
The Google AI Edge Gallery app needed to run the model is free on Android and iOS. Since Gemma 4 dropped, the app has shot up to fourth place among the most-downloaded free productivity apps in the iOS App Store, sitting right behind Claude, Gemini, and ChatGPT.
Gemma 4 is built on the same research as Google's proprietary Gemini 3 model but ships under the commercially friendly Apache 2.0 license. Google says the Gemma family has racked up over 400 million downloads since the first generation launched. All models handle text, images, and audio across more than 140 languages.
Four model sizes cover everything from phones to servers
The latest release comes in four variants. E2B and E4B are built specifically for smartphones. The "E" stands for "effective parameters," meaning the number of parameters actually active during inference. Quantized, E2B takes up about 1.3 GB on-device, while E4B needs roughly 2.5 GB.
The bigger 26B and 31B variants target servers and high-performance hardware. The 26B version uses a mixture-of-experts architecture with 128 experts, so only 3.8 billion parameters are active at any given time. The dense 31B model offers a context window of up to 256,000 tokens.
Google also teamed up with Arm and Qualcomm to optimize the phone variants for current mobile chips. According to Google, Gemma 4 on Android runs up to four times faster than the previous generation while cutting battery drain by up to 60 percent. Arm's own benchmarks show even bigger gains: an average 5.5x speedup in processing, provided the device packs a newer Arm chip with the SME2 instruction set, an extension that accelerates matrix math for AI models directly in silicon.
Agent skills bring tool use to on-device AI
The app requires Android 12 or iOS 17. The two phone-sized variants differ in RAM requirements: E2B uses about 1.3 GB quantized and runs on devices with 6 GB of RAM, while E4B needs around 2.5 GB of model memory and at least 8 GB of RAM.
Agent skills can be toggled and managed individually. On the right, Gemma 4 generates a QR code directly on-device using a JavaScript skill.
Beyond basic chat, image recognition, and audio transcription, the app ships with what Google calls "agent skills": Wikipedia search, interactive maps, auto-generated summaries, and flashcards. Gemma 4 can also describe photos, turn spoken input into diagrams and visualizations, and even team up with other local models for things like text-to-speech or image generation. Google shows this off with a demo skill that describes and plays animal calls.
Image recognition got a solid upgrade too, according to Google. OCR tasks, pulling text from images, diagrams, or handwriting, now deliver noticeably better results. The model also handles time-related information more reliably, which is important for calendars, reminders, and alarms.
Gemma 4 picks up on the intent behind user input and automatically fires up the right skill, here, a mood tracker with a history chart.
Individually, none of these features break new ground compared to what cloud providers already offer. What stands out is that a demo app running a purely local model on a phone can now use these tools on its own. Developers can build custom skills through GitHub and share them with the community. The built-in tools do need an internet connection, but the model itself runs locally, and chats never get saved.
Gemma 4 sets the stage for the next Gemini Nano
According to Google, Gemma 4 E2B and E4B serve as the foundation for Gemini Nano 4, the next generation of Android's system-wide on-device model. Code written for Gemma 4 today will work with Gemini Nano 4 out of the box when it ships on new flagship devices later this year. Gemini Nano already runs on over 140 million Android devices, powering features like Smart Replies and audio summaries.
Back in December, Google previewed this direction with FunctionGemma, a tiny local model with just 270 million parameters that can route commands to other phone apps. It translates natural language into structured function calls: toggling the flashlight, creating contacts, sending emails, adding calendar entries, pulling up locations on a map, or opening Wi-Fi settings.
How much on-device AI matters strategically became clear earlier this year with the billion-dollar deal between Apple and Google. Since January, we've known that the next generation of Apple's Foundation Models will be built on Google's Gemini technology, powering a sweeping Siri upgrade over the course of 2025.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now
More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.
Subscribe to The Decoder
関連記事
Chrome拡張機能でTransformers.jsを使用する方法
開発者はChrome拡張機能にTransformers.jsを組み込み、ブラウザ上で機械学習モデルを実行する。これによりサーバー依存を排除し、プライバシー保護と低レイテンシを実現する実装手順を示す。
Google、Room 3.0を発表:Kotlinファーストの非同期マルチプラットフォーム永続化ライブラリ
GoogleはRoom 3.0を発表した。本バージョンは破壊的変更を導入し、Kotlin Multiplatform対応を強化するとともにJSとWasmへのサポートを追加した。
Google の Gemini 3.1 Flash TTS モデルによる自然な音声合成ツール
Google は、単一話者および複数話者の会話モードに対応し、発声指示タグの適用も可能な「Gemini 3.1 Flash TTS」モデルを公開した。このツールにより、テキストから自然な音声を生成してダウンロードできるようになった。