Vercel Blog·2026年2月28日 22:00·約8分

GammaがVercelでデザイン重視のエージェントを構築

#AIエージェント #生成AI #マルチモーダル #Vercel AI SDK #プレゼンテーションAI #対話型編集

TL;DR

Gamma社はVercel AI SDKを活用して、デザイン重視のAIエージェント「Gamma Agent」を開発し、プレゼンテーションの自動生成から対話型編集へ進化させ、画像生成パイプラインの効率化も実現した。

AI深層分析2026年3月18日 07:42

重要/ 5段階

深度40%

キーポイント

デザイン重視の企業文化

Gamma社は創業初期からデザイナーを重視し、最初の10名の採用者のうち3名がデザイナーであり、デザインと技術の融合を文化として根付かせている。

Gamma Agentの進化

2025年10月に導入されたGamma Agentは、単なる生成から対話型編集へとAI機能を拡張し、ユーザーと製品の関係性を変革した。

AI SDKによるアーキテクチャの効率化

Vercel AI SDKを採用することで、カスタムオーケストレーションコードを書かずにエージェントの振る舞いを進化させ、会話状態の管理や複雑なマルチステップ連携を実現している。

画像生成パイプラインの効率性

60モデル・20プロバイダーで15億枚以上の画像を生成するパイプラインを構築し、新モデルの追加を約30行のコードで可能にする標準化されたアーキテクチャを実現した。

モデル戦略の分離による迅速な導入

Gammaはモデル層を設定として分離し、機能コード内にモデル固有の戦略を実装することで、新しいモデルを数時間で導入できる。

Vercelを活用した効率的なデプロイメント

GammaはVercelのプレビューデプロイメントとインスタントロールバックを活用し、少人数チームで1日250回以上のデプロイを99%の成功率で実現している。

コンテキストの重要性と階層化

エージェントの価値を高めるためには、セッション、ユーザー履歴、組織資産の3層のコンテキストを効率的にモデルに提供することが重要である。

影響分析・編集コメントを表示

影響分析

この記事は、AIツール開発においてデザイン思考と技術的効率性を両立する実践的なアプローチを示しており、生成AIから対話型AIへの進化トレンドを具体化している。Vercel AI SDKの実用性を実証することで、中小企業やスタートアップのAI開発効率化に影響を与える可能性がある。

編集コメント

デザインとAI技術の融合が製品競争力の源泉となっている好例。Vercel AI SDKの実用性を示す具体的事例として、AI開発コミュニティに参考になる内容。

タイトル: Gamma、Vercel上でデザイン主導のAIエージェントを構築

Gammaは、あるシンプルな疑問から始まりました：プレゼンテーションが自らをデザインできたらどうだろうか？

ユーザーはたった一文から、レイアウト、余白、階層を考慮した完成されたプレゼンテーションを生成できます。カラムは自動的に再配置され、新しいレイヤーが追加されれば図表も調整されます。製品がフォーマットを処理するため、チームはアイデアそのものに集中し続けることができるのです。

この哲学は、会社のDNAに反映されています。Gammaが最初に採用した10人のうち、3人はデザイナーでした。「細部へのこだわりとデザインへの価値観は、ごく初期から文化に刻み込まれてきました」と、AIおよびプロダクトエンジニアリング責任者のSherwin Yuは語ります。「Gammaのデザイナーは素晴らしい。彼らはコードを書き、技術に精通しています。自らプロダクション環境にリリースします。」

「可能な限り、どうすればユーザー体験を高められるか、活発な議論が行われています」とSherwinは言います。

利用が広まるにつれ、チームは「生成」が単なる始まりに過ぎないことに気づきました。実際のプレゼンテーション作業は、反復的な改良の中で行われます。チームは構成を練り、構造を見直し、トーンを調整し、ビジュアルを磨き上げます。2025年10月、Gammaは「Gamma Agent」をリリースしました。これは会話による編集機能であり、AIの可能性を劇的に拡大させました。

AI SDKによる複雑なエージェントアーキテクチャの進化

Gammaの最初のバージョンは、プロンプトからプレゼンテーションスライド（デッキ）を生成していました。Gamma Agentは対話を導入し、それによりユーザーと製品の関係性を一新しました。

チームがより強力なエージェントのプロトタイプ開発を始めると、そのシンプルさは崩れました。彼らには、会話の状態をより細かく制御し、永続的に保持する必要がありました。あるエージェントから別のエージェントへコンテキストを渡す機能、セッションをまたいだメッセージ履歴の管理、単純なリクエスト-レスポンスのループを超えた、より複雑な多段階のインタラクションを調整（オーケストレーション）する機能が必要だったのです。

ワークフローの初期段階でユーザーが下した決定、構造の背後にある理由、最終的に落ち着いたトーン…。これらすべては、使い捨てのチャットウィンドウ内だけに留めておくにはあまりにも貴重なコンテキストでした。

独自のオーケストレーションコードではなくAI SDK上に構築することで、Gammaはバックエンドを再構築することなく、エージェントの挙動を進化させることができます。

Gammaの、構成可能でモデルアグノスティックなアーキテクチャへの投資は、テキストの領域を超えています。60のモデルと20のプロバイダーにわたり15億枚以上の画像を生成してきた同社の画像パイプラインも、アーキテクチャの見直しを経験しました。

画像生成

画像生成の最先端であり続けるためには、新モデルを迅速に統合する必要があります…時にはリリースから数日以内に。Vercel AI SDKが、コンポーザブルなミドルウェア層を備えた画像生成の標準インターフェース「ImageModelV3」を導入した時、Gammaチームはまたとない好機と捉えました。

現在、Gammaに新しい画像モデルを追加するのは約30行のコードです。必要なのは、モデルID、コスト計算式、サポートサイズ、機能フラグを宣言するだけです。トレーシング、コスト追跡、画像前処理は、すべてのモデルをラップする共通のミドルウェアが自動的に処理します。エンジニアはその基盤部分について考える必要はなく、モデルが何をできるかだけを宣言すればよいのです。この仕組みはプロダクトに確かな成果をもたらしています。

インフォグラフィックス

チームがAIインフォグラフィックス機能をリリースした際、Geminiはマルチモーダルなスタイル参照（目標とする美的感覚を示す実際の画像）を必要としましたが、Fluxは簡潔なテキストのみのプロンプトで最高の性能を発揮しました。モデル層が単なる設定であるため、こうしたモデルごとの戦略はインフラに埋め込まれるのではなく、機能を実装するコード内に記述されます。新しいモデル、新しい機能、新機能の追加――それぞれが独立して行えるのです。

結果として、Gammaは数週間ではなく数時間で新モデルをリリースでき、すべてのモデルは最初のリクエストから本番環境レベルの監視（オブザーバビリティ）を自動的に得られます。

プレビューデプロイメントによる継続的リリース

Gammaは同じ哲学をデプロイメントのワークフローにも適用しています。安定した基盤を選択し、その上で迅速に動くのです。独自のリリースシステムを構築する代わりに、チームはVercelのプレビューデプロイメント、本番デプロイメント、インスタントロールバックを活用しています。

「必要のないインフラは再発明しないようにしています」とSherwinは言います。「そのエンジニアリングリソースは、製品そのものに注ぎたいのです。」

エンジニアが20名程度という小規模なチームにもかかわらず、Gammaはプレビューと本番環境を合わせて1日平均250回以上デプロイしています。デプロイメントの中央値はわずか7分強で完了し、成功率は99%に達します。

プレビューデプロイメントにより、すべてのプルリクエストでエージェントの挙動を安全に試すことができます。インスタントロールバックは、モデルロジックやオーケストレーションに影響を与える変更をリリースする際の自信につながります。

VercelでのAIコンテンツパイプラインのスケーリング

GammaのAIは生のHTMLを出力しますが、プレゼンテーションは単なるマークアップ以上のものです。それは、レイアウトルール、解決済みの画像、ライブチャート、編集可能な図表を含む構造化されたドキュメントです。生成されたすべてのカードは、このギャップをリアルタイムで埋める変換レイヤーを通過します。

Gammaはこの重要な変換レイヤーをVercel Functionsとして実行しています。AIが生成したすべてのカードは、サーバーレスエンドポイントを通過します。このエンドポイントは、JSDOM内で完全なTiptapエディタースキーマをインスタンス化し、LLMのHTML出力を構造化されたエディターコンテンツに解析し、非同期アセットを解決します。

その他のサーバーレス関数は、逆方向の処理（エディターコンテンツをAIが読み取れるHTMLにシリアライズする）や、その場でのテーマプレビュー画像の生成を担当します。

サーバーレス関数を活用することで、Gammaはプレゼンテーションの高速な読み込みと、世界中のユーザーにとってレスポンシブなAI編集体験を実現しています。

次の時代を見据えたデザイン

業界全体でエージェントの能力が高まるにつれ、制約要因は知性から情報へと移行しつつあります。

「あなたのブランドガイドライン、過去のプレゼンテーション、会社のトーンボイスを理解するエージェントは、汎用モデルよりもはるかに価値があります」とSherwinは言います。「現在、有用なエージェントと汎用のチャットボットを分けるものは、コンテキストです。」

彼はコンテキストが3つのレベルで機能すると考えています。即時のセッション、プロジェクトを横断したユーザーの履歴、そして組織レイヤー（ブランドアセット、テンプレート、ナレッジベースなど）です。これら3つすべてを、効率的に、適切なタイミングでモデルのコンテキストウィンドウに投入すること――これは、エージェントを構築するすべての企業が直面しているアーキテクチャ上の課題です。

これは、Gammaが最初の日から築き上げてきたビジョンと同じです。アイデアを、洗練され魅力的なコミュニケーションへと簡単に変換すること。最初はインテリジェントなレイアウトとデザインを通じて。次に会話による編集を通じて。そして今、あなたが何を、なぜ構築しているかを理解するコンテキストのレイヤーを通じて。

変わらないのはGammaの構築方法です。適切な抽象化を選択し、モデルに依存しない状態を保ち、状況の変化に応じて再構築できる十分な柔軟性を維持し、チャンスが去る前にリリースする。

6ヶ月ごとに自らを革新しなければならない分野において、この適応性こそが真の競争優位性（堀）なのです。

原文を表示

Gamma began with a simple idea: what if your presentation could design itself?

With a single sentence, users can generate a complete presentation that respects layout, spacing, and hierarchy. Columns reflow automatically. Diagrams adjust when new layers are added. The product handles the formatting so teams can stay focused on the ideas.

That philosophy reflects the company's DNA. Of Gamma's first ten hires, three were designers. "The attention to detail and value placed on design has been baked into the culture from the very, very beginning," says Sherwin Yu, Head of AI and Product Engineering. "Our designers at Gamma are fantastic. They ship code, they're technical. They'll push to production."

"There's a lot of discussion about how do we, whenever possible, elevate the user experience," Sherwin says.

As adoption grew, the team realized generation was only the beginning. Real presentation work happens in iteration. Teams outline, restructure, refine tone, and polish visuals. In October 2025, Gamma launched Gamma Agent, a conversational editing that shifted the AI capabilities dramatically.

Evolving complex agent architectures with AI SDK

The first version of Gamma generated decks from a prompt. Gamma Agent introduced dialogue, and with it, a new relationship between the user and the product.

As the team started prototyping more powerful agents, that simplicity broke down. They needed finer control and more persistence over conversation state. They needed the ability to pass context from one agent to another, manage message history across sessions, and orchestrate more complex multi-step interactions than a simple request-response loop.

The decisions a user made early in a workflow, the reasoning behind the structure, the tone they'd settled on… all of that was valuable context that couldn't just live in a disposable chat window.

By building on the AI SDK rather than custom orchestration code, Gamma can evolve agent behavior without re-architecting its backend.

Gamma's investment in composable, model-agnostic architecture extends beyond text. The company's image pipeline, which has generated more than 1.5 billion images across 60 models and 20 providers, has gone through its own architectural reckoning.

Image generation

Staying on the frontier of image generation means integrating new models fast… sometimes within days of launch. When the Vercel AI SDK introduced ImageModelV3, a standard interface for image generation with a composable middleware layer, Gamma's team saw it as yet another opportunity.

Today, adding a new image model to Gamma is about 30 lines of code: just a model ID, cost formula, supported sizes, and capability flags. Tracing, cost tracking, and image preprocessing are handled automatically by shared middleware that wraps every model. Engineers never think about that plumbing; they just declare what a model can do. This pays off in the product.

Infographics

When the team shipped AI infographics, Gemini needed multimodal style references (actual images showing the target aesthetic), while Flux worked best with concise, text-only prompts. Because the model layer is just configuration, those per-model strategies live in the feature code, not buried in infrastructure. New model, new capability, new feature—each independent.

The result: Gamma ships new models in hours, not weeks, and every model automatically gets production-grade observability from its first request.

Shipping continuously with preview deployments

Gamma applies the same philosophy to its deployment workflow: pick stable foundations, then move fast on top of them. Instead of building its own release system, the team relies on Vercel's Preview Deployments, production deployments, and Instant Rollbacks.

"We try not to reinvent infrastructure we don't have to," Sherwin says. "We'd rather spend that engineering energy on the product."

Despite Gamma’s team of just 20 or so engineers, Gamma averages more than 250 deployments per day across preview and production. Deploys complete in just over 7 minutes at median, with a 99 percent success rate.

Preview deployments make it safe to experiment with agent behavior on every pull request. Instant Rollbacks provide confidence when shipping changes that affect model logic or orchestration.

Scaling the AI content pipeline on Vercel

Gamma's AI outputs raw HTML, but a presentation is more than markup, it's a structured document with layout rules, resolved images, live charts, and editable diagrams. Every generated card passes through a conversion layer that bridges that gap in real time.

Gamma runs this critical translation layer as Vercel Functions. Every AI-generated card passes through a serverless endpoint that instantiates the complete Tiptap editor schema inside JSDOM, parses the LLM's HTML output into structured editor content, and resolves async assets.

Other serverless functions handle the reverse direction (serializing editor content into AI-readable HTML) and generating theme preview images on the fly.

All together, Gamma’s use of serverless functions ensures presentations load quickly and AI-powered editing stays responsive for users worldwide.

Designing for what’s next

As agents across the industry get more capable, the limiting factor shifts from intelligence to information.

"An agent that knows your brand guidelines, your previous presentations, and your company's tone of voice is infinitely more valuable than a generic model," Sherwin says. "Right now, context is what separates a useful agent from a generic chat bot."

He sees context operating at three levels: the immediate session, the user's history across projects, and the organizational layer (meaning things like brand assets, templates, knowledge base). Getting all three into the model's window, efficiently and at the right moment, is the architectural challenge every company building agents is wrestling with.

It's the same vision Gamma has been building toward from day one, making it effortless to turn ideas into polished, compelling communication. First through intelligent layout and design. Then through conversational editing. And now, through a context layer that understands what you're building and why.

What hasn't changed is how Gamma builds: pick the right abstractions, stay model-agnostic, keep enough flexibility to rebuild when the landscape moves, and ship before the window closes.

In a space that reinvents itself every six months, that adaptability is the real moat.

この記事をシェア

MarkTechPost重要度42026年7月3日 06:38

RAG-Anything チュートリアル：Colab でテキスト、表、数式、画像を扱うマルチモーダル検索パイプラインの構築方法

Vercel Blog2026年7月3日 10:00

Vercel サンドボックスが FUSE ベースのファイルシステムをサポート

Vercel Blog重要度42026年7月3日 09:00

Vercel MCP および CLI でエージェント実行が利用可能に

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む