LangChain Blog·2026年5月4日 22:11·約17分

Open SWE：社内コーディングエージェントのためのオープンソースフレームワーク

#コーディングエージェント #Deep Agents #LangGraph #オープンソース #ソフトウェア開発

TL;DR

LangChain は、Stripe や Ramp などの主要企業が採用する内部コーディングエージェントの共通アーキテクチャを抽出し、Deep Agents と LangGraph を基盤としたオープンソースフレームワーク「Open SWE」を公開した。

AI深層分析2026年7月5日 08:10

重要/ 5段階

深度40%

キーポイント

業界標準のアーキテクチャの可視化

Stripe (Minions)、Ramp (Inspect)、Coinbase (Cloudbot) が独自に開発した内部エージェントが、サンドボックス環境やツールセット、サブエージェントのオーケストレーションにおいて共通のパターンに収束していることを示す。

Open SWE フレームワークの公開

LangChain はこれらの成功事例から得られた知見を統合し、組織が自社の開発ワークフローに合わせてカスタマイズ可能なオープンソースフレームワーク「Open SWE」をリリースした。

既存ワークフローへのシームレスな統合

エンジニアが新しいインターフェースを学習する必要はなく、Slack、Linear、GitHub などの既存ツールを通じてエージェントと連携できる設計思想が採用されている。

影響分析・編集コメントを表示

影響分析

この発表は、AI エージェントの実用化において「個別最適」から「産業標準」への転換点となる重要な出来事です。LangChain が先行企業の知見を集約してオープンソース化したことで、中小企業を含む多くの組織が、高品質な内部コーディングエージェントを低リスク・短時間で導入できる道が開かれました。これにより、ソフトウェア開発プロセスにおける AI 支援の普及速度が加速すると予想されます。

編集コメント

LangChain が自社のフレームワークを軸に、業界の先行事例から得られた知見を体系的にパッケージ化し、オープンソースとして提供した点は非常に戦略的です。開発現場における AI エージェントの実装ハードルを劇的に下げる一石となるニュースです。

image

過去 1 年間、私たちは複数のエンジニアリング組織が、開発チームと共に動作する内部向けコーディングエージェントを構築しているのを観察してきました。Stripe は Minions を開発し、Ramp は Inspect を構築し、Coinbase は Cloudbot を作成しました。これらのシステムは、エンジニアが新しいインターフェースを採用する必要はなく、Slack、Linear、GitHub 経由でアクセス可能な既存のワークフローに統合されています。

これらのシステムは独立して開発されましたが、類似したアーキテクチャパターンに収束しています：隔離されたクラウドサンドボックス（cloud sandbox）、厳選されたツールセット、サブエージェントのオーケストレーション、そして開発者ワークフローとの統合です。この収束は、生産環境におけるエンジニアリング分野で AI エージェントをデプロイする際に共通して求められる要件があることを示唆しています。

本日、私たちはこれらのパターンをカスタマイズ可能な形式で捉えるオープンソースフレームワーク「Open SWE」をリリースします。Deep Agents と LangGraph を基盤として構築された Open SWE は、これらの実装全体で観察される中核的なアーキテクチャコンポーネントを提供します。貴社が社内コーディングエージェントの導入を検討されている場合、これは出発点として活用できます。

本番環境からのパターン

Stripe、Ramp、Coinbase はいずれも独自の社内コーディングエージェントを構築しました。Kishan Dahya は、これらのコーディングエージェントが下した異なるアーキテクチャ上の意思決定について素晴らしい投稿を行っています。以下にそれらを要約し、OpenSWE がこれらの次元においてどのように比較されるかを詳しく掘り下げていきます。

隔離された実行環境: タスクは、厳格な境界内での完全な権限を持つ専用クラウドサンドボックスで実行されます。これにより、ミスの影響範囲（ブラスト半径）が生産システムから隔離されると同時に、エージェントが各アクションごとに承認プロンプトを必要とせずにコマンドを実行できるようになります。

厳選されたツールセット: Stripe のエンジニアリングチームによると、彼らのエージェントは約 500 のツールにアクセスできますが、これらは時間とともに蓄積されたものではなく、慎重に選定・維持管理されています。ツールの数よりも、ツールの選別（キュレーション）の方が重要であることが示唆されます。

Slack-first invocation: 3 つのシステムすべてが Slack を主要なインターフェースとして統合しており、開発者が新しいアプリケーションへのコンテキストスイッチを必要とせず、既存のコミュニケーションワークフローの中で開発者と対話します。

Rich context at startup: これらのエージェントは作業を開始する前に、Linear のイシュー、Slack のスレッド、または GitHub の PR から完全な文脈を取得し、ツール呼び出しを通じて要件を発見するためのオーバーヘッドを削減します。

Subagent orchestration: 複雑なタスクは分解され、それぞれが孤立したコンテキストと焦点を絞った責任を持つ専門的な子エージェントに委任されます。

これらのアーキテクチャ上の選択は、複数の本番環境での展開において効果的であることが証明されていますが、組織は特定のコンポーネントを自社の環境や要件に合わせて適応させる必要があるでしょう。

Open SWE's Architecture

Open SWE は、同様のアーキテクチャパターンに対するオープンソースの実装を提供します。以下に、このフレームワークが私たちが観察した内容とどのように対応するかを示します。

1. Agent Harness: Composed on Deep Agents

既存のエージェントをフォークしたり、ゼロから構築したりするのではなく、Open SWE は Deep Agents フレームワーク上で構成されています。このアプローチは、Ramp のチームが OpenCode 上に Inspect を構築した方法に類似しています。

コンポジションには 2 つの利点があります：

アップグレードパス: Deep Agents が改善された場合（より優れたコンテキスト管理、より効率的なプランニング、最適化されたトークン使用など）、カスタマイズを再構築することなくそれらの改善を取り込むことができます。

フォークなしのカスタマイズ: 組織固有のツール、プロンプト、ワークフローを、コアエージェントロジックへの修正ではなく、設定として維持できます。

create_deep_agent(

model="anthropic:claude-opus-4-6",

system_prompt=construct_system_prompt(repo_dir, ...),

tools=[

http_request,

fetch_url,

commit_and_open_pr,

linear_comment,

slack_thread_reply

backend=sandbox_backend,

middleware=[

ToolErrorMiddleware(),

check_message_queue_before_model,

...

)

Deep Agents は、これらのパターンをサポートするインフラストラクチャを提供します：write_todos による組み込みのプランニング、ファイルベースのコンテキスト管理、タスクツールによるネイティブなサブエージェントの起動、および決定論的なオーケストレーションのためのミドルウェアフックです。

2. サンボックス：隔離されたクラウド環境

各タスクは、独立したクラウドサンボックス（リモート Linux 環境で完全なシェルアクセス権を持つ）内で実行されます。リポジトリがクローンされ、エージェントに完全な権限が付与され、エラーはその環境内に限定されます。

Open SWE は、複数のサンボックスプロバイダーを標準でサポートしています：

Modal
Daytona
Runloop
LangSmith

独自のサンバックエンドを実装することも可能です。

これは私たちが観察してきたパターンに従っています：まず境界を隔離し、その内部に完全な権限を与えます。

主要な動作:

各会話スレッドには永続的なサンドボックスが割り当てられ、フォローアップメッセージ間でも再利用されます
サンドボックスは到達不能になった場合に自動的に再作成されます
複数のタスクが並列で実行され、それぞれが独自のサンドボックス内で動作します

3. ツール：蓄積ではなく厳選されたセット

Open SWE は焦点を絞ったツールセットを搭載して提供されています:

Tool

Purpose

execute

Sandbox 内のシェルコマンド実行

fetch_url

Web ページを Markdown 形式で取得

http_request

API 呼び出し（GET、POST など）

commit_and_open_pr

Git コミットと GitHub ドラフト PR の作成

linear_comment

Linear チケットへの更新投稿

slack_thread_reply

Slack スレッド内での返信

さらに、組み込みの Deep Agents ツールとして、read_file、write_file、edit_file、ls、glob、grep、write_todos、task（サブエージェントの起動）が利用可能です。

小さく厳選されたツールセットは、テストや保守、推論において扱いやすい場合があります。組織に必要な追加ツール（内部 API、カスタムデプロイシステム、専門的なテストフレームワークなど）がある場合は、明示的に追加できます。

4. コンテキストエンジニアリング：AGENTS.md とソースコンテキスト

Open SWE は 2 つのソースからコンテキストを収集します：

AGENTS.md ファイル: リポジトリのルートに AGENTS.md ファイルが存在する場合、サンドボックスから読み込まれ、システムプロンプトに注入されます。このファイルには、すべてのエージェント実行で遵守すべき規約、テスト要件、アーキテクチャ上の決定事項、チーム固有のパターンなどを記述できます。

ソースコンテキスト: Linear の完全なイシュー（タイトル、説明、コメント）または Slack スレッドの履歴が組み合わされ、追加のツール呼び出しなしにタスク固有の文脈を提供するために、エージェント開始前に渡されます。

この 2 レベルのアプローチは、リポジトリ全体の知識とタスク固有の情報とのバランスを取ります。

5. オーケストレーション：サブエージェント + ミドルウェア

Open SWE のオーケストレーションは、以下の 2 つのメカニズムを組み合わせています。

サブエージェント: Deep Agents フレームワークでは、タスクツールを通じて子エージェントを起動する機能をサポートしています。メインエージェントは、独立したサブタスクを孤立したサブエージェントに委譲でき、各サブエージェントには独自のミドルウェアスタック、ToDo リスト、およびファイル操作が用意されます。

ミドルウェア: 決定論的なミドルウェアフックがエージェントループの周囲で実行されます：

check_message_queue_before_model: 次のモデル呼び出しの前に、実行中に到着するフォローアップメッセージ（Linear のコメントや Slack メッセージ）を注入します。これにより、エージェントが作業中であってもユーザーが追加の入力を提供できるようになります。
open_pr_if_needed: エージェントがこのステップを完了しなかった場合にコミットして PR を開くための安全網として機能します。これにより、重要なステップが確実に実行されるように保証されます。
ToolErrorMiddleware: ツールエラーを捕捉し、適切に処理します。

このアジェンシー（モデル駆動型）と決定論的（ミドルウェア駆動型）のオーケストレーションの分離は、信頼性と柔軟性のバランスを取るのに役立ちます。

6. 呼び出し：Slack, Linear, および GitHub

多くのチームが Slack を主要な呼び出しインターフェースとして採用していることを観察しています。Open SWE も同様のパターンに従っています：

Slack: 任意のスレッドでボットをメンションしてください。repo:owner/name 構文をサポートし、作業対象のレポジトリを指定できます。エージェントはスレッド内で応答し、ステータスの更新や PR リンクを提供します。

Linear: 任意のイシューに @openswe とコメントしてください。エージェントはイシューの全文コンテキストを読み取り、👀リアクションで承認を示し、結果をコメントとして投稿します。

GitHub: エージェントが作成した PR のコメントで @openswe をタグ付けすると、レビューフィードバックに対応して修正を同じブランチにプッシュします。

各呼び出しは決定論的なスレッド ID を生成するため、同じイシューやスレッド内のフォローアップメッセージは、常に同じ実行中のエージェントにルーティングされます。

7. バリデーション：プロンプト駆動型 + セーフティネット

エージェントはコミット前にリンター、フォーマッター、およびテストを実行するように指示されます。open_pr_if_needed ミドルウェアはバックストップとして機能し、エージェントがプルリクエスト（PR）を開かずに完了した場合でも、ミドルウェアが自動的に処理します。

このバリデーション層を拡張するには、追加のミドルウェアとして決定論的な CI チェック、視覚的検証、またはレビューゲートを追加できます。

なぜディープエージェントなのか

Deep Agents は、このアーキテクチャを構成可能かつ保守可能にする基盤を提供します。

コンテキスト管理: 長時間実行されるコーディングタスクでは、大量の中間データ（ファイル内容、コマンド出力、検索結果など）が生成されることがあります。Deep Agents はファイルベースのメモリを通じてこれを処理し、大きな結果を会話履歴に保持するのではなくオフロードします。これにより、大規模なコードベースで作業する際のコンテキストオーバーフローを防ぐのに役立ちます。

プランニングプリミティブ: 組み込みの write_todos ツールは、複雑な作業を構造化して分解し、進捗を追跡し、新しい情報が得られた際に計画を適応させるための構造化された方法を提供します。私たちは、この機能が長期にわたる多段階タスクにおいて特に有用であると発見しています。

サブエージェントの分離: メインエージェントが task ツールを通じて子エージェントを起動した場合、そのサブエージェントは独自の隔離されたコンテキストを取得します。異なるサブタスクはお互いの会話履歴を汚染しないため、複雑で多面的な作業に対する推論がより明確になります。

ミドルウェアフック: Deep Agents のミドルウェアシステムを使用すると、エージェントループの特定のポイントに決定論的なロジックを注入できます。これが Open SWE においてメッセージの注入や自動 PR（Pull Request）作成が実現される仕組みであり、これらは確実に実行される必要がある振る舞いです。

アップグレードパス: Deep Agents はスタンドアロンのライブラリとして積極的に開発されているため、コンテキスト圧縮、プロンプトキャッシュ、計画効率、サブエージェントのオーケストレーションに関する改善は、カスタマイズを再構築することなく Open SWE に反映させることができます。

この組み合わせ可能性は、Ramp のチームがOpenCode 上で構築する際に説明したのと同様の利点を提供します。維持され改善される基盤の恩恵を受けつつ、組織固有のレイヤーに対する制御を保持できるのです。

組織向けのカスタマイズ

Open SWE は完成品ではなく、カスタマイズ可能な基盤として意図されています。主要なコンポーネントはすべてプラグイン可能です:

サンドボックスプロバイダー: Modal、Daytona、Runloop、LangSmith の間で切り替えが可能です。内部インフラの要件がある場合は、独自のサンドバックエンドを実装できます。

モデル: 任意の LLM（大規模言語モデル）プロバイダーを使用できます。デフォルトは Claude Opus 4 ですが、異なるサブタスクに対して異なるモデルを構成することも可能です。

ツール: 内部 API、デプロイシステム、テストフレームワーク、監視プラットフォーム用のツールを追加できます。不要なツールは削除してください。

トリガー: Slack、Linear、GitHub の統合ロジックを変更します。メール、Webフック、カスタム UI などの新しいトリガー表面を追加します。

システムプロンプト: ベースとなるプロンプトと AGENTS.md ファイルを組み込むためのロジックをカスタマイズします。組織固有の指示、制約、または規約を追加します。

ミドルウェア: バリデーション、承認ゲート、ログ記録、または安全性チェック用の独自のミドルウェアフックを追加します。

カスタマイズガイドでは、これらの拡張ポイントそれぞれを例とともに解説しています。

内部実装との比較

Stripe、Ramp、Coinbase の内部システムにおける Open SWE の比較は、公開情報に基づいています:

Decision | Open SWE | Stripe (Minions) | Ramp (Inspect) | Coinbase (Cloudbot)

---|---|---|---|---

Harness | Composed (Deep Agents/LangGraph) | Forked (Goose) | Composed (OpenCode) | Built from scratch

サンドボックス

プラグ可能（Modal, Daytona, Runloop など）

AWS EC2 開発用ボックス（事前ウォームアップ済み）

Modal コンテナ（事前ウォームアップ済み）

自社製

ツール

約15 個、厳選されたもの

エージェントごとに約500 個、厳選されたもの

OpenCode SDK + 拡張機能

MCPs + カスタムスキル

コンテキスト

Agents.md + イシュー/スレッド

ルールファイル + プリハイドレーション

OpenCode 組み込み

Linear-fear + MCPs

オーケストレーション

サブエージェント + ミドルウェア

ブループリント（決定論的 + エージェント型）

セッション + 子セッション

3 つのモード

呼び出し

Slack, Linear, GitHub

Slack + 埋め込みボタン

Slack + Web + Chrome拡張機能

Slack ネイティブ

検証

プロンプト駆動 + PR セーフティネット

3 レイヤー（ローカル + CI + リトライ1 回）

ビジュアル DOM 検証

エージェント評議会 + オートマージ

コアとなるパターンは類似しています。違いは実装の詳細、内部統合、および組織固有のツールリングにあります。これは、フレームワークを異なる環境に適応させる際にまさに期待されることです。

はじめに

Open SWE は現在、GitHub で利用可能です。

インストールガイド: GitHub アプリの作成、LangSmith のセットアップ、Linear/Slack/GitHub によるトリガー設定、および本番環境へのデプロイまでを順を追って解説しています。

カスタマイズガイド: お使いの組織に合わせて、サンドボックス（sandbox）、モデル、ツール、トリガー、システムプロンプト、ミドルウェアをどのように差し替えるかを示しています。

本フレームワークは MIT ライセンスの下で公開されています。フォークしてカスタマイズし、社内環境にデプロイすることが可能です。この上に何か面白いものを構築された場合は、ぜひご報告ください。

複数のエンジニアリング組織が、すでに本番環境において内部向けコーディングエージェントの導入を成功させています。Open SWE は、異なるコードベースやワークフローに合わせてカスタマイズ可能に設計された、同様のアーキテクチャパターンを実装したオープンソース版です。さまざまな文脈で何が有効かについてはまだ研究中ですが、このフレームワークはこのようなアプローチを検討しているチームにとっての出発点となります。

Open SWE を試す: github.com/langchain-ai/open-swe

Deep Agents について学ぶ: docs.langchain.com/oss/python/deepagents

LangSmith Sandboxes のウェイトリストに登録する: https://www.langchain.com/langsmith-sandboxes-waitlist

ドキュメントを読む: Open SWE ドキュメンテーション

異なるモデルと連携させるためのディープエージェントのチューニング

image

V. Trivedy,

M. Daugherty

2026 年 4 月 29 日

image

5 分

image.png)

エージェントアーキテクチャ

LangSmith

オープンソース

EU AI 法（EU Artificial Intelligence Act）の要件を満たすための LangSmith と LangChain OSS の活用

image

J. Talbot,

B. Weng

2026 年 4 月 27 日

image

7 分

image

コンセプトガイド

ディープエージェント

本番環境のディープエージェントを支えるランタイム

image

S. Runkle,

V. Trivedy

2026 年 4 月 20 日

image

24 分

image

エージェントが実際に何をしているかを確認する

LangSmith は、開発者がエージェントのすべての意思決定をデバッグし、変更の評価を行い、ワンクリックでデプロイできるためのエージェントエンジニアリングプラットフォームです。

原文を表示

Over the past year, we've observed several engineering organizations building internal coding agents that operate alongside their development teams. Stripe developed Minions, Ramp built Inspect, and Coinbase created Cloudbot. These systems integrate into existing workflows (accessible through Slack, Linear, and GitHub) rather than requiring engineers to adopt new interfaces.

While these systems were developed independently, they've converged on similar architectural patterns: isolated cloud sandboxes, curated toolsets, subagent orchestration, and integration with developer workflows. This convergence suggests some common requirements for deploying AI agents in production engineering environments.

Today, we're releasing Open SWE, an open-source framework that captures these patterns in a customizable form. Built on Deep Agents and LangGraph, Open SWE provides the core architectural components we've observed across these implementations. If your organization is exploring internal coding agents, this can serve as a starting point.

Patterns from Production Deployments

Stripe, Ramp, and Coinbase have all built their own internal coding agents. Kishan Dahya wrote a great post on the different architectural decisions these coding agents made. We summarize them below and then dive into how OpenSWE compares on those dimensions.

Isolated execution environments: Tasks run in dedicated cloud sandboxes with full permissions inside strict boundaries. This isolates the blast radius of any mistake from production systems while allowing agents to execute commands without approval prompts for each action.

Curated toolsets: According to Stripe's engineering team, their agents have access to around 500 tools, but these are carefully selected and maintained rather than accumulated over time. Tool curation appears to matter more than tool quantity.

Slack-first invocation: All three systems integrate with Slack as a primary interface, meeting developers in their existing communication workflows rather than requiring context switches to new applications.

Rich context at startup: These agents pull full context from Linear issues, Slack threads, or GitHub PRs before beginning work, reducing the overhead of discovering requirements through tool calls.

Subagent orchestration: Complex tasks get decomposed and delegated to specialized child agents, each with isolated context and focused responsibilities.

These architectural choices have proven effective across multiple production deployments, though organizations will likely need to adapt specific components to their own environments and requirements.

Open SWE's Architecture

Open SWE provides an open-source implementation of similar architectural patterns. Here's how the framework maps to what we've observed:

1. Agent Harness: Composed on Deep Agents

Rather than forking an existing agent or building from scratch, Open SWE composes on the Deep Agents framework. This approach is similar to how Ramp's team built Inspect on top of OpenCode.

Composition provides two advantages:

Upgrade path: When Deep Agents improves (better context management, more efficient planning, optimized token usage), you can incorporate those improvements without rebuilding your customizations.

Customization without forking: You can maintain org-specific tools, prompts, and workflows as configuration rather than as modifications to core agent logic.

code

create_deep_agent(
    model="anthropic:claude-opus-4-6",
    system_prompt=construct_system_prompt(repo_dir, ...),
    tools=[
        http_request,
        fetch_url,
        commit_and_open_pr,
        linear_comment,
        slack_thread_reply
    ],
    backend=sandbox_backend,
    middleware=[
        ToolErrorMiddleware(),
        check_message_queue_before_model,
        ...
    ],
)

Deep Agents provides infrastructure that can support these patterns: built-in planning via write_todos, file-based context management, native subagent spawning via the task tool, and middleware hooks for deterministic orchestration.

2. Sandbox: Isolated Cloud Environments

Each task runs in its own isolated cloud sandbox, a remote Linux environment with full shell access. The repository is cloned in, the agent receives complete permissions, and any errors are contained within that environment.

Open SWE supports multiple sandbox providers out of the box:

Modal
Daytona
Runloop
LangSmith

You can also implement your own sandbox backend.

This follows a pattern we've observed: isolate first, then grant full permissions inside the boundary.

Key behaviors:

Each conversation thread gets a persistent sandbox, reused across follow-up messages
Sandboxes automatically recreate if they become unreachable
Multiple tasks run in parallel, each in its own sandbox

3. Tools: Curated, Not Accumulated

Open SWE ships with a focused toolset:

Tool

Purpose

execute

Shell commands in the sandbox

fetch_url

Fetch web pages as markdown

http_request

API calls (GET, POST, etc.)

commit_and_open_pr

Git commit and open a GitHub draft PR

linear_comment

Post updates to Linear tickets

slack_thread_reply

Reply in Slack threads

Plus the built-in Deep Agents tools: read_file, write_file, edit_file, ls, glob, grep, write_todos, and task (subagent spawning).

A smaller, curated toolset can be easier to test, maintain, and reason about. When you need additional tools for your organization (internal APIs, custom deployment systems, specialized testing frameworks), you can add them explicitly.

4. Context Engineering: AGENTS.md + Source Context

Open SWE gathers context from two sources:

AGENTS.md file: If your repository contains an AGENTS.md file at the root, it's read from the sandbox and injected into the system prompt. This file can encode conventions, testing requirements, architectural decisions, and team-specific patterns that every agent run should follow.

Source context: The full Linear issue (title, description, comments) or Slack thread history is assembled and passed to the agent before it starts, providing task-specific context without additional tool calls.

This two-layer approach balances repository-wide knowledge with task-specific information.

5. Orchestration: Subagents + Middleware

Open SWE's orchestration combines two mechanisms:

Subagents: The Deep Agents framework supports spawning child agents via the task tool. The main agent can delegate independent subtasks to isolated subagents, each with its own middleware stack, todo list, and file operations.

Middleware: Deterministic middleware hooks run around the agent loop:

check_message_queue_before_model: Injects follow-up messages (Linear comments or Slack messages that arrive mid-run) before the next model call. This allows users to provide additional input while the agent is working.
open_pr_if_needed: Acts as a safety net that commits and opens a PR if the agent didn't complete this step. This ensures critical steps happen reliably.
ToolErrorMiddleware: Catches and handles tool errors gracefully.

This separation between agentic (model-driven) and deterministic (middleware-driven) orchestration can help balance reliability with flexibility.

6. Invocation: Slack, Linear, and GitHub

We've observed that many teams converge on Slack as a primary invocation surface. Open SWE follows a similar pattern:

Slack: Mention the bot in any thread. Supports

code

repo:owner/name

syntax to specify which repository to work on. The agent replies in-thread with status updates and PR links.

Linear: Comment @openswe on any issue. The agent reads the full issue context, reacts with 👀 to acknowledge, and posts results back as comments.

GitHub: Tag @openswe in PR comments on agent-created PRs to have it address review feedback and push fixes to the same branch.

Each invocation creates a deterministic thread ID, so follow-up messages on the same issue or thread route to the same running agent.

7. Validation: Prompt-Driven + Safety Nets

The agent is instructed to run linters, formatters, and tests before committing. The open_pr_if_needed middleware acts as a backstop—if the agent finishes without opening a PR, the middleware handles it automatically.

You can extend this validation layer by adding deterministic CI checks, visual verification, or review gates as additional middleware.

Why Deep Agents

Deep Agents provides the foundation that makes this architecture composable and maintainable.

Context management: Long-running coding tasks can produce large amounts of intermediate data (file contents, command outputs, search results). Deep Agents handles this through file-based memory, offloading large results instead of keeping everything in the conversation history. This can help prevent context overflow when working on larger codebases.

Planning primitives: The built-in write_todos tool provides a structured way to break down complex work, track progress, and adapt plans as new information emerges. We've found this particularly helpful for multi-step tasks that span extended periods.

Subagent isolation: When the main agent spawns a child agent via the task tool, that subagent gets its own isolated context. Different subtasks don't pollute each other's conversation history, which can lead to clearer reasoning on complex, multi-faceted work.

Middleware hooks: Deep Agents' middleware system allows you to inject deterministic logic at specific points in the agent loop. This is how Open SWE implements message injection and automatic PR creation—behaviors that need to happen reliably.

Upgrade path: Because Deep Agents is actively developed as a standalone library, improvements to context compression, prompt caching, planning efficiency, and subagent orchestration can flow to Open SWE without requiring you to rebuild your customizations.

This composability offers similar advantages to what Ramp's team described when building on OpenCode: you get the benefits of a maintained, improving foundation while retaining control over your org-specific layer.

Customization for Your Organization

Open SWE is intended as a customizable foundation rather than a finished product. Every major component is pluggable:

Sandbox provider: Swap between Modal, Daytona, Runloop, or LangSmith. Implement your own sandbox backend if you have internal infrastructure requirements.

Model: Use any LLM provider. The default is Claude Opus 4, but you can configure different models for different subtasks.

Tools: Add tools for your internal APIs, deployment systems, testing frameworks, or monitoring platforms. Remove tools you don't need.

Triggers: Modify the Slack, Linear, and GitHub integration logic. Add new trigger surfaces like email, webhooks, or custom UIs.

System prompt: Customize the base prompt and the logic for incorporating AGENTS.md files. Add org-specific instructions, constraints, or conventions.

Middleware: Add your own middleware hooks for validation, approval gates, logging, or safety checks.

The Customization Guide walks through each of these extension points with examples.

Comparison to Internal Implementations

Here's how Open SWE compares to the internal systems at Stripe, Ramp, and Coinbase based on publicly available information:

Decision

Open SWE

Stripe (Minions)

Ramp (Inspect)

Coinbase (Cloudbot)

Harness

Composed (Deep Agents/LangGraph)

Forked (Goose)

Composed (OpenCode)

Built from scratch

Sandbox

Pluggable (Modal, Daytona, Runloop, etc.)

AWS EC2 devboxes (pre-warmed)

Modal containers (pre-warmed)

In-house

Tools

~15, curated

~500, curated per-agent

OpenCode SDK + extensions

MCPs + custom Skills

Context

Agents.md + issue/thread

Rule files + pre-hydration

OpenCode built-in

Linear-fear + MCPs

Orchestration

Subagents + middleware

Blueprints (deterministic + agentic)

Sessions + child sessions

Three modes

Invocation

Slack, Linear, GitHub

Slack + embedded buttons

Slack + web + Chrome extension

Slack-native

Validation

Prompt-driven + PR safety net

3-layer (local + CI + 1 retry)

Visual DOM verification

Agent councils + auto-merge

The core patterns are similar. The differences lie in implementation details, internal integrations, and org-specific tooling—which is exactly what you'd expect when adapting a framework to different environments.

Getting Started

Open SWE is available now on GitHub.

Installation Guide: Walks through GitHub App creation, LangSmith setup, Linear/Slack/GitHub triggers, and production deployment.

Customization Guide: Shows how to swap the sandbox, model, tools, triggers, system prompt, and middleware for your organization.

The framework is MIT-licensed. You can fork it, customize it, and deploy it internally. If you build something interesting on top of it, we'd be interested to hear about it.

Several engineering organizations have successfully deployed internal coding agents in production. Open SWE provides an open-source implementation of similar architectural patterns, designed to be customized for different codebases and workflows. While we're still learning what works across different contexts, this framework offers a starting point for teams exploring this approach.

Try Open SWE: github.com/langchain-ai/open-swe

Learn about Deep Agents: docs.langchain.com/oss/python/deepagents

Sign up for the LangSmith Sandboxes Waitlist: https://www.langchain.com/langsmith-sandboxes-waitlist

Read the docs: Open SWE Documentation

Tuning Deep Agents to Work Well with Different Models

V. Trivedy,

M. Daugherty

April 29, 2026

min

.png)

Agent Architecture

LangSmith

Open Source

How LangSmith and LangChain OSS Help You Meet EU AI Act Requirements

J. Talbot,

B. Weng

April 27, 2026

min

Conceptual Guide

Deep Agents

The runtime behind production deep agents

S. Runkle,

V. Trivedy

April 20, 2026

min

See what your agent is really doing

LangSmith, our agent engineering platform, helps developers debug every agent decision, eval changes, and deploy in one click.

この記事をシェア

Simon Willison Blog重要度42026年7月3日 02:07

参加するには理解せよ：コーディングエージェントとの協働における認知負荷の課題

MarkTechPost重要度42026年7月2日 17:46

Google Health API に CLI ツール「ghealth」登場：Fitbit データを AI エージェントへ

Smol AI News重要度42026年7月2日 14:44

今日は何も大きな出来事はありませんでした

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

create_deep_agent( model="anthropic:claude-opus-4-6", system_prompt=construct_system_prompt(repo_dir, ...), tools=[ http_request, fetch_url, commit_and_open_pr, linear_comment, slack_thread_reply ], backend=sandbox_backend, middleware=[ ToolErrorMiddleware(), check_message_queue_before_model, ... ], )

キーポイント

影響分析

編集コメント

本番環境からのパターン

Open SWE's Architecture

1. Agent Harness: Composed on Deep Agents

2. サンボックス：隔離されたクラウド環境

3. ツール：蓄積ではなく厳選されたセット

4. コンテキストエンジニアリング：AGENTS.md とソースコンテキスト

5. オーケストレーション：サブエージェント + ミドルウェア

6. 呼び出し：Slack, Linear, および GitHub

7. バリデーション：プロンプト駆動型 + セーフティネット

なぜディープエージェントなのか

組織向けのカスタマイズ

内部実装との比較

はじめに

関連コンテンツ

異なるモデルと連携させるためのディープエージェントのチューニング

EU AI 法（EU Artificial Intelligence Act）の要件を満たすための LangSmith と LangChain OSS の活用

本番環境のディープエージェントを支えるランタイム

エージェントが実際に何をしているかを確認する

Patterns from Production Deployments

Open SWE's Architecture

1. Agent Harness: Composed on Deep Agents

2. Sandbox: Isolated Cloud Environments

3. Tools: Curated, Not Accumulated

4. Context Engineering: AGENTS.md + Source Context

5. Orchestration: Subagents + Middleware

6. Invocation: Slack, Linear, and GitHub

7. Validation: Prompt-Driven + Safety Nets

Why Deep Agents

Customization for Your Organization

Comparison to Internal Implementations

Getting Started

Related content

Tuning Deep Agents to Work Well with Different Models

How LangSmith and LangChain OSS Help You Meet EU AI Act Requirements

The runtime behind production deep agents

See what your agent is really doing

関連記事

キーポイント

影響分析

編集コメント

本番環境からのパターン

Open SWE's Architecture

1. Agent Harness: Composed on Deep Agents

2. サンボックス：隔離されたクラウド環境

3. ツール：蓄積ではなく厳選されたセット

4. コンテキストエンジニアリング：AGENTS.md とソースコンテキスト

5. オーケストレーション：サブエージェント + ミドルウェア

6. 呼び出し：Slack, Linear, および GitHub

7. バリデーション：プロンプト駆動型 + セーフティネット

なぜディープエージェントなのか

組織向けのカスタマイズ

内部実装との比較

はじめに

関連コンテンツ

異なるモデルと連携させるためのディープエージェントのチューニング

EU AI 法（EU Artificial Intelligence Act）の要件を満たすための LangSmith と LangChain OSS の活用

本番環境のディープエージェントを支えるランタイム

エージェントが実際に何をしているかを確認する

Patterns from Production Deployments

Open SWE's Architecture

1. Agent Harness: Composed on Deep Agents

2. Sandbox: Isolated Cloud Environments

3. Tools: Curated, Not Accumulated

4. Context Engineering: AGENTS.md + Source Context

5. Orchestration: Subagents + Middleware

6. Invocation: Slack, Linear, and GitHub

7. Validation: Prompt-Driven + Safety Nets

Why Deep Agents

Customization for Your Organization

Comparison to Internal Implementations

Getting Started

Related content

Tuning Deep Agents to Work Well with Different Models

How LangSmith and LangChain OSS Help You Meet EU AI Act Requirements

The runtime behind production deep agents

See what your agent is really doing

関連記事