AWS Machine Learning Blog·2026年4月9日 23:47·約7分で読める

Amazon Bedrock AgentCore RuntimeでのステートフルMCPクライアント機能の導入

#MCP #AIエージェント #ステートフルアーキテクチャ #AWS Bedrock #LLM統合 #対話型AI

TL;DR

AWSはAmazon Bedrock AgentCore RuntimeでステートフルMCPクライアント機能を導入し、対話型マルチターンAIエージェントワークフローを可能にした。

AI深層分析2026年4月10日 00:42

重要/ 5段階

深度40%

キーポイント

ステートフルMCPの実現

従来のステートレス実装では不可能だった対話型マルチターンエージェントワークフローを、ステートフルMCPクライアント機能によって実現した。

3つの主要機能

Elicitation（実行中のユーザー入力要求）、Sampling（クライアントからのLLM生成コンテンツ要求）、Progress notification（リアルタイム更新ストリーミング）の3つの機能を導入。

双方向通信の実現

一方向のツール実行から、MCPサーバーとクライアント間の双方向会話へと変換し、より高度なエージェント対話を可能にした。

技術的実装

stateless_http=False設定で専用microVMをプロビジョニングし、最大8時間のセッション継続を実現する技術的アプローチを採用。

対話型経費追加ツールの実装

FastMCPを使用して、金額、説明、カテゴリ、確認の4ステップでユーザーから情報を収集する対話型経費追加ツールが実装されている。

状態保持セッションの活用

await ctx.elicit()呼び出しによりツールが中断され、elicitation/createリクエストがアクティブセッションを介して送信されることで、状態を保持した対話が実現される。

クライアント側の応答ハンドリング

fastmcp.Clientにelicitation_handlerを登録することで、ハンドラの接続とサーバーへの状態保持サポートの通知が同時に行われる。

影響分析・編集コメントを表示

影響分析

この技術的進展は、より複雑で対話的なAIエージェントの実装を可能にし、AWSのAIプラットフォーム競争力を強化する。特に長時間実行タスクやユーザー対話が必要なユースケースで、実用的なAIエージェント開発の障壁を大幅に低減する可能性がある。

編集コメント

AWSがMCP規格の実装を深化させ、AIエージェント開発の実用性を高める重要なアップデート。ステートフル機能の導入は、より複雑なワークフロー対応に向けた基盤整備と言える。

python

mcp = FastMCP(name='ElicitationMCP')

_region = os.environ.get('AWS_REGION') or os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1'
db = FinanceDB(region_name=_region)

class AmountInput(BaseModel):
    amount: float

class DescriptionInput(BaseModel):
    description: str

class CategoryInput(BaseModel):
    category: str  # 選択肢: food, transport, bills, entertainment, other

class ConfirmInput(BaseModel):
    confirm: str  # Yes または No

@mcp.tool()
async def add_expense_interactive(user_alias: str, ctx: Context) -> str:
    """対話的に新しい経費を追加する（エリシテーションを使用）。

    Args:
        user_alias: ユーザー識別子
    """
    # ステップ1: 金額を尋ねる
    result = await ctx.elicit('いくら使いましたか？', AmountInput)
    if not isinstance(result, AcceptedElicitation):
        return '経費入力がキャンセルされました。'
    amount = result.data.amount

    # ステップ2: 説明を尋ねる
    result = await ctx.elicit('何のためでしたか？', DescriptionInput)
    if not isinstance(result, AcceptedElicitation):
        return '経費入力がキャンセルされました。'
    description = result.data.description

    # ステップ3: カテゴリを選択する
    result = await ctx.elicit(
        'カテゴリを選択してください (food, transport, bills, entertainment, other):',
        CategoryInput
    )
    if not isinstance(result, AcceptedElicitation):
        return '経費入力がキャンセルされました。'
    category = result.data.category

    # ステップ4: 保存前に確認する
    confirm_msg = (
        f'確認: {description} に対して ${amount:.2f} の経費を追加しますか'
        f' (カテゴリ: {category})？ Yes または No で返答してください'
    )
    result = await ctx.elicit(confirm_msg, ConfirmInput)
    if not isinstance(result, AcceptedElicitation) or result.data.confirm != 'Yes':
        return '経費入力がキャンセルされました。'

    return db.add_transaction(user_alias, 'expense', -abs(amount), description, category)

if __name__ == '__main__':
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        stateless_http=False
    )

各 await ctx.elicit() 呼び出しはツールの実行を中断し、アクティブなセッション経由で elicitation/create リクエストを送信します。isinstance(result, AcceptedElicitation) チェックにより、各ステップでの decline と cancel を一貫して処理します。

クライアント

fastmcp.Client に elicitation_handler を登録することは、ハンドラーを接続する方法であり、初期化中にクライアントがサーバーに対してエリシテーションサポートを通知する方法でもあります。

python

import asyncio
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport

# 事前読み込みされた応答は、ユーザーが各質問に順番に回答することをシミュレートします
_responses = iter([
    {'amount': 45.50},
    {'description': 'Lunch at the office'},
    {'category': 'food'},
    {'confirm': 'Yes'},
])

async def elicit_handler(message, response_type, params, context):
    # 本番環境では: フォームをレンダリングし、ユーザーの入力を返します
    response = next(_responses)
    print(f'  サーバーが尋ねる: {message}')
    print(f'  応答:  {response}\n')
    return response

transport = StreamableHttpTransport(url=mcp_url, headers=headers)

async with Client(transport, elicitation_handler=elicit_handler) as client:
    await asyncio.sleep(2)  # セッション初期化を許可
    result = await client.call_tool('add_expense_interactive', {'user_alias': 'me'})

print(result.content[0].text)

デプロイされたサーバーに対してこれを実行すると、以下のようになります。

code

Server asks: How much did you spend?
Responding:  {'amount': 45.5}

Server asks: What was it for?
Responding:  {'description': 'Lunch at the office'}

Server asks: Select a category (food, transport, bills, entertainment, other):
Responding:  {'category': 'food'}

Server asks: Confirm: add expense of $45.50 for Lunch at the office (category: food)? Reply Yes or No
Responding:  {'confirm': 'Yes'}

Expense of $45.50 added for me

DynamoDBのセットアップとAgentCoreへのデプロイメントを含む完全な動作例は、GitHubのサンプルリポジトリで利用可能です。

エリシテーションは、ツールが以前の結果に依存する情報を必要とする場合、事前に収集するよりも対話的に収集する方が適している場合、または事前にパラメータ化できない方法でユーザー間で異なる情報を必要とする場合に使用してください。最初に目的地を検索し、次にユーザーに選択を求める旅行予約ツールは自然に適合します。送信前に取引金額を確認する金融ワークフローも別の例です。エリシテーションは、パスワードやAPIキーのような機密入力には適していません。それらにはURLモードまたは安全な帯域外チャネルを使用してください。

サンプリング: サーバー主導のLLM生成

サンプリングは、MCPサーバーがクライアントからLLM（大規模言語モデル）の補完を要求するメカニズムです。サーバーは、会話メッセージのリスト、システムプロンプト、およびオプションのモデル設定を含む sampling/createMessage リクエストを送信します。クライアントはリクエストを接続された言語モデルに転送し（ユーザーの承認を条件として）、生成された応答を返します。サーバーは、生成されたテキスト、使用されたモデル、および停止理由を含む構造化された結果を受け取ります。

この機能は典型的なフローを逆転させます: クライアントがサーバーにツール結果を求める代わりに、サーバーがクライアントにモデル出力を求めます。利点は、サーバーがAPIキーや直接のモデル統合を必要としないことです。クライアントはどのモデルが使用されるかを完全に制御し、MCP仕様では、ユーザーがサンプリングリクエストを転送する前にレビューして承認できる人間の介入ステップを要求しています。

サーバーは、機能の優先順位（costPriority, speedPriority, intelligencePriority）とオプションのモデルヒントを使用してモデルの設定を表現できます。これらは助言的なものであり、クライアントはアクセス可能なモデルに基づいて最終的な選択を行います。

サーバー

analyze_spending ツールは、DynamoDBから取引を取得し、構造化データからプロンプトを構築し、ctx.sample() を介して分析をクライアントのLLMに委任します。

agents/mcp_client_features.py (エリシテーションと同じファイルに追加されたツール)

python

@mcp.tool()
async def analyze_spending(user_alias: str, ctx: Context) -> str:
    """DynamoDBから経費を取得し、クライアントのLLMに分析を依頼します。

    Args:
        user_alias: ユーザー識別子
    """
    transactions = db.get_transactions(user_alias)
    if not transactions:
        return f'{user_alias} の取引が見つかりませんでした。'

    lines = '\n'.join(
        f"- {t['description']} (${abs(float(t['amount'])):.2f}, {t['category']})"
        for t in transactions
    )

    prompt = (
        f'以下はユーザーの最近の経費です:\n{lines}\n\n'
        f'支出パターンを分析し、財務を改善するための3つの簡潔で実用的な推奨事項を提供してください。'
        f'応答は120語以内に収めてください。'
    )

    ai_analysis = '分析を利用できません。'
    try:
        response = await ctx.sample(messages=prompt, max_tokens=300)
        if hasattr(response, 'text') and response.text:
            ai_analysis = response.text
    except Exception:
        pass

    return f'{user_alias} の支出分析:\n\n{ai_analysis}'

ツールは await ctx.sample() を呼び出して中断します。サーバーはオープンセッション経由でクライアントに sampling/createMessage リクエストを送信します。クライアントがLLM応答を返すと、実行が再開されます。

クライアント

sampling_handler はサーバーからプロンプトを受け取り、それを言語モデルに転送します。この例では、Amazon Bedrock RuntimeのClaude Haikuを使用しています。ハンドラーを登録することは、クライアントが初期化中にサーバーに対してサンプリングサポートを宣言する方法でもあります。

python

import json
import asyncio
import boto3
from mcp.types import CreateMessageResult, TextContent
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport

MODEL_ID = 'us.anthropic.claude-haiku-4-5-20251001-v1:0'
bedrock = boto3.client('bedrock-runtime', region_name=region)

def _invoke_bedrock(prompt: str, max_tokens: int) -> str:
    body = json.dumps({
        'anthropic_version': 'bedrock-2023-05-31',
        'max_tokens': max_tokens,
        'messages': [{'role': 'user', 'content': prompt}]
    })
    resp = bedrock.invoke_model(modelId=MODEL_ID, body=body)
    return json.loads(resp['body'].read())['content'][0]['text']

async def sampling_handler(messages, params, ctx):
    """fastmcp.Clientがサーバーがctx.sample()を発行したときに呼び出します。"""
    prompt = messages if isinstance(messages, str) else ' '.join(
        m.content.text for m in messages if hasattr(m.content, 'text')
    )
    max_tokens = params.maxTokens if params and hasattr(params, 'maxTokens') and params.maxTokens else 300
    text = await asyncio.to_thread(_invoke_bedrock, prompt, max_tokens)
    return CreateMessageResult(
        role='assistant',
        content=TextContent(type='text', text=text),
        model=MODEL_ID,
        stopReason='endTurn'
    )

transport = StreamableHttpTransport(url=mcp_url, headers=headers)

async with Client(transport, sampling_handler=sampling_handler) as client:
    result = await cli

原文を表示

Stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime now enable interactive, multi-turn agent workflows that were previously impossible with stateless implementations. Developers building AI agents often struggle when their workflows must pause mid-execution to ask users for clarification, request large language model (LLM)-generated content, or provide real-time progress updates during long-running operations, stateless MCP servers can’t handle these scenarios. This solves these limitations by introducing three client capabilities from the MCP specification:

Elicitation (request user input mid-execution)

Sampling (request LLM-generated content from the client)

Progress notification (stream real-time updates)

These capabilities transform one-way tool execution into bidirectional conversations between your MCP server and clients.

Model Context Protocol (MCP) is an open standard defining how LLM applications connect with external tools and data sources. The specification defines server capabilities (tools, prompts, and resources that servers expose) and client capabilities (features clients offer back to servers). While our previous release focused on hosting stateless MCP servers on AgentCore Runtime, this new capability completes the bidirectional protocol implementation. Clients connecting to AgentCore-hosted MCP servers can now respond to server-initiated requests. In this post, you will learn how to build stateful MCP servers that request user input during execution, invoke LLM sampling for dynamic content generation, and stream progress updates for long-running tasks. You will see code examples for each capability and deploy a working stateful MCP server to Amazon Bedrock AgentCore Runtime.

From stateless to stateful MCP

The original MCP server support on AgentCore used stateless mode: each incoming HTTP request was independent, with no shared context between calls. This model is straightforward to deploy and reason about, and it works well for tool servers that receive inputs and return outputs. However, it has a fundamental constraint. The server can’t maintain a conversation thread across requests, ask the user for clarification in the middle of a tool call, or report progress back to the client as work happens.

Stateful mode removes that constraint. When you run your MCP server with stateless_http=False, AgentCore Runtime provisions a dedicated microVM for each user session. The microVM persists for the session’s lifetime (up to 8 hours, or 15 minutes of inactivity per idleRuntimeSessionTimeout setting), with CPU, memory, and filesystem isolation between sessions. The protocol maintains continuity through a Mcp-Session-Id header: the server returns this identifier during the initialize handshake, and the client includes it in every subsequent request to route back to the same session.

The following table summarizes the key differences:

Stateless mode

Stateful mode

stateless_httpsetting

TRUE

FALSE

Session isolation

Dedicated microVM per session

Session lifetime

Up to 8 hours; 15-min idle timeout

Client capabilities

Not supported

Elicitation, sampling, progress notifications

Recommended for

Simple tool serving

Interactive, multi-turn workflows

When a session expires or the server is restarted, subsequent requests with the early session ID return a 404. At that point, clients must re-initialize the connection to obtain a new session ID and start a fresh session.The configuration change to enable stateful mode is a single flag in your server startup:

code

mcp.run( transport="streamable-http", host="0.0.0.0", port=8000, stateless_http=False # Enable stateful mode)

Beyond this flag, the three client capabilities become available automatically once the MCP client declares support for them during the initialization handshake.

The three new client capabilities

Stateful mode brings three client capabilities from the MCP specification. Each addresses a different interaction pattern that agents encounter in production workflows.

Elicitation allows a server to pause execution and request structured input from the user through the client. The tool can ask targeted questions at the right moment in its workflow, gathering a preference, confirming a decision, or collecting a value that depends on earlier results. The server sends an elicitation/create request with a message and an optional JSON schema describing the expected response structure. The client renders an appropriate input interface, and the user can accept (providing the data), decline, or cancel.

Sampling allows a server to request an LLM-generated completion from the client through sampling/createMessage. This is the mechanism that makes it possible for tool logic on the server to use language model capabilities without holding its own model credentials. The server provides a prompt and optional model preferences; the client forwards the request to its connected LLM and returns the generated response. Practical uses include generating personalized summaries, creating natural-language explanations of structured data, or producing recommendations based on earlier conversation context.

Progress notifications allow a server to report incremental progress during long-running operations. Using ctx.report_progress(progress, total), the server emits updates that clients can display as a progress bar or status indicator. For operations that span multiple steps, for example, searching across data sources, this keeps users informed rather than watching a blank screen.

All three capabilities are opt-in at the client level: a client declares which capabilities it supports during initialization, and the server must only use capabilities the client has advertised.

Elicitation: server-initiated user input

Elicitation is the mechanism by which an MCP server pauses mid-execution and asks the client to collect specific information from the user. The server sends an elicitation/create JSON-RPC request containing a human-readable message and a requestedSchema that describes the expected response. The client presents this as a form or prompt, and the user’s response (or explicit decline) is returned to the server so execution can continue.The MCP specification supports two elicitation modes:

Form mode: structured data collection directly through the MCP client. Suitable for preferences, configuration inputs, and confirmations that don’t involve sensitive data.

URL mode: directs the user to an external URL for interactions that must not pass through the MCP client, such as OAuth flows, payment processing, or credential entry.

The response uses a three-action model: accept (user provided data), decline (user explicitly rejected the request), or cancel (user dismissed without choosing). Servers should handle each case appropriately. The following example implements an add_expense_interactive tool that collects a new expense through four sequential elicitation steps: amount, description, category, and a final confirmation before writing to DynamoDB. Each step defines its expected input as a Pydantic model, which FastMCP converts to the JSON Schema sent in the elicitation/create request.

Server

The add_expense_interactive tool walks a user through four sequential questions before writing to Amazon DynamoDB. Each step defines its expected input as a separate Pydantic model, because the form mode schema must be a flat object. You can collect all four fields in a single model with four properties but splitting them here gives the user one focused question at a time, which is the interactive pattern elicitation is designed for.

**agents/mcp_client_features.py**

code

import os
from pydantic import BaseModel
from fastmcp import FastMCP, Context
from fastmcp.server.elicitation import AcceptedElicitation
from dynamo_utils import FinanceDB

mcp = FastMCP(name='ElicitationMCP')

_region = os.environ.get('AWS_REGION') or os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1'
db = FinanceDB(region_name=_region)

class AmountInput(BaseModel):
    amount: float

class DescriptionInput(BaseModel):
    description: str

class CategoryInput(BaseModel):
    category: str  # one of: food, transport, bills, entertainment, other

class ConfirmInput(BaseModel):
    confirm: str  # Yes or No

@mcp.tool()
async def add_expense_interactive(user_alias: str, ctx: Context) -> str:
    """Interactively add a new expense using elicitation.

    Args:
        user_alias: User identifier
    """
    # Step 1: Ask for the amount
    result = await ctx.elicit('How much did you spend?', AmountInput)
    if not isinstance(result, AcceptedElicitation):
        return 'Expense entry cancelled.'
    amount = result.data.amount

    # Step 2: Ask for a description
    result = await ctx.elicit('What was it for?', DescriptionInput)
    if not isinstance(result, AcceptedElicitation):
        return 'Expense entry cancelled.'
    description = result.data.description

    # Step 3: Select a category
    result = await ctx.elicit(
        'Select a category (food, transport, bills, entertainment, other):',
        CategoryInput
    )
    if not isinstance(result, AcceptedElicitation):
        return 'Expense entry cancelled.'
    category = result.data.category

    # Step 4: Confirm before saving
    confirm_msg = (
        f'Confirm: add expense of ${amount:.2f} for {description}'
        f' (category: {category})? Reply Yes or No'
    )
    result = await ctx.elicit(confirm_msg, ConfirmInput)
    if not isinstance(result, AcceptedElicitation) or result.data.confirm != 'Yes':
        return 'Expense entry cancelled.'

    return db.add_transaction(user_alias, 'expense', -abs(amount), description, category)

if __name__ == '__main__':
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        stateless_http=False
    )

Each await ctx.elicit() suspends the tool and sends an elicitation/create request over the active session. The isinstance(result, AcceptedElicitation) check handles decline and cancel uniformly at every step.

Client

Registering an elicitation_handler on fastmcp.Client is both how the handler is wired in and how the client advertises elicitation support to the server during initialization.

code

import asyncio
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport

# Pre-loaded responses simulate the user answering each question in sequence
_responses = iter([
    {'amount': 45.50},
    {'description': 'Lunch at the office'},
    {'category': 'food'},
    {'confirm': 'Yes'},
])

async def elicit_handler(message, response_type, params, context):
    # In production: render a form and return the user's input
    response = next(_responses)
    print(f'  Server asks: {message}')
    print(f'  Responding:  {response}\n')
    return response

transport = StreamableHttpTransport(url=mcp_url, headers=headers)

async with Client(transport, elicitation_handler=elicit_handler) as client:
    await asyncio.sleep(2)  # allow session initialization
    result = await client.call_tool('add_expense_interactive', {'user_alias': 'me'})

print(result.content[0].text)

Running this against the deployed server:

code

Server asks: How much did you spend?
Responding:  {'amount': 45.5}

Server asks: What was it for?
Responding:  {'description': 'Lunch at the office'}

Server asks: Select a category (food, transport, bills, entertainment, other):
Responding:  {'category': 'food'}

Server asks: Confirm: add expense of $45.50 for Lunch at the office (category: food)? Reply Yes or No
Responding:  {'confirm': 'Yes'}

Expense of $45.50 added for me

The complete working example, including DynamoDB setup and AgentCore deployment, is available in the GitHub sample repository.

Use elicitation when your tool needs information that depends on earlier results, is better collected interactively than upfront, or varies across users in ways that cannot be parameterized in advance. A travel booking tool that first searches destinations and then asks the user to choose among them is a natural fit. A financial workflow that confirms a transaction amount before submitting is another. Elicitation isn’t appropriate for sensitive inputs like passwords or API keys, use URL mode or a secure out-of-band channel for those.

Sampling: server-initiated LLM generation

Sampling is the mechanism by which an MCP server requests an LLM completion from the client. The server sends a sampling/createMessage request containing a list of conversation messages, a system prompt, and optional model preferences. The client forwards the request to its connected language model (subject to user approval) and returns the generated response. The server receives a structured result containing the generated text, the model used, and the stop reason.

This capability inverts the typical flow: instead of the client asking the server for tool results, the server asks the client for model output. The benefit is that the server doesn’t need API keys or a direct model integration. The client retains full control over which model is used, and the MCP specification calls for a human-in-the-loop step where users can review and approve sampling requests before they are forwarded.

Servers can express model preferences using capability priorities (costPriority, speedPriority, intelligencePriority) and optional model hints. These are advisory, the client makes the final selection based on what models it has access to.

Server

The analyze_spending tool fetches transactions from DynamoDB, builds a prompt from the structured data, and delegates the analysis to the client’s LLM via ctx.sample().

agents/mcp_client_features.py (added tool, same file as elicitation)

code

@mcp.tool()
async def analyze_spending(user_alias: str, ctx: Context) -> str:
    """Fetch expenses from DynamoDB and ask the client's LLM to analyse them.

    Args:
        user_alias: User identifier
    """
    transactions = db.get_transactions(user_alias)
    if not transactions:
        return f'No transactions found for {user_alias}.'

    lines = '\n'.join(
        f"- {t['description']} (${abs(float(t['amount'])):.2f}, {t['category']})"
        for t in transactions
    )

    prompt = (
        f'Here are the recent expenses for a user:\n{lines}\n\n'
        f'Please analyse the spending patterns and give 3 concise, '
        f'actionable recommendations to improve their finances. '
        f'Keep the response under 120 words.'
    )

    ai_analysis = 'Analysis unavailable.'
    try:
        response = await ctx.sample(messages=prompt, max_tokens=300)
        if hasattr(response, 'text') and response.text:
            ai_analysis = response.text
    except Exception:
        pass

    return f'Spending Analysis for {user_alias}:\n\n{ai_analysis}'

The tool calls await ctx.sample() and suspends. The server sends a sampling/createMessage request to the client over the open session. When the client returns the LLM response, execution resumes.

Client

The sampling_handler receives the prompt from the server and forwards it to a language model. In this example, that’s Claude Haiku on Amazon. Registering the handler is also how the client declares sampling support to the server during initialization.

import json

import asyncio

import boto3

from mcp.types import CreateMessageResult, TextContent

from fastmcp import Client

from fastmcp.client.transports import StreamableHttpTransport

MODEL_ID = 'us.anthropic.claude-haiku-4-5-20251001-v1:0'

bedrock = boto3.client('bedrock-runtime', region_name=region)

def _invoke_bedrock(prompt: str, max_tokens: int) -> str:

body = json.dumps({

'anthropic_version': 'bedrock-2023-05-31',

'max_tokens': max_tokens,

'messages': [{'role': 'user', 'content': prompt}]

})

resp = bedrock.invoke_model(modelId=MODEL_ID, body=body)

return json.loads(resp['body'].read())['content'][0]['text']

async def sampling_handler(messages, params, ctx):

"""Called by fastmcp.Client when the server issues ctx.sample()."""

prompt = messages if isinstance(messages, str) else ' '.join(

m.content.text for m in messages if hasattr(m.content, 'text')

)

max_tokens = params.maxTokens if params and hasattr(params, 'maxTokens') and params.maxTokens else 300

text = await asyncio.to_thread(_invoke_bedrock, prompt, max_tokens)

return CreateMessageResult(

role='assistant',

content=TextContent(type='text', text=text),

model=MODEL_ID,

stopReason='endTurn'

)

transport = StreamableHttpTransport(url=mcp_url, headers=headers)

async with Client(transport, sampling_handler=sampling_handler) as client:

result = await cli

この記事をシェア

DeNA Engineering★32026年4月7日 00:00

Pococha開発環境をEKS上で再設計：ブランチ単位の開発とPull Request単位の検証 [DeNAインフラSRE]

DeNAのインフラSREチームが、Pocochaの開発環境をAmazon EC2からAmazon EKSへ移行し、ブランチ単位の開発とPull Request単位の検証を可能にするコンテナベースの環境を構築した。

AWS Machine Learning Blog★42026年4月10日 02:06

Amazon Bedrock AgentCoreでReactアプリにライブAIブラウザエージェントを組み込む

Amazonは、Bedrock AgentCoreのブラウザツールを提供し、開発者がReactアプリにAIエージェントを組み込めるようにした。これにより、ユーザーはAIエージェントのウェブ操作を可視化でき、信頼性と制御性を向上させる。

AWS Machine Learning Blog★42026年4月10日 02:28

大規模エージェント管理の未来：AWS Agent Registryがプレビュー公開

AWSがAWS Agent Registryを発表し、組織内でエージェント・ツール・スキルを発見・共有・再利用できる機能をAmazon Bedrock AgentCoreで提供開始した。

ニュース一覧に戻る元記事を読む

mcp = FastMCP(name='ElicitationMCP') _region = os.environ.get('AWS_REGION') or os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1' db = FinanceDB(region_name=_region) class AmountInput(BaseModel): amount: float class DescriptionInput(BaseModel): description: str class CategoryInput(BaseModel): category: str # 選択肢: food, transport, bills, entertainment, other class ConfirmInput(BaseModel): confirm: str # Yes または No @mcp.tool() async def add_expense_interactive(user_alias: str, ctx: Context) -> str: """対話的に新しい経費を追加する（エリシテーションを使用）。 Args: user_alias: ユーザー識別子 """ # ステップ1: 金額を尋ねる result = await ctx.elicit('いくら使いましたか？', AmountInput) if not isinstance(result, AcceptedElicitation): return '経費入力がキャンセルされました。' amount = result.data.amount # ステップ2: 説明を尋ねる result = await ctx.elicit('何のためでしたか？', DescriptionInput) if not isinstance(result, AcceptedElicitation): return '経費入力がキャンセルされました。' description = result.data.description # ステップ3: カテゴリを選択する result = await ctx.elicit( 'カテゴリを選択してください (food, transport, bills, entertainment, other):', CategoryInput ) if not isinstance(result, AcceptedElicitation): return '経費入力がキャンセルされました。' category = result.data.category # ステップ4: 保存前に確認する confirm_msg = ( f'確認: {description} に対して ${amount:.2f} の経費を追加しますか' f' (カテゴリ: {category})？ Yes または No で返答してください' ) result = await ctx.elicit(confirm_msg, ConfirmInput) if not isinstance(result, AcceptedElicitation) or result.data.confirm != 'Yes': return '経費入力がキャンセルされました。' return db.add_transaction(user_alias, 'expense', -abs(amount), description, category) if __name__ == '__main__': mcp.run( transport="streamable-http", host="0.0.0.0", port=8000, stateless_http=False )

import asyncio from fastmcp import Client from fastmcp.client.transports import StreamableHttpTransport # 事前読み込みされた応答は、ユーザーが各質問に順番に回答することをシミュレートします _responses = iter([ {'amount': 45.50}, {'description': 'Lunch at the office'}, {'category': 'food'}, {'confirm': 'Yes'}, ]) async def elicit_handler(message, response_type, params, context): # 本番環境では: フォームをレンダリングし、ユーザーの入力を返します response = next(_responses) print(f' サーバーが尋ねる: {message}') print(f' 応答: {response}\n') return response transport = StreamableHttpTransport(url=mcp_url, headers=headers) async with Client(transport, elicitation_handler=elicit_handler) as client: await asyncio.sleep(2) # セッション初期化を許可 result = await client.call_tool('add_expense_interactive', {'user_alias': 'me'}) print(result.content[0].text)

Server asks: How much did you spend? Responding: {'amount': 45.5} Server asks: What was it for? Responding: {'description': 'Lunch at the office'} Server asks: Select a category (food, transport, bills, entertainment, other): Responding: {'category': 'food'} Server asks: Confirm: add expense of $45.50 for Lunch at the office (category: food)? Reply Yes or No Responding: {'confirm': 'Yes'} Expense of $45.50 added for me

@mcp.tool() async def analyze_spending(user_alias: str, ctx: Context) -> str: """DynamoDBから経費を取得し、クライアントのLLMに分析を依頼します。 Args: user_alias: ユーザー識別子 """ transactions = db.get_transactions(user_alias) if not transactions: return f'{user_alias} の取引が見つかりませんでした。' lines = '\n'.join( f"- {t['description']} (${abs(float(t['amount'])):.2f}, {t['category']})" for t in transactions ) prompt = ( f'以下はユーザーの最近の経費です:\n{lines}\n\n' f'支出パターンを分析し、財務を改善するための3つの簡潔で実用的な推奨事項を提供してください。' f'応答は120語以内に収めてください。' ) ai_analysis = '分析を利用できません。' try: response = await ctx.sample(messages=prompt, max_tokens=300) if hasattr(response, 'text') and response.text: ai_analysis = response.text except Exception: pass return f'{user_alias} の支出分析:\n\n{ai_analysis}'

import json import asyncio import boto3 from mcp.types import CreateMessageResult, TextContent from fastmcp import Client from fastmcp.client.transports import StreamableHttpTransport MODEL_ID = 'us.anthropic.claude-haiku-4-5-20251001-v1:0' bedrock = boto3.client('bedrock-runtime', region_name=region) def _invoke_bedrock(prompt: str, max_tokens: int) -> str: body = json.dumps({ 'anthropic_version': 'bedrock-2023-05-31', 'max_tokens': max_tokens, 'messages': [{'role': 'user', 'content': prompt}] }) resp = bedrock.invoke_model(modelId=MODEL_ID, body=body) return json.loads(resp['body'].read())['content'][0]['text'] async def sampling_handler(messages, params, ctx): """fastmcp.Clientがサーバーがctx.sample()を発行したときに呼び出します。""" prompt = messages if isinstance(messages, str) else ' '.join( m.content.text for m in messages if hasattr(m.content, 'text') ) max_tokens = params.maxTokens if params and hasattr(params, 'maxTokens') and params.maxTokens else 300 text = await asyncio.to_thread(_invoke_bedrock, prompt, max_tokens) return CreateMessageResult( role='assistant', content=TextContent(type='text', text=text), model=MODEL_ID, stopReason='endTurn' ) transport = StreamableHttpTransport(url=mcp_url, headers=headers) async with Client(transport, sampling_handler=sampling_handler) as client: result = await cli

import os from pydantic import BaseModel from fastmcp import FastMCP, Context from fastmcp.server.elicitation import AcceptedElicitation from dynamo_utils import FinanceDB mcp = FastMCP(name='ElicitationMCP') _region = os.environ.get('AWS_REGION') or os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1' db = FinanceDB(region_name=_region) class AmountInput(BaseModel): amount: float class DescriptionInput(BaseModel): description: str class CategoryInput(BaseModel): category: str # one of: food, transport, bills, entertainment, other class ConfirmInput(BaseModel): confirm: str # Yes or No @mcp.tool() async def add_expense_interactive(user_alias: str, ctx: Context) -> str: """Interactively add a new expense using elicitation. Args: user_alias: User identifier """ # Step 1: Ask for the amount result = await ctx.elicit('How much did you spend?', AmountInput) if not isinstance(result, AcceptedElicitation): return 'Expense entry cancelled.' amount = result.data.amount # Step 2: Ask for a description result = await ctx.elicit('What was it for?', DescriptionInput) if not isinstance(result, AcceptedElicitation): return 'Expense entry cancelled.' description = result.data.description # Step 3: Select a category result = await ctx.elicit( 'Select a category (food, transport, bills, entertainment, other):', CategoryInput ) if not isinstance(result, AcceptedElicitation): return 'Expense entry cancelled.' category = result.data.category # Step 4: Confirm before saving confirm_msg = ( f'Confirm: add expense of ${amount:.2f} for {description}' f' (category: {category})? Reply Yes or No' ) result = await ctx.elicit(confirm_msg, ConfirmInput) if not isinstance(result, AcceptedElicitation) or result.data.confirm != 'Yes': return 'Expense entry cancelled.' return db.add_transaction(user_alias, 'expense', -abs(amount), description, category) if __name__ == '__main__': mcp.run( transport="streamable-http", host="0.0.0.0", port=8000, stateless_http=False )

import asyncio from fastmcp import Client from fastmcp.client.transports import StreamableHttpTransport # Pre-loaded responses simulate the user answering each question in sequence _responses = iter([ {'amount': 45.50}, {'description': 'Lunch at the office'}, {'category': 'food'}, {'confirm': 'Yes'}, ]) async def elicit_handler(message, response_type, params, context): # In production: render a form and return the user's input response = next(_responses) print(f' Server asks: {message}') print(f' Responding: {response}\n') return response transport = StreamableHttpTransport(url=mcp_url, headers=headers) async with Client(transport, elicitation_handler=elicit_handler) as client: await asyncio.sleep(2) # allow session initialization result = await client.call_tool('add_expense_interactive', {'user_alias': 'me'}) print(result.content[0].text)

@mcp.tool() async def analyze_spending(user_alias: str, ctx: Context) -> str: """Fetch expenses from DynamoDB and ask the client's LLM to analyse them. Args: user_alias: User identifier """ transactions = db.get_transactions(user_alias) if not transactions: return f'No transactions found for {user_alias}.' lines = '\n'.join( f"- {t['description']} (${abs(float(t['amount'])):.2f}, {t['category']})" for t in transactions ) prompt = ( f'Here are the recent expenses for a user:\n{lines}\n\n' f'Please analyse the spending patterns and give 3 concise, ' f'actionable recommendations to improve their finances. ' f'Keep the response under 120 words.' ) ai_analysis = 'Analysis unavailable.' try: response = await ctx.sample(messages=prompt, max_tokens=300) if hasattr(response, 'text') and response.text: ai_analysis = response.text except Exception: pass return f'Spending Analysis for {user_alias}:\n\n{ai_analysis}'

Amazon Bedrock AgentCore RuntimeでのステートフルMCPクライアント機能の導入

キーポイント

影響分析

編集コメント

From stateless to stateful MCP

The three new client capabilities

Elicitation: server-initiated user input

Sampling: server-initiated LLM generation

関連記事

Amazon Bedrock AgentCore RuntimeでのステートフルMCPクライアント機能の導入

キーポイント

影響分析

編集コメント

From stateless to stateful MCP

The three new client capabilities

Elicitation: server-initiated user input

Sampling: server-initiated LLM generation

関連記事