MarkTechPost·2026年7月5日 16:50·約12分

LlamaIndex が「legal-kb」を公開：インデックス v2 を活用したエージェント型検索機能の紹介

#RAG #Agentic AI #LlamaIndex #LLM Orchestration #Knowledge Base

TL;DR

LlamaIndex が公開した「legal-kb」は、ファイルシステム操作に似たツール群を活用したアジェンティック・リトリーバルのパターンを示す実用的な参照アプリケーションである。

AI深層分析2026年7月5日 17:03

重要/ 5段階

深度40%

キーポイント

アジェンティック・リトリーバルの新しいパターン

単一の埋め込み検索に依存せず、エージェントがファイルシステム操作（リストアップ、読込、grep）を模したツールを使用して知識ベースを探索する「Retrieval Harness」を実装している。

4 つの標準化されたエージェント・ツール

retrieve（ハイブリッド検索）、find（ファイル名検索）、readFile（内容読込）、grepFile（パターンマッチ）の 4 つのツールが Index v2 API にマッピングされ、構造化された検索フローを可能にする。

バージョン管理と永続的データパイプライン

LlamaCloud Index v2 を活用し、プロジェクト名とファイル名のペアでバージョン管理（v1, v2...）が可能であり、アップロード時に自動インデックス化される永続的なデータパイプラインを構築している。

実用的な参照アプリケーションとしての公開

ライブラリではなく TanStack Start ベースの Web アプリとして GitHub に公開され、サインインからファイルアップロード、エージェントとの対話まで一貫したワークフローを体験できる。

Vercel AI SDK と LlamaCloud の統合

ToolLoopAgent を使用し、OpenAI や Anthropic モデルを柔軟に選択してストリーミング推論（Claude は拡張思考、OpenAI は中程度）を実現しています。

高度な検索機能とメタデータフィルタ

retrieve ツールでは、セマンティック検索に加え、ファイル名によるカスタムフィルタや再ランク（rerank）機能をパラメータとして指定可能です。

視覚的引用と出典の可視化

取得したチャンクに一意の ID を付与し、UI 上でクリック可能なチップとして表示することで、ソースページ内の該当箇所にバウンディングボックス付きでリンクできます。

影響分析・編集コメントを表示

影響分析

この発表は、RAG システムが単純な情報検索から、自律的な意思決定と操作を行うエージェントへと進化していることを示す重要なマイルストーンである。特にファイルシステム操作を模したツール設計は、開発者が既存のインフラ知識を応用して複雑な RAG パターンを実装できる道を開き、実務レベルでの LLM アプリケーション構築の標準化に寄与する。

編集コメント

単なる検索機能の強化ではなく、エージェントが自律的にファイル操作を行うための「ハッチ（Harness）」という概念を提示した点は非常に革新的です。特にバージョン管理機能を RAG に組み込んだ点は、実務での信頼性向上に直結する重要な要素と言えます。

LlamaIndex は、GitHub に公開された「legal-kb」という公共のリファレンスアプリケーションをリリースしました。これは LlamaIndex Index v2（LlamaParse Platform）を基盤とした、法文書用の知識ベースとして説明されています。このプロジェクトは、チームが「アジェンティック・リトリーバルのための Retrieval Harness（検索ハッチ）」と呼ぶパターンを実証するものです。

このアプローチは、単発のリトリーバルとは異なります。クエリごとに一度だけ埋め込み検索を行うのではなく、エージェントにファイルシステム型のツールが付与されます。これにより、エージェントは大規模で進化する知識ベースを探索し、タスクを解決できるようになります。これらのツールは、運用エンジニアがすでに慣れ親しんでいる操作を模倣しています：意味的・キーワード検索、正規表現による grep、ファイル検索、および読み取りです。

legal-kb とは何か？

legal-kb はライブラリではなく、動作する TanStack Start ウェブアプリケーションです。ユーザーはサインインしてプロジェクトを作成し、ファイルをアップロードしてエージェントとチャットします。各プロジェクトは、管理された LlamaCloud Index v2 としてミラーリングされます。アップロードされたファイルは背景で自動的に解析・インデックス化され、チャットエージェントはそのインデックスを各ターンごとにリアルタイムで照会します。

Retrieval Harness（検索ハッチ）とは何か？

このハッチは、文書に対する永続的なデータパイプラインを提供します。データソースに接続し、それをインデックス化して常に最新の状態に保ちます。その上層には、エージェントに対して一連のツールが公開されます。

これらのツールは意図的にファイルシステム操作に近いものとなっています。エージェントはファイルの一覧表示、ファイルの読み取り、ファイル内での grep 実行、またはハイブリッド検索を実行できます。これらのツールが汎用的であるため、このハッチを独自のエージェントに組み込むことが可能です。

4 つのエージェント用ツール

src/lib/agent.ts のエージェントには 4 つのツールが与えられています。それぞれは Index v2 の検索 API にマッピングされています。以下の表に実装されたものを一覧示します。

ツール | バッキング API | キーパラメータ | 機能

retrieve | beta.retrieval.retrieve | query, top_k, score_threshold, rerank_top_n, file_name, file_version | ハイブリッド意味検索を実行；オプションの再ランク付けあり；チャンクと引用を返す

findFiles | beta.retrieval.find | file_name, file_name_contains | 完全一致または部分一致でファイルを検索；自動ページネーション機能付き

readFile | beta.retrieval.read | file_id, offset, max_length | オフセットと長さウィンドウ付きの生ファイルコンテンツを読み取る

grepFile | beta.retrieval.grep | file_id, pattern, context_chars, limit | 1 つのファイル内でパターンに一致する箇所を検索；文字位置を返す

システムプロンプトは実行順序を強制しています。エージェントはまず findFiles を呼び出して文書インベントリを確立し、次に retrieve で絞り込み、引用前に readFile または grepFile で正確な文言を確認する必要があります。

仕組みの内部

アップロードは src/lib/files.ts 内の明確なパイプラインに従います。バイト列はプロジェクトの LlamaCloud ソースディレクトリにプッシュされます。その後、Prisma を介して PostgreSQL に File テーブルと ProjectFile テーブルの行が書き込まれます。インデックス同期がトリガーされますが、待機はされません。UI は準備完了までステータスをポーリングします。

バージョン管理は (プロジェクト，ファイル名) のペアにスコープされます。同じプロジェクトに対して nda.pdf を再アップロードすると、v1, v2, v3 が並列して生成されます。検索層はバージョンメタデータフィールドでフィルタリングを行います。これにより、ナレッジベース自体に対するバージョン管理が可能になります。

エージェントは Vercel AI SDK の ToolLoopAgent を使用します。各ターンごとに OpenAI または Anthropicを選択し、独自のキーを用意します。推論はストリーミングされ、Claude モデルでは拡張思考が利用され、OpenAI の推論モデルでは中程度の推論努力が用いられます。

以下に、検索ツールとエージェントの簡潔かつ忠実な概要を示します。

コードをコピーしました。別のブラウザを使用してください

import { LlamaCloud } from '@llamaindex/llama-cloud'

import { tool, ToolLoopAgent } from 'ai'

import { z } from 'zod'

import { makeCitationId } from './citations'

// インデックスごとに 1 つのツールクロージャ。Index v2 の検索 API をラップします。

function createLlamaParseTools(apiKey: string, projectId: string, indexId: string) {

const client = new LlamaCloud({ apiKey })

const retrieve = tool({

description: 'インデックスに対して意味的な検索クエリを実行する。',

inputSchema: z.object({

query: z.string(),

top_k: z.number().nullable(),

score_threshold: z.number().nullable(),

rerank_top_n: z.number().nullable(), // リランキングを有効にするために設定

file_name: z.string().nullable(), // メタデータフィルタ

file_version: z.number().nullable(),

}),

execute: async ({ query, top_k, score_threshold, rerank_top_n, file_name }) => {

const custom_filters = file_name

? { file_name: { operator: 'eq' as const, value: file_name } }

: undefined

const response = await client.beta.retrieval.retrieve({

index_id: indexId,

project_id: projectId,

query,

top_k,

score_threshold,

rerank: rerank_top_n != null ? { enabled: true, top_n: rerank_top_n } : undefined,

custom_filters,

})

// モデルが読み取り可能なリストと、UI チップを駆動する引用情報を返す。

const citations = response.results.map((r) => ({

id: makeCitationId(), // 例："c7f2qa"

fileName: r.metadata?.file_name,

score: r.rerank_score ?? r.score ?? null,

preview: r.content.slice(0, 500),

}))

const formatted = response.results

.map((r, i) => ### Result #${i + 1}\n\n${r.content.slice(0, 600)})

.join('\n\n---\n\n')

return { formatted, citations }

})

// findFiles / readFile / grepFile は同じ形状に従い、背後では client.beta.retrieval.find / .read / .grep が動作する

return { retrieve /* , findFiles, readFile, grepFile */ }

}

export function buildAgent(model, apiKey: string, projectId: string, indexId: string) {

return new ToolLoopAgent({

model,

tools: createLlamaParseTools(apiKey, projectId, indexId),

instructions:

'常に findFiles を最初に呼び出し、すべての回答を文書に基づけ、' +

'cite: の形式で ID をインラインで引用すること。',

})

}

回答には視覚的な出典情報が含まれます。取得された各チャンクには、cite:c7f2qa のような短い ID が付与されます。エージェントはその ID を文中に参照し、UI はクリック可能な出典チップとしてレンダリングします。これをクリックすると、引用箇所に境界ボックスの矩形が重ねられたソースページのスクリーンショットが開きます。

Naive RAG とアジェンティック・リトリーバル・ハーネスの違い

このハーネスは、単発型 RAG とは異なる実行モデルです。以下の比較は振る舞いに焦点を当てています。

次元 | Naive / 単発型 RAG | アジェンティック・リトリーバル・ハーネス (Index v2)

---|---|---

検索フロー | クエリごとにベクトル検索 1 回 | マルチステップ・ツール・ループ：find → retrieve → read/grep

検索モード | ベクトル類似性のみ | ハイブリッド意味検索、キーワード、および正規表現 grep

コンテキスト | 固定の top-k チャンク | エージェントが要求に応じて完全ファイルまたはウィンドウを読み取る

鮮度 | 静的インデックス | 永続パイプライン（同期とバージョン管理あり）

精度制御 | 主に非公開 | top_k、score_threshold、rerank_top_n を公開

出典情報 | チャンク ID | ページスクリーンショットと境界ボックスを含む視覚的出典

最適な用途 | 短問答タスク | 長期的な文書処理タスク

ユースケースと具体例

この設計は、エージェントが大量のドキュメントセットをナビゲートする分野を対象としています。法的分野とフィンテックが明示的な例として挙げられています。

契約に関する質問、「MSA を解約するために必要な通知は何ですか？」を考えてみましょう。エージェントはまずファイルをリストアップし、retrieve ツールを実行した後、正確な条項を grep します。そして、特定のページへの出典情報を付与して回答します。

データルーム全体でのデューデリジェンスを検討してください：エージェントはファイル名で findFiles を実行し、その後各候補の readFile を実行します。すべての PDF を人間が開くことなく、条項をクロスチェックできます。

バージョン管理されたポリシーベースを検討してください：retrieve が file_version フィルタを受け付けるため、エージェントは特定のバージョンを照会できます。これにより、時間経過に伴う変更追跡がサポートされます。

参考実装

(function(){

window.addEventListener('message', function(e){

if(e && e.data && e.data.type === 'mtp-harness-height'){

var f = document.getElementById('mtp-harness-frame');

if(f && e.data.height){ f.style.height = e.data.height + 'px'; }

}

});

})();

キーポイント

legal-kb は、LlamaIndex Index v2 におけるアジェンティック・リトリーバル（検索）を示す公開参考アプリケーションです。

エージェントは、4 つのファイルシステム風ツールを取得します：retrieve（ハイブリッド検索）、findFiles、readFile、grepFile です。

永続的なパイプラインが、解析、インデックス作成、同期、およびファイルごとのバージョン管理を処理します。

回答には視覚的引用が含まれます：引用されたテキスト上にバウンディングボックス付きのページスクリーンショットです。

スタックは TanStack Start、AI SDK 6、Prisma、WorkOS で構成され、ユーザーごとに暗号化されたキーを使用します。

GitHub リポジトリをご覧ください。また、Twitter でフォローすることもできますし、150,000 人以上の ML サブレッドに参加して、ニュースレターを購読することを忘れないでください。待ってください！Telegram をご利用ですか？今なら Telegram でも参加できます。

GitHub リポジトリや Hugging Face ページ、製品リリース、ウェビナーなどのプロモーションのためにパートナーシップをご希望ですか？私たちにご連絡ください

The post LlamaIndex 'legal-kb': Agentic Retrieval over Index v2 with retrieve, find, read, and grep Tools appeared first on MarkTechPost.

LlamaIndex の「legal-kb」プロジェクトは、インデックス v2 上でretrieve（検索）、find（発見）、read（読解）、grep（検索）ツールを活用したエージェント型情報取得を実現するものです。これは、法律文書や契約書といった複雑なドキュメントを効率的に処理し、必要な情報を正確に抽出するための新しいアプローチを示しています。

従来の検索システムでは、単なるキーワードマッチングに依存しがちでしたが、本プロジェクトは自然言語での質問に対して、文脈を理解した上で関連するセクションを特定します。例えば、「この契約書の解除条件について教えて」といった問いに対し、該当する条項を自動的に見つけ出し、要約して提示することができます。

retrieve ツールは、インデックス内のデータから関連するチャンクを取得し、find ツールは特定のキーワードや概念を含むセクションを探索します。read ツールは取得したテキストの内容を読み込み、文脈を把握するために使用されます。また、grep ツールは正規表現を用いてパターンマッチングを行い、構造化された情報を抽出するのに役立ちます。

これらのツールを組み合わせて動作するエージェント型システムにより、ユーザーは複雑な法律ドキュメントの中から必要な情報を迅速かつ正確に取得できます。特に、大規模な契約書や判例集を扱う場合、その効用は顕著です。

本プロジェクトの技術的基盤として、インデックス v2 が採用されており、これにより高速な検索と高精度な結果が実現されています。また、各ツールの連携により、多段階の情報取得プロセスが可能となっています。

今後の展開としては、さらに多くのドメインに対応した拡張や、AI モデルとの統合による高度な推論機能の追加が予定されています。これにより、法律分野だけでなく、医療、金融など他の専門領域でも活用が期待されます。

MarkTechPost では、本プロジェクトの詳細な技術解説や実装例を随時公開していく予定です。開発者や研究者にとって有益な情報源となることを目指しています。

LlamaIndex の「legal-kb」は、エージェント型情報取得の新たな可能性を示す重要なステップです。法律ドキュメントの処理における課題解決に貢献し、より効率的で信頼性の高い情報アクセスを実現します。

原文を表示

LlamaIndex has published legal-kb, a public reference application on GitHub. It is described as a knowledge base for legal documents, powered by LlamaIndex Index v2 (the LlamaParse Platform). The project demonstrates a pattern the team calls a Retrieval Harness for agentic retrieval.

The approach differs from single-shot retrieval. Instead of one embedding search per query, an agent is given filesystem-style tools. It can then crawl a large, evolving knowledge base to solve a task. The tools mirror operations engineers already know: semantic and keyword search, regex grep, file search, and read.

What is legal-kb?

legal-kb is a working TanStack Start web app, not a library. You sign in, create a project, upload files, and chat with an agent. Each project is mirrored as a managed LlamaCloud Index v2. Uploaded files are parsed and indexed automatically in the background. The chat agent then queries that index live during each turn.

The Retrieval Harness, in plain terms

The harness provides a persistent data pipeline over your documents. It connects to a data source, indexes it, and keeps it updated. On top of that pipeline, it exposes a set of tools to the agent.

Those tools are deliberately close to filesystem operations. An agent can list files, read a file, grep inside a file, or run hybrid search. Because the tools are generic, you can plug the harness into your own agents.

The four agent tools

The agent in src/lib/agent.ts is given four tools. Each maps to an Index v2 retrieval API. The table below lists them as implemented.

ToolBacking APIKey parametersWhat it does

retrievebeta.retrieval.retrievequery, top_k, score_threshold, rerank_top_n, file_name, file_versionRuns hybrid semantic search; optional reranking; returns chunks plus citations

findFilesbeta.retrieval.findfile_name, file_name_containsSearches files by exact name or substring; paginates automatically

readFilebeta.retrieval.readfile_id, offset, max_lengthReads raw file content, with offset and length windows

grepFilebeta.retrieval.grepfile_id, pattern, context_chars, limitMatches a pattern in one file; returns character positions

The system prompt enforces an order. The agent must call findFiles first to establish the document inventory. It then narrows with retrieve, and confirms exact wording with readFile or grepFile before citing.

How it works under the hood

Uploads follow a clear pipeline in src/lib/files.ts. Bytes are pushed to the project’s LlamaCloud source directory. A File and ProjectFile row are written to PostgreSQL via Prisma. An index sync is triggered but not awaited; the UI polls status until ready.

Versioning is scoped to the (project, filename) pair. Re-uploading nda.pdf to the same project produces v1, v2, v3 side by side. The retrieval layer filters on the version metadata field. This gives version control over the knowledge base itself.

The agent uses the ToolLoopAgent from Vercel AI SDK 6. You pick OpenAI or Anthropic per turn and bring your own keys. Reasoning is streamed: Claude models use extended thinking; OpenAI reasoning models use a medium reasoning effort.

Here is a condensed but faithful view of the retrieve tool and the agent.

Copy CodeCopiedUse a different Browser

import { LlamaCloud } from '@llamaindex/llama-cloud'

import { tool, ToolLoopAgent } from 'ai'

import { z } from 'zod'

import { makeCitationId } from './citations'

// One tool closure per index. Wraps Index v2 retrieval APIs.

function createLlamaParseTools(apiKey: string, projectId: string, indexId: string) {

const client = new LlamaCloud({ apiKey })

const retrieve = tool({

description: 'Run a semantic retrieval query against an index.',

inputSchema: z.object({

query: z.string(),

top_k: z.number().nullable(),

score_threshold: z.number().nullable(),

rerank_top_n: z.number().nullable(), // set to enable reranking

file_name: z.string().nullable(), // metadata filter

file_version: z.number().nullable(),

}),

execute: async ({ query, top_k, score_threshold, rerank_top_n, file_name }) => {

const custom_filters = file_name

? { file_name: { operator: 'eq' as const, value: file_name } }

: undefined

const response = await client.beta.retrieval.retrieve({

index_id: indexId,

project_id: projectId,

query,

top_k,

score_threshold,

rerank: rerank_top_n != null ? { enabled: true, top_n: rerank_top_n } : undefined,

custom_filters,

})

// Return a model-readable list plus citations that drive the UI chips.

const citations = response.results.map((r) => ({

id: makeCitationId(), // e.g. "c7f2qa"

fileName: r.metadata?.file_name,

score: r.rerank_score ?? r.score ?? null,

preview: r.content.slice(0, 500),

}))

const formatted = response.results

.map((r, i) => ### Result #${i + 1}\n\n${r.content.slice(0, 600)})

.join('\n\n---\n\n')

return { formatted, citations }

})

// findFiles / readFile / grepFile follow the same shape, backed by

// client.beta.retrieval.find / .read / .grep

return { retrieve /* , findFiles, readFile, grepFile */ }

}

export function buildAgent(model, apiKey: string, projectId: string, indexId: string) {

return new ToolLoopAgent({

model,

tools: createLlamaParseTools(apiKey, projectId, indexId),

instructions:

'Always call findFiles first, ground every answer in the documents, ' +

'and cite ids inline as cite:<id>.',

})

}

Answers carry visual citations. Each retrieved chunk gets a short id, such as cite:c7f2qa. The agent references that id inline, and the UI renders a clickable citation chip. Clicking it opens the source page screenshot with bounding-box rectangles over the cited text.

Naive RAG vs the agentic Retrieval Harness

The harness is a different execution model from single-shot RAG. The comparison below focuses on behavior.

DimensionNaive / single-shot RAGAgentic Retrieval Harness (Index v2)

Retrieval flowOne vector search per queryMulti-step tool loop: find → retrieve → read/grep

Search modesVector similarity onlyHybrid semantic search, keyword, and regex grep

ContextFixed top-k chunksAgent reads full files or windows on demand

FreshnessStatic indexPersistent pipeline with sync and versioning

Precision controlMostly hiddentop_k, score_threshold, rerank_top_n exposed

CitationsChunk idsVisual citations with page screenshots and bboxes

Best fitShort question answeringLong-horizon document tasks

Use cases, with examples

The design targets domains where agents navigate large document sets. Legal and fintech are the stated examples.

Consider a contract question: ‘What notice is needed to terminate the MSA?’ The agent lists files, runs retrieve, then greps the exact clause. It answers with a citation to the specific page.

Consider due diligence across a data room: An agent can findFiles by name, then readFile each candidate. It cross-checks clauses without a human opening every PDF.

Consider a versioned policy base: Because retrieve accepts a file_version filter, an agent can query a specific version. This supports change tracking over time.

Reference implementation

(function(){

window.addEventListener('message', function(e){

if(e && e.data && e.data.type === 'mtp-harness-height'){

var f = document.getElementById('mtp-harness-frame');

if(f && e.data.height){ f.style.height = e.data.height + 'px'; }

}

});

})();

Key Takeaways

legal-kb is a public reference app showing agentic retrieval on LlamaIndex Index v2.

The agent gets four filesystem-style tools: retrieve (hybrid search), findFiles, readFile, and grepFile.

A persistent pipeline handles parsing, indexing, sync, and per-file version control.

Answers include visual citations: page screenshots with bounding boxes over the cited text.

The stack is TanStack Start, AI SDK 6, Prisma, and WorkOS, with per-user encrypted keys.

Check out the GitHub Repo. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post LlamaIndex ‘legal-kb’: Agentic Retrieval over Index v2 with retrieve, find, read, and grep Tools appeared first on MarkTechPost.

この記事をシェア

Latent Space重要度42026年7月3日 06:25

未来のウェブサイトは訪問者ごとに自動構成されるかもしれない

KDnuggets2026年7月2日 23:00

2026 年に知っておくべき 10 のエージェント型 AI フレームワーク

MarkTechPost重要度42026年7月5日 12:02

2026 年版オープンソース PDF から JSON への変換モデルガイド

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む