TLDR AI·2026年6月5日 09:00·約3分で読める

Ollama モデルテスター（GitHub リポジトリ）

#LLM #Ollama #ローカル実行 #モデル評価 #CLIツール

TL;DR

Ollama のローカルモデルを比較・評価するための軽量な CLI ツール「Ollama Model Tester」が、依存関係なしで公開され、開発者のモデル検証ワークフローを効率化します。

AI深層分析2026年6月11日 21:17

注目/ 5段階

深度40%

キーポイント

依存関係不要の軽量ツール

Python の標準ライブラリのみを使用しており、pip install などの追加インストールが不要な CLI ツールです。

モデル比較と反復テスト機能

同じプロンプトを異なるモデル、あるいは同一モデルで複数回実行し、結果をディスクに保存して横並びで比較できます。

柔軟なコマンドライン操作

対話式での入力が基本ですが、--model や --prompt-file などのフラグを使用することでスクリプト化やバッチ処理が可能です。

影響分析・編集コメントを表示

影響分析

このツールの登場は、ローカル LLM 開発者がモデルの選定やハイパーパラメータ調整を行う際の標準的なワークフローを強化します。特に、依存関係の排除により環境構築の手間が省けるため、迅速なプロトタイピングや A/B テストの実施頻度が高まることが期待されます。

編集コメント

ローカル LLM の活用が進む中、モデル選定やパラメータ調整の効率化を支援する実用的なツールが追加されました。開発者の生産性向上に直結する良質なオープンソースツールです。

Ollama Model Tester

Ollama のローカルモデルに対して同じプロンプトを実行し、すべての応答をディスクに保存する、依存関係のない小さな CLI です。これにより、モデル同士（または同一モデルの繰り返し実行結果）を並列で比較することが可能になります。

Python 標準ライブラリのみを使用するため、pip install の必要はありません。

要件

Python 3.7 以降

ローカルで動作している Ollama（デフォルトは http://localhost:11434）

少なくとも 1 つのモデルがダウンロードされていること（例：ollama pull llama3.1:8b）

クイックスタート

Ollama が実行中であることを確認し、その後以下を実行してください。

python3 ollama_model_test.py

以下の質問が順次表示されます。

使用するモデル（インストールされているモデルから番号を選択）

プロンプト — 好きな行数を入力し、終了するには /done を独立した行に入力します

プロンプトの実行回数

Temperature（0.0–2.0）、または Enter キーを押して Ollama のデフォルト値を使用

応答をターミナルにリアルタイムでストリーミングするかどうか

その後、指定された回数为プロンプトを実行し、結果は ollama-runs/ ディレクトリ下に書き込まれます。

コマンドラインフラグ（オプション）

上記のプロンプト入力はすべて事前に指定可能であり、これによりツールをスクリプトとして利用できます。省略した項目については対話形式で質問されます。

Flag

Description

--model NAME

使用するローカルモデル（すでにインストールされている必要があります）

--runs N

実行する生成回数

--temperature T

Temperature 値（0.0–2.0）

--prompt-file PATH

UTF-8 テキストファイルからプロンプトを読み取る

--stream / --no-stream

応答をリアルタイムでストリーミングするか、しないか

例 — 保存されたプロンプトを対話なしで 3 回実行する：

python3 ollama_model_test.py \

--model llama3.1:8b \

--prompt-file prompt.txt \

--runs 3 \

--temperature 0.7 \

--no-stream

Output

結果は、プロンプトごとに 1 つのフォルダにグループ化されます:

ollama-runs/

what-are-the-main-tradeoffs-between_835562a4/

prompt.md # ハッシュとタイムスタンプを含むプロンプト

metadata.json # このプロンプトに対するすべての実行（モデル、タイミング、オプション）

llama3.1-8b.md # このモデルの応答と Ollama の実行メタデータ

gemma3-1b.md

フォルダ名は、プロンプトの最初の数単語に、フルプロンプトの短いハッシュを付加したものです。フォルダがプロンプトに基づいてキー付けされているため、異なるモデルに対して同じプロンプトを実行すると、その出力も同じフォルダに格納されます。これにより、モデル間の比較が容易になります。各モデルのファイルには、Ollama の実行メタデータ（トークン数、タイミングなど）とともに、すべての実行の応答が記録されます。

原文を表示

Ollama Model Tester

A small, dependency-free CLI for running the same prompt against your local

Ollama models and saving every response to disk — so you

can compare models (or compare repeated runs of one model) side by side.

It uses only the Python standard library: no pip install required.

Requirements

Python 3.7 or newer

Ollama running locally (the default http://localhost:11434)

At least one model pulled, e.g. ollama pull llama3.1:8b

Quick start

Make sure Ollama is running, then:

code

python3 ollama_model_test.py

You'll be asked, in order:

Which model to use (pick a number from your installed models)

The prompt — type as many lines as you like, then put /done on its own

line to finish

How many times to run the prompt

Temperature (0.0–2.0), or press Enter to use Ollama's default

Whether to stream the responses live to the terminal

It then runs the prompt the requested number of times and writes the results

under ollama-runs/.

Command-line flags (optional)

Every prompt above can be supplied up front, which makes the tool scriptable.

Anything you omit is still asked interactively.

Flag

Description

--model NAME

Local model to use (must already be installed)

--runs N

Number of generations to run

--temperature T

Temperature, 0.0–2.0

--prompt-file PATH

Read the prompt from a UTF-8 text file

--stream / --no-stream

Stream responses live, or don't

Example — run a saved prompt three times, fully non-interactive:

code

python3 ollama_model_test.py \
  --model llama3.1:8b \
  --prompt-file prompt.txt \
  --runs 3 \
  --temperature 0.7 \
  --no-stream

Output

Results are grouped into one folder per prompt:

code

ollama-runs/
  what-are-the-main-tradeoffs-between_835562a4/
    prompt.md         # the prompt, with its hash and timestamp
    metadata.json     # every run against this prompt (model, timing, options)
    llama3.1-8b.md    # responses + Ollama metadata for this model
    gemma3-1b.md

The folder name is the first few words of the prompt plus a short hash of the

full prompt. Because the folder is keyed on the prompt, **running the same

prompt against a different model drops its output into the same folder** —

making model-to-model comparison easy. Each model's file records every run's

response alongside Ollama's run metadata (token counts, timings, and so on).

この記事をシェア

Latent Space2026年6月20日 17:06

[AINews] 今日特に大きな出来事はありませんでした

Latent Space は、GLM 5.2 が依然として注目されていると指摘しつつ、AIE WF 2026 の通常チケットが月曜日に完売すると発表しました。同サイト購読者向けに限定割引を提供し、参加者には Warp や Datadog などからのスポンサークレジットも付与されます。

TechCrunch AI★42026年6月20日 01:01

米国がアンソロピックの「Fable 5」発売を禁止、しかし市場は動じず

米国政府は国家安全保障上の懸念から、アマゾンの研究者らがガードレール回避手法を発見したとして、アンソロピックに対し最新モデル「Fable 5」と「Mythos 5」の販売差し止めを命じた。サイバーセキュリティ研究者らはこの措置が危険だとする公開書簡に署名し、同社も他モデルでも同様の抜け道が存在すると指摘している。

GitHub Blog★42026年6月20日 01:00

社内データ分析エージェントの構築方法について

GitHub は、大規模なデータ組織が直面する自己完結型のデータアクセスと洞察提供の課題に対し、AI を活用した信頼性の高い解決策として、社内でデータ分析エージェントを構築したことを発表した。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

ollama-runs/ what-are-the-main-tradeoffs-between_835562a4/ prompt.md # the prompt, with its hash and timestamp metadata.json # every run against this prompt (model, timing, options) llama3.1-8b.md # responses + Ollama metadata for this model gemma3-1b.md

Ollama モデルテスター（GitHub リポジトリ）

キーポイント

影響分析

編集コメント

Ollama Model Tester

要件

クイックスタート

コマンドラインフラグ（オプション）

Output

Ollama Model Tester

Requirements

Quick start

Command-line flags (optional)

Output

関連記事

Ollama モデルテスター（GitHub リポジトリ）

キーポイント

影響分析

編集コメント

Ollama Model Tester

要件

クイックスタート

コマンドラインフラグ（オプション）

Output

Ollama Model Tester

Requirements

Quick start

Command-line flags (optional)

Output

関連記事