TLDR AI·2026年6月15日 09:00·約10分で読める

Kimi K2.7 Code（Hugging Face リポジトリ）

#LLM #MoE #Coding Agent #Moonshot AI #Hugging Face

TL;DR

Moonshot AI が公開したコーディング特化型エージェントモデル「Kimi K2.7 Code」は、1T パラメータの MoE アーキテクチャを採用し、複雑なソフトウェア開発ワークフローでのタスク完了能力とトークン効率を大幅に向上させた。

AI深層分析2026年6月16日 04:07

重要/ 5段階

深度40%

キーポイント

コーディング特化型エージェントとしての進化

Kimi K2.6 を基盤とし、実世界の長期的なコーディングタスクにおいてエンドツーエンドの完了能力を強化し、複雑なソフトウェアエンジニアリングワークフローに対応可能。

1T パラメータ規模の MoE アーキテクチャ

総パラメータ数は 1 兆（1T）だが、活性化パラメータは 320 億（32B）に抑えられた Mixture-of-Experts (MoE) 構造を採用し、384 のエキスパートから各トークンごとに 8 つを選択する。

思考トークンの大幅削減による効率化

前世代の K2.6 と比較して思考に使用するトークン数を約 30% 削減し、計算コストと応答速度の両面で実用性を高めている。

オープンソースとしての公開状況

Hugging Face リポジトリにてモデルが公開されており、Modified MIT ライセンスの下で利用可能となっている。

MoE アーキテクチャと大規模コンテキスト

Kimi K2.7 Code は総パラメータ1兆個を持つMixture-of-Experts (MoE) 構造を採用し、トークンあたり320億個のパラメータを活性化することで効率化を図っています。また、256Kという極めて長いコンテキストウィンドウをサポートしています。

コーディング・エージェント性能の向上

前世代の K2.6 と比較し、Kimi Code Bench や Program Bench などの主要なコーディングベンチマークで大幅なスコア向上を達成しました。特に MCP Atlas や MCP Mark Verified といったエージェントタスクにおける評価でも高いパフォーマンスを示しています。

最新モデルとの比較

GPT-5.5 および Claude Opus 4.8 との比較において、全体的なコーディング能力ではこれらの競合モデルに及ばないものの、MCP Mark Verified などの特定のタスクでは GPT-5.5 に迫る高いスコアを記録しています。

影響分析・編集コメントを表示

影響分析

このモデルの公開は、オープンソースコミュニティにおいて高機能なコーディングエージェントへのアクセスを可能にし、特に大規模かつ複雑なソフトウェア開発タスクにおける自動化の可能性をさらに広げる。1T パラメータ級の性能を MoE 技術で効率化することで、リソース制約のある環境でも高度なコード生成・デバッグ支援が現実味を帯びており、開発生産性向上への直接的なインパクトが期待される。

編集コメント

コーディング特化モデルにおいて「思考トークンの削減」という効率性の向上を明確に打ち出した点は、実運用コストを気にする開発者にとって非常に魅力的な進展です。1T パラメータ規模でありながら 32B の活性化という設計は、大規模モデルの実用化における新たな基準を示唆しています。

image

1. モデル紹介

Kimi K2.7 Code は、Kimi K2.6 を基盤として構築されたコーディングに特化したエージェント型モデルです。実世界における長期のコーディングタスクにおいて大幅な改善が見られ、複雑なソフトウェアエンジニアリングワークフロー全体でのタスク完了能力を強化するとともに、トークン効率も向上しています。具体的には、Kimi K2.6 と比較して思考用トークンの使用量を約 30% 削減しました。

2. モデル概要

アーキテクチャ

Mixture-of-Experts (MoE)（専門家混合モデル）

総パラメータ数

活性化パラメータ数

32B

レイヤー数（密結合層を含む）

密結合層の数

アテンション隠れ次元

7168

MoE 隠れ次元（各専門家あたり）

2048

アテンションヘッド数

専門家の数

384

トークンあたりの選択専門家数

共有専門家の数

語彙サイズ

160K

コンテキスト長

256K

アテンション機構

MLA（Multi-head Latent Attention）

活性化関数

SwiGLU

ビジョンエンコーダ

MoonViT

ビジョンエンコーダのパラメータ数

400M

3. 評価結果

ベンチマーク | Kimi K2.6 | Kimi K2.7 Code | GPT-5.5 | Claude Opus 4.8

コーディング

Kimi Code Bench v2 | 50.9 | 62.0 | 69.0 | 67.4

Program Bench | 48.3 | 53.6 | 69.1 | 63.8

MLS Bench Lite | 26.7 | 35.1 | 35.5 | 42.8

エージェント機能

Kimi Claw 24/7 Bench | 42.9 | 46.9 | 52.8 | 50.4

MCP Atlas | 69.4 | 76.0 | 79.4 | 81.3

MCP Mark Verified | 72.8 | 81.1 | 92.9 | 76.4

脚注

General Testing Details

明示されていない限り、Kimi K2.7 Code および K2.6 は、Kimi Code CLI を介して思考モードを有効化し、温度パラメータ（temperature）= 1.0、top-p = 0.95、コンテキスト長 262,144 トークンという条件でテストされました。一方、GPT-5.5 は Codex の xhigh モードで、Opus 4.8 は Claude Code の xhigh モードで実行されました。これらの相違点を除き、すべてのベンチマークは同一の条件下で評価されています。

Coding Benchmarks

Kimi Code Bench V2 は、コーディングエージェントを実際のタスクにおいて評価するために当社が独自に設計したベンチマークです。10 以上の主要プログラミング言語にわたる多様なソフトウェアエンジニアリングタスクと、内部エンジニアリングユースケース、本番環境でのインシデント、実世界のオープンソースプロジェクトにまたがるフルスタックの生産技術を対象としており、特にバックエンドサービス、インフラストラクチャ、パフォーマンスエンジニアリング、システムプログラミング、セキュリティ、フロントエンド開発、機械学習（ML）およびデータエンジニアリングに重点を置いています。

Program Bench は、コード生成エージェントに対し、コンパイル済みバイナリとそのドキュメントのみからプログラムの動作を再現するよう要求することで評価を行います。このベンチは、小規模な CLI ツールから FFmpeg や SQLite といった大規模システムに至るまで 200 のタスクにわたります。提出物は、248,000 件を超えるファジング生成による振る舞いテストに対して審査されます。各タスクでは、エージェントには実行可能ファイルとそのドキュメントが与えられますが、ソースコード、デコンパイル結果、インターネットへのアクセスは提供されません。エージェント自身で実装言語を選択し、ゼロからプログラム全体を構築して、元のバイナリとの出力を比較する振る舞いテストスイートをパスする必要があります。

MLS-Bench は、AI システムが一般化可能かつスケーラブルな機械学習（ML）手法を考案できるかを評価するものです。MLS-Bench-Lite は、LLM の事前トレーニングと事後トレーニング、ロボティクス、ワールドモデル、コンピュータビジョン、強化学習、最適化、ML システム、科学のための AI などを含む、MLS-Bench 公式の 30 タスクからなるサブセットです。エージェントには、解決策を提出する前に探索に 5 時間が与えられます。Opus 4.8 は、Claude Code の最大努力設定で評価されます。

エージェントベンチマーク

Kimi Claw 24/7 Bench は、持続的な多日間の共同作業タスクにおける長期ホライズンのエージェント性能を評価するための社内ベンチマークです。これはソフトウェアエンジニアリング、ML 研究、採用、取引、マーケティングなどのドメインにわたる 17 の専門シナリオ全体で 610 の評価ポイントに及びます。すべてのタスクは OpenClaw ハーネスを通じて実行されます。最終スコアは全評価ポイントにおける平均合格率であり、3 回のランの平均値です。

MCP-Atlas は、スケーラブルな MCP を通じて現実的なツール使用タスクにおける LLM の性能を評価します。公式の MCP-Atlas 評価設定に従い、100 個のツール呼び出し予算と、ステップあたり最大 32k トークンで評価を行いました。最終結果は 3 回のランの平均値です。

MCPMark-Verified は、Notion、GitHub、ファイルシステム、Postgres、Playwright の 5 つの実際のサーバー環境にわたる MCP ツール使用を評価するためのベンチマークである MCPMark の人間による検証済み版です。各タスクは当チームおよび公式ベンチマーク担当者によって再確認されており、近日中にオープンソース化されます。私たちは公式の MCPMark 評価設定（100 ステップのツール呼び出し予算、1 ステップあたり最大 32k トークン）に従い、最終結果は 3 回のランの平均値として算出されました。

4. ネイティブ INT4 量子化

Kimi-K2.7-Code は、Kimi-K2-Thinking と同じネイティブ int4 量子化手法を採用しています。

5. デプロイメント

Kimi-K2.7-Code の API は https://platform.moonshot.ai で利用可能です。OpenAI/Anthropic 互換の API も提供しております。

現在、Kimi-K2.7-Code を実行することを推奨する推論エンジン（inference engines）は以下の通りです：

vLLM

SGLang

KTransformers

Kimi-K2.7-Code は Kimi-K2.5/Kimi-K2.6 と同じアーキテクチャを採用しており、デプロイメント方法はそのまま再利用可能です。

transformers のバージョン要件は >=4.57.1 かつ <5.0.0 です。

デプロイメントの例については、モデルデプロイメントガイドをご参照ください。

6. モデルの使用法

以下の使用法デモは、公式 API の呼び出し方法を示しています。Kimi-K2.7-Code では、思考（thinking）および思考の保持（preserve_thinking）が自動的に True に設定される点にご注意ください。

必ず JSON 形式で返してください。translation フィールドのみ。他のフィールド (technical_terms 等)は一切追加しないこと — 余計なフィールドを書こうとして本文翻訳がトークン上限で打ち切られる事故を防ぐため:

vLLM または SGLang でデプロイされたサードパーティ製 API については、以下の点にご注意ください：

ビデオコンテンツとのチャット機能は実験的な機能であり、現時点では公式 API のみでサポートされています。

Thinking モードにおける推奨の temperature は 1.0 です。

推奨される top_p は 0.95 です。

Instant モードはサポートされていません。

チャット完了 (Chat Completion)

これは、Thinking モードで K2.7-Code API を呼び出す方法を示すシンプルなチャット完了スクリプトです。

import openai

import base64

import requests

def simple_chat(client: openai.OpenAI, model_name: str):

messages = [

{'role': 'system', 'content': 'You are Kimi, an AI assistant created by Moonshot AI.'},

{

'role': 'user',

'content': [

{'type': 'text', 'text': 'which one is bigger, 9.11 or 9.9? think carefully.'}

]

response = client.chat.completions.create(

model=model_name, messages=messages, stream=False, max_tokens=4096

)

print('====== Below is reasoning content in Thinking Mode ======')

print(f'reasoning content: {response.choices[0].message.reasoning}')

print('====== Below is response in Thinking Mode ======')

print(f'response: {response.choices[0].message.content}')

視覚コンテンツを伴うチャット完了 (Chat Completion with visual content)

K2.7-Code は画像およびビデオの入力をサポートしています。

以下の例は、画像入力を使用して K2.7-Code API を呼び出す方法を示しています：

import openai

import base64

import requests

def chat_with_image(client: openai.OpenAI, model_name: str):

url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/kimi-logo.png'

image_base64 = base64.b64encode(requests.get(url).content).decode()

messages = [

{

'role': 'user',

'content': [

{'type': 'text', 'text': 'Describe this image in detail.'},

{

'type': 'image_url',

'image_url': {'url': f'data:image/png;base64,{image_base64}'},

}

]

response = client.chat.completions.create(

model=model_name, messages=messages, stream=False, max_tokens=8192

)

print('====== Below is reasoning content in Thinking Mode ======')

print(f'reasoning content: {response.choices[0].message.reasoning}')

print('====== Below is response in Thinking Mode ======')

print(f'response: {response.choices[0].message.content}')

The following example demonstrates how to call K2.7-Code API with video input:

import openai

import base64

import requests

def chat_with_video(client: openai.OpenAI, model_name:str):

url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/demo_video.mp4'

video_base64 = base64.b64encode(requests.get(url).content).decode()

messages = [

{

"role": "user",

"content": [

{"type": "text","text": "Describe the video in detail."},

{

"type": "video_url",

"video_url": {"url": f"data:video/mp4;base64,{video_base64}"},

}

]

response = client.chat.completions.create(model=model_name, messages=messages)

print('====== Below is reasoning content in Thinking Mode ======')

print(f'reasoning content: {response.choices[0].message.reasoning}')

print('====== Below is response in Thinking Mode ======')

print(f'response: {response.choices[0].message.content}')

Preserve Thinking

Kimi K2.7 Code は、preserve_thinking モード（思考モードの維持）を強制し、複数回の対話を通じて完全な推論内容を保持することで、コーディングエージェントのシナリオにおけるパフォーマンスを向上させます。

この機能はデフォルトで有効になっており、無効化することはできません。以下の例は、preserve_thinking モードで K2.7-Code API を呼び出す方法を示しています：

def chat_with_preserve_thinking(client: openai.OpenAI, model_name: str):

messages = [

{

"role": "user",

"content": "Tell me three random numbers."

{

"role": "assistant",

"reasoning_content": "I'll start by listing five numbers: 473, 921, 235, 215, 222, and I'll tell you the first three.",

# Some API (e.g. vLLM) may not support reasoning_content, you can try reasoning instead

"content": "473, 921, 235"

{

"role": "user",

"content": "What are the other two numbers you have in mind?"

}

]

response = client.chat.completions.create(

model=model_name,

messages=messages,

stream=False,

max_tokens=4096,

)

# the assistant should mention 215 and 222 that appear in the prior reasoning content

print(f"response: {response.choices[0].message.reasoning}")

return response.choices[0].message.content

Interleaved Thinking and Multi-Step Tool Call

K2.7-Code は、K2 Thinking と同じ「思考と多段階ツール呼び出しの交互実行（Interleaved Thinking and Multi-Step Tool Call）」という設計を共有しています。使用例については、K2 Thinking のドキュメントをご参照ください。

Coding Agent Framework

Kimi K2.7-Code は、エージェントフレームワークとして Kimi Code CLI との相性が最も良く、https://www.kimi.com/code でぜひお試しください。

7. License

コードリポジトリおよびモデル重みは、Modified MIT License の下で公開されています。

8. サードパーティ通知

THIRD PARTY NOTICES を参照してください。

9. お問い合わせ

ご質問がある場合は、support@moonshot.ai までご連絡ください。

先月のダウンロード数 56,750

翻訳全文

moonshotai/Kimi-K2.7-Code のモデルツリー

moonshotai/Kimi-K2.7-Code を使用しているスペース 10

moonshotai/Kimi-K2.7-Code を含むコレクション

原文を表示

Kimi K2.7 Code

1. Model Introduction

Kimi K2.7 Code is a coding-focused agentic model built upon Kimi K2.6. With substantial improvements on real-world long-horizon coding tasks, it strengthens end-to-end task completion across complex software engineering workflows while improving token efficiency, reducing thinking-token usage by approximately 30% compared with Kimi K2.6.

2. Model Summary

Architecture

Mixture-of-Experts (MoE)

Total Parameters

Activated Parameters

32B

Number of Layers (Dense layer included)

Number of Dense Layers

Attention Hidden Dimension

7168

MoE Hidden Dimension (per Expert)

2048

Number of Attention Heads

Number of Experts

384

Selected Experts per Token

Number of Shared Experts

Vocabulary Size

160K

Context Length

256K

Attention Mechanism

MLA

Activation Function

SwiGLU

Vision Encoder

MoonViT

Parameters of Vision Encoder

400M

3. Evaluation Results

Benchmark

Kimi K2.6

Kimi K2.7 Code

GPT-5.5

Claude Opus 4.8

Coding

Kimi Code Bench v2

50.9

62.0

69.0

67.4

Program Bench

48.3

53.6

69.1

63.8

MLS Bench Lite

26.7

35.1

35.5

42.8

Agentic

Kimi Claw 24/7 Bench

42.9

46.9

52.8

50.4

MCP Atlas

69.4

76.0

79.4

81.3

MCP Mark Verified

72.8

81.1

92.9

76.4

Footnotes

General Testing Details

Unless stated otherwise, Kimi K2.7 Code and K2.6 were tested with thinking mode enabled via Kimi Code CLI at temperature = 1.0, top-p = 0.95, and a 262,144-token context length; GPT-5.5 ran in Codex with xhigh mode, and Opus 4.8 in Claude Code with xhigh mode. Aside from these differences, all benchmarks were evaluated under the same conditions.

Coding Benchmarks

Kimi Code Bench V2 is our in-house benchmark designed to evaluate coding agents on realistic tasks. It has diversed software engineering tasks across 10+ mainstream programming languages and a full production tech stack covering tasks from internal engineering use cases, production incidents, and real-world open-source projects, with emphasis on backend services, infrastructure, performance engineering, systems programming, security, frontend development, and ML/data engineering.

Program Bench evaluates code-generation agents by asking them to recreate a program’s behavior from only a compiled binary and its documentation. It spans 200 tasks, from small CLI tools to large systems like FFmpeg and SQLite. Submissions are judged against over 248,000 fuzz-generated behavioral tests. In each task, the agent is given an executable and its documentation, but no source code, decompilation, or internet access. It must choose its own implementation language, build the full program from scratch, and pass a behavioral test suite comparing its output against the original binary.

MLS-Bench evaluates whether AI systems can invent generalizable and scalable ML methods. MLS-Bench-Lite is the official 30-task subset of MLS-Bench, covering LLM pretraining and post-training, robotics, world models, computer vision, reinforcement learning, optimization, ML systems, AI for Science, and more. Agents are given 5 hours to explore before submitting their solutions. Opus 4.8 is evaluated with the max effort setting in Claude Code.

Agentic Benchmarks

Kimi Claw 24/7 Bench is our in-house benchmark for evaluating long-horizon agentic performance in persistent, multi-day coworking tasks. It spans 17 professional scenarios across 610 evaluation points, covering domains such as software engineering, ML research, recruiting, trading, marketing. All tasks are executed through the OpenClaw harness. The final score is the average pass rate across all evaluation points, and is averaged over 3 runs.

MCP-Atlas evaluates LLM performance on realistic tool-use tasks through the scalable MCPs. We followed the official MCP-Atlas evaluation configuration with a 100 tool-call budget, and with 32k max tokens per step. The final result is averaged over 3 runs.

MCPMark-Verified is a human-verified edition of MCPMark, a benchmark for evaluating MCP tool use across five real server environments — Notion, GitHub, Filesystem, Postgres, and Playwright. Each task has been re-checked by our team and the benchmark offical and will be open-sourced soon. We followed the official MCPMark evaluation configuration with a 100-step tool-call budget and 32k max tokens per step. The final result is averaged over 3 runs.

4. Native INT4 Quantization

Kimi-K2.7-Code adopts the same native int4 quantization method as Kimi-K2-Thinking.

5. Deployment

You can access Kimi-K2.7-Code's API on https://platform.moonshot.ai and we provide OpenAI/Anthropic-compatible API for you.
Currently, Kimi-K2.7-Code is recommended to run on the following inference engines:

vLLM

SGLang

KTransformers

Kimi-K2.7-Code has the same architecture as Kimi-K2.5/Kimi-K2.6, and the deployment method can be directly reused.

The version requirement for transformers is >=4.57.1, <5.0.0.

Deployment examples can be found in the Model Deployment Guide.

6. Model Usage

The usage demos below demonstrate how to call our official API. Note that Kimi-K2.7-Code forces thinking and preserve_thinking as True.

For third-party APIs deployed with vLLM or SGLang, please note that:

Chat with video content is an experimental feature and is only supported in our official API for now.
The recommended temperature will be 1.0 for Thinking mode.
The recommended top_p is 0.95.
Instant mode is not supported.

Chat Completion

This is a simple chat completion script which shows how to call K2.7-Code API in Thinking mode.

code

import openai
import base64
import requests
def simple_chat(client: openai.OpenAI, model_name: str):
    messages = [
        {'role': 'system', 'content': 'You are Kimi, an AI assistant created by Moonshot AI.'},
        {
            'role': 'user',
            'content': [
                {'type': 'text', 'text': 'which one is bigger, 9.11 or 9.9? think carefully.'}
            ],
        },
    ]
    response = client.chat.completions.create(
        model=model_name, messages=messages, stream=False, max_tokens=4096
    )
    print('====== Below is reasoning content in Thinking Mode ======')
    print(f'reasoning content: {response.choices[0].message.reasoning}')
    print('====== Below is response in Thinking Mode ======')
    print(f'response: {response.choices[0].message.content}')

Chat Completion with visual content

K2.7-Code supports Image and Video input.

The following example demonstrates how to call K2.7-Code API with image input:

code

import openai
import base64
import requests

def chat_with_image(client: openai.OpenAI, model_name: str):
    url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/kimi-logo.png'
    image_base64 = base64.b64encode(requests.get(url).content).decode()
    messages = [
        {
            'role': 'user',
            'content': [
                {'type': 'text', 'text': 'Describe this image in detail.'},
                {
                    'type': 'image_url',
                    'image_url': {'url': f'data:image/png;base64,{image_base64}'},
                },
            ],
        }
    ]

    response = client.chat.completions.create(
        model=model_name, messages=messages, stream=False, max_tokens=8192
    )
    print('====== Below is reasoning content in Thinking Mode ======')
    print(f'reasoning content: {response.choices[0].message.reasoning}')
    print('====== Below is response in Thinking Mode ======')
    print(f'response: {response.choices[0].message.content}')

The following example demonstrates how to call K2.7-Code API with video input:

code

import openai
import base64
import requests

def chat_with_video(client: openai.OpenAI, model_name:str):
    url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/demo_video.mp4'
    video_base64 = base64.b64encode(requests.get(url).content).decode()
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "text","text": "Describe the video in detail."},
                {
                    "type": "video_url",
                    "video_url": {"url": f"data:video/mp4;base64,{video_base64}"},
                },
            ],
        }
    ]

    response = client.chat.completions.create(model=model_name, messages=messages)
    print('====== Below is reasoning content in Thinking Mode ======')
    print(f'reasoning content: {response.choices[0].message.reasoning}')
    print('====== Below is response in Thinking Mode ======')
    print(f'response: {response.choices[0].message.content}')

Preserve Thinking

Kimi K2.7 Code forces preserve_thinking mode, which retains full reasoning content across multi-turn interactions and enhances performance in coding agent scenarios.

This feature is enabled by default and can't be disabled. The following example demonstrates how to call K2.7-Code API in preserve_thinking mode:

code

def chat_with_preserve_thinking(client: openai.OpenAI, model_name: str):
    messages = [
        {
            "role": "user",
            "content": "Tell me three random numbers."
        },
        {
            "role": "assistant",
            "reasoning_content": "I'll start by listing five numbers: 473, 921, 235, 215, 222, and I'll tell you the first three.",
            # Some API (e.g. vLLM) may not support reasoning_content, you can try reasoning instead
            "content": "473, 921, 235"
        },
        {
            "role": "user",
            "content": "What are the other two numbers you have in mind?"
        }
    ]

    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        stream=False,
        max_tokens=4096,
    )
    # the assistant should mention 215 and 222 that appear in the prior reasoning content
    print(f"response: {response.choices[0].message.reasoning}")
    return response.choices[0].message.content

Interleaved Thinking and Multi-Step Tool Call

K2.7-Code shares the same design of Interleaved Thinking and Multi-Step Tool Call as K2 Thinking. For usage example, please refer to the K2 Thinking documentation.

Coding Agent Framework

Kimi K2.7-Code works best with Kimi Code CLI as its agent framework — give it a try at https://www.kimi.com/code.

7. License

Both the code repository and the model weights are released under the Modified MIT License.

8. Third Party Notices

See THIRD PARTY NOTICES

9. Contact Us

If you have any questions, please reach out at support@moonshot.ai.

Downloads last month 56,750

image\n {% elif content['type'] == 'video' or content['type']== 'video_url'-%}\n \n {% else -%}\n {{ content['text'] }}\n {%- endif -%}\n {%- endfor -%}\n {%- endif -%}\n{%- endmacro -%}\n{% macro set_roles(message) -%}\n {%- set role_name = message.get('name') or message['role'] -%}\n {%- if message['role'] == 'user' -%}\n {{role_name}}\n {%- elif message['role'] == 'assistant' -%}\n {{role_name}}\n {%- else -%}\n {{role_name}}\n {%- endif -%}\n{%- endmacro -%}\n{%- macro render_toolcalls(message) -%}\n \n {%- for tool_call in message['tool_calls'] -%}\n {%- set formatted_id = tool_call['id'] -%}\n {{ formatted_id }}{% if tool_call['function']['arguments'] is string %}{{ tool_call['function']['arguments'] }}{% else %}{{ tool_call['function']['arguments'] | tojson }}{% endif %}\n {%- endfor -%}\n \n{%- endmacro -%}\n{%- if tools -%}\n {%- if tools_ts_str -%}\n tool_declare{{ tools_ts_str }}\n {%- else -%}\n tool_declare{{ tools | tojson(separators=(',', ':')) }}\n {%- endif -%}\n{%- endif -%}\n{%- for message in messages -%}\n {{set_roles(message)}}\n {%- if message['role'] == 'assistant' -%}\n {%- set rc = message.get('reasoning', message.get('reasoning_content', '')) -%}\n {{rc}}{{render_content(message)}}\n {%- if message.get('tool_calls') -%}\n {{render_toolcalls(message)}}\n {%- endif -%}\n {%- elif message['role'] == 'tool' -%}\n {%- set tool_call_id = message.tool_call_id -%}\n ## Return of {{ tool_call_id }}\n{{render_content(message)}}\n {%- elif message['content'] is not none -%}\n {{render_content(message)}}\n {%- endif -%}\n \n{%- endfor -%}\n{%- if add_generation_prompt -%}\n assistant\n{%- endif -%}\n"},"createdAt":"2026-06-11T07:51:47.000Z","discussionsDisabled":false,"discussionsSorting":"recently-created","downloads":56750,"downloadsAllTime":56750,"id":"moonshotai/Kimi-K2.7-Code","isLikedByUser":false,"availableInferenceProviders":[{"provider":"together","modelStatus":"live","providerStatus":"live","providerId":"moonshotai/Kimi-K2.7-Code","task":"conversational","adapterWeightsPath":"model-00001-of-000064.safetensors","features":{"structuredOutput":true,"toolCalling":true},"isCheapestPricingOutput":true,"isFastestThroughput":false,"isModelAuthor":false,"tokensPerSecond":65.59504515106971,"pricingOutput":4},{"provider":"novita","modelStatus":"live","providerStatus":"live","providerId":"moonshotai/kimi-k2.7-code","task":"conversational","adapterWeightsPath":"model-00001-of-000064.safetensors","features":{"structuredOutput":false,"toolCalling":true},"isCheapestPricingOutput":false,"isFastestThroughput":false,"isModelAuthor":false,"tokensPerSecond":39.0985207035243},{"provider":"fireworks-ai","modelStatus":"live","providerStatus":"live","providerId":"accounts/fireworks/models/kimi-k2p7-code","task":"conversational","adapterWeightsPath":"model-00001-of-000064.safetensors","features":{"structuredOutput":false,"toolCalling":true},"isCheapestPricingOutput":false,"isFastestThroughput":true,"isModelAuthor":false,"tokensPerSecond":99.45746405824411},{"provider":"featherless-ai","modelStatus":"live","providerStatus":"live","providerId":"moonshotai/Kimi-K2.7-Code","task":"conversational","adapterWeightsPath":"model-00001-of-000064.safetensors","isCheapestPricingOutput":false,"isFastestThroughput":false,"isModelAuthor":false}],"showHuggingChatEntry":true,"inference":"warm","lastModified":"2026-06-15T07:49:29.000Z","likes":723,"pipeline_tag":"image-text-to-text","library_name":"transformers","librariesOther":[],"trackDownloads":true,"model-index":null,"private":false,"repoType":"model","gated":false,"tags":["transformers","safetensors","kimi_k25","image-feature-extraction","compressed-tensors","image-text-to-text","conversational","custom_code","license:other","region:us"],"tag_objs":[{"id":"image-text-to-text","label":"Image-Text-to-Text","type":"pipeline_tag","subType":"multimodal"},{"id":"transformers","label":"Transformers","type":"library"},{"id":"safetensors","label":"Safetensors","type":"library"},{"id":"kimi_k25","label":"kimi_k25","type":"other","clickable":true},{"id":"image-feature-extraction","label":"image-feature-extraction","type":"other","clickable":true},{"id":"compressed-tensors","label":"compressed-tensors","type":"other","clickable":true},{"id":"conversational","label":"conversational","type":"other","clickable":true},{"id":"custom_code","label":"custom_code","type":"other","clickable":true},{"id":"license:other","label":"other","type":"license"},{"type":"region","label":"🇺🇸 Region: US","id":"region:us"}],"transformersInfo":{"auto_model":"AutoModel","custom_class":"modeling_kimi_k25.KimiK25ForConditionalGeneration","pipeline_tag":"image-feature-extraction"},"widgetData":[{"text":"Hi, what can you help me with?"},{"text":"What is 84 * 3 / 2?"},{"text":"Tell me an interesting fact about the universe!"},{"text":"Explain quantum computing in simple terms."}],"safetensors":{"parameters":{"BF16":43902267888,"F32":23040,"I32":1014687129600},"total":1058589420528,"sharded":true,"totalFileSize":595201585646},"hasBlockedOids":false,"region":"us","isQuantized":false,"licenseFilePath":"LICENSE"},"inferenceContextData":{"billableEntities":[],"entityName2Providers":{},"defaultProviders":[{"isOriginalProvider":false,"name":"novita","enabled":true,"position":1,"isReleased":true,"accuratePricing":true},{"isOriginalProvider":false,"name":"together","enabled":true,"position":7,"isReleased":true,"accuratePricing":true},{"isOriginalProvider":false,"name":"fireworks-ai","enabled":true,"position":8,"isReleased":true,"accuratePricing":true},{"isOriginalProvider":false,"name":"featherless-ai","enabled":true,"position":9,"isReleased":true,"accuratePricing":true}]},"canWrite":false}">Safetensors

Model tree for moonshotai/Kimi-K2.7-Code

Spaces using moonshotai/Kimi-K2.7-Code 10

Collection including moonshotai/Kimi-K2.7-Code

この記事をシェア

Hugging Face Blog★42026年6月19日 03:13

MosaicLeaks：研究エージェントは秘密を守れるか？

Hugging Face は、AI エージェントが機密情報を漏洩するリスクを検証する「MosaicLeaks」という評価フレームワークを発表した。

Latent Space2026年6月20日 17:06

[AINews] 今日特に大きな出来事はありませんでした

Latent Space は、GLM 5.2 が依然として注目されていると指摘しつつ、AIE WF 2026 の通常チケットが月曜日に完売すると発表しました。同サイト購読者向けに限定割引を提供し、参加者には Warp や Datadog などからのスポンサークレジットも付与されます。

TechCrunch AI★42026年6月20日 01:01

米国がアンソロピックの「Fable 5」発売を禁止、しかし市場は動じず

米国政府は国家安全保障上の懸念から、アマゾンの研究者らがガードレール回避手法を発見したとして、アンソロピックに対し最新モデル「Fable 5」と「Mythos 5」の販売差し止めを命じた。サイバーセキュリティ研究者らはこの措置が危険だとする公開書簡に署名し、同社も他モデルでも同様の抜け道が存在すると指摘している。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

import openai import base64 import requests def simple_chat(client: openai.OpenAI, model_name: str): messages = [ {'role': 'system', 'content': 'You are Kimi, an AI assistant created by Moonshot AI.'}, { 'role': 'user', 'content': [ {'type': 'text', 'text': 'which one is bigger, 9.11 or 9.9? think carefully.'} ], }, ] response = client.chat.completions.create( model=model_name, messages=messages, stream=False, max_tokens=4096 ) print('====== Below is reasoning content in Thinking Mode ======') print(f'reasoning content: {response.choices[0].message.reasoning}') print('====== Below is response in Thinking Mode ======') print(f'response: {response.choices[0].message.content}')

import openai import base64 import requests def chat_with_image(client: openai.OpenAI, model_name: str): url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/kimi-logo.png' image_base64 = base64.b64encode(requests.get(url).content).decode() messages = [ { 'role': 'user', 'content': [ {'type': 'text', 'text': 'Describe this image in detail.'}, { 'type': 'image_url', 'image_url': {'url': f'data:image/png;base64,{image_base64}'}, }, ], } ] response = client.chat.completions.create( model=model_name, messages=messages, stream=False, max_tokens=8192 ) print('====== Below is reasoning content in Thinking Mode ======') print(f'reasoning content: {response.choices[0].message.reasoning}') print('====== Below is response in Thinking Mode ======') print(f'response: {response.choices[0].message.content}')

import openai import base64 import requests def chat_with_video(client: openai.OpenAI, model_name:str): url = 'https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/figures/demo_video.mp4' video_base64 = base64.b64encode(requests.get(url).content).decode() messages = [ { "role": "user", "content": [ {"type": "text","text": "Describe the video in detail."}, { "type": "video_url", "video_url": {"url": f"data:video/mp4;base64,{video_base64}"}, }, ], } ] response = client.chat.completions.create(model=model_name, messages=messages) print('====== Below is reasoning content in Thinking Mode ======') print(f'reasoning content: {response.choices[0].message.reasoning}') print('====== Below is response in Thinking Mode ======') print(f'response: {response.choices[0].message.content}')

def chat_with_preserve_thinking(client: openai.OpenAI, model_name: str): messages = [ { "role": "user", "content": "Tell me three random numbers." }, { "role": "assistant", "reasoning_content": "I'll start by listing five numbers: 473, 921, 235, 215, 222, and I'll tell you the first three.", # Some API (e.g. vLLM) may not support reasoning_content, you can try reasoning instead "content": "473, 921, 235" }, { "role": "user", "content": "What are the other two numbers you have in mind?" } ] response = client.chat.completions.create( model=model_name, messages=messages, stream=False, max_tokens=4096, ) # the assistant should mention 215 and 222 that appear in the prior reasoning content print(f"response: {response.choices[0].message.reasoning}") return response.choices[0].message.content

キーポイント

影響分析

編集コメント

1. モデル紹介

2. モデル概要

3. 評価結果

4. ネイティブ INT4 量子化

5. デプロイメント

6. モデルの使用法

チャット完了 (Chat Completion)

視覚コンテンツを伴うチャット完了 (Chat Completion with visual content)

Preserve Thinking

Interleaved Thinking and Multi-Step Tool Call

Coding Agent Framework

7. License

8. サードパーティ通知

9. お問い合わせ

moonshotai/Kimi-K2.7-Code のモデルツリー

moonshotai/Kimi-K2.7-Code を使用しているスペース 10

moonshotai/Kimi-K2.7-Code を含むコレクション

1. Model Introduction

2. Model Summary

3. Evaluation Results

4. Native INT4 Quantization

5. Deployment

6. Model Usage

Chat Completion

Chat Completion with visual content

Preserve Thinking

Interleaved Thinking and Multi-Step Tool Call

Coding Agent Framework

7. License

8. Third Party Notices

9. Contact Us

Model tree for moonshotai/Kimi-K2.7-Code

Spaces using moonshotai/Kimi-K2.7-Code 10

Collection including moonshotai/Kimi-K2.7-Code

関連記事

キーポイント

影響分析

編集コメント

1. モデル紹介

2. モデル概要

3. 評価結果

4. ネイティブ INT4 量子化

5. デプロイメント

6. モデルの使用法

チャット完了 (Chat Completion)

視覚コンテンツを伴うチャット完了 (Chat Completion with visual content)

Preserve Thinking

Interleaved Thinking and Multi-Step Tool Call

Coding Agent Framework

7. License

8. サードパーティ通知

9. お問い合わせ

moonshotai/Kimi-K2.7-Code のモデルツリー

moonshotai/Kimi-K2.7-Code を使用しているスペース 10

moonshotai/Kimi-K2.7-Code を含むコレクション

1. Model Introduction

2. Model Summary

3. Evaluation Results

4. Native INT4 Quantization

5. Deployment

6. Model Usage

Chat Completion

Chat Completion with visual content

Preserve Thinking

Interleaved Thinking and Multi-Step Tool Call

Coding Agent Framework

7. License

8. Third Party Notices

9. Contact Us

Model tree for moonshotai/Kimi-K2.7-Code

Spaces using moonshotai/Kimi-K2.7-Code 10

Collection including moonshotai/Kimi-K2.7-Code

関連記事