Hugging Faceにおけるオープンソースの現状:2026年春
Hugging Faceの2026年春レポートは、オープンソースAIエコシステムの急成長とエンタープライズ採用の拡大を示す一方、ダウンロード数の極端な集中とニッチコミュニティの重要性を指摘している。
キーポイント
エコシステムの急成長と開発形態の転換
ユーザー数、公開モデル・データセット数がほぼ倍増し、単なるプリトレインドモデルの消費から、ファインチューニングやアダプタ作成などの派生開発へ能動的な参加が移行している。
ダウンロード数のパレート分布とサブエコシステム構造
モデルの半数は200回未満のダウンロードだが、上位0.01%が全体の約半分を占める。市場は均一ではなく、特定ドメインや言語に特化したコミュニティが持続的な再利用を生む重複するサブエコシステムから構成される。
エンタープライズとスタートアップのオープンソース標準化
Fortune 500企業の30%以上がHFに公式アカウントを保有し、スタートアップはオープンモデルをデフォルトコンポーネントとして採用する傾向が顕著であり、競争優位性の基盤となっている。
企業・スタートアップのオープンソース採用拡大
Fortune500企業の30%以上がHugging Faceに公式アカウントを維持し、IDEやスタートアップがオープンモデルを標準コンポーネントとして採用する動きが加速している。
Big Techの投資とオープンモデルの経済的優位性
NVIDIAなどBig Techがリポジトリ作成を強化する中、オープンモデルは再利用・適応により生産コストを上回る下流価値を創出し、クローズドシステムより柔軟性とコスト効率に優れる。
地理的・開発者構成の構造的シフト
中国モデルが月間・累計ダウンロード数で米国を逆転し、業界主導から独立開発者・個人への移行が進んで2025年には全体の約39%を占めるまでになった。
Open Source and National Sovereignty
Open weight models enable governments to fine-tune systems on local data, deploy on domestic hardware, and ensure transparency for regulatory review, directly supporting national AI sovereignty.
影響分析・編集コメントを表示
影響分析
本レポートは、オープンソースAIが単なる技術リポジトリから産業標準インフラへ移行していることを定量的に裏付ける。企業はハブの集中リスクを認識しつつも、スタートアップとの競争やエコシステム参画のためにHFでの存在とオープンモデルの活用を必須戦略としている。今後はニッチ分野の専門モデルと大規模基盤モデルの二極化が進む中で、開発者は適切なアーキテクチャ選択とコミュニティ連携が競争力維持の鍵となる。
編集コメント
Hugging Faceによる公式分析レポートであり、エコシステムの構造変化を定量的に可視化した点に価値がある。ただし、ダウンロード数集中の指摘は既存の知見を再確認するものであり、具体的な技術実装や競争戦略への直接的な示唆は限定的である。
タイトル: Hugging Faceにおけるオープンソースの現状: 2026年春 (続き 3/3)
共通の研究目標を中心に、コミュニティ主導のプロジェクトが形成され、しばしば数百人の貢献者が組織や分野の垣根を越えて協力しています。これらの取り組みは、従来の学術的または企業的な構造だけでは調整が困難な大規模な学際的プロジェクトを可能にするメカニズムとしての、オープンソースの役割を浮き彫りにしています。
今後の展望
オープンソースAIエコシステムは、グローバルな参加、技術的特化、組織的採用を通じて進化を続けています。以下のトレンドが次のフェーズを形作る可能性があります。
権力の地理的再均衡が加速しています。 欧米の組織は、中国発のモデルに代わる商用利用可能な代替案をますます求めており、OpenAIのGPT-OSS、AI2のOLMo、GoogleのGemmaといったプロジェクトに緊急性を与え、米国および欧州の開発者による競争力のあるオープンな選択肢を提供しています。これらの取り組みがQwenやDeepSeekの採用ペースに追いつけるかどうかが、2026年の重要な課題となるでしょう。
ロボティクスと科学分野におけるサブコミュニティの成長は、オープンソースAIが言語および画像生成を超えて、物理的・実験的領域へ拡大していることを示唆しています。テキストおよび画像モデルを中心に発展したインフラストラクチャ、規範、調整メカニズムは、新しいモダリティとユースケースへと適応されつつあります。
研究者、開発者、企業、政府にとって、オープンソースはAIシステムを構築、評価、ガバナンスするための基盤であり続けます。エージェントの展開が増加する中、オープンソースとその相互運用性は、エージェントが真に繁栄するための鍵となります。過去1年間の軌跡が示すのは、オープンソースエコシステムがAIの開発、適応、展開における実用的作業の多くが行われる場であり、より広範なAI分野への影響力を増大させ続けているということです。
AIエコシステムの基盤を構築し続けてくださるHugging Faceコミュニティに感謝します 🤗







原文を表示
Back to Articles State of Open Source on Hugging Face: Spring 2026
Upvote 10 ![]()





This post examines how the open source AI landscape has shifted across competition, geography, technical trends, and emerging communities over the past year. We primarily examine community activity on Hugging Face across many types of metrics to give a holistic view of the ecosystem.
This post builds on an earlier analysis conducted mid-2025, available here, which examined what the Hugging Face Community is building. We recommend reading additional perspectives on the open source ecosystem in and outside of Hugging Face from the Data Provenance Initiative, Interconnects, OpenRouter and a16z, and MIT and the Linux Foundation. As the Hugging Face ecosystem is distributed, analyses are a combination of Hugging Face and community members' work, each of which is appropriately credited.
Activity in the open source AI ecosystem has rapidly grown, with the number of users, model, and dataset repositories all close to doubling. In 2025, Hugging Face grew to 11 million users, more than 2 million public models, and over 500,000 public datasets. This growth signals more than increased interest in open source; it reflects a shift toward active participation, with users increasingly creating derivative artifacts such as fine-tuned models, adapters, benchmarks, and applications rather than only consuming pre-trained systems.

Data from Hugging Face | Hugging Face's two million models and counting: Graph and story by AI World
The ecosystem remains highly concentrated. Approximately half of the models on Hugging Face have less than 200 total downloads, and the top 200 most downloaded models, or 0.01% of models, comprise 49.6% of all downloads.
Specialized communities form around particular domains, languages, or problem areas, and often show sustained engagement and reuse even when their overall download counts are modest. Open source AI is best understood as a collection of overlapping sub-ecosystems rather than a single uniform market.
Open Source in Competition
More companies, both large and small, are building on open source. Over 30% of the Fortune 500 now maintain verified accounts on Hugging Face. Startups frequently use open models as default components: Thinking Machines built its Tinker model options entirely on open weights, while popular IDEs such as VSCode and Cursor support both open and closed models. Established American companies such as Airbnb have increased their engagement with the open ecosystem, and Hugging Face has seen more legacy companies upgrading their organizational subscriptions over the course of 2025.
Big Tech companies are frequently creating new repositories on Hugging Face Hub; visualized side-by-side, the strong increase in repository growth shows investment over time. NVIDIA has emerged as the strongest contributor.

Data from Hugging Face | Big Tech Is All-In On Open-Source AI, Graph and story by AI World
Studies of open software more broadly suggest that the downstream value created by open artifacts far exceeds the cost of producing them. Similar dynamics are emerging in AI, where open models are reused, adapted, and specialized across thousands of downstream applications. Organizations that rely exclusively on closed systems often incur higher costs and face reduced flexibility in deployment and customization.
The Geography of Open Source
All-time downloads over the past four years show clear frontrunner regions in model popularity. The U.S. and China have historically been top contributors, with the UK, Germany, and France as secondary in popularity. Models developed by individual users or distributed organizations without a clear geographic base account for about half of all platform downloads.
Data from Hugging Face | Graph and Research from Longpre et al. “Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem”
The geographic composition of the open source ecosystem has fundamentally changed. Hugging Face data shows China surpassing the U.S. in monthly downloads and overall downloads. In the past year, Chinese models quickly accounted for the plurality or 41% of downloads.

Data and Graph from Hugging Face
Industry's share of overall development fell from around 70% before 2022 to roughly 37% in 2025. Meanwhile, independent or unaffiliated developers rose from 17% to 39% of all downloads over the same period, at times accounting for more than half of total usage. Individuals and small collectives focused on quantizing, adapting, and redistributing base models. These intermediaries now steer a meaningful portion of what typical users can run and how innovations spread through the ecosystem.
Data from Hugging Face | Graph and Research from Longpre et al. “Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem”
Different regions contribute in different ways. The United States and Western Europe have historically dominated through large industry labs (Google, Meta, OpenAI, Stability AI), while China has increasingly led on both releases and adoption. France, Germany, and the UK continue to contribute through research organizations, national AI initiatives, and specialized model families. Ecosystems supporting a variety of contributors and organizational forms tend to produce more widely adopted artifacts.
Countries, Organizations, and Individual Users
Popular models from startups were more widespread. Competitive countries were France and South Korea. Notably, the fourth most popular entity for developing new trending models were individual users, not organizations. Creating competitive models at a user level is more accessible than ever before.
Data and Graph from Hugging Face
Between the U.S. and China
Of the newly created models in 2025, the majority of trending models were either developed in China or derivative of a model developed in China. The most popular models were developed by large organizations, predominantly from the U.S. and China. For more on the Chinese AI ecosystem, read our three part series reflecting on the changes in one year since the "DeepSeek Moment", with one on strategic changes, two on architectural changes, and three on organizations and the future.
In 2025, China’s AI ecosystem steered heavily into open source, following the viral release of DeepSeek’s R1 model in January. The number of competitive Chinese organizations releasing models and the number of repositories on Hugging Face skyrocketed. Baidu went from zero releases on the Hub in 2024 to over 100 in 2025. ByteDance and Tencent each increased releases by eight to nine times. Organizations that had previously favored closed approaches, including Baidu and MiniMax, shifted decisively toward open releases.

Data and Graph from Hugging Face
A similar number of popular U.S. organizations have consistently contributed a higher volume of repositories over time. Meta and its former Facebook research organization account for a significant proportion of open releases, as does Google to a lesser extent.

Data and Graph from Hugging Face
Next to each other, the steep upward trajectory of repository growth among popular Chinese organizations emerges as a key strategic difference.

Data and Graph from Hugging Face
Global Open Source and Sovereignty
Open source AI is increasingly tied to questions of sovereignty. Open weight models allow governments and public institutions to fine-tune systems on local data under national legal frameworks. Models that can be deployed on domestic hardware reduce reliance on foreign-controlled cloud infrastructure. Transparency around model architecture, training processes, and evaluation supports regulatory review and public accountability. Read more about the open source approach to sovereignty here.
At the national level, governments are taking action. South Korea's National Sovereign AI Initiative launched mid-2025 named national champions LG AI Research, SK Telecom, Naver Cloud, NC AI, and Upstage to produce competitive domestic models. Three models from South Korea trended simultaneously on Hugging Face Hub in February 2026. In March 2026, In 2026, South Korea and U.S. startup Reflection AI announced a data center partnership, also bringing frontier open weight models to South Korea.
Switzerland's Swiss AI initiative and various EU-funded projects reflect similar priorities. The UK's principle of "public money, public code" has influenced several government-backed AI initiatives.

Hugging Face Trending Page February 2026
These investments in open-source and open weight AI are already paying dividends for countries with thriving AI training ecosystems of their own, as we see that models and datasets are typically most used in the regions where they're developed; with developers often turning to the models that best represent their languages and reflect similar technical and application requirements.

Data and Graph from Hugging Face
Model Popularity
Most liked models on the Hub show community attention, in terms of ability to go back to or reference the model or general popularity. While this metric does not always reflect usage, the attention collected over time can show signals of interest. In one year, the most liked models went from predominantly U.S.-developed from Meta’s Llama family, to an international mix with China’s DeepSeek-R1 at the top.

Data and Graphic from Hugging Face
Papers and Scientific Contributions
While determining the value of scientific contributions can be determined by many metrics, our upvote feature on the Hub shows papers from large AI organizations be widely appreciated by community members. Notably, the most upvoted papers are from large organizations, mostly from the U.S. and China. The majority of the top organizations are Chinese Big Tech companies, with ByteDance sharing a high volume of high impact papers.
Space by Hugging Face | PaperVerse Explorer
Of Hugging Face's Daily Papers, a set of papers curated by Hugging Face's AK, papers that reference model and dataset creation, showing the most open source adoption, are generally diverse. Prominent takeaways show medical papers being influential, while Big Tech's influence is sparse.

Data from Hugging Face | Graphic and story by AI World
Derivative Models
How our community members choose to build on models, whether via fine-tuning, merging, or other methods, reflects model popularity and usability. Alibaba as an organization has more derivative models than both Google and Meta combined, with the Qwen family constituting more than 113,000 derivative models. When including all models that tag Qwen, that number balloons to over 200,000 models.

Data and Graph from Hugging Face
Adoption and Accessibility
Model development has increasingly emphasized accessibility alongside scale. Smaller models are downloaded and deployed at far higher rates than very large systems, reflecting practical constraints around cost, latency, and hardware availability.
This small-model dominance occurs in part because far more models are released at that size. But even when normalizing for this, the data from the ATOM Project's Relative Adoption Metric shows that the median top-10 models from 1-9B parameters are only downloaded about 4x more than models above 100B. Automated systems and CI pipelines further inflate small model download counts, but the trend toward smaller, deployable models is real.

Data from Hugging Face | Graph and Article by ATOM
Engagement with open models tends to peak almost immediately after release, then slow. Mean engagement duration is approximately 6 weeks. Continuous improvement and frequent updates have become critical for maintaining relevance. DeepSeek's successive releases (V3, R1, V3.2) kept it competitive even as challengers emerged. Organizations that stagnate in development tend to lose share quickly to those with frequent updates or domain-specific fine-tunes.

Data from Hugging Face | Graph and Research from Choksi et al. "The Brief and Wondrous Life of Open Models"
The mean size of downloaded open models rose from 827M parameters in 2023 to 20.8B in 2025, driven largely by quantization and mixture-of-experts architectures. The median, however, increased only marginally, from 326M to 406M parameters. This divergence indicates that high-end LLM users are pulling up the mean while underlying small-model usage remains stable.

Data from Hugging Face | Graph and Research from Longpre et al. "Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem"
Performance differences between frontier models and smaller systems often narrow rapidly through fine-tuning and task-specific adaptation. On the Hub, models with hundreds of millions of parameters support search, tagging, and document processing workflows, while models in the single-digit billions are widely used for coding, reasoning, and multimodal tasks. As a result, most major model developers now release families of models spanning a range of sizes. The rise of capable small models shifts autonomy closer to the edge, reducing dependency on centralized cloud providers.
Compute, Hardware, and Open Source
Open source AI development is closely linked to hardware trends. Most models are optimized for NVIDIA GPUs, but support for AMD hardware continues to expand. Stability AI model collections now optimize for both NVIDIA and AMD platforms. Libraries increasingly target both, and tooling has improved to make cross-hardware deployment more straightforward. In 2025 Hugging Face launched the Kernel Hub to load and run kernels optimized for NVIDIA and AMD GPUs.
In parallel, Chinese open models are being released with explicit support for domestically developed chips. Alibaba has invested in inference-focused chip architectures designed to fill Chinese data centers with hardware capable of running open source models locally.
While access to compute remains a core necessity across the board for development and deployment of AI models, open-source and open-weight models are helping break away from an ecosystem where it becomes the be-all and end-all, with increasingly more models at all levels of performance pushing efficiency from 10x to 1000x lower costs than flagship AI models the largest developers.

Data and Graphic from Hugging Face
Still, the question of infrastructure investment for open source remains urgent. Public funding for data centers capable of training and serving open models has become a growing policy discussion, particularly in Europe and the UK. The gap between the compute resources available to large closed-model companies and those accessible to the open source community continues to shape what is feasible in open development.
Sub-Communities: Robotics
Robotics has emerged as one of the fastest-growing sub-communities on Hugging Face. The numbers are striking: robotics datasets grew from 1,145 in 2024 to 26,991 in 2025, climbing from rank 44 to the single largest dataset category on the Hub in just three years. For comparison, text generation, the second-largest category, had only around 5,000 datasets in 2025.
Data from Hugging Face | Graph and Story by AI World
Community-contributed datasets span everything from household manipulation tasks to autonomous driving. The largest multimodal dataset for spatial intelligence, Learning to Drive (L2D), was released through a LeRobot collaboration with Yaak. Datasets like RoboMIND, with over 107,000 real-world trajectories across 479 distinct tasks and multiple robot embodiments, provide the kind of scale and diversity needed for training generalizable robotic policies.
Hugging Face's acquisition of Pollen Robotics opened open source robotic sales to both industry and academic labs, as well as everyday hobbyists. LeRobot, Hugging Face's open source robotics library that provides models, datasets, and tools for real-world robotics in PyTorch, covering imitation learning, reinforcement learning, and vision-language-action models, experienced rapid growth. Over the past year, its GitHub repository stars nearly tripled.

Data from GitHub | Graphic from star-history.com
Sub-Communities: AI for Science
Scientific research has become another particularly active area. Open models and datasets are increasingly used for protein folding, molecular dynamics, drug discovery, and scientific data analysis. All frontier AI companies now have dedicated science teams, though much current focus remains on literature discovery rather than direct experimentation.

Space by Hugging Face | Science Release Heatmap
Community-led projects have formed around shared research goals, often involving hundreds of contributors working across institutions and disciplines. These efforts highlight the role of open source as a mechanism for coordinating large-scale, interdisciplinary work that would be difficult to organize through traditional academic or corporate structures alone.
Looking Forward
The open source AI ecosystem continues to evolve through a combination of global participation, technical specialization, and institutional adoption. Several trends are likely to define the next phase.
The geographic rebalancing of power is accelerating. Western organizations increasingly seek commercially deployable alternatives to Chinese models, creating urgency around efforts like OpenAI's GPT-OSS, AI2's OLMo, and Google's Gemma to offer competitive open options from US and European developers. Whether these efforts can match the adoption momentum of Qwen and DeepSeek will be a defining question of 2026.
The growth of sub-communities in robotics and science suggests that open source AI is expanding beyond language and image generation into the physical and experimental domains. The infrastructure, norms, and coordination mechanisms developed around text and image models are being adapted for new modalities and use cases.
For researchers, developers, companies, and governments, open source remains a foundational layer for building, evaluating, and governing AI systems. With increasing agent deployments, open-source and its interoperability will be key for agents to thrive. Its trajectory over the past year makes one thing clear: the open source ecosystem is where much of the practical work of AI development, adaptation, and deployment takes place, and its influence on the broader AI landscape continues to grow.
Thank you to the Hugging Face community for continuing to build the foundation of the AI ecosystem 🤗







関連記事
今日のまとめ
AI日報で今日の重要ニュースをまとめ読み