PerplexityとCoreWeaveの提携が推論処理を強化
PerplexityとCoreWeaveがマルチイヤー契約を結び、Nvidia GB200 NVL72クラスターを用いた次世代推論ワークロードの移行を発表し、AI市場が学習から推論へシフトしていることを示した。
キーポイント
PerplexityとCoreWeaveの戦略的提携
AI検索ベンダーPerplexityは、ニュークラウドプロバイダーCoreWeaveと複数年の契約を結び、次世代AI推論ワークロードをCoreWeave Cloudへ移行すると発表した。
高性能ハードウェアとプラットフォームの活用
PerplexityはNvidia GB200 NVL72クラスターを使用してAIモデル「Sonar」およびSearch APIを運用し、CoreWeave Kubernetes Services (CKS) と W&B Modelsを用いてモデル管理とデプロイを最適化する。
AI市場における推論(Inference)へのシフト
この提携は、AI業界がモデル学習から推論プロセスへ重心を移しつつあることを示しており、OpenAIやMetaも同様に大規模な推論インフラへの投資を強化している。
エンタープライズ顧客への実用的価値
CoreWeaveのCIOは、エンタープライズ顧客にとって推論速度とリアルタイム応答が重要であり、Perplexityのような企業がAI体験を広く提供することで「現実世界の使用」が増加し、目的特化型コンピューティングの需要が高まっていると指摘している。
影響分析・編集コメントを表示
影響分析
このニュースは、AI業界の投資重心が「モデル学習」から「推論運用」へ明確にシフトしていることを示しています。Perplexityのようなアプリケーション層の企業が、CoreWeaveやNvidiaといったインフラ層と直接提携することで、大規模なリアルタイム推論を実現するエコシステムが構築されつつあります。これは、AIサービスの品質(応答速度)を担保するためのインフラ競争が激化していることを意味し、中小規模のAI企業にとって独自インフラ構築の難しさを浮き彫りにしています。
編集コメント
PerplexityとCoreWeaveの提携は、単なるインフラ調達ではなく、AIアプリケーションの競争力が「いかに高速に推論を実行するか」にかかっていることを示す指標です。今後は、学習済みモデルの活用方法よりも、推論時のコストとレイテンシを最適化するアーキテクチャ設計が重要になります。
この提携は、推論処理への継続的な重点を示すとともに、CoreWeaveがPerplexityを通じて推論プロバイダーとしての能力を証明する機会となります。
原文を表示
3 Min ReadCollection/Gado via Getty ImagesNeocloud provider CoreWeave and AI search vendor Perplexity have agreed to a multiyear deal to scale Perplexity's AI search and inference capabilities. The agreement, financial terms of which were not disclosed, underscores the broad applicability of inferencing and the ongoing shift from AI training to AI inference.The vendors revealed on March 4 that Perplexity will migrate its next AI inference workloads to CoreWeave Cloud. The partnership requires Nvidia's GB200 NVL72 clusters to power Perplexity's AI model, Sonar, and its Search API ecosystem. Perplexity will also use CKS (CoreWeave Kubernetes Services) and W&B (weights and balances) models for model management and deployment. CKS and W&B are core components of CoreWeave's AI cloud platform, with CKS a managed service optimized for computationally intensive AI workloads, and W&B Models a specialized "system of record" for managing the lifecycle of machine learning models.Related:AMD's Vision for AI PCs in the Age of Agentic AIPerplexity and CoreWeave's deal further shows the shift in the AI market toward inference, the process by which an AI uses the knowledge or data it acquired during training. Most recently, there have been deals in which vendors are partnering solely for inference. For instance, OpenAI recently committed to using 2 gigawatts of capacity on AWS' Trainium3 and Trainium4 chips, following an expansion of its partnership with the cloud provider. Meta also plans to deploy millions of Nvidia Blackwell and Rubin GPUs to run high-volume inference and agentic workloads.More Inferencing "Inference is an ongoing, continuous workload," said Nick Patience, an analyst at Futurum Group, adding that inference is not nonstop. "Everybody in the whole ecosystem believes inference is the bigger opportunity by quite some scale."While it may seem that an emphasis on inference only benefits vendors -- including AI labs, model makers, and hardware providers -- there is a benefit for enterprise customers using AI in the applications consumers will use, according to Sandy Venugopal, CoreWeave's CIO. "For enterprise customers, they want to make sure when they're building AI features or products or capabilities for their platforms, for their customers, inference does matter," Venugopal said in an interview. "When somebody comes in and uses AI on your product or platform, they want quick responses. They want to see it in real time."The new emphasis on inference is also due to AI vendors such as Perplexity making AI-powered experiences accessible to a broader group of people and companies, leading to a surge in "real-world usage," said Mike Leone, an analyst at Omdia, a division of Informa TechTarget. This growth provides an opportunity for vendors like CoreWeave to offer purpose-built AI computing.Related:Siemens Trials Nvidia-Powered HumanoidBenefits and ChallengesHe added that CoreWeave's partnership with Perplexity is about the neocloud vendor diversifying its customer base, as its revenue is heavily concentrated in contracts with Microsoft, OpenAI, and Meta."Landing an AI application company like Perplexity shows the platform can attract a broader mix of customers with different workload profiles," Leone said. He added that Perplexity, for its part, gets to secure high-performance infrastructure that they do not have to build themselves.CoreWeave is also trying to make itself a more credible inference platform, Patience said."Winning a customer like Perplexity is quite a big deal because Perplexity is quite a demanding customer," he said. He added that the APIs will run continuously in production, and if Perplexity is satisfied with CoreWeave, it is less likely to switch to another provider."Inference is even more important to Perplexity because that's essentially its business," Patience continued.However, the challenge for CoreWeave is to prove itself in a market in which hyperscalers can compete with their own in-house custom chips.Related:AI Chipmaker Cerebras Files for IPO"CoreWeave needs to keep proving that a purpose-built AI cloud delivers better performance and economics than what the hyperscalers offer natively," Leone said. About the AuthorNews Writer, AI BusinessEsther Shittu brings four years of expertise covering artificial intelligence technologies and industry trends. As co-host of the "Targeting AI" podcast, she talks to thought leaders and practitioners exploring critical AI developments. Previous to AI Business, she wrote for several publications including the New York Daily News, Bklyner and the Brooklyn Daily Eagle. When she's not diving deep into the world of AI, she spends her time on passion projects and raising her three daughters.
関連記事
今日のまとめ
AI日報で今日の重要ニュースをまとめ読み