NeocloudパイオニアCoreWeaveが推論に完全集中
GPU-as-a-serviceベンダーとして知られるCoreWeaveが、推論(inference)に特化した戦略へと再び進化を遂げていることを報じる記事である。
キーポイント
事業戦略の転換
GPUリソースの提供サービスで名を馳せたCoreWeaveが、新たな事業フェーズへと移行している。
推論(Inference)への集中
同社は「All In on Inference」と題し、AIモデルの学習後の運用・推論フェーズに注力する方針を示している。
クラウドインフラの進化
提供記事は、Neocloudの先駆者としての同社の継続的な進化を「evolving -- again」と表現している。
影響分析・編集コメントを表示
影響分析
この動きは、生成AIの実用化・普及が進む中で、モデル学習だけでなく推論実行に最適化されたインフラ需要が高まっていることを示唆する。クラウドサービス市場において、汎用GPU提供から特定ユースケース特化型への差別化が進む可能性がある。
編集コメント
記事本文が非常に短く、具体的な技術内容や事業計画の詳細に乏しいため、分析の深さには限界がある。今後の詳細な発表に注目したい。
GPU-as-a-service(GPUサービス)ベンダーとして名を馳せた後、CoreWeaveは再び進化を遂げています。
原文を表示
4 Min ReadMichael M. Santiago/Staff via Getty ImagesInference is everything.That aphorism and way of looking at AI infrastructure have been appearing frequently in AI circles lately.Now, CoreWeave, the cryptocurrency startup turned major neocloud player, with a close relationship with AI chip giant Nvidia, has started to pivot toward one of the fastest-growing trends in AI -- inference.The vendor operates some 40 AI data centers -- largely populated by Nvidia GPUs -- and serves dozens of major customers, including generative AI vendors OpenAI, Cohere and ElevenLabs; enterprises and tech vendors such as Siemens, Mercado Libre, Salesforce and Databricks; and AI platforms Perplexity, Cursor and Runway.Putting Inference to Use"Inference is the way to monetize AI," Chen Goldberg, executive vice president of product and engineering at CoreWeave, said during an online media roundtable earlier this week. "We are seeing that with our customer base, no matter if it's enterprise AI, AI labs or AI platforms, customers are looking for different methods to run inference. That's what we've been doing."Related:Once Trendy Shoe Company Allbirds Pivots to AI InfrastructurePropelling the demand for inference is the dramatic surge in agentic AI interest. Many AI users are interested in using autonomous agents that lean heavily on the reasoning capabilities of large language models. And reasoning largely relies on inference, with agents drawing new conclusions and acting independently rather than regurgitating information from huge, pretrained LLMs."Instead of a single query … we have a new category of agents, which [do] a long-running task. [Agents] can complete more complicated tasks, maybe with multiple queries," Goldberg said.Applications that are increasingly using agentic AI and inference include coding, engineering, physical AI, call centers and drug discovery, she noted.Speed and Older GPUsMeanwhile, CoreWeave is touting recent top performance in compute processing speed benchmarks on the independent MLPerf Training benchmark suite from the MLCommons consortium using Nvidia Grace Blackwell architectures to run two popular, powerful reasoning models: DeepSeek-R1 and OpenAI's smaller open-weight gpt-oss-120b.That speed is important for extracting the most performance from earlier-generation GPUs, said Shadi Saba, senior director of AI/ML infrastructure at CoreWeave, during the roundtable.With Nvidia and other chip vendors rapidly releasing newer generations of GPUs, industry observers have raised financial concerns about depreciating GPUs as faster, more capable chips arrive on the market."Compared with older generations, the same model will squeeze the most from whatever Nvidia is giving between generations," Saba said, noting that CoreWeave uses its own software stack to optimize performance from GPUs and CPUs, which are becoming more popular for inference tasks.Related:Nvidia Invests $2B In Custom Chip Vendor Marvell TechnologyCoreWeave's strategy of wringing usable production from older GPUs, while also upgrading to the latest chips, is effective, said Steven Dickens, an analyst at HyperFrame Research. "You've got to look at it as a sort of portfolio construction, in the same way you do your stock portfolio. You want some things that earn you money from dividends, and then you want some high growth stocks," Dickens said, adding that the vendor can provide reliable inference compute with older chips. "The same thing with CoreWeave. They have some H100 chips that are probably three or four years old. Those are still in the portfolio and still earning money."The strategy, however, isn't unique and is also employed by neocloud competitors including Nebius, Lambda, OVH and QumulusAI.The Neocloud MarketDickens said the ability of neocloud vendors to use their software stacks to optimize the performance of older chips and to move workloads to the most cost-effective GPUs and other chips is the vendors' specialty.Related:Meta Ups Texas AI Data Center Investment From $1.5B to $10B"That's the secret sauce of a neocloud, their ability to portfolio manage their GPU fleet and then be able to move workloads to optimize," he said. "Everybody's going to say they want their stuff to run on the latest and greatest. Very few workloads actually need to work on the latest and greatest."As for the neocloud market landscape, Dickens said it is starting to shake out to a handful of major players.While there were some 150 neocloud startups 18 months ago, he said he sees that number winnowing down to 10 or so dominant players over the next five years. "Winner-takes-most is how I see this industry panning out, not winner-takes-all," Dickens said. "It's not going to be that there's no more business for Lambda, Nebius and OVH. There's obviously going to be business for those guys, and CoreWeave is going to be one of those names."About the AuthorSenior News Director, AI BusinessShaun Sutner, a journalist with more than 25 years of daily newspaper experience and 11 years at Informa TechTarget as an editor and writer, directs news coverage for AI Business. He was previously a senior news and features writer covering health IT and HR software at TechTarget and a senior news director overseeing coverage of AI, business analytics, data management and government tech regulation.Sutner's newspaper career included investigative reporting and covering the Massachusetts State House and politics for the Worcester Telegram & Gazette. He has written about snow sports as a T&G columnist and correspondent for 20 years. Sutner's interests also include tennis, standup paddleboarding, cooking and popular music.
関連記事
TSMC、AI需要の増加に対応できず「限界に達している」と表明
世界最大の半導体メーカーである台湾積体電路製造(TSMC)は、米国内での工場建設を進めても、アメリカ顧客からのAI向け需要増に対応しきれていない。同社の魏哲家CEOは株主総会後、「顧客の需要があまりにも高く、対応できる限度がある」と述べた。
インテル:次期 AI チップは NVIDIA や AMD より安価・低温動作へ
インテルのデータセンター部門責任者ケヴォルク・ケチキアン氏は、同社が年内に発売予定の新型 AI 用 GPU「Crescent Island」について、競合他社の製品より安価なメモリと冷却技術を採用し、コスト削減と省電力化を実現すると発表した。
NVIDIA CompileIQによる自動チューニングでカーネルパフォーマンスを最大化
NVIDIAは、特定の環境に最適なコンパイラオプションを見つけるという難問に対し、CompileIQの自動チューニング機能を活用することで、より高いカーネルパフォーマンスを引き出す方法を発表した。