继HappyHorse后,阿里又有一款模型登顶权威评测榜单
登顶WorldArena
阿里巴巴旗下一款名为ABot-PhysWorld的世界模型登顶世界模型领域的权威评测WorldArena。不到半个月内,阿里巴巴已先后有两个全新模型登顶世界级评测榜单。
4月初,来自阿里巴巴的HappyHorse横空出世,一举登顶权威AI模型测评榜单Artificial Analysis。而这次,ABot-PhysWorld也是力压GigaWorld、Google Veo等知名模型,登顶WorldArena。

WorldArena是世界模型领域公认的权威评测,由清华大学联合普林斯顿大学、新加坡国立大学、北京大学、香港大学、中国科学院、上海交通大学以及中国科学技术大学等8所顶尖高校共同打造。其构建了包含16项细分核心指标和3大真实应用任务的立体评估体系,旨在对具身世界模型的感知精度、物理规律理解、三维空间认知以及动作预测与落地能力进行最严苛的压力测试。
从能力指标上来看,ABot-PhysWorld的领先性体现在其对物理规律的深度内化与长程动态预测能力上。不同于多数模型仅能生成短时、静态或装饰性的视频片段,ABot-PhysWorld够准确预见物体在复杂交互下的运动轨迹,如滑动、倾倒、堆叠、流体变化等,并保持多步因果逻辑的一致性。
这种“可推理”的生成能力使ABot-PhysWorld在任务规划、异常预判和自主决策中具备实际价值,而非停留在视觉演示层面。
版权所有,未经授权不得以任何形式转载及使用,违者必究。
Related Articles
Google’s AI Mode update lets you open links without leaving the page
Emma Roth is a news writer who covers the streaming wars, consumer tech, crypto, social media, and much more. Previously, she was a writer and editor at MUO.Google is upgrading AI Mode in Chrome with a new...
OpenAI’s big Codex update is a direct shot at Claude Code
OpenAI is beefing up its agentic coding and development system, Codex, with a suite of updates that let it use your computer, generate images, and remember from past experiences. The package of updates comes...
Gemini can now pull from Google Photos to generate personalized images
Jay Peters is a senior reporter covering technology, gaming, and more. He joined The Verge in 2019 after nearly two years at Techmeme.Google’s Personal Intelligence feature, which lets Gemini pull data from...
Anthropic releases a new Opus model amid Mythos Preview buzz
Anthropic has released its most powerful “generally available” model to date: Claude Opus 4.7.The company called it a step up from Opus 4.6 for advanced software engineering tasks, particularly in complex...