Sheau Pei's AI Journal

Manus Agentic AI

Manus AI, a state of the art agentic AI, was recently launched. I wanted to learn more about the AI model and the company behind it. Below is an analysis provided by Grok on Manus AI.

Key Points

Overview

Manus, announced recently by a Chinese team, is making waves as a fully autonomous AI agent capable of executing complex, real-world tasks independently. This capability marks it as a significant step forward in AI development, potentially challenging the dominance of Western AI companies.

Revolutionary Features

Manus stands out for its ability to think, plan, and execute tasks without human intervention, such as creating custom websites, devising travel itineraries, and conducting in-depth stock analyses. This level of autonomy is a leap beyond traditional AI assistants, which typically provide information or perform simple actions but struggle with multi-step, dynamic tasks. Its versatility suggests it could be a milestone towards artificial general intelligence (AGI), where AI can perform any task a human can.

Benchmark Comparison

On the GAIA benchmark, which tests AI on real-world tasks requiring reasoning and tool-use, Manus is reported to have achieved state-of-the-art performance, likely surpassing OpenAI's Deep Research (scoring around 67% to 72.57%) and H2O.ai's h2oGPTe Agent (65% accuracy). While exact scores for Manus are not publicly disclosed, industry reports indicate it leads, highlighting its competitive edge in this critical measure of AI capability.

Unexpected Detail

Interestingly, Manus's development in China, backed by local investors, underscores a shift in the global AI landscape, with potential economic implications as seen in recent market reactions, such as a 6% increase in China's AI sector following its announcement.


Survey Note: Detailed Analysis of Manus AI and Its Implications

Introduction

On March 6, 2025, Manus, a general-purpose AI agent developed by the Chinese team Monica.im, was unveiled, quickly capturing attention for its autonomous capabilities and benchmark performance. This survey note provides a comprehensive analysis of Manus, exploring its revolutionary features, benchmark comparisons, and broader implications, based on recent reports and official statements.

Background and Announcement

Manus was introduced as the world's first fully autonomous AI agent, available initially through an invitation-only web preview at manus.im. The launch coincided with significant interest, with users seeking access codes, and was highlighted in media outlets like Newsweek and The Express Tribune. Its development by a low-profile team, backed by Chinese investors, positions it as a notable contender in the global AI race, especially following China's earlier AI breakthrough with DeepSeek in January 2025, described as a "Sputnik moment" for U.S. AI development.

Revolutionary Features of Manus

Manus's revolutionary nature lies in its autonomy and versatility. Unlike traditional AI assistants that provide suggestions or answers, Manus is designed to deliver complete task results through independent execution. Its capabilities include:

This autonomy is enabled by a multi-signature (multisig) approach, powered by multiple independent models, with plans to open-source parts of the model later in 2025, as reported by Pandaily. The technical philosophy of "less structure more intelligence" suggests that with sufficient data quality, model power, and flexible architecture, capabilities like computer use, deep research, and coding emerge naturally, as noted in the same source.

Benchmark Performance and Comparison

Manus's performance was evaluated using the GAIA (General AI Assistants) benchmark, introduced in 2023 by research teams from Meta AI, Hugging Face, and others, to test AI on real-world tasks requiring reasoning, multi-modality handling, web browsing, and tool-use proficiency. The benchmark consists of 466 questions across three difficulty levels, with human respondents achieving 92% accuracy, while GPT-4 with plugins scored only 15%, as detailed in arXiv.

Reports indicate Manus achieved state-of-the-art (SOTA) performance across all GAIA levels, surpassing OpenAI's models, as claimed on its website (manus.im) and supported by Chinadaily.com.cn. While exact scores are not publicly available, industry analyses suggest it outperforms H2O.ai's h2oGPTe Agent (65% accuracy) and OpenAI's Deep Research, which scored between 67% and 72.57% on GAIA, according to TechRadar and Ars Technica. This positions Manus as a leader, though the lack of specific scores introduces some uncertainty.

Other notable AI agents include Google's Deep Research, Perplexity Deep Research, and Hugging Face's Open Deep Research, but detailed GAIA scores for these were not found in recent reports. DeepSeek, another Chinese model, focuses on LLMs and is not directly comparable as an agent, with its R1 model competing on cost and performance but not autonomy, as per Reuters.

Comparative Analysis Table

To organize the comparison, here is a table summarizing key AI agents and their GAIA benchmark performance where available:

AI Agent Developer GAIA Score Range Notes
Manus Monica.im (China) >72.57% (Estimated) SOTA, autonomous, handles complex tasks, exact score not disclosed
Deep Research OpenAI 67%-72.57% High performance, multi-step research, scores vary by mode
h2oGPTe Agent H2O.ai 65% Strong competitor, detailed in industry reports
GPT-4 with plugins OpenAI 15% Baseline for comparison, lower on GAIA, as per arXiv

Economic and Industry Impact

The announcement of Manus has had notable economic implications, with China's AI sector seeing a 6% increase, reflecting investor optimism, as reported by EC Innovations. Analysts predict the global AI agent market could reach $130 billion by 2033, underscoring the potential scale of impact. This development also raises questions about U.S. dominance in AI, especially given the cost-effective approaches of Chinese models like DeepSeek, as noted in CNBC.

Challenges and Future Outlook

While Manus shows promise, challenges include the lack of public leaderboard scores, which could affect transparency, and the need for further real-world validation beyond benchmarks. The open-sourcing of parts of Manus later in 2025 could foster collaboration but also raise concerns about control and security. The AI community's focus on agentic workflows, as discussed in Medium, suggests Manus could play a pivotal role in shaping future productivity tools, potentially automating tasks like tax filing and research.

Conclusion

Manus represents a significant advancement in AI, with its autonomous capabilities and leading GAIA benchmark performance positioning it as a potential game-changer. Its comparison to other leading models highlights its competitive edge, though exact metrics remain partially undisclosed. As the AI landscape evolves, Manus's impact on industry and global competition will be closely watched, especially given its Chinese origin and the broader implications for AGI development.

Key Citations

#AI #Agentic AI #China #Manus