Manus Agentic AI
Manus AI, a state of the art agentic AI, was recently launched. I wanted to learn more about the AI model and the company behind it. Below is an analysis provided by Grok on Manus AI.
Key Points
- Research suggests Manus, a new AI agent from China, is revolutionary for its autonomy in handling complex tasks like website creation and stock analysis, potentially advancing towards AGI.
- It seems likely that Manus outperforms leading models like OpenAI's Deep Research on the GAIA benchmark, though exact scores are not publicly available as of March 7, 2025.
- The evidence leans toward Manus setting a new standard, with industry reports claiming it surpasses previous top performers, amid growing competition in AI.
Overview
Manus, announced recently by a Chinese team, is making waves as a fully autonomous AI agent capable of executing complex, real-world tasks independently. This capability marks it as a significant step forward in AI development, potentially challenging the dominance of Western AI companies.
Revolutionary Features
Manus stands out for its ability to think, plan, and execute tasks without human intervention, such as creating custom websites, devising travel itineraries, and conducting in-depth stock analyses. This level of autonomy is a leap beyond traditional AI assistants, which typically provide information or perform simple actions but struggle with multi-step, dynamic tasks. Its versatility suggests it could be a milestone towards artificial general intelligence (AGI), where AI can perform any task a human can.
Benchmark Comparison
On the GAIA benchmark, which tests AI on real-world tasks requiring reasoning and tool-use, Manus is reported to have achieved state-of-the-art performance, likely surpassing OpenAI's Deep Research (scoring around 67% to 72.57%) and H2O.ai's h2oGPTe Agent (65% accuracy). While exact scores for Manus are not publicly disclosed, industry reports indicate it leads, highlighting its competitive edge in this critical measure of AI capability.
Unexpected Detail
Interestingly, Manus's development in China, backed by local investors, underscores a shift in the global AI landscape, with potential economic implications as seen in recent market reactions, such as a 6% increase in China's AI sector following its announcement.
Survey Note: Detailed Analysis of Manus AI and Its Implications
Introduction
On March 6, 2025, Manus, a general-purpose AI agent developed by the Chinese team Monica.im, was unveiled, quickly capturing attention for its autonomous capabilities and benchmark performance. This survey note provides a comprehensive analysis of Manus, exploring its revolutionary features, benchmark comparisons, and broader implications, based on recent reports and official statements.
Background and Announcement
Manus was introduced as the world's first fully autonomous AI agent, available initially through an invitation-only web preview at manus.im. The launch coincided with significant interest, with users seeking access codes, and was highlighted in media outlets like Newsweek and The Express Tribune. Its development by a low-profile team, backed by Chinese investors, positions it as a notable contender in the global AI race, especially following China's earlier AI breakthrough with DeepSeek in January 2025, described as a "Sputnik moment" for U.S. AI development.
Revolutionary Features of Manus
Manus's revolutionary nature lies in its autonomy and versatility. Unlike traditional AI assistants that provide suggestions or answers, Manus is designed to deliver complete task results through independent execution. Its capabilities include:
- Creating custom websites through a step-by-step process, as demonstrated in a video on its website.
- Devising itineraries for trips, such as to Japan, and providing in-depth analyses, like Tesla's stock performance.
- Assisting in educational content creation, such as interactive courses for middle-school teachers.
- Comparing insurance policies and aiding in business-to-business supplier sourcing.
This autonomy is enabled by a multi-signature (multisig) approach, powered by multiple independent models, with plans to open-source parts of the model later in 2025, as reported by Pandaily. The technical philosophy of "less structure more intelligence" suggests that with sufficient data quality, model power, and flexible architecture, capabilities like computer use, deep research, and coding emerge naturally, as noted in the same source.
Benchmark Performance and Comparison
Manus's performance was evaluated using the GAIA (General AI Assistants) benchmark, introduced in 2023 by research teams from Meta AI, Hugging Face, and others, to test AI on real-world tasks requiring reasoning, multi-modality handling, web browsing, and tool-use proficiency. The benchmark consists of 466 questions across three difficulty levels, with human respondents achieving 92% accuracy, while GPT-4 with plugins scored only 15%, as detailed in arXiv.
Reports indicate Manus achieved state-of-the-art (SOTA) performance across all GAIA levels, surpassing OpenAI's models, as claimed on its website (manus.im) and supported by Chinadaily.com.cn. While exact scores are not publicly available, industry analyses suggest it outperforms H2O.ai's h2oGPTe Agent (65% accuracy) and OpenAI's Deep Research, which scored between 67% and 72.57% on GAIA, according to TechRadar and Ars Technica. This positions Manus as a leader, though the lack of specific scores introduces some uncertainty.
Other notable AI agents include Google's Deep Research, Perplexity Deep Research, and Hugging Face's Open Deep Research, but detailed GAIA scores for these were not found in recent reports. DeepSeek, another Chinese model, focuses on LLMs and is not directly comparable as an agent, with its R1 model competing on cost and performance but not autonomy, as per Reuters.
Comparative Analysis Table
To organize the comparison, here is a table summarizing key AI agents and their GAIA benchmark performance where available:
AI Agent | Developer | GAIA Score Range | Notes |
---|---|---|---|
Manus | Monica.im (China) | >72.57% (Estimated) | SOTA, autonomous, handles complex tasks, exact score not disclosed |
Deep Research | OpenAI | 67%-72.57% | High performance, multi-step research, scores vary by mode |
h2oGPTe Agent | H2O.ai | 65% | Strong competitor, detailed in industry reports |
GPT-4 with plugins | OpenAI | 15% | Baseline for comparison, lower on GAIA, as per arXiv |
Economic and Industry Impact
The announcement of Manus has had notable economic implications, with China's AI sector seeing a 6% increase, reflecting investor optimism, as reported by EC Innovations. Analysts predict the global AI agent market could reach $130 billion by 2033, underscoring the potential scale of impact. This development also raises questions about U.S. dominance in AI, especially given the cost-effective approaches of Chinese models like DeepSeek, as noted in CNBC.
Challenges and Future Outlook
While Manus shows promise, challenges include the lack of public leaderboard scores, which could affect transparency, and the need for further real-world validation beyond benchmarks. The open-sourcing of parts of Manus later in 2025 could foster collaboration but also raise concerns about control and security. The AI community's focus on agentic workflows, as discussed in Medium, suggests Manus could play a pivotal role in shaping future productivity tools, potentially automating tasks like tax filing and research.
Conclusion
Manus represents a significant advancement in AI, with its autonomous capabilities and leading GAIA benchmark performance positioning it as a potential game-changer. Its comparison to other leading models highlights its competitive edge, though exact metrics remain partially undisclosed. As the AI landscape evolves, Manus's impact on industry and global competition will be closely watched, especially given its Chinese origin and the broader implications for AGI development.
Key Citations
- What is Manus? China's World-First Fully Autonomous AI Agent Explained - Newsweek
- Chinese AI Agent Manus unveiled, first fully autonomous AI agent | The Express Tribune
- China releases new AI agent Manus; claims to outperform OpenAI models By Investing.com
- Manus AI launched in China, challenging GPT and DeepSeek - Chinadaily.com.cn
- Chinese Team Unveils AI Agent, Manus - Pandaily
- New Chinese AI Agent Manus Is Said to Beat OpenAI's Deep Research - Yicai Global
- 'Another DeepSeek moment'? Chinese start-up launches new AI agent, sparking widespread attention - Global Times
- Another DeepSeek moment? General AI agent Manus shows ability to handle complex tasks | South China Morning Post
- Global First Universal AI Agent: Manus, a Chinese AI Product, Launches Internal Testing and Becomes a Hit - Aibase
- GAIA: a benchmark for General AI Assistants - arXiv
- OpenAI's Deep Research smashes records for the world's hardest AI exam, with ChatGPT o3-mini and DeepSeek left in its wake | TechRadar
- Hugging Face clones OpenAI’s Deep Research in 24 hours - Ars Technica
- DeepSeek rushes to launch new AI model as China goes all in | Reuters
- What is DeepSeek - and why is everyone talking about it? - BBC
- DeepSeek - Wikipedia
- The Battle of AI Agents: Comparing Real World Performance Using Benchmarking | Medium