Chinese technology giant Baidu is challenging leading AI models with its most recent releases, ERNIE 4.5 and ERNIE X1.
Key points
- ERNIE 4.5 is Baidu’s latest multimodal foundation model, and according to Baidu, it surpasses GPT-4.5 in multiple benchmarks while being priced at only 1% of GPT-4.5.
- ERNIE X1 is an advanced reasoning model with multimodal capabilities, offering performance comparable to DeepSeek-R1 at only half the price.
- ERNIE Bot will be made freely available to the public starting April 1, ahead of its original release schedule.
Both models can be accessed by all ERNIE Bot users through its official website (in Chinese) via the chatbot interface. For enterprises and developers, ERNIE 4.5 is available through APIs on Baidu AI Cloud‘s Qianfan platform, with ERNIE X1 to follow soon. These models will also be integrated into Baidu’s ecosystem, including Baidu Search and the Wenxiaoyan app, enhancing user experiences. Currently, there is no option to download the models for local use.
Baidu announced that the ERNIE 4.5 series will be open-sourced on June 30, 2025, with source code and model weights available to developers. However, the company has not made any similar announcements or comments regarding the ERNIE X1 model.
ERNIE 4.5: advancing multimodal AI
ERNIE 4.5 is Baidu’s latest multimodal foundation model, designed to efficiently process and combine text, images, audio, and video. It offers strong abilities in language understanding, content generation, reasoning, and memory, making it useful for various applications.
Baidu has refined ERNIE 4.5 to reduce errors, improve logical thinking, and enhance coding skills. The model can also understand internet memes and satire, showing its advanced ability to understand the context.
According to Baidu, ERNIE 4.5 outperforms GPT-4.5 in several benchmarks while being offered at only 1% of GPT-4.5’s cost (see the next picture).

The next picture illustrates its text capabilities.

ERNIE 4.5’s low price makes it an attractive choice for developers and businesses seeking powerful AI solutions at minimal cost.
Based on the information available, ERNIE 4.5’s advanced capabilities come from a combination of technological improvements:
- FlashMask: A method that helps the model focus on the most relevant input data.
- Heterogeneous multimodal mixture-of-experts: Specialized sub-models for processing text, images, audio, and video efficiently.
- Spatiotemporal compression: Compactly represents data across space and time.
- Knowledge-centric training data: Focuses on high-quality, accurate data to enhance reasoning and understanding.
- Self-feedback post-training: Improves performance by refining its outputs through its own feedback.
ERNIE X1: the deep-thinking model
ERNIE X1 is Baidu’s first multimodal reasoning model, focusing on advanced reasoning and problem-solving. It excels in tasks like Chinese knowledge-based Q&A, manuscript writing, and logical reasoning. Its capabilities also include image analysis, code interpretation, and academic research, making it a powerful tool for professionals and researchers across diverse fields.
Built using innovative methodologies like Progressive Reinforcement Learning and End-to-End Training, ERNIE X1 offers similar performance to DeepSeek-R1 and other advanced AI models at a fraction of the cost.
Comparing leading AI models
Creating a precise, fully comprehensive comparison between the ERNIE 4.5, ERNIE X1, DeepSeek-R1 and GPT-4.5 models is difficult due to the rapid advancements in AI and the limited availability of standardized benchmark data. The table below summarizes the key attributes of these models.
Feature | ERNIE 4.5 | ERNIE X1 | DeepSeek-R1 | GPT-4.5 |
---|---|---|---|---|
Multimodal | Yes | Yes | No | Yes |
Key strengths | General purpose, optimized for document analysis, complex reasoning and mathematical tasks | Deep thinking reasoning and complex problem-solving | General purpose, strong reasoning and coding capabilities | General purpose, advanced multimodal capabilities, widely adopted |
Pricing | Input: ~$0.55/1M tokens Output: ~$2.2/1M tokens | Input: ~$0.28/1M tokens Output: ~$1.1/1M tokens | Varying based on the specific model and usage, but estimated lower than GPT-4 | Relatively higher cost |
Reasoning Abilities | Improved logical reasoning | Designed for advanced logical inference and problem-solving | Strong reasoning capabilities | Advanced reasoning capabilities |
Availability | Available via Qianfan MaaS platform; Integration into Baidu products | Coming soon to Qianfan MaaS platform; Integration into Baidu products | Available on major cloud platforms | Available via APIs and various platforms |
Geographic focus | Primarily focused on the Chinese market, but Baidu is expanding internationally | Primarily focused on the Chinese market, but Baidu is expanding internationally | Focused on the Chinese market, but its open-source nature gives it broader accessibility | Global |
Conclusion
Baidu’s launch of ERNIE 4.5 and ERNIE X1 brings advanced AI technologies to a wider audience, including developers, researchers, and AI enthusiasts alike. By offering these high-performance models at accessible prices, Baidu becomes a strong competitor to U.S. companies like OpenAI and Google.