Microsoft invests in Mistral AI and brings Mistral Large to Azure

Microsoft and Mistral AI, a French startup, have announced a multi-year partnership to accelerate AI innovation. This collaboration brings Mistral Large, a cutting-edge LLM, exclusively to Microsoft Azure cloud computing platform.

As part of the agreement, Microsoft is investing €15 million in Mistral AI to help it grow globally and find new businesses. The deal is under EU probe for AI market competition. The deal faces EU scrutiny for AI market competition.

Through this partnership, Mistral AI gains access to Microsoft’s Azure AI infrastructure, for the development and deployment of their next-generation large language models (LLMs).

Mistral AI is a leading AI company that specializes in generative AI and LLMs. It was founded in April 2023 by former employees of Meta Platforms and Google DeepMind and by December 2023 it had a value of over $2 billion, according to Financial Times.

It has been developing Mistral Large, a state-of-the-art LLM that can perform a wide range of language-based tasks, such as reasoning, knowledge, coding, and multilingual communication. Unlike earlier Mistral AI releases, Mistral Large isn’t open-source. The model is available through the Mistral AI platform (la Plateforme) and Microsoft Azure.

Mistral Large wants to compete with GPT-4

Mistral Large is a powerful text generation model, capable of performing multilingual tasks such as text comprehension, transformation, and code generation. It is based on the latest breakthroughs in natural language processing (NLP) and deep learning, and has been trained on a massive corpus of data from various domains and languages.

As can be seen in the next picture, Mistral Large surpasses most models and ranks second among the world’s publicly available models via API, only behind GPT-4.

Comparison of GPT-4, Mistral Large (pre-trained), Claude 2, Gemini Pro 1.0, GPT 3.5 and LLaMA 2 70B on MMLU (Measuring massive multitask language understanding) (source: Mistral AI blog)

Mistral Large capabilities

Mistral Large unveils new capabilities and strengths:

  • It is natively fluent in English, French, Spanish, German, and Italian.
  • It can retrieve accurate information from large documents with its 32K tokens context window.
  • It can follow instructions precisely, which allows developers to create their moderation policies – they used this feature to moderate le Chat at the system level.
  • It can directly call functions within your code, making it easier for developers to work with it. This capability, combined with the “constrained output mode” available on la Plateforme, allows for building and updating applications more efficiently, even for large projects.

Mistral Small

Alongside the release of Mistral Large, they unveiled Mistral Small, an optimized model designed for reduced latency and cost. Mistral Small offers superior performance compared to the previous Mixtral 8x7B model.

Le Chat – the new AI chatbot

Le Chat is a new AI chatbot from Mistral AI capable of interacting with the different models from Mistral AI, such as Mistral Large, Mistral Small, and Mistral Next.

It can speak five languages: English, French, Spanish, German, and Italian and can also answer questions, tell jokes, play games, and generate content such as poems, stories, and code.

Le Chat is available online here.

Evaluation results

Mistral Large’s performance was compared to the top-leading LLM models on the following tasks:

I. Reasoning and knowledge: Mistral Large has strong reasoning abilities. The next figure shows the performance of the pretrained models on standard benchmarks.

Mistral Large performance on widespread common sense, reasoning and knowledge benchmarks (source: Mistral AI blog)

II. Multi-lingual capacities: Mistral Large can speak five languages natively, outperforming LLaMA 2 70B on HellaSwag, Arc Challenge and MMLU benchmarks in French, German, Spanish and Italian.

Mistral Large multi-lingual capacities (source: Mistral AI blog)

III. Math & Coding: Mistral Large does well in coding and math tasks. In the table below, we show the performance on a set of popular benchmarks to test the coding and math performance for some of the best LLM models.

Mistral Large performance on popular coding and math benchmarks (source: Mistral AI blog)

Conclusion

The partnership between Microsoft and Mistral AI is a game-changer for the AI industry, as it will make Mistral Large, one of the most powerful and versatile LLMs in the world, available to developers, researchers, and customers through Azure.

Learn more:

Other popular posts
  • Is generative AI making us think less? Microsoft investigates

  • DeepSeek-R1 revolutionizes the AI landscape

  • Cosmos simulates physical worlds for training AI systems

  • Create dynamic multi-angle videos with CAT4D diffusion model

  • IBM’s Docling converts PDFs into other digital formats

  • Hunyuan-Large, the largest open-source Mixture of Experts model from Tencent