Multimodal
-
Qwen3 by Alibaba, a new open-source model with hybrid reasoning
Released on April 28, 2025, Qwen3 is an open-source multimodal LLM that extends…
-
Meta’s Llama 4, advanced multimodal models with long context
Meta released Llama 4, a new suite of AI models which offers advanced…
-
InfiniteYou, photo customization with identity preservation
ByteDance introduced InfiniteYou (InfU), a powerful model that allows flexible photo modifications based…
-
Gemma 3 matches 98% DeepSeek-R1 and runs on a single GPU or TPU
Gemma 3, Google’s latest AI model, offers multi-modal capabilities and achieves 98% of…
-
Baidu released two advanced LLMs, ERNIE 4.5 and ERNIE X1
Chinese technology giant Baidu is challenging leading AI models with its most recent…
-
Cosmos simulates physical worlds for training AI systems
NVIDIA has released the Cosmos World Foundation Model Platform, an advanced AI toolkit…
-
Alibaba released Qwen2.5 with more than 100 open-source AI models
Alibaba Cloud recently announced the release of over 100 open-sourced Qwen 2.5 multimodal…
-
LLaMA-Omni lets you speak to LLMs and get instant responses
LLaMA-Omni is an open-source AI tool designed for real-time voice interaction with large…
-
Transfusion, a multi-modal model for text and image generation
Transfusion is a multi-modal AI tool designed to handle both text and images…
-
TinyChart, a powerful AI that understands charts
TinyChart is an open-source multimodal large language model specifically designed for chart understanding.…