Multimodal
-
TinyChart, a powerful AI that understands charts
TinyChart is an open-source multimodal large language model specifically designed for chart understanding.…
-
Idefics2 by Hugging Face, a strong multimodal model with 8B parameters
Hugging Face has launched Idefics2, an 8B parameters multimodal model that rivals the…
-
AniPortrait generates animations from portraits and audio
AniPortrait is a new framework that creates dynamic and expressive animated portraits from…
-
Google DeepMind’s SIMA, a generalist AI gaming partner
Google DeepMind’s new Scalable Instructable Multiworld Agent (SIMA) is a cutting-edge AI that…
-
NVIDIA Canary 1B, a speech recognition and translation model
Canary is a new multilingual speech-to-text recognition and translation model from the NVIDIA…
-
StreamDiffusion is a new AI model for real-time image generation
StreamDiffusion is a new diffusion pipeline specifically tailored for real-time image generation. It…
-
TaskWeaver is a smart planning agent for data analytics
Microsoft’s TaskWeaver is a code-first framework for planning and executing complex data analytics…
-
Google launches Gemini, its most advanced AI model
On December 6, 2023, Google launched Gemini, a cutting-edge multimodal AI model that…
-
DiagrammerGPT generates better diagrams using LLMs
DiagrammerGPT is a new framework that uses large language models (LLMs) to generate…
-
Introducing Lemur-70B and Lemur-70B-Chat: the open-source models harmonizing text and code for powerful language agents
Lemur and Lemur-Chat are openly accessible language models optimized for both natural language…