Reinforcement Learning
-

DeepSeekMath-V2, the AI that can check its own math reasoning
DeepSeekMath-V2 (paper, model) is a self-verifiable mathematical reasoning model developed by DeepSeek-AI. It…
-

Absolute Zero – AI training without any human data
Imagine an AI that improves without human-labeled data and curated datasets. Just pure…
-

SWE-RL enhances LLMs coding capabilities
Meta introduces SWE-RL, marking the first time reinforcement learning has been used to…
-

Google DeepMind’s SIMA, a generalist AI gaming partner
Google DeepMind’s new Scalable Instructable Multiworld Agent (SIMA) is a cutting-edge AI that…
-

TaskWeaver is a smart planning agent for data analytics
Microsoft’s TaskWeaver is a code-first framework for planning and executing complex data analytics…
-

Google launches Gemini, its most advanced AI model
On December 6, 2023, Google launched Gemini, a cutting-edge multimodal AI model that…
-

AlphaDev: how deep reinforcement learning can find better sorting algorithms than humans
Google DeepMind introduces AlphaDev, a new deep reinforcement learning (DRL) agent that can…
-

Stability AI launched StableVicuna, the first open-source chatbot based on human feedback
Stability AI has introduced StableVicuna, the first large-scale open-source chatbot that has been…
-

Metacognitive reinforcement learning: do humans learn the way that machines do?
The researchers at the Max Planck Institute for Intelligent Systems, Stuttgart, Germany have…
-

DeepMind’s latest AI research reduces energy usage for cooling buildings
A research team from DeepMind and Google proposes a new approach involving ML…