The Technology Innovation Institute (TII) launched Falcon 180B, which is a scaled-up version of Falcon 40B. With 180 billion parameters, it is the largest openly available large language model (LLM), being comparable in performance to Google’s PaLM 2 and OpenAI’s GPT-4.
The model is available for both research and commercial use on the Hugging Face Hub (the base and chat models) and you can try it out on the Falcon Chat Demo Space. The full license is here.
Falcon 180B is the highest-performing open-source LLM on the Hugging Face Leaderboard. It outperforms other large language models on a variety of tasks, including machine translation, text summarization, and question answering.
Falcon 180B inherits the multiquery attention technique from Falcon 40B, but also introduces some other improvements, such as:
- FlashAttention (a faster and more efficient way to compute attention)
- Grouped-query attention (a compromise between multiquery and multi-head attention that uses an intermediate number of key-value heads)
Using Amazon SageMaker, Falcon 180B was trained on a massive dataset of 3.5 trillion tokens from TII’s RefinedWeb dataset in a single-epoch pretraining process, which is the longest for any public model.
How good is Falcon 180B?
Falcon 180B surpasses Llama 2 and GPT-3.5 on various natural language understanding tasks, including machine translation, text summarization, and question answering. It also matches the performance of PaLM 2-Large, a private LLM from Google that powers Bard, on several challenging benchmarks.
Falcon 180B is the best public large language model (LLM) that has been pre-trained, scoring 68.74 on the Hugging Face Leaderboard. It beats Google’s PaLM and Meta’s LLaMA 2, which has a score of 67.35 (see the table below).
Model | Size | Leaderboard score |
---|---|---|
Falcon | 180B | 68.74 |
Llama 2 | 70B | 67.35 |
LLaMA | 65B | 64.23 |
Falcon | 40B | 61.48 |
MPT | 30B | 56.15 |
Here is a table summarizing the key differences between Falcon 180B and LLaMA 2:
Feature | Falcon 180B | Llama 2 |
---|---|---|
Number of parameters | 180 billion | 70 billion |
Dataset size | 3.5 trillion tokens | 1.5 trillion tokens |
Pre-training time | ~7 million GPU hours | ~1.5 million GPU hours |
Number of GPUs used | Up to 4096 | Up to 1024 |
License | Apache License 2.0 | MIT License |
As we can see, Falcon 180B used 4 times more compute power (up to 4096 GPUs) and 2.3 times more data than Llama 2 for training. This resulted in a larger and more diverse model that outperforms Llama 2.
Conclusion
Falcon 180B is a powerful open-access LLM that can generate high-quality text and answer complex questions.
Anyone can use it, without having to pay any licensing fees. This makes it a more affordable option than other LLMs, such as GPT-4.
The model is still under development and the community can further improve Falcon 180B by fine-tuning it on specific domains and datasets.
Learn more:
Release announcement: “Spread Your Wings: Falcon 180B is here” (on Hugging Face)