Accelerating text generation with Confident Adaptive Language Modeling (CALM)

January 7, 2023

Google AI researcher Tal Schuster has announced a new technology called “Confident Adaptive Language Modeling” (CALM) for speeding up the language models.

CALM is a framework for dynamically allocating different resources, using an algorithm to predict the amount of resources needed.

The research team tested the new system for several NLP tasks, including text summarization, machine translation, and question answering. It proved the efficacy of the framework in reducing computation time up to 3 times. The result of the research was presented at the NeurIPS 2022 conference.

LLM and CALM

The most recent advancements in AI have made it possible to define intelligent systems with a deeper and more sophisticated understanding of language than ever before.

The performance of LLM (Large Language Models) like GPT-3, T5, PaLM, etc. has increased exponentially. These simulations have begun to mimic human behavior by gaining the ability to read, summarize, and generate textual data.

Numerous in-depth researches have demonstrated that an LLM performs better on larger models. Sometimes a larger LLM model, trained on large amounts of training data, may be too expensive and less useful for certain tasks. In this context, CALM introduces the concept of allocating the computational resources according to the complexity of the task and the quantity of resources required.

Methodology

The team employed an eight-level T5 encoder-decoder architecture to conduct research and experiments on the aforementioned datasets.

The encoder starts by creating a dense representation of the text input. After this the decoder offers output prediction one after another. Matrix multiplication is one of the transformer layers used for development, along with attention and feedforward modules. Before all decoder levels are complete, CALM predicts the next word, skipping certain calculations in the process.

To evaluate the model three estimated confidence measures were used: the SoftMax score, state propagation, and early exit classifier.

Results

According to the findings, Confident Adaptive Language Modeling is a step forward in terms of Language Modeling. It is able to maintain high performance, producing high quality output along with an increase in text generation speed.

It also reduces the onerous task of calculating the model and is undoubtedly a very effective solution.