A research team from University of California, Georgia Institute of Technology, and Allen Institute for AI proposes a novel approach called PROMPTPG to help ML models like GPT-3 in complex mathematical tasks.
How it operates
Mathematical reasoning is a core ability of human intelligence provided by its capacity to use logical reasoning, intuition, and abstract thinking. Researchers aim to answer a critical question: Can machines be taught to reason mathematically in the way that humans do?

It’s difficult to give a definite answer. When compared to human intelligence, machines still struggle to comprehend the context and the meaning behind the mathematical concepts and problems.
Recently, machines have made significant strides in mathematical reasoning. Large pre-trained language models such as GPT-3 have successfully performed mathematical reasoning tasks written in text form.
Limitations
Language models still rely on pre-programmed rules and algorithms and it is unclear whether these models can handle more complex information such as tabular data.
Much math-based data is not presented in written text. Various documents such as financial reports, health records, and invoices have tables and other structured data that is distinct from unstructured text.
Solving MathWord Problems (MWP) in a tabular context is significantly more difficult than current MWP standards.
The model
To address this gap in knowledge, the researchers have proposed a new framework, following three main steps:
First, they created a new dataset called Tabular MathWord Problems (TABMWP) that contains over 38,000 open-domain grade-level problems from both textual and tabular data. Each question in TABMWP is associated with a tabular context, presented as an image, semi-structured text, and a structured table.
Secondly, the authors evaluated different pre-trained models on TABMWP, including GPT-3, and find that GPT-3’s performance is unstable when handling complex problems like TABMWP.
Finally, to address this issue, they proposed a novel approach called PROMPTPG which selects in-context examples from a small amount of training data and construct the corresponding prompt for the test example.

The tests show that this method outperforms the best baseline by 5.31% on the accuracy metric, being more effective in selecting in-context examples.

The above example is a free-text problem with a numerical answer. The TABMWP dataset can also be used in a multi-choice problem with a textual answer.
Conclusion and future research
The research introduces a new dataset TABMWP and a novel approach called PROMPTPG which, together, they help ML in performing tasks involving abstract thinking and logical reasoning.
Overall, while machines are making significant progress in mathematical reasoning, they still have a long way to go before they can match the level of flexibility, creativity, and intuition of human intelligence.