Skip to content

GSM8K (Grade School Math 8K)

GSM8K is a benchmark for evaluating the multi-step mathematical reasoning capabilities of LLMs.

Description

It contains 8.5K high-quality grade school math word problems. These problems require 2 to 8 steps of basic arithmetic to solve.

Key Metrics

  • Exact Match (EM): Accuracy of the final numerical answer.

Alternatives

Backlog

  • Add info on Chain-of-Thought (CoT) prompting impact on GSM8K.