MBPP (Mostly Basic Python Problems) is a coding Benchmark introduced by Google in Austin et al. (2021), containing about 974 crowd-sourced basic Python tasks. Each item ships with a natural-language description, a reference solution, and unit tests, measuring natural-language-to-code translation. It's positioned as a complement to HumanEval — the latter is shorter and sparser, MBPP broader and more pedestrian. Modern models have largely saturated MBPP at 90%+ pass@1, but it still appears in reports as a quick sanity check.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Intermediate · 2021
MBPP
A Google coding benchmark of nearly 1,000 basic Python problems.
- EN — English term
- MBPP
- TR — Turkish term
- MBPP