Top-P, also called nucleus sampling, was proposed by Holtzman et al. in the 2019 paper 'The Curious Case of Neural Text Degeneration'. Unlike Top-K, it doesn't fix the candidate count — instead it forms the smallest set whose cumulative probability crosses a threshold P (say 0.9) and samples from that 'nucleus'. This adapts to the model's confidence: where the next-token distribution is sharp, the nucleus is small; where it is flat, the nucleus widens, which usually produces more natural-feeling output. In practice most teams tune Temperature and top-p together while leaving top-k disabled.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Intermediate · 2019
Top-P (Nucleus) Sampling
A sampling method that draws from the smallest set of candidates whose cumulative probability exceeds P.
- EN — English term
- Top-P (Nucleus) Sampling
- TR — Turkish term
- Top-P (Nucleus) Örnekleme