Skip to content
MEVZU N°124ISTANBUL
Glossary · Advanced · 2017

RLHF — Reinforcement Learning from Human Feedback

An alignment technique that trains a reward model from human preferences and then optimises the LLM against it.

EN — English term
RLHF (Reinforcement Learning from Human Feedback)
TR — Turkish term
RLHF — İnsan Geri Bildirimiyle Pekiştirmeli Öğrenme