Prototypical Reward Network for Data-Efficient RLHF

Abstract
We propose a prototypical reward network that enables data-efficient reinforcement learning from human feedback (RLHF) for large language models.
Type
Publication
Annual Meeting of the Association for Computational Linguistics (ACL) 2024
Abstract
We propose a prototypical reward network that enables data-efficient reinforcement learning from human feedback (RLHF) for large language models.
Keywords
Reinforcement Learning, Human Feedback, Large Language Models, Data Efficiency