LASER: An Adaptive Method for Selecting Reward Models RMs and Iteratively Training LLMs Using Multiple Reward Models RMs
One of the major challenges in aligning large language models (LLMs) with human preferences is the difficulty in selecting the right reward model (RM) to guide their training. A single…