Exploring In-Context Reinforcement Learning in LLMs with Sparse Autoencoders
Large language models (LLMs) have demonstrated remarkable in-context learning capabilities across various domains, including translation, function learning, and reinforcement learning. However, the underlying mechanisms of these abilities, particularly in reinforcement…