AMPLIFY: Leveraging Data Quality Over Scale for Efficient Protein Language Model Development
Protein language models (pLMs), trained on protein sequence databases, aim to capture the fitness landscape for property prediction and design tasks. While scaling these models has become common, it assumes…