Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models

Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models

February 14, 2024 | Francesca-Zhoufan Li, Ava P. Amini, Yisong Yue, Kevin K. Yang, Alex X. Lu
This paper investigates the effectiveness of transfer learning using protein language models (PLMs) for downstream tasks in protein biology. The authors conduct 370 experiments across various downstream tasks, model sizes, depths, and pretraining durations to understand how features learned during pretraining relate to and are useful for downstream tasks. They find that while most downstream tasks benefit from pretrained models compared to naive sequence representations, performance does not consistently scale with pretraining. Instead, performance often relies on low-level features learned early in pretraining. This suggests a mismatch between current PLM pretraining paradigms and many applications of these models, indicating a need for better pretraining methods. The study explores several hypotheses about why transfer learning works, including feature reuse, inductive biases, weight statistics, and the reuse of low-level features. The results show that for some tasks, transfer learning improves performance, but this improvement is not necessarily due to feature reuse or scaling with pretraining. For example, secondary structure prediction benefits from transfer learning because it is well-aligned with MLM pretraining. However, many other tasks do not benefit from transfer learning, suggesting that pretraining may not be effective for all protein biology applications. The authors also find that for some tasks, performance does not improve as PLMs improve, indicating that these tasks may rely on low-level features learned early in pretraining. They conclude that scaling PLMs under current pretraining paradigms may not improve performance on many protein function prediction tasks and that new, better-aligned pretraining tasks are needed. The study highlights the importance of understanding the factors that influence transfer learning in protein biology and the need for improved evaluation standards for PLMs. The results have implications for future work in protein engineering and bioinformatics, emphasizing the need for diversified pretraining strategies to better serve aspects of protein biology not well-served by current PLMs.This paper investigates the effectiveness of transfer learning using protein language models (PLMs) for downstream tasks in protein biology. The authors conduct 370 experiments across various downstream tasks, model sizes, depths, and pretraining durations to understand how features learned during pretraining relate to and are useful for downstream tasks. They find that while most downstream tasks benefit from pretrained models compared to naive sequence representations, performance does not consistently scale with pretraining. Instead, performance often relies on low-level features learned early in pretraining. This suggests a mismatch between current PLM pretraining paradigms and many applications of these models, indicating a need for better pretraining methods. The study explores several hypotheses about why transfer learning works, including feature reuse, inductive biases, weight statistics, and the reuse of low-level features. The results show that for some tasks, transfer learning improves performance, but this improvement is not necessarily due to feature reuse or scaling with pretraining. For example, secondary structure prediction benefits from transfer learning because it is well-aligned with MLM pretraining. However, many other tasks do not benefit from transfer learning, suggesting that pretraining may not be effective for all protein biology applications. The authors also find that for some tasks, performance does not improve as PLMs improve, indicating that these tasks may rely on low-level features learned early in pretraining. They conclude that scaling PLMs under current pretraining paradigms may not improve performance on many protein function prediction tasks and that new, better-aligned pretraining tasks are needed. The study highlights the importance of understanding the factors that influence transfer learning in protein biology and the need for improved evaluation standards for PLMs. The results have implications for future work in protein engineering and bioinformatics, emphasizing the need for diversified pretraining strategies to better serve aspects of protein biology not well-served by current PLMs.
Reach us at info@futurestudyspace.com
[slides] Feature Reuse and Scaling%3A Understanding Transfer Learning with Protein Language Models | StudySpace