Protein language models are biased by unequal sequence sampling across the tree of life

Publication
In bioRxiv, 2024