Hugo Talibart, qui vient de rejoindre l'Atelier de BioInformatique nous présentera ses travaux récemment publiés dans BMC Bioinformatics.





  • Voir la présentation du Séminaire  ICI


Towards homology search with coevolution information: optimal alignments of protein Potts models

To assign structural and functional annotations to the ever increasing amount of sequenced proteins, the main approach relies on sequence-based homology search methods, e.g. BLAST or the current state-of-the-art methods based on profile Hidden Markov Models, which rely on significant alignments of query sequences to annotated proteins or protein families. While powerful, these approaches do not take coevolution between residues into account. Taking advantage of recent advances in the field of contact prediction, we propose to represent proteins by Potts models, which model direct couplings between positions in addition to positional composition, and we introduce a method to compare proteins by aligning these models: PPalign. This method relies on an Integer Linear Programming formulation to compute the optimal solution in tractable time. We assessed this approach on a low sequence identity reference benchmark and found that PPalign yields a better mean F1 score with respect to the reference alignments and in some cases finds significantly better alignments than its pHMM-based equivalent HHalign. This suggests that pairwise couplings can improve the alignment of remotely related protein sequences.
Publié le : 28/08/2021 20:47 - Mis à jour le : 26/01/2023 14:45

À voir aussi...