IIB   20738
INSTITUTO DE INVESTIGACIONES BIOLOGICAS
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Phylogenetic and Functional Classification of the Plant Cytochrome P450 superfamily by using SEQrutinator and HMMERCTTER inception
Autor/es:
AMALFITANO, A; STOCCHI, N; ATENCIO, HM; VILLARREAL, F; TEN HAVE, ARJEN
Lugar:
Mendoza
Reunión:
Congreso; X Argentinian Congress of Bioinformatics and Computational Biology 10CAB2C 2019; 2019
Institución organizadora:
A2B2C
Resumen:
BackgroundPlant secondary metabolism is formed by a complex network of reactions catalyzed largely by proteins from a few superfamilies among which Cytochrome P450 (CYP). Many of the enzymes involved have not been identified whereas many of the superfamily sequences have not been annotated. Although CYP is a paradigm in structure-function prediction, CYP classification is based on sequence identity, which is known to be inherently error-prone. Hence, there is no method or platform that classifies CYP sequences.ResultsWe applied SEQrutinator to eight sequence sets, obtained by hmmsearch of eight complete plant proteomes using its sensitive inclusion threshold. For scrutiny parameter setting we used the plant SwissProt sequence-set. The final dataset contained 2392 sequences among which three sequences from a protein with a resolved structure. We then applied HMMERCTTER clustering and classification of existing CYP classification sets (See http://metabolomics.jp/wiki/Category:P450). Surprisingly the sequences clustered perfectly into four major monophyletic clades with 100% Precision and Recall. The major clade corresponds with the 71 clan with approximately 30 classes. The second clade corresponds with the 85 clan with all 16 described classes and includes the formerly independent CYP51 and CYP710 families. The third clade combines clans 72 and 86. The minor clade form CYP74, another independent family.A more functional classification is to be found performing further hierarchic classification, which we performed based on the inception principle: The four clades were selected as independently evolving superfamilies, which is strictly correct, and subjected to further HMMERCTTER clustering using the complete sequence set as sequence space, by which all identified clusters are 100% P&R. As such we demonstrate that clans 72 and 86 are monophyletic and that formerly independent families 97 and 711 belong to clan 86. This was guided by existing biochemical annotation. For instance, we identified a small monophyletic cluster containing two subclusters with flavonoid 3´-hydroxylase and flavonoid 3´, 5´-hydroxylase, respectively. Conclusions 1 Inception, based on independent evolution of clades, appears important since realignment improves downstream clustering. 2 Also for complex superfamilies, phylogenetic clustering can correspond with functional clustering. 3 There are many clades without functional annotation.