Link to home

Unearthing Planet’s Plant Virus Modulome: Exploring Plant Virus Proteome Modularity for Taxonomic Classification and Biological Predictions

Rachid Tahzima: Flanders Research Institute for Agriculture, Fisheries and Food (ILVO)


<div>Current plant virus systematics are often designed for specific well-defined plant viral taxa and may appear cumbersome for unassigned or newly discovered plant virus entities. Therefore, plant virology could directly benefit from comparative analysis for characterizing newly sequenced plant virus isolates in order to aid their taxonomic placement and even to make predictions on their biology. We indeed hypothesized that the information content of the virus genome, or more specifically the virus proteome, is directly linked to its taxonomic classification, but also to its biology (e.g. vector, host plant). To explore these hypotheses, we characterized all available plant viral genome sequences from RefSeq by looking for protein families/domains in their respective open reading frames using existing Hidden Markov Model profiles. Our resulting dataset includes 1319 different plant viruses belonging to 26 families and 120 genera. For all these accessions we evaluated the presence/absence and abundance of different features, for example Protein Families (PFAMs) and Superfamilies (SFs). Through appropriate data mining from this unique dataset, we found features with a high correlation with the current taxonomical classification of the virus, or with the biology of the virus (e.g. vector, host plant). Hence, we can identify several features of interest for improved taxonomical classification and we are possibly able to make predictions about the virus biology. This approach could particularly be useful to explore the biology of all the new viruses recently identified by NGS and to guide further research while providing information for preliminary risk analysis after virus discovery.</div>