Yong-Hwan Lee, professor of plant pathology at Seoul National University, and Seogchan Kang, professor of plant pathology at Pennsylvania State University, have built an innovative and comprehensive cyberinfrastructure to support the preservation, sharing, integration, and utilization of data from diverse plant pathology-related research. Yong-Hwan Lee was born in Korea and obtained his B.S. and M.S. degrees in plant pathology in 1983 and 1985, respectively, from Seoul National University. After obtaining a Ph.D. degree in plant pathology at Louisiana State University in 1991, he worked as a visiting assistant professor at Clemson University and as a senior research scientist at LG Chemicals. Since 1995, he has been on the faculty of agricultural biotechnology at Seoul National University. His research group has taken a multipronged approach to study the molecular basis of rice blast and has developed several functional and comparative fungal genomics platforms and associated informatics tools.
Seogchan Kang was born in Korea and obtained his B.S. and M.S. degrees in chemistry in 1983 and 1985, respectively, from Seoul National University. He completed his Ph.D. degree in physiological chemistry in 1991 at the University of Wisconsin. Kang was a visiting scientist at DuPont, followed by post-doctoral appointments at Purdue University and the University of New Mexico. In 1997, he joined the faculty of the Department of Plant Pathology at Penn State. His research interests center around genetic and cellular mechanisms underlying plant–pathogen interactions with the focus on fungal vascular wilt diseases, ecological and biological functions of microbial volatile organic compounds, and development of informatics platforms supporting fungal genomics and taxonomy.
Rapid increases in the amount of data, fueled in large part by genomics, have greatly expanded our understanding of the genetic basis of many fundamental biological processes and the diversity and evolution of organisms. However, the resulting “data tsunami” has also presented multiple challenges to data preservation, sharing, integration, and utilization and has hampered the rapid translation of resulting data to advance fundamental knowledge and solve practical problems. As stated in a Phytopathology Letter to the Editor coauthored by Lee, Kang, and their colleagues, “because science builds on existing knowledge, the lack of establishing proper links between what has been done and what will be done is a poor scientific practice and frequently forces us to reinvent the wheel.” The following contributions illustrate how their work has contributed to building such links by establishing effective means for data preservation, sharing, integration, and utilization.
Lee and Kang have developed a large number of informatics platforms that facilitate utilization and analysis of genome sequences of all sequenced fungi and oomycetes and precomputed data for specific gene families or functional groups across kingdoms and/or phylogenetic data and phenotypic characteristics derived from culture collections of major pathogen genera. They also developed data analysis and visualization tools to support versatile uses of the data archived in these platforms.
Microbial germplasm collections are irreplaceable resources that connect discoveries of the present with established knowledge of the past, help understand microbial diversity, and supply materials for bioprospecting. Despite their importance, however, the survival of many collections has been increasingly threatened. Kang, Lee, and their collaborators implemented an innovative model to protect and enhance the value of culture collections. Many culture collections are not catalogued in a format that allows easy browsing of their content, thus making them mostly invisible and inaccessible to the broad research community. In addition, genotypic and phenotypic characteristics of most collections have not been systematically characterized. To address these problems, they have characterized two major pathogen groups and also have archived the data in web platforms to share them with others. The Phytophthora Database (PD; www.phytophthoradb.org) archives phylogenetic and phenotypic data from more than 130 Phytophthora species and provides tools for sequence-based strain identification, visualizing the spatial origins of characterized isolates, and phylogenetic analyses. The Cyberinfrastructure for Fusarium (CiF; www.fusariumdb.org) consists of three platforms, including comparative genomics platform (supporting comparative analysis of Fusarium genomes), Fusarium-ID (supporting the multigene-based Fusarium identification), and community-networking platform based on a web2.0 model. The PD currently has more than 600 registered users from 55 countries. The CiF maintains a slightly bigger user pool. Both platforms average approximately 2,500 uses per month. Lee and Kang are also building similar platforms for Verticillium and Pythium.
Genome sequences of diverse fungal and oomycete species have been rapidly accumulating, enabling comparative functional and evolutionary genomic analyses at multiple taxon levels. With the rapid advances in sequencing technology, the rate of genome sequencing continues to accelerate, highlighting an urgent need to establish a robust mechanism and tools for data curation and utilization in order to maximally reap the benefit from these data. Because most researchers and students are not trained to manage a huge amount of genomics data, user-friendly informatics platforms that link data with versatile analysis tools are essential in order for them to efficiently conduct multifaceted analyses of genome sequences. To meet this need, Lee and Kang have built a series of comparative genomics platforms and associated tools. The Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr/) archives fungal and oomycete genome sequences (283 genomes from 152 species) and averages approximately 1,300 uses per month. This platform has greatly facilitated studies on the evolution and diversification of traits of practical significance within and across kingdoms and also has enabled the development of online platforms specialized for supporting comparative genomics of specific gene families or functional groups. Three published platforms based on CFGP include Fungal Cytochrome P450 Database (http://p450.riceblast.snu.ac.kr/), Fungal Transcription Factor Database (http://ftfd.snu.ac.kr/), and Fungal Secretome Database (http://fsd.snu.ac.kr/). In addition, Kang and Lee are currently building Fungal Calcium Signaling Proteins (http://fcsd.ifungi.org/).
Lee and Kang have integrated a diverse array of data and informatics tools to enhance research and education in plant pathology and mycology. Their work clearly demonstrated how informatics tools could potentially help integrate fragmented community research efforts into a globally networked endeavor. Considering the increasing importance of team approach and data integration in solving complex plant disease problems, their work represents an exemplary model. Their work clearly has caused, and will continue to cause, a major paradigm shift in plant pathology.