Link to home

Metacoder and taxa: R packages for visualization and manipulation of community taxonomic diversity data

Zachary Foster: USDA ARS


<div>Metabarcoding is revolutionizing microbial ecology by circumventing the limits of traditional culture-based techniques, but the massive hierarchical (e.g. taxonomic) datasets produced are difficult to plot and manipulate using current tools. Hierarchical data are more challenging to subset and otherwise manipulate than typical tabular data. We have developed the “taxa” package to provide a standard for the storage and manipulation of any data associated with a taxonomy, modelled after the popular dplyr data-manipulation philosophy. In addition, we have developed “metacoder” for parsing, analyzing, and visualizing hierarchical data associated with metabarcoding research. The reliance on color to depict taxa in stacked bar charts and pie graphs limits the number of taxa displayed to the number of discernible colors. Metacoder implements a novel visualization called heat trees that use the color and size of nodes and edges on a taxonomic tree to quantitatively depict up to 4 statistics distributed over a hierarchy. This allows for rapid exploration of data and information-dense, publication-ready graphics. In addition, metacoder provides tools for reading common file formats and evaluating primers and barcode loci using simulated PCR. The metacoder and taxa packages are already being adopted by the community and have been applied to diverse projects including research on gut microbiota, soil microbiota, wastewater communities, and mycorrhizal associations.</div>