The potential control agent for invasive treefrog: a perspective from gut microbiota 83

5 Discussion

5.4 The potential control agent for invasive treefrog: a perspective from gut microbiota 83

In Hawaii, helminth parasite has been evaluated as a potential biological control for invasive frog, Eleutherodactylus coqui. Some of the helminths can produce mucin-like molecules and invasively attach on host cells to avoid host immune processes (Theodoropoulos, Hicks et al. 2001). It facilitates helminths to degrade host mucus barriers and increases permeability. The decreased thickness of mucus barrier can increase permeability and result in inflammation (Liévin-Le Moal and Servin 2006, McGuckin, Eri et al. 2009), which may increase the risk of lethal infection. AH treefrogs increase pathogenic infections and restructure microbial composition in the intestine. Our study demonstrated the increased relative abundance of Akkermansia in AH treefrogs. Therefore, manipulating gut microbiota may be a potential biological agent for invasive treefrog.

5.5 Regression-based models provide better performance than correlation-based methods

A wide range of computational tools have been used to infer microbial interactions based on microbiome datasets. Each tool represents different speed, computational requirement, and accuracy. In this dissertation, I compared two popular used method – correlation-based and regression-based network inference tool to meet the balance and choose the better tool to infer microbial interactions of treefrog microbiome.

Although simple pairwise correlation-based methods such as Pearson correlations that I tested in this study shows higher speed, the use of correlation coefficients to detect dependencies between OTUs suffers from limitations. For example, the use of relative abundance could cause the detection of spurious correlations among low-abundance OTUs in zero-inflated data. This could strongly impact to the performance by increase the number of correlated pairs that due to the transformation to relative abundance.

Therefore, in the correlation-based method, the null model is also included in the calculation to reduce the chance of correlation that caused by being sensitive to compositionality.

Due to the limitation that correlation-based method can only process pairwise

correlations, I also included regression-based method that evaluate multiple OTUs in the computation. Generalized Lotka-Volterra (gLV) equation is the most commonly used regression-based model due to its high efficiency. The gLV model also have been applied to many microbiome datasets to infer microbial interactions (Faust and Raes 2012, Gao, Huynh et al. 2018). These computational tools based on gLV model can generated microbial interactions of microbiomes. Studies that used these tools usually pay more attention on several dominant OTUs to decipher the relations between those dominant OTUs and conclude the key member that regulate in the community (Stein, Bucci et al. 2013, Fisher and Mehta 2014, Marino, Baxter et al.

2014, Buffie, Bucci et al. 2015, Bucci, Tzen et al. 2016). However, the rare biosphere

has been studied to play important role in the community. MetaMIS is a current network inference tool that is design to be able to extract the importance of rare OTUs interacting with other members in the community (Shaw, Pao et al. 2016). Compared with other tools based on gLV model, MetaMIS could provide more insights into interacting relationships between members in the community, including the

interactions between dominant-dominant OTUs, dominant-rare OTUs, and rare-rare OTUs.

5.5.1 Performance comparison between regression-based and correlation-based methods

To evaluate the performance between different network inference tools, I generated a simulated time series dataset consist of 50 OTUs with relative abundance profile across 10 time points. Absolute abundance profile does not have bias because it reflects each real OTU counts in the community and consequently no normalization is required. However, in most cases, normalization of OTUs from absolute abundance into relative abundance within a sample is usually designed in the analysis pipeline to avoid nonsense comparison, such as unequal sequencing deep across different

samples. For this reason, the absolute abundances of OTUs are usually converted into relative abundance through the normalization process. This normalization step can introduce biases when using relative abundance profile, because computing

correlations between relative abundance of OTU across samples can distort results, which could generate artificial correlations. For example, artificial correlations may observe between low abundant members in the community. In my time series

simulation dataset, I apply relative abundance to evaluate both correlation-based and regression-based methods. To avoid this bias in correlation-based method, I

introduced a null model to correct Pearson’s correlation coefficient matrix. For regression-based method, there is an assumption when using relative abundance profile – time series sequencing dataset maintains a constant population size if there

are no other additional measurements to obtain absolute abundances. A recent study by Remien et al. showed a minor influence on the performance of network inference by using relative abundance dataset (Remien, Eckwright et al. 2018). They suggest that the assumption is not strictly necessary because the information from relative abundance profile is sufficient to infer interactions. Therefore, the final simulated dataset to evaluate the performance of both correlation-based and regression-based methods is relative abundance OTU profile.

There are variety of network algorithms and tools to decipher microbial interactions.

In this study, we compare the accuracy of the most two popular network inference tools by using in silico simulation dataset that all microbial interactions are already known. Alshawaqfeh et al. (Alshawaqfeh, Serpedin et al. 2017) had compared different network inference tools or algorithms to address the performance of each method by using synthetic datasets, which contains 10 bacteria with different level of noises in the simulation. The tools they discussed included Pearson correlation

coefficient (PCC), local similarity analysis (LSA); as well as algorithms including the gradient free Nelder-Mead algorithm, regression-based algorithm by Stein et al.

(Stein, Bucci et al. 2013), and by Fisher et al. (Fisher and Mehta 2014) called learning

在文檔中探索斑腿樹蛙腸道菌以及其網絡關係 (頁 83-86)