Bayesian phylogenetic analysis supports an agricultural origin of Japonic languages.
Lee, Sean; Hasegawa, Toshikazu
2011-12-22
Languages, like genes, evolve by a process of descent with modification. This striking similarity between biological and linguistic evolution allows us to apply phylogenetic methods to explore how languages, as well as the people who speak them, are related to one another through evolutionary history. Language phylogenies constructed with lexical data have so far revealed population expansions of Austronesian, Indo-European and Bantu speakers. However, how robustly a phylogenetic approach can chart the history of language evolution and what language phylogenies reveal about human prehistory must be investigated more thoroughly on a global scale. Here we report a phylogeny of 59 Japonic languages and dialects. We used this phylogeny to estimate time depth of its root and compared it with the time suggested by an agricultural expansion scenario for Japanese origin. In agreement with the scenario, our results indicate that Japonic languages descended from a common ancestor approximately 2182 years ago. Together with archaeological and biological evidence, our results suggest that the first farmers of Japan had a profound impact on the origins of both people and languages. On a broader level, our results are consistent with a theory that agricultural expansion is the principal factor for shaping global linguistic diversity.
Estimating Bayesian Phylogenetic Information Content
Lewis, Paul O.; Chen, Ming-Hui; Kuo, Lynn; Lewis, Louise A.; Fučíková, Karolina; Neupane, Suman; Wang, Yu-Bo; Shi, Daoyuan
2016-01-01
Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.] PMID:27155008
Bibi, F; Vrba, E; Fack, F
2012-09-01
Given that most species that have ever existed on Earth are extinct, no evolutionary history can ever be complete without the inclusion of fossil taxa. Bovids (antelopes and relatives) are one of the most diverse clades of large mammals alive today, with over a hundred living species and hundreds of documented fossil species. With the advent of molecular phylogenetics, major advances have been made in the phylogeny of this clade; however, there has been little attempt to integrate the fossil record into the developing phylogenetic picture. We here describe a new large fossil caprin species from ca. 1.9-Ma deposits from the Middle Awash, Ethiopia. To place the new species phylogenetically, we perform a Bayesian analysis of a combined molecular (cytochrome b) and morphological (osteological) character supermatrix. We include all living species of Caprini, the new fossil species, a fossil takin from the Pliocene of Ethiopia (Budorcas churcheri), and the insular subfossil Myotragus balearicus. The combined analysis demonstrates successful incorporation of both living and fossil species within a single phylogeny based on both molecular and morphological evidence. Analysis of the combined supermatrix produces superior resolution than with either the molecular or morphological data sets considered alone. Parsimony and Bayesian analyses of the data set are also compared and shown to produce similar results. The combined phylogenetic analysis indicates that the new fossil species is nested within Capra, making it one of the earliest representatives of this clade, with implications for molecular clock calibration. Geographical optimization indicates no less than four independent dispersals into Africa by caprins since the Pliocene.
Afreen, Nazia; Naqvi, Irshad H.; Broor, Shobha; Ahmed, Anwar; Kazim, Syed Naqui; Dohare, Ravins; Kumar, Manoj; Parveen, Shama
2016-01-01
Dengue fever is the most important arboviral disease in the tropical and sub-tropical countries of the world. Delhi, the metropolitan capital state of India, has reported many dengue outbreaks, with the last outbreak occurring in 2013. We have recently reported predominance of dengue virus serotype 2 during 2011–2014 in Delhi. In the present study, we report molecular characterization and evolutionary analysis of dengue serotype 2 viruses which were detected in 2011–2014 in Delhi. Envelope genes of 42 DENV-2 strains were sequenced in the study. All DENV-2 strains grouped within the Cosmopolitan genotype and further clustered into three lineages; Lineage I, II and III. Lineage III replaced lineage I during dengue fever outbreak of 2013. Further, a novel mutation Thr404Ile was detected in the stem region of the envelope protein of a single DENV-2 strain in 2014. Nucleotide substitution rate and time to the most recent common ancestor were determined by molecular clock analysis using Bayesian methods. A change in effective population size of Indian DENV-2 viruses was investigated through Bayesian skyline plot. The study will be a vital road map for investigation of epidemiology and evolutionary pattern of dengue viruses in India. PMID:26977703
Afreen, Nazia; Naqvi, Irshad H; Broor, Shobha; Ahmed, Anwar; Kazim, Syed Naqui; Dohare, Ravins; Kumar, Manoj; Parveen, Shama
2016-03-01
Dengue fever is the most important arboviral disease in the tropical and sub-tropical countries of the world. Delhi, the metropolitan capital state of India, has reported many dengue outbreaks, with the last outbreak occurring in 2013. We have recently reported predominance of dengue virus serotype 2 during 2011-2014 in Delhi. In the present study, we report molecular characterization and evolutionary analysis of dengue serotype 2 viruses which were detected in 2011-2014 in Delhi. Envelope genes of 42 DENV-2 strains were sequenced in the study. All DENV-2 strains grouped within the Cosmopolitan genotype and further clustered into three lineages; Lineage I, II and III. Lineage III replaced lineage I during dengue fever outbreak of 2013. Further, a novel mutation Thr404Ile was detected in the stem region of the envelope protein of a single DENV-2 strain in 2014. Nucleotide substitution rate and time to the most recent common ancestor were determined by molecular clock analysis using Bayesian methods. A change in effective population size of Indian DENV-2 viruses was investigated through Bayesian skyline plot. The study will be a vital road map for investigation of epidemiology and evolutionary pattern of dengue viruses in India.
Bayesian phylogenetic estimation of fossil ages
Drummond, Alexei J.; Stadler, Tanja
2016-01-01
Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth–death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the ‘morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses. This article is part of the themed issue ‘Dating species divergences
Shafer, Aaron B A; Stewart, Donald T
2007-07-01
The field of molecular systematics has relied heavily on mitochondrial DNA (mtDNA) analysis since its inception. Despite the obvious utility of mtDNA, such data inevitably only presents a limited (i.e., single genome) perspective on species evolution. A combination of mitochondrial and nuclear markers is essential for reconstructing more robust phylogenetic trees. To evaluate the utility of one category of nuclear marker (short interspersed elements or SINEs) for resolving phylogenetic relationships, we constructed an inter-SINE fingerprint for nine putative species of the genus Sorex. In addition, we analyzed 1011 nucleotides of the cytochrome b gene. Traditional neighbor-joining and maximum parsimony analyses were applied to the individual cytochrome b and inter-SINE fingerprint data sets, along with Bayesian analysis to the combined data sets. We found inter-SINE fingerprinting to be an effective species level marker; however, we were unable to reconstruct deeper branching patterns within the Sorex genus using these data. The combined data analyzed under a Bayesian analysis showed higher levels of structuring within the Otisorex subgenus, most notably recognizing a monophyletic group consisting of sister-taxa S. palustris and S. monticolus, S. cinereus and S. haydeni, and S. hoyi. An additional noteworthy result was the detection of an historic mitochondrial introgression event between S. monticolus and S. palustris. When combining disparate data sets, we emphasize researcher diligence as certain types of data and processes may overly influence the analysis. However, there is considerable phylogenetic potential stemming from inter-SINE fingerprinting.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2017-01-18
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Yuan, Ying; MacKinnon, David P.
2009-01-01
In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…
Angeletti, Silvia; Lo Presti, Alessandra; Cella, Eleonora; Dicuonzo, Giordano; Crea, Francesca; Palazzotti, Bernardetta; Dedej, Etleva; Ciccozzi, Massimo; De Florio, Lucia
2015-12-01
Clinical Candida isolates from two different hospitals in Rome were identified and clustered by MALDI-TOF MS system and their origin and evolution estimated by Bayesian phylogenetic analysis. The different species of Candida were correctly identified and clustered separately, confirming the ability of these techniques to discriminate between different Candida species. Focusing MALDI-TOF analysis on a single Candida species, Candida albicans and Candida parapsilosis strains clustered differently for hospital setting as well as for period of isolation than Candida glabrata and Candida tropicalis isolates. The evolutionary rates of C. albicans and C. parapsilosis (1.93×10(-2) and 1.17×10(-2)substitutions/site/year, respectively) were in agreement with a higher rate of mutation of these species, even in a narrow period, than what was observed in C. glabrata and C. tropicalis strains (6.99×10(-4) and 7.52×10(-3)substitutions/site/year, respectively). C. albicans resulted as the species with the highest between and within clades genetic distance values in agreement with the temporal-related clustering found by MALDI-TOF and the high evolutionary rate 1.93×10(-2)substitutions/site/year.
Phylogenetic Analyses: A Toolbox Expanding towards Bayesian Methods
Aris-Brosou, Stéphane; Xia, Xuhua
2008-01-01
The reconstruction of phylogenies is becoming an increasingly simple activity. This is mainly due to two reasons: the democratization of computing power and the increased availability of sophisticated yet user-friendly software. This review describes some of the latest additions to the phylogenetic toolbox, along with some of their theoretical and practical limitations. It is shown that Bayesian methods are under heavy development, as they offer the possibility to solve a number of long-standing issues and to integrate several steps of the phylogenetic analyses into a single framework. Specific topics include not only phylogenetic reconstruction, but also the comparison of phylogenies, the detection of adaptive evolution, and the estimation of divergence times between species. PMID:18483574
Golemba, Marcelo D.; Di Lello, Federico A.; Bessone, Fernando; Fay, Fabian; Benetti, Silvina; Jones, Leandro R.; Campos, Rodolfo H.
2010-01-01
Previous studies in Argentina have documented a general prevalence of Hepatitis C Virus (HCV) infection close to 2%. In addition, a high prevalence of HCV has been recently reported in different Argentinean small rural communities. In this work, we performed a study aimed at analyzing the origins and diversification patterns of an HCV outbreak in Wheelwright, a small rural town located in Santa Fe province (Argentina). A total of 89 out of 1814 blood samples collected from people living in Wheelwright, were positive for HCV infection. The highest prevalence (4.9%) was observed in people older than 50 years, with the highest level for the group aged between 70–79 years (22%). The RFLP analyses showed that 91% of the positive samples belonged to the HCV-1b genotype. The E1/E2 and NS5B genes were sequenced, and their phylogenetic analysis showed that the HCV-1b sequences from Wheelwright were monophyletic. Bayesian coalescent-based methods were used to estimate substitution rates and time of the most recent common ancestor (tMRCA). The mean estimated substitution rates and the tMRCA for E1/E2 with and without HVR1 and NS5B were 7.41E-03 s/s/y and 61 years, 5.05E-03 s/s/y and 58 years and 3.24E-03 s/s/y and 53 years, respectively. In summary, the tMRCA values, the demographic model with constant population size, and the fact that the highest prevalence of infection was observed in elder people support the hypothesis that the HCV-1b introduction in Wheelwright initially occurred at least five decades ago and that the early epidemic was characterized by a fast rate of virus transmission. The epidemic seems to have been controlled later on down to the standard transmission rates observed elsewhere. PMID:20090919
Golemba, Marcelo D; Di Lello, Federico A; Bessone, Fernando; Fay, Fabian; Benetti, Silvina; Jones, Leandro R; Campos, Rodolfo H
2010-01-18
Previous studies in Argentina have documented a general prevalence of Hepatitis C Virus (HCV) infection close to 2%. In addition, a high prevalence of HCV has been recently reported in different Argentinean small rural communities. In this work, we performed a study aimed at analyzing the origins and diversification patterns of an HCV outbreak in Wheelwright, a small rural town located in Santa Fe province (Argentina).A total of 89 out of 1814 blood samples collected from people living in Wheelwright, were positive for HCV infection. The highest prevalence (4.9%) was observed in people older than 50 years, with the highest level for the group aged between 70-79 years (22%). The RFLP analyses showed that 91% of the positive samples belonged to the HCV-1b genotype. The E1/E2 and NS5B genes were sequenced, and their phylogenetic analysis showed that the HCV-1b sequences from Wheelwright were monophyletic. Bayesian coalescent-based methods were used to estimate substitution rates and time of the most recent common ancestor (tMRCA). The mean estimated substitution rates and the tMRCA for E1/E2 with and without HVR1 and NS5B were 7.41E-03 s/s/y and 61 years, 5.05E-03 s/s/y and 58 years and 3.24E-03 s/s/y and 53 years, respectively. In summary, the tMRCA values, the demographic model with constant population size, and the fact that the highest prevalence of infection was observed in elder people support the hypothesis that the HCV-1b introduction in Wheelwright initially occurred at least five decades ago and that the early epidemic was characterized by a fast rate of virus transmission. The epidemic seems to have been controlled later on down to the standard transmission rates observed elsewhere.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.
Baele, Guy; Lemey, Philippe; Suchard, Marc A
2016-03-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of "working distributions" to facilitate--or shorten--the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a "working" distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different "working" distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses.
Bayesian Exploratory Factor Analysis
Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James J.; Piatek, Rémi
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study confirms the validity of the approach. The method is used to produce interpretable low dimensional aggregates from a high dimensional set of psychological measurements. PMID:25431517
Bayesian modelling of compositional heterogeneity in molecular phylogenetics.
Heaps, Sarah E; Nye, Tom M W; Boys, Richard J; Williams, Tom A; Embley, T Martin
2014-10-01
In molecular phylogenetics, standard models of sequence evolution generally assume that sequence composition remains constant over evolutionary time. However, this assumption is violated in many datasets which show substantial heterogeneity in sequence composition across taxa. We propose a model which allows compositional heterogeneity across branches, and formulate the model in a Bayesian framework. Specifically, the root and each branch of the tree is associated with its own composition vector whilst a global matrix of exchangeability parameters applies everywhere on the tree. We encourage borrowing of strength between branches by developing two possible priors for the composition vectors: one in which information can be exchanged equally amongst all branches of the tree and another in which more information is exchanged between neighbouring branches than between distant branches. We also propose a Markov chain Monte Carlo (MCMC) algorithm for posterior inference which uses data augmentation of substitutional histories to yield a simple complete data likelihood function that factorises over branches and allows Gibbs updates for most parameters. Standard phylogenetic models are not informative about the root position. Therefore a significant advantage of the proposed model is that it allows inference about rooted trees. The position of the root is fundamental to the biological interpretation of trees, both for polarising trait evolution and for establishing the order of divergence among lineages. Furthermore, unlike some other related models from the literature, inference in the model we propose can be carried out through a simple MCMC scheme which does not require problematic dimension-changing moves. We investigate the performance of the model and priors in analyses of two alignments for which there is strong biological opinion about the tree topology and root position.
Tracing the roots of syntax with Bayesian phylogenetics.
Maurits, Luke; Griffiths, Thomas L
2014-09-16
The ordering of subject, verb, and object is one of the fundamental components of the syntax of natural languages. The distribution of basic word orders across the world's languages is highly nonuniform, with the majority of languages being either subject-object-verb (SOV) or subject-verb-object (SVO). Explaining this fact using psychological accounts of language acquisition or processing requires understanding how the present distribution has resulted from ancestral distributions and the rates of change between orders. We show that Bayesian phylogenetics can provide quantitative answers to three important questions: how word orders are likely to change over time, which word orders were dominant historically, and whether strong inferences about the origins of syntax can be drawn from modern languages. We find that SOV to SVO change is more common than the reverse and VSO to SVO change is more common than VSO to SOV, and that if the seven language families we consider share a common ancestor then that common ancestor likely had SOV word order, but also that there are limits on how confidently we can make inferences about ancestral word order based on modern-day observations. These results shed new light on old questions from historical linguistics and provide clear targets for psychological explanations of word-order distributions.
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC
Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun
2009-01-01
Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598
Dembo, Mana; Radovčić, Davorka; Garvin, Heather M; Laird, Myra F; Schroeder, Lauren; Scott, Jill E; Brophy, Juliet; Ackermann, Rebecca R; Musiba, Chares M; de Ruiter, Darryl J; Mooers, Arne Ø; Collard, Mark
2016-08-01
Homo naledi is a recently discovered species of fossil hominin from South Africa. A considerable amount is already known about H. naledi but some important questions remain unanswered. Here we report a study that addressed two of them: "Where does H. naledi fit in the hominin evolutionary tree?" and "How old is it?" We used a large supermatrix of craniodental characters for both early and late hominin species and Bayesian phylogenetic techniques to carry out three analyses. First, we performed a dated Bayesian analysis to generate estimates of the evolutionary relationships of fossil hominins including H. naledi. Then we employed Bayes factor tests to compare the strength of support for hypotheses about the relationships of H. naledi suggested by the best-estimate trees. Lastly, we carried out a resampling analysis to assess the accuracy of the age estimate for H. naledi yielded by the dated Bayesian analysis. The analyses strongly supported the hypothesis that H. naledi forms a clade with the other Homo species and Australopithecus sediba. The analyses were more ambiguous regarding the position of H. naledi within the (Homo, Au. sediba) clade. A number of hypotheses were rejected, but several others were not. Based on the available craniodental data, Homo antecessor, Asian Homo erectus, Homo habilis, Homo floresiensis, Homo sapiens, and Au. sediba could all be the sister taxon of H. naledi. According to the dated Bayesian analysis, the most likely age for H. naledi is 912 ka. This age estimate was supported by the resampling analysis. Our findings have a number of implications. Most notably, they support the assignment of the new specimens to Homo, cast doubt on the claim that H. naledi is simply a variant of H. erectus, and suggest H. naledi is younger than has been previously proposed.
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.
Heled, Joseph; Drummond, Alexei J
2015-05-01
Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference.
Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik
2016-07-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].
Höhna, Sebastian; Landis, Michael J.
2016-01-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.] PMID:27235697
Phylogenetic analysis of honey bee behavioral evolution.
Raffiudin, Rika; Crozier, Ross H
2007-05-01
DNA sequences from three mitochondrial (rrnL, cox2, nad2) and one nuclear gene (itpr) from all 9 known honey bee species (Apis), a 10th possible species, Apis dorsata binghami, and three outgroup species (Bombus terrestris, Melipona bicolor and Trigona fimbriata) were used to infer Apis phylogenetic relationships using Bayesian analysis. The dwarf honey bees were confirmed as basal, and the giant and cavity-nesting species to be monophyletic. All nodes were strongly supported except that grouping Apis cerana with A. nigrocincta. Two thousand post-burnin trees from the phylogenetic analysis were used in a Bayesian comparative analysis to explore the evolution of dance type, nest structure, comb structure and dance sound within Apis. The ancestral honey bee species was inferred with high support to have nested in the open, and to have more likely than not had a silent vertical waggle dance and a single comb. The common ancestor of the giant and cavity-dwelling bees is strongly inferred to have had a buzzing vertical directional dance. All pairwise combinations of characters showed strong association, but the multiple comparisons problem reduces the ability to infer associations between states between characters. Nevertheless, a buzzing dance is significantly associated with cavity-nesting, several vertical combs, and dancing vertically, a horizontal dance is significantly associated with a nest with a single comb wrapped around the support, and open nesting with a single pendant comb and a silent waggle dance.
Brandley, Matthew C; Schmitz, Andreas; Reeder, Tod W
2005-06-01
Partitioned Bayesian analyses of approximately 2.2 kb of nucleotide sequence data (mtDNA) were used to elucidate phylogenetic relationships among 30 scincid lizard genera. Few partitioned Bayesian analyses exist in the literature, resulting in a lack of methods to determine the appropriate number of and identity of partitions. Thus, a criterion, based on the Bayes factor, for selecting among competing partitioning strategies is proposed and tested. Improvements in both mean -lnL and estimated posterior probabilities were observed when specific models and parameter estimates were assumed for partitions of the total data set. This result is expected given that the 95% credible intervals of model parameter estimates for numerous partitions do not overlap and it reveals that different data partitions may evolve quite differently. We further demonstrate that how one partitions the data (by gene, codon position, etc.) is shown to be a greater concern than simply the overall number of partitions. Using the criterion of the 2 ln Bayes factor > 10, the phylogenetic analysis employing the largest number of partitions was decisively better than all other strategies. Strategies that partitioned the ND1 gene by codon position performed better than other partition strategies, regardless of the overall number of partitions. Scincidae, Acontinae, Lygosominae, east Asian and North American "Eumeces" + Neoseps; North African Eumeces, Scincus, and Scincopus, and a large group primarily from sub-Saharan Africa, Madagascar, and neighboring islands are monophyletic. Feylinia, a limbless group of previously uncertain relationships, is nested within a "scincine" clade from sub-Saharan Africa. We reject the hypothesis that the nearly limbless dibamids are derived from within the Scincidae, but cannot reject the hypothesis that they represent the sister taxon to skinks. Amphiglossus, Chalcides, the acontines Acontias and Typhlosaurus, and Scincinae are paraphyletic. The globally widespread
Matthews, Luke J.; Tehrani, Jamie J.; Jordan, Fiona M.; Collard, Mark; Nunn, Charles L.
2011-01-01
Background Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how Bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. Methods We used Bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. Results For the Iranian textiles, the Bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. Conclusions The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that Bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture. PMID:21559083
Bayesian analysis of rare events
NASA Astrophysics Data System (ADS)
Straub, Daniel; Papaioannou, Iason; Betz, Wolfgang
2016-06-01
In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into the probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
Spade, David A; Herbei, Radu; Kubatko, Laura S
2015-10-01
One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.
Phylogenetic analysis of adenovirus sequences.
Harrach, Balázs; Benko, Mária
2007-01-01
Members of the family Adenoviridae have been isolated from a large variety of hosts, including representatives from every major vertebrate class from fish to mammals. The high prevalence, together with the fairly conserved organization of the central part of their genomes, make the adenoviruses one of (if not the) best models for studying viral evolution on a larger time scale. Phylogenetic calculation can infer the evolutionary distance among adenovirus strains on serotype, species, and genus levels, thus helping the establishment of a correct taxonomy on the one hand, and speeding up the process of typing new isolates on the other. Initially, four major lineages corresponding to four genera were recognized. Later, the demarcation criteria of lower taxon levels, such as species or types, could also be defined with phylogenetic calculations. A limited number of possible host switches have been hypothesized and convincingly supported. Application of the web-based BLAST and MultAlin programs and the freely available PHYLIP package, along with the TreeView program, enables everyone to make correct calculations. In addition to step-by-step instruction on how to perform phylogenetic analysis, critical points where typical mistakes or misinterpretation of the results might occur will be identified and hints for their avoidance will be provided.
RWTY (R We There Yet): An R package for examining convergence of Bayesian phylogenetic analyses.
Warren, Dan L; Geneva, Anthony J; Lanfear, Robert
2017-01-12
Bayesian inference using Markov chain Monte Carlo (MCMC) has become one of the primary methods used to infer phylogenies from sequence data. Assessing convergence is a crucial component of these analyses, as it establishes the reliability of the posterior distribution estimates of the tree topology and model parameters sampled from the MCMC. Numerous tests and visualizations have been developed for this purpose, but many of the most popular methods are implemented in ways that make them inconvenient to use for large data sets. RWTY is an R package that implements established and new methods for diagnosing phylogenetic MCMC convergence in a single convenient interface.
Bayesian Statistics for Biological Data: Pedigree Analysis
ERIC Educational Resources Information Center
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
Bayesian phylogeny analysis via stochastic approximation Monte Carlo.
Cheon, Sooyoung; Liang, Faming
2009-11-01
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis-Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time.
Molecular phylogenetic analysis of the Papionina using concatenation and species tree methods.
Guevara, Elaine E; Steiper, Michael E
2014-01-01
The Papionina is a geographically widespread subtribe of African cercopithecid monkeys whose evolutionary history is of particular interest to anthropologists. The phylogenetic relationships among arboreal mangabeys (Lophocebus), baboons (Papio), and geladas (Theropithecus) remain unresolved. Molecular phylogenetic analyses have revealed marked gene tree incongruence for these taxa, and several recent concatenated phylogenetic analyses of multilocus datasets have supported different phylogenetic hypotheses. To address this issue, we investigated the phylogeny of the Lophocebus + Papio + Theropithecus group using concatenation methods, as well as alternative methods that incorporate gene tree heterogeneity to estimate a 'species tree.' Our compiled DNA sequence dataset was ∼56 kb pairs long and included 57 independent partitions. All analyses of concatenated alignments strongly supported a Lophocebus + Papio clade and a basal position for Theropithecus. The Bayesian concordance analysis supported the same phylogeny. A coalescent-based Bayesian method resulted in a very poorly resolved species tree. The topological agreement between concatenation and the Bayesian concordance analysis offers considerable support for a Lophocebus + Papio clade as the dominant relationship across the genome. However, the results of the Bayesian concordance analysis indicate that almost half the genome has an alternative history. As such, our results offer a well-supported phylogenetic hypothesis for the Papio/Lophocebus/Theropithecus trichotomy, while at the same time providing evidence for a complex evolutionary history that likely includes hybridization among lineages.
Bayesian robust principal component analysis.
Ding, Xinghao; He, Lihan; Carin, Lawrence
2011-12-01
A hierarchical Bayesian model is considered for decomposing a matrix into low-rank and sparse components, assuming the observed matrix is a superposition of the two. The matrix is assumed noisy, with unknown and possibly non-stationary noise statistics. The Bayesian framework infers an approximate representation for the noise statistics while simultaneously inferring the low-rank and sparse-outlier contributions; the model is robust to a broad range of noise levels, without having to change model hyperparameter settings. In addition, the Bayesian framework allows exploitation of additional structure in the matrix. For example, in video applications each row (or column) corresponds to a video frame, and we introduce a Markov dependency between consecutive rows in the matrix (corresponding to consecutive frames in the video). The properties of this Markov process are also inferred based on the observed matrix, while simultaneously denoising and recovering the low-rank and sparse components. We compare the Bayesian model to a state-of-the-art optimization-based implementation of robust PCA; considering several examples, we demonstrate competitive performance of the proposed model.
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
ERIC Educational Resources Information Center
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A. G.
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are…
Bayesian analysis for kaon photoproduction
Marsainy, T. Mart, T.
2014-09-25
We have investigated contribution of the nucleon resonances in the kaon photoproduction process by using an established statistical decision making method, i.e. the Bayesian method. This method does not only evaluate the model over its entire parameter space, but also takes the prior information and experimental data into account. The result indicates that certain resonances have larger probabilities to contribute to the process.
Bayesian Meta-Analysis of Coefficient Alpha
ERIC Educational Resources Information Center
Brannick, Michael T.; Zhang, Nanhua
2013-01-01
The current paper describes and illustrates a Bayesian approach to the meta-analysis of coefficient alpha. Alpha is the most commonly used estimate of the reliability or consistency (freedom from measurement error) for educational and psychological measures. The conventional approach to meta-analysis uses inverse variance weights to combine…
An Integrated Bayesian Model for DIF Analysis
ERIC Educational Resources Information Center
Soares, Tufi M.; Goncalves, Flavio B.; Gamerman, Dani
2009-01-01
In this article, an integrated Bayesian model for differential item functioning (DIF) analysis is proposed. The model is integrated in the sense of modeling the responses along with the DIF analysis. This approach allows DIF detection and explanation in a simultaneous setup. Previous empirical studies and/or subjective beliefs about the item…
Mesoamerican tree squirrels evolution (Rodentia: Sciuridae): a molecular phylogenetic analysis.
Villalobos, Federico; Gutierrez-Espeleta, Gustavo
2014-06-01
The tribe Sciurini comprehends the genera Sciurus, Syntheosiurus, Microsciurus, Tamiasciurus and Rheinthrosciurus. The phylogenetic relationships within Sciurus have been only partially done, and the relationship between Mesoamerican species remains unsolved. The phylogenetic relationships of the Mesoamerican tree squirrels were examined using molecular data. Sequence data publicly available (12S, 16S, CYTB mitochondrial genes and IRBP nuclear gene) and cytochrome B gene sequences of four previously not sampled Mesoamerican Sciurus species were analyzed under a Bayesian multispecies coalescence model. Phylogenetic analysis of the multilocus data set showed the neotropical tree squirrels as a monophyletic clade. The genus Sciurus was paraphyletic due to the inclusion of Microsciurus species (M. alfari and M. flaviventer). The South American species S. aestuans and S. stramineus showed a sister taxa relationship. Single locus analysis based on the most compact and complete data set (i.e. CYTB gene sequences), supported the monophyly of the South American species and recovered a Mesoamerican clade including S. aureogaster, S. granatensis and S. variegatoides. These results corroborated previous findings based on cladistic analysis of cranial and post-cranial characters. Our data support a close relationship between Mesoamerican Sciurus species and a sister relationship with South American species, and corroborates previous findings in relation to the polyphyly of Microsciurus and Syntheosciurus paraphyly.
Bayesian Correlation Analysis for Sequence Count Data
Lau, Nelson; Perkins, Theodore J.
2016-01-01
Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449
Bayesian Correlation Analysis for Sequence Count Data.
Sánchez-Taltavull, Daniel; Ramachandran, Parameswaran; Lau, Nelson; Perkins, Theodore J
2016-01-01
Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.
[Analysis phylogenetic relationship of Gynostemma (Cucurbitaceae)].
Qin, Shuang-shuang; Li, Hai-tao; Wang, Zhou-yong; Cui, Zhan-hu; Yu, Li-ying
2015-05-01
The sequences of ITS, matK, rbcL and psbA-trnH of 9 Gynostemma species or variety including 38 samples were compared and analyzed by molecular phylogeny method. Hemsleya macrosperma was designated as outgroup. The MP and NJ phylogenetic tree of Gynostemma was built based on ITS sequence, the results of PAUP phylogenetic analysis showed the following results: (1) The eight individuals of G. pentaphyllum var. pentaphyllum were not supported as monophyletic in the strict consensus trees and NJ trees. (2) It is suspected whether G. longipes and G. laxum should be classified as the independent species. (3)The classification of subgenus units of Gynostemma plants is supported.
A Comprehensive Phylogenetic Analysis of Deadenylases
Pavlopoulou, Athanasia; Vlachakis, Dimitrios; Balatsos, Nikolaos A.A.; Kossida, Sophia
2013-01-01
Deadenylases catalyze the shortening of the poly(A) tail at the messenger ribonucleic acid (mRNA) 3′-end in eukaryotes. Therefore, these enzymes influence mRNA decay, and constitute a major emerging group of promising anti-cancer pharmacological targets. Herein, we conducted full phylogenetic analyses of the deadenylase homologs in all available genomes in an effort to investigate evolutionary relationships between the deadenylase families and to identify invariant residues, which probably play key roles in the function of deadenylation across species. Our study includes both major Asp-Glu-Asp-Asp (DEDD) and exonuclease-endonuclease-phospatase (EEP) deadenylase superfamilies. The phylogenetic analysis has provided us with important information regarding conserved and invariant deadenylase amino acids across species. Knowledge of the phylogenetic properties and evolution of the domain of deadenylases provides the foundation for the targeted drug design in the pharmaceutical industry and modern exonuclease anti-cancer scientific research. PMID:24348009
A SAS Interface for Bayesian Analysis with WinBUGS
ERIC Educational Resources Information Center
Zhang, Zhiyong; McArdle, John J.; Wang, Lijuan; Hamagami, Fumiaki
2008-01-01
Bayesian methods are becoming very popular despite some practical difficulties in implementation. To assist in the practical application of Bayesian methods, we show how to implement Bayesian analysis with WinBUGS as part of a standard set of SAS routines. This implementation procedure is first illustrated by fitting a multiple regression model…
Posada, David; Buckley, Thomas R
2004-10-01
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).
Integrative bayesian network analysis of genomic data.
Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran
2014-01-01
Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.
On the analysis of phylogenetically paired designs
Funk, Jennifer L; Rakovski, Cyril S; Macpherson, J Michael
2015-01-01
As phylogenetically controlled experimental designs become increasingly common in ecology, the need arises for a standardized statistical treatment of these datasets. Phylogenetically paired designs circumvent the need for resolved phylogenies and have been used to compare species groups, particularly in the areas of invasion biology and adaptation. Despite the widespread use of this approach, the statistical analysis of paired designs has not been critically evaluated. We propose a mixed model approach that includes random effects for pair and species. These random effects introduce a “two-layer” compound symmetry variance structure that captures both the correlations between observations on related species within a pair as well as the correlations between the repeated measurements within species. We conducted a simulation study to assess the effect of model misspecification on Type I and II error rates. We also provide an illustrative example with data containing taxonomically similar species and several outcome variables of interest. We found that a mixed model with species and pair as random effects performed better in these phylogenetically explicit simulations than two commonly used reference models (no or single random effect) by optimizing Type I error rates and power. The proposed mixed model produces acceptable Type I and II error rates despite the absence of a phylogenetic tree. This design can be generalized to a variety of datasets to analyze repeated measurements in clusters of related subjects/species. PMID:25750719
Bayesian Analysis of Individual Level Personality Dynamics
Cripps, Edward; Wood, Robert E.; Beckmann, Nadin; Lau, John; Beckmann, Jens F.; Cripps, Sally Ann
2016-01-01
A Bayesian technique with analyses of within-person processes at the level of the individual is presented. The approach is used to examine whether the patterns of within-person responses on a 12-trial simulation task are consistent with the predictions of ITA theory (Dweck, 1999). ITA theory states that the performance of an individual with an entity theory of ability is more likely to spiral down following a failure experience than the performance of an individual with an incremental theory of ability. This is because entity theorists interpret failure experiences as evidence of a lack of ability which they believe is largely innate and therefore relatively fixed; whilst incremental theorists believe in the malleability of abilities and interpret failure experiences as evidence of more controllable factors such as poor strategy or lack of effort. The results of our analyses support ITA theory at both the within- and between-person levels of analyses and demonstrate the benefits of Bayesian techniques for the analysis of within-person processes. These include more formal specification of the theory and the ability to draw inferences about each individual, which allows for more nuanced interpretations of individuals within a personality category, such as differences in the individual probabilities of spiraling. While Bayesian techniques have many potential advantages for the analyses of processes at the level of the individual, ease of use is not one of them for psychologists trained in traditional frequentist statistical techniques. PMID:27486415
Bayesian model selection analysis of WMAP3
Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.
2006-06-15
We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.
Klebsiella pneumoniae blaKPC-3 nosocomial epidemic: Bayesian and evolutionary analysis.
Angeletti, Silvia; Presti, Alessandra Lo; Cella, Eleonora; Fogolari, Marta; De Florio, Lucia; Dedej, Etleva; Blasi, Aletheia; Milano, Teresa; Pascarella, Stefano; Incalzi, Raffaele Antonelli; Coppola, Roberto; Dicuonzo, Giordano; Ciccozzi, Massimo
2016-12-01
K. pneumoniae isolates carrying blaKPC-3 gene were collected to perform Bayesian phylogenetic and selective pressure analysis and to apply homology modeling to the KPC-3 protein. A dataset of 44 blakpc-3 gene sequences from clinical isolates of K. pneumoniae was used for Bayesian phylogenetic, selective pressure analysis and homology modeling. The mean evolutionary rate for blakpc-3 gene was 2.67×10(-3) substitution/site/year (95% HPD: 3.4×10(-4-)5.59×10(-)(3)). The root of the Bayesian tree dated back to the year 2011 (95% HPD: 2007-2012). Two main clades (I and II) were identified. The population dynamics analysis showed an exponential growth from 2011 to 2013 and the reaching of a plateau. The phylogeographic reconstruction showed that the root of the tree had a probable common ancestor in the general surgery ward. Selective pressure analysis revealed twelve positively selected sites. Structural analysis of KPC-3 protein predicted that the amino acid mutations are destabilizing for the protein and could alter the substrate specificity. Phylogenetic analysis and homology modeling of blaKPC-3 gene could represent a useful tool to follow KPC spread in nosocomial setting and to evidence amino acid substitutions altering the substrate specificity.
Detecting Network Communities: An Application to Phylogenetic Analysis
Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.
2011-01-01
This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B; Neyer, Franz J; van Aken, Marcel AG
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are introduced using a simplified example. Thereafter, the advantages and pitfalls of the specification of prior knowledge are discussed. To illustrate Bayesian methods explained in this study, in a second example a series of studies that examine the theoretical framework of dynamic interactionism are considered. In the Discussion the advantages and disadvantages of using Bayesian statistics are reviewed, and guidelines on how to report on Bayesian statistics are provided. PMID:24116396
Phylogenetic analysis of cubilin (CUBN) gene
Shaik, Abjal Pasha; Alsaeed, Abbas H; Kiranmayee, S; Bammidi, VK; Sultana, Asma
2013-01-01
Cubilin, (CUBN; also known as intrinsic factor-cobalamin receptor [Homo sapiens Entrez Pubmed ref NM_001081.3; NG_008967.1; GI: 119606627]), located in the epithelium of intestine and kidney acts as a receptor for intrinsic factor – vitamin B12 complexes. Mutations in CUBN may play a role in autosomal recessive megaloblastic anemia. The current study investigated the possible role of CUBN in evolution using phylogenetic testing. A total of 588 BLAST hits were found for the cubilin query sequence and these hits showed putative conserved domain, CUB superfamily (as on 27th Nov 2012). A first-pass phylogenetic tree was constructed to identify the taxa which most often contained the CUBN sequences. Following this, we narrowed down the search by manually deleting sequences which were not CUBN. A repeat phylogenetic analysis of 25 taxa was performed using PhyML, RAxML and TreeDyn softwares to confirm that CUBN is a conserved protein emphasizing its importance as an extracellular domain and being present in proteins mostly known to be involved in development in many chordate taxa but not found in prokaryotes, plants and yeast.. No horizontal gene transfers have been found between different taxa. PMID:23390341
Phylogenetic analysis of cubilin (CUBN) gene.
Shaik, Abjal Pasha; Alsaeed, Abbas H; Kiranmayee, S; Bammidi, Vk; Sultana, Asma
2013-01-01
Cubilin, (CUBN; also known as intrinsic factor-cobalamin receptor [Homo sapiens Entrez Pubmed ref NM_001081.3; NG_008967.1; GI: 119606627]), located in the epithelium of intestine and kidney acts as a receptor for intrinsic factor - vitamin B12 complexes. Mutations in CUBN may play a role in autosomal recessive megaloblastic anemia. The current study investigated the possible role of CUBN in evolution using phylogenetic testing. A total of 588 BLAST hits were found for the cubilin query sequence and these hits showed putative conserved domain, CUB superfamily (as on 27(th) Nov 2012). A first-pass phylogenetic tree was constructed to identify the taxa which most often contained the CUBN sequences. Following this, we narrowed down the search by manually deleting sequences which were not CUBN. A repeat phylogenetic analysis of 25 taxa was performed using PhyML, RAxML and TreeDyn softwares to confirm that CUBN is a conserved protein emphasizing its importance as an extracellular domain and being present in proteins mostly known to be involved in development in many chordate taxa but not found in prokaryotes, plants and yeast.. No horizontal gene transfers have been found between different taxa.
Bayesian Nonparametric Models for Multiway Data Analysis.
Xu, Zenglin; Yan, Feng; Qi, Yuan
2015-02-01
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels.
Waddell, Peter J; Kishino, Hirohisa; Ota, Rissa
2002-01-01
Evolutionary trees sit at the core of all realistic models describing a set of related sequences, including alignment, homology search, ancestral protein reconstruction and 2D/3D structural change. It is important to assess the stochastic error when estimating a tree, including models using the most realistic likelihood-based optimizations, yet computation times may be many days or weeks. If so, the bootstrap is computationally prohibitive. Here we show that the extremely fast "resampling of estimated log likelihoods" or RELL method behaves well under more general circumstances than previously examined. RELL approximates the bootstrap (BP) proportions of trees better that some bootstrap methods that rely on fast heuristics to search the tree space. The BIC approximation of the Bayesian posterior probability (BPP) of trees is made more accurate by including an additional term related to the determinant of the information matrix (which may also be obtained as a product of gradient or score vectors). Such estimates are shown to be very close to MCMC chain values. Our analysis of mammalian mitochondrial amino acid sequences suggest that when model breakdown occurs, as it typically does for sequences separated by more than a few million years, the BPP values are far too peaked and the real fluctuations in the likelihood of the data are many times larger than expected. Accordingly, several ways to incorporate the bootstrap and other types of direct resampling with MCMC procedures are outlined. Genes evolve by a process which involves some sites following a tree close to, but not identical with, the species tree. It is seen that under such a likelihood model BP (bootstrap proportions) and BPP estimates may still be reasonable estimates of the species tree. Since many of the methods studied are very fast computationally, there is no reason to ignore stochastic error even with the slowest ML or likelihood based methods.
Tanner, Alastair R.; Fleming, James F.; Tarver, James E.; Pisani, Davide
2017-01-01
Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods. PMID:28077778
Puttick, Mark N; O'Reilly, Joseph E; Tanner, Alastair R; Fleming, James F; Clark, James; Holloway, Lucy; Lozano-Fernandez, Jesus; Parry, Luke A; Tarver, James E; Pisani, Davide; Donoghue, Philip C J
2017-01-11
Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods.
Bayesian analysis of multiple direct detection experiments
NASA Astrophysics Data System (ADS)
Arina, Chiara
2014-12-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the frequentist statistical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for their excessive complexity. Secondly, we confront several coherent scattering models to determine the current best physical scenario compatible with the experimental hints. We find that exothermic and inelastic dark matter are moderatly disfavored against the elastic scenario, while the isospin violating model has a similar evidence. Lastly the Bayes' factor gives inconclusive evidence for an incompatibility between the data sets of XENON100 and the hints of detection. The same question assessed with goodness of fit would indicate a 2 σ discrepancy. This suggests that more data are therefore needed to settle this question.
Bayesian Model Averaging for Propensity Score Analysis.
Kaplan, David; Chen, Jianshen
2014-01-01
This article considers Bayesian model averaging as a means of addressing uncertainty in the selection of variables in the propensity score equation. We investigate an approximate Bayesian model averaging approach based on the model-averaged propensity score estimates produced by the R package BMA but that ignores uncertainty in the propensity score. We also provide a fully Bayesian model averaging approach via Markov chain Monte Carlo sampling (MCMC) to account for uncertainty in both parameters and models. A detailed study of our approach examines the differences in the causal estimate when incorporating noninformative versus informative priors in the model averaging stage. We examine these approaches under common methods of propensity score implementation. In addition, we evaluate the impact of changing the size of Occam's window used to narrow down the range of possible models. We also assess the predictive performance of both Bayesian model averaging propensity score approaches and compare it with the case without Bayesian model averaging. Overall, results show that both Bayesian model averaging propensity score approaches recover the treatment effect estimates well and generally provide larger uncertainty estimates, as expected. Both Bayesian model averaging approaches offer slightly better prediction of the propensity score compared with the Bayesian approach with a single propensity score equation. Covariate balance checks for the case study show that both Bayesian model averaging approaches offer good balance. The fully Bayesian model averaging approach also provides posterior probability intervals of the balance indices.
Bayesian analysis of the solar neutrino anomaly
Bhat, C.M.
1998-02-01
We present an analysis of the recent solar neutrino data from the five experiments using Bayesian approach. We extract quantitative and easily understandable information pertaining to the solar neutrino problem. The probability distributions for the individual neutrino fluxes and, discrepancy distribution for B and Be fluxes, which include theoretical and experimental uncertainties have been extracted. The analysis carried out assuming that the neutrinos are unaltered during their passage from the sun to earth, clearly indicate that the observed PP flux is consistent with the 1995 standard solar model predictions of Bahcall and Pinsonneault within 2{sigma} (standard deviation), whereas the {sup 8}B flux is down by more than 12{sigma} and the {sup 7}Be flux is maximally suppressed. We also deduce the experimental survival probability for the solar neutrinos as a function of their energy in a model-independent way. We find that the shape of that distribution is in qualitative agreement with the MSW oscillation predictions.
Bayesian analysis on gravitational waves and exoplanets
NASA Astrophysics Data System (ADS)
Deng, Xihao
Attempts to detect gravitational waves using a pulsar timing array (PTA), i.e., a collection of pulsars in our Galaxy, have become more organized over the last several years. PTAs act to detect gravitational waves generated from very distant sources by observing the small and correlated effect the waves have on pulse arrival times at the Earth. In this thesis, I present advanced Bayesian analysis methods that can be used to search for gravitational waves in pulsar timing data. These methods were also applied to analyze a set of radial velocity (RV) data collected by the Hobby- Eberly Telescope on observing a K0 giant star. They confirmed the presence of two Jupiter mass planets around a K0 giant star and also characterized the stellar p-mode oscillation. The first part of the thesis investigates the effect of wavefront curvature on a pulsar's response to a gravitational wave. In it we show that we can assume the gravitational wave phasefront is planar across the array only if the source luminosity distance " 2piL2/lambda, where L is the pulsar distance to the Earth (˜ kpc) and lambda is the radiation wavelength (˜ pc) in the PTA waveband. Correspondingly, for a point gravitational wave source closer than ˜ 100 Mpc, we should take into account the effect of wavefront curvature across the pulsar-Earth line of sight, which depends on the luminosity distance to the source, when evaluating the pulsar timing response. As a consequence, if a PTA can detect a gravitational wave from a source closer than ˜ 100 Mpc, the effects of wavefront curvature on the response allows us to determine the source luminosity distance. The second and third parts of the thesis propose a new analysis method based on Bayesian nonparametric regression to search for gravitational wave bursts and a gravitational wave background in PTA data. Unlike the conventional Bayesian analysis that introduces a signal model with a fixed number of parameters, Bayesian nonparametric regression sets
Bayesian Logical Data Analysis for the Physical Sciences
NASA Astrophysics Data System (ADS)
Gregory, Phil
2010-05-01
Preface; Acknowledgements; 1. Role of probability theory in science; 2. Probability theory as extended logic; 3. The how-to of Bayesian inference; 4. Assigning probabilities; 5. Frequentist statistical inference; 6. What is a statistic?; 7. Frequentist hypothesis testing; 8. Maximum entropy probabilities; 9. Bayesian inference (Gaussian errors); 10. Linear model fitting (Gaussian errors); 11. Nonlinear model fitting; 12. Markov Chain Monte Carlo; 13. Bayesian spectral analysis; 14. Bayesian inference (Poisson sampling); Appendix A. Singular value decomposition; Appendix B. Discrete Fourier transforms; Appendix C. Difference in two samples; Appendix D. Poisson ON/OFF details; Appendix E. Multivariate Gaussian from maximum entropy; References; Index.
Optimal sequential Bayesian analysis for degradation tests.
Rodríguez-Narciso, Silvia; Christen, J Andrés
2016-07-01
Degradation tests are especially difficult to conduct for items with high reliability. Test costs, caused mainly by prolonged item duration and item destruction costs, establish the necessity of sequential degradation test designs. We propose a methodology that sequentially selects the optimal observation times to measure the degradation, using a convenient rule that maximizes the inference precision and minimizes test costs. In particular our objective is to estimate a quantile of the time to failure distribution, where the degradation process is modelled as a linear model using Bayesian inference. The proposed sequential analysis is based on an index that measures the expected discrepancy between the estimated quantile and its corresponding prediction, using Monte Carlo methods. The procedure was successfully implemented for simulated and real data.
A Distance Measure for Genome Phylogenetic Analysis
NASA Astrophysics Data System (ADS)
Cao, Minh Duc; Allison, Lloyd; Dix, Trevor
Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
Yamanoue, Yusuke; Miya, Masaki; Matsuura, Keiichi; Yagishita, Naoki; Mabuchi, Kohji; Sakai, Harumi; Katoh, Masaya; Nishida, Mutsumi
2007-10-01
Tetraodontiformes includes approximately 350 species assigned to nine families, sharing several reduced morphological features of higher teleosts. The order has been accepted as a monophyletic group by many authors, although several alternative hypotheses exist regarding its phylogenetic position within the higher teleosts. To date, acanthuroids, zeiforms, and lophiiforms have been proposed as sister-groups of the tetraodontiforms. The monophyly and sister-group status was investigated using whole mitochondrial genome (mitogenome) sequences from 44 purposefully-chosen species (26 sequences newly-determined during the study) that fully represent the major tetraodontiform lineages plus all the groups that have been hypothesized as being close relatives. Partitioned Bayesian analyses were conducted with the three datasets that comprised concatenated nucleotide sequences from 13 protein-coding genes (with and without, or with RY-coding, 3rd codon positions), plus 22 transfer RNA and two ribosomal RNA genes. The resultant trees were well resolved and largely congruent, with most internal branches being supported by high posterior probabilities. Mitogenomic data strongly supported the monophyly of tetraodontiform fishes, placing them as a sister-group of either Lophiiformes plus Caproidei or Caproidei only. The sister-group relationship between Acanthuroidei and Tetraodontiformes was statistically rejected using Bayes factors. These results were confirmed by a reanalysis of the previously published nuclear RAG1 gene sequences using the Bayesian method. Within the Tetraodontiformes, however, monophylies of the three superfamilies were not recovered and further taxonomic sampling and subsequent efforts should clarify these relationships.
Bayesian analysis. II. Signal detection and model selection
NASA Astrophysics Data System (ADS)
Bretthorst, G. Larry
In the preceding. paper, Bayesian analysis was applied to the parameter estimation problem, given quadrature NMR data. Here Bayesian analysis is extended to the problem of selecting the model which is most probable in view of the data and all the prior information. In addition to the analytic calculation, two examples are given. The first example demonstrates how to use Bayesian probability theory to detect small signals in noise. The second example uses Bayesian probability theory to compute the probability of the number of decaying exponentials in simulated T1 data. The Bayesian answer to this question is essentially a microcosm of the scientific method and a quantitative statement of Ockham's razor: theorize about possible models, compare these to experiment, and select the simplest model that "best" fits the data.
Bayesian data analysis in population ecology: motivations, methods, and benefits
Dorazio, Robert
2016-01-01
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
NASA Technical Reports Server (NTRS)
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
Enhancing the Modeling of PFOA Pharmacokinetics with Bayesian Analysis
The detail sufficient to describe the pharmacokinetics (PK) for perfluorooctanoic acid (PFOA) and the methods necessary to combine information from multiple data sets are both subjects of ongoing investigation. Bayesian analysis provides tools to accommodate these goals. We exa...
Bayesian Analysis of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey
2007-01-01
There is a wealth of cosmological information encoded in the spatial power spectrum of temperature anisotropies of the cosmic microwave background! Experiments designed to map the microwave sky are returning a flood of data (time streams of instrument response as a beam is swept over the sky) at several different frequencies (from 30 to 900 GHz), all with different resolutions and noise properties. The resulting analysis challenge is to estimate, and quantify our uncertainty in, the spatial power spectrum of the cosmic microwave background given the complexities of "missing data", foreground emission, and complicated instrumental noise. Bayesian formulation of this problem allows consistent treatment of many complexities including complicated instrumental noise and foregrounds, and can be numerically implemented with Gibbs sampling. Gibbs sampling has now been validated as an efficient, statistically exact, and practically useful method for low-resolution (as demonstrated on WMAP 1 and 3 year temperature and polarization data). Continuing development for Planck - the goal is to exploit the unique capabilities of Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters.
Bayesian analysis of the backreaction models
Kurek, Aleksandra; Bolejko, Krzysztof; Szydlowski, Marek
2010-03-15
We present a Bayesian analysis of four different types of backreaction models, which are based on the Buchert equations. In this approach, one considers a solution to the Einstein equations for a general matter distribution and then an average of various observable quantities is taken. Such an approach became of considerable interest when it was shown that it could lead to agreement with observations without resorting to dark energy. In this paper we compare the {Lambda}CDM model and the backreaction models with type Ia supernovae, baryon acoustic oscillations, and cosmic microwave background data, and find that the former is favored. However, the tested models were based on some particular assumptions about the relation between the average spatial curvature and the backreaction, as well as the relation between the curvature and curvature index. In this paper we modified the latter assumption, leaving the former unchanged. We find that, by varying the relation between the curvature and curvature index, we can obtain a better fit. Therefore, some further work is still needed--in particular, the relation between the backreaction and the curvature should be revisited in order to fully determine the feasibility of the backreaction models to mimic dark energy.
Asymptotic analysis of Bayesian generalization error with Newton diagram.
Yamazaki, Keisuke; Aoyagi, Miki; Watanabe, Sumio
2010-01-01
Statistical learning machines that have singularities in the parameter space, such as hidden Markov models, Bayesian networks, and neural networks, are widely used in the field of information engineering. Singularities in the parameter space determine the accuracy of estimation in the Bayesian scenario. The Newton diagram in algebraic geometry is recognized as an effective method by which to investigate a singularity. The present paper proposes a new technique to plug the diagram in the Bayesian analysis. The proposed technique allows the generalization error to be clarified and provides a foundation for an efficient model selection. We apply the proposed technique to mixtures of binomial distributions.
Bayesian analysis of a disability model for lung cancer survival.
Armero, C; Cabras, S; Castellanos, M E; Perra, S; Quirós, A; Oruezábal, M J; Sánchez-Rubio, J
2016-02-01
Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions.
Bayesian Analysis of Perceived Eye Level
Orendorff, Elaine E.; Kalesinskas, Laurynas; Palumbo, Robert T.; Albert, Mark V.
2016-01-01
To accurately perceive the world, people must efficiently combine internal beliefs and external sensory cues. We introduce a Bayesian framework that explains the role of internal balance cues and visual stimuli on perceived eye level (PEL)—a self-reported measure of elevation angle. This framework provides a single, coherent model explaining a set of experimentally observed PEL over a range of experimental conditions. Further, it provides a parsimonious explanation for the additive effect of low fidelity cues as well as the averaging effect of high fidelity cues, as also found in other Bayesian cue combination psychophysical studies. Our model accurately estimates the PEL and explains the form of previous equations used in describing PEL behavior. Most importantly, the proposed Bayesian framework for PEL is more powerful than previous behavioral modeling; it permits behavioral estimation in a wider range of cue combination and perceptual studies than models previously reported. PMID:28018204
Bayesian analysis of MEG visual evoked responses
NASA Astrophysics Data System (ADS)
Schmidt, David M.; George, John S.; Wood, C. C.
1999-05-01
We have developed a method for analyzing neural electromagnetic data that allows probabilistic inferences to be drawn about regions of activation. The method involves the generation of a large number of possible solutions which both fit the data and prior expectations about the nature of probable solutions made explicit by a Bayesian formalism. In addition, we have introduced a model for the current distributions that produce MEG (and EEG) data that allows extended regions of activity, and can easily incorporate prior information such as anatomical constraints from MRI. To evaluate the feasibility and utility of the Bayesian approach with actual data, we analyzed MEG data from a visual evoked response experiment. We compared Bayesian analyses of MEG responses to visual stimuli in the left and right visual fields, in order to examine the sensitivity of the method to detect known features of human visual cortex organization. We also examined the changing pattern of cortical activation as a function of time.
Bayesian analysis of MEG visual evoked responses
Schmidt, D.M.; George, J.S.; Wood, C.C.
1999-04-01
The authors developed a method for analyzing neural electromagnetic data that allows probabilistic inferences to be drawn about regions of activation. The method involves the generation of a large number of possible solutions which both fir the data and prior expectations about the nature of probable solutions made explicit by a Bayesian formalism. In addition, they have introduced a model for the current distributions that produce MEG and (EEG) data that allows extended regions of activity, and can easily incorporate prior information such as anatomical constraints from MRI. To evaluate the feasibility and utility of the Bayesian approach with actual data, they analyzed MEG data from a visual evoked response experiment. They compared Bayesian analyses of MEG responses to visual stimuli in the left and right visual fields, in order to examine the sensitivity of the method to detect known features of human visual cortex organization. They also examined the changing pattern of cortical activation as a function of time.
Open Reading Frame Phylogenetic Analysis on the Cloud
2013-01-01
Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843
Dediu, Dan
2011-02-07
Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic diversity and a window into the emergence and evolution of language. In particular, it has recently been proposed that linguistic tone-the usage of voice pitch to convey lexical and grammatical meaning-is biased by two genes involved in brain growth and development, ASPM and Microcephalin. This hypothesis predicts that tone is a stable characteristic of language because of its 'genetic anchoring'. The present paper tests this prediction using a Bayesian phylogenetic framework applied to a large set of linguistic features and language families, using multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices. The results of these different methods and datasets show a large agreement, suggesting that this approach produces reliable estimates of the stability of linguistic data. Moreover, linguistic tone is found to be stable across methods and datasets, providing suggestive support for the hypothesis of genetic influences on its distribution.
Veeramah, Krishna R.; Woerner, August E.; Johnstone, Laurel; Gut, Ivo; Gut, Marta; Marques-Bonet, Tomas; Carbone, Lucia; Wall, Jeff D.; Hammer, Michael F.
2015-01-01
Gibbons are believed to have diverged from the larger great apes ∼16.8 MYA and today reside in the rainforests of Southeast Asia. Based on their diploid chromosome number, the family Hylobatidae is divided into four genera, Nomascus, Symphalangus, Hoolock, and Hylobates. Genetic studies attempting to elucidate the phylogenetic relationships among gibbons using karyotypes, mitochondrial DNA (mtDNA), the Y chromosome, and short autosomal sequences have been inconclusive . To examine the relationships among gibbon genera in more depth, we performed second-generation whole genome sequencing (WGS) to a mean of ∼15× coverage in two individuals from each genus. We developed a coalescent-based approximate Bayesian computation (ABC) method incorporating a model of sequencing error generated by high coverage exome validation to infer the branching order, divergence times, and effective population sizes of gibbon taxa. Although Hoolock and Symphalangus are likely sister taxa, we could not confidently resolve a single bifurcating tree despite the large amount of data analyzed. Instead, our results support the hypothesis that all four gibbon genera diverged at approximately the same time. Assuming an autosomal mutation rate of 1 × 10−9/site/year this speciation process occurred ∼5 MYA during a period in the Early Pliocene characterized by climatic shifts and fragmentation of the Sunda shelf forests. Whole genome sequencing of additional individuals will be vital for inferring the extent of gene flow among species after the separation of the gibbon genera. PMID:25769979
Veeramah, Krishna R; Woerner, August E; Johnstone, Laurel; Gut, Ivo; Gut, Marta; Marques-Bonet, Tomas; Carbone, Lucia; Wall, Jeff D; Hammer, Michael F
2015-05-01
Gibbons are believed to have diverged from the larger great apes ∼16.8 MYA and today reside in the rainforests of Southeast Asia. Based on their diploid chromosome number, the family Hylobatidae is divided into four genera, Nomascus, Symphalangus, Hoolock, and Hylobates. Genetic studies attempting to elucidate the phylogenetic relationships among gibbons using karyotypes, mitochondrial DNA (mtDNA), the Y chromosome, and short autosomal sequences have been inconclusive . To examine the relationships among gibbon genera in more depth, we performed second-generation whole genome sequencing (WGS) to a mean of ∼15× coverage in two individuals from each genus. We developed a coalescent-based approximate Bayesian computation (ABC) method incorporating a model of sequencing error generated by high coverage exome validation to infer the branching order, divergence times, and effective population sizes of gibbon taxa. Although Hoolock and Symphalangus are likely sister taxa, we could not confidently resolve a single bifurcating tree despite the large amount of data analyzed. Instead, our results support the hypothesis that all four gibbon genera diverged at approximately the same time. Assuming an autosomal mutation rate of 1 × 10(-9)/site/year this speciation process occurred ∼5 MYA during a period in the Early Pliocene characterized by climatic shifts and fragmentation of the Sunda shelf forests. Whole genome sequencing of additional individuals will be vital for inferring the extent of gene flow among species after the separation of the gibbon genera.
[Phylogenetic analysis of bacteria of extreme ecosystems].
Romanovskaia, V A; Parfenova, V V; Bel'kova, N L; Sukhanova, E V; Gladka, G V; Tashireva, A A
2014-01-01
Phylogenetic analysis of aerobic chemoorganotrophic bacteria of the two extreme regions (Dead Sea and West Antarctic) was performed on the basis of the nucleotide sequences of the 16S rRNA gene. Thermotolerant and halotolerant spore-forming bacteria 7t1 and 7t3 of terrestrial ecosystems Dead Sea identified as Bacillus licheniformis and B. subtilis subsp. subtilis, respectively. Taking into account remote location of thermotolerant strain 6t1 from closely related strains in the cluster Staphylococcus, 6t1 strain can be regarded as Staphylococcus sp. In terrestrial ecosystems, Galindez Island (Antarctic) detected taxonomically diverse psychrotolerant bacteria. From ornithogenic soil were isolated Micrococcus luteus O-1 and Microbacterium trichothecenolyticum O-3. Strains 4r5, 5r5 and 40r5, isolated from grass and lichens, can be referred to the genus Frondihabitans. These strains are taxonomically and ecologically isolated and on the tree diagram form the joint cluster with three isolates Frondihabitans sp., isolated from the lichen Austrian Alps, and psychrotolerant associated with plants F. cladoniiphilus CafT13(T). Isolates from black lichen in the different stationary observation points on the south side of a vertical cliff identified as: Rhodococcus fascians 181n3, Sporosarcina aquimarina O-7, Staphylococcus sp. 0-10. From orange biofilm of fouling on top of the vertical cliff isolated Arthrobacter sp. 28r5g1, from the moss-- Serratia sp. 6r1g. According to the results, Frondihabitans strains most frequently encountered among chemoorganotrophic aerobic bacteria in the Antarctic phytocenoses.
Dealing with Reflection Invariance in Bayesian Factor Analysis.
Erosheva, Elena A; Curtis, S McKay
2017-03-13
This paper considers the reflection unidentifiability problem in confirmatory factor analysis (CFA) and the associated implications for Bayesian estimation. We note a direct analogy between the multimodality in CFA models that is due to all possible column sign changes in the matrix of loadings and the multimodality in finite mixture models that is due to all possible relabelings of the mixture components. Drawing on this analogy, we derive and present a simple approach for dealing with reflection in variance in Bayesian factor analysis. We recommend fitting Bayesian factor analysis models without rotational constraints on the loadings-allowing Markov chain Monte Carlo algorithms to explore the full posterior distribution-and then using a relabeling algorithm to pick a factor solution that corresponds to one mode. We demonstrate our approach on the case of a bifactor model; however, the relabeling algorithm is straightforward to generalize for handling multimodalities due to sign invariance in the likelihood in other factor analysis models.
Wang, Wei; Xia, Minxuan; Chen, Jie; Deng, Fenni; Yuan, Rui; Zhang, Xiaopei; Shen, Fafu
2016-12-01
The data presented in this paper is supporting the research article "Genome-Wide Analysis of Superoxide Dismutase Gene Family in Gossypium raimondii and G. arboreum" [1]. In this data article, we present phylogenetic tree showing dichotomy with two different clusters of SODs inferred by the Bayesian method of MrBayes (version 3.2.4), "Bayesian phylogenetic inference under mixed models" [2], Ramachandran plots of G. raimondii and G. arboreum SODs, the protein sequence used to generate 3D sructure of proteins and the template accession via SWISS-MODEL server, "SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information." [3] and motif sequences of SODs identified by InterProScan (version 4.8) with the Pfam database, "Pfam: the protein families database" [4].
Bayesian linkage and segregation analysis: factoring the problem.
Matthysse, S
2000-01-01
Complex segregation analysis and linkage methods are mathematical techniques for the genetic dissection of complex diseases. They are used to delineate complex modes of familial transmission and to localize putative disease susceptibility loci to specific chromosomal locations. The computational problem of Bayesian linkage and segregation analysis is one of integration in high-dimensional spaces. In this paper, three available techniques for Bayesian linkage and segregation analysis are discussed: Markov Chain Monte Carlo (MCMC), importance sampling, and exact calculation. The contribution of each to the overall integration will be explicitly discussed.
TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics
Jobb, Gangolf; von Haeseler, Arndt; Strimmer, Korbinian
2004-01-01
Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004) are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies. PMID:15222900
A phylogenetic analysis of the phylum Fibrobacteres.
Jewell, Kelsea A; Scott, Jarrod J; Adams, Sandra M; Suen, Garret
2013-09-01
Members of the phylum Fibrobacteres are highly efficient cellulolytic bacteria, best known for their role in rumen function and as potential sources of novel enzymes for bioenergy applications. Despite being key members of ruminants and other digestive microbial communities, our knowledge of this phylum remains incomplete, as much of our understanding is focused on two recognized species, Fibrobacter succinogenes and F. intestinalis. As a result, we lack insights regarding the environmental niche, host range, and phylogenetic organization of this phylum. Here, we analyzed over 1000 16S rRNA Fibrobacteres sequences available from public databases to establish a phylogenetic framework for this phylum. We identify both species- and genus-level clades that are suggestive of previously unknown taxonomic relationships between Fibrobacteres in addition to their putative lifestyles as host-associated or free-living. Our results shed light on this poorly understood phylum and will be useful for elucidating the function, distribution, and diversity of these bacteria in their niches.
Jiang, Xianhuan; Gao, Jun; Ni, Liju; Hu, Jianhua; Li, Kai; Sun, Fengping; Xie, Jianyun; Bo, Xiong; Gao, Chen; Xiao, Junhua; Zhou, Yuxun
2012-05-01
Microtus fortis is a special resource of rodent in China. It is a promising experimental animal model for the study on the mechanism of Schistosome japonicum resistance. The first complete mitochondrial genome sequence for Microtus fortis calamorum, a subspecies of M. fortis (Arvicolinae, Rodentia), was reported in this study. The mitochondrial genome sequence of M. f. calamorum (Genbank: JF261175) showed a typical vertebrate pattern with 13 protein coding genes, 2 ribosomal RNAs, 22 transfer RNAs and one major noncoding region (CR region).The extended termination associated sequences (ETAS-1 and ETAS-2) and conserved sequence block 1 (CSB-1) were found in the CR region. The putative origin of replication for the light strand (O(L)) of M. f. calamorum was 35bp long and showed high conservation in stem and adjacent sequences, but the difference existed in the loop region among three species of genus Microtus. In order to investigate the phylogenetic position of M. f. calamorum, the phylogenetic trees (Maximum likelihood and Bayesian methods) were constructed based on 12 protein-coding genes (except for ND6 gene) on H strand from 16 rodent species. M. f. calamorum was classified into genus Microtus, Arvcicolinae for the highly phylogenetic relationship with Microtus kikuchii (Taiwan vole). Further phylogenetic analysis results based on the cytochrome b gene ranged from M. f. calamorum to one of the subspecies of M. fortis, which formed a sister group of Microtus middendorfii in the genus Microtus.
Aberer, Andre J; Stamatakis, Alexandros; Ronquist, Fredrik
2016-01-01
Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes.
Steinfartz, Sebastian; Vicario, Saverio; Arntzen, J W; Caccone, Adalgisa
2007-03-15
The monophyly of European newts of the genus Triturus within the family Salamandridae has for decades rested on presumably homologous behavioral and morphological characters. Molecular data challenge this hypothesis, but the phylogenetic position of Triturus within the Salamandridae has not yet been convincingly resolved. We addressed this issue and the temporal divergence of Triturus within the Salamandridae with novel Bayesian approaches applied to DNA sequence data from three mitochondrial genes (12S, 16S and cytb). We included 38 salamandrid species comprising all 13 recognized species of Triturus and 16 out of 17 salamandrid genera. A clade comprising all the "Newts" can be separated from the "True Salamanders" and Salamandrina clades. Within the "Newts" well-supported clades are: Tylototriton-Pleurodeles, the "New World Newts" (Notophthalmus-Taricha), and the "Modern Eurasian Newts" (Cynops, Pachytriton, Paramesotriton=together the "Modern Asian Newts", Calotriton, Euproctus, Neurergus and Triturus species). We found that Triturus is a non-monophyletic species assemblage, which includes four groups that are themselves monophyletic: (i) the "Large-Bodied Triturus" (six species), (ii) the "Small-Bodied Triturus" (five species), (iii) T. alpestris and (iv) T. vittatus. We estimated that the last common ancestor of Triturus existed around 64 million years ago (mya) while the root of the Salamandridae dates back to 95 mya. This was estimated using a fossil-based molecular dating approach and an explicit framework to select calibration points that least underestimated their corresponding nodes. Using the molecular phylogeny we mapped the evolution of life history and courtship traits in Triturus and found that several Triturus-specific courtship traits evolved independently.
Spatiotemporal Bayesian inference dipole analysis for MEG neuroimaging data.
Jun, Sung C; George, John S; Paré-Blagoev, Juliana; Plis, Sergey M; Ranken, Doug M; Schmidt, David M; Wood, C C
2005-10-15
Recently, we described a Bayesian inference approach to the MEG/EEG inverse problem that used numerical techniques to estimate the full posterior probability distributions of likely solutions upon which all inferences were based [Schmidt, D.M., George, J.S., Wood, C.C., 1999. Bayesian inference applied to the electromagnetic inverse problem. Human Brain Mapping 7, 195; Schmidt, D.M., George, J.S., Ranken, D.M., Wood, C.C., 2001. Spatial-temporal bayesian inference for MEG/EEG. In: Nenonen, J., Ilmoniemi, R. J., Katila, T. (Eds.), Biomag 2000: 12th International Conference on Biomagnetism. Espoo, Norway, p. 671]. Schmidt et al. (1999) focused on the analysis of data at a single point in time employing an extended region source model. They subsequently extended their work to a spatiotemporal Bayesian inference analysis of the full spatiotemporal MEG/EEG data set. Here, we formulate spatiotemporal Bayesian inference analysis using a multi-dipole model of neural activity. This approach is faster than the extended region model, does not require use of the subject's anatomical information, does not require prior determination of the number of dipoles, and yields quantitative probabilistic inferences. In addition, we have incorporated the ability to handle much more complex and realistic estimates of the background noise, which may be represented as a sum of Kronecker products of temporal and spatial noise covariance components. This reduces the effects of undermodeling noise. In order to reduce the rigidity of the multi-dipole formulation which commonly causes problems due to multiple local minima, we treat the given covariance of the background as uncertain and marginalize over it in the analysis. Markov Chain Monte Carlo (MCMC) was used to sample the many possible likely solutions. The spatiotemporal Bayesian dipole analysis is demonstrated using simulated and empirical whole-head MEG data.
Alfaro, Michael E; Zoller, Stefan; Lutzoni, François
2003-02-01
Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Bayesian Analysis of the Pattern Informatics Technique
NASA Astrophysics Data System (ADS)
Cho, N.; Tiampo, K.; Klein, W.; Rundle, J.
2007-12-01
The pattern informatics (PI) [Rundle et al., 2000; Tiampo et al., 2002; Holliday et al., 2005] is a technique that uses phase dynamics in order to quantify temporal variations in seismicity patterns. This technique has shown interesting results for forecasting earthquakes with magnitude greater than or equal to 5 in southern California from 2000 to 2010 [Rundle et al., 2002]. In this work, a Bayesian approach is used to obtain a modified updated version of the PI called Bayesian pattern informatics (BPI). This alternative method uses the PI result as a prior probability and models such as ETAS [Ogata, 1988, 2004; Helmstetter and Sornette, 2002] or BASS [Turcotte et al., 2007] in order to obtain the likelihood. Its result is similar to the one obtained by the PI: the determination of regions, known as hotspots, that are most susceptible to the occurrence of events with M=5 and larger during the forecast period. As an initial test, retrospective forecasts for the southern California region from 1990 to 2000 were made with both the BPI and the PI techniques, and the results are discussed in this work.
Bayesian networks as a tool for epidemiological systems analysis
NASA Astrophysics Data System (ADS)
Lewis, F. I.
2012-11-01
Bayesian network analysis is a form of probabilistic modeling which derives from empirical data a directed acyclic graph (DAG) describing the dependency structure between random variables. Bayesian networks are increasingly finding application in areas such as computational and systems biology, and more recently in epidemiological analyses. The key distinction between standard empirical modeling approaches, such as generalised linear modeling, and Bayesian network analyses is that the latter attempts not only to identify statistically associated variables, but to additionally, and empirically, separate these into those directly and indirectly dependent with one or more outcome variables. Such discrimination is vastly more ambitious but has the potential to reveal far more about key features of complex disease systems. Applying Bayesian network modeling to biological and medical data has considerable computational demands, combined with the need to ensure robust model selection given the vast model space of possible DAGs. These challenges require the use of approximation techniques, such as the Laplace approximation, Markov chain Monte Carlo simulation and parametric bootstrapping, along with computational parallelization. A case study in structure discovery - identification of an optimal DAG for given data - is presented which uses additive Bayesian networks to explore veterinary disease data of industrial and medical relevance.
bamr: Bayesian analysis of mass and radius observations
NASA Astrophysics Data System (ADS)
Steiner, Andrew W.
2014-08-01
bamr is an MPI implementation of a Bayesian analysis of neutron star mass and radius data that determines the mass versus radius curve and the equation of state of dense matter. Written in C++, bamr provides some EOS models. This code requires O2scl (ascl:1408.019) be installed before compilation.
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Exploration of phylogenetic data using a global sequence analysis method
Chapus, Charles; Dufraigne, Christine; Edwards, Scott; Giron, Alain; Fertil, Bernard; Deschavanne, Patrick
2005-01-01
Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. Results Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 γ-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. Conclusion The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis. PMID:16280081
ERIC Educational Resources Information Center
Franklin, Wilfred A.
2010-01-01
In a flexible multisession laboratory, students investigate concepts of phylogenetic analysis at both the molecular and the morphological level. Students finish by conducting their own analysis on a collection of skeletons representing the major phyla of vertebrates, a collection of primate skulls, or a collection of hominid skulls.
A phylogenetic analysis of the megadiverse Chalcidoidea (Hymenoptera)
Technology Transfer Automated Retrieval System (TEKTRAN)
Chalcidoidea (Hymenoptera) are extremely diverse with an estimated 500,000 species. We present the first phylogenetic analysis of the superfamily based on a cladistic analysis of both morphological and molecular data. A total of 233 morphological characters were scored for 300 taxa and 265 genera, a...
Phylogenetic Analysis and Epidemic History of Hepatitis C Virus Genotype 2 in Tunisia, North Africa.
Rajhi, Mouna; Ghedira, Kais; Chouikha, Anissa; Djebbi, Ahlem; Cheikh, Imed; Ben Yahia, Ahlem; Sadraoui, Amel; Hammami, Walid; Azouz, Msaddek; Ben Mami, Nabil; Triki, Henda
2016-01-01
HCV genotype 2 (HCV-2) has a worldwide distribution with prevalence rates that vary from country to country. High genetic diversity and long-term endemicity were suggested in West African countries. A global dispersal of HCV-2 would have occurred during the 20th century, especially in European countries. In Tunisia, genotype 2 was the second prevalent genotype after genotype 1 and most isolates belong to subtypes 2c and 2k. In this study, phylogenetic analyses based on the NS5B genomic sequences of 113 Tunisian HCV isolates from subtypes 2c and 2k were carried out. A Bayesian coalescent-based framework was used to estimate the origin and the spread of these subtypes circulating in Tunisia. Phylogenetic analyses of HCV-2c sequences suggest the absence of country-specific or time-specific variants. In contrast, the phylogenetic grouping of HCV-2k sequences shows the existence of two major genetic clusters that may represent two distinct circulating variants. Coalescent analysis indicated a most recent common ancestor (tMRCA) of Tunisian HCV-2c around 1886 (1869-1902) before the introduction of HCV-2k in 1901 (1867-1931). Our findings suggest that the introduction of HCV-2c in Tunisia is possibly a result of population movements between Tunisia and European population following the French colonization.
A Bayesian Analysis of Scale-Invariant Processes
2012-01-01
Analysis of Scale-Invariant Processes Jingfeng Wang, Rafael L. Bras, Veronica Nieves Georgia Tech Research Corporation Office of Sponsored Programs...processes Veronica Nieves , Jingfeng Wang, and Rafael L. Bras Citation: AIP Conf. Proc. 1443, 56 (2012); doi: 10.1063/1.3703620 View online: http...http://proceedings.aip.org/about/rights_permissions A Bayesian Analysis of Scale-Invariant Processes Veronica Nieves ∗, Jingfeng Wang† and Rafael L. Bras
An Overview of Bayesian Methods for Neural Spike Train Analysis
2013-01-01
Neural spike train analysis is an important task in computational neuroscience which aims to understand neural mechanisms and gain insights into neural circuits. With the advancement of multielectrode recording and imaging technologies, it has become increasingly demanding to develop statistical tools for analyzing large neuronal ensemble spike activity. Here we present a tutorial overview of Bayesian methods and their representative applications in neural spike train analysis, at both single neuron and population levels. On the theoretical side, we focus on various approximate Bayesian inference techniques as applied to latent state and parameter estimation. On the application side, the topics include spike sorting, tuning curve estimation, neural encoding and decoding, deconvolution of spike trains from calcium imaging signals, and inference of neuronal functional connectivity and synchrony. Some research challenges and opportunities for neural spike train analysis are discussed. PMID:24348527
Molak, Martyna; Suchard, Marc A; Ho, Simon Y W; Beilman, David W; Shapiro, Beth
2015-01-01
Studies of DNA from ancient samples provide a valuable opportunity to gain insight into past evolutionary and demographic processes. Bayesian phylogenetic methods can estimate evolutionary rates and timescales from ancient DNA sequences, with the ages of the samples acting as calibrations for the molecular clock. Sample ages are often estimated using radiocarbon dating, but the associated measurement error is rarely taken into account. In addition, the total uncertainty quantified by converting radiocarbon dates to calendar dates is typically ignored. Here, we present a tool for incorporating both of these sources of uncertainty into Bayesian phylogenetic analyses of ancient DNA. This empirical calibrated radiocarbon sampler (ECRS) integrates the age uncertainty for each ancient sequence over the calibrated probability density function estimated for its radiocarbon date and associated error. We use the ECRS to analyse three ancient DNA data sets. Accounting for radiocarbon-dating and calibration error appeared to have little impact on estimates of evolutionary rates and related parameters for these data sets. However, analyses of other data sets, particularly those with few or only very old radiocarbon dates, might be more sensitive to using artificially precise sample ages and should benefit from use of the ECRS.
A Deliberate Practice Approach to Teaching Phylogenetic Analysis
ERIC Educational Resources Information Center
Hobbs, F. Collin; Johnson, Daniel J.; Kearns, Katherine D.
2013-01-01
One goal of postsecondary education is to assist students in developing expert-level understanding. Previous attempts to encourage expert-level understanding of phylogenetic analysis in college science classrooms have largely focused on isolated, or "one-shot," in-class activities. Using a deliberate practice instructional approach, we…
A Bayesian Analysis of the Flood Frequency Hydrology Concept
2016-02-01
Concept by Brian E. Skahill, Alberto Viglione, and Aaron Byrd PURPOSE: The purpose of this document is to demonstrate a Bayesian analysis of the...flood frequency hydrology concept as a formal probabilistic-based means by which to coherently combine and also evaluate the worth of different types...and development. INTRODUCTION: Merz and Blöschl (2008a,b) proposed the concept of flood frequency hydrology, which emphasizes the importance of
Fusarium culmorum is a single phylogenetic species based on multilocus sequence analysis.
Obanor, Friday; Erginbas-Orakci, G; Tunali, B; Nicol, J M; Chakraborty, S
2010-09-01
Fusarium culmorum is a major pathogen of wheat and barley causing head blight and crown rot in cooler temperate climates of Australia, Europe, West Asia and North Africa. To better understand its evolutionary history we partially sequenced single copy nuclear genes encoding translation elongation factor 1-α (TEF), reductase (RED) and phosphate permease (PHO) in 100 F. culmorum isolates with 11 isolates of Fusarium crookwellense, Fusarium graminearum and Fusarium pseudograminearum. Phylogenetic analysis of multilocus sequence (MLS) data using Bayesian inference and maximum parsimony analysis showed that F. culmorum from wheat is a single phylogenetic species with no significant linkage disequilibrium and little or no lineage development along geographic origin. Both MLS and TEF and RED gene sequence analysis separated the four Fusarium species used and delineated three to four groups within the F. culmorum clade. But the PHO gene could not completely resolve isolates into their respective species. Fixation index and gene flow suggest significant genetic exchange between the isolates from distant geographic regions. A lack of strong lineage structure despite the geographic separation of the three collections indicates a frequently recombining species and/or widespread distribution of genotypes due to international trade, tourism and long-range dispersal of macroconidia. Moreover, the two mating type genes were present in equal proportion among the F. culmorum collection used in this study, leaving open the possibility of sexual reproduction.
Phylogenetic and evolutionary analysis of influenza A H7N9 virus.
Babakir-Mina, Muhammed; Dimonte, Slavatore; Lo Presti, Alessandra; Cella, Eleonora; Perno, Carlo Federico; Ciotti, Marco; Ciccozzi, Massimo
2014-07-01
Recently, human infections with the novel avian-origin influenza A H7N9 virus have been reported from various provinces in China. Human infections with avian influenza A viruses are rare and may cause a wide spectrum of clinical symptoms. This is the first time that human infection with a low pathogenic avian influenza A virus has been associated with a fatal outcome. Here, a phylogenetic and positive selective pressure analysis of haemagglutin (HA), neuraminidase (NA), and matrix protein (MP) genes of the novel reassortant H7N9 virus was carried out. The analysis showed that both structural genes of this reassortant virus likely originated from Euro-Asiatic birds, while NA was more likely to have originated from South Korean birds. The Bayesian phylogenetic tree of the MP showed a main clade and an outside cluster including four sequences from China. The United States and Guatemala classical H7N9-isolates appeared homogeneous and clustered together, although they are distinct from other classical Euro-Asiatic and novel H7N9 viruses. Selective pressure analysis did not reveal any site under statistically significant positive selective pressure in any of the three genes analyzed. Unknown certain intermediate hosts involved might be implicated, so extensive global surveillance and bird-to-person transmission should be closely considered in the future.
Bayesian analysis of structural equation models with dichotomous variables.
Lee, Sik-Yum; Song, Xin-Yuan
2003-10-15
Structural equation modelling has been used extensively in the behavioural and social sciences for studying interrelationships among manifest and latent variables. Recently, its uses have been well recognized in medical research. This paper introduces a Bayesian approach to analysing general structural equation models with dichotomous variables. In the posterior analysis, the observed dichotomous data are augmented with the hypothetical missing values, which involve the latent variables in the model and the unobserved continuous measurements underlying the dichotomous data. An algorithm based on the Gibbs sampler is developed for drawing the parameters values and the hypothetical missing values from the joint posterior distributions. Useful statistics, such as the Bayesian estimates and their standard error estimates, and the highest posterior density intervals, can be obtained from the simulated observations. A posterior predictive p-value is used to test the goodness-of-fit of the posited model. The methodology is applied to a study of hypertensive patient non-adherence to medication.
Bayesian tomography and integrated data analysis in fusion diagnostics
NASA Astrophysics Data System (ADS)
Li, Dong; Dong, Y. B.; Deng, Wei; Shi, Z. B.; Fu, B. Z.; Gao, J. M.; Wang, T. B.; Zhou, Yan; Liu, Yi; Yang, Q. W.; Duan, X. R.
2016-11-01
In this article, a Bayesian tomography method using non-stationary Gaussian process for a prior has been introduced. The Bayesian formalism allows quantities which bear uncertainty to be expressed in the probabilistic form so that the uncertainty of a final solution can be fully resolved from the confidence interval of a posterior probability. Moreover, a consistency check of that solution can be performed by checking whether the misfits between predicted and measured data are reasonably within an assumed data error. In particular, the accuracy of reconstructions is significantly improved by using the non-stationary Gaussian process that can adapt to the varying smoothness of emission distribution. The implementation of this method to a soft X-ray diagnostics on HL-2A has been used to explore relevant physics in equilibrium and MHD instability modes. This project is carried out within a large size inference framework, aiming at an integrated analysis of heterogeneous diagnostics.
BAYESIAN ANALYSIS OF MULTIPLE HARMONIC OSCILLATIONS IN THE SOLAR CORONA
Arregui, I.; Asensio Ramos, A.; Diaz, A. J.
2013-03-01
The detection of multiple mode harmonic kink oscillations in coronal loops enables us to obtain information on coronal density stratification and magnetic field expansion using seismology inversion techniques. The inference is based on the measurement of the period ratio between the fundamental mode and the first overtone and theoretical results for the period ratio under the hypotheses of coronal density stratification and magnetic field expansion of the wave guide. We present a Bayesian analysis of multiple mode harmonic oscillations for the inversion of the density scale height and magnetic flux tube expansion under each of the hypotheses. The two models are then compared using a Bayesian model comparison scheme to assess how plausible each one is given our current state of knowledge.
Bayesian methods for the analysis of inequality constrained contingency tables.
Laudy, Olav; Hoijtink, Herbert
2007-04-01
A Bayesian methodology for the analysis of inequality constrained models for contingency tables is presented. The problem of interest lies in obtaining the estimates of functions of cell probabilities subject to inequality constraints, testing hypotheses and selection of the best model. Constraints on conditional cell probabilities and on local, global, continuation and cumulative odds ratios are discussed. A Gibbs sampler to obtain a discrete representation of the posterior distribution of the inequality constrained parameters is used. Using this discrete representation, the credibility regions of functions of cell probabilities can be constructed. Posterior model probabilities are used for model selection and hypotheses are tested using posterior predictive checks. The Bayesian methodology proposed is illustrated in two examples.
A Deliberate Practice Approach to Teaching Phylogenetic Analysis
Hobbs, F. Collin; Johnson, Daniel J.; Kearns, Katherine D.
2013-01-01
One goal of postsecondary education is to assist students in developing expert-level understanding. Previous attempts to encourage expert-level understanding of phylogenetic analysis in college science classrooms have largely focused on isolated, or “one-shot,” in-class activities. Using a deliberate practice instructional approach, we designed a set of five assignments for a 300-level plant systematics course that incrementally introduces the concepts and skills used in phylogenetic analysis. In our assignments, students learned the process of constructing phylogenetic trees through a series of increasingly difficult tasks; thus, skill development served as a framework for building content knowledge. We present results from 5 yr of final exam scores, pre- and postconcept assessments, and student surveys to assess the impact of our new pedagogical materials on student performance related to constructing and interpreting phylogenetic trees. Students improved in their ability to interpret relationships within trees and improved in several aspects related to between-tree comparisons and tree construction skills. Student feedback indicated that most students believed our approach prepared them to engage in tree construction and gave them confidence in their abilities. Overall, our data confirm that instructional approaches implementing deliberate practice address student misconceptions, improve student experiences, and foster deeper understanding of difficult scientific concepts. PMID:24297294
A deliberate practice approach to teaching phylogenetic analysis.
Hobbs, F Collin; Johnson, Daniel J; Kearns, Katherine D
2013-01-01
One goal of postsecondary education is to assist students in developing expert-level understanding. Previous attempts to encourage expert-level understanding of phylogenetic analysis in college science classrooms have largely focused on isolated, or "one-shot," in-class activities. Using a deliberate practice instructional approach, we designed a set of five assignments for a 300-level plant systematics course that incrementally introduces the concepts and skills used in phylogenetic analysis. In our assignments, students learned the process of constructing phylogenetic trees through a series of increasingly difficult tasks; thus, skill development served as a framework for building content knowledge. We present results from 5 yr of final exam scores, pre- and postconcept assessments, and student surveys to assess the impact of our new pedagogical materials on student performance related to constructing and interpreting phylogenetic trees. Students improved in their ability to interpret relationships within trees and improved in several aspects related to between-tree comparisons and tree construction skills. Student feedback indicated that most students believed our approach prepared them to engage in tree construction and gave them confidence in their abilities. Overall, our data confirm that instructional approaches implementing deliberate practice address student misconceptions, improve student experiences, and foster deeper understanding of difficult scientific concepts.
Martínez-Salazar, Elizabeth A; Rosas-Valdez, Rogelio; Gregory, T Ryan; Violante-González, Juan
2016-08-01
: Infidum similis Travassos, 1916 (Dicrocoeliidae: Leipertrematinae) was found in the gall bladder of Leptophis diplotropis Günther, 1872 from El Podrido, Acapulco, Guerrero, Mexico. A phylogenetic analysis based on partial sequences of the 28S ribosomal RNA using maximum likelihood (ML) and Bayesian inference (BI) analyses was carried out to assess its phylogenetic position within suborder Xiphidiata, alongside members of the superfamilies Gorgoderoidea and Plagiorchoidea. The phylogenetic trees showed that the genus is most-closely related to the Plagiorchoidea rather than to the Gorgoderoidea, in keeping with previous taxonomic designations. Phylogenies obtained from ML and BI analysis of the 28S rDNA gene revealed a well supported clade in which Choledocystus hepaticus (Lutz, 1928) Sullivan, 1977 is sister to I. similis. On the other hand, a tree obtained using a partial sequence of the cytochrome c oxidase subunit 1 (cox1) mtDNA gene (ML and BI analysis), with species supposed to be closely related to I. similis according to 28S, does not support this relatedness. Based on the independence of Infidum from the subfamily Leipertrematinae Yamaguti, 1958 , our results clearly demonstrated that the genus corresponds to a different family and with species closely related to C. hepaticus within Plagiorchoidea. New data are presented about the tegumental surface of I. similis by scanning electron microscopy as well as the estimation of its haploid genome size using Feulgen Image Analysis Densitometry of sperm nuclei as part of the characterization of this species. This is the first genome size estimated for a member of Plagiorchiida, and these data will provide a new source of knowledge on helminth diversity and evolutionary studies. This constitutes the first host record, and new geographical distribution, for this species in Mexico.
Cao, Y; Hao, J S; Sun, X Y; Zheng, B; Yang, Q
2016-12-02
Pieridae is a butterfly family whose evolutionary history is poorly understood. Due to the difficulties in identifying morphological synapomorphies within the group and the scarcity of the fossil records, only a few studies on higher phylogeny of Pieridae have been reported to date. In this study, we describe the complete mitochondrial genomes of four pierid butterfly species (Aporia martineti, Aporia hippia, Aporia bieti, and Mesapia peloria), in order to better characterize the pierid butterfly mitogenomes and perform the phylogenetic analyses using all available mitogenomic sequence data (13PCGs, rRNAs, and tRNAs) from the 18 pierid butterfly species comprising the three main subfamilies (Dismorphiinae, Coliadinae and Pierinae). Our analysis shows that the four new mitogenomes share similar features with other known pierid mitogenomes in gene order and organization. Phylogenetic analyses by maximum likelihood and Bayesian inference show that the pierid higher-level relationship is: Dismorphiinae + (Coliadinae + Pierinae), which corroborates the results of some previous molecular and morphological studies. However, we found that the Hebomoia and Anthocharis make a sister group, supporting the traditional tribe Anthocharidini; in addition, the Mesapia peloria was shown to be clustered within the Aporia group, suggesting that the genus Mesapia should be reduced to the taxonomic status of subgenus. Our molecular dating analysis indicates that the family Pieridae began to diverge during the Late Cretaceous about 92 million years ago (mya), while the subfamily Pierinae diverged from the Coliadinae at about 86 mya (Late Cretaceous).
2010-01-01
Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504
A Bayesian Nonparametric Meta-Analysis Model
ERIC Educational Resources Information Center
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Risk analysis using a hybrid Bayesian-approximate reasoning methodology.
Bott, T. F.; Eisenhawer, S. W.
2001-01-01
Analysts are sometimes asked to make frequency estimates for specific accidents in which the accident frequency is determined primarily by safety controls. Under these conditions, frequency estimates use considerable expert belief in determining how the controls affect the accident frequency. To evaluate and document beliefs about control effectiveness, we have modified a traditional Bayesian approach by using approximate reasoning (AR) to develop prior distributions. Our method produces accident frequency estimates that separately express the probabilistic results produced in Bayesian analysis and possibilistic results that reflect uncertainty about the prior estimates. Based on our experience using traditional methods, we feel that the AR approach better documents beliefs about the effectiveness of controls than if the beliefs are buried in Bayesian prior distributions. We have performed numerous expert elicitations in which probabilistic information was sought from subject matter experts not trained In probability. We find it rnuch easier to elicit the linguistic variables and fuzzy set membership values used in AR than to obtain the probability distributions used in prior distributions directly from these experts because it better captures their beliefs and better expresses their uncertainties.
Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily
Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju
2015-01-01
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity. PMID:26263546
Acute Abdominal Pain: Bayesian Analysis in the Emergency Room
Harvey, A. C.; Moodie, P. F.
1982-01-01
A non-sequential Bayesian analysis was deemed a suitable approach to the important clinical problem of analysis of acute abdominal pain in the Emergency Room. Using series reported in the literature as a data source complemented by expert clinical estimates of probabilities of clinical data a program has been established in St. Boniface, Canada. Prior to implementing the program as an online, quickly available diagnostic aid, a prospective preliminary study has shown that the performance of computer plus clinician is significantly better than either clinician or computer alone. A major emphasis has been developing the acceptability of the program in real-life diagnoses in the Emergency Room.
Spectral Analysis of B Stars: An Application of Bayesian Statistics
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2012-12-01
To better understand the processes involved in stellar physics, it is necessary to obtain accurate stellar parameters (effective temperature, surface gravity, abundances…). Spectral analysis is a powerful tool for investigating stars, but it is also vital to reduce uncertainties at a decent computational cost. Here we present a spectral analysis method based on a combination of Bayesian statistics and grids of synthetic spectra obtained with TLUSTY. This method simultaneously constrains the stellar parameters by using all the lines accessible in observed spectra and thus greatly reduces uncertainties and improves the overall spectrum fitting. Preliminary results are shown using spectra from the Observatoire du Mont-Mégantic.
Ari, Eszter; Ittzés, Péter; Podani, János; Thi, Quynh Chi Le; Jakó, Eena
2012-04-01
Boolean analysis (or BOOL-AN; Jakó et al., 2009. BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction. Mol. Phylogenet. Evol. 52, 887-97.), a recently developed method for sequence comparison uses the Iterative Canonical Form of Boolean functions. It considers sequence information in a way entirely different from standard phylogenetic methods (i.e. Maximum Parsimony, Maximum-Likelihood, Neighbor-Joining, and Bayesian analysis). The performance and reliability of Boolean analysis were tested and compared with the standard phylogenetic methods, using artificially evolved - simulated - nucleotide sequences and the 22 mitochondrial tRNA genes of the great apes. At the outset, we assumed that the phylogeny of Hominidae is generally well established, and the guide tree of artificial sequence evolution can also be used as a benchmark. These offer a possibility to compare and test the performance of different phylogenetic methods. Trees were reconstructed by each method from 2500 simulated sequences and 22 mitochondrial tRNA sequences. We also introduced a special re-sampling method for Boolean analysis on permuted sequence sites, the P-BOOL-AN procedure. Considering the reliability values (branch support values of consensus trees and Robinson-Foulds distances) we used for simulated sequence trees produced by different phylogenetic methods, BOOL-AN appeared as the most reliable method. Although the mitochondrial tRNA sequences of great apes are relatively short (59-75 bases long) and the ratio of their constant characters is about 75%, BOOL-AN, P-BOOL-AN and the Bayesian approach produced the same tree-topology as the established phylogeny, while the outcomes of Maximum Parsimony, Maximum-Likelihood and Neighbor-Joining methods were equivocal. We conclude that Boolean analysis is a promising alternative to existing methods of sequence comparison for phylogenetic reconstruction and congruence analysis.
Phylogenetic and Recombination Analysis of Tomato Spotted Wilt Virus
Yu, Jisuk; Kim, Mi-Kyeong; Choi, Hong-Soo; Kim, Kook-Hyung
2013-01-01
Tomato spotted wilt virus (TSWV) severely damages and reduces the yield of many economically important plants worldwide. In this study, we determined the whole-genome sequences of 10 TSWV isolates recently identified from various regions and hosts in Korea. Phylogenetic analysis of these 10 isolates as well as the three previously sequenced isolates indicated that the 13 Korean TSWV isolates could be divided into two groups reflecting either two different origins or divergences of Korean TSWV isolates. In addition, the complete nucleotide sequences for the 13 Korean TSWV isolates along with previously sequenced TSWV RNA segments from Korea and other countries were subjected to phylogenetic and recombination analysis. The phylogenetic analysis indicated that both the RNA L and RNA M segments of most Korean isolates might have originated in Western Europe and North America but that the RNA S segments for all Korean isolates might have originated in China and Japan. Recombination analysis identified a total of 12 recombination events among all isolates and segments and five recombination events among the 13 Korea isolates; among the five recombinants from Korea, three contained the whole RNA L segment, suggesting reassortment rather than recombination. Our analyses provide evidence that both recombination and reassortment have contributed to the molecular diversity of TSWV. PMID:23696821
Leaché, Adam D; Crews, Sarah C; Hickerson, Michael J
2007-12-22
Many species inhabiting the Peninsular Desert of Baja California demonstrate a phylogeographic break at the mid-peninsula, and previous researchers have attributed this shared pattern to a single vicariant event, a mid-peninsular seaway. However, previous studies have not explicitly considered the inherent stochasticity associated with the gene-tree coalescence for species preceding the time of the putative mid-peninsular divergence. We use a Bayesian analysis of a hierarchical model to test for simultaneous vicariance across co-distributed sister lineages sharing a genealogical break at the mid-peninsula. This Bayesian method is advantageous over traditional phylogenetic interpretations of biogeography because it considers the genetic variance associated with the coalescent and mutational processes, as well as the among-lineage demographic differences that affect gene-tree coalescent patterns. Mitochondrial DNA data from six small mammals and six squamate reptiles do not support the perception of a shared vicariant history among lineages exhibiting a north-south divergence at the mid-peninsula, and instead support two events differentially structuring genetic diversity in this region.
2011-01-01
Background The universal common ancestry (UCA) of all known life is a fundamental component of modern evolutionary theory, supported by a wide range of qualitative molecular evidence. Nevertheless, recently both the status and nature of UCA has been questioned. In earlier work I presented a formal, quantitative test of UCA in which model selection criteria overwhelmingly choose common ancestry over independent ancestry, based on a dataset of universally conserved proteins. These model-based tests are founded in likelihoodist and Bayesian probability theory, in opposition to classical frequentist null hypothesis tests such as Karlin-Altschul E-values for sequence similarity. In a recent comment, Koonin and Wolf (K&W) claim that the model preference for UCA is "a trivial consequence of significant sequence similarity". They support this claim with a computational simulation, derived from universally conserved proteins, which produces similar sequences lacking phylogenetic structure. The model selection tests prefer common ancestry for this artificial data set. Results For the real universal protein sequences, hierarchical phylogenetic structure (induced by genealogical history) is the overriding reason for why the tests choose UCA; sequence similarity is a relatively minor factor. First, for cases of conflicting phylogenetic structure, the tests choose independent ancestry even with highly similar sequences. Second, certain models, like star trees and K&W's profile model (corresponding to their simulation), readily explain sequence similarity yet lack phylogenetic structure. However, these are extremely poor models for the real proteins, even worse than independent ancestry models, though they explain K&W's artificial data well. Finally, K&W's simulation is an implementation of a well-known phylogenetic model, and it produces sequences that mimic homologous proteins. Therefore the model selection tests work appropriately with the artificial data. Conclusions For K
A Bayesian analysis of pentaquark signals from CLAS data
David Ireland; Bryan McKinnon; Dan Protopopescu; Pawel Ambrozewicz; Marco Anghinolfi; G. Asryan; Harutyun Avakian; H. Bagdasaryan; Nathan Baillie; Jacques Ball; Nathan Baltzell; V. Batourine; Marco Battaglieri; Ivan Bedlinski; Ivan Bedlinskiy; Matthew Bellis; Nawal Benmouna; Barry Berman; Angela Biselli; Lukasz Blaszczyk; Sylvain Bouchigny; Sergey Boyarinov; Robert Bradford; Derek Branford; William Briscoe; William Brooks; Volker Burkert; Cornel Butuceanu; John Calarco; Sharon Careccia; Daniel Carman; Liam Casey; Shifeng Chen; Lu Cheng; Philip Cole; Patrick Collins; Philip Coltharp; Donald Crabb; Volker Crede; Natalya Dashyan; Rita De Masi; Raffaella De Vita; Enzo De Sanctis; Pavel Degtiarenko; Alexandre Deur; Richard Dickson; Chaden Djalali; Gail Dodge; Joseph Donnelly; David Doughty; Michael Dugger; Oleksandr Dzyubak; Hovanes Egiyan; Kim Egiyan; Lamiaa Elfassi; Latifa Elouadrhiri; Paul Eugenio; Gleb Fedotov; Gerald Feldman; Ahmed Fradi; Herbert Funsten; Michel Garcon; Gagik Gavalian; Nerses Gevorgyan; Gerard Gilfoyle; Kevin Giovanetti; Francois-Xavier Girod; John Goetz; Wesley Gohn; Atilla Gonenc; Ralf Gothe; Keith Griffioen; Michel Guidal; Nevzat Guler; Lei Guo; Vardan Gyurjyan; Kawtar Hafidi; Hayk Hakobyan; Charles Hanretty; Neil Hassall; F. Hersman; Ishaq Hleiqawi; Maurik Holtrop; Charles Hyde; Yordanka Ilieva; Boris Ishkhanov; Eugeny Isupov; D. Jenkins; Hyon-Suk Jo; John Johnstone; Kyungseon Joo; Henry Juengst; Narbe Kalantarians; James Kellie; Mahbubul Khandaker; Wooyoung Kim; Andreas Klein; Franz Klein; Mikhail Kossov; Zebulun Krahn; Laird Kramer; Valery Kubarovsky; Joachim Kuhn; Sergey Kuleshov; Viacheslav Kuznetsov; Jeff Lachniet; Jean Laget; Jorn Langheinrich; D. Lawrence; Kenneth Livingston; Haiyun Lu; Marion MacCormick; Nikolai Markov; Paul Mattione; Bernhard Mecking; Mac Mestayer; Curtis Meyer; Tsutomu Mibe; Konstantin Mikhaylov; Marco Mirazita; Rory Miskimen; Viktor Mokeev; Brahim Moreno; Kei Moriya; Steven Morrow; Maryam Moteabbed; Edwin Munevar Espitia; Gordon Mutchler; Pawel Nadel-Turonski; Rakhsha Nasseripour; Silvia Niccolai; Gabriel Niculescu; Maria-Ioana Niculescu; Bogdan Niczyporuk; Megh Niroula; Rustam Niyazov; Mina Nozar; Mikhail Osipenko; Alexander Ostrovidov; Kijun Park; Evgueni Pasyuk; Craig Paterson; Sergio Pereira; Joshua Pierce; Nikolay Pivnyuk; Oleg Pogorelko; Sergey Pozdnyakov; John Price; Sebastien Procureur; Yelena Prok; Brian Raue; Giovanni Ricco; Marco Ripani; Barry Ritchie; Federico Ronchetti; Guenther Rosner; Patrizia Rossi; Franck Sabatie; Julian Salamanca; Carlos Salgado; Joseph Santoro; Vladimir Sapunenko; Reinhard Schumacher; Vladimir Serov; Youri Sharabian; Dmitri Sharov; Nikolay Shvedunov; Elton Smith; Lee Smith; Daniel Sober; Daria Sokhan; Aleksey Stavinskiy; Samuel Stepanyan; Stepan Stepanyan; Burnham Stokes; Paul Stoler; Steffen Strauch; Mauro Taiuti; David Tedeschi; Ulrike Thoma; Avtandil Tkabladze; Svyatoslav Tkachenko; Clarisse Tur; Maurizio Ungaro; Michael Vineyard; Alexander Vlassov; Daniel Watts; Lawrence Weinstein; Dennis Weygand; M. Williams; Elliott Wolin; M.H. Wood; Amrit Yegneswaran; Lorenzo Zana; Jixie Zhang; Bo Zhao; Zhiwen Zhao
2008-02-01
We examine the results of two measurements by the CLAS collaboration, one of which claimed evidence for a $\\Theta^{+}$ pentaquark, whilst the other found no such evidence. The unique feature of these two experiments was that they were performed with the same experimental setup. Using a Bayesian analysis we find that the results of the two experiments are in fact compatible with each other, but that the first measurement did not contain sufficient information to determine unambiguously the existence of a $\\Theta^{+}$. Further, we suggest a means by which the existence of a new candidate particle can be tested in a rigorous manner.
Analysis of diversification: combining phylogenetic and taxonomic data.
Paradis, Emmanuel
2003-01-01
The estimation of diversification rates using phylogenetic data has attracted a lot of attention in the past decade. In this context, the analysis of incomplete phylogenies (e.g. phylogenies resolved at the family level but unresolved at the species level) has remained difficult. I present here a likelihood-based method to combine partly resolved phylogenies with taxonomic (species-richness) data to estimate speciation and extinction rates. This method is based on fitting a birth-and-death model to both phylogenetic and taxonomic data. Some examples of the method are presented with data on birds and on mammals. The method is compared with existing approaches that deal with incomplete phylogenies. Some applications and generalizations of the approach introduced in this paper are further discussed. PMID:14667342
Bayesian analysis of inflationary features in Planck and SDSS data
NASA Astrophysics Data System (ADS)
Benetti, Micol; Alcaniz, Jailson S.
2016-07-01
We perform a Bayesian analysis to study possible features in the primordial inflationary power spectrum of scalar perturbations. In particular, we analyze the possibility of detecting the imprint of these primordial features in the anisotropy temperature power spectrum of the cosmic microwave background (CMB) and also in the matter power spectrum P (k ) . We use the most recent CMB data provided by the Planck Collaboration and P (k ) measurements from the 11th data release of the Sloan Digital Sky Survey. We focus our analysis on a class of potentials whose features are localized at different intervals of angular scales, corresponding to multipoles in the ranges 10 <ℓ<60 (Oscill-1) and 150 <ℓ<300 (Oscill-2). Our results show that one of the step potentials (Oscill-1) provides a better fit to the CMB data than does the featureless Λ CDM scenario, with moderate Bayesian evidence in favor of the former. Adding the P (k ) data to the analysis weakens the evidence of the Oscill-1 potential relative to the standard model and strengthens the evidence of this latter scenario with respect to the Oscill-2 model.
Implementation of a Bayesian Engine for Uncertainty Analysis
Leng Vang; Curtis Smith; Steven Prescott
2014-08-01
In probabilistic risk assessment, it is important to have an environment where analysts have access to a shared and secured high performance computing and a statistical analysis tool package. As part of the advanced small modular reactor probabilistic risk analysis framework implementation, we have identified the need for advanced Bayesian computations. However, in order to make this technology available to non-specialists, there is also a need of a simplified tool that allows users to author models and evaluate them within this framework. As a proof-of-concept, we have implemented an advanced open source Bayesian inference tool, OpenBUGS, within the browser-based cloud risk analysis framework that is under development at the Idaho National Laboratory. This development, the “OpenBUGS Scripter” has been implemented as a client side, visual web-based and integrated development environment for creating OpenBUGS language scripts. It depends on the shared server environment to execute the generated scripts and to transmit results back to the user. The visual models are in the form of linked diagrams, from which we automatically create the applicable OpenBUGS script that matches the diagram. These diagrams can be saved locally or stored on the server environment to be shared with other users.
[A phylogenetic analysis of plant communities of Teberda Biosphere Reserve].
Shulakov, A A; Egorov, A V; Onipchenko, V G
2016-01-01
Phylogenetic analysis of communities is based on the comparison of distances on the phylogenetic tree between species of a community under study and those distances in random samples taken out of local flora. It makes it possible to determine to what extent a community composition is formed by more closely related species (i.e., "clustered") or, on the opposite, it is more even and includes species that are less related with each other. The first case is usually interpreted as a result of strong influence caused by abiotic factors, due to which species with similar ecology, a priori more closely related, would remain: In the second case, biotic factors, such as competition, may come to the fore and lead to forming a community out of distant clades due to divergence of their ecological niches: The aim of this' study Was Ad explore the phylogenetic structure in communities of the northwestern Caucasus at two spatial scales - the scale of area from 4 to 100 m2 and the smaller scale within a community. The list of local flora of the alpine belt has been composed using the database of geobotanic descriptions carried out in Teberda Biosphere Reserve at true altitudes exceeding.1800 m. It includes 585 species of flowering plants belonging to 57 families. Basal groups of flowering plants are.not represented in the list. At the scale of communities of three classes, namely Thlaspietea rotundifolii - commumties formed on screes and pebbles, Calluno-Ulicetea - alpine meadow, and Mulgedio-Aconitetea subalpine meadows, have not demonstrated significant distinction of phylogenetic structure. At intra level, for alpine meadows the larger share of closely related species. (clustered community) is detected. Significantly clustered happen to be those communities developing on rocks (class Asplenietea trichomanis) and alpine (class Juncetea trifidi). At the same time, alpine lichen proved to have even phylogenetic structure at the small scale. Alpine (class Salicetea herbaceae) that
Reginal Frequency Analysis Based on Scaling Properties and Bayesian Models
NASA Astrophysics Data System (ADS)
Kwon, Hyun-Han; Lee, Jeong-Ju; Moon, Young-Il
2010-05-01
A regional frequency analysis based on Hierarchical Bayesian Network (HBN) and scaling theory was developmed. Many recording rain gauges over South Korea were used for the analysis. First, a scaling approach combined with extreme distribution was employed to derive regional formula for frequency analysis. Second, HBN model was used to represent additional information about the regional structure of the scaling parameters, especially the location parameter and shape parameter. The location and shape parameters of the extreme distribution were estimated by utilizing scaling properties in a regression framework, and the scaling parameters linking the parameters (location and shape) to various duration times were simultaneously estimated. It was found that the regional frequency analysis combined with HBN and scaling properties show promising results in terms of establishing regional IDF curves.
Bayesian principal geodesic analysis for estimating intrinsic diffeomorphic image variability.
Zhang, Miaomiao; Fletcher, P Thomas
2015-10-01
In this paper, we present a generative Bayesian approach for estimating the low-dimensional latent space of diffeomorphic shape variability in a population of images. We develop a latent variable model for principal geodesic analysis (PGA) that provides a probabilistic framework for factor analysis in the space of diffeomorphisms. A sparsity prior in the model results in automatic selection of the number of relevant dimensions by driving unnecessary principal geodesics to zero. To infer model parameters, including the image atlas, principal geodesic deformations, and the effective dimensionality, we introduce an expectation maximization (EM) algorithm. We evaluate our proposed model on 2D synthetic data and the 3D OASIS brain database of magnetic resonance images, and show that the automatically selected latent dimensions from our model are able to reconstruct unobserved testing images with lower error than both linear principal component analysis (LPCA) in the image space and tangent space principal component analysis (TPCA) in the diffeomorphism space.
Jacquemin, Stephen J.; Doll, Jason C.
2014-01-01
We combine evolutionary biology and community ecology to test whether two species traits, body size and geographic range, explain long term variation in local scale freshwater stream fish assemblages. Body size and geographic range are expected to influence several aspects of fish ecology, via relationships with niche breadth, dispersal, and abundance. These traits are expected to scale inversely with niche breadth or current abundance, and to scale directly with dispersal potential. However, their utility to explain long term temporal patterns in local scale abundance is not known. Comparative methods employing an existing molecular phylogeny were used to incorporate evolutionary relatedness in a test for covariation of body size and geographic range with long term (1983 – 2010) local scale population variation of fishes in West Fork White River (Indiana, USA). The Bayesian model incorporating phylogenetic uncertainty and correlated predictors indicated that neither body size nor geographic range explained significant variation in population fluctuations over a 28 year period. Phylogenetic signal data indicated that body size and geographic range were less similar among taxa than expected if trait evolution followed a purely random walk. We interpret this as evidence that local scale population variation may be influenced less by species-level traits such as body size or geographic range, and instead may be influenced more strongly by a taxon’s local scale habitat and biotic assemblages. PMID:24691075
Dolz, Roser; Valle, Rosa; Perera, Carmen L.; Bertran, Kateri; Frías, Maria T.; Majó, Natàlia; Ganges, Llilianne; Pérez, Lester J.
2013-01-01
Background Infectious bursal disease is a highly contagious and acute viral disease caused by the infectious bursal disease virus (IBDV); it affects all major poultry producing areas of the world. The current study was designed to rigorously measure the global phylogeographic dynamics of IBDV strains to gain insight into viral population expansion as well as the emergence, spread and pattern of the geographical structure of very virulent IBDV (vvIBDV) strains. Methodology/Principal Findings Sequences of the hyper-variable region of the VP2 (HVR-VP2) gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank database; Cuban sequences were obtained in the current work. All sequences were analysed by Bayesian phylogeographic analysis, implemented in the Bayesian Evolutionary Analysis Sampling Trees (BEAST), Bayesian Tip-association Significance testing (BaTS) and Spatial Phylogenetic Reconstruction of Evolutionary Dynamics (SPREAD) software packages. Selection pressure on the HVR-VP2 was also assessed. The phylogeographic association-trait analysis showed that viruses sampled from individual countries tend to cluster together, suggesting a geographic pattern for IBDV strains. Spatial analysis from this study revealed that strains carrying sequences that were linked to increased virulence of IBDV appeared in Iran in 1981 and spread to Western Europe (Belgium) in 1987, Africa (Egypt) around 1990, East Asia (China and Japan) in 1993, the Caribbean Region (Cuba) by 1995 and South America (Brazil) around 2000. Selection pressure analysis showed that several codons in the HVR-VP2 region were under purifying selection. Conclusions/Significance To our knowledge, this work is the first study applying the Bayesian phylogeographic reconstruction approach to analyse the emergence and spread of vvIBDV strains worldwide. PMID:23805195
Phylogenetic analysis reveals a scattered distribution of autumn colours
Archetti, Marco
2009-01-01
Background and Aims Leaf colour in autumn is rarely considered informative for taxonomy, but there is now growing interest in the evolution of autumn colours and different hypotheses are debated. Research efforts are hindered by the lack of basic information: the phylogenetic distribution of autumn colours. It is not known when and how autumn colours evolved. Methods Data are reported on the autumn colours of 2368 tree species belonging to 400 genera of the temperate regions of the world, and an analysis is made of their phylogenetic relationships in order to reconstruct the evolutionary origin of red and yellow in autumn leaves. Key Results Red autumn colours are present in at least 290 species (70 genera), and evolved independently at least 25 times. Yellow is present independently from red in at least 378 species (97 genera) and evolved at least 28 times. Conclusions The phylogenetic reconstruction suggests that autumn colours have been acquired and lost many times during evolution. This scattered distribution could be explained by hypotheses involving some kind of coevolutionary interaction or by hypotheses that rely on the need for photoprotection. PMID:19126636
Phylogenetic analysis of uroporphyrinogen III synthase (UROS) gene.
Shaik, Abjal Pasha; Alsaeed, Abbas H; Sultana, Asma
2012-01-01
The uroporphyrinogen III synthase (UROS) enzyme (also known as hydroxymethylbilane hydrolyase) catalyzes the cyclization of hydroxymethylbilane to uroporphyrinogen III during heme biosynthesis. A deficiency of this enzyme is associated with the very rare Gunther's disease or congenital erythropoietic porphyria, an autosomal recessive inborn error of metabolism. The current study investigated the possible role of UROS (Homo sapiens [EC: 4.2.1.75; 265 aa; 1371 bp mRNA; Entrez Pubmed ref NP_000366.1, NM_000375.2]) in evolution by studying the phylogenetic relationship and divergence of this gene using computational methods. The UROS protein sequences from various taxa were retrieved from GenBank database and were compared using Clustal-W (multiple sequence alignment) with defaults and a first-pass phylogenetic tree was built using neighbor-joining method as in DELTA BLAST 2.2.27+ version. A total of 163 BLAST hits were found for the uroporphyrinogen III synthase query sequence and these hits showed putative conserved domain, HemD superfamily (as on 14(th) Nov 2012). We then narrowed down the search by manually deleting the proteins which were not UROS sequences and sequences belonging to phyla other than Chordata were deleted. A repeat phylogenetic analysis of 39 taxa was performed using PhyML and TreeDyn software to confirm that UROS is a highly conserved protein with approximately 85% conserved sequences in almost all chordate taxons emphasizing its importance in heme synthesis.
Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.
Hack, C Eric
2006-04-17
Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach.
Node Augmentation Technique in Bayesian Network Evidence Analysis and Marshaling
Keselman, Dmitry; Tompkins, George H; Leishman, Deborah A
2010-01-01
Given a Bayesian network, sensitivity analysis is an important activity. This paper begins by describing a network augmentation technique which can simplifY the analysis. Next, we present two techniques which allow the user to determination the probability distribution of a hypothesis node under conditions of uncertain evidence; i.e. the state of an evidence node or nodes is described by a user specified probability distribution. Finally, we conclude with a discussion of three criteria for ranking evidence nodes based on their influence on a hypothesis node. All of these techniques have been used in conjunction with a commercial software package. A Bayesian network based on a directed acyclic graph (DAG) G is a graphical representation of a system of random variables that satisfies the following Markov property: any node (random variable) is independent of its non-descendants given the state of all its parents (Neapolitan, 2004). For simplicities sake, we consider only discrete variables with a finite number of states, though most of the conclusions may be generalized.
Phylogenetic analysis of a transfusion-transmitted hepatitis A outbreak.
Hettmann, Andrea; Juhász, Gabriella; Dencs, Ágnes; Tresó, Bálint; Rusvai, Erzsébet; Barabás, Éva; Takács, Mária
2017-02-01
A transfusion-associated hepatitis A outbreak was found in the first time in Hungary. The outbreak involved five cases. Parenteral transmission of hepatitis A is rare, but may occur during viraemia. Direct sequencing of nested PCR products was performed, and all the examined samples were identical in the VP1/2A region of the hepatitis A virus genome. HAV sequences found in recent years were compared and phylogenetic analysis showed that the strain which caused these cases is the same as that had spread in Hungary recently causing several hepatitis A outbreaks throughout the country.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments
NASA Technical Reports Server (NTRS)
Evans, John W.; Gallo, Luis; Kaminsky, Mark
2012-01-01
Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
A Bayesian subgroup analysis using collections of ANOVA models.
Liu, Jinzhong; Sivaganesan, Siva; Laud, Purushottam W; Müller, Peter
2017-03-20
We develop a Bayesian approach to subgroup analysis using ANOVA models with multiple covariates, extending an earlier work. We assume a two-arm clinical trial with normally distributed response variable. We also assume that the covariates for subgroup finding are categorical and are a priori specified, and parsimonious easy-to-interpret subgroups are preferable. We represent the subgroups of interest by a collection of models and use a model selection approach to finding subgroups with heterogeneous effects. We develop suitable priors for the model space and use an objective Bayesian approach that yields multiplicity adjusted posterior probabilities for the models. We use a structured algorithm based on the posterior probabilities of the models to determine which subgroup effects to report. Frequentist operating characteristics of the approach are evaluated using simulation. While our approach is applicable in more general cases, we mainly focus on the 2 × 2 case of two covariates each at two levels for ease of presentation. The approach is illustrated using a real data example.
Bayesian analysis of U.S. hurricane climate
Elsner, James B.; Bossak, Brian H.
2001-01-01
Predictive climate distributions of U.S. landfalling hurricanes are estimated from observational records over the period 1851–2000. The approach is Bayesian, combining the reliable records of hurricane activity during the twentieth century with the less precise accounts of activity during the nineteenth century to produce a best estimate of the posterior distribution on the annual rates. The methodology provides a predictive distribution of future activity that serves as a climatological benchmark. Results are presented for the entire coast as well as for the Gulf Coast, Florida, and the East Coast. Statistics on the observed annual counts of U.S. hurricanes, both for the entire coast and by region, are similar within each of the three consecutive 50-yr periods beginning in 1851. However, evidence indicates that the records during the nineteenth century are less precise. Bayesian theory provides a rational approach for defining hurricane climate that uses all available information and that makes no assumption about whether the 150-yr record of hurricanes has been adequately or uniformly monitored. The analysis shows that the number of major hurricanes expected to reach the U.S. coast over the next 30 yr is 18 and the number of hurricanes expected to hit Florida is 20.
Phylogenetic Analysis of Human Immunodeficiency Virus Type 2 Group B
Cella, Eleonora; Lo Presti, Alessandra; Giovanetti, Marta; Veo, Carla; Lai, Alessia; Dicuonzo, Giordano; Angeletti, Silvia; Ciotti, Marco; Zehender, Gianguglielmo; Ciccozzi, Massimo
2016-01-01
Context: Human immunodeficiency virus type 2 (HIV-2) infections are mainly restricted to West Africa; however, in the recent years, the prevalence of HIV-2 is a growing concern in some European countries and the Southwestern region of India. Despite the presence of different HIV-2 groups, only A and B Groups have established human-to-human transmission chains. Aims: This work aimed to evaluate the phylogeographic inference of HIV-2 Group B worldwide to estimate their data of origin and the population dynamics. Materials and Methods: The evolutionary rates, the demographic history for HIV-2 Group B dataset, and the phylogeographic analysis were estimated using a Bayesian approach. The viral gene flow analysis was used to count viral gene out/in flow among different locations. Results: The root of the Bayesian maximum clade credibility tree of HIV-2 Group B dated back to 1957. The demographic history of HIV-2 Group B showed that the epidemic remained constant up to 1970 when started an exponential growth. From 1985 to early 2000s, the epidemic reached a plateau, and then it was characterized by two bottlenecks and a new plateau at the end of 2000s. Phylogeographic reconstruction showed that the most probable location for the root of the tree was Ghana. Regarding the viral gene flow of HIV-2 Group B, the only observed viral gene flow was from Africa to France, Belgium, and Luxembourg. Conclusions: The study gives insights into the origin, history, and phylogeography of HIV-2 Group B epidemic. The growing number of infections of HIV-2 worldwide indicates the need for strengthening surveillance. PMID:27621561
Developing and Testing a Bayesian Analysis of Fluorescence Lifetime Measurements
Needleman, Daniel J.
2017-01-01
FRET measurements can provide dynamic spatial information on length scales smaller than the diffraction limit of light. Several methods exist to measure FRET between fluorophores, including Fluorescence Lifetime Imaging Microscopy (FLIM), which relies on the reduction of fluorescence lifetime when a fluorophore is undergoing FRET. FLIM measurements take the form of histograms of photon arrival times, containing contributions from a mixed population of fluorophores both undergoing and not undergoing FRET, with the measured distribution being a mixture of exponentials of different lifetimes. Here, we present an analysis method based on Bayesian inference that rigorously takes into account several experimental complications. We test the precision and accuracy of our analysis on controlled experimental data and verify that we can faithfully extract model parameters, both in the low-photon and low-fraction regimes. PMID:28060890
Risk analysis of dust explosion scenarios using Bayesian networks.
Yuan, Zhi; Khakzad, Nima; Khan, Faisal; Amyotte, Paul
2015-02-01
In this study, a methodology has been proposed for risk analysis of dust explosion scenarios based on Bayesian network. Our methodology also benefits from a bow-tie diagram to better represent the logical relationships existing among contributing factors and consequences of dust explosions. In this study, the risks of dust explosion scenarios are evaluated, taking into account common cause failures and dependencies among root events and possible consequences. Using a diagnostic analysis, dust particle properties, oxygen concentration, and safety training of staff are identified as the most critical root events leading to dust explosions. The probability adaptation concept is also used for sequential updating and thus learning from past dust explosion accidents, which is of great importance in dynamic risk assessment and management. We also apply the proposed methodology to a case study to model dust explosion scenarios, to estimate the envisaged risks, and to identify the vulnerable parts of the system that need additional safety measures.
BaTMAn: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
A Bayesian Hierarchical Approach to Regional Frequency Analysis of Extremes
NASA Astrophysics Data System (ADS)
Renard, B.
2010-12-01
Rainfall and runoff frequency analysis is a major issue for the hydrological community. The distribution of hydrological extremes varies in space and possibly in time. Describing and understanding this spatiotemporal variability are primary challenges to improve hazard quantification and risk assessment. This presentation proposes a general approach based on a Bayesian hierarchical model, following previous work by Cooley et al. [2007], Micevski [2007], Aryal et al. [2009] or Lima and Lall [2009; 2010]. Such a hierarchical model is made up of two levels: (1) a data level modeling the distribution of observations, and (2) a process level describing the fluctuation of the distribution parameters in space and possibly in time. At the first level of the model, at-site data (e.g., annual maxima series) are modeled with a chosen distribution (e.g., a GEV distribution). Since data from several sites are considered, the joint distribution of a vector of (spatial) observations needs to be derived. This is challenging because data are in general not spatially independent, especially for nearby sites. An elliptical copula is therefore used to formally account for spatial dependence between at-site data. This choice might be questionable in the context of extreme value distributions. However, it is motivated by its applicability in spatial highly dimensional problems, where the joint pdf of a vector of n observations is required to derive the likelihood function (with n possibly amounting to hundreds of sites). At the second level of the model, parameters of the chosen at-site distribution are then modeled by a Gaussian spatial process, whose mean may depend on covariates (e.g. elevation, distance to sea, weather pattern, time). In particular, this spatial process allows estimating parameters at ungauged sites, and deriving the predictive distribution of rainfall/runoff at every pixel/catchment of the studied domain. An application to extreme rainfall series from the French
Molecular detection and phylogenetic analysis of bovine astrovirus in Brazil.
Candido, Marcelo; Alencar, Anna Luiza Farias; Almeida-Queiroz, Sabrina R; Buzinaro, Maria da Glória; Munin, Flavia Simone; de Godoy, Silvia Helena Seraphin; Livonesi, Marcia Cristina; Fernandes, Andrezza Maria; de Sousa, Ricardo Luiz Moro
2015-06-01
Bovine astrovirus (BoAstV) is associated with gastroenterical disorders such as diarrhea, particularly in neonates and immunocompromised animals. Its prevalence is >60 % in the first five weeks of the animal's life. The aim of this study was to detect and perform a phylogenetic analysis of BoAstV in Brazilian cattle. A prevalence of 14.3 % of BoAstV in fecal samples from 272 head of cattle from different Brazilian states was detected, and 11 samples were analyzed by nucleotide sequencing. The majority of positive samples were obtained from diarrheic animals (p < 0.01). Phylogenetic analysis revealed that Brazilian samples were grouped in clades along with other BoAstV isolates. There was 74.3 %-96.5 % amino acid sequence similarity between the samples in this study and >74.8 % when compared with reference samples for enteric BoAstV. Our results indicate, for the first time, the occurrence of BoAstV circulation in cattle from different regions of Brazil, prevalently in diarrheic calves.
Phylogenetic analysis of diprotodontian marsupials based on complete mitochondrial genomes.
Munemasa, Maruo; Nikaido, Masato; Donnellan, Stephen; Austin, Christopher C; Okada, Norihiro; Hasegawa, Masami
2006-06-01
Australidelphia is the cohort, originally named by Szalay, of all Australian marsupials and the South American Dromiciops. A lot of mitochondria and nuclear genome studies support the hypothesis of a monophyly of Australidelphia, but some familial relationships in Australidelphia are still unclear. In particular, the familial relationships among the order Diprotodontia (koala, wombat, kangaroos and possums) are ambiguous. These Diprotodontian families are largely grouped into two suborders, Vombatiformes, which contains Phascolarctidae (koala) and Vombatidae (wombat), and Phalangerida, which contains Macropodidae, Potoroidae, Phalangeridae, Petauridae, Pseudocheiridae, Acrobatidae, Tarsipedidae and Burramyidae. Morphological evidence and some molecular analyses strongly support monophyly of the two families in Vombatiformes. The monophyly of Phalangerida as well as the phylogenetic relationships of families in Phalangerida remains uncertain, however, despite searches for morphological synapomorphy and mitochondrial DNA sequence analyses. Moreover, phylogenetic relationships among possum families (Phalangeridae, Petauridae, Pseudocheiridae, Acrobatidae, Tarsipedidae and Burramyidae) as well as a sister group of Macropodoidea (Macropodidae and Potoroidae) remain unclear. To evaluate familial relationships among Dromiciops and Australian marsupials as well as the familial relationships in Diprotodontia, we determined the complete mitochondrial sequence of six Diprotodontian species. We used Maximum Likelihood analyses with concatenated amino acid and codon sequences of 12 mitochondrial protein genomes. Our analysis of mitochondria amino acid sequence supports monophyly of Australian marsupials+Dromiciops and monophyly of Phalangerida. The close relatedness between Macropodidae and Phalangeridae is also weakly supported by our analysis.
BASE-9: Bayesian Analysis for Stellar Evolution with nine variables
NASA Astrophysics Data System (ADS)
Robinson, Elliot; von Hippel, Ted; Stein, Nathan; Stenning, David; Wagner-Kaiser, Rachel; Si, Shijing; van Dyk, David
2016-08-01
The BASE-9 (Bayesian Analysis for Stellar Evolution with nine variables) software suite recovers star cluster and stellar parameters from photometry and is useful for analyzing single-age, single-metallicity star clusters, binaries, or single stars, and for simulating such systems. BASE-9 uses a Markov chain Monte Carlo (MCMC) technique along with brute force numerical integration to estimate the posterior probability distribution for the age, metallicity, helium abundance, distance modulus, line-of-sight absorption, and parameters of the initial-final mass relation (IFMR) for a cluster, and for the primary mass, secondary mass (if a binary), and cluster probability for every potential cluster member. The MCMC technique is used for the cluster quantities (the first six items listed above) and numerical integration is used for the stellar quantities (the last three items in the above list).
Objective Bayesian Comparison of Constrained Analysis of Variance Models.
Consonni, Guido; Paroli, Roberta
2016-10-04
In the social sciences we are often interested in comparing models specified by parametric equality or inequality constraints. For instance, when examining three group means [Formula: see text] through an analysis of variance (ANOVA), a model may specify that [Formula: see text], while another one may state that [Formula: see text], and finally a third model may instead suggest that all means are unrestricted. This is a challenging problem, because it involves a combination of nonnested models, as well as nested models having the same dimension. We adopt an objective Bayesian approach, requiring no prior specification from the user, and derive the posterior probability of each model under consideration. Our method is based on the intrinsic prior methodology, suitably modified to accommodate equality and inequality constraints. Focussing on normal ANOVA models, a comparative assessment is carried out through simulation studies. We also present an application to real data collected in a psychological experiment.
Bayesian Analysis of Peak Ground Acceleration Attenuation Relationship
Mu Heqing; Yuen Kaveng
2010-05-21
Estimation of peak ground acceleration is one of the main issues in civil and earthquake engineering practice. The Boore-Joyner-Fumal empirical formula is well known for this purpose. In this paper we propose to use the Bayesian probabilistic model class selection approach to obtain the most suitable prediction model class for the seismic attenuation formula. The optimal model class is robust in the sense that it has balance between the data fitting capability and the sensitivity to noise. A database of strong-motion records is utilized for the analysis. It turns out that the optimal model class is simpler than the full order attenuation model suggested by Boore, Joyner and Fumal (1993).
Bayesian analysis of factors associated with fibromyalgia syndrome subjects
NASA Astrophysics Data System (ADS)
Jayawardana, Veroni; Mondal, Sumona; Russek, Leslie
2015-01-01
Factors contributing to movement-related fear were assessed by Russek, et al. 2014 for subjects with Fibromyalgia (FM) based on the collected data by a national internet survey of community-based individuals. The study focused on the variables, Activities-Specific Balance Confidence scale (ABC), Primary Care Post-Traumatic Stress Disorder screen (PC-PTSD), Tampa Scale of Kinesiophobia (TSK), a Joint Hypermobility Syndrome screen (JHS), Vertigo Symptom Scale (VSS-SF), Obsessive-Compulsive Personality Disorder (OCPD), Pain, work status and physical activity dependent from the "Revised Fibromyalgia Impact Questionnaire" (FIQR). The study presented in this paper revisits same data with a Bayesian analysis where appropriate priors were introduced for variables selected in the Russek's paper.
Bayesian Library for the Analysis of Neutron Diffraction Data
NASA Astrophysics Data System (ADS)
Ratcliff, William; Lesniewski, Joseph; Quintana, Dylan
During this talk, I will introduce the Bayesian Library for the Analysis of Neutron Diffraction Data. In this library we use of the DREAM algorithm to effectively sample parameter space. This offers several advantages over traditional least squares fitting approaches. It gives us more robust estimates of the fitting parameters, their errors, and their correlations. It also is more stable than least squares methods and provides more confidence in finding a global minimum. I will discuss the algorithm and its application to several materials. I will show applications to both structural and magnetic diffraction patterns. I will present examples of fitting both powder and single crystal data. We would like to acknowledge support from the Department of Commerce and the NSF.
Discrete Dynamic Bayesian Network Analysis of fMRI Data
Burge, John; Lane, Terran; Link, Hamilton; Qiu, Shibin; Clark, Vincent P.
2010-01-01
We examine the efficacy of using discrete Dynamic Bayesian Networks (dDBNs), a data-driven modeling technique employed in machine learning, to identify functional correlations among neuroanatomical regions of interest. Unlike many neuroimaging analysis techniques, this method is not limited by linear and/or Gaussian noise assumptions. It achieves this by modeling the time series of neuroanatomical regions as discrete, as opposed to continuous, random variables with multinomial distributions. We demonstrated this method using an fMRI dataset collected from healthy and demented elderly subjects and identify correlates based on a diagnosis of dementia. The results are validated in three ways. First, the elicited correlates are shown to be robust over leave-one-out cross-validation and, via a Fourier bootstrapping method, that they were not likely due to random chance. Second, the dDBNs identified correlates that would be expected given the experimental paradigm. Third, the dDBN's ability to predict dementia is competitive with two commonly employed machine-learning classifiers: the support vector machine and the Gaussian naïve Bayesian network. We also verify that the dDBN selects correlates based on non-linear criteria. Finally, we provide a brief analysis of the correlates elicited from Buckner et al.'s data that suggests that demented elderly subjects have reduced involvement of entorhinal and occipital cortex and greater involvement of the parietal lobe and amygdala in brain activity compared with healthy elderly (as measured via functional correlations among BOLD measurements). Limitations and extensions to the dDBN method are discussed. PMID:17990301
A Bayesian Seismic Hazard Analysis for the city of Naples
NASA Astrophysics Data System (ADS)
Faenza, Licia; Pierdominici, Simona; Hainzl, Sebastian; Cinti, Francesca R.; Sandri, Laura; Selva, Jacopo; Tonini, Roberto; Perfetti, Paolo
2016-04-01
In the last years many studies have been focused on determination and definition of the seismic, volcanic and tsunamogenic hazard in the city of Naples. The reason is that the town of Naples with its neighboring area is one of the most densely populated places in Italy. In addition, the risk is increased also by the type and condition of buildings and monuments in the city. It is crucial therefore to assess which active faults in Naples and surrounding area could trigger an earthquake able to shake and damage the urban area. We collect data from the most reliable and complete databases of macroseismic intensity records (from 79 AD to present). For each seismic event an active tectonic structure has been associated. Furthermore a set of active faults, well-known from geological investigations, located around the study area that they could shake the city, not associated with any earthquake, has been taken into account for our studies. This geological framework is the starting point for our Bayesian seismic hazard analysis for the city of Naples. We show the feasibility of formulating the hazard assessment procedure to include the information of past earthquakes into the probabilistic seismic hazard analysis. This strategy allows on one hand to enlarge the information used in the evaluation of the hazard, from alternative models for the earthquake generation process to past shaking and on the other hand to explicitly account for all kinds of information and their uncertainties. The Bayesian scheme we propose is applied to evaluate the seismic hazard of Naples. We implement five different spatio-temporal models to parameterize the occurrence of earthquakes potentially dangerous for Naples. Subsequently we combine these hazard curves with ShakeMap of past earthquakes that have been felt in Naples. The results are posterior hazard assessment for three exposure times, e.g., 50, 10 and 5 years, in a dense grid that cover the municipality of Naples, considering bedrock soil
ERIC Educational Resources Information Center
Chung, Gregory K. W. K.; Dionne, Gary B.; Kaiser, William J.
2006-01-01
Our research question was whether we could develop a feasible technique, using Bayesian networks, to diagnose gaps in student knowledge. Thirty-four college-age participants completed tasks designed to measure conceptual knowledge, procedural knowledge, and problem-solving skills related to circuit analysis. A Bayesian network was used to model…
Li, Xiaoxu; Liu, Cheng; Li, Wei; Zhang, Zenglin; Gao, Xiaoming; Zhou, Hui; Guo, Yongfeng
2016-05-01
Members of the plant-specific WOX transcription factor family have been reported to play important roles in cell to cell communication as well as other physiological and developmental processes. In this study, ten members of the WOX transcription factor family were identified in Solanum lycopersicum with HMMER. Neighbor-joining phylogenetic tree, maximum-likelihood tree and Bayesian-inference tree were constructed and similar topologies were shown using the protein sequences of the homeodomain. Phylogenetic study revealed that the 25 WOX family members from Arabidopsis and tomato fall into three clades and nine subfamilies. The patterns of exon-intron structures and organization of conserved domains in Arabidopsis and tomato were consistent based on the phylogenetic results. Transcriptome analysis showed that the expression patterns of SlWOXs were different in different tissue types. Gene Ontology (GO) analysis suggested that, as transcription factors, the SlWOX family members could be involved in a number of biological processes including cell to cell communication and tissue development. Our results are useful for future studies on WOX family members in tomato and other plant species.
Chao, Q J; Li, Y D; Geng, X X; Zhang, L; Dai, X; Zhang, X; Li, J; Zhang, H J
2014-04-14
This is the first report of a complete mitochondrial genome sequence from Himalayan marmot (Marmota himalayana, class Marmota). We determined the M. himalayana mitochondrial (mt) genome sequence by using long-PCR methods and a primer-walking sequencing strategy with genus-specific primers. The complete mt genome of M. himalayana was 16,443 bp in length and comprised 13 protein-coding genes, 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes, and a typical control region (CR). Gene order and orientation were identical to those in mt genomes of most vertebrates. The heavy strand showed an overall A+T content of 63.49%. AT and GC skews for the mt genome of the M. himalayana were 0.012 and -0.300, respectively, indicating a nucleotide bias against T and G. The control region was 997 bp in size and displayed some unusual features, including absence of repeated motifs and two conserved sequence blocks (CSB2 and CSB3), which is consistent with observations from two other rodent species, Sciurus vulgaris and Myoxus glis. Phylogenetic analysis of complete mt DNA sequences without the control region including 30 taxa of Rodentia was performed with Maximum-Likelihood (ML) and Bayesian Inference (BI) methods and provided strong support for Sciurognathi polyphyly and Hystricognathi monophyly. This analysis also provided evidence that M. himalayana mt DNA was closely related to that from Sciurus vulgaris (Sciuridae) and was similar to mt DNA from Myoxus glis.
Priya, R; Siva, Ramamoorthy
2015-07-01
During different environmental stress conditions, plant growth is regulated by the hormone abscisic acid (an apocarotenoid). In the biosynthesis of abscisic acid, the oxidative cleavage of cis-epoxycarotenoid catalyzed by 9-cis-epoxycarotenoid dioxygenase (NCED) is the crucial step. The NCED genes were isolated in numerous plant species and those genes were phylogenetically investigated to understand the evolution of NCED genes in various plant lineages comprising lycophyte, gymnosperm, dicot and monocot. A total of 93 genes were obtained from 48 plant species to statistically estimate their sequence conservation and functional divergence. Selaginella moellendorffii appeared to be evolutionarily distinct from those of the angiosperms, insisting the substantial influence of natural selection pressure on NCED genes. Further, using exon-intron structure analysis, the gene structures of NCED were found to be conserved across some species. In addition, the substitution rate ratio of non-synonymous (Ka) versus synonymous (Ks) mutations using the Bayesian inference approach, depicted the critical amino acid residues for functional divergence. A significant functional divergence was found between some subgroups through the co-efficient of type-I functional divergence. Our results suggest that the evolution of NCED genes occurred by duplication, diversification and exon intron loss events. The site-specific profile and functional diverge analysis revealed NCED genes might facilitate the tissue-specific functional divergence in NCED sub-families, that could combat different environmental stress conditions aiding plant survival.
Azeredo-Espin, Ana Maria L.
2013-01-01
Insect pest phylogeography might be shaped both by biogeographic events and by human influence. Here, we conducted an approximate Bayesian computation (ABC) analysis to investigate the phylogeography of the New World screwworm fly, Cochliomyia hominivorax, with the aim of understanding its population history and its order and time of divergence. Our ABC analysis supports that populations spread from North to South in the Americas, in at least two different moments. The first split occurred between the North/Central American and South American populations in the end of the Last Glacial Maximum (15,300-19,000 YBP). The second split occurred between the North and South Amazonian populations in the transition between the Pleistocene and the Holocene eras (9,100-11,000 YBP). The species also experienced population expansion. Phylogenetic analysis likewise suggests this north to south colonization and Maxent models suggest an increase in the number of suitable areas in South America from the past to present. We found that the phylogeographic patterns observed in C. hominivorax cannot be explained only by climatic oscillations and can be connected to host population histories. Interestingly we found these patterns are very coincident with general patterns of ancient human movements in the Americas, suggesting that humans might have played a crucial role in shaping the distribution and population structure of this insect pest. This work presents the first hypothesis test regarding the processes that shaped the current phylogeographic structure of C. hominivorax and represents an alternate perspective on investigating the problem of insect pests. PMID:24098436
Fresia, Pablo; Azeredo-Espin, Ana Maria L; Lyra, Mariana L
2013-01-01
Insect pest phylogeography might be shaped both by biogeographic events and by human influence. Here, we conducted an approximate Bayesian computation (ABC) analysis to investigate the phylogeography of the New World screwworm fly, Cochliomyia hominivorax, with the aim of understanding its population history and its order and time of divergence. Our ABC analysis supports that populations spread from North to South in the Americas, in at least two different moments. The first split occurred between the North/Central American and South American populations in the end of the Last Glacial Maximum (15,300-19,000 YBP). The second split occurred between the North and South Amazonian populations in the transition between the Pleistocene and the Holocene eras (9,100-11,000 YBP). The species also experienced population expansion. Phylogenetic analysis likewise suggests this north to south colonization and Maxent models suggest an increase in the number of suitable areas in South America from the past to present. We found that the phylogeographic patterns observed in C. hominivorax cannot be explained only by climatic oscillations and can be connected to host population histories. Interestingly we found these patterns are very coincident with general patterns of ancient human movements in the Americas, suggesting that humans might have played a crucial role in shaping the distribution and population structure of this insect pest. This work presents the first hypothesis test regarding the processes that shaped the current phylogeographic structure of C. hominivorax and represents an alternate perspective on investigating the problem of insect pests.
Phylogenetic analysis and characterization of Korean bovine viral diarrhea viruses.
Oem, Jae-Ku; Hyun, Bang-Hun; Cha, Sang-Ho; Lee, Kyoung-Ki; Kim, Seong-Hee; Kim, Hye-Ryoung; Park, Choi-Kyu; Joo, Yi-Seok
2009-11-18
Thirty-six bovine viral disease viruses (BVDVs) were identified in bovine feces (n=16), brains (n=2), and aborted fetuses (n=18) in Korea. To reveal the genetic diversity and characteristics of these Korean strains, the sequences of their 5'-untranslated regions (5'-UTRs) were determined and then compared with published reference sequences. Neighbor-joining phylogenetic analysis revealed that most of the Korean viruses were of the BVDV subtypes 1a (n=17) or 2a (n=17). The remaining strains were of subtypes 1b (n=1) and 1n (n=1). This analysis indicates that the 1a and 2a BVDV subtypes are predominant and widespread in Korea. In addition, the prevalence of BVDV-2 was markedly higher in aborted fetuses than in other samples and was more often associated with reproductive problems and significant mortality in cattle.
Phylogenetic analysis of cichlid fishes using nuclear DNA markers.
Sültmann, H; Mayer, W E; Figueroa, F; Tichy, H; Klein, J
1995-11-01
The recent explosive adaptive radiation of cichlids in the great lakes of Africa has attracted the attention of both morphologists and molecular biologists. To decipher the phylogenetic relationships among the various taxa within the family Cichlidae is a prerequisite for answering some fundamental questions about the nature of the speciation process. In the present study, we used the random amplification of polymorphic DNA (RAPD) technique to obtain sequence differences between selected cichlid species. We then designed specific primers based on these sequences and used them to amplify template DNA from a large number of species by the polymerase chain reaction (PCR). We sequenced the amplified products and searched the sequences for indels and shared substitutions. We identified a number of such characters at three loci--DXTU1, DXTU2, and DXTU3--and used them for phylogenetic and cladistic analysis of the relationships among the various cichlid groups. Our studies assign an outgroup position to Neotropical cichlids in relation to African cichlids, provide evidence for a sister-group relationship of tilapiines to the haplochromines, group Cyphotilapia frontosa with the lamprologines of Lake Tanganyika, place Astatoreochromis alluaudi to an outgroup position with respect to other haplochromines of Lakes Victoria and Malawi, and provide additional support for the monophyly of the remaining Lake Victoria haplochromines and the Lake Malawi haplochromines. The described approach holds great promise for further resolution of cichlid phylogeny.
A phylogenetic transform enhances analysis of compositional microbiota data
Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A
2017-01-01
Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities. DOI: http://dx.doi.org/10.7554/eLife.21887.001 PMID:28198697
Phylogenetic analysis of the Argonaute protein family in platyhelminths.
Zheng, Yadong
2013-03-01
Argonaute proteins (AGOs) are mediators of gene silencing via recruitment of small regulatory RNAs to induce translational regression or degradation of targeted molecules. Platyhelminths have been reported to express microRNAs but the diversity of AGOs in the phylum has not been explored. Phylogenetic relationships of members of this protein family were studied using data from six platyhelminth genomes. Phylogenetic analysis showed that all cestode and trematode AGOs, along with some triclad planarian AGOs, were grouped into the Ago subfamily and its novel sister clade, here referred to as Cluster 1. These were very distant from Piwi and Class 3 subfamilies. By contrast, a number of planarian Piwi-like AGOs formed a novel sister clade to the Piwi subfamily. Extensive sequence searching revealed the presence of an additional locus for AGO2 in the cestode Echinococcus granulosus and exon expansion in this species and E. multilocularis. The current study suggests the absence of the Piwi subfamily and Class 3 AGOs in cestodes and trematodes and the Piwi-like AGO expansion in a free-living triclad planarian and the occurrence of exon expansion prior to or during the evolution of the most-recent common ancestor of the Echinococcus species studied.
We use Bayesian uncertainty analysis to explore how to estimate pollutant exposures from biomarker concentrations. The growing number of national databases with exposure data makes such an analysis possible. They contain datasets of pharmacokinetic biomarkers for many polluta...
A Bayesian Analysis of Finite Mixtures in the LISREL Model.
ERIC Educational Resources Information Center
Zhu, Hong-Tu; Lee, Sik-Yum
2001-01-01
Proposes a Bayesian framework for estimating finite mixtures of the LISREL model. The model augments the observed data of the manifest variables with the latent variables and allocation variables and uses the Gibbs sampler to obtain the Bayesian solution. Discusses other associated statistical inferences. (SLD)
Ungvári, Ildikó; Hullám, Gábor; Antal, Péter; Kiszel, Petra Sz; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba
2012-01-01
Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called bayesian network based bayesian multilevel analysis of relevance (BN-BMLA). This method uses bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated.With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2-1.8); p = 3×10(-4)). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics.In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance.
A Bayesian Analysis of the Cepheid Distance Scale
NASA Astrophysics Data System (ADS)
Barnes, Thomas G., III; Jefferys, W. H.; Berger, J. O.; Mueller, Peter J.; Orr, K.; Rodriguez, R.
2003-07-01
We develop and describe a Bayesian statistical analysis to solve the surface brightness equations for Cepheid distances and stellar properties. Our analysis provides a mathematically rigorous and objective solution to the problem, including immunity from Lutz-Kelker bias. We discuss the choice of priors, show the construction of the likelihood distribution, and give sampling algorithms in a Markov chain Monte Carlo approach for efficiently and completely sampling the posterior probability distribution. Our analysis averages over the probabilities associated with several models rather than attempting to pick the ``best model'' from several possible models. Using a sample of 13 Cepheids we demonstrate the method. We discuss diagnostics of the analysis and the effects of the astrophysical choices going into the model. We show that we can objectively model the order of Fourier polynomial fits to the light and velocity data. By comparison with theoretical models of Bono et al. we find that EU Tau and SZ Tau are overtone pulsators, most likely without convective overshoot. The period-radius and period-luminosity relations we obtain are shown to be compatible with those in the recent literature. Specifically, we find log()=(0.693+/-0.037)[log(P)-1.2]+(2.042+/-0.047) and v>=-(2.690+/-0.169)[log(P)-1.2]-(4.699+/-0.216).
Bayesian Analysis of Evolutionary Divergence with Genomic Data Under Diverse Demographic Models.
Chung, Yujin; Hey, Jody
2017-02-25
We present a new Bayesian method for estimating demographic and phylogenetic history using population genomic data. Several key innovations are introduced that allow the study of diverse models within an Isolation with Migration framework. The new method implements a 2-step analysis, with an initial Markov chain Monte Carlo (MCMC) phase that samples simple coalescent trees, followed by the calculation of the joint posterior density for the parameters of a demographic model. In step 1, the MCMC sampling phase, the method uses a reduced state space, consisting of coalescent trees without migration paths, and a simple importance sampling distribution without the demography of interest. Once obtained, a single sample of trees can be used in step 2 to calculate the joint posterior density for model parameters under multiple diverse demographic models, without having to repeat MCMC runs. Because migration paths are not included in the state space of the MCMC phase, but rather are handled by analytic integration in step 2 of the analysis, the method is scalable to a large number of loci with excellent MCMC mixing properties. With an implementation of the new method in the computer program MIST, we demonstrate the method's accuracy, scalability and other advantages using simulated data and DNA sequences of two common chimpanzee subspecies: Pan troglodytes troglodytes (P. t.) and P. t. verus.
2013-01-01
Background Dendropsophus is a monophyletic anuran genus with a diploid number of 30 chromosomes as an important synapomorphy. However, the internal phylogenetic relationships of this genus are poorly understood. Interestingly, an intriguing interspecific variation in the telocentric chromosome number has been useful in species identification. To address certain uncertainties related to one of the species groups of Dendropsophus, the D. microcephalus group, we carried out a cytogenetic analysis combined with phylogenetic inferences based on mitochondrial sequences, which aimed to aid in the analysis of chromosomal characters. Populations of Dendropsophus nanus, Dendropsophus walfordi, Dendropsophus sanborni, Dendropsophus jimi and Dendropsophus elianeae, ranging from the extreme south to the north of Brazil, were cytogenetically compared. A mitochondrial region of the ribosomal 12S gene from these populations, as well as from 30 other species of Dendropsophus, was used for the phylogenetic inferences. Phylogenetic relationships were inferred using maximum parsimony and Bayesian analyses. Results The species D. nanus and D. walfordi exhibited identical karyotypes (2n = 30; FN = 52), with four pairs of telocentric chromosomes and a NOR located on metacentric chromosome pair 13. In all of the phylogenetic hypotheses, the paraphyly of D. nanus and D. walfordi was inferred. D. sanborni from Botucatu-SP and Torres-RS showed the same karyotype as D. jimi, with 5 pairs of telocentric chromosomes (2n = 30; FN = 50) and a terminal NOR in the long arm of the telocentric chromosome pair 12. Despite their karyotypic similarity, these species were not found to compose a monophyletic group. Finally, the phylogenetic and cytogenetic analyses did not cluster the specimens of D. elianeae according to their geographical occurrence or recognized morphotypes. Conclusions We suggest that a taxonomic revision of the taxa D. nanus and D. walfordi is quite necessary. We also
Buckley, Christopher D.
2012-01-01
The warp ikat method of making decorated textiles is one of the most geographically widespread in southeast Asia, being used by Austronesian peoples in Indonesia, Malaysia and the Philippines, and Daic peoples on the Asian mainland. In this study a dataset consisting of the decorative characters of 36 of these warp ikat weaving traditions is investigated using Bayesian and Neighbornet techniques, and the results are used to construct a phylogenetic tree and taxonomy for warp ikat weaving in southeast Asia. The results and analysis show that these diverse traditions have a common ancestor amongst neolithic cultures the Asian mainland, and parallels exist between the patterns of textile weaving descent and linguistic phylogeny for the Austronesian group. Ancestral state analysis is used to reconstruct some of the features of the ancestral weaving tradition. The widely held theory that weaving motifs originated in the late Bronze Age Dong-Son culture is shown to be inconsistent with the data. PMID:23272211
Buckley, Christopher D
2012-01-01
The warp ikat method of making decorated textiles is one of the most geographically widespread in southeast Asia, being used by Austronesian peoples in Indonesia, Malaysia and the Philippines, and Daic peoples on the Asian mainland. In this study a dataset consisting of the decorative characters of 36 of these warp ikat weaving traditions is investigated using Bayesian and Neighbornet techniques, and the results are used to construct a phylogenetic tree and taxonomy for warp ikat weaving in southeast Asia. The results and analysis show that these diverse traditions have a common ancestor amongst neolithic cultures the Asian mainland, and parallels exist between the patterns of textile weaving descent and linguistic phylogeny for the Austronesian group. Ancestral state analysis is used to reconstruct some of the features of the ancestral weaving tradition. The widely held theory that weaving motifs originated in the late Bronze Age Dong-Son culture is shown to be inconsistent with the data.
Zhou, Tai-Cheng; Sha, Tao; Irwin, David M; Zhang, Ya-Ping
2015-01-01
Pavo cristatus, known as the Indian peafowl, is endemic to India and Sri Lanka and has been domesticated for its ornamental and food value. However, its phylogenetic status is still debated. Here, to clarify the phylogenetic status of P. cristatus within Phasianidae, we analyzed its mitochondrial genome (mtDNA). The complete mitochondrial DNA (mtDNA) genome was determined using 34 pairs of primers. Our data show that the mtDNA genome of P. cristatus is 16,686 bp in length. Molecular phylogenetic analyses of P. cristatus was performed along with 22 complete mtDNA genomes belonging to other species in Phasianidae using Bayesian and maximum likelihood methods, where Aythya americana and Anas platyrhynchos were used as outgroups. Our results show that P. critatus has its closest genetic affinity with Pavo muticus and belongs to clade that contains Gallus, Bambusicola and Francolinus.
Phylogenetic analysis of dissimilatory Fe(III)-reducing bacteria
Lonergan, D.J.; Jenter, H.L.; Coates, J.D.; Phillips, E.J.P.; Schmidt, T.M.; Lovley, D.R.
1996-01-01
Evolutionary relationships among strictly anaerobic dissimilatory Fe(III)- reducing bacteria obtained from a diversity of sedimentary environments were examined by phylogenetic analysis of 16S rRNA gene sequences. Members of the genera Geobacter, Desulfuromonas, Pelobacter, and Desulfuromusa formed a monophyletic group within the delta subdivision of the class Proteobacteria. On the basis of their common ancestry and the shared ability to reduce Fe(III) and/or S0, we propose that this group be considered a single family, Geobacteraceae. Bootstrap analysis, characteristic nucleotides, and higher- order secondary structures support the division of Geobacteraceae into two subgroups, designated the Geobacter and Desulfuromonas clusters. The genus Desulfuromusa and Pelobacter acidigallici make up a distinct branch with the Desulfuromonas cluster. Several members of the family Geobacteraceae, none of which reduce sulfate, were found to contain the target sequences of probes that have been previously used to define the distribution of sulfate-reducing bacteria and sulfate-reducing bacterium-like microorganisms. The recent isolations of Fe(III)-reducing microorganisms distributed throughout the domain Bacteria suggest that development of 16S rRNA probes that would specifically target all Fe(III) reducers may not be feasible. However, all of the evidence suggests that if a 16S rRNA sequence falls within the family Geobacteraceae, then the organism has the capacity for Fe(III) reduction. The suggestion, based on geological evidence, that Fe(III) reduction was the first globally significant process for oxidizing organic matter back to carbon dioxide is consistent with the finding that acetate-oxidizing Fe(III) reducers are phylogenetically diverse.
Phylogenetic and Structural Analysis of Polyketide Synthases in Aspergilli
Bhetariya, Preetida J.; Prajapati, Madhvi; Bhaduri, Asani; Mandal, Rahul Shubhra; Varma, Anupam; Madan, Taruna; Singh, Yogendra; Sarma, P. Usha
2016-01-01
Polyketide synthases (PKSs) of Aspergillus species are multidomain and multifunctional megaenzymes that play an important role in the synthesis of diverse polyketide compounds. Putative PKS protein sequences from Aspergillus species representing medically, agriculturally, and industrially important Aspergillus species were chosen and screened for in silico studies. Six candidate Aspergillus species, Aspergillus fumigatus Af293, Aspergillus flavus NRRL3357, Aspergillus niger CBS 513.88, Aspergillus terreus NIH2624, Aspergillus oryzae RIB40, and Aspergillus clavatus NRRL1, were selected to study the PKS phylogeny. Full-length PKS proteins and only ketosynthase (KS) domain sequence were retrieved for independent phylogenetic analysis from the aforementioned species, and phylogenetic analysis was performed with characterized fungal PKS. This resulted into grouping of Aspergilli PKSs into nonreducing (NR), partially reducing (PR), and highly reducing (HR) PKS enzymes. Eight distinct clades with unique domain arrangements were classified based on homology with functionally characterized PKS enzymes. Conserved motif signatures corresponding to each type of PKS were observed. Three proteins from Protein Data Bank corresponding to NR, PR, and HR type of PKS (XP_002384329.1, XP_753141.2, and XP_001402408.2, respectively) were selected for mapping of conserved motifs on three-dimensional structures of KS domain. Structural variations were found at the active sites on modeled NR, PR, and HR enzymes of Aspergillus. It was observed that the number of iteration cycles was dependent on the size of the cavity in the active site of the PKS enzyme correlating with a type with reducing or NR products, such as pigment, 6MSA, and lovastatin. The current study reports the grouping and classification of PKS proteins of Aspergilli for possible exploration of novel polyketides based on sequence homology; this information can be useful for selection of PKS for polyketide exploration and
Hörandl, Elvira; Paun, Ovidiu; Johansson, Jan T; Lehnebach, Carlos; Armstrong, Tristan; Chen, Lixue; Lockhart, Peter
2005-08-01
Ranunculus is a large genus with a worldwide distribution. Phylogenetic analyses of c. 200 species of Ranunculus s.l. based on sequences of the nrITS using maximum parsimony and Bayesian inference yielded high congruence with previous cpDNA restriction site analyses, but strongly contradict previous classifications. A large core clade including Ranunculus subg. Ranunculus, subg. Batrachium, subg. Crymodes p.p., Ceratocephala, Myosurus, and Aphanostemma is separated from R. subg. Ficaria, subg. Pallasiantha, subg. Coptidium, subg. Crymodes p.p., Halerpestes, Peltocalathos, Callianthemoides, and Arcteranthis. Within the core clade, 19 clades can be described with morphological and karyological features. Several sections are not monophyletic. Parallel evolution of morphological characters in adaptation to climatic conditions may be a reason for incongruence of molecular data and morphology-based classifications. In some mountainous regions, groups of closely related species may have originated from adaptive radiation and rapid speciation. Split decomposition analysis indicated complex patterns of relationship and suggested hybridization in the apomictic R. auricomus complex, R. subg. Batrachium, and the white-flowering European alpines. The evolutionary success of the genus might be due to a combination of morphological plasticity and adaptations, hybridization and polyploidy as important factors for regional diversification, and a broad range of reproductive strategies.
Rex, Martina; Schulte, Katharina; Zizka, Georg; Peters, Jule; Vásquez, Roberto; Ibisch, Pierre L; Weising, Kurt
2009-06-01
The about 31 species of Fosterella L.B. Sm. (Bromeliaceae) are terrestrial herbs with a centre of diversity in the central South American Andes. To resolve infra- and intergeneric relationships among Fosterella and their putative allies, we conducted a phylogenetic analysis based on sequence data from four chloroplast DNA regions (matK gene, rps16 intron, atpB-rbcL and psbB-psbH intergenic spacers). Sequences were generated for 96 accessions corresponding to 60 species from 18 genera. Among these, 57 accessions represented 22 of the 31 recognized Fosterella species and one undescribed morphospecies. Maximum parsimony and Bayesian inference methods yielded well-resolved phylogenies. The monophyly of Fosterella was strongly supported, as was its sister relationship with a clade comprising Deuterocohnia, Dyckia and Encholirium. Six distinct evolutionary lineages were distinguished within Fosterella. Character mapping indicated that parallel evolution of identical character states is common in the genus. Relationships between species and lineages are discussed in the context of morphological, ecological and biogeographical data as well as the results of a previous amplified fragment length polymorphism (AFLP) study.
Bayesian survival analysis in clinical trials: What methods are used in practice?
Brard, Caroline; Le Teuff, Gwénaël; Le Deley, Marie-Cécile; Hampson, Lisa V
2017-02-01
Background Bayesian statistics are an appealing alternative to the traditional frequentist approach to designing, analysing, and reporting of clinical trials, especially in rare diseases. Time-to-event endpoints are widely used in many medical fields. There are additional complexities to designing Bayesian survival trials which arise from the need to specify a model for the survival distribution. The objective of this article was to critically review the use and reporting of Bayesian methods in survival trials. Methods A systematic review of clinical trials using Bayesian survival analyses was performed through PubMed and Web of Science databases. This was complemented by a full text search of the online repositories of pre-selected journals. Cost-effectiveness, dose-finding studies, meta-analyses, and methodological papers using clinical trials were excluded. Results In total, 28 articles met the inclusion criteria, 25 were original reports of clinical trials and 3 were re-analyses of a clinical trial. Most trials were in oncology (n = 25), were randomised controlled (n = 21) phase III trials (n = 13), and half considered a rare disease (n = 13). Bayesian approaches were used for monitoring in 14 trials and for the final analysis only in 14 trials. In the latter case, Bayesian survival analyses were used for the primary analysis in four cases, for the secondary analysis in seven cases, and for the trial re-analysis in three cases. Overall, 12 articles reported fitting Bayesian regression models (semi-parametric, n = 3; parametric, n = 9). Prior distributions were often incompletely reported: 20 articles did not define the prior distribution used for the parameter of interest. Over half of the trials used only non-informative priors for monitoring and the final analysis (n = 12) when it was specified. Indeed, no articles fitting Bayesian regression models placed informative priors on the parameter of interest. The prior for the treatment
BEAST 2: A Software Platform for Bayesian Evolutionary Analysis
Bouckaert, Remco; Heled, Joseph; Kühnert, Denise; Vaughan, Tim; Wu, Chieh-Hsi; Xie, Dong; Suchard, Marc A.; Rambaut, Andrew; Drummond, Alexei J.
2014-01-01
We present a new open source, extensible and flexible software platform for Bayesian evolutionary analysis called BEAST 2. This software platform is a re-design of the popular BEAST 1 platform to correct structural deficiencies that became evident as the BEAST 1 software evolved. Key among those deficiencies was the lack of post-deployment extensibility. BEAST 2 now has a fully developed package management system that allows third party developers to write additional functionality that can be directly installed to the BEAST 2 analysis platform via a package manager without requiring a new software release of the platform. This package architecture is showcased with a number of recently published new models encompassing birth-death-sampling tree priors, phylodynamics and model averaging for substitution models and site partitioning. A second major improvement is the ability to read/write the entire state of the MCMC chain to/from disk allowing it to be easily shared between multiple instances of the BEAST software. This facilitates checkpointing and better support for multi-processor and high-end computing extensions. Finally, the functionality in new packages can be easily added to the user interface (BEAUti 2) by a simple XML template-based mechanism because BEAST 2 has been re-designed to provide greater integration between the analysis engine and the user interface so that, for example BEAST and BEAUti use exactly the same XML file format. PMID:24722319
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
NASA Technical Reports Server (NTRS)
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
2016-01-01
The Fayum Depression of Egypt has yielded fossils of hystricognathous rodents from multiple Eocene and Oligocene horizons that range in age from ∼37 to ∼30 Ma and document several phases in the early evolution of crown Hystricognathi and one of its major subclades, Phiomorpha. Here we describe two new genera and species of basal phiomorphs, Birkamys korai and Mubhammys vadumensis, based on rostra and maxillary and mandibular remains from the terminal Eocene (∼34 Ma) Fayum Locality 41 (L-41). Birkamys is the smallest known Paleogene hystricognath, has very simple molars, and, like derived Oligocene-to-Recent phiomorphs (but unlike contemporaneous and older taxa) apparently retained dP4∕4 late into life, with no evidence for P4∕4 eruption or formation. Mubhammys is very similar in dental morphology to Birkamys, and also shows no evidence for P4∕4 formation or eruption, but is considerably larger. Though parsimony analysis with all characters equally weighted places Birkamys and Mubhammys as sister taxa of extant Thryonomys to the exclusion of much younger relatives of that genus, all other methods (standard Bayesian inference, Bayesian “tip-dating,” and parsimony analysis with scaled transitions between “fixed” and polymorphic states) place these species in more basal positions within Hystricognathi, as sister taxa of Oligocene-to-Recent phiomorphs. We also employ tip-dating as a means for estimating the ages of early hystricognath-bearing localities, many of which are not well-constrained by geological, geochronological, or biostratigraphic evidence. By simultaneously taking into account phylogeny, evolutionary rates, and uniform priors that appropriately encompass the range of possible ages for fossil localities, dating of tips in this Bayesian framework allows paleontologists to move beyond vague and assumption-laden “stage of evolution” arguments in biochronology to provide relatively rigorous age assessments of poorly-constrained faunas
Phylogenetic analysis of bovine astrovirus in Korean cattle.
Oem, Jae-Ku; An, Dong-Jun
2014-04-01
Bovine astrovirus (BAstV) belongs to a genetically divergent lineage within the genus Mamastrovirus. The present study showed that BAstV was associated with the gastroenteric tracts of cattle in nine positive fecal samples from 115 cattle, whereas no positive samples were found in the brain tissues of 14 downer cattle. Interestingly, the positive diarrheal samples were obtained mainly from calves aged 14 days-3 months. Bayesian inference tree analysis of the partial ORF1ab and capsid (ORF2) gene sequences of BAstVs identified four divergent groups. Eleven BAstVs, four porcine astroviruses, and two deer astroviruses (DAstVs; CcAstV-1 and -2) belonged to group 1; group 2 contained two BAstVs (BAstK08-51 and BAstK10-96) with another two in group 3 (BAstK08-2 and BAstK08-53); and group 4 comprised the BAstV-NeuroS1 strain derived from a cattle brain tissue sample and an ovine astrovirus. The same divergent groups were obtained when the pairwise alignments were produced using both amino acid and nucleotide sequences. The Korean BAstVs isolated from infected cattle had a nationwide distribution and they belonged to groups 1, 2, and 3.
Dembo, Mana; Matzke, Nicholas J; Mooers, Arne Ø; Collard, Mark
2015-08-07
The phylogenetic relationships of several hominin species remain controversial. Two methodological issues contribute to the uncertainty-use of partial, inconsistent datasets and reliance on phylogenetic methods that are ill-suited to testing competing hypotheses. Here, we report a study designed to overcome these issues. We first compiled a supermatrix of craniodental characters for all widely accepted hominin species. We then took advantage of recently developed Bayesian methods for building trees of serially sampled tips to test among hypotheses that have been put forward in three of the most important current debates in hominin phylogenetics--the relationship between Australopithecus sediba and Homo, the taxonomic status of the Dmanisi hominins, and the place of the so-called hobbit fossils from Flores, Indonesia, in the hominin tree. Based on our results, several published hypotheses can be statistically rejected. For example, the data do not support the claim that Dmanisi hominins and all other early Homo specimens represent a single species, nor that the hobbit fossils are the remains of small-bodied modern humans, one of whom had Down syndrome. More broadly, our study provides a new baseline dataset for future work on hominin phylogeny and illustrates the promise of Bayesian approaches for understanding hominin phylogenetic relationships.
Bayesian Model Selection with Network Based Diffusion Analysis
Whalen, Andrew; Hoppitt, William J. E.
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089
Bayesian network models in brain functional connectivity analysis
Zhang, Sheng; Li, Chiang-shan R.
2013-01-01
Much effort has been made to better understand the complex integration of distinct parts of the human brain using functional magnetic resonance imaging (fMRI). Altered functional connectivity between brain regions is associated with many neurological and mental illnesses, such as Alzheimer and Parkinson diseases, addiction, and depression. In computational science, Bayesian networks (BN) have been used in a broad range of studies to model complex data set in the presence of uncertainty and when expert prior knowledge is needed. However, little is done to explore the use of BN in connectivity analysis of fMRI data. In this paper, we present an up-to-date literature review and methodological details of connectivity analyses using BN, while highlighting caveats in a real-world application. We present a BN model of fMRI dataset obtained from sixty healthy subjects performing the stop-signal task (SST), a paradigm widely used to investigate response inhibition. Connectivity results are validated with the extant literature including our previous studies. By exploring the link strength of the learned BN’s and correlating them to behavioral performance measures, this novel use of BN in connectivity analysis provides new insights to the functional neural pathways underlying response inhibition. PMID:24319317
Using Bayesian analysis in repeated preclinical in vivo studies for a more effective use of animals.
Walley, Rosalind; Sherington, John; Rastrick, Joe; Detrait, Eric; Hanon, Etienne; Watt, Gillian
2016-05-01
Whilst innovative Bayesian approaches are increasingly used in clinical studies, in the preclinical area Bayesian methods appear to be rarely used in the reporting of pharmacology data. This is particularly surprising in the context of regularly repeated in vivo studies where there is a considerable amount of data from historical control groups, which has potential value. This paper describes our experience with introducing Bayesian analysis for such studies using a Bayesian meta-analytic predictive approach. This leads naturally either to an informative prior for a control group as part of a full Bayesian analysis of the next study or using a predictive distribution to replace a control group entirely. We use quality control charts to illustrate study-to-study variation to the scientists and describe informative priors in terms of their approximate effective numbers of animals. We describe two case studies of animal models: the lipopolysaccharide-induced cytokine release model used in inflammation and the novel object recognition model used to screen cognitive enhancers, both of which show the advantage of a Bayesian approach over the standard frequentist analysis. We conclude that using Bayesian methods in stable repeated in vivo studies can result in a more effective use of animals, either by reducing the total number of animals used or by increasing the precision of key treatment differences. This will lead to clearer results and supports the "3Rs initiative" to Refine, Reduce and Replace animals in research. Copyright © 2016 John Wiley & Sons, Ltd.
Guidance on the implementation and reporting of a drug safety Bayesian network meta-analysis.
Ohlssen, David; Price, Karen L; Xia, H Amy; Hong, Hwanhee; Kerman, Jouni; Fu, Haoda; Quartey, George; Heilmann, Cory R; Ma, Haijun; Carlin, Bradley P
2014-01-01
The Drug Information Association Bayesian Scientific Working Group (BSWG) was formed in 2011 with a vision to ensure that Bayesian methods are well understood and broadly utilized for design and analysis and throughout the medical product development process, and to improve industrial, regulatory, and economic decision making. The group, composed of individuals from academia, industry, and regulatory, has as its mission to facilitate the appropriate use and contribute to the progress of Bayesian methodology. In this paper, the safety sub-team of the BSWG explores the use of Bayesian methods when applied to drug safety meta-analysis and network meta-analysis. Guidance is presented on the conduct and reporting of such analyses. We also discuss different structural model assumptions and provide discussion on prior specification. The work is illustrated through a case study involving a network meta-analysis related to the cardiovascular safety of non-steroidal anti-inflammatory drugs.
Hepatitis E Virus Circulation in Italy: Phylogenetic and Evolutionary Analysis
Montesano, Carla; Giovanetti, Marta; Ciotti, Marco; Cella, Eleonora; Lo Presti, Alessandra; Grifoni, Alba; Zehender, Gianguglielmo; Angeletti, Silvia; Ciccozzi, Massimo
2016-01-01
Background Hepatitis E virus (HEV), a major cause of acute viral hepatitis in developing countries, has been classified into four main genotypes and a number of subtypes. New genotypes have been recently identified in various mammals, including HEV genotype 3, which has a worldwide distribution. It is widespread among pigs in developed countries. Objectives This study investigated the genetic diversity of HEV among humans and swine in Italy. The date of origin and the demographic history of the HEV were also estimated. Materials and Methods A total of 327 HEV sequences of swine and humans from Italy were downloaded from the national centre for biotechnology information. Three different data sets were constructed. The first and the second data set were used to confirm the genotype of the sequences analyzed. The third data set was used to estimate the mean evolutionary rate and to determine the time-scaled phylogeny and demographic history. Results The Bayesian maximum clade credibility tree and the time of the most common recent ancestor estimates showed that the root of the tree dated back to the year 1907 (95% HPD: 1811 - 1975). Two main clades were found, divided into two subclades. Skyline plot analysis, performed separately for human and swine sequences, demonstrated the presence of a bottleneck only in the skyline plot from the swine sequences. Selective pressure analysis revealed only negatively selected sites. Conclusions This study provides support for the hypothesis that humans are probably infected after contact with swine sources. The findings emphasize the importance of checking the country of origin of swine and of improving sanitary control measures from the veterinary standpoint to prevent the spread of HEV infection in Italy. PMID:27226798
A Bayesian analysis of plutonium exposures in Sellafield workers.
Puncher, M; Riddell, A E
2016-03-01
The joint Russian (Mayak Production Association) and British (Sellafield) plutonium worker epidemiological analysis, undertaken as part of the European Union Framework Programme 7 (FP7) SOLO project, aims to investigate potential associations between cancer incidence and occupational exposures to plutonium using estimates of organ/tissue doses. The dose reconstruction protocol derived for the study makes best use of the most recent biokinetic models derived by the International Commission on Radiological Protection (ICRP) including a recent update to the human respiratory tract model (HRTM). This protocol was used to derive the final point estimates of absorbed doses for the study. Although uncertainties on the dose estimates were not included in the final epidemiological analysis, a separate Bayesian analysis has been performed for each of the 11 808 Sellafield plutonium workers included in the study in order to assess: A. The reliability of the point estimates provided to the epidemiologists and B. The magnitude of the uncertainty on dose estimates. This analysis, which accounts for uncertainties in biokinetic model parameters, intakes and measurement uncertainties, is described in the present paper. The results show that there is excellent agreement between the point estimates of dose and posterior mean values of dose. However, it is also evident that there are significant uncertainties associated with these dose estimates: the geometric range of the 97.5%:2.5% posterior values are a factor of 100 for lung dose, 30 for doses to liver and red bone marrow, and 40 for intakes: these uncertainties are not reflected in estimates of risk when point doses are used to assess them. It is also shown that better estimates of certain key HRTM absorption parameters could significantly reduce the uncertainties on lung dose in future studies.
Nuclear stockpile stewardship and Bayesian image analysis (DARHT and the BIE)
Carroll, James L
2011-01-11
Since the end of nuclear testing, the reliability of our nation's nuclear weapon stockpile has been performed using sub-critical hydrodynamic testing. These tests involve some pretty 'extreme' radiography. We will be discussing the challenges and solutions to these problems provided by DARHT (the world's premiere hydrodynamic testing facility) and the BIE or Bayesian Inference Engine (a powerful radiography analysis software tool). We will discuss the application of Bayesian image analysis techniques to this important and difficult problem.
Kinetic and phylogenetic analysis of plant polyamine uptake transporters.
Mulangi, Vaishali; Chibucos, Marcus C; Phuntumart, Vipaporn; Morris, Paul F
2012-10-01
The rice gene Polyamine Uptake Transporter1 (PUT1) was originally identified based on its homology to the polyamine uptake transporters LmPOT1 and TcPAT12 in Leishmania major and Trypanosoma cruzi, respectively. Here we show that five additional transporters from rice and Arabidopsis that cluster in the same clade as PUT1 all function as high affinity spermidine uptake transporters. Yeast expression assays of these genes confirmed that uptake of spermidine was minimally affected by 166 fold or greater concentrations of amino acids. Characterized polyamine transporters from both Arabidopsis thaliana and Oryza sativa along with the two polyamine transporters from L. major and T. cruzi were aligned and used to generate a hidden Markov model. This model was used to identify significant matches to proteins in other angiosperms, bryophytes, chlorophyta, discicristates, excavates, stramenopiles and amoebozoa. No significant matches were identified in fungal or metazoan genomes. Phylogenic analysis showed that some sequences from the haptophyte, Emiliania huxleyi, as well as sequences from oomycetes and diatoms clustered closer to sequences from plant genomes than from a homologous sequence in the red algal genome Galdieria sulphuraria, consistent with the hypothesis that these polyamine transporters were acquired by horizontal transfer from green algae. Leishmania and Trypansosoma formed a separate cluster with genes from other Discicristates and two Entamoeba species. We surmise that the genes in Entamoeba species were acquired by phagotrophy of Discicristates. In summary, phylogenetic and functional analysis has identified two clades of genes that are predictive of polyamine transport activity.
Phylogenetic analysis of the evolution of lactose digestion in adults.
Holden, C; Mace, R
1997-10-01
In most of the world's population the ability to digest lactose declines sharply after infancy. High lactose digestion capacity in adults is common only in populations of European and circum-Mediterranean origin and is thought to be an evolutionary adaptation to millennia of drinking milk from domestic livestock. Milk can also be consumed in a processed form, such as cheese or soured milk, which has a reduced lactose content. Two other selective pressures for drinking fresh milk with a high lactose content have been proposed: promotion of calcium uptake in high-latitude populations prone to vitamin-D deficiency and maintainance of water and electrolytes in the body in highly and environments. These three hypotheses are all supported by the geographic distribution of high lactose digestion capacity in adults. However, the relationships between environmental variables and adult lactose digestion capacity are highly confounded by the shared ancestry of many populations whose lactose digestion capacity has been tested. The three hypotheses for the evolution of high adult lactose digestion capacity are tested here using a comparative method of analysis that takes the problem of phylogenetic confounding into account. This analysis supports the hypothesis that high adult lactose digestion capacity is an adaptation to dairying but does not support the hypotheses that lactose digestion capacity is additionally selected for either at high latitudes or in highly arid environments. Furthermore, methods using maximum likelihood are used to show that the evolution of milking preceded the evolution of high lactose digestion.
Molecular analysis and phylogenetic characterization of HIV in Iran.
Sarrami-Forooshani, Ramin; Das, Suman Ranjan; Sabahi, Farzaneh; Adeli, Ahmad; Esmaeili, Rezvan; Wahren, Britta; Mohraz, Minoo; Haji-Abdolbaghi, Mahboubeh; Rasoolinejad, Mehrnaz; Jameel, Shahid; Mahboudi, Fereidoun
2006-07-01
The rate of human immunodeficiency virus type 1 (HIV-1) infection in Iran has increased dramatically in the last few years. While the earliest cases were found in hemophiliacs, intravenous drug users are now fueling the outbreak. In this study, both the 122 clones of HIV-1 gag p17 and the 131 clones of env V1-V5 region were obtained from 61 HIV-1 seropositives belonging to these two groups in Iran. HIV-1 subtyping and phylogenetic analysis was done by heteroduplex mobility assays (HMA) and multiple clone sequencing. The result indicated all hemophiliacs are infected with HIV-1 subtype B and all intravenous drug users are infected with HIV-1 subtype A. Since intravenous drug abuse is the major transmission route in Iran, HIV-1 subtype A is likely to be the dominant viral subtype circulating in the country. The analysis of genetic distances showed subtype B viruses in Iran to be twice as heterogeneous as the subtype A viruses. In conclusion, this first molecular study of HIV-1 genotypes in Iran suggests two parallel outbreaks in distinct high-risk populations and may offer clues to the origin and spread of infection in Iran.
STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-02-20
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
NASA Technical Reports Server (NTRS)
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks [Scargle 1998]-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piece- wise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by [Arias-Castro, Donoho and Huo 2003]. In the spirit of Reproducible Research [Donoho et al. (2008)] all of the code and data necessary to reproduce all of the figures in this paper are included as auxiliary material.
A Bayesian model for the analysis of transgenerational epigenetic variation.
Varona, Luis; Munilla, Sebastián; Mouresan, Elena Flavia; González-Rodríguez, Aldemar; Moreno, Carlos; Altarriba, Juan
2015-01-23
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T: matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix ( T-1: ) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible.
Cepheid light curve demography via Bayesian functional data analysis
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.; Hendry, Martin; Kowal, Daniel; Ruppert, David
2016-01-01
Synoptic time-domain surveys provide astronomers, not simply more data, but a different kind of data: large ensembles of multivariate, irregularly and asynchronously sampled light curves. We describe a statistical framework for light curve demography—optimal accumulation and extraction of information, not only along individual light curves as conventional methods do, but also across large ensembles of related light curves. We build the framework using tools from functional data analysis (FDA), a rapidly growing area of statistics that addresses inference from datasets that sample ensembles of related functions. Our Bayesian FDA framework builds hierarchical models that describe light curve ensembles using multiple levels of randomness: upper levels describe the source population, and lower levels describe the observation process, including measurement errors and selection effects. Roughly speaking, a particular object's light curve is modeled as the sum of a parameterized template component (modeling population-averaged behavior) and a peculiar component (modeling variability across the population), subsequently subjected to an observation model. A functional shrinkage adjustment to individual light curves emerges—an adaptive, functional generalization of the kind of adjustments made for Eddington or Malmquist bias in single-epoch photometric surveys. We describe ongoing work applying the framework to improved estimation of Cepheid variable star luminosities via FDA-based refinement and generalization of the Cepheid period-luminosity relation.
Using Bayesian Population Viability Analysis to Define Relevant Conservation Objectives
Green, Adam W.; Bailey, Larissa L.
2015-01-01
Adaptive management provides a useful framework for managing natural resources in the face of uncertainty. An important component of adaptive management is identifying clear, measurable conservation objectives that reflect the desired outcomes of stakeholders. A common objective is to have a sustainable population, or metapopulation, but it can be difficult to quantify a threshold above which such a population is likely to persist. We performed a Bayesian metapopulation viability analysis (BMPVA) using a dynamic occupancy model to quantify the characteristics of two wood frog (Lithobates sylvatica) metapopulations resulting in sustainable populations, and we demonstrate how the results could be used to define meaningful objectives that serve as the basis of adaptive management. We explored scenarios involving metapopulations with different numbers of patches (pools) using estimates of breeding occurrence and successful metamorphosis from two study areas to estimate the probability of quasi-extinction and calculate the proportion of vernal pools producing metamorphs. Our results suggest that ≥50 pools are required to ensure long-term persistence with approximately 16% of pools producing metamorphs in stable metapopulations. We demonstrate one way to incorporate the BMPVA results into a utility function that balances the trade-offs between ecological and financial objectives, which can be used in an adaptive management framework to make optimal, transparent decisions. Our approach provides a framework for using a standard method (i.e., PVA) and available information to inform a formal decision process to determine optimal and timely management policies. PMID:26658734
Light curve demography via Bayesian functional data analysis
NASA Astrophysics Data System (ADS)
Loredo, Thomas; Budavari, Tamas; Hendry, Martin A.; Kowal, Daniel; Ruppert, David
2015-08-01
Synoptic time-domain surveys provide astronomers, not simply more data, but a different kind of data: large ensembles of multivariate, irregularly and asynchronously sampled light curves. We describe a statistical framework for light curve demography—optimal accumulation and extraction of information, not only along individual light curves as conventional methods do, but also across large ensembles of related light curves. We build the framework using tools from functional data analysis (FDA), a rapidly growing area of statistics that addresses inference from datasets that sample ensembles of related functions. Our Bayesian FDA framework builds hierarchical models that describe light curve ensembles using multiple levels of randomness: upper levels describe the source population, and lower levels describe the observation process, including measurement errors and selection effects. Schematically, a particular object's light curve is modeled as the sum of a parameterized template component (modeling population-averaged behavior) and a peculiar component (modeling variability across the population), subsequently subjected to an observation model. A functional shrinkage adjustment to individual light curves emerges—an adaptive, functional generalization of the kind of adjustments made for Eddington or Malmquist bias in single-epoch photometric surveys. We are applying the framework to a variety of problems in synoptic time-domain survey astronomy, including optimal detection of weak sources in multi-epoch data, and improved estimation of Cepheid variable star luminosities from detailed demographic modeling of ensembles of Cepheid light curves.
Bayesian Angular Power Spectrum Analysis of Interferometric Data
NASA Astrophysics Data System (ADS)
Sutter, P. M.; Wandelt, Benjamin D.; Malu, Siddarth S.
2012-09-01
We present a Bayesian angular power spectrum and signal map inference engine which can be adapted to interferometric observations of anisotropies in the cosmic microwave background (CMB), 21 cm emission line mapping of galactic brightness fluctuations, or 21 cm absorption line mapping of neutral hydrogen in the dark ages. The method uses Gibbs sampling to generate a sampled representation of the angular power spectrum posterior and the posterior of signal maps given a set of measured visibilities in the uv-plane. We use a mock interferometric CMB observation to demonstrate the validity of this method in the flat-sky approximation when adapted to take into account arbitrary coverage of the uv-plane, mode-mode correlations due to observations on a finite patch, and heteroschedastic visibility errors. The computational requirements scale as {O}(n_p log n_p) where np measures the ratio of the size of the detector array to the inter-detector spacing, meaning that Gibbs sampling is a promising technique for meeting the data analysis requirements of future cosmology missions.
Spatial Hierarchical Bayesian Analysis of the Historical Extreme Streamflow
NASA Astrophysics Data System (ADS)
Najafi, M. R.; Moradkhani, H.
2012-04-01
Analysis of the climate change impact on extreme hydro-climatic events is crucial for future hydrologic/hydraulic designs and water resources decision making. The purpose of this study is to investigate the changes of the extreme value distribution parameters with respect to time to reflect upon the impact of climate change. We develop a statistical model using the observed streamflow data of the Columbia River Basin in USA to estimate the changes of high flows as a function of time as well as other variables. Generalized Pareto Distribution (GPD) is used to model the upper 95% flows during December through March for 31 gauge stations. In the process layer of the model the covariates including time, latitude, longitude, elevation and basin area are considered to assess the sensitivity of the model to each variable. Markov Chain Monte Carlo (MCMC) method is used to estimate the parameters. The Spatial Hierarchical Bayesian technique models the GPD parameters spatially and borrows strength from other locations by pooling data together, while providing an explicit estimation of the uncertainties in all stages of modeling.
A Bayesian Analysis of Regularised Source Inversions in Gravitational Lensing
Suyu, Sherry H.; Marshall, P.J.; Hobson, M.P.; Blandford, R.D.; /Caltech /KIPAC, Menlo Park
2006-01-25
Strong gravitational lens systems with extended sources are of special interest because they provide additional constraints on the models of the lens systems. To use a gravitational lens system for measuring the Hubble constant, one would need to determine the lens potential and the source intensity distribution simultaneously. A linear inversion method to reconstruct a pixellated source distribution of a given lens potential model was introduced by Warren and Dye. In the inversion process, a regularization on the source intensity is often needed to ensure a successful inversion with a faithful resulting source. In this paper, we use Bayesian analysis to determine the optimal regularization constant (strength of regularization) of a given form of regularization and to objectively choose the optimal form of regularization given a selection of regularizations. We consider and compare quantitatively three different forms of regularization previously described in the literature for source inversions in gravitational lensing: zeroth-order, gradient and curvature. We use simulated data with the exact lens potential to demonstrate the method. We find that the preferred form of regularization depends on the nature of the source distribution.
Bayesian analysis of a reduced-form air quality model.
Foley, Kristen M; Reich, Brian J; Napelenok, Sergey L
2012-07-17
Numerical air quality models are being used for assessing emission control strategies for improving ambient pollution levels across the globe. This paper applies probabilistic modeling to evaluate the effectiveness of emission reduction scenarios aimed at lowering ground-level ozone concentrations. A Bayesian hierarchical model is used to combine air quality model output and monitoring data in order to characterize the impact of emissions reductions while accounting for different degrees of uncertainty in the modeled emissions inputs. The probabilistic model predictions are weighted based on population density in order to better quantify the societal benefits/disbenefits of four hypothetical emission reduction scenarios in which domain-wide NO(x) emissions from various sectors are reduced individually and then simultaneously. Cross validation analysis shows the statistical model performs well compared to observed ozone levels. Accounting for the variability and uncertainty in the emissions and atmospheric systems being modeled is shown to impact how emission reduction scenarios would be ranked, compared to standard methodology.
Erosion of phylogenetic signal in tunicate mitochondrial genomes on different levels of analysis.
Stach, Thomas; Braband, Anke; Podsiadlowski, Lars
2010-06-01
The molecular phylogenetic position of Tunicata and internal interrelationship of higher tunicate taxa is controversial. High substitution rates and extreme gene order variability hamper phylogenetic analyses. We describe the sequence and organization of the mitochondrial genome of the aplousobranch ascidian Clavelina lepadiformis and use mitochondrial genomes to investigate phylogenetic information content on different molecular levels of comparison. Despite agreement in phylogenetic analyses of nucleotide and amino acid sequences, split analyses revealed little phylogenetic signal. Split analyses on molecular data sets deemed increasingly conservative, demonstrated that the lack of signal pervades all levels and that it is Tunicata the taxon of interest that introduces noise in the data sets. The strongest signal present in our molecular data sets as revealed by split analyses is not present in the optimal cladograms and supports a sister group relationship between cephalochordates and craniates. Phylogenetic analysis of gene order using common interval algorithms shows that phylogenetic signal is also eroded in respect of gene positions. Even functional constraints, such as partial gene overlap as exemplified in the case of the commonly observed adjacency between cox2 and cytb are subjected to homoplasy. However, rare phylogenetic events like this hold some promise to retain phylogenetic information even in such cases of extreme variability. We therefore caution to rely on sequence analysis alone and recommend investigation into the signal content of molecular data sets in order to assess the strength of phylogenetic signal.
Molecular phylogenetic analysis of mango mealybug, Drosicha mangiferae from Punjab.
Banta, Geetika; Jindal, Vikas; Mohindru, Bharathi; Sharma, Sachin; Kaur, Jaimeet; Gupta, V K
2016-01-01
Mealybugs (Hemiptera: Pseudococcidae) are major pests of a wide range of crops and ornamental plants worldwide. Their high degree of morphological similarity makes them difficult to identify and limits their study and management. In the present study, four Indian populations of mango mealybug (mango, litchi, guava from Gurdaspur and mango from Jalandhar) were analyzed. The mtCOI region was amplified, cloned, the nucleotide sequences were determined and analysed. All the four species were found to be D. mangiferae. The population from Litchi and Mango from Gurdaspur showed 100% homologus sequence. The population of Guava-Gurdaspur and Mango-Jalandhar showed a single mutation of 'C' instead of 'T' at 18th and 196th position, respectively. Indian populations were compared with populations from Pakistan (21) and Japan (1). The phylogenetic tree resulted in two main clusters. Cluster1 represent all the 4 populations of Punjab, India, 20 of Pakistan (Punjab, Sind, Lahore, Multan, Faisalabad and Karak districts) with homologous sequences. The two population collected from Faisalabad district of Pakistan and Japan made a separate cluster 2 because the gene sequence used in analysis was from the COI-3p region. However, all the other sequence of D. mangiferae samples under study showed a low nucleotide divergence. The homologus mtCO1 sequence of Indian and Pakistan population concluded that the genetic diversity in mealybug population was quite less over a large geographical area.
Inventory and Phylogenetic Analysis of Meiotic Genes in Monogonont Rotifers
2013-01-01
A long-standing question in evolutionary biology is how sexual reproduction has persisted in eukaryotic lineages. As cyclical parthenogens, monogonont rotifers are a powerful model for examining this question, yet the molecular nature of sexual reproduction in this lineage is currently understudied. To examine genes involved in meiosis, we generated partial genome assemblies for 2 distantly related monogonont species, Brachionus calyciflorus and B. manjavacas. Here we present an inventory of 89 meiotic genes, of which 80 homologs were identified and annotated from these assemblies. Using phylogenetic analysis, we show that several meiotic genes have undergone relatively recent duplication events that appear to be specific to the monogonont lineage. Further, we compare the expression of “meiosis-specific” genes involved in recombination and all annotated copies of the cell cycle regulatory gene CDC20 between obligate parthenogenetic (OP) and cyclical parthenogenetic (CP) strains of B. calyciflorus. We show that “meiosis-specific” genes are expressed in both CP and OP strains, whereas the expression of one of the CDC20 genes is specific to cyclical parthenogenesis. The data presented here provide insights into mechanisms of cyclical parthenogenesis and establish expectations for studies of obligate asexual relatives of monogononts, the bdelloid rotifer lineage. PMID:23487324
2011-01-01
Background With nearly 1,100 species, the fish family Characidae represents more than half of the species of Characiformes, and is a key component of Neotropical freshwater ecosystems. The composition, phylogeny, and classification of Characidae is currently uncertain, despite significant efforts based on analysis of morphological and molecular data. No consensus about the monophyly of this group or its position within the order Characiformes has been reached, challenged by the fact that many key studies to date have non-overlapping taxonomic representation and focus only on subsets of this diversity. Results In the present study we propose a new definition of the family Characidae and a hypothesis of relationships for the Characiformes based on phylogenetic analysis of DNA sequences of two mitochondrial and three nuclear genes (4,680 base pairs). The sequences were obtained from 211 samples representing 166 genera distributed among all 18 recognized families in the order Characiformes, all 14 recognized subfamilies in the Characidae, plus 56 of the genera so far considered incertae sedis in the Characidae. The phylogeny obtained is robust, with most lineages significantly supported by posterior probabilities in Bayesian analysis, and high bootstrap values from maximum likelihood and parsimony analyses. Conclusion A monophyletic assemblage strongly supported in all our phylogenetic analysis is herein defined as the Characidae and includes the characiform species lacking a supraorbital bone and with a derived position of the emergence of the hyoid artery from the anterior ceratohyal. To recognize this and several other monophyletic groups within characiforms we propose changes in the limits of several families to facilitate future studies in the Characiformes and particularly the Characidae. This work presents a new phylogenetic framework for a speciose and morphologically diverse group of freshwater fishes of significant ecological and evolutionary importance
Bayesian analysis of anisotropic cosmologies: Bianchi VIIh and WMAP
NASA Astrophysics Data System (ADS)
McEwen, J. D.; Josset, T.; Feeney, S. M.; Peiris, H. V.; Lasenby, A. N.
2013-12-01
We perform a definitive analysis of Bianchi VIIh cosmologies with Wilkinson Microwave Anisotropy Probe (WMAP) observations of the cosmic microwave background (CMB) temperature anisotropies. Bayesian analysis techniques are developed to study anisotropic cosmologies using full-sky and partial-sky masked CMB temperature data. We apply these techniques to analyse the full-sky internal linear combination (ILC) map and a partial-sky masked W-band map of WMAP 9 yr observations. In addition to the physically motivated Bianchi VIIh model, we examine phenomenological models considered in previous studies, in which the Bianchi VIIh parameters are decoupled from the standard cosmological parameters. In the two phenomenological models considered, Bayes factors of 1.7 and 1.1 units of log-evidence favouring a Bianchi component are found in full-sky ILC data. The corresponding best-fitting Bianchi maps recovered are similar for both phenomenological models and are very close to those found in previous studies using earlier WMAP data releases. However, no evidence for a phenomenological Bianchi component is found in the partial-sky W-band data. In the physical Bianchi VIIh model, we find no evidence for a Bianchi component: WMAP data thus do not favour Bianchi VIIh cosmologies over the standard Λ cold dark matter (ΛCDM) cosmology. It is not possible to discount Bianchi VIIh cosmologies in favour of ΛCDM completely, but we are able to constrain the vorticity of physical Bianchi VIIh cosmologies at (ω/H)0 < 8.6 × 10-10 with 95 per cent confidence.
A New Orchid Genus, Danxiaorchis, and Phylogenetic Analysis of the Tribe Calypsoeae
Zhai, Jun-Wen; Zhang, Guo-Qiang; Chen, Li-Jun; Xiao, Xin-Ju; Liu, Ke-Wei; Tsai, Wen-Chieh; Hsiao, Yu-Yun; Tian, Huai-Zhen; Zhu, Jia-Qiang; Wang, Mei-Na; Wang, Fa-Guo; Xing, Fu-Wu; Liu, Zhong-Jian
2013-01-01
Background Orchids have numerous species, and their speciation rates are presumed to be exceptionally high, suggesting that orchids are continuously and actively evolving. The wide diversity of orchids has attracted the interest of evolutionary biologists. In this study, a new orchid was discovered on Danxia Mountain in Guangdong, China. However, the phylogenetic clarification of this new orchid requires further molecular, morphological, and phytogeographic analyses. Methodology/Principal Findings A new orchid possesses a labellum with a large Y-shaped callus and two sacs at the base, and cylindrical, fleshy seeds, which make it distinct from all known orchid genera. Phylogenetic methods were applied to a matrix of morphological and molecular characters based on the fragments of the nuclear internal transcribed spacer, chloroplast matK, and rbcL genes of Orchidaceae (74 genera) and Calypsoeae (13 genera). The strict consensus Bayesian inference phylogram strongly supports the division of the Calypsoeae alliance (not including Dactylostalix and Ephippianthus) into seven clades with 11 genera. The sequence data of each species and the morphological characters of each genus were combined into a single dataset. The inferred Bayesian phylogram supports the division of the 13 genera of Calypsoeae into four clades with 13 subclades (genera). Based on the results of our phylogenetic analyses, Calypsoeae, under which the new orchid is classified, represents an independent lineage in the Epidendroideae subfamily. Conclusions Analyses of the combined datasets using Bayesian methods revealed strong evidence that Calypsoeae is a monophyletic tribe consisting of eight well-supported clades with 13 subclades (genera), which are all in agreement with the phytogeography of Calypsoeae. The Danxia orchid represents an independent lineage under the tribe Calypsoeae of the subfamily Epidendroideae. This lineage should be treated as a new genus, which we have named Danxiaorchis, that is
Wu, Yu-Peng; Zhao, Jin-Liang; Su, Tian-Juan; Luo, A-Rong; Zhu, Chao-Dong
2016-10-10
To better understand the diversity and phylogeny of Lepidoptera, the complete mitochondrial genome of Choristoneura longicellana (=Hoshinoa longicellana) was determined. It is a typical circular duplex molecule with 15,759bp in length, containing the standard metazoan set of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and an A+T-rich region. All of the inferred tRNA secondary structures show the common cloverleaf pattern, with the exception of trnS1(AGN), which lacks the DHU arm. The rrnL of C. Longicellana is the longest in sequenced lepidopterans. C. Longicellana has the same gene order as all lepidopteran species currently available in GenBank. There are 5 overlapping regions ranging from 1bp to 8bp and 14 intergenic spacers ranging from 1bp to 48bp. In addition, there are four similar tandem macro-satellite regions with the lengths of 101bp, 98bp, 92bp, and 92bp respectively in the A+T-rich regions of C. longicellana. We sampled 89 species representing 13 superfamilies, and reconstructed their relationship among Lepidoptera by Bayesian Inference and Maximum Likelihood analysis. The topology of the two phylogenetic analysis trees is identical roughly, except for Cossoidea in different locations, the positions of Cossoidea, Copromorphoidea, Gelechioidea, Zygaenoidea were not determined based the limited sampling. (Geometroidea+(Noctuoidea+Bombycoidea)) form the Macrolepidoptera "core". Pyraloidea group with the "core" Macrolepidoptera. Papilionoidea are not Macrolepidoptera. The Hesperiidae (represent Hesperioidea) is nested in the Papilionoidea, and closely related to Pieridae and Papilionidae. The well-known relationship of (Nymphalidae+(Riodinidae+Lycaenidae)) is recovered in this paper.
Dembo, Mana; Matzke, Nicholas J.; Mooers, Arne Ø.; Collard, Mark
2015-01-01
The phylogenetic relationships of several hominin species remain controversial. Two methodological issues contribute to the uncertainty—use of partial, inconsistent datasets and reliance on phylogenetic methods that are ill-suited to testing competing hypotheses. Here, we report a study designed to overcome these issues. We first compiled a supermatrix of craniodental characters for all widely accepted hominin species. We then took advantage of recently developed Bayesian methods for building trees of serially sampled tips to test among hypotheses that have been put forward in three of the most important current debates in hominin phylogenetics—the relationship between Australopithecus sediba and Homo, the taxonomic status of the Dmanisi hominins, and the place of the so-called hobbit fossils from Flores, Indonesia, in the hominin tree. Based on our results, several published hypotheses can be statistically rejected. For example, the data do not support the claim that Dmanisi hominins and all other early Homo specimens represent a single species, nor that the hobbit fossils are the remains of small-bodied modern humans, one of whom had Down syndrome. More broadly, our study provides a new baseline dataset for future work on hominin phylogeny and illustrates the promise of Bayesian approaches for understanding hominin phylogenetic relationships. PMID:26202999
Huang, Jie; Yang, Bo; Yan, Chaochao; Yang, Chengzhong; Tu, Feiyun; Zhang, Xiuyue; Yue, Bisong
2014-08-01
The mountain weasel (Mustela altaica) belongs to family Mustelidae, which is the near threatened species in the IUCN Red List. In this study, the complete mitochondrial genome of M. altaica was sequenced and characterized. The genome is 16,521 bases in length (GenBank accession no. KC815122). The nucleotide sequence data of 12 heavy-strand protein-coding genes of M. altaica and other 20 Mustelidae species were used for phylogenetic analyses. Trees constructed by using Bayesian inference, maximum parsimony and maximum likelihood demonstrated that M. altaica was close to Mustela nivalis and they were sister to Mustela putorius and Mustela sibirica.
Detrano, R; Leatherman, J; Salcedo, E E; Yiannikas, J; Williams, G
1986-05-01
Both Bayesian analysis assuming independence and discriminant function analysis have been used to estimate probabilities of coronary disease. To compare their relative accuracy, we submitted 303 subjects referred for coronary angiography to stress electrocardiography, thallium scintigraphy, and cine fluoroscopy. Severe angiographic disease was defined as at least one greater than 50% occlusion of a major vessel. Four calculations were done: (1) Bayesian analysis using literature estimates of pretest probabilities, sensitivities, and specificities was applied to the clinical and test data of a randomly selected subgroup (group I, 151 patients) to calculate posttest probabilities. (2) Bayesian analysis using literature estimates of pretest probabilities (but with sensitivities and specificities derived from the remaining 152 subjects [group II]) was applied to group I data to estimate posttest probabilities. (3) A discriminant function with logistic regression coefficients derived from the clinical and test variables of group II was used to calculate posttest probabilities of group I. (4) A discriminant function derived with the use of test results from group II and pretest probabilities from the literature was used to calculate posttest probabilities of group I. Receiver operating characteristic curve analysis showed that all four calculations could equivalently rank the disease probabilities for our patients. A goodness-of-fit analysis suggested the following relationship between the accuracies of the four calculations: (1) less than (2) approximately equal to (4) less than (3). Our results suggest that data-based discriminant functions are more accurate than literature-based Bayesian analysis assuming independence in predicting severe coronary disease based on clinical and noninvasive test results.
Carapelli, Antonio; Liò, Pietro; Nardi, Francesco; van der Wath, Elizabeth; Frati, Francesco
2007-01-01
Background The phylogeny of Arthropoda is still a matter of harsh debate among systematists, and significant disagreement exists between morphological and molecular studies. In particular, while the taxon joining hexapods and crustaceans (the Pancrustacea) is now widely accepted among zoologists, the relationships among its basal lineages, and particularly the supposed reciprocal paraphyly of Crustacea and Hexapoda, continues to represent a challenge. Several genes, as well as different molecular markers, have been used to tackle this problem in molecular phylogenetic studies, with the mitochondrial DNA being one of the molecules of choice. In this study, we have assembled the largest data set available so far for Pancrustacea, consisting of 100 complete (or almost complete) sequences of mitochondrial genomes. After removal of unalignable sequence regions and highly rearranged genomes, we used nucleotide and inferred amino acid sequences of the 13 protein coding genes to reconstruct the phylogenetic relationships among major lineages of Pancrustacea. The analysis was performed with Bayesian inference, and for the amino acid sequences a new, Pancrustacea-specific, matrix of amino acid replacement was developed and used in this study. Results Two largely congruent trees were obtained from the analysis of nucleotide and amino acid datasets. In particular, the best tree obtained based on the new matrix of amino acid replacement (MtPan) was preferred over those obtained using previously available matrices (MtArt and MtRev) because of its higher likelihood score. The most remarkable result is the reciprocal paraphyly of Hexapoda and Crustacea, with some lineages of crustaceans (namely the Malacostraca, Cephalocarida and, possibly, the Branchiopoda) being more closely related to the Insecta s.s. (Ectognatha) than two orders of basal hexapods, Collembola and Diplura. Our results confirm that the mitochondrial genome, unlike analyses based on morphological data or nuclear
Zhang, Honghai; Chen, Lei
2011-03-01
The dhole (Cuon alpinus) is the only existent species in the genus Cuon (Carnivora: Canidae). In the present study, the complete mitochondrial genome of the dhole was sequenced. The total length is 16672 base pairs which is the shortest in Canidae. Sequence analysis revealed that most mitochondrial genomic functional regions were highly consistent among canid animals except the CSB domain of the control region. The difference in length among the Canidae mitochondrial genome sequences is mainly due to the number of short segments of tandem repeated in the CSB domain. Phylogenetic analysis was progressed based on the concatenated data set of 14 mitochondrial genes of 8 canid animals by using maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) inference methods. The genera Vulpes and Nyctereutes formed a sister group and split first within Canidae, followed by that in the Cuon. The divergence in the genus Canis was the latest. The divarication of domestic dogs after that of the Canis lupus laniger is completely supported by all the three topologies. Pairwise sequence divergence data of different mitochondrial genes among canid animals were also determined. Except for the synonymous substitutions in protein-coding genes, the control region exhibits the highest sequence divergences. The synonymous rates are approximately two to six times higher than those of the non-synonymous sites except for a slightly higher rate in the non-synonymous substitution between Cuon alpinus and Vulpes vulpes. 16S rRNA genes have a slightly faster sequence divergence than 12S rRNA and tRNA genes. Based on nucleotide substitutions of tRNA genes and rRNA genes, the times since divergence between dhole and other canid animals, and between domestic dogs and three subspecies of wolves were evaluated. The result indicates that Vulpes and Nyctereutes have a close phylogenetic relationship and the divergence of Nyctereutes is a little earlier. The Tibetan wolf may be an archaic
Bayesian Analysis of Multiple Populations in Galactic Globular Clusters
NASA Astrophysics Data System (ADS)
Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan
2016-01-01
We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.
JBASE: Joint Bayesian Analysis of Subphenotypes and Epistasis
Colak, Recep; Kim, TaeHyung; Kazan, Hilal; Oh, Yoomi; Cruz, Miguel; Valladares-Salgado, Adan; Peralta, Jesus; Escobedo, Jorge; Parra, Esteban J.; Kim, Philip M.; Goldenberg, Anna
2016-01-01
Motivation: Rapid advances in genotyping and genome-wide association studies have enabled the discovery of many new genotype–phenotype associations at the resolution of individual markers. However, these associations explain only a small proportion of theoretically estimated heritability of most diseases. In this work, we propose an integrative mixture model called JBASE: joint Bayesian analysis of subphenotypes and epistasis. JBASE explores two major reasons of missing heritability: interactions between genetic variants, a phenomenon known as epistasis and phenotypic heterogeneity, addressed via subphenotyping. Results: Our extensive simulations in a wide range of scenarios repeatedly demonstrate that JBASE can identify true underlying subphenotypes, including their associated variants and their interactions, with high precision. In the presence of phenotypic heterogeneity, JBASE has higher Power and lower Type 1 Error than five state-of-the-art approaches. We applied our method to a sample of individuals from Mexico with Type 2 diabetes and discovered two novel epistatic modules, including two loci each, that define two subphenotypes characterized by differences in body mass index and waist-to-hip ratio. We successfully replicated these subphenotypes and epistatic modules in an independent dataset from Mexico genotyped with a different platform. Availability and implementation: JBASE is implemented in C++, supported on Linux and is available at http://www.cs.toronto.edu/∼goldenberg/JBASE/jbase.tar.gz. The genotype data underlying this study are available upon approval by the ethics review board of the Medical Centre Siglo XXI. Please contact Dr Miguel Cruz at mcruzl@yahoo.com for assistance with the application. Contact: anna.goldenberg@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26411870
Bayesian inference approach to room-acoustic modal analysis
NASA Astrophysics Data System (ADS)
Henderson, Wesley; Goggans, Paul; Xiang, Ning; Botts, Jonathan
2013-08-01
Spectrum estimation is a problem common to many fields of physics, science, and engineering, and it has thus received a great deal of attention from the Bayesian data analysis community. In room acoustics, the modal or frequency response of a room is important for diagnosing and remedying acoustical defects. The physics of a sound field in a room dictates a model comprised of exponentially decaying sinusoids. Continuing in the tradition of the seminal work of Bretthorst and Jaynes, this work contributes an approach to analyzing the modal responses of rooms with a time-domain model. Room acoustic spectra are constructed of damped sinusoids, and the modelbased approach allows estimation of the number of sinusoids in the signal as well as their frequencies, amplitudes, damping constants, and phase delays. The frequency-amplitude spectrum may be most useful for characterizing a room, but in some settings the damping constants are of primary interest. This is the case for measuring the absorptive properties of materials, for example. A further challenge of the room acoustic spectrum problem is that modal density increases quadratically with frequency. At a point called the Schroeder frequency, adjacent modes overlap enough that the spectrum - particularly when estimated with the discrete Fourier transform - can be treated as a continuum. The time-domain, model-based approach can resolve overlapping modes and in some cases be used to estimate the Schroeder frequency. The proposed approach addresses the issue of filtering and preprocessing in order for the sampling to accurately identify all present room modes with their quadratically increasing density.
Liu, Mengyao; Kang, Chunlan; Yan, Chaochao; Huang, Ting; Song, Xuhao; Zhang, Xiuyue; Yue, Bisong; Zeng, Tao
2016-01-01
The Black Stork, Ciconia nigra belongs to family Ciconiidae, which is evaluated as Least Concern by IUCN. In this study, the complete mitochondrial genome of C. nigra was first sequenced and characterized, which was 17,795 bp in length. The mt-genome has tandem repeats of 80 bp and 78 bp repeat units, and AAACAAC and AAACAAACAAC tandem repeats in D-loop region. It is notable that a single extra base "C" at position 174 was inserted in gene ND3. Bayesian inference, maximum likelihood methods were used to construct phylogenetic trees based on 12 heavy-strand protein-coding genes. Phylogenetic analyses showed that Ardeidae diverged earlier than Ciconiidae, Cathartida and Threskiornithidae, and Ciconiidae had closest relationship to Cathartida. C. nigra diverged first among three Ciconia birds.
A Bayesian analysis of the solar neutrino problem
Bhat, C.M.; Bhat, P.C.; Paterno, M.; Prosper, H.B.
1996-09-01
We illustrate how the Bayesian approach can be used to provide a simple but powerful way to analyze data from solar neutrino experiments. The data are analyzed assuming that the neutrinos are unaltered during their passage from the Sun to the Earth. We derive quantitative and easily understood information pertaining to the solar neutrino problem.
Bayesian Analysis of Order-Statistics Models for Ranking Data.
ERIC Educational Resources Information Center
Yu, Philip L. H.
2000-01-01
Studied the order-statistics models, extending the usual normal order-statistics model into one in which the underlying random variables followed a multivariate normal distribution. Used a Bayesian approach and the Gibbs sampling technique. Applied the proposed method to analyze presidential election data from the American Psychological…
Incorporating Prior Theory in Covariance Structure Analysis: A Bayesian Approach.
ERIC Educational Resources Information Center
Fornell, Claes; Rust, Roland T.
1989-01-01
A Bayesian approach to the testing of competing covariance structures is developed. Approximate posterior probabilities are easily obtained from the chi square values and other known constants. The approach is illustrated using an example that demonstrates how the prior probabilities can alter results concerning the preferred model specification.…
A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis
ERIC Educational Resources Information Center
DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim
2004-01-01
This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Carvalho, Pedro; Marques, Rui Cunha
2016-02-15
This study aims to search for economies of size and scope in the Portuguese water sector applying Bayesian and classical statistics to make inference in stochastic frontier analysis (SFA). This study proves the usefulness and advantages of the application of Bayesian statistics for making inference in SFA over traditional SFA which just uses classical statistics. The resulting Bayesian methods allow overcoming some problems that arise in the application of the traditional SFA, such as the bias in small samples and skewness of residuals. In the present case study of the water sector in Portugal, these Bayesian methods provide more plausible and acceptable results. Based on the results obtained we found that there are important economies of output density, economies of size, economies of vertical integration and economies of scope in the Portuguese water sector, pointing out to the huge advantages in undertaking mergers by joining the retail and wholesale components and by joining the drinking water and wastewater services.
Dust biasing of damped Lyman alpha systems: a Bayesian analysis
NASA Astrophysics Data System (ADS)
Pontzen, Andrew; Pettini, Max
2009-02-01
If damped Lyman alpha systems (DLAs) contain even modest amounts of dust, the ultraviolet luminosity of the background quasar can be severely diminished. When the spectrum is redshifted, this leads to a bias in optical surveys for DLAs. Previous estimates of the magnitude of this effect are in some tension; in particular, the distribution of DLAs in the (NHI, Z) (i.e. column density-metallicity) plane has led to claims that we may be missing a considerable fraction of metal-rich, high column density DLAs, whereas radio surveys do not unveil a substantial population of otherwise hidden systems. Motivated by this tension, we perform a Bayesian parameter estimation analysis of a simple dust obscuration model. We include radio and optical observations of DLAs in our overall likelihood analysis and show that these do not, in fact, constitute conflicting constraints. Our model gives statistical limits on the biasing effects of dust, predicting that only 7 per cent of DLAs are missing from optical samples due to dust obscuration; at 2σ confidence, this figure takes a maximum value of 17 per cent. This contrasts with recent claims that DLA incidence rates are underestimated by 30-50 per cent. Optical measures of the mean metallicities of DLAs are found to underestimate the true value by just 0.1dex (or at most 0.4dex,2σ confidence limit), in agreement with the radio survey results of Akerman et al. As an independent test, we use our model to make a rough prediction for dust reddening of the background quasar. We find a mean reddening in the DLA rest frame of log10
Johnson, James R; Owens, Krista L; Clabots, Connie R; Weissman, Scott J; Cannon, Steven B
2006-06-01
The evolutionary origins of extraintestinal pathogenic Escherichia coli (ExPEC) remain uncertain despite these organisms' relevance to human disease. A valid understanding of ExPEC phylogeny is needed as a framework against which the observed distribution of virulence factors and clinical associations can be analyzed. Accordingly, phylogenetic relationships were defined by multi-locus sequence analysis among 44 representatives of selected ExPEC clonal groups and the E. coli Reference (ECOR) collection. Recombination, which significantly obscured the phylogenetic signal for several strains, was dealt with by excluding strains or specific sequences. Conflicting overall phylogenies, and internal phylogenies for virulence-associated phylogenetic group B2, were inferred depending on the specific dataset (i.e., how extensively purged of recombination), outgroup (Salmonella enterica and/or Escherichia fergusonii), and analysis method (neighbor joining, maximum parsimony, maximum likelihood, or Bayesian likelihood). Nonetheless, the major E. coli phylogenetic groups A, B1, and B2 were consistently well resolved, as was a major sub-component of group D and an ECOR 37-O157:H7 clade. Moreover, nine important ExPEC clonal groups within groups B2 and D, characterized by serotypes O6:K2:H1, O18:K1:H7, O6:H31, and O4:K+:H+ (from group B2), and O1:K1:H-, O7:K1:H-, O157:K+:H (non-7), O15:K52:H1, and O11/17/77:K52:H18 ("clonal group A") (from group D), were consistently well resolved, regardless of clinical background (cystitis, pyelonephritis, neonatal meningitis, sepsis, or fecal), host group, geographical origin, and virulence profile. Among the group B2-derived clonal groups the O6:K2:H1 clade appeared basal. Within group D, "clonal group A" and the O15:K52:H1 clonal group were consistently placed with ECOR 47 and ECOR 44, respectively, as nearest neighbors. These findings clarify phylogenetic relationships among key ExPEC clonal groups but also emphasize that recombination
A Bayesian approach to meta-analysis of plant pathology studies.
Mila, A L; Ngugi, H K
2011-01-01
Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework
2011-01-01
Background Chemosensory receptors, which are all G-protein-coupled receptors (GPCRs), come in four types: odorant receptors (ORs), vomeronasal receptors, trace-amine associated receptors and formyl peptide receptor-like proteins. The ORs are the most important receptors for detecting a wide range of environmental chemicals in daily life. Most fish OR genes have been identified from genome databases following the completion of the genome sequencing projects of many fishes. However, it remains unclear whether these OR genes from the genome databases are actually expressed in the fish olfactory epithelium. Thus, it is necessary to clone the OR mRNAs directly from the olfactory epithelium and to examine their expression status. Results Eighty-nine full-length and 22 partial OR cDNA sequences were isolated from the olfactory epithelium of the large yellow croaker, Larimichthys crocea. Bayesian phylogenetic analysis classified the vertebrate OR genes into two types, with several clades within each type, and showed that the L. crocea OR genes of each type are more closely related to those of fugu, pufferfish and stickleback than they are to those of medaka, zebrafish and frog. The reconciled tree showed 178 duplications and 129 losses. The evolutionary relationships among OR genes in these fishes accords with their evolutionary history. The fish OR genes have experienced functional divergence, and the different clades of OR genes have evolved different functions. The result of real-time PCR shows that different clades of ORs have distinct expression levels. Conclusion We have shown about 100 OR genes to be expressed in the olfactory epithelial tissues of L. crocea. The OR genes of modern fishes duplicated from their common ancestor, and were expanded over evolutionary time. The OR genes of L. crocea are closely related to those of fugu, pufferfish and stickleback, which is consistent with its evolutionary position. The different expression levels of OR genes of large
Phylogenetic analysis of β-defensin-like genes of Bothrops, Crotalus and Lachesis snakes.
Correa, Poliana G; Oguiura, Nancy
2013-07-01
Defensins are components of the vertebrate innate immune system; they comprise a diverse group of small cationic antimicrobial peptides. Among them, β-defensins have a characteristic β-sheet-rich fold plus six conserved cysteines with particular spacing and intramolecular bonds. They have been fully studied in mammals, but there is little information about them in snakes. Using a PCR approach, we described 13 β-defensin-like sequences in Bothrops and Lachesis snakes. The genes are organized in three exons and two introns, with exception of B.atrox_defensinB_01 which has only two exons. They show high similarities in exon 1, intron 1 and intron 2, but exons 2 and 3 have undergone accelerated evolution. The theoretical translated sequences encode a pre-β-defensin-like molecule with a conserved signal peptide and a mature peptide. The signal peptides are leucine-rich and the mature β-defensin-like molecules have a size around 4.5 kDa, a net charge from +2 to +11, and the conserved cysteine motif. Phylogenetic analysis was done using maximum parsimony, maximum likelihood and Bayesian analyses, and all resulted in similar topologies with slight differences. The genus Bothrops displayed two separate lineages. The reconciliation of gene trees and species tree indicated eight to nine duplications and 23 to 29 extinctions depending on the gene tree used. Our results together with previously published data indicate that the ancestral β-defensin-like gene may have three exons in vertebrates and that their evolution occurred according to a birth-and-death model.
Knight, Sarah; Gordon, Dennis P; Lavery, Shane D
2011-11-01
Phylogenetic relationships within the bryozoan order Cheilostomata are currently uncertain, with many morphological hypotheses proposed but scarcely tested by independent means of molecular analysis. This research uses DNA sequence data across five loci of both mitochondrial and nuclear origin from 91 species of cheilostome Bryozoa (34 species newly sequenced). This vastly improved the taxonomic coverage and number of loci used in a molecular analysis of this order and allowed a more in-depth look into the evolutionary history of Cheilostomata. Maximum likelihood and Bayesian analyses of individual loci were carried out along with a partitioned multi-locus approach, plus a range of topology tests based on morphological hypotheses. Together, these provide a comprehensive set of phylogenetic analyses of the order Cheilostomata. From these results inferences are made about the evolutionary history of this order and proposed morphological hypotheses are discussed in light of the independent evidence gained from the molecular data. Infraorder Ascophorina was demonstrated to be non-monophyletic, and there appears to be multiple origins of the ascus and associated structures involved in lophophore extension. This was further supported by the lack of monophyly within each of the four ascophoran grades (acanthostegomorph/spinocystal, hippothoomorph/gymnocystal, umbonulomorph/umbonuloid, lepraliomorph/lepralioid) defined by frontal-shield morphology. Chorizopora, currently classified in the ascophoran grade Hippothoomorpha, is phylogenetically distinct from Hippothoidae, providing strong evidence for multiple origins of the gymnocystal frontal shield type. Further evidence is produced to support the morphological hypothesis of multiple umbonuloid origins of lepralioid frontal shields, using a step-wise set of topological hypothesis tests combined with examination of multi-locus phylogenies.
Pu, De-qiang; Liu, Hong-ling; Gong, Yi-yun; Ji, Pei-cheng; Li, Yue-jian; Mou, Fang-sheng; Wei, Shu-jun
2017-01-01
The hoverflies Episyrphus balteatus and Eupeodes corollae (Diptera: Muscomorpha: Syrphidae) are important natural aphid predators. We obtained mitochondrial genome sequences from these two species using methods of PCR amplification and sequencing. The complete Episyrphus mitochondrial genome is 16,175 bp long while the incomplete one of Eupeodes is 15,326 bp long. All 37 typical mitochondrial genes are present in both species and arranged in ancestral positions and directions. The two mitochondrial genomes showed a biased A/T usage versus G/C. The cox1, cox2, cox3, cob and nad1 showed relatively low level of nucleotide diversity among protein-coding genes, while the trnM was the most conserved one without any nucleotide variation in stem regions within Muscomorpha. Phylogenetic relationships among the major lineages of Muscomorpha were reconstructed using a complete set of mitochondrial genes. Bayesian and maximum likelihood analyses generated congruent topologies. Our results supported the monophyly of five species within the Syrphidae (Syrphoidea). The Platypezoidea was sister to all other species of Muscomorpha in our phylogeny. Our study demonstrated the power of the complete mitochondrial gene set for phylogenetic analysis in Muscomorpha. PMID:28276531
Serologic and hexon phylogenetic analysis of ruminant adenoviruses
Technology Transfer Automated Retrieval System (TEKTRAN)
The objectives of this study were to determine the antigenic relationship among ruminant adenoviruses and determine their phylogenetic relationship based on the deduced hexon gene amino acid sequence. Results of reciprocal cross-neutralization tests demonstrated antigenic relationships in either on...
A Bayesian Solution for Two-Way Analysis of Variance. ACT Technical Bulletin No. 8.
ERIC Educational Resources Information Center
Lindley, Dennis V.
The standard statistical analysis of data classified in two ways (say into rows and columns) is through an analysis of variance that splits the total variation of the data into the main effect of rows, the main effect of columns, and the interaction between rows and columns. This paper presents an alternative Bayesian analysis of the same…
Yang, Jingjing; Cox, Dennis D; Lee, Jong Soo; Ren, Peng; Choi, Taeryon
2017-04-10
Functional data are defined as realizations of random functions (mostly smooth functions) varying over a continuum, which are usually collected on discretized grids with measurement errors. In order to accurately smooth noisy functional observations and deal with the issue of high-dimensional observation grids, we propose a novel Bayesian method based on the Bayesian hierarchical model with a Gaussian-Wishart process prior and basis function representations. We first derive an induced model for the basis-function coefficients of the functional data, and then use this model to conduct posterior inference through Markov chain Monte Carlo methods. Compared to the standard Bayesian inference that suffers serious computational burden and instability in analyzing high-dimensional functional data, our method greatly improves the computational scalability and stability, while inheriting the advantage of simultaneously smoothing raw observations and estimating the mean-covariance functions in a nonparametric way. In addition, our method can naturally handle functional data observed on random or uncommon grids. Simulation and real studies demonstrate that our method produces similar results to those obtainable by the standard Bayesian inference with low-dimensional common grids, while efficiently smoothing and estimating functional data with random and high-dimensional observation grids when the standard Bayesian inference fails. In conclusion, our method can efficiently smooth and estimate high-dimensional functional data, providing one way to resolve the curse of dimensionality for Bayesian functional data analysis with Gaussian-Wishart processes.
Bayesian time-series analysis of a repeated-measures poisson outcome with excess zeroes.
Murphy, Terrence E; Van Ness, Peter H; Araujo, Katy L B; Pisani, Margaret A
2011-12-01
In this article, the authors demonstrate a time-series analysis based on a hierarchical Bayesian model of a Poisson outcome with an excessive number of zeroes. The motivating example for this analysis comes from the intensive care unit (ICU) of an urban university teaching hospital (New Haven, Connecticut, 2002-2004). Studies of medication use among older patients in the ICU are complicated by statistical factors such as an excessive number of zero doses, periodicity, and within-person autocorrelation. Whereas time-series techniques adjust for autocorrelation and periodicity in outcome measurements, Bayesian analysis provides greater precision for small samples and the flexibility to conduct posterior predictive simulations. By applying elements of time-series analysis within both frequentist and Bayesian frameworks, the authors evaluate differences in shift-based dosing of medication in a medical ICU. From a small sample and with adjustment for excess zeroes, linear trend, autocorrelation, and clinical covariates, both frequentist and Bayesian models provide evidence of a significant association between a specific nursing shift and dosing level of a sedative medication. Furthermore, the posterior distributions from a Bayesian random-effects Poisson model permit posterior predictive simulations of related results that are potentially difficult to model.
Bayesian analysis of the flutter margin method in aeroelasticity
NASA Astrophysics Data System (ADS)
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-12-01
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis-Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the flutter speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. It will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the flutter speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less
Genotyping and phylogenetic analysis of Acanthamoeba isolates associated with keratitis.
Risler, Arnaud; Coupat-Goutaland, Bénédicte; Pélandakis, Michel
2013-11-01
We examined a partial SSU-rDNA sequence from 20 Acanthamoeba isolates associated with keratitis infections. The phylogenetic tree inferred from this partial sequence allowed to assign isolates to genotypes. Among the 20 isolates examined, 16 were found to be of the T4 genotype, 2 were T3, 1 was a T5, and 1 was a T2, confirming the predominance of T4 in infections. However, the study highlighted other genotypes more rarely associated with infections, particularly the T2 genotype. Our study is the second one to detect that this genotype is associated with keratitis. Additionally, the phylogenetic analyses showed five main emerging clusters, T4/T3/T11, T2/T6, T10/T12/T14, T13/T16, and T7/T8/T9/T17, regularly obtained whichever method was used. A similar branching pattern was found when the full rDNA sequence was investigated.
Exploring fast computational strategies for probabilistic phylogenetic analysis.
Rodrigue, Nicolas; Philippe, Hervé; Lartillot, Nicolas
2007-10-01
In recent years, the advent of Markov chain Monte Carlo (MCMC) techniques, coupled with modern computational capabilities, has enabled the study of evolutionary models without a closed form solution of the likelihood function. However, current Bayesian MCMC applications can incur significant computational costs, as they are based on a full sampling from the posterior probability distribution of the parameters of interest. Here, we draw attention as to how MCMC techniques can be embedded within normal approximation strategies for more economical statistical computation. The overall procedure is based on an estimate of the first and second moments of the likelihood function, as well as a maximum likelihood estimate. Through examples, we review several MCMC-based methods used in the statistical literature for such estimation, applying the approaches to constructing posterior distributions under non-analytical evolutionary models relaxing the assumptions of rate homogeneity, and of independence between sites. Finally, we use the procedures for conducting Bayesian model selection, based on Laplace approximations of Bayes factors, which we find to be accurate and computationally advantageous. Altogether, the methods we expound here, as well as other related approaches from the statistical literature, should prove useful when investigating increasingly complex descriptions of molecular evolution, alleviating some of the difficulties associated with nonanalytical models.
Bayesian uncertainty analysis compared with the application of the GUM and its supplements
NASA Astrophysics Data System (ADS)
Elster, Clemens
2014-08-01
The Guide to the Expression of Uncertainty in Measurement (GUM) has proven to be a major step towards the harmonization of uncertainty evaluation in metrology. Its procedures contain elements from both classical and Bayesian statistics. The recent supplements 1 and 2 to the GUM appear to move the guidelines towards the Bayesian point of view, and they produce a probability distribution that shall encode one's state of knowledge about the measurand. In contrast to a Bayesian uncertainty analysis, however, Bayes' theorem is not applied explicitly. Instead, a distribution is assigned for the input quantities which is then ‘propagated’ through a model that relates the input quantities to the measurand. The resulting distribution for the measurand may coincide with a distribution obtained by the application of Bayes' theorem, but this is not true in general. The relation between a Bayesian uncertainty analysis and the application of the GUM and its supplements is investigated. In terms of a simple example, similarities and differences in the approaches are illustrated. Then a general class of models is considered and conditions are specified for which the distribution obtained by supplement 1 to the GUM is equivalent to a posterior distribution resulting from the application of Bayes' theorem. The corresponding prior distribution is identified and assessed. Finally, we briefly compare the GUM approach with a Bayesian uncertainty analysis in the context of regression problems.
A Gibbs sampler for Bayesian analysis of site-occupancy data
Dorazio, Robert M.; Rodriguez, Daniel Taylor
2012-01-01
1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
Phylogenetic analysis of mammalian maximal oxygen consumption during exercise.
Dlugosz, Elizabeth M; Chappell, Mark A; Meek, Thomas H; Szafranska, Paulina A; Zub, Karol; Konarzewski, Marek; Jones, James H; Bicudo, J Eduardo P W; Nespolo, Roberto F; Careau, Vincent; Garland, Theodore
2013-12-15
We compiled published values of mammalian maximum oxygen consumption during exercise ( ) and supplemented these data with new measurements of for the largest rodent (capybara), 20 species of smaller-bodied rodents, two species of weasels and one small marsupial. Many of the new data were obtained with running-wheel respirometers instead of the treadmill systems used in most previous measurements of mammalian . We used both conventional and phylogenetically informed allometric regression models to analyze of 77 'species' (including subspecies or separate populations within species) in relation to body size, phylogeny, diet and measurement method. Both body mass and allometrically mass-corrected showed highly significant phylogenetic signals (i.e. related species tended to resemble each other). The Akaike information criterion corrected for sample size was used to compare 27 candidate models predicting (all of which included body mass). In addition to mass, the two best-fitting models (cumulative Akaike weight=0.93) included dummy variables coding for three species previously shown to have high (pronghorn, horse and a bat), and incorporated a transformation of the phylogenetic branch lengths under an Ornstein-Uhlenbeck model of residual variation (thus indicating phylogenetic signal in the residuals). We found no statistical difference between wheel- and treadmill-elicited values, and diet had no predictive ability for . Averaged across all models, the allometric scaling exponent was 0.839, with 95% confidence limits of 0.795 and 0.883, which does not provide support for a scaling exponent of 0.67, 0.75 or unity.
Staggemeier, Vanessa Graziele; Diniz-Filho, José Alexandre Felizola; Forest, Félix; Lucas, Eve
2015-01-01
Background and Aims Myrcia section Aulomyrcia includes ∼120 species that are endemic to the Neotropics and disjunctly distributed in the moist Amazon and Atlantic coastal forests of Brazil. This paper presents the first comprehensive phylogenetic study of this group and this phylogeny is used as a basis to evaluate recent classification systems and to test alternative hypotheses associated with the history of this clade. Methods Fifty-three taxa were sampled out of the 120 species currently recognized, plus 40 outgroup taxa, for one nuclear marker (ribosomal internal transcribed spacer) and four plastid markers (psbA-trnH, trnL-trnF, trnQ-rpS16 and ndhF). The relationships were reconstructed based on Bayesian and maximum likelihood analyses. Additionally, a likelihood approach, ‘geographic state speciation and extinction’, was used to estimate region- dependent rates of speciation, extinction and dispersal, comparing historically climatic stable areas (refugia) and unstable areas. Key Results Maximum likelihood and Bayesian inferences indicate that Myrcia and Marlierea are polyphyletic, and the internal groupings recovered are characterized by combinations of morphological characters. Phylogenetic relationships support a link between Amazonian and north-eastern species and between north-eastern and south-eastern species. Lower extinction rates within glacial refugia suggest that these areas were important in maintaining diversity in the Atlantic forest biodiversity hotspot. Conclusions This study provides a robust phylogenetic framework to address important ecological questions for Myrcia s.l. within an evolutionary context, and supports the need to unite taxonomically the two traditional genera Myrcia and Marlierea in an expanded Myrcia s.l. Furthermore, this study offers valuable insights into the diversification of plant species in the highly impacted Atlantic forest of South America; evidence is presented that the lowest extinction rates are found inside
ERIC Educational Resources Information Center
Stakhovych, Stanislav; Bijmolt, Tammo H. A.; Wedel, Michel
2012-01-01
In this article, we present a Bayesian spatial factor analysis model. We extend previous work on confirmatory factor analysis by including geographically distributed latent variables and accounting for heterogeneity and spatial autocorrelation. The simulation study shows excellent recovery of the model parameters and demonstrates the consequences…
Bayesian Factor Analysis When Only a Sample Covariance Matrix Is Available
ERIC Educational Resources Information Center
Hayashi, Kentaro; Arav, Marina
2006-01-01
In traditional factor analysis, the variance-covariance matrix or the correlation matrix has often been a form of inputting data. In contrast, in Bayesian factor analysis, the entire data set is typically required to compute the posterior estimates, such as Bayes factor loadings and Bayes unique variances. We propose a simple method for computing…
Analysis of Climate Change on Hydrologic Components by using Bayesian Neural Networks
NASA Astrophysics Data System (ADS)
Kang, K.
2012-12-01
Representation of hydrologic analysis in climate change is a challenging task. Hydrologic outputs in regional climate models (RCMs) from general circulation models (GCMs) have difficult representation due to several uncertainties in hydrologic impacts of climate change. To overcome this problem, this research presents practical options for hydrological climate change with Bayesian and Neural networks approached to regional adaption to climate change. Bayesian and Neural networks analysis to climate hydrologic components is one of new frontier researches considering to climate change expectation. Strong advantage in Bayesian Neural networks is detecting time series in hydrologic components, which is complicated due to data, parameter, and model hypothesis on climate change scenario, through changing steps by removing and adding connections in Neural network process that combined Bayesian concept from parameter, predict and update process. As an example study, Mekong River Watershed, which is surrounded by four countries (Myanmar, Laos, Thailand and Cambodia), is selected. Results will show understanding of hydrologic components trend on climate model simulations through Bayesian Neural networks.
Li, Shi; Mukherjee, Bhramar; Batterman, Stuart; Ghosh, Malay
2013-12-01
Case-crossover designs are widely used to study short-term exposure effects on the risk of acute adverse health events. While the frequentist literature on this topic is vast, there is no Bayesian work in this general area. The contribution of this paper is twofold. First, the paper establishes Bayesian equivalence results that require characterization of the set of priors under which the posterior distributions of the risk ratio parameters based on a case-crossover and time-series analysis are identical. Second, the paper studies inferential issues under case-crossover designs in a Bayesian framework. Traditionally, a conditional logistic regression is used for inference on risk-ratio parameters in case-crossover studies. We consider instead a more general full likelihood-based approach which makes less restrictive assumptions on the risk functions. Formulation of a full likelihood leads to growth in the number of parameters proportional to the sample size. We propose a semi-parametric Bayesian approach using a Dirichlet process prior to handle the random nuisance parameters that appear in a full likelihood formulation. We carry out a simulation study to compare the Bayesian methods based on full and conditional likelihood with the standard frequentist approaches for case-crossover and time-series analysis. The proposed methods are illustrated through the Detroit Asthma Morbidity, Air Quality and Traffic study, which examines the association between acute asthma risk and ambient air pollutant concentrations.
FABADA: Fitting Algorithm for Bayesian Analysis of DAta
NASA Astrophysics Data System (ADS)
Pardo, L.; Sala, G.
2014-07-01
The extraction of any physical information from data has been generally made by fitting the data through a χ^2 minimization procedure. However, as pointed out by the pioneer work of Sivia D. S. et al. another way to analyze the data is possible using a probabilistic approach based on Bayes theorem. Expressed in a practical way, the main difference between the classical (χ^2 minimization) and the Bayesian approach is the way of expressing the final results of the fitting procedure: in the first case the result is expressed by values of parameters and a merit figure such as χ^2, while in the second case results are presented as probability distribution functions (PDF) of both. In the method presented here we obtain the final probability distribution functions exploring the combinations of parameters compatible with the experimental error, i.e. allowing the fitting procedure to wander in the parameter space with a probability of visiting a certain point P=exp(-χ^2/2), the so called Gibbs sampling. Among the advantages of this method, we would like to emphasize three. First of all, correlation between parameters is automatically taken into account with the Bayesian method. This implies, for example, that parameter errors are correctly calculated, correlations show up in a natural way and ill defined parameters are immediately recognized from their PDF (i.e. parameters for which data only support the calculation of lower or upper bounds). Secondly, it is possible to calculate the likelihood of a determined physical model, and therefore to select the one which best fits the data with the minimum number of parameters, in a correctly defined probabilistic way. Finally, the last but not less, in the case of a low count rate, where the known low error=√{counts} fails because Poisson distribution can no longer be approximated as a Gaussian, the Bayesian, method can also be used by simply redefining χ^2, which is not possible with the usual fitting procedure.
Buckley, Thomas R; James, Sam; Allwood, Julia; Bartlam, Scott; Howitt, Robyn; Prada, Diana
2011-01-01
We have constructed the first ever phylogeny for the New Zealand earthworm fauna (Megascolecinae and Acanthodrilinae) including representatives from other major continental regions. Bayesian and maximum likelihood phylogenetic trees were constructed from 427 base pairs from the mitochondrial large subunit (16S) rRNA gene and 661 base pairs from the nuclear large subunit (28S) rRNA gene. Within the Acanthodrilinae we were able to identify a number of well-supported clades that were restricted to continental landmasses. Estimates of nodal support for these major clades were generally high, but relationships among clades were poorly resolved. The phylogenetic analyses revealed several independent lineages in New Zealand, some of which had a comparable phylogenetic depth to monophyletic groups sampled from Madagascar, Africa, North America and Australia. These results are consistent with at least some of these clades having inhabited New Zealand since rifting from Gondwana in the Late Cretaceous. Within the New Zealand Acanthodrilinae, major clades tended to be restricted to specific regions of New Zealand, with the central North Island and Cook Strait representing major biogeographic boundaries. Our field surveys of New Zealand and subsequent identification has also revealed extensive cryptic taxonomic diversity with approximately 48 new species sampled in addition to the 199 species recognized by previous authors. Our results indicate that further survey and taxonomic work is required to establish a foundation for future biogeographic and ecological research on this vitally important component of the New Zealand biota.
Bayesian analysis of the dynamic structure in China's economic growth
NASA Astrophysics Data System (ADS)
Kyo, Koki; Noda, Hideo
2008-11-01
To analyze the dynamic structure in China's economic growth during the period 1952-1998, we introduce a model of the aggregate production function for the Chinese economy that considers total factor productivity (TFP) and output elasticities as time-varying parameters. Specifically, this paper is concerned with the relationship between the rate of economic growth in China and the trend in TFP. Here, we consider the time-varying parameters as random variables and introduce smoothness priors to construct a set of Bayesian linear models for parameter estimation. The results of the estimation are in agreement with the movements in China's social economy, thus illustrating the validity of the proposed methods.
Bayesian analysis of truncation errors in chiral effective field theory
NASA Astrophysics Data System (ADS)
Melendez, J.; Furnstahl, R. J.; Klco, N.; Phillips, D. R.; Wesolowski, S.
2016-09-01
In the Bayesian approach to effective field theory (EFT) expansions, truncation errors are derived from degree-of-belief (DOB) intervals for EFT predictions. By encoding expectations about the naturalness of EFT expansion coefficients for observables, this framework provides a statistical interpretation of the standard EFT procedure where truncation errors are estimated using the order-by-order convergence of the expansion. We extend and test previous calculations of DOB intervals for chiral EFT observables, examine correlations between contributions at different orders and energies, and explore methods to validate the statistical consistency of the EFT expansion parameter. Supported in part by the NSF and the DOE.
Multigene analysis of lophophorate and chaetognath phylogenetic relationships.
Helmkampf, Martin; Bruchhaus, Iris; Hausdorf, Bernhard
2008-01-01
Maximum likelihood and Bayesian inference analyses of seven concatenated fragments of nuclear-encoded housekeeping genes indicate that Lophotrochozoa is monophyletic, i.e., the lophophorate groups Bryozoa, Brachiopoda and Phoronida are more closely related to molluscs and annelids than to Deuterostomia or Ecdysozoa. Lophophorates themselves, however, form a polyphyletic assemblage. The hypotheses that they are monophyletic and more closely allied to Deuterostomia than to Protostomia can be ruled out with both the approximately unbiased test and the expected likelihood weights test. The existence of Phoronozoa, a putative clade including Brachiopoda and Phoronida, has also been rejected. According to our analyses, phoronids instead share a more recent common ancestor with bryozoans than with brachiopods. Platyhelminthes is the sister group of Lophotrochozoa. Together these two constitute Spiralia. Although Chaetognatha appears as the sister group of Priapulida within Ecdysozoa in our analyses, alternative hypothesis concerning chaetognath relationships could not be rejected.
Dornburg, Alex; Friedman, Matt; Near, Thomas J
2015-08-01
Elopomorpha is one of the three main clades of living teleost fishes and includes a range of disparate lineages including eels, tarpons, bonefishes, and halosaurs. Elopomorphs were among the first groups of fishes investigated using Hennigian phylogenetic methods and continue to be the object of intense phylogenetic scrutiny due to their economic significance, diversity, and crucial evolutionary status as the sister group of all other teleosts. While portions of the phylogenetic backbone for Elopomorpha are consistent between studies, the relationships among Albula, Pterothrissus, Notacanthiformes, and Anguilliformes remain contentious and difficult to evaluate. This lack of phylogenetic resolution is problematic as fossil lineages are often described and placed taxonomically based on an assumed sister group relationship between Albula and Pterothrissus. In addition, phylogenetic studies using morphological data that sample elopomorph fossil lineages often do not include notacanthiform or anguilliform lineages, potentially introducing a bias toward interpreting fossils as members of the common stem of Pterothrissus and Albula. Here we provide a phylogenetic analysis of DNA sequences sampled from multiple nuclear genes that include representative taxa from Albula, Pterothrissus, Notacanthiformes and Anguilliformes. We integrate our molecular dataset with a morphological character matrix that spans both living and fossil elopomorph lineages. Our results reveal substantial uncertainty in the placement of Pterothrissus as well as all sampled fossil lineages, questioning the stability of the taxonomy of fossil Elopomorpha. However, despite topological uncertainty, our integration of fossil lineages into a Bayesian time calibrated framework provides divergence time estimates for the clade that are consistent with previously published age estimates based on the elopomorph fossil record and molecular estimates resulting from traditional node-dating methods.
Phylogenetic Analysis of an Anaerobic, Trichlorobenzene-Transforming Microbial Consortium
von Wintzingerode, Friedrich; Selent, Burkhard; Hegemann, Werner; Göbel, Ulf B.
1999-01-01
A culture-independent phylogenetic survey for an anaerobic trichlorobenzene-transforming microbial community was carried out. Small-subunit rRNA genes were PCR amplified from community DNA by using primers specific for Bacteria or Euryarchaeota and were subsequently cloned. Application of a new hybridization-based screening approach revealed 51 bacterial clone families, one of which was closely related to dechlorinating Dehalobacter species. Several clone sequences clustered to rDNA sequences obtained from a molecular study of an anaerobic aquifer contaminated with hydrocarbons and chlorinated solvents (Dojka et al., Appl. Env. Microbiol. 64:3869–3877, 1998). PMID:9872791
A Bayesian geostatistical transfer function approach to tracer test analysis
NASA Astrophysics Data System (ADS)
Fienen, Michael N.; Luo, Jian; Kitanidis, Peter K.
2006-07-01
Reactive transport modeling is often used in support of bioremediation and chemical treatment planning and design. There remains a pressing need for practical and efficient models that do not require (or assume attainable) the high level of characterization needed by complex numerical models. We focus on a linear systems or transfer function approach to the problem of reactive tracer transport in a heterogeneous saprolite aquifer. Transfer functions are obtained through the Bayesian geostatistical inverse method applied to tracer injection histories and breakthrough curves. We employ nonparametric transfer functions, which require minimal assumptions about shape and structure. The resulting flexibility empowers the data to determine the nature of the transfer function with minimal prior assumptions. Nonnegativity is enforced through a reflected Brownian motion stochastic model. The inverse method enables us to quantify uncertainty and to generate conditional realizations of the transfer function. Complex information about a hydrogeologic system is distilled into a relatively simple but rigorously obtained function that describes the transport behavior of the system between two wells. The resulting transfer functions are valuable in reactive transport models based on traveltime and streamline methods. The information contained in the data, particularly in the case of strong heterogeneity, is not overextended but is fully used. This is the first application of Bayesian geostatistical inversion to transfer functions in hydrogeology but the methodology can be extended to any linear system.
OBJECTIVE BAYESIAN ANALYSIS OF ''ON/OFF'' MEASUREMENTS
Casadei, Diego
2015-01-01
In high-energy astrophysics, it is common practice to account for the background overlaid with counts from the source of interest with the help of auxiliary measurements carried out by pointing off-source. In this ''on/off'' measurement, one knows the number of photons detected while pointing toward the source, the number of photons collected while pointing away from the source, and how to estimate the background counts in the source region from the flux observed in the auxiliary measurements. For very faint sources, the number of photons detected is so low that the approximations that hold asymptotically are not valid. On the other hand, an analytical solution exists for the Bayesian statistical inference, which is valid at low and high counts. Here we illustrate the objective Bayesian solution based on the reference posterior and compare the result with the approach very recently proposed by Knoetig, and discuss its most delicate points. In addition, we propose to compute the significance of the excess with respect to the background-only expectation with a method that is able to account for any uncertainty on the background and is valid for any photon count. This method is compared to the widely used significance formula by Li and Ma, which is based on asymptotic properties.
Objective Bayesian Analysis of "on/off" Measurements
NASA Astrophysics Data System (ADS)
Casadei, Diego
2015-01-01
In high-energy astrophysics, it is common practice to account for the background overlaid with counts from the source of interest with the help of auxiliary measurements carried out by pointing off-source. In this "on/off" measurement, one knows the number of photons detected while pointing toward the source, the number of photons collected while pointing away from the source, and how to estimate the background counts in the source region from the flux observed in the auxiliary measurements. For very faint sources, the number of photons detected is so low that the approximations that hold asymptotically are not valid. On the other hand, an analytical solution exists for the Bayesian statistical inference, which is valid at low and high counts. Here we illustrate the objective Bayesian solution based on the reference posterior and compare the result with the approach very recently proposed by Knoetig, and discuss its most delicate points. In addition, we propose to compute the significance of the excess with respect to the background-only expectation with a method that is able to account for any uncertainty on the background and is valid for any photon count. This method is compared to the widely used significance formula by Li & Ma, which is based on asymptotic properties.
A Bayesian Analysis of the Ages of Four Open Clusters
NASA Astrophysics Data System (ADS)
Jeffery, Elizabeth J.; von Hippel, Ted; van Dyk, David A.; Stenning, David C.; Robinson, Elliot; Stein, Nathan; Jefferys, William H.
2016-09-01
In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn-off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional “by-eye” isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 ± 0.05, 1.02 ± 0.02, 1.64 ± 0.04, and 0.860 ± 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to a higher precision than that offered by these traditional methods of isochrone fitting.
Evolution of climatic niche specialization: a phylogenetic analysis in amphibians.
Bonetti, Maria Fernanda; Wiens, John J
2014-11-22
The evolution of climatic niche specialization has important implications for many topics in ecology, evolution and conservation. The climatic niche reflects the set of temperature and precipitation conditions where a species can occur. Thus, specialization to a limited set of climatic conditions can be important for understanding patterns of biogeography, species richness, community structure, allopatric speciation, spread of invasive species and responses to climate change. Nevertheless, the factors that determine climatic niche width (level of specialization) remain poorly explored. Here, we test whether species that occur in more extreme climates are more highly specialized for those conditions, and whether there are trade-offs between niche widths on different climatic niche axes (e.g. do species that tolerate a broad range of temperatures tolerate only a limited range of precipitation regimes?). We test these hypotheses in amphibians, using phylogenetic comparative methods and global-scale datasets, including 2712 species with both climatic and phylogenetic data. Our results do not support either hypothesis. Rather than finding narrower niches in more extreme environments, niches tend to be narrower on one end of a climatic gradient but wider on the other. We also find that temperature and precipitation niche breadths are positively related, rather than showing trade-offs. Finally, our results suggest that most amphibian species occur in relatively warm and dry environments and have relatively narrow climatic niche widths on both of these axes. Thus, they may be especially imperilled by anthropogenic climate change.
NASA Astrophysics Data System (ADS)
Gutiérrez, Jose Manuel; San Martín, Daniel; Herrera, Sixto; Santiago Cofiño, Antonio
2016-04-01
The growing availability of spatial datasets (observations, reanalysis, and regional and global climate models) demands efficient multivariate spatial modeling techniques for many problems of interest (e.g. teleconnection analysis, multi-site downscaling, etc.). Complex networks have been recently applied in this context using graphs built from pairwise correlations between the different stations (or grid boxes) forming the dataset. However, this analysis does not take into account the full dependence structure underlying the data, gien by all possible marginal and conditional dependencies among the stations, and does not allow a probabilistic analysis of the dataset. In this talk we introduce Bayesian networks as an alternative multivariate analysis and modeling data-driven technique which allows building a joint probability distribution of the stations including all relevant dependencies in the dataset. Bayesian networks is a sound machine learning technique using a graph to 1) encode the main dependencies among the variables and 2) to obtain a factorization of the joint probability distribution of the stations given by a reduced number of parameters. For a particular problem, the resulting graph provides a qualitative analysis of the spatial relationships in the dataset (alternative to complex network analysis), and the resulting model allows for a probabilistic analysis of the dataset. Bayesian networks have been widely applied in many fields, but their use in climate problems is hampered by the large number of variables (stations) involved in this field, since the complexity of the existing algorithms to learn from data the graphical structure grows nonlinearly with the number of variables. In this contribution we present a modified local learning algorithm for Bayesian networks adapted to this problem, which allows inferring the graphical structure for thousands of stations (from observations) and/or gridboxes (from model simulations) thus providing new
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
The PNarec method for detection of ancient recombinations through phylogenetic network analysis.
Saitou, Naruya; Kitano, Takashi
2013-02-01
Recombinations are known to disrupt bifurcating tree structure of gene genealogies. Although recently occurred recombinations are easily detectable by using conventional methods, recombinations may have occurred at any time. We devised a new method for detecting ancient recombinations through phylogenetic network analysis, and detected five ancient recombinations in gibbon ABO blood group genes [Kitano et al., 2009. Mol. Phylogenet. Evol., 51, 465-471]. We present applications of this method, now named as "PNarec", to various virus sequences as well as HLA genes.
Pontarp, Mikael; Canbäck, Björn; Tunlid, Anders; Lundberg, Per
2012-07-01
The phylogenetic structure and community composition were analysed in an existing data set of marine bacterioplankton communities to elucidate the evolutionary and ecological processes dictating the assembly. The communities were sampled from coastal waters at nine locations distributed worldwide and were examined through the use of comprehensive clone libraries of 16S ribosomal RNA genes. The analyses show that the local communities are phylogenetically different from each other and that a majority of them are phylogenetically clustered, i.e. the species (operational taxonomic units) were more related to each other than expected by chance. Accordingly, the local communities were assembled non-randomly from the global pool of available bacterioplankton. Further, the phylogenetic structures of the communities were related to the water temperature at the locations. In agreement with similar studies, including both macroorganisms and bacteria, these results suggest that marine bacterial communities are structured by “habitat filtering”, i.e. through non-random colonization and invasion determined by environmental characteristics. Different bacterial types seem to have different ecological niches that dictate their survival in different habitats. Other eco-evolutionary processes that may contribute to the observed phylogenetic patterns are discussed. The results also imply a mapping between phenotype and phylogenetic relatedness which facilitates the use of community phylogenetic structure analysis to infer ecological and evolutionary assembly processes.
McCandless, Lawrence C; Gustafson, Paul; Austin, Peter C; Levy, Adrian R
2009-09-10
Regression adjustment for the propensity score is a statistical method that reduces confounding from measured variables in observational data. A Bayesian propensity score analysis extends this idea by using simultaneous estimation of the propensity scores and the treatment effect. In this article, we conduct an empirical investigation of the performance of Bayesian propensity scores in the context of an observational study of the effectiveness of beta-blocker therapy in heart failure patients. We study the balancing properties of the estimated propensity scores. Traditional Frequentist propensity scores focus attention on balancing covariates that are strongly associated with treatment. In contrast, we demonstrate that Bayesian propensity scores can be used to balance the association between covariates and the outcome. This balancing property has the effect of reducing confounding bias because it reduces the degree to which covariates are outcome risk factors.
Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.
2009-01-01
A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043
Reeves, Patrick A; Friedman, Philip H; Richards, Christopher M
2005-01-01
wolfPAC is an AppleScript-based software package that facilitates the use of numerous, remotely located Macintosh computers to perform computationally-intensive phylogenetic analyses using the popular application PAUP* (Phylogenetic Analysis Using Parsimony). It has been designed to utilise readily available, inexpensive processors and to encourage sharing of computational resources within the worldwide phylogenetics community.
ERIC Educational Resources Information Center
Tsiouris, John; Mann, Rachel; Patti, Paul; Sturmey, Peter
2004-01-01
Clinicians need to know the likelihood of a condition given a positive or negative diagnostic test. In this study a Bayesian analysis of the Clinical Behavior Checklist for Persons with Intellectual Disabilities (CBCPID) to predict depression in people with intellectual disability was conducted. The CBCPID was administered to 92 adults with…
Bayesian analysis of cross-prefectural production function with time varying structure in Japan
NASA Astrophysics Data System (ADS)
Kyo, Koki; Noda, Hideo
2006-11-01
A cross-prefectural production function (CPPF) in Japan is constructed in a set of Bayesian models to examine the performance of Japan's post-war economy. The parameters in the model are estimated by using the procedure of a Monte Carlo filter together with the method of maximum likelihood. The estimated results are applied to regional and historical analysis of the Japanese economy.
Bayesian Meta-Analysis of Cronbach's Coefficient Alpha to Evaluate Informative Hypotheses
ERIC Educational Resources Information Center
Okada, Kensuke
2015-01-01
This paper proposes a new method to evaluate informative hypotheses for meta-analysis of Cronbach's coefficient alpha using a Bayesian approach. The coefficient alpha is one of the most widely used reliability indices. In meta-analyses of reliability, researchers typically form specific informative hypotheses beforehand, such as "alpha of…
Monte Carlo Algorithms for a Bayesian Analysis of the Cosmic Microwave Background
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey B.; Eriksen, H. K.; ODwyer, I. J.; Wandelt, B. D.; Gorski, K.; Knox, L.; Chu, M.
2006-01-01
A viewgraph presentation on the review of Bayesian approach to Cosmic Microwave Background (CMB) analysis, numerical implementation with Gibbs sampling, a summary of application to WMAP I and work in progress with generalizations to polarization, foregrounds, asymmetric beams, and 1/f noise is given.
Technology Transfer Automated Retrieval System (TEKTRAN)
In this paper, the Genetic Algorithms (GA) and Bayesian model averaging (BMA) were combined to simultaneously conduct calibration and uncertainty analysis for the Soil and Water Assessment Tool (SWAT). In this hybrid method, several SWAT models with different structures are first selected; next GA i...
Bayesian Factor Analysis as a Variable-Selection Problem: Alternative Priors and Consequences.
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor-loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, Muthén & Asparouhov proposed a Bayesian structural equation modeling (BSEM) approach to explore the presence of cross loadings in CFA models. We show that the issue of determining factor-loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov's approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike-and-slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set is used to demonstrate our approach.
ERIC Educational Resources Information Center
Zwick, Rebecca; Lenaburg, Lubella
2009-01-01
In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…
Application of a data-mining method based on Bayesian networks to lesion-deficit analysis
NASA Technical Reports Server (NTRS)
Herskovits, Edward H.; Gerring, Joan P.
2003-01-01
Although lesion-deficit analysis (LDA) has provided extensive information about structure-function associations in the human brain, LDA has suffered from the difficulties inherent to the analysis of spatial data, i.e., there are many more variables than subjects, and data may be difficult to model using standard distributions, such as the normal distribution. We herein describe a Bayesian method for LDA; this method is based on data-mining techniques that employ Bayesian networks to represent structure-function associations. These methods are computationally tractable, and can represent complex, nonlinear structure-function associations. When applied to the evaluation of data obtained from a study of the psychiatric sequelae of traumatic brain injury in children, this method generates a Bayesian network that demonstrates complex, nonlinear associations among lesions in the left caudate, right globus pallidus, right side of the corpus callosum, right caudate, and left thalamus, and subsequent development of attention-deficit hyperactivity disorder, confirming and extending our previous statistical analysis of these data. Furthermore, analysis of simulated data indicates that methods based on Bayesian networks may be more sensitive and specific for detecting associations among categorical variables than methods based on chi-square and Fisher exact statistics.
ERIC Educational Resources Information Center
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.
2013-01-01
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Bayesian Network Meta-Analysis for Unordered Categorical Outcomes with Incomplete Data
ERIC Educational Resources Information Center
Schmid, Christopher H.; Trikalinos, Thomas A.; Olkin, Ingram
2014-01-01
We develop a Bayesian multinomial network meta-analysis model for unordered (nominal) categorical outcomes that allows for partially observed data in which exact event counts may not be known for each category. This model properly accounts for correlations of counts in mutually exclusive categories and enables proper comparison and ranking of…
Lee, Hyejung; Song, Woogeun; Kwak, Hae-Ryun; Kim, Jae-Deok; Park, Jungan; Auh, Chung-Kyoon; Kim, Dae-Hyun; Lee, Kyeong-Yeoll; Lee, Sukchan; Choi, Hong-Soo
2010-11-01
Tomato yellow leaf curl virus (TYLCV) is a member of the genus Begomovirus of the family Geminiviridae, members of which are characterized by closed circular single-stranded DNA genomes of 2.7-2.8 kb in length, and include viruses transmitted by the Bemisia tabaci whitefly. No reports of TYLCV in Korea are available prior to 2008, after which TYLCV spread rapidly to most regions of the southern Korean peninsula (Gyeongsang-Do, Jeolla-Do and Jeju-Do). Fifty full sequences of TYLCV were analyzed in this study, and the AC1, AV1, IR, and full sequences were analyzed via the muscle program and bayesian analysis. Phylogenetic analysis demonstrated that the Korea TYLCVs were divided into two subgroups. The TYLCV Korea 1 group (Masan) originated from TYLCV Japan (Miyazaki) and the TYLCV Korea 2 group (Jeju/Jeonju) from TYLCV Japan (Tosa/Haruno). A B. tabaci phylogenetic tree was constructed with 16S rRNA and mitochondria cytochrome oxidase I (MtCOI) sequences using the muscle program and MEGA 4.0 in the neighbor-joining algorithm. The sequence data of 16S rRNA revealed that Korea B. tabaci was closely aligned to B. tabaci isolated in Iran and Nigeria. The Q type of B. tabaci, which was originally identified as a viruliferous insect in 2008, was initially isolated in Korea as a non-viruliferous insect in 2005. Therefore, we suggest that two TYLCV Japan isolates were introduced to Korea via different routes, and then transmitted by native B. tabaci.
Meerow, Alan W.; Noblick, Larry; Borrone, James W.; Couvreur, Thomas L. P.; Mauro-Herrera, Margarita; Hahn, William J.; Kuhn, David N.; Nakamura, Kyoko; Oleas, Nora H.; Schnell, Raymond J.
2009-01-01
Background The Cocoseae is one of 13 tribes of Arecaceae subfam. Arecoideae, and contains a number of palms with significant economic importance, including the monotypic and pantropical Cocos nucifera L., the coconut, the origins of which have been one of the “abominable mysteries” of palm systematics for decades. Previous studies with predominantly plastid genes weakly supported American ancestry for the coconut but ambiguous sister relationships. In this paper, we use multiple single copy nuclear loci to address the phylogeny of the Cocoseae subtribe Attaleinae, and resolve the closest extant relative of the coconut. Methodology/Principal Findings We present the results of combined analysis of DNA sequences of seven WRKY transcription factor loci across 72 samples of Arecaceae tribe Cocoseae subtribe Attaleinae, representing all genera classified within the subtribe, and three outgroup taxa with maximum parsimony, maximum likelihood, and Bayesian approaches, producing highly congruent and well-resolved trees that robustly identify the genus Syagrus as sister to Cocos and resolve novel and well-supported relationships among the other genera of the Attaleinae. We also address incongruence among the gene trees with gene tree reconciliation analysis, and assign estimated ages to the nodes of our tree. Conclusions/Significance This study represents the as yet most extensive phylogenetic analyses of Cocoseae subtribe Attaleinae. We present a well-resolved and supported phylogeny of the subtribe that robustly indicates a sister relationship between Cocos and Syagrus. This is not only of biogeographic interest, but will also open fruitful avenues of inquiry regarding evolution of functional genes useful for crop improvement. Establishment of two major clades of American Attaleinae occurred in the Oligocene (ca. 37 MYBP) in Eastern Brazil. The divergence of Cocos from Syagrus is estimated at 35 MYBP. The biogeographic and morphological congruence that we see for
Variational Bayesian causal connectivity analysis for fMRI.
Luessi, Martin; Babacan, S Derin; Molina, Rafael; Booth, James R; Katsaggelos, Aggelos K
2014-01-01
The ability to accurately estimate effective connectivity among brain regions from neuroimaging data could help answering many open questions in neuroscience. We propose a method which uses causality to obtain a measure of effective connectivity from fMRI data. The method uses a vector autoregressive model for the latent variables describing neuronal activity in combination with a linear observation model based on a convolution with a hemodynamic response function. Due to the employed modeling, it is possible to efficiently estimate all latent variables of the model using a variational Bayesian inference algorithm. The computational efficiency of the method enables us to apply it to large scale problems with high sampling rates and several hundred regions of interest. We use a comprehensive empirical evaluation with synthetic and real fMRI data to evaluate the performance of our method under various conditions.
FABADA: a Fitting Algorithm for Bayesian Analysis of DAta
NASA Astrophysics Data System (ADS)
Pardo, L. C.; Rovira-Esteva, M.; Busch, S.; Ruiz-Martin, M. D.; Tamarit, J. Ll
2011-10-01
The fit of data using a mathematical model is the standard way to know if the model describes data correctly and to obtain parameters that describe the physical processes hidden behind the experimental results. This is usually done by means of a χ2 minimization procedure. Although this procedure is fast and quite reliable for simple models, it has many drawbacks when dealing with complicated problems such as models with many or correlated parameters. We present here a Bayesian method to explore the parameter space guided only by the probability laws underlying the χ2 figure of merit. The presented method does not get stuck in local minima of the χ2 landscape as it usually happens with classical minimization procedures. Moreover correlations between parameters are taken into account in a natural way. Finally, parameters are obtained as probability distribution functions so that all the complexity of the parameter space is shown.
Variational Bayesian causal connectivity analysis for fMRI
Luessi, Martin; Babacan, S. Derin; Molina, Rafael; Booth, James R.; Katsaggelos, Aggelos K.
2014-01-01
The ability to accurately estimate effective connectivity among brain regions from neuroimaging data could help answering many open questions in neuroscience. We propose a method which uses causality to obtain a measure of effective connectivity from fMRI data. The method uses a vector autoregressive model for the latent variables describing neuronal activity in combination with a linear observation model based on a convolution with a hemodynamic response function. Due to the employed modeling, it is possible to efficiently estimate all latent variables of the model using a variational Bayesian inference algorithm. The computational efficiency of the method enables us to apply it to large scale problems with high sampling rates and several hundred regions of interest. We use a comprehensive empirical evaluation with synthetic and real fMRI data to evaluate the performance of our method under various conditions. PMID:24847244
Evolutionary ecology of specialization: insights from phylogenetic analysis
Vamosi, Jana C.; Armbruster, W. Scott; Renner, Susanne S.
2014-01-01
In this Special feature, we assemble studies that illustrate phylogenetic approaches to studying salient questions regarding the effect of specialization on lineage diversification. The studies use an array of techniques involving a wide-ranging collection of biological systems (plants, butterflies, fish and amphibians are all represented). Their results reveal that macroevolutionary examination of specialization provides insight into the patterns of trade-offs in specialized systems; in particular, the genetic mechanisms of trade-offs appear to extend to very different aspects of life history in different groups. In turn, because a species may be a specialist from one perspective and a generalist in others, these trade-offs influence whether we perceive specialization to have effects on the evolutionary success of a lineage when we examine specialization only along a single axis. Finally, how geographical range influences speciation and extinction of specialist lineages remains a question offering much potential for further insight. PMID:25274367
Assigning protein functions by comparative genome analysis protein phylogenetic profiles
Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.
2003-05-13
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Micronutrients in HIV: A Bayesian Meta-Analysis
Carter, George M.; Indyk, Debbie; Johnson, Matthew; Andreae, Michael; Suslov, Kathryn; Busani, Sudharani; Esmaeili, Aryan; Sacks, Henry S.
2015-01-01
Background Approximately 28.5 million people living with HIV are eligible for treatment (CD4<500), but currently have no access to antiretroviral therapy. Reduced serum level of micronutrients is common in HIV disease. Micronutrient supplementation (MNS) may mitigate disease progression and mortality. Objectives We synthesized evidence on the effect of micronutrient supplementation on mortality and rate of disease progression in HIV disease. Methods We searched MEDLINE, EMBASE, the Cochrane Central, AMED and CINAHL databases through December 2014, without language restriction, for studies of greater than 3 micronutrients versus any or no comparator. We built a hierarchical Bayesian random effects model to synthesize results. Inferences are based on the posterior distribution of the population effects; posterior distributions were approximated by Markov chain Monte Carlo in OpenBugs. Principal Findings From 2166 initial references, we selected 49 studies for full review and identified eight reporting on disease progression and/or mortality. Bayesian synthesis of data from 2,249 adults in three studies estimated the relative risk of disease progression in subjects on MNS vs. control as 0.62 (95% credible interval, 0.37, 0.96). Median number needed to treat is 8.4 (4.8, 29.9) and the Bayes Factor 53.4. Based on data reporting on 4,095 adults reporting mortality in 7 randomized controlled studies, the RR was 0.84 (0.38, 1.85), NNT is 25 (4.3, ∞). Conclusions MNS significantly and substantially slows disease progression in HIV+ adults not on ARV, and possibly reduces mortality. Micronutrient supplements are effective in reducing progression with a posterior probability of 97.9%. Considering MNS low cost and lack of adverse effects, MNS should be standard of care for HIV+ adults not yet on ARV. PMID:25830916
Phylogenetic analysis and development of probes for differentiating methylotrophic bacteria.
Brusseau, G A; Bulygina, E S; Hanson, R S
1994-01-01
Fifteen small-subunit rRNAs from methylotrophic bacteria have been sequenced. Comparisons of these sequences with 22 previously published sequences further defined the phylogenetic relationships among these bacteria and illustrated the agreement between phylogeny and physiological characteristics of the bacteria. Phylogenetic trees were constructed with 16S rRNA sequences from methylotrophic bacteria and representative organisms from subdivisions within the class Proteobacteria on the basis of sequence similarities by using a weighted least-mean-square difference method. The methylotrophs have been separated into coherent clusters in which bacteria shared physiological characteristics. The clusters distinguished bacteria which used either the ribulose monophosphate or serine pathway for carbon assimilation. In addition, methanotrophs and methylotrophs which do not utilize methane were found to form distinct clusters within these groups. Five new deoxyoligonucleotide probes were designed, synthesized, labelled with digoxigenin-11-ddUTP, and tested for the ability to hybridize to RNA extracted from the bacteria represented in the unique clusters and for the ability to detect RNAs purified from soils enriched for methanotrophs by exposure to a methane-air atmosphere for one month. The 16S rRNA purified from soil hybridized to the probe which was complementary to sequences present in 16S rRNA from serine pathway methanotrophs and hybridized to a lesser extent with a probe complementary to sequences in 16S rRNAs of ribulose monophosphate pathway methanotrophs. The nonradioactive detection system used performed reliably at amounts of RNA from pure cultures as small as 10 ng. Images PMID:7510941
NASA Astrophysics Data System (ADS)
Figueira, P.; Faria, J. P.; Adibekyan, V. Zh.; Oshagh, M.; Santos, N. C.
2016-11-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (˜ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log(R^' }_{ {HK}}) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
Buddhavarapu, Prasad; Smit, Andre F; Prozzi, Jorge A
2015-07-01
Permeable friction course (PFC), a porous hot-mix asphalt, is typically applied to improve wet weather safety on high-speed roadways in Texas. In order to warrant expensive PFC construction, a statistical evaluation of its safety benefits is essential. Generally, the literature on the effectiveness of porous mixes in reducing wet-weather crashes is limited and often inconclusive. In this study, the safety effectiveness of PFC was evaluated using a fully Bayesian before-after safety analysis. First, two groups of road segments overlaid with PFC and non-PFC material were identified across Texas; the non-PFC or reference road segments selected were similar to their PFC counterparts in terms of site specific features. Second, a negative binomial data generating process was assumed to model the underlying distribution of crash counts of PFC and reference road segments to perform Bayesian inference on the safety effectiveness. A data-augmentation based computationally efficient algorithm was employed for a fully Bayesian estimation. The statistical analysis shows that PFC is not effective in reducing wet weather crashes. It should be noted that the findings of this study are in agreement with the existing literature, although these studies were not based on a fully Bayesian statistical analysis. Our study suggests that the safety effectiveness of PFC road surfaces, or any other safety infrastructure, largely relies on its interrelationship with the road user. The results suggest that the safety infrastructure must be properly used to reap the benefits of the substantial investments.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Figueira, P; Faria, J P; Adibekyan, V Zh; Oshagh, M; Santos, N C
2016-11-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (∼ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log([Formula: see text]) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure.
Ling, Cheng; Hamada, Tsuyoshi; Gao, Jingyang; Zhao, Guoguang; Sun, Donghong; Shi, Weifeng
2016-01-01
MrBayes is a widespread phylogenetic inference tool harnessing empirical evolutionary models and Bayesian statistics. However, the computational cost on the likelihood estimation is very expensive, resulting in undesirably long execution time. Although a number of multi-threaded optimizations have been proposed to speed up MrBayes, there are bottlenecks that severely limit the GPU thread-level parallelism of likelihood estimations. This study proposes a high performance and resource-efficient method for GPU-oriented parallelization of likelihood estimations. Instead of having to rely on empirical programming, the proposed novel decomposition storage model implements high performance data transfers implicitly. In terms of performance improvement, a speedup factor of up to 178 can be achieved on the analysis of simulated datasets by four Tesla K40 cards. In comparison to the other publicly available GPU-oriented MrBayes, the tgMC(3)++ method (proposed herein) outperforms the tgMC(3) (v1.0), nMC(3) (v2.1.1) and oMC(3) (v1.00) methods by speedup factors of up to 1.6, 1.9 and 2.9, respectively. Moreover, tgMC(3)++ supports more evolutionary models and gamma categories, which previous GPU-oriented methods fail to take into analysis.
APPLICATION OF PRINCIPAL COMPONENT ANALYSIS AND BAYESIAN DECOMPOSITION TO RELAXOGRAPHIC IMAGING
OCHS,M.F.; STOYANOVA,R.S.; BROWN,T.R.; ROONEY,W.D.; LI,X.; LEE,J.H.; SPRINGER,C.S.
1999-05-22
Recent developments in high field imaging have made possible the acquisition of high quality, low noise relaxographic data in reasonable imaging times. The datasets comprise a huge amount of information (>>1 million points) which makes rigorous analysis daunting. Here, the authors present results demonstrating that Principal Component Analysis (PCA) and Bayesian Decomposition (BD) provide powerful methods for relaxographic analysis of T{sub 1} recovery curves and editing of tissue type in resulting images.
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
Toward an ecological analysis of Bayesian inferences: how task characteristics influence responses
Hafenbrädl, Sebastian; Hoffrage, Ulrich
2015-01-01
In research on Bayesian inferences, the specific tasks, with their narratives and characteristics, are typically seen as exchangeable vehicles that merely transport the structure of the problem to research participants. In the present paper, we explore whether, and possibly how, task characteristics that are usually ignored influence participants’ responses in these tasks. We focus on both quantitative dimensions of the tasks, such as their base rates, hit rates, and false-alarm rates, as well as qualitative characteristics, such as whether the task involves a norm violation or not, whether the stakes are high or low, and whether the focus is on the individual case or on the numbers. Using a data set of 19 different tasks presented to 500 different participants who provided a total of 1,773 responses, we analyze these responses in two ways: first, on the level of the numerical estimates themselves, and second, on the level of various response strategies, Bayesian and non-Bayesian, that might have produced the estimates. We identified various contingencies, and most of the task characteristics had an influence on participants’ responses. Typically, this influence has been stronger when the numerical information in the tasks was presented in terms of probabilities or percentages, compared to natural frequencies – and this effect cannot be fully explained by a higher proportion of Bayesian responses when natural frequencies were used. One characteristic that did not seem to influence participants’ response strategy was the numerical value of the Bayesian solution itself. Our exploratory study is a first step toward an ecological analysis of Bayesian inferences, and highlights new avenues for future research. PMID:26300791
Molecular phylogenetic analysis of Fasciola flukes from eastern India.
Hayashi, Kei; Ichikawa-Seki, Madoka; Mohanta, Uday Kumar; Singh, T Shantikumar; Shoriki, Takuya; Sugiyama, Hiromu; Itagaki, Tadashi
2015-10-01
Fasciola flukes from eastern India were characterized on the basis of spermatogenesis status and nuclear ITS1. Both Fasciola gigantica and aspermic Fasciola flukes were detected in Imphal, Kohima, and Gantoku districts. The sequences of mitochondrial nad1 were analyzed to infer their phylogenetical relationship with neighboring countries. The haplotypes of aspermic Fasciola flukes were identical or showed a single nucleotide substitution compared to those from populations in the neighboring countries, corroborating the previous reports that categorized them in the same lineage. However, the prevalence of aspermic Fasciola flukes in eastern India was lower than those in the neighboring countries, suggesting that they have not dispersed throughout eastern India. In contrast, F. gigantica was predominant and well diversified, and the species was thought to be distributed in the area for a longer time than the aspermic Fasciola flukes. Fasciola gigantica populations from eastern India were categorized into two distinct haplogroups A and B. The level of their genetic diversity suggests that populations belonging to haplogroup A have dispersed from the west side of the Indian subcontinent to eastern India with the artificial movement of domestic cattle, Bos indicus, whereas populations belonging to haplogroup B might have spread from Myanmar to eastern India with domestic buffaloes, Bubalus bubalis.
[Phylogenetic analysis and expression patterns of tropomyosin in amphioxus].
Li, Xin-Yi; Lin, Yu-Shuang; Zhang, Hong-Wei
2012-08-01
In amphioxus, we found a mesoderm related gene, tropomyosin, which encodes a protein comprising 284 amino acid residues, sharing high identities with other known Tropomyosin proteins both in vertebrates and invertebrates. Phylogenetically, amphioxus Tropomyosin fell outside the invertebrate clade and was at the base of the vertebrate protein family clade, indicating that it may represent an independent branch. From the early neurula to the larva stage, whole-mount in situ hybridization and histological sections found transcripts of amphioxus tropomyosin gene. Weak tropomyosin expression was first detected in the wall of the archenteron at about 10 hours-post-fertilization neurula stage, while intense expression was revealed in the differentiating presumptive notochord and the muscle. Transcripts of tropomyosin were then expressed in the formed notochord and somites. Gene expression seemed to continue in these developing organs throughout the neurular stages and remained till 72-hours, during the early larval stages. In situ study still showed tropomyosin was also expressed in the neural tube, hepatic diverticulum, notochord and the spaces between myotomes in adult amphioxus. Our results indicated that tropomyosin may play an important role in both embryonic development and adult life.
Molecular characterization and phylogenetic analysis of Fasciola hepatica from Peru.
Ichikawa-Seki, Madoka; Ortiz, Pedro; Cabrera, Maria; Hobán, Cristian; Itagaki, Tadashi
2016-06-01
The causative agent of fasciolosis in South America is thought to be Fasciola hepatica. In this study, Fasciola flukes from Peru were analyzed to investigate their genetic structure and phylogenetic relationships with those from other countries. Fasciola flukes were collected from the three definitive host species: cattle, sheep, and pigs. They were identified as F. hepatica because mature sperms were observed in their seminal vesicles, and also they displayed Fh type, which has an identical fragment pattern to F. hepatica in the nuclear internal transcribed spacer 1. Eight haplotypes were obtained from the mitochondrial NADH dehydrogenase subunit 1 (nad1) sequences of Peruvian F. hepatica; however, no special difference in genetic structure was observed between the three host species. Its extremely low genetic diversity suggests that the Peruvian population was introduced from other regions. Nad1 haplotypes identical to those of Peruvian F. hepatica were detected in China, Uruguay, Italy, Iran, and Australia. Our results indicate that F. hepatica rapidly expanded its range due to human migration. Future studies are required to elucidate dispersal route of F. hepatica from Europe, its probable origin, to other areas, including Peru.
Phylogenetic Analysis of Canine Parvovirus VP2 Gene in China.
Yi, L; Tong, M; Cheng, Y; Song, W; Cheng, S
2016-04-01
In this study, a total of 37 samples (58.0%) were found through PCR assay to be positive for canine parvovirus (CPV) of 66 suspected faecal samples of dogs collected from various cities throughout China. Eight CPV isolates could be obtained in the CRFK cell line. The sequencing of the VP2 gene of CPV identified the predominant CPV strain as CPV-2a (Ser297Ala), with two CPV-2b (Ser297Ala). Sequence comparison revealed homologies of 99.3-99.9%, 99.9% and 99.3-99.7% within the CPV 2a isolates, within the CPV 2b isolates and between the CPV 2a and 2b isolates, respectively. In addition, several non-synonymous and synonymous mutations were also recorded. The phylogenetic tree revealed that most of the CPV strains from different areas in China were located in the formation of a large branch, which were grouped together along with the KU143-09 strain from Thailand and followed the same evolution. In this study, we provide an updated molecular characterization of CPV 2 circulation in China.
Bayesian inversion analysis of nonlinear dynamics in surface heterogeneous reactions
NASA Astrophysics Data System (ADS)
Omori, Toshiaki; Kuwatani, Tatsu; Okamoto, Atsushi; Hukushima, Koji
2016-09-01
It is essential to extract nonlinear dynamics from time-series data as an inverse problem in natural sciences. We propose a Bayesian statistical framework for extracting nonlinear dynamics of surface heterogeneous reactions from sparse and noisy observable data. Surface heterogeneous reactions are chemical reactions with conjugation of multiple phases, and they have the intrinsic nonlinearity of their dynamics caused by the effect of surface-area between different phases. We adapt a belief propagation method and an expectation-maximization (EM) algorithm to partial observation problem, in order to simultaneously estimate the time course of hidden variables and the kinetic parameters underlying dynamics. The proposed belief propagation method is performed by using sequential Monte Carlo algorithm in order to estimate nonlinear dynamical system. Using our proposed method, we show that the rate constants of dissolution and precipitation reactions, which are typical examples of surface heterogeneous reactions, as well as the temporal changes of solid reactants and products, were successfully estimated only from the observable temporal changes in the concentration of the dissolved intermediate product.
Bayesian Analysis of the Mass Distribution of Neutron Stars
NASA Astrophysics Data System (ADS)
Valentim, Rodolfo; Horvath, Jorge E.; Rangel, Eraldo M.
The distribution of masses for neutron stars is analyzed using the Bayesian statistical inference, evaluating the likelihood of two a priori gaussian peaks distribution by using fifty-five measured points obtained in a variety of systems. The results strongly suggest the existence of a bimodal distribution of the masses, with the first peak around 1.35M⊙ ± 0.06M⊙ and a much wider second peak at 1.73M⊙ ± 0.36M⊙. We compared the two gaussian's model centered at 1.35M⊙ and 1.55M⊙ against a "single gaussian" model with 1.50M⊙ ± 0.11M⊙ using 3σ that provided a wide peak covering objects the full range of observed of masses. In order to compare models, BIC (Baysesian Information Criterion) can be used and a strong evidence for two distributions model against one peak model was found. The results support earlier views related to the different evolutionary histories of the members for the first two peaks, which produces a natural separation (in spite that no attempt to "label" the systems has been made). However, the recently claimed low-mass group, possibly related to O - Mg - Ne core collapse events, has a monotonically decreasing likelihood and has not been identified within this sample.
Multi-Class Sparse Bayesian Regression for Neuroimaging Data Analysis
NASA Astrophysics Data System (ADS)
Michel, Vincent; Eger, Evelyn; Keribin, Christine; Thirion, Bertrand
The use of machine learning tools is gaining popularity in neuroimaging, as it provides a sensitive assessment of the information conveyed by brain images. In particular, finding regions of the brain whose functional signal reliably predicts some behavioral information makes it possible to better understand how this information is encoded or processed in the brain. However, such a prediction is performed through regression or classification algorithms that suffer from the curse of dimensionality, because a huge number of features (i.e. voxels) are available to fit some target, with very few samples (i.e. scans) to learn the informative regions. A commonly used solution is to regularize the weights of the parametric prediction function. However, model specification needs a careful design to balance adaptiveness and sparsity. In this paper, we introduce a novel method, Multi - Class Sparse Bayesian Regression(MCBR), that generalizes classical approaches such as Ridge regression and Automatic Relevance Determination. Our approach is based on a grouping of the features into several classes, where each class is regularized with specific parameters. We apply our algorithm to the prediction of a behavioral variable from brain activation images. The method presented here achieves similar prediction accuracies than reference methods, and yields more interpretable feature loadings.
Heterogeneous multimodal biomarkers analysis for Alzheimer's disease via Bayesian network.
Jin, Yan; Su, Yi; Zhou, Xiao-Hua; Huang, Shuai
2016-12-01
By 2050, it is estimated that the number of worldwide Alzheimer's disease (AD) patients will quadruple from the current number of 36 million, while no proven disease-modifying treatments are available. At present, the underlying disease mechanisms remain under investigation, and recent studies suggest that the disease involves multiple etiological pathways. To better understand the disease and develop treatment strategies, a number of ongoing studies including the Alzheimer's Disease Neuroimaging Initiative (ADNI) enroll many study participants and acquire a large number of biomarkers from various modalities including demographic, genotyping, fluid biomarkers, neuroimaging, neuropsychometric test, and clinical assessments. However, a systematic approach that can integrate all the collected data is lacking. The overarching goal of our study is to use machine learning techniques to understand the relationships among different biomarkers and to establish a system-level model that can better describe the interactions among biomarkers and provide superior diagnostic and prognostic information. In this pilot study, we use Bayesian network (BN) to analyze multimodal data from ADNI, including demographics, volumetric MRI, PET, genotypes, and neuropsychometric measurements and demonstrate our approach to have superior prediction accuracy.
Li, Yangwei; Shi, Yuhua; Lu, Jiqi; Ji, Weihong; Wang, Zhenlong
2016-11-30
Mandarin vole (Lasiopodomys mandarinus) is a subterranean rodent that is often used as a model for studying subterranean hypoxic stress in mammals. However the taxonomy of this species is still in dispute. Mitochondrial DNA (mtDNA) has long been used for phylogenetic reconstruction and, in this study, the complete mitochondrial genome of L. mandarinus mandarinus was sequenced. Our results showed that the mitochondrial genome of L. m. mandarinus is a circular molecule of 16,367bp, which contains 13 protein-coding genes, 22 tRNA and 2 rRNA genes. Except for the 8 tRNA and ND6 genes, all other mitochondrial genes are encoded on the heavy strand. We also analyzed the phylogenetic position of L. mandarinus in respect to the tribe Arvicolini using the sequence of complete Cytb gene, 2rRNA genes and 12 protein-coding genes, and maximum likelihood and Bayesian methods. Our results gave further support to the species status of L. mandarinus and the generic status of Lasiopodomys.
Auguste, Albert J.; Volk, Sara M.; Arrigo, Nicole C.; Martinez, Raymond; Ramkissoon, Vernie; Adams, A. Paige; Thompson, Nadin N.; Adesiyun, Abiodun A.; Chadee, Dave D.; Foster, Jerome E.; Travassos Da Rosa, Amelia P.A.; Tesh, Robert B.; Weaver, Scott C.; Carrington, Christine V. F.
2009-01-01
In the 1950s and 1960s, alphaviruses in the Venezuelan equine encephalitis (VEE) antigenic complex were the most frequently isolated arboviruses in Trinidad. Since then, there has been very little research performed with these viruses. Herein, we report on the isolation, sequencing, and phylogenetic analyses of Mucambo virus (MUCV; VEE complex subtype IIIA), including 6 recently isolated from Culex (Melanoconion) portesi mosquitoes and 11 previously isolated in Trinidad and Brazil. Results show that nucleotide and amino acid identities across the complete structural polyprotein for the MUCV isolates were 96.6 – 100% and 98.7 – 100%, respectively, and the phylogenetic tree inferred for MUCV was highly geographically- and temporally- structured. Bayesian analyses suggest the sampled MUCV lineages have a recent common ancestry of approximately 198 years (with a 95% highest posterior density (HPD) interval of 63 – 448 years) prior to 2007, and an overall rate of evolution of 1.28 × 10−4 substitutions/site/yr. PMID:19631956
Ito, Akira; Ishihara, Miki; Imai, Soichi
2014-04-01
Bozasella gracilis n. sp. in the order Entodiniomorphida was found in fecal samples of an Asian elephant kept in a zoo. The ciliate has general and infraciliary similarities to the families Ophryoscolecidae and Cycloposthiidae. Phylogenetic trees were inferred from 18S rRNA gene sequences of B. gracilis, 45 entodiniomorphids, 10 vestibuliferids, 5 macropodiniids, and an outgroup, using maximum likelihood, Bayesian inference, and neighbor joining analyses. Of them, there were 32 new sequences; 26 entodiniomorphid species in the genera, Bozasella, Triplumaria, Gassovskiella, Ditoxum, Spirodinium, Triadinium, Tetratoxum, Pseudoentodinium, Ochoterenaia, Circodinium, Blepharocorys, Sulcoarcus, Didesmis, Alloiozona, Blepharoconus, Hemiprorodon, and Prorodonopsis, and 6 vestibuliferid species in the genera, Buxtonella, Balantidium, Helicozoster, Latteuria, and Paraisotricha. Thirty additional sequences were retrieved from the GenBank database. Phylogenetic trees revealed non-monophylies of the orders Entodiniomorphida and Vestibuliferida, the suborders Entodiniomorphina and Blepharocorythina, and the families Cycloposthiidae and Paraisotrichidae. Bozasella gracilis was sister to Triplumaria. In addition, to avoid homonymy, we propose Gilchristinidae nom. nov., Gilchristina nom. nov. and Gilchristina artemis (Ito, Van Hoven, Miyazaki & Imai, 2006) comb. nov.
A genomic schism in birds revealed by phylogenetic analysis of DNA strings.
Edwards, Scott V; Fertil, Bernard; Giron, Alain; Deschavanne, Patrick J
2002-08-01
The molecular systematics of vertebrates has been based entirely on alignments of primary structures of macromolecules; however, higher order features of DNA sequences not used in traditional studies also contain valuable phylogenetic information. Recent molecular data sets conflict over the phylogenetic placement of flightless birds (ratites - paleognaths), but placement of this clade critically influences interpretation of character change in birds. To help resolve this issue, we applied a new bioinformatics approach to the largest molecular data set currently available. We distilled nearly one megabase (1 million base pairs) of heterogeneous avian genomic DNA from 20 birds and an alligator into genomic signatures, defined as the complete set of frequencies of short sequence motifs (strings), thereby providing a way to directly compare higher order features of nonhomologous DNA sequences. Phylogenetic analysis and principal component analysis of the signatures strongly support the traditional hypothesis of basal ratites and monophyly of the nonratite birds (neognaths) and imply that ratite genomes are linguistically primitive within birds, despite their base compositional similarity to neognath genomes. Our analyses show further that the phylogenetic signal of genomic signatures are strongest among deep splits within vertebrates. Despite clear problems with phylogenetic analysis of genomic signatures, our study raises intriguing issues about the biological and genomic differences that fundamentally differentiate paleognaths and neognaths.
Phylogenetic and phylogeographic analysis of Iberian lynx populations.
Johnson, W E; Godoy, J A; Palomares, F; Delibes, M; Fernandes, M; Revilla, E; O'Brien, S J
2004-01-01
The Iberian lynx (Lynx pardinus), one of the world's most endangered cat species, is vulnerable due to habitat loss, increased fragmentation of populations, and precipitous demographic reductions. An understanding of Iberian lynx evolutionary history is necessary to develop rational management plans for the species. Our objectives were to assess Iberian lynx genetic diversity at three evolutionary timescales. First we analyzed mitochondrial DNA (mtDNA) sequence variation to position the Iberian lynx relative to other species of the genus LYNX: We then assessed the pattern of mtDNA variation of isolated populations across the Iberian Peninsula. Finally we estimated levels of gene flow between two of the most important remaining lynx populations (Doñana National Park and the Sierra Morena Mountains) and characterized the extent of microsatellite locus variation in these populations. Phylogenetic analyses of 1613 bp of mtDNA sequence variation supports the hypothesis that the Iberian lynx, Eurasian lynx, and Canadian lynx diverged within a short time period around 1.53-1.68 million years ago, and that the Iberian lynx and Eurasian lynx are sister taxa. Relative to most other felid species, genetic variation in mtDNA genes and nuclear microsatellites were reduced in Iberian lynx, suggesting that they experienced a fairly severe demographic bottleneck. In addition, the effects of more recent reductions in gene flow and population size are being manifested in local patterns of molecular genetic variation. These data, combined with recent studies modeling the viability of Iberian lynx populations, should provide greater urgency for the development and implementation of rational in situ and ex situ conservation plans.
Genotyping and phylogenetic analysis of Pneumocystis jirovecii isolates from India.
Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Agarwal, Sanjay Kumar; Samantaray, Jyotish Chandra; Kumar, Lalit; Kabra, Sushil Kumar; Luthra, Kalpana; Sreenivas, Vishnubhatla
2010-08-01
Pneumocystis jirovecii is the cause of Pneumocystis pneumonia (PCP) in immuno-compromised individuals. The aim of this study was to describe the genotypes/haplotypes of P. jirovecii in immuno-compromised individuals with positive polymerase chain reaction (PCR) result for PCP. The typing was based on sequence polymorphism at internal transcribed spacer (ITS) regions of rRNA operon. Phylogenetic relationship between Indian and global haplotypes was also studied. Between January 2005 to October 2008, 43 patients were found to be positive for Pneumocystis using PCR targeting mitochondrial large subunit rRNA (mt LSU rRNA) and ITS region. Genotyping of all the positive samples was performed at the ITS locus by direct sequencing. Nine ITS1 alleles (all previously known) and 11 ITS2 alleles (nine previously defined and two new) were observed. A total of 19 ITS haplotypes, including five novel haplotypes (DEL1r, Edel2, Hr, Adel3 and SYD1a), were observed. The most prevalent type was SYD1g (16.3%), followed by types Ea (11.6%), Ec (9.3%), Eg (6.9%), DEL1r (6.9%), Ne (6.9%) and Ai (6.9%). To detect mixed infection, 30% of the positive isolates were cloned and 4-5 clones were sequenced from each specimen. Cloning and sequencing identified two more haplotypes in addition to the 19 types. Mixed infection was identified in 3 of the 13 cloned samples (23.1%). Upon construction of a haplotype network of 21 haplotypes, type Eg was identified as the most probable ancestral type. The present study is the first study that describes the haplotypes of P. jirovecii based on the ITS gene from India. The study suggests a high diversity of P. jirovecii haplotypes in the population.
Sato, Jun J.; Ohdachi, Satoshi D.; Echenique-Diaz, Lazaro M.; Borroto-Páez, Rafael; Begué-Quiala, Gerardo; Delgado-Labañino, Jorge L.; Gámez-Díez, Jorgelino; Alvarez-Lemus, José; Nguyen, Son Truong; Yamaguchi, Nobuyuki; Kita, Masaki
2016-01-01
The Cuban solenodon (Solenodon cubanus) is one of the most enigmatic mammals and is an extremely rare species with a distribution limited to a small part of the island of Cuba. Despite its rarity, in 2012 seven individuals of S. cubanus were captured and sampled successfully for DNA analysis, providing new insights into the evolutionary origin of this species and into the origins of the Caribbean fauna, which remain controversial. We conducted molecular phylogenetic analyses of five nuclear genes (Apob, Atp7a, Bdnf, Brca1 and Rag1; total, 4,602 bp) from 35 species of the mammalian order Eulipotyphla. Based on Bayesian relaxed molecular clock analyses, the family Solenodontidae diverged from other eulipotyphlan in the Paleocene, after the bolide impact on the Yucatan Peninsula, and S. cubanus diverged from the Hispaniolan solenodon (S. paradoxus) in the Early Pliocene. The strikingly recent divergence time estimates suggest that S. cubanus and its ancestral lineage originated via over-water dispersal rather than vicariance events, as had previously been hypothesised. PMID:27498968
Sato, Jun J; Ohdachi, Satoshi D; Echenique-Diaz, Lazaro M; Borroto-Páez, Rafael; Begué-Quiala, Gerardo; Delgado-Labañino, Jorge L; Gámez-Díez, Jorgelino; Alvarez-Lemus, José; Nguyen, Son Truong; Yamaguchi, Nobuyuki; Kita, Masaki
2016-08-08
The Cuban solenodon (Solenodon cubanus) is one of the most enigmatic mammals and is an extremely rare species with a distribution limited to a small part of the island of Cuba. Despite its rarity, in 2012 seven individuals of S. cubanus were captured and sampled successfully for DNA analysis, providing new insights into the evolutionary origin of this species and into the origins of the Caribbean fauna, which remain controversial. We conducted molecular phylogenetic analyses of five nuclear genes (Apob, Atp7a, Bdnf, Brca1 and Rag1; total, 4,602 bp) from 35 species of the mammalian order Eulipotyphla. Based on Bayesian relaxed molecular clock analyses, the family Solenodontidae diverged from other eulipotyphlan in the Paleocene, after the bolide impact on the Yucatan Peninsula, and S. cubanus diverged from the Hispaniolan solenodon (S. paradoxus) in the Early Pliocene. The strikingly recent divergence time estimates suggest that S. cubanus and its ancestral lineage originated via over-water dispersal rather than vicariance events, as had previously been hypothesised.
PFG NMR and Bayesian analysis to characterise non-Newtonian fluids
NASA Astrophysics Data System (ADS)
Blythe, Thomas W.; Sederman, Andrew J.; Stitt, E. Hugh; York, Andrew P. E.; Gladden, Lynn F.
2017-01-01
Many industrial flow processes are sensitive to changes in the rheological behaviour of process fluids, and there therefore exists a need for methods that provide online, or inline, rheological characterisation necessary for process control and optimisation over timescales of minutes or less. Nuclear magnetic resonance (NMR) offers a non-invasive technique for this application, without limitation on optical opacity. We present a Bayesian analysis approach using pulsed field gradient (PFG) NMR to enable estimation of the rheological parameters of Herschel-Bulkley fluids in a pipe flow geometry, characterised by a flow behaviour index n , yield stress τ0 , and consistency factor k , by analysis of the signal in q -space. This approach eliminates the need for velocity image acquisition and expensive gradient hardware. We investigate the robustness of the proposed Bayesian NMR approach to noisy data and reduced sampling using simulated NMR data and show that even with a signal-to-noise ratio (SNR) of 100, only 16 points are required to be sampled to provide rheological parameters accurate to within 2% of the ground truth. Experimental validation is provided through an experimental case study on Carbopol 940 solutions (model Herschel-Bulkley fluids) using PFG NMR at a 1H resonance frequency of 85.2 MHz; for SNR > 1000, only 8 points are required to be sampled. This corresponds to a total acquisition time of <60 s and represents an 88% reduction in acquisition time when compared to MR flow imaging. Comparison of the shear stress-shear rate relationship, quantified using Bayesian NMR, with non-Bayesian NMR methods demonstrates that the Bayesian NMR approach is in agreement with MR flow imaging to within the accuracy of the measurement. Furthermore, as we increase the concentration of Carbopol 940 we observe a change in rheological characteristics, probably due to shear history-dependent behaviour and the different geometries used. This behaviour highlights the need for
Puncher, M; Birchall, A; Bull, R K
2014-12-01
In Bayesian inference, the initial knowledge regarding the value of a parameter, before additional data are considered, is represented as a prior probability distribution. This paper describes the derivation of a prior distribution of intake that was used for the Bayesian analysis of plutonium and uranium worker doses in a recent epidemiology study. The chosen distribution is log-normal with a geometric standard deviation of 6 and a median value that is derived for each worker based on the duration of the work history and the number of reported acute intakes. The median value is a function of the work history and a constant related to activity in air concentration, M, which is derived separately for uranium and plutonium. The value of M is based primarily on measurements of plutonium and uranium in air derived from historical personal air sampler (PAS) data. However, there is significant uncertainty on the value of M that results from paucity of PAS data and from extrapolating these measurements to actual intakes. This paper compares posterior and prior distributions of intake and investigates the sensitivity of the Bayesian analyses to the assumed value of M. It is found that varying M by a factor of 10 results in a much smaller factor of 2 variation in mean intake and lung dose for both plutonium and uranium. It is concluded that if a log-normal distribution is considered to adequately represent worker intakes, then the Bayesian posterior distribution of dose is relatively insensitive to the value assumed of M.
Lavoué, Sébastien; Sullivan, John P
2004-10-01
Fishes of the Superorder Osteoglossomorpha (the "bonytongues") constitute a morphologically heterogeneous group of basal teleosts, including highly derived subgroups such as African electric fishes, the African butterfly fish, and Old World knifefishes. Lack of consensus among hypotheses of osteoglossomorph relationships advanced during the past 30 years may be due in part to the difficulty of identifying shared derived characters among the morphologically differentiated extant families of this group. In this study, we present a novel phylogenetic hypothesis for this group, based on the analysis of more than 4000 characters from five molecular markers (the mitochondrial cytochrome b, 12S and 16S rRNA genes, and the nuclear genes RAG2 and MLL). Our taxonomic sampling includes one representative of each extant non-mormyrid osteoglossomorph genus, one representative for the monophyletic family Mormyridae, and four outgroup taxa within the basal Teleostei. Maximum parsimony analysis of combined and equally weighted characters from the five molecular markers and Bayesian analysis provide a single, well-supported, hypothesis of osteoglossomorph interrelationships and show the group to be monophyletic. The tree topology is the following: (Hiodon alosoides, (Pantodon buchholzi, (((Osteoglossum bicirrhosum, Scleropages sp.), (Arapaima gigas, Heterotis niloticus)), ((Gymnarchus niloticus, Ivindomyrus opdenboschi), ((Notopterus notopterus, Chitala ornata), (Xenomystus nigri, Papyrocranus afer)))))). We compare our results with previously published phylogenetic hypotheses based on morpho-anatomical data. Additionally, we explore the consequences of the long terminal branch length for the taxon Pantodon buchholzi in our phylogenetic reconstruction and we use the obtained phylogenetic tree to reconstruct the evolutionary history of electroreception in the Notopteroidei.
A problem in particle physics and its Bayesian analysis
NASA Astrophysics Data System (ADS)
Landon, Joshua
An up and coming field in contemporary nuclear and particle physics is "Lattice Quantum Chromodynamics", henceforth Lattice QCD. Indeed the 2004 Nobel Prize in Physics went to the developers of equations that describe QCD. In this dissertation, following a layperson's introduction to the structure of matter, we outline the statistical aspects of a problem in Lattice QCD faced by particle physicists, and point out the difficulties encountered by them in trying to address the problem. The difficulties stem from the fact that one is required to estimate a large -- conceptually infinite -- number of parameters based on a finite number of non-linear equations, each of which is a sum of exponential functions. We then present a plausible approach for solving the problem. Our approach is Bayesian and is driven by a computationally intensive Markov Chain Monte Carlo based solution. However, in order to invoke our approach we first look at the underlying anatomy of the problem and synthesize its essentials. These essentials reveal a pattern that can be harnessed via some assumptions, and this in turn enables us to outline a pathway towards a solution. We demonstrate the viability of our approach via simulated data, followed by its validation against real data provided to us by our physicist colleagues. Our approach yields results that in the past were not obtainable via alternate approaches. The contribution of this dissertation is two-fold. The first is a use of computationally intensive statistical technology to produce results in physics that could not be obtained using physics based techniques. Since the statistical architecture of the problem considered here can arise in other contexts as well, the second contribution of this dissertation is to indicate a plausible approach for addressing a generic class of problems wherein the number of parameters to be estimated exceeds the number of constraints, each constraint being a non-linear equation that is the sum of
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Sanders, N. E.; Soderberg, A. M.; Betancourt, M.
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
Phylogenetic analysis of human immunodeficiency virus type 2 isolated from Cuban individuals.
Machado, Liuber Y; Díaz, Héctor M; Noa, Enrique; Martín, Dayamí; Blanco, Madeline; Díaz, Dervel F; Sánchez, Yordank R; Nibot, Carmen; Sánchez, Lourdes; Dubed, Marta
2014-08-01
The presence of infection by human immunodeficiency virus type 2 (HIV-2) in Cuba has been previously documented. However, genetic information on the strains that circulate in the Cuban people is still unknown. The present work constitutes the first study concerning the phylogenetic relationship of HIV-2 Cuban isolates conducted on 13 Cuban patients who were diagnosed with HIV-2. The env sequences were analyzed for the construction of a phylogenetic tree with reference sequences of HIV-2. Phylogenetic analysis of the env gene showed that all the Cuban sequences clustered in group A of HIV-2. The analysis indicated several independent introductions of HIV-2 into Cuba. The results of the study will reinforce the program on the epidemiological surveillance of the infection in Cuba and make possible further molecular evolutionary studies.
Martínez, Alexander A; Zaldívar, Yamitzel; Arteaga, Griselda; de Castillo, Zoila; Ortiz, Alma; Mendoza, Yaxelis; Castillero, Omar; Castillo, Juan A; Cristina, Juan; Pascale, Juan M
2015-01-01
The Hepatitis B Virus (HBV) can cause acute or chronic infection it is also associated with the development of liver cancer, thousands of new infections occur on a yearly basis, and many of these cases are located in certain areas of the Caribbean and Latin America. In these areas, the HBV prevalence is still high which makes this virus a serious public health concern to the entire region. Studies performed in Panama suggest a complex pattern in the distribution of HBV among the country's different risk groups. We use phylogenetic analysis in order to determine which HBV genotypes were circulating in these specific groups; for this we used a fragment of the PreS2/2 region of the HBV genome. Subsequently whole HBV genome sequences were used for Bayesian analysis of phylodynamics and phylogeography. Two main genotypes were found: genotype A (54.5%) and genotype F (45.5%). There was a difference in the distribution of genotypes according to risk groups: 72.9% of high risk groups were associated to genotype A, and 55.0% of samples of genotype F were associated to the low risk group (p<0.002). The Bayesian analysis of phylogeny-traits association revealed a statistically significant geographical association (p<0.0001) with both genotypes and different regions of the country. The Bayesian time of most recent common ancestor analysis (tMRCA) revealed a recent tMRCA for genotype A2 circulating in Panama (1997, 95% HPD: 1986-2005), when it is compared with Panamanian genotype F1c sequences (1930, 95% HPD: 1810 - 2005). These results suggest a possible change in the distribution of HBV genotypes in Panama and Latin America as a whole. They also serve to encourage the implementation of vaccination programs in high-risk groups, in order to prevent an increase in the number of new HBV cases in Latin America and worldwide.
Martínez, Alexander A.; Zaldívar, Yamitzel; Arteaga, Griselda; de Castillo, Zoila; Ortiz, Alma; Mendoza, Yaxelis; Castillero, Omar; Castillo, Juan A.; Cristina, Juan; Pascale, Juan M.
2015-01-01
The Hepatitis B Virus (HBV) can cause acute or chronic infection it is also associated with the development of liver cancer, thousands of new infections occur on a yearly basis, and many of these cases are located in certain areas of the Caribbean and Latin America. In these areas, the HBV prevalence is still high which makes this virus a serious public health concern to the entire region. Studies performed in Panama suggest a complex pattern in the distribution of HBV among the country’s different risk groups. We use phylogenetic analysis in order to determine which HBV genotypes were circulating in these specific groups; for this we used a fragment of the PreS2/2 region of the HBV genome. Subsequently whole HBV genome sequences were used for Bayesian analysis of phylodynamics and phylogeography. Two main genotypes were found: genotype A (54.5%) and genotype F (45.5%). There was a difference in the distribution of genotypes according to risk groups: 72.9% of high risk groups were associated to genotype A, and 55.0% of samples of genotype F were associated to the low risk group (p<0.002). The Bayesian analysis of phylogeny-traits association revealed a statistically significant geographical association (p<0.0001) with both genotypes and different regions of the country. The Bayesian time of most recent common ancestor analysis (tMRCA) revealed a recent tMRCA for genotype A2 circulating in Panama (1997, 95% HPD: 1986—2005), when it is compared with Panamanian genotype F1c sequences (1930, 95% HPD: 1810 – 2005). These results suggest a possible change in the distribution of HBV genotypes in Panama and Latin America as a whole. They also serve to encourage the implementation of vaccination programs in high-risk groups, in order to prevent an increase in the number of new HBV cases in Latin America and worldwide. PMID:26230260
Kuramoto, Tae; Nishihara, Hidenori; Watanabe, Maiko; Okada, Norihiro
2015-01-01
Despite many studies on avian phylogenetics in recent decades that used morphology, mitochondrial genomes, and/or nuclear genes, the phylogenetic positions of several birds (e.g., storks) remain unsettled. In addition to the aforementioned approaches, analysis of retroposon insertions, which are nearly homoplasy-free phylogenetic markers, has also been used in avian phylogenetics. However, the first step in the analysis of retroposon insertions, that is, isolation of retroposons from genomic libraries, is a costly and time-consuming procedure. Therefore, we developed a high-throughput and cost-effective protocol to collect retroposon insertion information based on next-generation sequencing technology, which we call here the STRONG (Screening of Transposons Obtained by Next Generation Sequencing) method, and applied it to 3 waterbird species, for which we identified 35,470 loci containing chicken repeat 1 retroposons (CR1). Our analysis of the presence/absence of 30 CR1 insertions demonstrated the intra- and interordinal phylogenetic relationships in the waterbird assemblage, namely 1) Loons diverged first among the waterbirds, 2) penguins (Sphenisciformes) and petrels (Procellariiformes) diverged next, and 3) among the remaining families of waterbirds traditionally classified in Ciconiiformes/Pelecaniformes, storks (Ciconiidae) diverged first. Furthermore, our genome-scale, in silico retroposon analysis based on published genome data uncovered a complex divergence history among pelican, heron, and ibis lineages, presumably involving ancient interspecies hybridization between the heron and ibis lineages. Thus, our retroposon-based waterbird phylogeny and the established phylogenetic position of storks will help to understand the evolutionary processes of aquatic adaptation and related morphological convergent evolution. PMID:26527652
Phylogenetic analysis of Wheat dwarf virus isolates from Iran.
Parizipour, Mohamad Hamed Ghodoum; Schubert, Jörg; Behjatnia, Seyed Ali Akbar; Afsharifar, Alireza; Habekuß, Antje; Wu, Beilei
2017-04-01
Wheat dwarf virus (WDV) adversely affects cereal production in Asia, Europe, and North Africa. In this study, sequences of several WDV isolates from Iran which is located in the Fertile Crescent were analyzed. Analysis revealed a new geographic cluster for WDV-Wheat from Iran. Recombination analysis demonstrated the existence of several breakpoints in different regions of the viral genome. Data analysis demonstrated that WDV-Barley has an older history and lower diversity than WDV-Wheat. Sequence analysis identified a rare occasion of a co-infection of wheat with WDV-Wheat and WDV-Barley.
Application of Bayesian graphs to SN Ia data analysis and compression
NASA Astrophysics Data System (ADS)
Ma, Cong; Corasaniti, Pier-Stefano; Bassett, Bruce A.
2016-12-01
Bayesian graphical models are an efficient tool for modelling complex data and derive self-consistent expressions of the posterior distribution of model parameters. We apply Bayesian graphs to perform statistical analyses of Type Ia supernova (SN Ia) luminosity distance measurements from the joint light-curve analysis (JLA) data set. In contrast to the χ2 approach used in previous studies, the Bayesian inference allows us to fully account for the standard-candle parameter dependence of the data covariance matrix. Comparing with χ2 analysis results, we find a systematic offset of the marginal model parameter bounds. We demonstrate that the bias is statistically significant in the case of the SN Ia standardization parameters with a maximal 6σ shift of the SN light-curve colour correction. In addition, we find that the evidence for a host galaxy correction is now only 2.4σ. Systematic offsets on the cosmological parameters remain small, but may increase by combining constraints from complementary cosmological probes. The bias of the χ2 analysis is due to neglecting the parameter-dependent log-determinant of the data covariance, which gives more statistical weight to larger values of the standardization parameters. We find a similar effect on compressed distance modulus data. To this end, we implement a fully consistent compression method of the JLA data set that uses a Gaussian approximation of the posterior distribution for fast generation of compressed data. Overall, the results of our analysis emphasize the need for a fully consistent Bayesian statistical approach in the analysis of future large SN Ia data sets.
Rodríguez-Ramilo, Silvia T; Wang, Jinliang
2012-09-01
The inference of population genetic structures is essential in many research areas in population genetics, conservation biology and evolutionary biology. Recently, unsupervised Bayesian clustering algorithms have been developed to detect a hidden population structure from genotypic data, assuming among others that individuals taken from the population are unrelated. Under this assumption, markers in a sample taken from a subpopulation can be considered to be in Hardy-Weinberg and linkage equilibrium. However, close relatives might be sampled from the same subpopulation, and consequently, might cause Hardy-Weinberg and linkage disequilibrium and thus bias a population genetic structure analysis. In this study, we used simulated and real data to investigate the impact of close relatives in a sample on Bayesian population structure analysis. We also showed that, when close relatives were identified by a pedigree reconstruction approach and removed, the accuracy of a population genetic structure analysis can be greatly improved. The results indicate that unsupervised Bayesian clustering algorithms cannot be used blindly to detect genetic structure in a sample with closely related individuals. Rather, when closely related individuals are suspected to be frequent in a sample, these individuals should be first identified and removed before conducting a population structure analysis.
Perandini, Simone; Soardi, Gian Alberto; Motton, Massimiliano; Augelli, Raffaele; Dallaserra, Chiara; Puntel, Gino; Rossi, Arianna; Sala, Giuseppe; Signorini, Manuel; Spezia, Laura; Zamboni, Federico; Montemezzi, Stefania
2016-01-01
The aim of this study was to prospectively assess the accuracy gain of Bayesian analysis-based computer-aided diagnosis (CAD) vs human judgment alone in characterizing solitary pulmonary nodules (SPNs) at computed tomography (CT). The study included 100 randomly selected SPNs with a definitive diagnosis. Nodule features at first and follow-up CT scans as well as clinical data were evaluated individually on a 1 to 5 points risk chart by 7 radiologists, firstly blinded then aware of Bayesian Inference Malignancy Calculator (BIMC) model predictions. Raters’ predictions were evaluated by means of receiver operating characteristic (ROC) curve analysis and decision analysis. Overall ROC area under the curve was 0.758 before and 0.803 after the disclosure of CAD predictions (P = 0.003). A net gain in diagnostic accuracy was found in 6 out of 7 readers. Mean risk class of benign nodules dropped from 2.48 to 2.29, while mean risk class of malignancies rose from 3.66 to 3.92. Awareness of CAD predictions also determined a significant drop on mean indeterminate SPNs (15 vs 23.86 SPNs) and raised the mean number of correct and confident diagnoses (mean 39.57 vs 25.71 SPNs). This study provides evidence supporting the integration of the Bayesian analysis-based BIMC model in SPN characterization. PMID:27648166
Empirical Markov Chain Monte Carlo Bayesian analysis of fMRI data.
de Pasquale, F; Del Gratta, C; Romani, G L
2008-08-01
In this work an Empirical Markov Chain Monte Carlo Bayesian approach to analyse fMRI data is proposed. The Bayesian framework is appealing since complex models can be adopted in the analysis both for the image and noise model. Here, the noise autocorrelation is taken into account by adopting an AutoRegressive model of order one and a versatile non-linear model is assumed for the task-related activation. Model parameters include the noise variance and autocorrelation, activation amplitudes and the hemodynamic response function parameters. These are estimated at each voxel from samples of the Posterior Distribution. Prior information is included by means of a 4D spatio-temporal model for the interaction between neighbouring voxels in space and time. The results show that this model can provide smooth estimates from low SNR data while important spatial structures in the data can be preserved. A simulation study is presented in which the accuracy and bias of the estimates are addressed. Furthermore, some results on convergence diagnostic of the adopted algorithm are presented. To validate the proposed approach a comparison of the results with those from a standard GLM analysis, spatial filtering techniques and a Variational Bayes approach is provided. This comparison shows that our approach outperforms the classical analysis and is consistent with other Bayesian techniques. This is investigated further by means of the Bayes Factors and the analysis of the residuals. The proposed approach applied to Blocked Design and Event Related datasets produced reliable maps of activation.
Bayesian Statistical Analysis Applied to NAA Data for Neutron Flux Spectrum Determination
NASA Astrophysics Data System (ADS)
Chiesa, D.; Previtali, E.; Sisti, M.
2014-04-01
In this paper, we present a statistical method, based on Bayesian statistics, to evaluate the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation analysis (NAA) experiment [A. Borio di Tigliole et al., Absolute flux measurement by NAA at the Pavia University TRIGA Mark II reactor facilities, ENC 2012 - Transactions Research Reactors, ISBN 978-92-95064-14-0, 22 (2012)] performed at the TRIGA Mark II reactor of Pavia University (Italy). In order to evaluate the neutron flux spectrum, subdivided in energy groups, we must solve a system of linear equations containing the grouped cross sections and the activation rate data. We solve this problem with Bayesian statistical analysis, including the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, is used to define the problem statistical model and solve it. The energy group fluxes and their uncertainties are then determined with great accuracy and the correlations between the groups are analyzed. Finally, the dependence of the results on the prior distribution choice and on the group cross section data is investigated to confirm the reliability of the analysis.
The mitochondrial DNA of Xenoturbella bocki: genomic architecture and phylogenetic analysis.
Perseke, Marleen; Hankeln, Thomas; Weich, Bettina; Fritzsch, Guido; Stadler, Peter F; Israelsson, Olle; Bernhard, Detlef; Schlegel, Martin
2007-08-01
The phylogenetic position of Xenoturbella bocki has been a matter of controversy since its description in 1949. We sequenced a second complete mitochondrial genome of this species and performed phylogenetic analyses based on the amino acid sequences of all 13 mitochondrial protein-coding genes and on its gene order. Our results confirm the deuterostome relationship of Xenoturbella. However, in contrast to a recently published study (Bourlat et al. in Nature 444:85-88, 2006), our data analysis suggests a more basal branching of Xenoturbella within the deuterostomes, rather than a sister-group relationship to the Ambulacraria (Hemichordata and Echinodermata).
McMurdie, Paul J; Holmes, Susan
2012-01-01
We present a detailed description of a new Bioconductor package, phyloseq, for integrated data and analysis of taxonomically-clustered phylogenetic sequencing data in conjunction with related data types. The phyloseq package integrates abundance data, phylogenetic information and covariates so that exploratory transformations, plots, and confirmatory testing and diagnostic plots can be carried out seamlessly. The package is built following the S4 object-oriented framework of the R language so that once the data have been input the user can easily transform, plot and analyze the data. We present some examples that highlight the methods and the ease with which we can leverage existing packages.
Semerikova, S A; Semerikov, V L
2014-01-01
A phylogenetic study of firs (Abies Mill.) was conducted using nucleotide sequences of several chloroplast DNA regions with a total length of 5580 bp. The analysis included 37 taxa, which represented the main evolutionary lineages of the genus, and Keteleeria daviana. According to phylogenetic reconstruction the Abies species were subdivided into six main groups, generally corresponding to their geographic distribution. The phylogenetic tree had three basal clades. All of these clades contained American species, and only one of them contained Eurasian species. The divergence time calibrations, based on paleobotanical data and the chloroplast DNA mutation rate estimates in Pinaceae, produced similar results..The age of diversification among the clades of the present-day Abies was estimated as the end of the Oligocene-beginning of Miocene. The age of the separation of Mediterranean firs from the Asian-North American branch corresponds to the Miocene. The age of diversification within the young groups of Mediterranean, Asian, and boreal American firs (A. lasiocarpa, A. balsamea, A. fraseri) was estimated as the Pliocene-Pleistocene. Based on the phylogenetic reconstruction obtained, the most plausible biogeographic scenarios were suggested. It is noted that the existing systematic classification of the genus Abies strongly contradicts with phylogenetic reconstruction and requires revision.
Phylogenetic analysis of bacteria preserved in a permafrost ice wedge for 25,000 years.
Katayama, Taiki; Tanaka, Michiko; Moriizumi, Jun; Nakamura, Toshio; Brouchkov, Anatoli; Douglas, Thomas A; Fukuda, Masami; Tomita, Fusao; Asano, Kozo
2007-04-01
Phylogenetic analysis of bacteria preserved within an ice wedge from the Fox permafrost tunnel was undertaken by cultivation and molecular techniques. The radiocarbon age of the ice wedge was determined. Our results suggest that the bacteria in the ice wedge adapted to the frozen conditions have survived for 25,000 years.
EthoSeq: a tool for phylogenetic analysis and data mining in behavioral sequences.
Japyassú, Hilton F; Alberts, Carlos C; Izar, Patrícia; Sato, Takechi
2006-11-01
This article introduces the software program called EthoSeq, which is designed to extract probabilistic behavioral sequences (tree-generated sequences, or TGSs) from observational data and to prepare a TGS-species matrix for phylogenetic analysis. The program uses Graph Theory algorithms to automatically detect behavioral patterns within the observational sessions. It includes filtering tools to adjust the search procedure to user-specified statistical needs. Preliminary analyses of data sets, such as grooming sequences in birds and foraging tactics in spiders, uncover a large number of TGSs which together yield single phylogenetic trees. An example of the use of the program is our analysis of felid grooming sequences, in which we have obtained 1,386 felid grooming TGSs for seven species, resulting in a single phylogeny. These results show that behavior is definitely useful in phylogenetic analysis. EthoSeq simplifies and automates such analyses, uncovers much of the hidden patterns of long behavioral sequences, and prepares this data for further analysis with standard phylogenetic programs. We hope it will encourage many empirical studies on the evolution of behavior.
Caruso, Claudio; Dondo, Alessandro; Cerutti, Francesco; Masoero, Loretta; Rosamilia, Alfonso; Zoppi, Simona; D'Errico, Valeria; Grattarola, Carla; Acutis, Pier Luigi; Peletto, Simone
2014-07-01
We describe Aujeszky's disease in a female of red fox (Vulpes vulpes). Although wild boar (Sus scrofa) would be the expected source of infection, phylogenetic analysis suggested a domestic rather than a wild source of virus, underscoring the importance of biosecurity measures in pig farms to prevent contact with wild animals.
Technology Transfer Automated Retrieval System (TEKTRAN)
Sarcocystis nesbitti was first described by Mandour in 1969 from rhesus monkey muscle. Its definitive host remains unknown. 18SrRNA gene of Sarcocystis nesbitti was amplified, sequenced, and subjected to phylogenetic analysis. Among those congeners available for comparison, it shares closest affinit...
Kopec, D; Shagas, G; Reinharth, D; Tamang, S
2004-01-01
The use and development of software in the medical field offers tremendous opportunities for making health care delivery more efficient, more effective, and less error-prone. We discuss and explore the use of clinical pathways analysis with Adaptive Bayesian Networks and Data Mining Techniques to perform such analyses. The computation of "lift" (a measure of completed pathways improvement potential) leads us to optimism regarding the potential for this approach.
Bayesian Switching Factor Analysis for Estimating Time-varying Functional Connectivity in fMRI.
Taghia, Jalil; Ryali, Srikanth; Chen, Tianwen; Supekar, Kaustubh; Cai, Weidong; Menon, Vinod
2017-03-03
There is growing interest in understanding the dynamical properties of functional interactions between distributed brain regions. However, robust estimation of temporal dynamics from functional magnetic resonance imaging (fMRI) data remains challenging due to limitations in extant multivariate methods for modeling time-varying functional interactions between multiple brain areas. Here, we develop a Bayesian generative model for fMRI time-series within the framework of hidden Markov models (HMMs). The model is a dynamic variant of the static factor analysis model (Ghahramani and Beal, 2000). We refer to this model as Bayesian switching factor analysis (BSFA) as it integrates factor analysis into a generative HMM in a unified Bayesian framework. In BSFA, brain dynamic functional networks are represented by latent states which are learnt from the data. Crucially, BSFA is a generative model which estimates the temporal evolution of brain states and transition probabilities between states as a function of time. An attractive feature of BSFA is the automatic determination of the number of latent states via Bayesian model selection arising from penalization of excessively complex models. Key features of BSFA are validated using extensive simulations on carefully designed synthetic data. We further validate BSFA using fingerprint analysis of multisession resting-state fMRI data from the Human Connectome Project (HCP). Our results show that modeling temporal dependencies in the generative model of BSFA results in improved fingerprinting of individual participants. Finally, we apply BSFA to elucidate the dynamic functional organization of the salience, central-executive, and default mode networks-three core neurocognitive systems with central role in cognitive and affective information processing (Menon, 2011). Across two HCP sessions, we demonstrate a high level of dynamic interactions between these networks and determine that the salience network has the highest temporal
Phylogenetic analysis of the genus Microbacterium based on 16S rRNA gene sequences.
Takeuchi, M; Yokota, A
1994-11-15
16S rRNA gene (rDNA) studies of the six species of the genus Microbacterium, M. lacticum, M. laevaniformans, M. dextranolyticum, M. imperiale, M. arborescens and M. aurum, were performed and the primary structures were compared with those of 29 representative actinobacteria and related organisms. Phylogenetic analysis indicated that six species of the genus Microbacterium and representative four species of the genus Aureobacterium appear to be phylogenetically coherent as was suggested by Rainey et al., although the peptidoglycan types of these two genera are different (peptidoglycan type B1 or B2). Thus, the phylogenetical analyses revealed that members of actinobacteria with group B-peptidoglycan do not cluster according to their peptidoglycan types, but form compact cluster different from actinobacteria or actinomycetes with group A-peptidoglycan.
Soft-tissue anatomy of the extant hominoids: a review and phylogenetic analysis
Gibbs, S; Collard, M; Wood, B
2002-01-01
This paper reports the results of a literature search for information about the soft-tissue anatomy of the extant non-human hominoid genera, Pan, Gorilla, Pongo and Hylobates, together with the results of a phylogenetic analysis of these data plus comparable data for Homo. Information on the four extant non-human hominoid genera was located for 240 out of the 1783 soft-tissue structures listed in the Nomina Anatomica. Numerically these data are biased so that information about some systems (e.g. muscles) and some regions (e.g. the forelimb) are over-represented, whereas other systems and regions (e.g. the veins and the lymphatics of the vascular system, the head region) are either under-represented or not represented at all. Screening to ensure that the data were suitable for use in a phylogenetic analysis reduced the number of eligible soft-tissue structures to 171. These data, together with comparable data for modern humans, were converted into discontinuous character states suitable for phylogenetic analysis and then used to construct a taxon-by-character matrix. This matrix was used in two tests of the hypothesis that soft-tissue characters can be relied upon to reconstruct hominoid phylogenetic relationships. In the first, parsimony analysis was used to identify cladograms requiring the smallest number of character state changes. In the second, the phylogenetic bootstrap was used to determine the confidence intervals of the most parsimonious clades. The parsimony analysis yielded a single most parsimonious cladogram that matched the molecular cladogram. Similarly the bootstrap analysis yielded clades that were compatible with the molecular cladogram; a (Homo, Pan) clade was supported by 95% of the replicates, and a (Gorilla, Pan, Homo) clade by 96%. These are the first hominoid morphological data to provide statistically significant support for the clades favoured by the molecular evidence. PMID:11833653
treespace: statistical exploration of landscapes of phylogenetic trees.
Jombart, Thibaut; Kendall, Michelle; Almagro-Garcia, Jacob; Colijn, Caroline
2017-04-04
The increasing availability of large genomic datasets as well as the advent of Bayesian phylogenetics facilitate the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group-specific consensus phylogenies. treespace also provides a user-friendly web interface for interactive data analysis. treespace is integrated alongside existing standards for phylogenetics and is easily accessible through a web interface. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results. This article is protected by copyright. All rights reserved.
Wen, Dingqiao; Yu, Yun; Hahn, Matthew W; Nakhleh, Luay
2016-06-01
The role of hybridization and subsequent introgression has been demonstrated in an increasing number of species. Recently, Fontaine et al. (Science, 347, 2015, 1258524) conducted a phylogenomic analysis of six members of the Anopheles gambiae species complex. Their analysis revealed a reticulate evolutionary history and pointed to extensive introgression on all four autosomal arms. The study further highlighted the complex evolutionary signals that the co-occurrence of incomplete lineage sorting (ILS) and introgression can give rise to in phylogenomic analyses. While tree-based methodologies were used in the study, phylogenetic networks provide a more natural model to capture reticulate evolutionary histories. In this work, we reanalyse the Anopheles data using a recently devised framework that combines the multispecies coalescent with phylogenetic networks. This framework allows us to capture ILS and introgression simultaneously, and forms the basis for statistical methods for inferring reticulate evolutionary histories. The new analysis reveals a phylogenetic network with multiple hybridization events, some of which differ from those reported in the original study. To elucidate the extent and patterns of introgression across the genome, we devise a new method that quantifies the use of reticulation branches in the phylogenetic network by each genomic region. Applying the method to the mosquito data set reveals the evolutionary history of all the chromosomes. This study highlights the utility of 'network thinking' and the new insights it can uncover, in particular in phylogenomic analyses of large data sets with extensive gene tree incongruence.
Faith, Daniel P
2008-12-01
New species conservation strategies, including the EDGE of Existence (EDGE) program, have expanded threatened species assessments by integrating information about species' phylogenetic distinctiveness. Distinctiveness has been measured through simple scores that assign shared credit among species for evolutionary heritage represented by the deeper phylogenetic branches. A species with a high score combined with a high extinction probability receives high priority for conservation efforts. Simple hypothetical scenarios for phylogenetic trees and extinction probabilities demonstrate how such scoring approaches can provide inefficient priorities for conservation. An existing probabilistic framework derived from the phylogenetic diversity measure (PD) properly captures the idea of shared responsibility for the persistence of evolutionary history. It avoids static scores, takes into account the status of close relatives through their extinction probabilities, and allows for the necessary updating of priorities in light of changes in species threat status. A hypothetical phylogenetic tree illustrates how changes in extinction probabilities of one or more species translate into changes in expected PD. The probabilistic PD framework provided a range of strategies that moved beyond expected PD to better consider worst-case PD losses. In another example, risk aversion gave higher priority to a conservation program that provided a smaller, but less risky, gain in expected PD. The EDGE program could continue to promote a list of top species conservation priorities through application of probabilistic PD and simple estimates of current extinction probability. The list might be a dynamic one, with all the priority scores updated as extinction probabilities change. Results of recent studies suggest that estimation of extinction probabilities derived from the red list criteria linked to changes in species range sizes may provide estimated probabilities for many different species
LEEBENS-MACK, JIM; VISION, TODD; BRENNER, ERIC; BOWERS, JOHN E.; CANNON, STEVEN; CLEMENT, MARK J.; CUNNINGHAM, CLIFFORD W.; dePAMPHILIS, CLAUDE; deSALLE, ROB; DOYLE, JEFF J.; EISEN, JONATHAN A.; GU, XUN; HARSHMAN, JOHN; JANSEN, ROBERT K.; KELLOGG, ELIZABETH A.; KOONIN, EUGENE V.; MISHLER, BRENT D.; PHILIPPE, HERVÉ; PIRES, J. CHRIS; QIU, YIN-LONG; RHEE, SEUNG Y.; SJÖLANDER, KIMMEN; SOLTIS, DOUGLAS E.; SOLTIS, PAMELA S.; STEVENSON, DENNIS W.; WALL, KERR; WARNOW, TANDY; ZMASEK, CHRISTIAN
2011-01-01
In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors. PMID:16901231
Bayesian estimation of dynamic matching function for U-V analysis in Japan
NASA Astrophysics Data System (ADS)
Kyo, Koki; Noda, Hideo; Kitagawa, Genshiro
2012-05-01
In this paper we propose a Bayesian method for analyzing unemployment dynamics. We derive a Beveridge curve for unemployment and vacancy (U-V) analysis from a Bayesian model based on a labor market matching function. In our framework, the efficiency of matching and the elasticities of new hiring with respect to unemployment and vacancy are regarded as time varying parameters. To construct a flexible model and obtain reasonable estimates in an underdetermined estimation problem, we treat the time varying parameters as random variables and introduce smoothness priors. The model is then described in a state space representation, enabling the parameter estimation to be carried out using Kalman filter and fixed interval smoothing. In such a representation, dynamic features of the cyclic unemployment rate and the structural-frictional unemployment rate can be accurately captured.
MorePower 6.0 for ANOVA with relational confidence intervals and Bayesian analysis.
Campbell, Jamie I D; Thompson, Valerie A
2012-12-01
MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs. It also calculates relational confidence intervals for ANOVA effects based on formulas from Jarmasz and Hollands (Canadian Journal of Experimental Psychology 63:124-138, 2009), as well as Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson (Behavior Research Methods 43:679-690, 2011). The program is unique in affording direct comparison of these three approaches to the interpretation of ANOVA tests. Its high numerical precision and ability to work with complex ANOVA designs could facilitate researchers' attention to issues of statistical power, Bayesian analysis, and the use of confidence intervals for data interpretation. MorePower 6.0 is available at https://wiki.usask.ca/pages/viewpageattachments.action?pageId=420413544 .
Bayesian Propensity Score Analysis: Simulation and Case Study
ERIC Educational Resources Information Center
Kaplan, David; Chen, Cassie J. S.
2011-01-01
Propensity score analysis (PSA) has been used in a variety of settings, such as education, epidemiology, and sociology. Most typically, propensity score analysis has been implemented within the conventional frequentist perspective of statistics. This perspective, as is well known, does not account for uncertainty in either the parameters of the…
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
NASA Astrophysics Data System (ADS)
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
NASA Astrophysics Data System (ADS)
Fox, Neil I.; Micheas, Athanasios C.; Peng, Yuqiang
2016-07-01
This paper introduces the use of Bayesian full Procrustes shape analysis in object-oriented meteorological applications. In particular, the Procrustes methodology is used to generate mean forecast precipitation fields from a set of ensemble forecasts. This approach has advantages over other ensemble averaging techniques in that it can produce a forecast that retains the morphological features of the precipitation structures and present the range of forecast outcomes represented by the ensemble. The production of the ensemble mean avoids the problems of smoothing that result from simple pixel or cell averaging, while producing credible sets that retain information on ensemble spread. Also in this paper, the full Bayesian Procrustes scheme is used as an object verification tool for precipitation forecasts. This is an extension of a previously presented Procrustes shape analysis based verification approach into a full Bayesian format designed to handle the verification of precipitation forecasts that match objects from an ensemble of forecast fields to a single truth image. The methodology is tested on radar reflectivity nowcasts produced in the Warning Decision Support System - Integrated Information (WDSS-II) by varying parameters in the K-means cluster tracking scheme.
A phylogenetic analysis of the myxobacteria: basis for their classification
NASA Technical Reports Server (NTRS)
Shimkets, L.; Woese, C. R.
1992-01-01
The primary sequence and secondary structural features of the 16S rRNA were compared for 12 different myxobacteria representing all the known cultivated genera. Analysis of these data show the myxobacteria to form a monophyletic grouping consisting of three distinct families, which lies within the delta subdivision of the purple bacterial phylum. The composition of the families is consistent with differences in cell and spore morphology, cell behavior, and pigment and secondary metabolite production but is not correlated with the morphological complexity of the fruiting bodies. The Nannocystis exedens lineage has evolved at an unusually rapid pace and its rRNA shows numerous primary and secondary structural idiosyncrasies.
Heydari, Shahram; Miranda-Moreno, Luis F; Lord, Dominique; Fu, Liping
2014-03-01
In road safety studies, decision makers must often cope with limited data conditions. In such circumstances, the maximum likelihood estimation (MLE), which relies on asymptotic theory, is unreliable and prone to bias. Moreover, it has been reported in the literature that (a) Bayesian estimates might be significantly biased when using non-informative prior distributions under limited data conditions, and that (b) the calibration of limited data is plausible when existing evidence in the form of proper priors is introduced into analyses. Although the Highway Safety Manual (2010) (HSM) and other research studies provide calibration and updating procedures, the data requirements can be very taxing. This paper presents a practical and sound Bayesian method to estimate and/or update safety performance function (SPF) parameters combining the information available from limited data with the SPF parameters reported in the HSM. The proposed Bayesian updating approach has the advantage of requiring fewer observations to get reliable estimates. This paper documents this procedure. The adopted technique is validated by conducting a sensitivity analysis through an extensive simulation study with 15 different models, which include various prior combinations. This sensitivity analysis contributes to our understanding of the comparative aspects of a large number of prior distributions. Furthermore, the proposed method contributes to unification of the Bayesian updating process for SPFs. The results demonstrate the accuracy of the developed methodology. Therefore, the suggested approach offers considerable promise as a methodological tool to estimate and/or update baseline SPFs and to evaluate the efficacy of road safety countermeasures under limited data conditions.
Blattner, F R; Weising, K; Bänfer, G; Maschwitz, U; Fiala, B
2001-06-01
Many species of the paleotropical pioneer tree genus Macaranga Thou. (Euphorbiaceae) live in association with ants. Various types of mutualistic interactions exist, ranging from the attraction of unspecific ant visitors to obligate myrmecophytism. In the latter, nesting space and food bodies are exchanged for protection by highly specific ant partners (mainly species of the myrmicine genus Crematogaster). As a first step toward elucidating the coevolution of ant-plant interactions in the Macaranga-Crematogaster system, we have initiated a molecular investigation of the plant partners' phylogeny. Nuclear ribosomal DNA internal transcribed spacer (ITS) sequences were analyzed for 73 accessions from 47 Macaranga species, representing 17 sections or informally described species groups. Three accessions from the putative sister taxon Mallotus Lour, were included as outgroups. Cladograms of the ITS data revealed Macaranga to be nested within Mallotus. ITS sequences are highly similar within section Pachystemon s.str., suggesting a relatively recent and rapid radiation of obligate myrmecophytes within this section. Forty-three accessions, mainly of ant-inhabited species, were additionally investigated by random amplified polymorphic DNA (RAPD) and microsatellite-primed PCR (MP-PCR) techniques. Phenetic analysis of RAPD and MP-PCR banding profiles generally confirmed the ITS results. Best resolutions for individual clades were obtained when ITS and RAPD/MP-PCR data were combined into a single matrix and analyzed phenetically. The combined analysis suggests multiple (four) rather than a single evolutionary origin of myrmecophytism, at least one reversal from obligate myrmecophytism to nonmyrmecophytism, and one loss of mutualistic specifity.
Phylogenetic analysis of Mexican Babesia bovis isolates using msa and ssrRNA gene sequences.
Genis, Alma D; Mosqueda, Juan J; Borgonio, Verónica M; Falcón, Alfonso; Alvarez, Antonio; Camacho, Minerva; de Lourdes Muñoz, Maria; Figueroa, Julio V
2008-12-01
Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2), and msa-2b). Small subunit ribosomal (ssr)RNA gene is subject to evolutive pressure and has been used in phylogenetic studies. To determine the phylogenetic relationship among B. bovis Mexican isolates using different genetic markers, PCR amplicons, corresponding to msa-1, msa-2c, msa-2b, and ssrRNA genes, were cloned and plasmids carrying the corresponding inserts were sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 12 geographically different B. bovis isolates and a reference strain. Overall sequence identities of 47.7%, 72.3%, 87.7%, and 94% were determined for msa-1, msa-2b, msa-2c, and ssrRNA, respectively. A robust phylogenetic tree was obtained with msa-2b sequences. The phylogenetic analysis suggests that Mexican B. bovis isolates group in clades not concordant with the Mexican geography. However, the Mexican isolates group together in an American clade separated from the Australian clade. Sequence heterogeneity in msa-1, msa-2b, and msa-2c coding regions of Mexican B. bovis isolates present in different geographical regions can be a result of either differential evolutive pressure or cattle movement from commercial trade.
Phylogenetic Analysis of Brassica rapa MATH-Domain Proteins
Zhao, Liming; Huang, Yong; Hu, Yan; He, Xiaoli; Shen, Wenhui; Liu, Chunlin; Ruan, Ying
2013-01-01
The MATH (meprin and TRAF-C homology) domain is a fold of seven anti-parallel β-helices involved in protein-protein interaction. Here, we report the identification and characterization of 90 MATH-domain proteins from the Brassica rapa genome. By sequence analysis together with MATH-domain proteins from other species, the B. rapa MATH-domain proteins can be grouped into 6 classes. Class-I protein has one or several MATH domains without any other recognizable domain; Class-II protein contains a MATH domain together with a conserved BTB (Broad Complex, Tramtrack, and Bric-a-Brac ) domain; Class-III protein belongs to the MATH/Filament domain family; Class-IV protein contains a MATH domain frequently combined with some other domains; Class-V protein has a relative long sequence but contains only one MATH domain; Class-VI protein is characterized by the presence of Peptidase and UBQ (Ubiquitinylation) domains together with one MATH domain. As part of our study regarding seed development of B. rapa, six genes are screened by SSH (Suppression Subtractive Hybridization) and their expression levels are analyzed in combination with seed developmental stages, and expression patterns suggested that Bra001786, Bra03578 and Bra036572 may be seed development specific genes, while Bra001787, Bra020541 and Bra040904 may be involved in seed and flower organ development. This study provides the first characterization of the MATH domain proteins in B. rapa PMID:24179444
Phylogenetic analysis and homology modelling of Paracentrotus lividus nectin.
Costa, Caterina; Cavalcante, Carmela; Zito, Francesca; Yokota, Yukio; Matranga, Valeria
2010-11-01
The extracellular matrix protein Pl-nectin, a 210-kDa homodimer originally purified from sea urchin eggs, plays a crucial role in cell adhesion and embryonic morphogenesis. The compiled cDNA sequence, obtained by RT-PCR primer walking and 3' RACE, identified a 984aa product containing a 23aa signal peptide and including all six internal peptides identified by protein microsequencing. The protein is a new member of the galactose-binding protein superfamily as it consists of six 151-156aa-long tandemly repeated domains (D1-D6), homologous to the discoidin-like domains, also known as F5/8-type C domains. Based on homology modelling, we present a three-dimensional structure (3D) for D5, identified as the prototype domain. The molecular modelling of the assembled Pl-nectin homodimer accounts for a Pl-nectin quaternary structure composed of two 105-kDa C-shaped monomers linked by a S-S bridge. The presence of an LDT motif between the first and the second exposed loops of the D2 domain suggests the binding of Pl-nectin to an integrin receptor. Altogether, the in silico analysis described here is consistent with previous biochemical reports and offers a basis for predictions to be experimentally tested.
Kerr, Tovah; Roalson, Eric H; Rodgers, Buel D
2005-01-01
The myostatin (MSTN)-null phenotype in mammals is characterized by extreme gains in skeletal muscle mass or "double muscling" as the cytokine negatively regulates skeletal muscle growth. Recent attempts, however, to reproduce a comparable phenotype in zebrafish have failed. Several aspects of MSTN biology in the fishes differ significantly from those in mammals and at least two distinct paralogs have been identified in some species, which possibly suggests functional divergence between the different vertebrate classes or between fish paralogs. We therefore conducted a phylogenetic analysis of the entire MSTN gene sub-family. Maximum likelihood, Bayesian inference, and bootstrap analyses indicated a monophyletic distribution of all MSTN genes with two distinct fish clades: MSTN-1 and -2. These analyses further indicated that all Salmonid genes described are actually MSTN-1 orthologs and that additional MSTN-2 paralogs may be present in most, if not all, teleosts. An additional zebrafish homolog was identified by BLAST searches of the zebrafish Hierarchical Tets Generation System database and was subsequently cloned. Comparative sequence analysis of both genes (zebrafish MSTN (zfMSTN)-1 and -2) revealed many differences, primarily within the latency-associated peptide regions, but also within the bioactive domains. The 2-kb promoter region of zfMSTN-2 contained many putative cis regulatory elements that are active during myogenesis, but are lacking in the zfMSTN-1 promoter. In fact, zfMSTN-2 expression was limited to the early stages of somitogenesis, whereas zfMSTN-1 was expressed throughout embryogenesis. These data suggest that zfMSTN-2 may be more closely associated with skeletal muscle growth and development. They also resolve the previous ambiguity in classification of fish MSTN genes.
Kikuchi, Shingo; Onuki, Yoshinori; Yasuda, Akihito; Hayashi, Yoshihiro; Takayama, Kozo
2011-03-01
A latent structure analysis of pharmaceutical formulations was performed using Kohonen's self-organizing map (SOM) and a Bayesian network. A hydrophilic matrix tablet containing diltiazem hydrochloride (DTZ), a highly water-soluble model drug, was used as a model formulation. Nonlinear relationship correlations among formulation factors (oppositely charged dextran derivatives and hydroxypropyl methylcellulose), latent variables (turbidity and viscosity of the polymer mixtures and binding affinity of DTZ to polymers), and release properties [50% dissolution times (t50s) and similarity factor] were clearly visualized by self organizing feature maps. The quantities of dextran derivatives forming polyion complexes were strongly related to the binding affinity of DTZ to polymers and t50s. The latent variables were classified into five characteristic clusters with similar properties by SOM clustering. The probabilistic graphical model of the latent structure was successfully constructed using a Bayesian network. The causal relationships among the factors were quantitatively estimated by inferring conditional probability distributions. Moreover, these causal relationships estimated by the Bayesian network coincided well with estimations by SOM clustering, and the probabilistic graphical model was reflected in the characteristics of SOM clusters. These techniques provide a better understanding of the latent structure between formulation factors and responses in DTZ hydrophilic matrix tablet formulations.
Bayesian GWAS and network analysis revealed new candidate genes for number of teats in pigs.
Verardo, L L; Silva, F F; Varona, L; Resende, M D V; Bastiaansen, J W M; Lopes, P S; Guimarães, S E F
2015-02-01
The genetic improvement of reproductive traits such as the number of teats is essential to the success of the pig industry. As opposite to most SNP association studies that consider continuous phenotypes under Gaussian assumptions, this trait is characterized as a discrete variable, which could potentially follow other distributions, such as the Poisson. Therefore, in order to access the complexity of a counting random regression considering all SNPs simultaneously as covariate under a GWAS modeling, the Bayesian inference tools become necessary. Currently, another point that deserves to be highlighted in GWAS is the genetic dissection of complex phenotypes through candidate genes network derived from significant SNPs. We present a full Bayesian treatment of SNP association analysis for number of teats assuming alternatively Gaussian and Poisson distributions for this trait. Under this framework, significant SNP effects were identified by hypothesis tests using 95% highest posterior density intervals. These SNPs were used to construct associated candidate genes network aiming to explain the genetic mechanism behind this reproductive trait. The Bayesian model comparisons based on deviance posterior distribution indicated the superiority of Gaussian model. In general, our results suggest the presence of 19 significant SNPs, which mapped 13 genes. Besides, we predicted gene interactions through networks that are consistent with the mammals known breast biology (e.g., development of prolactin receptor signaling, and cell proliferation), captured known regulation binding sites, and provided candidate genes for that trait (e.g., TINAGL1 and ICK).
Bayesian analysis of fingerprint, face and signature evidences with automatic biometric systems.
Gonzalez-Rodriguez, Joaquin; Fierrez-Aguilar, Julian; Ramos-Castro, Daniel; Ortega-Garcia, Javier
2005-12-20
The Bayesian approach provides a unified and logical framework for the analysis of evidence and to provide results in the form of likelihood ratios (LR) from the forensic laboratory to court. In this contribution we want to clarify how the biometric scientist or laboratory can adapt their conventional biometric systems or technologies to work according to this Bayesian approach. Forensic systems providing their results in the form of LR will be assessed through Tippett plots, which give a clear representation of the LR-based performance both for targets (the suspect is the author/source of the test pattern) and non-targets. However, the computation procedures of the LR values, especially with biometric evidences, are still an open issue. Reliable estimation techniques showing good generalization properties for the estimation of the between- and within-source variabilities of the test pattern are required, as variance restriction techniques in the within-source density estimation to stand for the variability of the source with the course of time. Fingerprint, face and on-line signature recognition systems will be adapted to work according to this Bayesian approach showing both the likelihood ratios range in each application and the adequacy of these biometric techniques to the daily forensic work.
Alverson, Andrew J; Jansen, Robert K; Theriot, Edward C
2007-10-01
Salinity imposes a significant barrier to the distribution of many organisms, including diatoms. Diatoms are ancestrally marine, and the number of times they have independently colonized fresh waters and the physiological adaptations that facilitated these transitions remain outstanding questions in diatom evolution. The colonization of fresh waters by diatoms has been compared to "crossing the Rubicon," implying that successful colonization events are rare, irreversible, and lead to substantial species diversification. To test these hypotheses, we reconstructed the phylogeny of Thalassiosirales, a diatom lineage with high diversity in both marine and fresh waters. We collected approximately 5.3kb of DNA sequence data from the nuclear (SSU and partial LSU rDNA) and chloroplast genomes (psbC and rbcL) and reconstructed the phylogeny using parsimony and Bayesian methods. Alternative topology tests strongly reject all previous colonization hypotheses, including monophyly of the predominantly freshwater Stephanodiscaceae. Results showed at least three independent colonizations of fresh waters, and whereas previous accounts of freshwater-to-marine transitions have been discounted, these results provide compelling evidence for as many as three independent re-colonizations of the marine habitat, two of which led to speciation events. This study adds valuable phylogenetic context to previous debate about the nature of the salinity barrier in diatoms and provides compelling evidence that, at least for Thalassiosirales, the salinity barrier might be less formidable than previously thought.
Ma, Li; Dong, Wan-Wei; Jiang, Guo-Fang; Wang, Xing
2016-01-01
The sweet potato leaf folder, Brachmia macroscopa, is an important pest in China. The complete mitogenome, which consists of 13 protein-coding genes (PCGs), 22 transfer RNA genes, two ribosomal RNA genes, and an A + T-rich region, was sequenced and found to be 15,394 bp in length (GeneBank no. KT354968). The gene order and orientation of the B. macroscopa mitogenome were similar to those of other sequenced lepidopteran species. All of the PCGs started with ATN as the canonical start codon except for cox1, which started with CGA. In regard to stop codons, most PCGs stopped at TAA except for cox2, which stopped at TA, and nad4, which stopped at a single T. Thirteen PCGs of the available species (33 taxa) were used to demonstrate phylogenetic relationships. The ditrysian cluster was supported as a monophyletic clade at high levels by using maximum likelihood and Bayesian methods. The apoditrysian group, covering the Gelechioidea, formed a monophyletic clade with a bootstrap value of 88% and a posterior probability of 1.00. The superfamily Gelechioidea was supported as a monophyletic lineage by a posterior probability of 1.00. PMID:26810560
Bayesian latent variable models for the analysis of experimental psychology data.
Merkle, Edgar C; Wang, Ting
2016-03-18
In this paper, we address the use of Bayesian factor analysis and structural equation models to draw inferences from experimental psychology data. While such application is non-standard, the models are generally useful for the unified analysis of multivariate data that stem from, e.g., subjects' responses to multiple experimental stimuli. We first review the models and the parameter identification issues inherent in the models. We then provide details on model estimation via JAGS and on Bayes factor estimation. Finally, we use the models to re-analyze experimental data on risky choice, comparing the approach to simpler, alternative methods.
Tomono, Takayoshi; Kojima, Hisao; Fukuchi, Satoshi; Tohsato, Yukako; Ito, Masahiro
2015-01-01
Glycans play important roles in such cell-cell interactions as signaling and adhesion, including processes involved in pathogenic infections, cancers, and neurological diseases. Glycans are biosynthesized by multiple glycosyltransferases (GTs), which function sequentially. Excluding mucin-type O-glycosylation, the non-reducing terminus of glycans is biosynthesized in the Golgi apparatus after the reducing terminus is biosynthesized in the ER. In the present study, we performed genome-wide analyses of human GTs by investigating the degree of conservation of homologues in other organisms, as well as by elucidating the phylogenetic relationship between cephalochordates and urochordates, which has long been controversial in deuterostome phylogeny. We analyzed 173 human GTs and functionally linked glycan synthesis enzymes by phylogenetic profiling and clustering, compiled orthologous genes from the genomes of other organisms, and converted them into a binary sequence based on the presence (1) or absence (0) of orthologous genes in the genomes. Our results suggest that the non-reducing terminus of glycans is biosynthesized by newly evolved GTs. According to our analysis, the phylogenetic profiles of GTs resemble the phylogenetic tree of life, where deuterostomes, metazoans, and eukaryotes are resolved into separate branches. Lineage-specific GTs appear to play essential roles in the divergence of these particular lineages. We suggest that urochordates lose several genes that are conserved among metazoans, such as those expressing sialyltransferases, and that the Golgi apparatus acquires the ability to synthesize glycans after the ER acquires this function. PMID:27493855
Phylogenetic Analysis of Petunia sensu Jussieu (Solanaceae) using Chloroplast DNA RFLP
ANDO, TOSHIO; KOKUBUN, HISASHI; WATANABE, HITOSHI; TANAKA, NORIO; YUKAWA, TOMOHISA; HASHIMOTO, GORO; MARCHESI, EDUARDO; SUÁREZ, ENRIQUE; BASUALDO, ISABEL L.
2005-01-01
• Background and Aims The phylogenetic relationships of Petunia sensu Jussieu (Petunia sensu Wijsman plus Calibrachoa) are unclear. This study aimed to resolve this uncertainty using molecular evidence. • Methods Phylogenetic trees of 52 taxa of Petunia sensu Jussieu were constructed using restriction fragment length polymorphism (RFLP) of chloroplast DNA digested with 19 restriction enzymes and hybridized with 12 cloned Nicotiana chloroplast DNA fragments as probes. • Key Results In all, 89 phylogenetically informative RFLPs were detected and one 50 % majority consensus tree was obtained, using the maximum parsimony method, and one distance matrix tree, using the neighbour joining method. Petunia sensu Wijsman and Calibrachoa were monophyletic sister clades in both trees. Calibrachoa parviflora and C. pygmaea, previously thought to differ from the other species in terms of their cross-compatibility, seed morphology, and nuclear DNA content, formed a basal clade that was sister to the remainder of Calibrachoa. Several clades found in the phylogenetic trees corresponded to their distribution ranges, suggesting that recent speciation in the genus Petunia sensu Jussieu occurred independently in several different regions. • Conclusions The separation of Petunia sensu Wijsman and Calibrachoa was supported by chloroplast DNA analysis. Two groups in the Calibrachoa were also recognized with a high degree of confidence. PMID:15944177
Phylogenetic analysis of ALAD and MGP genes related to lead toxicity.
Shaik, A P; Khan, M; Jamil, K
2009-07-01
Experimental studies in our laboratory have established the role of delta-aminolevulinic acid dehydratase (ALAD) and matrix gamma-carboxyglutamic acid (MGP) gene polymorphisms in the etiology of lead toxicity. Polymorphisms in these genes influenced the levels of lead in subjects exposed to this metal. In extension to our studies, we aimed to investigate the possible role of these proteins in evolution by studying the phylogenetic relationship and divergence of ALAD and MGP genes using computational phylogenetic methods. The human ALAD and MGP protein sequences from various species were retrieved from Swiss-Prot database and were compared using Basic Local Alignment Search Tool. Multiple sequence alignment was carried out using ClustalW with defaults, and phylogenetic trees for both the genes were built using neighbor-joining method as in Mega software. Our study indicated that ALAD is a highly conserved protein with the same metal binding site distributed in all the phyla (from archaea to chordates). Phylogenetic analysis of MGP gene revealed that it had an important role in the evolution of endogenous skeleton in contrast to exoskeleton of insects. Occurrence of these genes in evolution with conserved metal binding sites strengthens the role of ALAD and MGP genes in regulating heme biosynthesis and mineralization, respectively, in evolution and helps in better understanding of lead poisoning.
NASA Astrophysics Data System (ADS)
Hidayat, Topik; Priyandoko, Didik; Islami, Dina Karina; Wardiny, Putri Yunitha
2016-02-01
Solanaceae is one of largest family in Angiosperm group with highly diverse in morphological character. In Indonesia, this group of plant is very popular due to its usefulness as food, ornamental and medicinal plants. However, investigation on phylogenetic relationship among the member of this family in Indonesia remains less attention. The purpose of this study was to evaluate the phylogenetics relationship of the family especially distributed in Indonesia. DNA sequences of Internal Transcribed Spacer (ITS) region of 19 species of Solanaceae and three species of outgroup, which belongs to family Convolvulaceae, Apocynaceae, and Plantaginaceae, were isolated, amplified, and sequenced. Phylogenetic tree analysis based on parsimony method was conducted with using data derived from the ITS-1, 5.8S, and ITS-2, separately, and the combination of all. Results indicated that the phylogenetic tree derived from the combined data established better pattern of relationship than separate data. Thus, three major groups were revealed. Group 1 consists of tribe Datureae, Cestreae, and Petunieae, whereas group 2 is member of tribe Physaleae. Group 3 belongs to tribe Solaneae. The use of the ITS region as a molecular markers, in general, support the global Solanaceae relationship that has been previously reported.
Phylogenetic Analysis of Selected Menthol-Producing Species Belonging to the Lamiaceae Family.
Mirzaei, Motahareh; Mirzaei, Hamed; Sahebkar, Amirhossein; Bagherian, Ali; Masoud Khoi, Mohammad Jaber; Reza Mirzaei, Hamid; Salehi, Rasoul; Reza Jaafari, Mahmoud; Kazemi Oskuee, Reza
2015-01-01
Menthol is an organic compound with diverse medicinal and commercial applications, and is made either synthetically or through extraction from mint oils. The aim of the present study was to investigate menthol levels in selected menthol-producing species belonging to the Lamiaceae family, and to determine phylogenetic relationships of menthol dehydrogenase gene sequence among these species. Three genus of Lamiaceae, namely Mentha, Salvia, and Micromeria, were selected for phytochemical and phylogenetic analyses. After identification of each species based on menthol dehydrogenase gene in NCBI, BLAST software was used for the sequence alignment. MEGA4 software was used to draw phylogenetic tree for various species. Phytochemical analysis revealed that the highest and lowest amounts of both essential oil and menthol belonged to Mentha spicata and Micromeria hyssopifolia, respectively. The species Mentha spicata and Mentha piperita, which were assigned to one cluster in the dendrogram, contained the highest amounts of essential oil and menthol while Micromeria species, which was in the distinct cluster and placed in the farther evolutionary distance, contained the lowest amount of essential oil and menthol. Phylogenetic and phytochemistry analyses showed that essential oil and menthol contents of menthol-producing species are associated with menthol dehydrogenase gene sequence.
NASA Technical Reports Server (NTRS)
Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.
2003-01-01
The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.
Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.
2003-01-01
The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at −9°C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 × 107 cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at −2°C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years. PMID:12676695
Hierarchical Bayesian Modeling, Estimation, and Sampling for Multigroup Shape Analysis
Yu, Yen-Yun; Fletcher, P. Thomas; Awate, Suyash P.
2016-01-01
This paper proposes a novel method for the analysis of anatomical shapes present in biomedical image data. Motivated by the natural organization of population data into multiple groups, this paper presents a novel hierarchical generative statistical model on shapes. The proposed method represents shapes using pointsets and defines a joint distribution on the population’s (i) shape variables and (ii) object-boundary data. The proposed method solves for optimal (i) point locations, (ii) correspondences, and (iii) model-parameter values as a single optimization problem. The optimization uses expectation maximization relying on a novel Markov-chain Monte-Carlo algorithm for sampling in Kendall shape space. Results on clinical brain images demonstrate advantages over the state of the art. PMID:25320776
Phylogenetic concordance analysis shows an emerging pathogen is novel and endemic.
Storfer, Andrew; Alfaro, Michael E; Ridenhour, Benjamin J; Jancovich, James K; Mech, Stephen G; Parris, Matthew J; Collins, James P
2007-11-01
Distinguishing whether pathogens are novel or endemic is critical for controlling emerging infectious diseases, an increasing threat to wildlife and human health. To test the endemic vs. novel pathogen hypothesis, we present a unique analysis of intraspecific host-pathogen phylogenetic concordance of tiger salamanders and an emerging Ranavirus throughout Western North America. There is significant non-concordance of host and virus gene trees, suggesting pathogen novelty. However, non-concordance has likely resulted from virus introductions by human movement of infected salamanders. When human-associated viral introductions are excluded, host and virus gene trees are identical, strongly supporting coevolution and endemism. A laboratory experiment showed an introduced virus strain is significantly more virulent than endemic strains, likely due to artificial selection for high virulence. Thus, our analysis of intraspecific phylogenetic concordance revealed that human introduction of viruses is the mechanism underlying tree non-concordance and possibly disease emergence via artificial selection.
Mikkelsen, Deirdre; Milinovich, Gabriel J; Burrell, Paul C; Huynh, Sharnan C; Pettett, Lyndall M; Blackall, Linda L; Trott, Darren J; Bird, Philip S
2008-09-01
Porphyromonas species are frequently isolated from the oral cavity and are associated with periodontal disease in both animals and humans. Black, pigmented Porphyromonas spp. isolated from the gingival margins of selected wild and captive Australian marsupials with varying degrees of periodontal disease (brushtail possums, koalas and macropods) were compared phylogenetically to Porphyromonas strains from non-marsupials (bear, wolf, coyote, cats and dogs) and Porphyromonas gingivalis strains from humans using 16S rRNA gene sequence analysis. The results of the phylogenetic analysis identified three distinct groups of strains. A monophyletic P. gingivalis group (Group 1) contained only strains isolated from humans and a Porphyromonas gulae group (Group 2) was divided into three distinct subclades, each containing both marsupial and non-marsupial strains. Group 3, which contained only marsupial strains, including all six strains isolated from captive koalas, was genetically distinct from P. gulae and may constitute a new Porphyromonas species.
K-mer natural vector and its application to the phylogenetic analysis of genetic sequences
Wen, Jia; Chan, Raymond H.; Yau, Shek-Chung; He, Rong L.; Yau, Stephen S. T.
2014-01-01
Based on the well-known k-mer model, we propose a k-mer natural vector model for representing a genetic sequence based on the numbers and distributions of k-mers in the sequence. We show that there exists a one-to-one correspondence between a genetic sequence and its associated k-mer natural vector. The k-mer natural vector method can be easily and quickly used to perform phylogenetic analysis of genetic sequences without requiring evolutionary models or human intervention. Whole or partial genomes can be handled more effective with our proposed method. It is applied to the phylogenetic analysis of genetic sequences, and the obtaining results fully demonstrate that the k-mer natural vector method is a very powerful tool for analysing and annotating genetic sequences and determining evolutionary relationships both in terms of accuracy and efficiency. PMID:24858075
Sequence and phylogenetic analysis of M-class genome segments of novel duck reovirus NP03
Wang, Shao; Chen, Shilong; Cheng, Xiaoxia; Chen, Shaoying; Lin, FengQiang; Jiang, Bing; Zhu, Xiaoli; Li, Zhaolong; Wang, Jinxiang
2015-01-01
We report the sequence and phylogenetic analysis of the entire M1, M2, and M3 genome segments of the novel duck reovirus (NDRV) NP03. Alignment between the newly determined nucleotide sequences as well as their deduced amino acid sequences and the published sequences of avian reovirus (ARV) was carried out with DNASTAR software. Sequence comparison showed that the M2 gene had the most variability among the M-class genes of DRV. Phylogenetic analysis of the M-class genes of ARV strains revealed different lineages and clusters within DRVs. The 5 NDRV strains used in this study fall into a well-supported lineage that includes chicken ARV strains, whereas Muscovy DRV (MDRV) strains are separate from NDRV strains and form a distinct genetic lineage in the M2 gene tree. However, the MDRV and NDRV strains are closely related and located in a common lineage in the M1 and M3 gene trees, respectively. PMID:25852231
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
Hierarchical models and Bayesian analysis of bird survey information
Sauer, J.R.; Link, W.A.; Royle, J. Andrew; Ralph, C. John; Rich, Terrell D.
2005-01-01
Summary of bird survey information is a critical component of conservation activities, but often our summaries rely on statistical methods that do not accommodate the limitations of the information. Prioritization of species requires ranking and analysis of species by magnitude of population trend, but often magnitude of trend is a misleading measure of actual decline when trend is poorly estimated. Aggregation of population information among regions is also complicated by varying quality of estimates among regions. Hierarchical models provide a reasonable means of accommodating concerns about aggregation and ranking of quantities of varying precision. In these models the need to consider multiple scales is accommodated by placing distributional assumptions on collections of parameters. For collections of species trends, this allows probability statements to be made about the collections of species-specific parameters, rather than about the estimates. We define and illustrate hierarchical models for two commonly encountered situations in bird conservation: (1) Estimating attributes of collections of species estimates, including ranking of trends, estimating number of species with increasing populations, and assessing population stability with regard to predefined trend magnitudes; and (2) estimation of regional population change, aggregating information from bird surveys over strata. User-friendly computer software makes hierarchical models readily accessible to scientists.
2014-01-01
Background Odd traits in few of plant species usually implicate potential biology significances in plant evolutions. The genus Helwingia Willd, a dioecious medical shrub in Aquifoliales order, has an odd floral architecture-epiphyllous inflorescence. The potential significances and possible evolutionary origin of this specie are not well understood due to poorly available data of biological and genetic studies. In addition, the advent of genomics-based technologies has widely revolutionized plant species with unknown genomic information. Results Morphological and biological pattern were detailed via anatomical and pollination analyses. An RNA sequencing based transcriptomic analysis were undertaken and a high-resolution phylogenetic analysis was conducted based on single-copy genes in more than 80 species of seed plants, including H. japonica. It is verified that a potential fusion of rachis to the leaf midvein facilitates insect pollination. RNA sequencing yielded a total of 111450 unigenes; half of them had significant similarity with proteins in the public database, and 20281 unigenes were mapped to 119 pathways. Deduced from the phylogenetic analysis based on single-copy genes, the group of Helwingia is closer with Euasterids II and rather than Euasterids, congruent with previous reports using plastid sequences. Conclusions The odd flower architecture make H. Willd adapt to insect pollination by hosting those insects larger than the flower in size via leave, which has little common character that other insect pollination plants hold. Further the present transcriptome greatly riches genomics information of Helwingia species and nucleus genes based phylogenetic analysis also greatly improve the resolution and robustness of phylogenetic reconstruction in H. japonica. PMID:24969969
Genotyping and Phylogenetic Analysis of Giardia duodenalis Isolates from Turkish Children
Tamer, Gulden Sonmez; Kasap, Murat; Er, Doganhan Kadir
2015-01-01
Background Giardiasis is caused by the intestinal protozoan parasite Giardia duodenalis (synonyms: G. lamblia, G. intestinalis), which is one of the most frequent parasites that infect Turkish children. However, molecular characterization of G. duodenalis in Turkey is relatively scarce. The present work aimed at genotyping G. duodenalis isolates from Turkey using molecular techniques. Material/Methods In the present study, 145 fecal samples from children were collected to search for the presence of Giardia by microscopy and PCR screening. PCR generated a 384 bp fragment for β-giardin. The PCR products were sequenced and the sequences were subjected to phylogenetic analysis by using PHYLIP. Results Based on the phylogenetic analysis of the sequences, assemblage A, B, and mixed subtypes were determined. Of 22 isolates, 11 were identified as assemblage A (50%), 7 were assemblage B (31.8%), and 4 were assemblage AB (18.2%). Association between G. duodenalis assemblages and the epidemiological data was analyzed. No correlation was found between symptoms and infection with specific assemblages (P>0.05), but we found statistically significant association between age and the assemblage AB (P=0.001). Conclusions The association between G. duodenalis and the epidemiologic data were analyzed. Since assemblage A is the more prevalent subgroup compared with assemblage B, this subgroup might be responsible for common Giardia infections in Turkey. This is the first study that included a detailed phylogenetic analysis of Giardia strains from Turkey. PMID:25689970
Distribution and genetic analysis of TTV and TTMV major phylogenetic groups in French blood donors.
Biagini, Philippe; Gallian, Pierre; Cantaloube, Jean-François; Attoui, Houssam; de Micco, Philippe; de Lamballerie, Xavier
2006-02-01
TTV and TTMV (recently assigned to the floating genus Anellovirus) infect human populations (including healthy individuals) at high prevalence (>80%). They display notably high levels of genetic diversity, but very little is known regarding the distribution of Anellovirus genetic groups in human populations. We analyzed the distribution of the major genetic groups of TTV and TTMV in healthy voluntary blood donors using group-independent and group-specific PCR amplifications systems, combined with sequence determination and phylogenetic analysis. Analysis of Anellovirus groups revealed a non-random pattern of group distribution with a predominant prevalence of TTV phylogenetic groups 1, 3, and 5, and of TTMV group 1. Multiple co-infections were observed. In addition, TTMV sequences exhibiting a high genetic divergence with reference sequences were identified. This study provided the first picture of the genetic distribution of the major phylogenetic groups of members of the genus Anellovirus in a cohort of French voluntary blood donors. Obtaining such data from a reference population comprising healthy individuals was an essential step that will allow the subsequent comparative analysis of cohorts including patients with well-characterized diseases, in order to identify any possible relationship between Anellovirus infection and human diseases.
Aisen, Santiago; Ramírez, Martín J
2015-08-06
We review the spider genus Oxysoma Nicolet, with most of its species endemic from the southern temperate forests in Chile and Argentina, and present a phylogenetic analysis including seven species, of which three are newly described in this study (O. macrocuspis new species, O. kuni new species, and O. losruiles new species, all from Chile), together with other 107 representatives of Anyphaenidae. New geographical records and distribution maps are provided for all species, with illustrations and reviewed diagnoses for the genus and the four previously known species (O. punctatum Nicolet, O. saccatum (Tullgren), O. longiventre (Nicolet) and O. itambezinho Ramírez). The phylogenetic analysis using cladistic methods is based on 264 previously defined characters plus one character that arises from this study. The three new species are closely related with Oxysoma longiventre, and this four species compose what we define as the Oxysoma longiventre species group. The phylogenetic analysis did not retrieve the monophyly of Oxysoma, which should be reevaluated in the future, together with the genus Tasata.
NASA Astrophysics Data System (ADS)
Hobson, Michael P.; Jaffe, Andrew H.; Liddle, Andrew R.; Mukherjee, Pia; Parkinson, David
2009-12-01
Preface; Part I. Methods: 1. Foundations and algorithms John Skilling; 2. Simple applications of Bayesian methods D. S. Sivia and Steve Rawlings; 3. Parameter estimation using Monte Carlo sampling Antony Lewis and Sarah Bridle; 4. Model selection and multi-model interference Andrew R. Liddle, Pia Mukherjee and David Parkinson; 5. Bayesian experimental design and model selection forecasting Roberto Trotta, Martin Kunz, Pia Mukherjee and David Parkinson; 6. Signal separation in cosmology M. P. Hobson, M. A. J. Ashdown and V. Stolyarov; Part II. Applications: 7. Bayesian source extraction M. P. Hobson, Graça Rocha and R. Savage; 8. Flux measurement Daniel Mortlock; 9. Gravitational wave astronomy Neil Cornish; 10. Bayesian analysis of cosmic microwave background data Andrew H. Jaffe; 11. Bayesian multilevel modelling of cosmological populations Thomas J. Loredo and Martin A. Hendry; 12. A Bayesian approach to galaxy evolution studies Stefano Andreon; 13. Photometric redshift estimation: methods and applications Ofer Lahav, Filipe B. Abdalla and Manda Banerji; Index.
NASA Astrophysics Data System (ADS)
Hobson, Michael P.; Jaffe, Andrew H.; Liddle, Andrew R.; Mukherjee, Pia; Parkinson, David
2014-02-01
Preface; Part I. Methods: 1. Foundations and algorithms John Skilling; 2. Simple applications of Bayesian methods D. S. Sivia and Steve Rawlings; 3. Parameter estimation using Monte Carlo sampling Antony Lewis and Sarah Bridle; 4. Model selection and multi-model interference Andrew R. Liddle, Pia Mukherjee and David Parkinson; 5. Bayesian experimental design and model selection forecasting Roberto Trotta, Martin Kunz, Pia Mukherjee and David Parkinson; 6. Signal separation in cosmology M. P. Hobson, M. A. J. Ashdown and V. Stolyarov; Part II. Applications: 7. Bayesian source extraction M. P. Hobson, Graça Rocha and R. Savage; 8. Flux measurement Daniel Mortlock; 9. Gravitational wave astronomy Neil Cornish; 10. Bayesian analysis of cosmic microwave background data Andrew H. Jaffe; 11. Bayesian multilevel modelling of cosmological populations Thomas J. Loredo and Martin A. Hendry; 12. A Bayesian approach to galaxy evolution studies Stefano Andreon; 13. Photometric redshift estimation: methods and applications Ofer Lahav, Filipe B. Abdalla and Manda Banerji; Index.
Crash risk analysis for Shanghai urban expressways: A Bayesian semi-parametric modeling approach.
Yu, Rongjie; Wang, Xuesong; Yang, Kui; Abdel-Aty, Mohamed
2016-10-01
Urban expressway systems have been developed rapidly in recent years in China; it has become one key part of the city roadway networks as carrying large traffic volume and providing high traveling speed. Along with the increase of traffic volume, traffic safety has become a major issue for Chinese urban expressways due to the frequent crash occurrence and the non-recurrent congestions caused by them. For the purpose of unveiling crash occurrence mechanisms and further developing Active Traffic Management (ATM) control strategies to improve traffic safety, this study developed disaggregate crash risk analysis models with loop detector traffic data and historical crash data. Bayesian random effects logistic regression models were utilized as it can account for the unobserved heterogeneity among crashes. However, previous crash risk analysis studies formulated random effects distributions in a parametric approach, which assigned them to follow normal distributions. Due to the limited information known about random effects distributions, subjective parametric setting may be incorrect. In order to construct more flexible and robust random effects to capture the unobserved heterogeneity, Bayesian semi-parametric inference technique was introduced to crash risk analysis in this study. Models with both inference techniques were developed for total crashes; semi-parametric models were proved to provide substantial better model goodness-of-fit, while the two models shared consistent coefficient estimations. Later on, Bayesian semi-parametric random effects logistic regression models were developed for weekday peak hour crashes, weekday non-peak hour crashes, and weekend non-peak hour crashes to investigate different crash occurrence scenarios. Significant factors that affect crash risk have been revealed and crash mechanisms have been concluded.
Tetreau, Guillaume; Cao, Xiaolong; Chen, Yun-Ru; Muthukrishnan, Subbaratnam; Jiang, Haobo; Blissard, Gary W; Kanost, Michael R; Wang, Ping
2015-07-01
Chitin is one of the most abundant biomaterials in nature. The biosynthesis and degradation of chitin in insects are complex and dynamically regulated to cope with insect growth and development. Chitin metabolism in insects is known to involve numerous enzymes, including chitin synthases (synthesis of chitin), chitin deacetylases (modification of chitin by deacetylation) and chitinases (degradation of chitin by hydrolysis). In this study, we conducted a genome-wide search and analysis of genes encoding these chitin metabolism enzymes in Manduca sexta. Our analysis confirmed that only two chitin synthases are present in M. sexta as in most other arthropods. Eleven chitin deacetylases (encoded by nine genes) were identified, with at least one representative in each of the five phylogenetic groups that have been described for chitin deacetylases to date. Eleven genes encoding for family 18 chitinases (GH18) were found in the M. sexta genome. Based on the presence of conserved sequence motifs in the catalytic sequences and phylogenetic relationships, two of the M. sexta chitinases did not cluster with any of the current eight phylogenetic groups of chitinases: two new groups were created (groups IX and X) and their characteristics are described. The result of the analysis of the Lepidoptera-specific chitinase-h (group h) is consistent with its proposed bacterial origin. By analyzing chitinases from fourteen species that belong to seven different phylogenetic groups, we reveal that the chitinase genes appear to have evolved sequentially in the arthropod lineage to achieve the current high level of diversity observed in M. sexta. Based on the sequence conservation of the catalytic domains and on their developmental stage- and tissue-specific expression, we propose putative functions for each group in each category of enzymes.
Bayesian design and analysis of computer experiments: Use of derivatives in surface prediction
Morris, M.D.; Mitchell, T.J. ); Ylvisaker, D. . Dept. of Mathematics)
1991-06-01
The work of Currin et al. and others in developing fast predictive approximations'' of computer models is extended for the case in which derivatives of the output variable of interest with respect to input variables are available. In addition to describing the calculations required for the Bayesian analysis, the issue of experimental design is also discussed, and an algorithm is described for constructing maximin distance'' designs. An example is given based on a demonstration model of eight inputs and one output, in which predictions based on a maximin design, a Latin hypercube design, and two compromise'' designs are evaluated and compared. 12 refs., 2 figs., 6 tabs.
Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images
2010-04-01
OF EACH CELL ARE RESULTS OF KSVD AND BPFA, RESPECTIVELY. σ C.man House Peppers Lena Barbara Boats F.print Couple Hill 5 37.87 39.37 37.78 38.60 38.08...INTERPOLATION PSNR RESULTS, USING PATCH SIZE 8× 8. BOTTOM: BPFA RGB IMAGE INTERPOLATION PSNR RESULTS, USING PATCH SIZE 7× 7. data ratio C.man House Peppers Lena...of subspaces. IEEE Trans. Inform. Theory, 2009. [16] T. Ferguson . A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1:209–230
Integrated Data Analysis for Fusion: A Bayesian Tutorial for Fusion Diagnosticians
Dinklage, Andreas; Dreier, Heiko; Preuss, Roland; Fischer, Rainer; Gori, Silvio; Toussaint, Udo von
2008-03-12
Integrated Data Analysis (IDA) offers a unified way of combining information relevant to fusion experiments. Thereby, IDA meets with typical issues arising in fusion data analysis. In IDA, all information is consistently formulated as probability density functions quantifying uncertainties in the analysis within the Bayesian probability theory. For a single diagnostic, IDA allows the identification of faulty measurements and improvements in the setup. For a set of diagnostics, IDA gives joint error distributions allowing the comparison and integration of different diagnostics results. Validation of physics models can be performed by model comparison techniques. Typical data analysis applications benefit from IDA capabilities of nonlinear error propagation, the inclusion of systematic effects and the comparison of different physics models. Applications range from outlier detection, background discrimination, model assessment and design of diagnostics. In order to cope with next step fusion device requirements, appropriate techniques are explored for fast analysis applications.
Integrated Data Analysis for Fusion: A Bayesian Tutorial for Fusion Diagnosticians
NASA Astrophysics Data System (ADS)
Dinklage, Andreas; Dreier, Heiko; Fischer, Rainer; Gori, Silvio; Preuss, Roland; Toussaint, Udo von
2008-03-01
Integrated Data Analysis (IDA) offers a unified way of combining information relevant to fusion experiments. Thereby, IDA meets with typical issues arising in fusion data analysis. In IDA, all information is consistently formulated as probability density functions quantifying uncertainties in the analysis within the Bayesian probability theory. For a single diagnostic, IDA allows the identification of faulty measurements and improvements in the setup. For a set of diagnostics, IDA gives joint error distributions allowing the comparison and integration of different diagnostics results. Validation of physics models can be performed by model comparison techniques. Typical data analysis applications benefit from IDA capabilities of nonlinear error propagation, the inclusion of systematic effects and the comparison of different physics models. Applications range from outlier detection, background discrimination, model assessment and design of diagnostics. In order to cope with next step fusion device requirements, appropriate techniques are explored for fast analysis applications.
Graf, Daniel L; Jones, Hugh; Geneva, Anthony J; Pfeiffer, John M; Klunzinger, Michael W
2015-04-01
The freshwater mussel family Hyriidae (Mollusca: Bivalvia: Unionida) has a disjunct trans-Pacific distribution in Australasia and South America. Previous phylogenetic analyses have estimated the evolutionary relationships of the family and the major infra-familial taxa (Velesunioninae and Hyriinae: Hyridellini in Australia; Hyriinae: Hyriini, Castaliini, and Rhipidodontini in South America), but taxon and character sampling have been too incomplete to support a predictive classification or allow testing of biogeographical hypotheses. We sampled 30 freshwater mussel individuals representing the aforementioned hyriid taxa, as well as outgroup species representing the five other freshwater mussel families and their marine sister group (order Trigoniida). Our ingroup included representatives of all Australian genera. Phylogenetic relationships were estimated from three gene fragments (nuclear 28S, COI and 16S mtDNA) using maximum parsimony, maximum likelihood, and Bayesian inference, and we applied a Bayesian relaxed clock model calibrated with fossil dates to estimate node ages. Our analyses found good support for monophyly of the Hyriidae and the subfamilies and tribes, as well as the paraphyly of the Australasian taxa (Velesunioninae, (Hyridellini, (Rhipidodontini, (Castaliini, Hyriini)))). The Hyriidae was recovered as sister to a clade comprised of all other Recent freshwater mussel families. Our molecular date estimation supported Cretaceous origins of the major hyriid clades, pre-dating the Tertiary isolation of South America from Antarctica/Australia. We hypothesize that early diversification of the Hyriidae was driven by terrestrial barriers on Gondwana rather than marine barriers following disintegration of the super-continent.
Phylogenetic and Diversity Analysis of Dactylis glomerata Subspecies Using SSR and IT-ISJ Markers.
Yan, Defei; Zhao, Xinxin; Cheng, Yajuan; Ma, Xiao; Huang, Linkai; Zhang, Xinquan
2016-10-31
The genus Dactylis, an important forage crop, has a wide geographical distribution in temperate regions. While this genus is thought to include a single species, Dactylis glomerata, this species encompasses many subspecies whose relationships have not been fully characterized. In this study, the genetic diversity and phylogenetic relationships of nine representative Dactylis subspecies were examined using SSR and IT-ISJ markers. In total, 21 pairs of SSR primers and 15 pairs of IT-ISJ primers were used to amplify 295 polymorphic bands with polymorphic rates of 100%. The average polymorphic information contents (PICs) of SSR and IT-ISJ markers were 0.909 and 0.780, respectively. The combined data of the two markers indicated a high level of genetic diversity among the nine D. glomerata subspecies, with a Nei's gene diversity index value of 0.283 and Shannon's diversity of 0.448. Preliminarily phylogenetic analysis results revealed that the 20 accessions could be divided into three groups (A, B, C). Furthermore, they could be divided into five clusters, which is similar to the structure analysis with K = 5. Phylogenetic placement in these three groups may be related to the distribution ranges and the climate types of the subspecies in each group. Group A contained eight accessions of four subspecies, originating from the west Mediterranean, while Group B contained seven accessions of three subspecies, originating from the east Mediterranean.
Pan, Ting Shuang; Nie, Pin
2013-07-01
Acanthocephalans are a small group of obligate endoparasites. They and rotifers are recently placed in a group called Syndermata. However, phylogenetic relationships within classes of acanthocephalans, and between them and rotifers, have not been well resolved, possibly due to the lack of molecular data suitable for such analysis. In this study, the mitochondrial (mt) genome was sequenced from Pallisentis celatus (Van Cleave, 1928), an acanthocephalan in the class Eoacanthocephala, an intestinal parasite of rice-field eel, Monopterus albus (Zuiew, 1793), in China. The complete mt genome sequence of P. celatus is 13 855 bp long, containing 36 genes including 12 protein-coding genes, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs) as reported for other acanthocephalan species. All genes are encoded on the same strand and in the same direction. Phylogenetic analysis indicated that acanthocephalans are closely related with a clade containing bdelloids, which then correlates with the clade containing monogononts. The class Eoacanthocephala, containing P. celatus and Paratenuisentis ambiguus (Van Cleave, 1921) was closely related to the Palaeacanthocephala. It is thus indicated that acanthocephalans may be just clustered among groups of rotifers. However, the resolving of phylogenetic relationship among all classes of acanthocephalans and between them and rotifers may require further sampling and more molecular data.
YAMASAKI, Masahiro; TSUBOI, Yoshihiro; TANIYAMA, Yusuke; UCHIDA, Naohiro; SATO, Reeko; NAKAMURA, Kensuke; OHTA, Hiroshi; TAKIGUCHI, Mitsuyoshi
2016-01-01
The Babesia gibsoni heat shock protein 90 (BgHSP90) gene was cloned and sequenced. The length of the gene was 2,610 bp with two introns. This gene was amplified from cDNA corresponding to full length coding sequence (CDS) with an open reading frame of 2,148 bp. A phylogenetic analysis of the CDS of HSP90 gene showed that B. gibsoni was most closely related to B. bovis and Babesia sp. BQ1/Lintan and lies within a phylogenetic cluster of protozoa. Moreover, mRNA transcription profile for BgHSP90 exposed to high temperature were examined by quantitative real-time reverse transcription-polymerase chain reaction. BgHSP90 levels were elevated when the parasites were incubated at 43°C for 1 hr. PMID:27149891
Maĭor, T Iu; Sheveleva, N G; Sukhanova, L V; Timoshkin, O A; Kiril'chik, S V
2010-11-01
Baikalian cyclopoids represent one of the richest endemic faunas of freshwater cyclopoid copepods. The genus Diacyclops Kiefer, 1927 is the most numerous by species number in the lake. In this work, molecular-phylogenetic analysis of 14 species and 1 sub-species from Lake Baikal and its water catchment basin is performed. The regions of mitochondrial cytochrom-oxydase I (COI) and of nuclear small-subunit 18S rRNA were used as evolution markers. In the obtained set of nucleotide sequences of COT gene, an effect of synonymous substitution saturation is revealed. Baikalian representatives of the genus Diacyclops form at phylogenetic schemes by two markers a monophyletic griup, it suggest their origin from a common ancestral form. Preliminary estimate of the age of this group is 20-25 My.
Phylogenetic analysis and evolutionary origins of DNA polymerase X-family members
Bienstock, Rachelle J.; Beard, William A.; Wilson, Samuel H.
2014-01-01
Mammalian DNA polymerase (pol) β is the founding member of a large group of DNA polymerases now termed the X-family. DNA polymerase β has been kinetically, structurally, and biologically well characterized and can serve as a phylogenetic reference. Accordingly, we have performed a phylogenetic analysis to understand the relationship between pol β and other members of the X-family of DNA polymerases. The bacterial X-family DNA polymerases, Saccharomyces cerevisiae pol IV, and four mammalian X-family polymerases appear to be directly related. These enzymes originated from an ancient common ancestor characterized in two Bacillus species. Understanding distinct functions for each of the X-family polymerases, evolving from a common bacterial ancestor is of significant interest in light of the specialized roles of these enzymes in DNA metabolism. PMID:25112931
Garcia-Barrera, Ali A; Del Valle, Alberto; Montaño-Hirose, Juan A; Barrón, Blanca Lilia; Salinas-Trujano, Juana; Torres-Flores, Jesus
2017-02-09
We report the complete genome sequences of four neurovirulent isolates of porcine rubulavirus (PorPV) from 2015 and one historical PorPV isolate from 1984 obtained by next-generation sequencing. A phylogenetic tree constructed using the individual sequences of the complete HN genes of the 2015 isolates and other historical sequences deposited in the GenBank database revealed that several recent neurovirulent isolates of PorPV (2008-2015) cluster together in a separate clade. Phylogenetic analysis of the complete genome sequences revealed that the neurovirulent strains of PorPV that circulated in Mexico during 2015 are genetically different from the PorPV strains that circulated during the 1980s.
Fuzzy Bayesian Network-Bow-Tie Analysis of Gas Leakage during Biomass Gasification
Yan, Fang; Xu, Kaili; Yao, Xiwen; Li, Yang
2016-01-01
Biomass gasification technology has been rapidly developed recently. But fire and poisoning accidents caused by gas leakage restrict the development and promotion of biomass gasification. Therefore, probabilistic safety assessment (PSA) is necessary for biomass gasification system. Subsequently, Bayesian network-bow-tie (BN-bow-tie) analysis was proposed by mapping bow-tie analysis into Bayesian network (BN). Causes of gas leakage and the accidents triggered by gas leakage can be obtained by bow-tie analysis, and BN was used to confirm the critical nodes of accidents by introducing corresponding three importance measures. Meanwhile, certain occurrence probability of failure was needed in PSA. In view of the insufficient failure data of biomass gasification, the occurrence probability of failure which cannot be obtained from standard reliability data sources was confirmed by fuzzy methods based on expert judgment. An improved approach considered expert weighting to aggregate fuzzy numbers included triangular and trapezoidal numbers was proposed, and the occurrence probability of failure was obtained. Finally, safety measures were indicated based on the obtained critical nodes. The theoretical occurrence probabilities in one year of gas leakage and the accidents caused by it were reduced to 1/10.3 of the original values by these safety measures. PMID:27463975
Drug-drug interaction prediction: a Bayesian meta-analysis approach.
Li, Lang; Yu, Menggang; Chin, Raymond; Lucksiri, Aroonrut; Flockhart, David A; Hall, Stephen D
2007-09-10
In drug-drug interaction (DDI) research, a two drug interaction is usually predicted by individual drug pharmacokinetics (PK). Although subject-specific drug concentration data from clinical PK studies on inhibitor/inducer or substrate's PK are not usually published, sample mean plasma drug concentrations and their standard deviations have been routinely reported. In this paper, an innovative DDI prediction method based on a three-level hierarchical Bayesian meta-analysis model is developed. The first level model is a study-specific sample mean model; the second level model is a random effect model connecting different PK studies; and all priors of PK parameters are specified in the third level model. A Monte Carlo Markov chain (MCMC) PK parameter estimation procedure is developed, and DDI prediction for a future study is conducted based on the PK models of two drugs and posterior distributions of the PK parameters. The performance of Bayesian meta-analysis in DDI prediction is demonstrated through a ketoconazole-midazolam example. The biases of DDI prediction are evaluated through statistical simulation studies. The DDI marker, ratio of area under the concentration curves, is predicted with little bias (less than 5 per cent), and its 90 per cent credible interval coverage rate is close to the nominal level. Sensitivity analysis is conducted to justify prior distribution selections.
Harrigan, George G; Harrison, Jay M
2012-01-01
New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.
Stanojević, Boban; Osiowy, Carla; Schaefer, Stephan; Bojović, Ksenija; Blagojević, Jelena; Nešić, Milica; Yamashita, Shunichi; Stamenković, Gorana
2011-08-01
Hepatitis B virus (HBV) is classified into 8 genotypes with distinct geographical distribution. Genotype D (HBV/D) has the widest distribution area and is comprised of 7 subgenotypes. Subgenotypes D1, D2 and D3 appear worldwide, while D4-D7 have a more restricted distribution. Within the Mediterranean area, HBV/D and subgenotype D3 are the most prevalent. The purpose of this study was to characterize the full genome of Serbian HBV/D3 isolates by comparison and phylogenetic analysis with HBV/D3 sequences (66 samples) found in GeneBank/DDBJ databases from different parts of the world. Isolates were obtained from three patients diagnosed with chronic hepatitis B (HBsAg+). All three isolates have two very rare nucleotide substitutions, A929T and T150A, which indicate the same ancestor. Phylogenetic analysis of HBV/D3 genome sequences throughout the world follows an ethno-geographical origin of isolates with rare exceptions, which could be explained by human travelling and migration. The geographically close but ethnically different Serbian and Italian isolates clustered in the same subnode, and on a common branch with strains from Northern Canada. To test the apparently close HBV phylogenetic relationship between completely separated patients from Serbia and Northern Canada we analyzed in depth a 440 bp region of the HBsAg from Canadian (n=73) and Serbian (n=70) isolates. The constructed parsimony tree revealed that strains from Serbia and Northern Canada fell along the same branch which indicates independent evolution within regions of each country. Considering that HBsAg sequence has limited variability for phylogenetic analyses, our hypothesis needs further confirmation with more HBV complete genome sequences.
Phylogenetic analysis of proteins involved in the stringent response in plant cells.
Ito, Doshun; Ihara, Yuta; Nishihara, Hidenori; Masuda, Shinji
2017-03-16
The nucleotide (p)ppGpp is a second messenger that controls the stringent response in bacteria. The stringent response modifies expression of a large number of genes and metabolic processes and allows bacteria to survive under fluctuating environmental conditions. Recent genome sequencing analyses have revealed that genes responsible for the stringent response are also found in plants. These include (p)ppGpp synthases and hydrolases, RelA/SpoT homologs (RSHs), and the pppGpp-specific phosphatase GppA/Ppx. However, phylogenetic relationship between enzymes involved in bacterial and plant stringent responses is as yet generally unclear. Here, we investigated the origin and evolution of genes involved in the stringent response in plants. Phylogenetic analysis and primary structures of RSH homologs from different plant phyla (including Embryophyta, Charophyta, Chlorophyta, Rhodophyta and Glaucophyta) indicate that RSH gene families were introduced into plant cells by at least two independent lateral gene transfers from the bacterial Deinococcus-Thermus phylum and an unidentified bacterial phylum; alternatively, they were introduced into a proto-plant cell by a lateral gene transfer from the endosymbiotic cyanobacterium followed by gene loss of an ancestral RSH gene in the cyanobacterial linage. Phylogenetic analysis of gppA/ppx families indicated that plant gppA/ppx homologs form an individual cluster in the phylogenetic tree, and show a sister relationship with some bacterial gppA/ppx homologs. Although RSHs contain a plastidial transit peptide at the N terminus, GppA/Ppx homologs do not, suggesting that plant GppA/Ppx homologs function in the cytosol. These results reveal that a proto-plant cell obtained genes for the stringent response by lateral gene transfer events from different bacterial phyla and have utilized them to control metabolism in plastids and the cytosol.
A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2012-01-01
A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for…
A Flexible Hierarchical Bayesian Modeling Technique for Risk Analysis of Major Accidents.
Yu, Hongyang; Khan, Faisal; Veitch, Brian
2017-02-28
Safety analysis of rare events with potentially catastrophic consequences is challenged by data scarcity and uncertainty. Traditional causation-based approaches, such as fault tree and event tree (used to model rare event), suffer from a number of weaknesses. These include the static structure of the event causation, lack of event occurrence data, and need for reliable prior information. In this study, a new hierarchical Bayesian modeling based technique is proposed to overcome these drawbacks. The proposed technique can be used as a flexible technique for risk analysis of major accidents. It enables both forward and backward analysis in quantitative reasoning and the treatment of interdependence among the model parameters. Source-to-source variability in data sources is also taken into account through a robust probabilistic safety analysis. The applicability of the proposed technique has been demonstrated through a case study in marine and offshore industry.
Phylogenetic and recombination analysis of rice black-streaked dwarf virus segment 9 in China.
Zhou, Yu; Weng, Jian-Feng; Chen, Yan-Ping; Liu, Chang-Lin; Han, Xiao-Hua; Hao, Zhuan-Fang; Li, Ming-Shun; Yong, Hong-Jun; Zhang, De-Gui; Zhang, Shi-Huang; Li, Xin-Hai
2015-04-01
Rice black-streaked dwarf virus (RBSDV) is an economically important virus that causes maize rough dwarf disease and rice black-streaked dwarf disease in East Asia. To study RBSDV variation and recombination, we examined the segment 9 (S9) sequences of 49 RBSDV isolates from maize and rice in China. Three S9 recombinants were detected in Baoding, Jinan, and Jining, China. Phylogenetic analysis showed that Chinese RBSDV isolates could be classified into two groups based on their S9 sequences, regardless of host or geographical origin. Further analysis suggested that S9 has undergone negative and purifying selection.
BAYESIAN SEMIPARAMETRIC ANALYSIS FOR TWO-PHASE STUDIES OF GENE-ENVIRONMENT INTERACTION
Ahn, Jaeil; Mukherjee, Bhramar; Gruber, Stephen B.; Ghosh, Malay
2013-01-01
The two-phase sampling design is a cost-efficient way of collecting expensive covariate information on a judiciously selected sub-sample. It is natural to apply such a strategy for collecting genetic data in a sub-sample enriched for exposure to environmental factors for gene-environment interaction (G × E) analysis. In this paper, we consider two-phase studies of G × E interaction where phase I data are available on exposure, covariates and disease status. Stratified sampling is done to prioritize individuals for genotyping at phase II conditional on disease and exposure. We consider a Bayesian analysis based on the joint retrospective likelihood of phase I and phase II data. We address several important statistical issues: (i) we consider a model with multiple genes, environmental factors and their pairwise interactions. We employ a Bayesian variable selection algorithm to reduce the dimensionality of this potentially high-dimensional model; (ii) we use the assumption of gene-gene and gene-environment independence to trade-off between bias and efficiency for estimating the interaction parameters through use of hierarchical priors reflecting this assumption; (iii) we posit a flexible model for the joint distribution of the phase I categorical variables using the non-parametric Bayes construction of Dunson and Xing (2009). We carry out a small-scale simulation study to compare the proposed Bayesian method with weighted likelihood and pseudo likelihood methods that are standard choices for analyzing two-phase data. The motivating example originates from an ongoing case-control study of colorectal cancer, where the goal is to explore the interaction between the use of statins (a drug used for lowering lipid levels) and 294 genetic markers in the lipid metabolism/cholesterol synthesis pathway. The sub-sample of cases and controls on which these genetic markers were measured is enriched in terms of statin users. The example and simulation results illustrate that the
A Bayesian analysis of uncertainties on lung doses resulting from occupational exposures to uranium.
Puncher, M; Birchall, A; Bull, R K
2013-09-01
In a recent epidemiological study, Bayesian estimates of lung doses were calculated in order to determine a possible association between lung dose and lung cancer incidence resulting from occupational exposures to uranium. These calculations, which produce probability distributions of doses, used the human respiratory tract model (HRTM) published by the International Commission on Radiological Protection (ICRP) with a revised particle transport clearance model. In addition to the Bayesian analyses, point estimates (PEs) of doses were also provided for that study using the existing HRTM as it is described in ICRP Publication 66. The PEs are to be used in a preliminary analysis of risk. To explain the differences between the PEs and Bayesian analysis, in this paper the methodology was applied to former UK nuclear workers who constituted a subset of the study cohort. The resulting probability distributions of lung doses calculated using the Bayesian methodology were compared with the PEs obtained for each worker. Mean posterior lung doses were on average 8-fold higher than PEs and the uncertainties on doses varied over a wide range, being greater than two orders of magnitude for some lung tissues. It is shown that it is the prior distributions of the parameters describing absorption from the lungs to blood that are responsible for the large difference between posterior mean doses and PEs. Furthermore, it is the large prior uncertainties on these parameters that are mainly responsible for the large uncertainties on lung doses. It is concluded that accurate determination of the chemical form of inhaled uranium, as well as the absorption parameter values for these materials, is important for obtaining unbiased estimates of lung doses from occupational exposures to uranium for epidemiological studies. Finally, it should be noted that the inferences regarding the PEs described here apply only to the assessments of cases provided for the epidemiological study, where central
Kim, Byoung-Jun; Kim, Bo-Ram; Lee, So-Young; Kim, Ga-Na; Kook, Yoon-Hoh; Kim, Bum-Joon
2016-01-01
Recently, we introduced a distinct Mycobacterium intracellulare INT-5 genotype, distantly related to other genotypes of M. intracellulare (INT-1 to -4). The aim of this study is to determine the exact taxonomic status of the M. intracellulare INT-5 genotype via genome-based phylogenetic analysis. To this end, genome sequences of the two INT-5 strains, MOTT-H4Y and MOTT-36Y were compared with M. intracellulare ATCC 13950T and Mycobacterium yongonense DSM 45126T. Our phylogenetic analysis based on complete genome sequences, multi-locus sequence typing (MLST) of 35 target genes, and single nucleotide polymorphism (SNP) analysis indicated that the two INT-5 strains were more closely related to M. yongonense DSM 45126T than the M. intracellulare strains. These results suggest their taxonomic transfer from M. intracellulare into M. yongonense. Finally, we selected 5 target genes (argH, dnaA, deaD, hsp65, and recF) and used SNPs for the identification of M. yongonese strains from other M. avium complex (MAC) strains. The application of the SNP analysis to 14 MAC clinical isolates enabled the selective identification of 4 M. yongonense clinical isolates from the other MACs. In conclusion, our genome-based phylogenetic analysis showed that the taxonomic status of two INT-5 strains, MOTT-H4Y and MOTT-36Y should be revised into M. yongonense. Our results also suggest that M. yongonense could be divided into 2 distinct genotypes (the Type I genotype with the M. parascrofulaceum rpoB gene and the Type II genotype with the M. intracellulare rpoB gene) depending on the presence of the lateral gene transfer of rpoB from M. parascrofulaceum. PMID:27031100
Korsgaard, Inge Riis; Lund, Mogens Sandø; Sorensen, Daniel; Gianola, Daniel; Madsen, Per; Jensen, Just
2003-01-01
A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories) or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed.
Pestes, Lynsey R; Peterman, Randall M; Bradford, Michael J; Wood, Chris C
2008-04-01
The endangered population of sockeye salmon (Oncorhynchus nerka) in Cultus Lake, British Columbia, Canada, migrates through commercial fishing areas along with other, much more abundant sockeye salmon populations, but it is not feasible to selectively harvest only the latter, abundant populations. This situation creates controversial trade-offs between recovery actions and economic revenue. We conducted a Bayesian decision analysis to evaluate options for recovery of Cultus Lake sockeye salmon. We used a stochastic population model that included 2 sources of uncertainty that are often omitted from such analyses: structural uncertainty in the magnitude of a potential Allee effect and implementation uncertainty (the deviation between targets and actual outcomes of management actions). Numerous state-dependent, time-independent management actions meet recovery objectives. These actions prescribe limitations on commercial harvest rates as a function of abundance of Cultus Lake sockeye salmon. We also quantified how much reduction in economic value of commercial harvests of the more abundant sockeye salmon populations would be expected for a given increase in the probability of recovery of the Cultus population. Such results illustrate how Bayesian decision analysis can rank options for dealing with conservation risks and can help inform trade-off discussions among decision makers and among groups that have competing objectives.
Risk analysis of emergent water pollution accidents based on a Bayesian Network.
Tang, Caihong; Yi, Yujun; Yang, Zhifeng; Sun, Jie
2016-01-01
To guarantee the security of water quality in water transfer channels, especially in open channels, analysis of potential emergent pollution sources in the water transfer process is critical. It is also indispensable for forewarnings and protection from emergent pollution accidents. Bridges above open channels with large amounts of truck traffic are the main locations where emergent accidents could occur. A Bayesian Network model, which consists of six root nodes and three middle layer nodes, was developed in this paper, and was employed to identify the possibility of potential pollution risk. Dianbei Bridge is reviewed as a typical bridge on an open channel of the Middle Route of the South to North Water Transfer Project where emergent traffic accidents could occur. Risk of water pollutions caused by leakage of pollutants into water is focused in this study. The risk for potential traffic accidents at the Dianbei Bridge implies a risk for water pollution in the canal. Based on survey data, statistical analysis, and domain specialist knowledge, a Bayesian Network model was established. The human factor of emergent accidents has been considered in this model. Additionally, this model has been employed to describe the probability of accidents and the risk level. The sensitive reasons for pollution accidents have been deduced. The case has also been simulated that sensitive factors are in a state of most likely to lead to accidents.
A Bayesian framework for cell-level protein network analysis for multivariate proteomics image data
NASA Astrophysics Data System (ADS)
Kovacheva, Violet N.; Sirinukunwattana, Korsuk; Rajpoot, Nasir M.
2014-03-01
The recent development of multivariate imaging techniques, such as the Toponome Imaging System (TIS), has facilitated the analysis of multiple co-localisation of proteins. This could hold the key to understanding complex phenomena such as protein-protein interaction in cancer. In this paper, we propose a Bayesian framework for cell level network analysis allowing the identification of several protein pairs having significantly higher co-expression levels in cancerous tissue samples when compared to normal colon tissue. It involves segmenting the DAPI-labeled image into cells and determining the cell phenotypes according to their protein-protein dependence profile. The cells are phenotyped using Gaussian Bayesian hierarchical clustering (GBHC) after feature selection is performed. The phenotypes are then analysed using Difference in Sums of Weighted cO-dependence Profiles (DiSWOP), which detects differences in the co-expression patterns of protein pairs. We demonstrate that the pairs highlighted by the proposed framework have high concordance with recent results using a different phenotyping method. This demonstrates that the results are independent of the clustering method used. In addition, the highlighted protein pairs are further analysed via protein interaction pathway databases and by considering the localization of high protein-protein dependence within individual samples. This suggests that the proposed approach could identify potentially functional protein complexes active in cancer progression and cell differentiation.
Bayesian meta-analysis of Cronbach's coefficient alpha to evaluate informative hypotheses.
Okada, Kensuke
2015-12-01
This paper proposes a new method to evaluate informative hypotheses for meta-analysis of Cronbach's coefficient alpha using a Bayesian approach. The coefficient alpha is one of the most widely used reliability indices. In meta-analyses of reliability, researchers typically form specific informative hypotheses beforehand, such as 'alpha of this test is greater than 0.8' or 'alpha of one form of a test is greater than the others.' The proposed method enables direct evaluation of these informative hypotheses. To this end, a Bayes factor is calculated to evaluate the informative hypothesis against its complement. It allows researchers to summarize the evidence provided by previous studies in favor of their informative hypothesis. The proposed approach can be seen as a natural extension of the Bayesian meta-analysis of coefficient alpha recently proposed in this journal (Brannick and Zhang, 2013). The proposed method is illustrated through two meta-analyses of real data that evaluate different kinds of informative hypotheses on superpopulation: one is that alpha of a particular test is above the criterion value, and the other is that alphas among different test versions have ordered relationships. Informative hypotheses are supported from the data in both cases, suggesting that the proposed approach is promising for application.
BayGO: Bayesian analysis of ontology term enrichment in microarray data
Vêncio, Ricardo ZN; Koide, Tie; Gomes, Suely L; de B Pereira, Carlos A
2006-01-01
Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis. PMID:16504085
Integrative Bayesian Analysis of Neuroimaging-Genetic Data with Application to Cocaine Dependence
Azadeh, Shabnam; Hobbs, Brian P.; Ma, Liangsuo; Nielsen, David A.; Moeller, F. Gerard; Baladandayuthapani, Veerabhadran
2016-01-01
Neuroimaging and genetic studies provide distinct and complementary information about the structural and biological aspects of a disease. Integrating the two sources of data facilitates the investigation of the links between genetic variability and brain mechanisms among different individuals for various medical disorders. This article presents a general statistical framework for integrative Bayesian analysis of neuroimaging-genetic (iBANG) data, which is motivated by a neuroimaging-genetic study in cocaine dependence. Statistical inference necessitated the integration of spatially dependent voxel-level measurements with various patient-level genetic and demographic characteristics under an appropriate probability model to account for the multiple inherent sources of variation. Our framework uses Bayesian model averaging to integrate genetic information into the analysis of voxel-wise neuroimaging data, accounting for spatial correlations in the voxels. Using multiplicity controls based on the false discovery rate, we delineate voxels associated with genetic and demographic features that may impact diffusion as measured by fractional anisotropy (FA) obtained from DTI images. We demonstrate the benefits of accounting for model uncertainties in both model fit and prediction. Our results suggest that cocaine consumption is associated with FA reduction in most white matter regions of interest in the brain. Additionally, gene polymorphisms associated with GABAergic, serotonergic and dopaminergic neurotransmitters and receptors were associated with FA. PMID:26484829
Roalson, Eric H; Friar, Elizabeth A
2004-12-01
We analyzed sequence variation for the alcohol dehydrogenase (Adh) gene family in Carex section Acrocystis (Cyperaceae) to reconstruct Adh gene trees for Acrocystis species and to characterize the structure of the Adh gene family in Carex. Two Adh loci were included with ITS and ETS sequences in a combined Bayesian inference analysis of Carex section Acrocystis to gain a better understanding of species relationships in the section. In addition, we comment on how the results presented here contribute to our knowledge of the birth-death process of the Adh gene family in angiosperms. It appears that the structure of the Adh gene family in Carex is complex with possibly six loci present in the gene family. Additionally, variation among Acrocystis species within loci is quite low, and there is little phylogenetic resolution in the individual datasets. Bayesian inference analysis of the combined ITS, ETS, Adh1, and Adh2 datasets resulted in a moderately well-supported phylogenetic hypothesis of relationships in the section which is discussed in relation to previous hypotheses of relationships.
Shamsi, Hamid; Mardani, Karim; Ownagh, Abdolghaffar
2017-01-01
Escherichia coli isolates from chickens with colibacillosis were assigned to phylogenetic groups based on multiplex polymerase chain reaction (PCR) and antibacterial resistance of E. coli belonging to these groups was examined. Furthermore, the gyrA gene of isolates was sequenced and a phylogenetic tree was generated. A total of 84 E. coli isolates were grouped using multiplex PCR of TSPE4.C2, chuA, yjaA, and gadA molecular markers. Four phylogenetic groups were identified with strains divided as follows: 16 in group A (19.05%), 17 in group B1 (20.24%), 23 in group B2 (27.38%), and 28 in group D (33.33%). Escherichia coli isolates belonging to phylogenetic groups B2 and D were resistant to Soltrim and Flumequine unlike the majority of E. coli isolates that belonged to groups A and B1, and which were susceptible to these antibiotics. The phylogenetic results based on gyrA gene sequences from multiplex PCR revealed that E. coli phylogenetic grouping was in accordance with the clusters obtained in the phylogenetic tree. In conclusion, the comparative sequence analysis of gyrA sequences provides a firm framework for an accurate classification of E. coli and related taxa and may constitute a pertinent phylogenetic marker for E. coli.
Diversification of land plants: insights from a family-level phylogenetic analysis
2011-01-01
Background Some of the evolutionary history of land plants has been documented based on the fossil record and a few broad-scale phylogenetic analyses, especially focusing on angiosperms and ferns. Here, we reconstructed phylogenetic relationships among all 706 families of land plants using molecular data. We dated the phylogeny using multiple fossils and a molecular clock technique. Applying various tests of diversification that take into account topology, branch length, numbers of extant species as well as extinction, we evaluated diversification rates through time. We also compared these diversification profiles against the distribution of the climate modes of the Phanerozoic. Results We found evidence for the radiations of ferns and mosses in the shadow of angiosperms coinciding with the rather warm Cretaceous global climate. In contrast, gymnosperms and liverworts show a signature of declining diversification rates during geological time periods of cool global climate. Conclusions This broad-scale phylogenetic analysis helps to reveal the successive waves of diversification that made up the diversity of land plants we see today. Both warm temperatures and wet climate may have been necessary for the rise of the diversity under a successive lineage replacement scenario. PMID:22103931
Isolation and phylogenetic analysis of novel γ-gliadin genes in genus Dasypyrum.
Li, G R; Liu, C; Yang, E N; Yang, Z J
2013-03-13
As the most ancient member of the wheat gluten family, the γ-gliadin genes are suitable for phylogenetic analysis among wheat and related species. Species in the grass genus Dasypyrum have been widely used for wheat cross breeding. However, the genomic relationships among Dasypyrum species have been little studied. We isolated 22 novel γ-gliadin gene sequences, among which 10 are putatively functional. The open reading frame lengths of these sequences range from 642 to 933 bp, and these putative proteins consist of five domains. Phylogenetic analyses showed that all Dasypyrum γ-gliadin gene sequences clustered in a large group; D. villosum and tetraploid D. breviaristatum γ-gliadin gene sequences clustered in a subgroup, while diploid D. breviaristatum γ-gliadin gene sequences clustered at the edge of the subgroup. All of the Dasypyrum γ-gliadin gene sequences were absent in three major T cell-stimulatory epitopes binding to HLA-DQ2/8 in celiac disease patients. Based on the phylogenetic analyses, we suggest that D. villosum and tetraploid D. breviaristatum evolved in parallel from a diploid ancestor D. breviaristatum.
Phylogenetic analysis of the Trypanosoma genus based on the heat-shock protein 70 gene.
Fraga, Jorge; Fernández-Calienes, Aymé; Montalvo, Ana Margarita; Maes, Ilse; Deborggraeve, Stijn; Büscher, Philippe; Dujardin, Jean-Claude; Van der Auwera, Gert
2016-09-01
Trypanosome evolution was so far essentially studied on the basis of phylogenetic analyses of small subunit ribosomal RNA (SSU-rRNA) and glycosomal glyceraldehyde-3-phosphate dehydrogenase (gGAPDH) genes. We used for the first time the 70kDa heat-shock protein gene (hsp70) to investigate the phylogenetic relationships among 11 Trypanosoma species on the basis of 1380 nucleotides from 76 sequences corresponding to 65 strains. We also constructed a phylogeny based on combined datasets of SSU-rDNA, gGAPDH and hsp70 sequences. The obtained clusters can be correlated with the sections and subgenus classifications of mammal-infecting trypanosomes except for Trypanosoma theileri and Trypanosoma rangeli. Our analysis supports the classification of Trypanosoma species into clades rather than in sections and subgenera, some of which being polyphyletic. Nine clades were recognized: Trypanosoma carassi, Trypanosoma congolense, Trypanosoma cruzi, Trypanosoma grayi, Trypanosoma lewisi, T. rangeli, T. theileri, Trypanosoma vivax and Trypanozoon. These results are consistent with existing knowledge of the genus' phylogeny. Within the T. cruzi clade, three groups of T. cruzi discrete typing units could be clearly distinguished, corresponding to TcI, TcIII, and TcII+V+VI, while support for TcIV was lacking. Phylogenetic analyses based on hsp70 demonstrated that this molecular marker can be applied for discriminating most of the Trypanosoma species and clades.
Multi-locus phylogenetic analysis reveals the pattern and tempo of bony fish evolution
Broughton, Richard E.; Betancur-R., Ricardo; Li, Chenhong; Arratia, Gloria; Ortí, Guillermo
2013-01-01
Over half of all vertebrates are “fishes”, which exhibit enormous diversity in morphology, physiology, behavior, reproductive biology, and ecology. Investigation of fundamental areas of vertebrate biology depend critically on a robust phylogeny of fishes, yet evolutionary relationships among the major actinopterygian and sarcopterygian lineages have not been conclusively resolved. Although a consensus phylogeny of teleosts has been emerging recently, it has been based on analyses of various subsets of actinopterygian taxa, but not on a full sample of all bony fishes. Here we conducted a comprehensive phylogenetic study on a broad taxonomic sample of 61 actinopterygian and sarcopterygian lineages (with a chondrichthyan outgroup) using a molecular data set of 21 independent loci. These data yielded a resolved phylogenetic hypothesis for extant Osteichthyes, including 1) reciprocally monophyletic Sarcopterygii and Actinopterygii, as currently understood, with polypteriforms as the first diverging lineage within Actinopterygii; 2) a monophyletic group containing gars and bowfin (= Holostei) as sister group to teleosts; and 3) the earliest diverging lineage among teleosts being Elopomorpha, rather than Osteoglossomorpha. Relaxed-clock dating analysis employing a set of 24 newly applied fossil calibrations reveals divergence times that are more consistent with paleontological estimates than previous studies. Establishing a new phylogenetic pattern with accurate divergence dates for bony fishes illustrates several areas where the fossil record is incomplete and provides critical new insights on diversification of this important vertebrate group. PMID:23788273
The Green Clade grows: A phylogenetic analysis of Aplastodiscus (Anura; Hylidae).
Berneck, Bianca V M; Haddad, Célio F B; Lyra, Mariana L; Cruz, Carlos A G; Faivovich, Julián
2016-04-01
Green tree frogs of the genus Aplastodiscus occur in the Atlantic Forest and Cerrado biomes of South America. The genus comprises 15 medium-sized species placed in three species groups diagnosed mainly by cloacal morphology. A phylogenetic analysis was conducted to: (1) test the monophyly of these species groups; (2) explore the phylogenetic relationships among putative species; and (3) investigate species boundaries. The dataset included eight mitochondrial and nuclear gene fragments for up to 6642 bp per specimen. The results strongly support the monophyly of Aplastodiscus and of the A. albofrenatus and A. perviridis groups. Aplastodiscus sibilatus is the sister taxon of all other species of Aplastodiscus, making the A. albosignatus Group non-monophyletic as currently defined. At least six unnamed species are recognized for Aplastodiscus, increasing the diversity of the genus by 40%. A fourth species group, the A. sibilatus Group is recognized. Aplastodiscus musicus is transferred from the A. albofrenatus Group to the A. albosignatus Group, and A. callipygius is considered a junior synonym of A. albosignatus. Characters related to external cloacal morphology reveal an interesting evolutionary pattern of parallelisms and reversions, suggesting an undocumented level of complexity. We analyze, in light of our phylogenetic results, the evolution of reproductive biology and chromosome morphology in Aplastodiscus.
Liu, Yang; Lai, Qiliang; Dong, Chunming; Sun, Fengqin; Wang, Liping; Li, Guangyu; Shao, Zongze
2013-01-01
Bacteria closely related to Bacillus pumilus cannot be distinguished from such other species as B. safensis, B. stratosphericus, B. altitudinis and B. aerophilus simply by 16S rRNA gene sequence. In this report, 76 marine strains were subjected to phylogenetic analysis based on 7 housekeeping genes to understand the phylogeny and biogeography in comparison with other origins. A phylogenetic tree based on the 7 housekeeping genes concatenated in the order of gyrB-rpoB-pycA-pyrE-mutL-aroE-trpB was constructed and compared with trees based on the single genes. All these trees exhibited a similar topology structure with small variations. Our 79 strains were divided into 6 groups from A to F; Group A was the largest and contained 49 strains close to B. altitudinis. Additional two large groups were presented by B. safensis and B. pumilus respectively. Among the housekeeping genes, gyrB and pyrE showed comparatively better resolution power and may serve as molecular markers to distinguish these closely related strains. Furthermore, a recombinant phylogenetic tree based on the gyrB gene and containing 73 terrestrial and our isolates was constructed to detect the relationship between marine and other sources. The tree clearly showed that the bacteria of marine origin were clustered together in all the large groups. In contrast, the cluster belonging to B. safensis was mainly composed of bacteria of terrestrial origin. Interestingly, nearly all the marine isolates were at the top of the tree, indicating the possibility of the recent divergence of this bacterial group in marine environments. We conclude that B. altitudinis bacteria are the most widely spread of the B. pumilus group in marine environments. In summary, this report provides the first evidence regarding the systematic evolution of this bacterial group, and knowledge of their phylogenetic diversity will help in the understanding of their ecological role and distribution in marine environments.
Bayesian sensitivity analysis of incomplete data: bridging pattern-mixture and selection models.
Kaciroti, Niko A; Raghunathan, Trivellore
2014-11-30
Pattern-mixture models (PMM) and selection models (SM) are alternative approaches for statistical analysis when faced with incomplete data and a nonignorable missing-data mechanism. Both models make empirically unverifiable assumptions and need additional constraints to identify the parameters. Here, we first introduce intuitive parameterizations to identify PMM for different types of outcome with distribution in the exponential family; then we translate these to their equivalent SM approach. This provides a unified framework for performing sensitivity analysis under either setting. These new parameterizations are transparent, easy-to-use, and provide dual interpretation from both the PMM and SM perspectives. A Bayesian approach is used to perform sensitivity analysis, deriving inferences using informative prior distributions on the sensitivity parameters. These models can be fitted using software that implements Gibbs sampling.
Bayesian analysis for OPC modeling with film stack properties and posterior predictive checking
NASA Astrophysics Data System (ADS)
Burbine, Andrew; Fenger, Germain; Sturtevant, John; Fryer, David
2016-10-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and analysis techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper expands upon Bayesian analysis methods for parameter selection in lithographic models by increasing the parameter set and employing posterior predictive checks. Work continues with a Markov chain Monte Carlo (MCMC) search algorithm to generate posterior distributions of parameters. Models now include wafer film stack refractive indices, n and k, as parameters, recognizing the uncertainties associated with these values. Posterior predictive checks are employed as a method to validate parameter vectors discovered by the analysis, akin to cross validation.
ERIC Educational Resources Information Center
Hsieh, Chueh-An; Maier, Kimberly S.
2009-01-01
The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…
Application of phylogenetic microarray analysis to discriminate sources of fecal pollution.
Dubinsky, Eric A; Esmaili, Laleh; Hulls, John R; Cao, Yiping; Griffith, John F; Andersen, Gary L
2012-04-17
Conventional methods for fecal source tracking typically use single biomarkers to systematically identify or exclude sources. High-throughput DNA sequence analysis can potentially identify all sources of microbial contaminants in a single test by measuring the total diversity of fecal microbial communities. In this study, we used phylogenetic microarray analysis to determine the comprehensive suite of bacteria that define major sources of fecal contamination in coastal California. Fecal wastes were collected from 42 different populations of humans, birds, cows, horses, elk, and pinnipeds. We characterized bacterial community composition using a DNA microarray that probes for 16S rRNA genes of 59,316 different bacterial taxa. Cluster analysis revealed strong differences in community composition among fecal wastes from human, birds, pinnipeds, and grazers. Actinobacteria, Bacilli, and many Gammaproteobacteria taxa discriminated birds from mammalian sources. Diverse families within the Clostridia and Bacteroidetes taxa discriminated human wastes, grazers, and pinnipeds from each other. We found 1058 different bacterial taxa that were unique to either human, grazing mammal, or bird fecal wastes. These OTUs can serve as specific identifier taxa for these sources in environmental waters. Two field tests in marine waters demonstrate the capacity of phylogenetic microarray analysis to track multiple sources with one test.
Malyarchuk, B A; Derenko, M V; Denisova, G A; Litvinov, A N
2015-08-01
Phylogenetic analysis of different regions of the mitochondrial genome of the sable showed the presence of several topologies of phylogenetic trees, but the most statistically significant topology is A-BC, which was obtained as a result of the analysis of the mitochondrial genome as a whole, as well as of the individual CO1, ND4, and ND5 genes. Analysis of the intergroup divergence of the mtDNA haplotypes (Dxy) indicated that the maximum Dxy values between A and BC groups were accompanied by minimum differences between B and C groups only for six genes showing the A-BC topology (12S rRNA; CO1, CO2, ND4, ND5, and CYTB). It is assumed that the topological conflicts observed in the analysis of individual sable mtDNA genes are associated with the uneven distribution of mutations along the mitochondrial genome and the mitochondrial tree. This may be due to random causes, as well as the nonuniform effect of selection.
Cao, Kai; Yang, Kun; Wang, Chao; Guo, Jin; Tao, Lixin; Liu, Qingrong; Gehendra, Mahara; Zhang, Yingjie; Guo, Xiuhua
2016-01-01
Objective: To explore the spatial-temporal interaction effect within a Bayesian framework and to probe the ecological influential factors for tuberculosis. Methods: Six different statistical models containing parameters of time, space, spatial-temporal interaction and their combination were constructed based on a Bayesian framework. The optimum model was selected according to the deviance information criterion (DIC) value. Coefficients of climate variables were then estimated using the best fitting model. Results: The model containing spatial-temporal interaction parameter was the best fitting one, with the smallest DIC value (−4,508,660). Ecological analysis results showed the relative risks (RRs) of average temperature, rainfall, wind speed, humidity, and air pressure were 1.00324 (95% CI, 1.00150–1.00550), 1.01010 (95% CI, 1.01007–1.01013), 0.83518 (95% CI, 0.93732–0.96138), 0.97496 (95% CI, 0.97181–1.01386), and 1.01007 (95% CI, 1.01003–1.01011), respectively. Conclusions: The spatial-temporal interaction was statistically meaningful and the prevalence of tuberculosis was influenced by the time and space interaction effect. Average temperature, rainfall, wind speed, and air pressure influenced tuberculosis. Average humidity had no influence on tuberculosis. PMID:27164117
Busschaert, P; Geeraerd, A H; Uyttendaele, M; Van Impe, J F
2011-06-01
Microbiological contamination data often is censored because of the presence of non-detects or because measurement outcomes are known only to be smaller than, greater than, or between certain boundary values imposed by the laboratory procedures. Therefore, it is not straightforward to fit distributions that summarize contamination data for use in quantitative microbiological risk assessment, especially when variability and uncertainty are to be characterized separately. In this paper, distributions are fit using Bayesian analysis, and results are compared to results obtained with a methodology based on maximum likelihood estimation and the non-parametric bootstrap method. The Bayesian model is also extended hierarchically to estimate the effects of the individual elements of a covariate such as, for example, on a national level, the food processing company where the analyzed food samples were processed, or, on an international level, the geographical origin of contamination data. Including this extra information allows a risk assessor to differentiate between several scenario's and increase the specificity of the estimate of risk of illness, or compare different scenario's to each other. Furthermore, inference is made on the predictive importance of several different covariates while taking into account uncertainty, allowing to indicate which covariates are influential factors determining contamination.
Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures
Moore, Brian R.; Höhna, Sebastian; May, Michael R.; Rannala, Bruce; Huelsenbeck, John P.
2016-01-01
Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM. PMID:27512038
A Bayesian analysis of the 69 highest energy cosmic rays detected by the Pierre Auger Observatory
NASA Astrophysics Data System (ADS)
Khanin, Alexander; Mortlock, Daniel J.
2016-08-01
The origins of ultrahigh energy cosmic rays (UHECRs) remain an open question. Several attempts have been made to cross-correlate the arrival directions of the UHECRs with catalogues of potential sources, but no definite conclusion has been reached. We report a Bayesian analysis of the 69 events, from the Pierre Auger Observatory (PAO), that aims to determine the fraction of the UHECRs that originate from known AGNs in the Veron-Cety & Verson (VCV) catalogue, as well as AGNs detected with the Swift Burst Alert Telescope (Swift-BAT), galaxies from the 2MASS Redshift Survey (2MRS), and an additional volume-limited sample of 17 nearby AGNs. The study makes use of a multilevel Bayesian model of UHECR injection, propagation and detection. We find that for reasonable ranges of prior parameters the Bayes factors disfavour a purely isotropic model. For fiducial values of the model parameters, we report 68 per cent credible intervals for the fraction of source originating UHECRs of 0.09^{+0.05}_{-0.04}, 0.25^{+0.09}_{-0.08}, 0.24^{+0.12}_{-0.10}, and 0.08^{+0.04}_{-0.03} for the VCV, Swift-BAT and 2MRS catalogues, and the sample of 17 AGNs, respectively.
Bayesian analysis of response to selection: a case study using litter size in Danish Yorkshire pigs.
Sorensen, D; Vernersen, A; Andersen, S
2000-01-01
Implementation of a Bayesian analysis of a selection experiment is illustrated using litter size [total number of piglets born (TNB)] in Danish Yorkshire pigs. Other traits studied include average litter weight at birth (WTAB) and proportion of piglets born dead (PRBD). Response to selection for TNB was analyzed with a number of models, which differed in their level of hierarchy, in their prior distributions, and in the parametric form of the likelihoods. A model assessment study favored a particular form of an additive genetic model. With this model, the Monte Carlo estimate of the 95% probability interval of response to selection was (0.23; 0.60), with a posterior mean of 0.43 piglets. WTAB showed a correlated response of -7.2 g, with a 95% probability interval equal to (-33.1; 18.9). The posterior mean of the genetic correlation between TNB and WTAB was -0.23 with a 95% probability interval equal to (-0.46; -0.01). PRBD was studied informally; it increases with larger litters, when litter size is >7 piglets born. A number of methodological issues related to the Bayesian model assessment study are discussed, as well as the genetic consequences of inferring response to selection using additive genetic models. PMID:10978292
Assessing State Nuclear Weapons Proliferation: Using Bayesian Network Analysis of Social Factors
Coles, Garill A.; Brothers, Alan J.; Olson, Jarrod; Whitney, Paul D.
2010-04-16
A Bayesian network (BN) model of social factors can support proliferation assessments by estimating the likelihood that a state will pursue a nuclear weapon. Social factors including political, economic, nuclear capability, security, and national identity and psychology factors may play as important a role in whether a State pursues nuclear weapons as more physical factors. This paper will show how using Bayesian reasoning on a generic case of a would-be proliferator State can be used to combine evidence that supports proliferation assessment. Theories and analysis by political scientists can be leveraged in a quantitative and transparent way to indicate proliferation risk. BN models facilitate diagnosis and inference in a probabilistic environment by using a network of nodes and acyclic directed arcs between the nodes whose connections, or absence of, indicate probabilistic relevance, or independence. We propose a BN model that would use information from both traditional safeguards and the strengthened safeguards associated with the Additional Protocol to indicate countries with a high risk of proliferating nuclear weapons. This model could be used in a variety of applications such a prioritization tool and as a component of state safeguards evaluations. This paper will discuss the benefits of BN reasoning, the development of Pacific Northwest National Laboratory’s (PNNL) BN state proliferation model and how it could be employed as an analytical tool.
Fast Bayesian whole-brain fMRI analysis with spatial 3D priors.
Sidén, Per; Eklund, Anders; Bolin, David; Villani, Mattias
2017-02-01
Spatial whole-brain Bayesian modeling of task-related functional magnetic resonance imaging (fMRI) is a great computational challenge. Most of the currently proposed methods therefore do inference in subregions of the brain separately or do approximate inference without comparison to the true posterior distribution. A popular such method, which is now the standard method for Bayesian single subject analysis in the SPM software, is introduced in Penny et al. (2005b). The method processes the data slice-by-slice and uses an approximate variational Bayes (VB) estimation algorithm that enforces posterior independence between activity coefficients in different voxels. We introduce a fast and practical Markov chain Monte Carlo (MCMC) scheme for exact inference in the same model, both slice-wise and for the whole brain using a 3D prior on activity coefficients. The algorithm exploits sparsity and uses modern techniques for efficient sampling from high-dimensional Gaussian distributions, leading to speed-ups without which MCMC would not be a practical option. Using MCMC, we are for the first time able to evaluate the approximate VB posterior against the exact MCMC posterior, and show that VB can lead to spurious activation. In addition, we develop an improved VB method that drops the assumption of independent voxels a posteriori. This algorithm is shown to be much faster than both MCMC and the original VB for large datasets, with negligible error compared to the MCMC posterior.
White, Amanda M.; Gastelum, Zoe N.; Whitney, Paul D.
2014-05-13
Under the auspices of Pacific Northwest National Laboratory’s Signature Discovery Initiative (SDI), the research team developed a series of Bayesian Network models to assess multi-source signatures of nuclear programs. A Bayesian network is a mathematical model that can be used to marshal evidence to assess competing hypotheses. The purpose of the models was to allow non-expert analysts to benefit from the use of expert-informed mathematical models to assess nuclear programs, because such assessments require significant technical expertise ranging from the nuclear fuel cycle, construction and engineering, imagery analysis, and so forth. One such model developed under this research was aimed at assessing the consistency of open-source information about a nuclear facility with the facility’s declared use. The model incorporates factors such as location, security and safety features among others identified by subject matter experts as crucial to their assessments. The model includes key features, observables and their relationships. The model also provides documentation, which serves as training materials for the non-experts.
Bayesian analysis of a multivariate null intercept errors-in-variables regression model.
Aoki, Reiko; Bolfarine, Heleno; Achcar, Jorge A; Dorival, Leão P Júnior
2003-11-01
Longitudinal data are of great interest in analysis of clinical trials. In many practical situations the covariate can not be measured precisely and a natural alternative model is the errors-in-variables regression models. In this paper we study a null intercept errors-in-variables regression model with a structure of dependency between the response variables within the same group. We apply the model to real data presented in Hadgu and Koch (Hadgu, A., Koch, G. (1999). Application of generalized estimating equations to a dental randomized clinical trial. J. Biopharmaceutical Statistics 9(1):161-178). In that study volunteers with preexisting dental plaque were randomized to two experimental mouth rinses (A and B) or a control mouth rinse with double blinding. The dental plaque index was measured for each subject in the beginning of the study and at two follow-up times, which leads to the presence of an interclass correlation. We propose the use of a Bayesian approach to model a multivariate null intercept errors-in-variables regression model to the longitudinal data. The proposed Bayesian approach accommodates the correlated measurements and incorporates the restriction that the slopes must lie in the (0, 1) interval. A Gibbs sampler is used to perform the computations.
Bayesian analysis of an admixture model with mutations and arbitrarily linked markers.
Excoffier, Laurent; Estoup, Arnaud; Cornuet, Jean-Marie
2005-03-01
We introduce here a Bayesian analysis of a classical admixture model in which all parameters are simultaneously estimated. Our approach follows the approximate Bayesian computation (ABC) framework, relying on massive simulations and a rejection-regression algorithm. Although computationally intensive, this approach can easily deal with complex mutation models and partially linked loci, and it can be thoroughly validated without much additional computation cost. Compared to a recent maximum-likelihood (ML) method, the ABC approach leads to similarly accurate estimates of admixture proportions in the case of recent admixture events, but it is found superior when the admixture is more ancient. All other parameters of the admixture model such as the divergence time between parental populations, the admixture time, and the population sizes are also well estimated, unlike the ML method. The use of partially linked markers does not introduce any particular bias in the estimation of admixture, but ML confidence intervals are found too narrow if linkage is not specifically accounted for. The application of our method to an artificially admixed domestic bee population from northwest Italy suggests that the admixture occurred in the last 10-40 generations and that the parental Apis mellifera and A. ligustica populations were completely separated since the last glacial maximum.
Bayesian analysis of radiocarbon chronologies: examples from the European Late-glacial
NASA Astrophysics Data System (ADS)
Blockley, S. P. E.; Lowe, J. J.; Walker, M. J. C.; Asioli, A.; Trincardi, F.; Coope, G. R.; Donahue, R. E.
2004-02-01
Although there are many Late-glacial (ca. 15 000-11 000 cal. yr BP) proxy climate records from northwest Europe, some analysed at a very high temporal resolution (decadal to century scale), attempts to establish time-stratigraphical correlations between sequences are constrained by problems of radiocarbon dating. In an attempt to overcome some of these difficulties, we have used a Bayesian approach to the analysis of radiocarbon chronologies for two Late-glacial sites in the British Isles and one in the Adriatic Sea. The palaeoclimatic records from the three sites were then compared with that from the GRIP Greenland ice-core. Although there are some apparent differences in the timing of climatic events during the early part of the Late-glacial (pre-14 000 cal. yr BP), the results suggest that regional climatic changes appear to have been broadly comparable between Greenland, the British Isles and the Adriatic during the major part of the Late-glacial (i.e. between 14 000 and 11 000 cal. yr BP). The advantage of using the Bayesian approach is that it provides a means of testing the reliability of Late-glacial radiocarbon chronologies that is independent of regional chronostratigraphical (climatostratigraphical) frameworks. It also uses the full radiocarbon inventory available for each sequence and makes explicit any data selection applied. Potentially, therefore, it offers a more objective basis for comparing regional radiocarbon chronologies than the conventional approaches that have been used hitherto. Copyright
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia
2015-06-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data
Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia
2015-01-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944
Shipham, Ashlee; Schmidt, Daniel J; Joseph, Leo; Hughes, Jane M
2015-10-01
Relationships and species limits among the colourful Australian parrots known as rosellas (Platycercus) are contentious because of poorly understood patterns of parapatry, sympatry and hybridization as well as complex patterns of geographical replacement of phenotypic forms. Two subgenera are, however, conventionally recognised: Platycercus comprises the blue-cheeked crimson rosella complex (Crimson Rosella P. elegans and Green Rosella P. caledonicus), and Violania contains the remaining four currently recognised species (Pale-headed Rosella P. adscitus, Eastern Rosella P. eximius, Northern Rosella P. venustus, and Western Rosella P. icterotis). We used phylogenetic analysis of ten loci (one mitochondrial, eight autosomal and one z-linked) and several individuals per nominal species primarily to examine relationships within the subgenera, especially the relationships and species limits within Violania. Of these, P. adscitus and P. eximius have long been considered sister species or conspecific due to a morphology-based hybrid zone and an early phylogenetic analysis of mitochondrial DNA restriction fragment length polymorphisms. The multilocus phylogenetic analysis presented here supports an alternative hypothesis aligning P. adscitus and P. venustus as sister species. Using divergence rates published in other avian studies, we estimated the divergence between P. venustus and P. adscitus at 0.0148-0.6124MYA and that between the P. adscitus/P. venustus ancestor and P. eximius earlier at 0.1617-1.0816MYA, both within the Pleistocene. Discordant topologies among gene and species trees are discussed and proposed to be the result of historical gene flow and/or incomplete lineage sorting (ILS). In particular, we suggest that discordance between mitochondrial and nuclear data may be the result of asymmetrical mitochondrial introgression from P. adscitus into P. eximius. The biogeographical implications of our findings are discussed relative to similarly distributed groups
Onisko, Agnieszka; Druzdzel, Marek J.; Austin, R. Marshall
2016-01-01
Background: Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. Aim: The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. Materials and Methods: This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan–Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. Results: The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Conclusion: Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches. PMID:28163973
Phylogenetic analysis of Gansu sheeppox virus isolates based on P32, GPCR, and RPO30 genes.
Su, H L; Jia, H J; Yin, C; Jing, Z Z; Luo, X N; Chen, Y X
2015-03-13
Two outbreaks of sheeppox in sheep have occurred in Gansu Province, China. The P32, GPCR, and RPO30 genes were used as markers for differential diagnosis. We confirmed that the outbreaks were caused by sheeppox virus. Sequence and phylogenetic analysis of the P32, GPCR, and RPO30 genes revealed a close relationship between the 2 isolates and Chinese sheeppox viruses. Because ill sheep were imported from Jingyuan, another county of Gansu Province, our results strongly suggest the importance of veterinary surveillance prior to transportation.
Genotyping and phylogenetic analysis of bovine viral diarrhea virus (BVDV) isolates in Kosovo.
Goga, Izedin; Berxholi, Kristaq; Hulaj, Beqe; Sylejmani, Driton; Yakobson, Boris; Stram, Yehuda
2014-01-01
Three serum samples positive in Antigen ELISA BVDV have been tested to characterise genetic diversity of bovine viral diarrhea virus (BVDV) in Kosovo. Samples were obtained in 2011 from heifers and were amplified by reverse transcription-polymerase chain reaction, sequenced and analysed by computer-assisted phylogenetic analysis. Amplified products and nucleotide sequence showed that all 3 isolates belonged to BVDV 1 genotype and 1b sub genotype. These results enrich the extant knowledge of BVDV and represent the first documented data about Kosovo BVDV isolates.
BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes.
de Santiago, Ines; Liu, Wei; Yuan, Ke; O'Reilly, Martin; Chilamakuri, Chandra Sekhar Reddy; Ponder, Bruce A J; Meyer, Kerstin B; Markowetz, Florian
2017-02-24
Allele-specific measurements of transcription factor binding from ChIP-seq data are key to dissecting the allelic effects of non-coding variants and their contribution to phenotypic diversity. However, most methods of detecting an allelic imbalance assume diploid genomes. This assumption severely limits their applicability to cancer samples with frequent DNA copy-number changes. Here we present a Bayesian statistical approach called BaalChIP to correct for the effect of background allele frequency on the observed ChIP-seq read counts. BaalChIP allows the joint analysis of multiple ChIP-seq samples across a single variant and outperforms competing approaches in simulations. Using 548 ENCODE ChIP-seq and six targeted FAIRE-seq samples, we show that BaalChIP effectively corrects allele-specific analysis for copy-number variation and increases the power to detect putative cis-acting regulatory variants in cancer genomes.
A Bayesian Approach for Instrumental Variable Analysis with Censored Time-to-Event Outcome
Li, Gang; Lu, Xuyang
2014-01-01
Instrumental variable (IV) analysis has been widely used in economics, epidemiology, and other fields to estimate the causal effects of covariates on outcomes, in the presence of unobserved confounders and/or measurement errors in covariates. However, IV methods for time-to-event outcome with censored data remain underdeveloped. This paper proposes a Bayesian approach for IV analysis with censored time-to-event outcome by using a two-stage linear model. A Markov Chain Monte Carlo sampling method is developed for parameter estimation for both normal and non-normal linear models with elliptically contoured error distributions. Performance of our method is examined by simulation studies. Our method largely reduces bias and greatly improves coverage probability of the estimated causal effect, compared to the method that ignores the unobserved confounders and measurement errors. We illustrate our method on the Women's Health Initiative Observational Study and the Atherosclerosis Risk in Communities Study. PMID:25393617
Detection and phylogenetic analysis of Orf virus from sheep in Brazil: a case report
Abrahão, Jônatas S; Campos, Rafael K; Trindade, Giliane S; Guedes, Maria IM; Lobato, Zélia IP; Mazur, Carlos; Ferreira, Paulo CP; Bonjardim, Cláudio A; Kroon, Erna G
2009-01-01
Background Orf virus (ORFV), the prototype of the genus Parapoxvirus (PPV), is the etiological agent of contagious ecthyma, a severe exanthematic dermatitis that afflicts domestic and wild small ruminants. Although South American ORFV outbreaks have occurred and diagnosed there are no South American PPV major membrane glycoprotein B2L gene nucleotide sequences available. Case presentation an outbreak of ovine contagious ecthyma in Midwest Brazil was investigated. The diagnosis was based on clinical examinations and molecular biology techniques. The molecular characterization of the virus was done using PCR amplification, cloning and DNA sequencing of the B2L gene. The phylogenetic analysis demonstrated a high degree of identity with ORFV strains, and the isolate was closest to the ORFV-India 82/04 isolate. Another Brazilian ORFV isolate, NE1, was sequenced for comparative analysis and also showed a high degree of identity with an Asian ORFV strain. Conclusion Distinct ORFV strains are circulating in Brazil. This is the first report on the phylogenetic analysis of an ORFV in South America. PMID:19413907
Genome-wide identification and phylogenetic analysis of the SBP-box gene family in melons.
Ma, Y; Guo, J W; Bade, R; Men, Z H; Hasi, A
2014-10-27
The SBP-box gene family is specific to plants and encodes a class of zinc finger-containing transcription factors with a broad range of functions. Although SBP-box genes have been identified in numerous plants, including green algae, moss, silver birch, snapdragon, Arabidopsis, rice, and maize, there is little information concerning SBP-box genes, or the corresponding miR156/157, function in melon. Using the highly conserved sequence of the Arabidopsis thaliana SBP-box domain protein as a probe of information sequence, the genome-wide protein database of melon was explored to obtain 13 SBP-box protein sequences, which were further divided into 4 groups, based on phylogenetic analysis. A further analysis centered on the melon SBP-box genetic family's phylogenetic evolution, sequence similarities, gene structure, and miR156 target sequence was also conducted. Analysis of all the expression patterns of melon SBP-box family genes showed that the SBP-box genes were detected in 7 kinds of tissue, and fruit had the highest expression level. CmSBP11 tends to present its specific expression in melon fruit and root. CmSBP09 expression was the highest in flower. Overall, the molecular evolution and expression pattern of the melon SBP-box gene family, revealed by these results, suggest its function differentiation that followed gene duplication.
Phylogenetic analysis reveals the evolution and diversification of cyclins in eukaryotes.
Ma, Zhaowu; Wu, Yuliang; Jin, Jialu; Yan, Jun; Kuang, Shuzhen; Zhou, Mi; Zhang, Yuexuan; Guo, An-Yuan
2013-03-01
Cyclins are a family of diverse proteins that play fundamental roles in regulating cell cycle progression in Eukaryotes. Cyclins have been identified from protists to higher Eukaryotes, while its evolution remains vague and the findings turn out controversial. Current classification of cyclins is mainly based on their functions, which may not be appropriate for the systematic evolutionary analysis. In this work, we performed comparative and phylogenetic analysis of cyclins to investigate their classification, origin and evolution. Cyclins originated in early Eukaryotes and evolved from protists to plants, fungi and animals. Based on the phylogenetic tree, cyclins can be divided into three major groups designated as the group I, II and III with different functions and features. Group I plays key roles in cell cycle, group II varied in actions are kingdom (plant, fungi and animal) specific, and group III functions in transcription regulation. Our results showed that the dominating cyclins (group I) diverged from protists to plants, fungi and animals, while divergence of the other cyclins (groups II and III) has occurred in protists. We also discussed the evolutionary relationships between cyclins and cyclin-dependent kinases (CDKs) and found that the cyclins have undergone divergence in protists before the divergence of animal CDKs. This reclassification and evolutionary analysis of cyclins might facilitate understanding eukaryotic cell cycle control.
Guo, Zhong-Long; Wang, Juan; Shen, Yu-Ying
2015-01-01
Insect mitochondrial genome (mitogenome) are the most extensively used genetic information for molecular evolution, phylogenetics and population genetics. Pentatomomorpha (>14,000 species) is the second largest infraorder of Heteroptera and of great economic importance. To better understand the diversity and phylogeny within Pentatomomorpha, we sequenced and annotated the complete mitogenome of Corizus tetraspilus (Hemiptera: Rhopalidae), an important pest of alfalfa in China. We analyzed the main features of the C. tetraspilus mitogenome, and provided a comparative analysis with four other Coreoidea species. Our results reveal that gene content, gene arrangement, nucleotide composition, codon usage, rRNA structures and sequences of mitochondrial transcription termination factor are conserved in Coreoidea. Comparative analysis shows that different protein-coding genes have been subject to different evolutionary rates correlated with the G+C content. All the transfer RNA genes found in Coreoidea have the typical clover leaf secondary structure, except for trnS1 (AGN) which lacks the dihydrouridine (DHU) arm and possesses a unusual anticodon stem (9 bp vs. the normal 5 bp). The control regions (CRs) among Coreoidea are highly variable in size, of which the CR of C. tetraspilus is the smallest (440 bp), making the C. tetraspilus mitogenome the smallest (14,989 bp) within all completely sequenced Coreoidea mitogenomes. No conserved motifs are found in the CRs of Coreoidea. In addition, the A+T content (60.68%) of the CR of C. tetraspilus is much lower than that of the entire mitogenome (74.88%), and is lowest among Coreoidea. Phylogenetic analyses based on mitogenomic data support the monophyly of each superfamily within Pentatomomorpha, and recognize a phylogenetic relationship of (Aradoidea + (Pentatomoidea + (Lygaeoidea + (Pyrrhocoroidea + Coreoidea)))). PMID:26042898
2016-01-01
Aetosauria is an early-diverging clade of pseudosuchians (crocodile-line archosaurs) that had a global distribution and high species diversity as a key component of various Late Triassic terrestrial faunas. It is one of only two Late Triassic clades of large herbivorous archosaurs, and thus served a critical ecological role. Nonetheless, aetosaur phylogenetic relationships are still poorly understood, owing to an overreliance on osteoderm characters, which are often poorly constructed and suspected to be highly homoplastic. A new phylogenetic analysis of the Aetosauria, comprising 27 taxa and 83 characters, includes more than 40 new characters that focus on better sampling the cranial and endoskeletal regions, and represents the most comprenhensive phylogeny of the clade to date. Parsimony analysis recovered three most parsimonious trees; the strict consensus of these trees finds an Aetosauria that is divided into two main clades: Desmatosuchia, which includes the Desmatosuchinae and the Stagonolepidinae, and Aetosaurinae, which includes the Typothoracinae. As defined Desmatosuchinae now contains Neoaetosauroides engaeus and several taxa that were previously referred to the genus Stagonolepis, and a new clade, Desmatosuchini, is erected for taxa more closely related to Desmatosuchus. Overall support for some clades is still weak, and Partitioned Bremer Support (PBS) is applied for the first time to a strictly morphological dataset demonstrating that this weak support is in part because of conflict in the phylogenetic signals of cranial versus postcranial characters. PBS helps identify homoplasy among characters from various body regions, presumably the result of convergent evolution within discrete anatomical modules. It is likely that at least some of this character conflict results from different body regions evolving at different rates, which may have been under different selective pressures. PMID:26819845
Geib, Scott M.; Scully, Erin D.; Jimenez-Gasco, Maria del Mar; Carlson, John E.; Tien, Ming; Hoover, Kelli
2012-01-01
Culture-independent analysis of the gut of a wood-boring insect, Anoplophora glabripennis (Coleoptera: Cerambycidae), revealed a consistent association between members of the fungal Fusarium solani species complex and the larval stage of both colony-derived and wild A. glabripennis populations. Using the translation elongation factor 1-alpha region for culture-independent phylogenetic and operational taxonomic unit (OTU)-based analyses, only two OTUs were detected, suggesting that genetic variance at this locus was low among A. glabripennis-associated isolates. To better survey the genetic variation of F. solani associated with A. glabripennis, and establish its phylogenetic relationship with other members of the F. solani species complex, single spore isolates were created from different populations and multi-locus phylogenetic analysis was performed using a combination of the translation elongation factor alpha-1, internal transcribed spacer, and large subunit rDNA regions. These analyses revealed that colony-derived larvae reared in three different tree species or on artificial diet, as well as larvae from wild populations collected from three additional tree species in New York City and from a single tree species in Worcester, MA, consistently harbored F. solani within their guts. While there is some genetic variation in the F. solani carried between populations, within-population variation is low. We speculate that F. solani is able to fill a broad niche in the A. glabripennis gut, providing it with fungal lignocellulases to allow the larvae to grow and develop on woody tissue. However, it is likely that many F. solani genotypes could potentially fill this niche, so the relationship may not be limited to a single member of the F. solani species complex. While little is known about the role of filamentous fungi and their symbiotic associations with insects, this report suggests that larval A. glabripennis has developed an intimate relationship with F. solani
Geib, Scott M; Scully, Erin D; Jimenez-Gasco, Maria Del Mar; Carlson, John E; Tien, Ming; Hoover, Kelli
2012-02-10
Culture-independent analysis of the gut of a wood-boring insect, Anoplophora glabripennis (Coleoptera: Cerambycidae), revealed a consistent association between members of the fungal Fusarium solani species complex and the larval stage of both colony-derived and wild A. glabripennis populations. Using the translation elongation factor 1-alpha region for culture-independent phylogenetic and operational taxonomic unit (OTU)-based analyses, only two OTUs were detected, suggesting that genetic variance at this locus was low among A. glabripennis-associated isolates. To better survey the genetic variation of F. solani associated with A. glabripennis, and establish its phylogenetic relationship with other members of the F. solani species complex, single spore isolates were created from different populations and multi-locus phylogenetic analysis was performed using a combination of the translation elongation factor alpha-1, internal transcribed spacer, and large subunit rDNA regions. These analyses revealed that colony-derived larvae reared in three different tree species or on artificial diet, as well as larvae from wild populations collected from three additional tree species in New York City and from a single tree species in Worcester, MA, consistently harbored F. solani within their guts. While there is some genetic variation in the F. solani carried between populations, within-population variation is low. We speculate that F. solani is able to fill a broad niche in the A. glabripennis gut, providing it with fungal lignocellulases to allow the larvae to grow and develop on woody tissue. However, it is likely that many F. solani genotypes could potentially fill this niche, so the relationship may not be limited to a single member of the F. solani species complex. While little is known about the role of filamentous fungi and their symbiotic associations with insects, this report suggests that larval A. glabripennis has developed an intimate relationship with F. solani
Shin, Junha; Lee, Insuk
2015-01-01
Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co-inheritance analysis
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Yu, H; Liu, T X; Wang, D
2016-09-23
The complete genomic RNA of the Chinese sacbrood virus (CSBV) strain, which infects the honeybees in the Loess plateau, was sequenced and analyzed. The CSBV-SX strain contains 8705 nucleotides, which includes a single large open reading frame (99-8681 nucleotides) encoding 2860 amino acids. A novel efficient identification method was used to investigate the samples infected by CSBV. The putative amino acid sequence alignment analysis showed that, except for some normal well characterized domains such as RNA helicase, RNA protease, and RNA-dependent RNA polymerase domains, a calicivirus coat protein domain was identified at amino acids 493-564. Phylogenetic analysis indicated that CSBV-SX was closely related to CSBV-BJ, and this result was supported by nucleotide multiple sequence alignment and protein multiple sequence alignment analysis results. These differences in the CSBV-SX strain may be related to virus adaptation to the xerothermic, low relative humidity, and strong ultraviolet radiation conditions in the Loess Plateau.
Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics.
Harrison, Nicola; Harrison, Richard J; Kidner, Catherine A
2016-01-01
Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia.
Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics
Harrison, Nicola; Harrison, Richard J.
2016-01-01
Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia. PMID:27058864
Kovács, Endre R; Benko, Mária
2009-03-01
Partial genome characterisation of a novel adenovirus, found recently in organ samples of multiple species of dead birds of prey, was carried out by sequence analysis of PCR-amplified DNA fragments. The virus, named as raptor adenovirus 1 (RAdV-1), has originally been detected by a nested PCR method with consensus primers targeting the adenoviral DNA polymerase gene. Phylogenetic analysis with the deduced amino acid sequence of the small PCR product has implied a new siadenovirus type present in the samples. Since virus isolation attempts remained unsuccessful, further characterisation of this putative novel siadenovirus was carried out with the use of PCR on the infected organ samples. The DNA sequence of the central genome part of RAdV-1, encompassing nine full (pTP, 52K, pIIIa, III, pVII, pX, pVI, hexon, protease) and two partial (DNA polymerase and DBP) genes and exceeding 12 kb pairs in size, was determined. Phylogenetic tree reconstructions, based on several genes, unambiguously confirmed the preliminary classification of RAdV-1 as a new species within the genus Siadenovirus. Further study of RAdV-1 is of interest since it represents a rare adenovirus genus of yet undetermined host origin.
Analysis of Domain Architecture and Phylogenetics of Family 2 Glycoside Hydrolases (GH2).
Talens-Perales, David; Górska, Anna; Huson, Daniel H; Polaina, Julio; Marín-Navarro, Julia
2016-01-01
In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2). We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module) with two β-sandwich domains (non-catalytic) at the N-terminal end, but differ in the occurrence and nature of additional non-catalytic modules at the C-terminal region. Phylogenetic analysis was based on the sequence of the Pfam Glyco_hydro_2_C catalytic module present in most GH2 proteins. Our results led us to propose a model in which evolutionary diversity of GH2 enzymes is driven by the addition of different non-catalytic domains at the C-terminal region. This model accounts for the divergence of β-galactosidases from β-glucuronidases, the diversification of β-galactosidases with different transglycosylation specificities, and the emergence of bicistronic β-galactosidases. This study also allows the identification of groups of functionally uncharacterized protein sequences with potential biotechnological interest.
Phylogenetic analysis and antimicrobial activities of Streptomyces isolates from mangrove sediment.
Satheeja, Santhi V; Jebakumar, Solomon R D
2011-02-01
The phylogeny of members of Streptomyces bacteria isolated from mangrove sediments in the Manakudi estuary near the Arabian Sea, India, was analyzed in the present study. Among the 35 different isolates, five organisms, JS-9, JS-11, JS-12, JS-13 and JS-20, exhibited potent antimicrobial effects against methicillin-resistant Staphylococcus aureus (clinical isolate) and methicillin-susceptible S. aureus MTCC 3160 and Salmonella typhi MTCC 733; all other isolates displayed intermediate antimicrobial effects. RFLP analysis of HaeIII and BstUI double-digested 16S rRNA gene fragments of the isolates were distinguished into 20 distinct RFLP types, with the genetic similarity coefficient varying from 0.57 to 0.97. On average, 17 RFLP markers were observed from approximately 50 to 350 bp size and all the RFLP types showed significant genetic polymorphism by clustering into three major clusters. Phylogenetic analysis showed that the 20-member Streptomyces isolates were divided into three major clusters and they shared 97.2-99.8% sequence identity to the 16S rRNA gene sequences of the Streptomyces taxons of marine origin. The distribution of the isolates revealed that the distinct Streptomyces groups were clustered in the phylogenetic tree and there was a good correlation between the diversity of the antimicrobial phenotype and that of the 16S rRNA gene.
Phylogenetic analysis of rabies virus isolated from canids in North and Northeast Brazil.
de Souza, Débora Nunes; Carnieli, Pedro; Macedo, Carla Isabel; de Novaes Oliveira, Rafael; de Carvalho Ruthner Batista, Helena Beatriz; Rodrigues, Adriana Candido; Pereira, Patricia Mariano Cruz; Achkar, Samira Maria; Vieira, Luiz Fernando Pereira; Kawai, Juliana Galera Castilho
2017-01-01
Cases of canine rabies continue to occur in North and Northeast Brazil, and the number of notifications of rabies cases in wild canids has increased as a result of the expansion of urban areas at the expense of areas with native vegetation. In light of this, we performed molecular characterization of rabies virus isolates from dogs and Cerdocyon thous from various states in North and Northeast Brazil. In all, 102 samples from dogs (n = 56) and Cerdocyon thous (n = 46) collected between 2006 and 2012 were used. The nucleotide sequences obtained for the N gene of rabies virus were analyzed, and phylogenetic analysis revealed the presence of two distinct genetic lineages, one associated with canids and one with bats, and, within the canid cluster, two distinct sublineages circulating among dogs and Cerdocyon thous. In addition, phylogenetic groups associated with geographic region and fourteen cases of interspecific infection were observed among the isolates from canids. Our findings show that analysis of rabies virus lineages isolated from reservoirs such as canids must be constantly evaluated because the mutation rate is high.
Phylogenetic Analysis of Cryptosporidium Parasites Based on the Small-Subunit rRNA Gene Locus
Xiao, Lihua; Escalante, Lillian; Yang, Chunfu; Sulaiman, Irshad; Escalante, Anannias A.; Montali, Richard J.; Fayer, Ronald; Lal, Altaf A.
1999-01-01
Biological data support the hypothesis that there are multiple species in the genus Cryptosporidium, but a recent analysis of the available genetic data suggested that there is insufficient evidence for species differentiation. In order to resolve the controversy in the taxonomy of this parasite genus, we characterized the small-subunit rRNA genes of Cryptosporidium parvum, Cryptosporidium baileyi, Cryptosporidium muris, and Cryptosporidium serpentis and performed a phylogenetic analysis of the genus Cryptosporidium. Our study revealed that the genus Cryptosporidium contains the phylogenetically distinct species C. parvum, C. muris, C. baileyi, and C. serpentis, which is consistent with the biological characteristics and host specificity data. The Cryptosporidium species formed two clades, with C. parvum and C. baileyi belonging to one clade and C. muris and C. serpentis belonging to the other clade. Within C. parvum, human genotype isolates and guinea pig isolates (known as Cryptosporidium wrairi) each differed from bovine genotype isolates by the nucleotide sequence in four regions. A C. muris isolate from cattle was also different from parasites isolated from a rock hyrax and a Bactrian camel. Minor differences were also detected between C. serpentis isolates from snakes and lizards. Based on the genetic information, a species- and strain-specific PCR-restriction fragment length polymorphism diagnostic tool was developed. PMID:10103253
Analysis of Domain Architecture and Phylogenetics of Family 2 Glycoside Hydrolases (GH2)
Talens-Perales, David; Górska, Anna; Huson, Daniel H.; Polaina, Julio
2016-01-01
In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2). We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module) with two β-sandwich domains (non-catalytic) at the N-terminal end, but differ in the occurrence and nature of additional non-catalytic modules at the C-terminal region. Phylogenetic analysis was based on the sequence of the Pfam Glyco_hydro_2_C catalytic module present in most GH2 proteins. Our results led us to propose a model in which evolutionary diversity of GH2 enzymes is driven by the addition of different non-catalytic domains at the C-terminal region. This model accounts for the divergence of β-galactosidases from β-glucuronidases, the diversification of β-galactosidases with different transglycosylation specificities, and the emergence of bicistronic β-galactosidases. This study also allows the identification of groups of functionally uncharacterized protein sequences with potential biotechnological interest. PMID:27930742
Identification and phylogenetic analysis of novel cytochrome P450 1A genes from ungulate species.
Darwish, Wageh Sobhy; Kawai, Yusuke; Ikenaka, Yoshinori; Yamamoto, Hideaki; Muroya, Tarou; Ishizuka, Mayumi
2010-09-01
As part of an ongoing effort to understand the biological response of wild and domestic ungulates to different environmental pollutants such as dioxin-like compounds, cDNAs encoding for CYP1A1 and CYP1A2 were cloned and characterized. Four novel CYP1A cDNA fragments from the livers of four wild ungulates (elephant, hippopotamus, tapir and deer) were identified. Three fragments from hippopotamus, tapir and deer were classified as CYP1A2, and the other fragment from elephant was designated as CYP1A1/2. The deduced amino acid sequences of these fragment CYP1As showed identities ranging from 76 to 97% with other animal CYP1As. The phylogenetic analysis of these fragments showed that both elephant and hippopotamus CYP1As made separate branches, while tapir and deer CYP1As were located beside that of horse and cattle respectively in the phylogenetic tree. Analysis of dN/dS ratio among the identified CYP1As indicated that odd toed ungulate CYP1A2s were exposed to different selection pressure.
Phylogenetic analysis of the NS5 gene of dengue viruses isolated in Ecuador.
Regato, Mary; Recarey, Ricardo; Moratorio, Gonzalo; de Mora, Domenica; Garcia-Aguirre, Laura; Gónzalez, Manuel; Mosquera, Carlos; Alava, Aracely; Fajardo, Alvaro; Alvarez, Macarena; D' Andrea, Lucia; Dubra, Ana; Martínez, Mariela; Khan, Baldip; Cristina, Juan
2008-03-01
Dengue virus (DENV) is a member of the genus Flavivirus of the family Flaviviridae. DENV causes a wide range of diseases in humans, from the acute febrile illness dengue fever (DF) to life-threatening dengue hemorrhagic fever/dengue shock syndrome (DHF/DSS). There is not knowledge of the genetic relations among DENV circulating in Ecuador. Given the emerging behaviour of DENV, a single tube RT-PCR assay using a pair of consensus primers to target the NS5 coding region has been recently validated for rapid detection of flaviviruses. In order to gain insight into the degree of genetic variation of DENV strains isolated in Ecuador, DENV NS5 sequences from 23 patients were obtained by direct sequencing of PCR fragments using the mentioned one step RT-PCR assay. Phylogenetic analysis carried out using the 23 Ecuadorian DENV NS5 sequences, as well as 56 comparable sequences from DENV strains isolated elsewhere, revealed a close genetic relation among Ecuadorian strains and DENV isolates of Caribbean origin. The use of partial NS5 gene sequences may represent a useful alternative for a rapid phylogenetic analysis of DENV outbreaks.
Cella, Eleonora; Ceccarelli, Giancarlo; Vita, Serena; Lai, Alessia; Presti, Alessandra Lo; Blasi, Aletheia; Palco, Maurizio Lo; Guarino, Michele Pier Luca; Zehender, Gianguglielmo; Angeletti, Silvia; Ciccozzi, Massimo
2017-04-01
The armed conflict in Mali caused a migration crisis since 2012. Most Malian refugees were in Italy. In Sub-Saharan Africa, the seroprevalence of anti-HBV antibodies is particularly high. Genotype E is the most prevalent throughout a crescent covering area from Angola to Senegal, including Mali. We report 16 HBV positive individual from 136 Malian asylum seekers in order to investigate the genetic diversity of HBV in this population. Sequencing and phylogenetic analysis has been used. The HBV genotype E isolates from Mali did not cluster together but were intermixed, with the other African sequences. Only three supported clade were evidenced and closely related to sequences from Burkina Faso. The estimated evolutionary rate was 9.29 × 10(4) . The root of the tree dated back to February 2008 in (95% HPD: 2006-2011). From this ancestor six main statistically supported clusters (pp > 0.80) were identified. The most recent Clade dated back to May 2015. The BSP showed that the effective number of infections softly increased from 2011 to the 2015. Phylogenetic analysis helped in understanding how two on sixteen individuals, have been infected in Italy, and give an important improvement in prevention campaigns and monitoring of the viral infection in migrants. J. Med. Virol. 89:639-646, 2017. © 2016 Wiley Periodicals, Inc.
Duan, Chaohui; Liao, Meiying; Wang, Han; Luo, Xiaohong; Shao, Jing; Xu, Ying; Li, Wei; Hao, Wenbo; Luo, Shuhong
2015-01-25
Infection with the orf virus (ORFV) leads to contagious ecthyma, also called contagious pustular dermatitis, which usually affects sheep, goats and other small ruminants. It has a great distribution throughout the world and has also been reported to infect humans. Though many strains have been isolated from differing parts of mainland China, rarely has any strain been reported from the southern provinces of China. We studied a case of orf virus infection that occurred at Qingyuan City, Guangdong Province in southern China. An orf virus strain, GDQY, was successfully isolated and identified through cell culture techniques and transmission electron microscopy. Complete genes of ORFV011, ORFV059, ORFV106 and ORFV107 were amplified for the sequence analysis based on their nucleotide or amino acid level. In order to discuss the genetic variation, precise sequences were used to compare to other reference strains isolated from different districts or countries. Phylogenetic trees based on those strains were built up and evolutionary distances were calculated based on the alignment of their complete sequences. The typical structure of the orf virus was observed in cell-culture suspensions inoculated with GDQY, and the full-length of four genes was amplified and sequenced. Phylogenetic analysis indicated that GDQY is homologous to FJ-DS and CQ/WZ on ORFV011 nucleotides. ORFV059 may be more variable than ORFV011 based on the comparison between GDQY and other isolates. Genetic studies of ORFV106 and 107 are reported for the first time in the presented study.
Using genes as characters and a parsimony analysis to explore the phylogenetic position of turtles.
Lu, Bin; Yang, Weizhao; Dai, Qiang; Fu, Jinzhong
2013-01-01
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a "genes as characters" approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog) and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara) but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis), similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as characters provides a
Pereira, Felipe B; Luque, José L
2017-02-01
Genetic and morphological variations in two component populations of Raphidascaris (Sprentascaris) lanfrediae collected in the intestine of Geophagus argyrosticus and G. proximus (Cichlidae) from States of Pará and Amapá, Brazil, respectively, were explored for the first time. A phylogenetic study including two genes (18S and 28S of the rDNA) plus morphological and life history traits of "anisakid-related" nematodes (Anisakidae, Raphidascarididae) was also performed in order to clarify taxonomic and systematic issues related to these taxa. Gene alignments were subjected to maximum likelihood (ML) and Bayesian Inference (BI), and combined data of the genetic and morphological datasets was subjected to maximum parsimony (MP) analysis. Despite of the subtle differences in the morphology (mainly in male caudal papillae) and morphometry between specimens of R. (S.) lanfrediae from the two different hosts and from the type material of the species, no genetic variation was found among representatives of the newly collected material. This find may represent an example of gene-environment interactions, similar to that recently observed for Raphidascaroides brasiliensis. Phylogenetic reconstructions indicated the paraphyly of Anisakidae represented by two subfamilies, i.e., Anisakinae and Contracaecinae and the monophyly of Raphidascarididae. Analysis of the combined datasets revealed that some morphological traits may represent apomorphic characters of Raphidascarididae and Anisakidae, whereas others are highly homoplastic and some may be interpreted with careful to avoid errors. The results support the premise that taxonomists should consider Anisakidae and Raphidascarididae as separate families, and only two subfamilies of Anisakidae, i.e., Anisakinae and Contracaecinae.
Fermi's paradox, extraterrestrial life and the future of humanity: a Bayesian analysis
NASA Astrophysics Data System (ADS)
Verendel, Vilhelm; Häggström, Olle
2017-01-01
The Great Filter interpretation of Fermi's great silence asserts that Npq is not a very large number, where N is the number of potentially life-supporting planets in the observable universe, p is the probability that a randomly chosen such planet develops intelligent life to the level of present-day human civilization, and q is the conditional probability that it then goes on to develop a technological supercivilization visible all over the observable universe. Evidence suggests that N is huge, which implies that pq is very small. Hanson (1998) and Bostrom (2008) have argued that the discovery of extraterrestrial life would point towards p not being small and therefore a very small q, which can be seen as bad news for humanity's prospects of colonizing the universe. Here we investigate whether a Bayesian analysis supports their argument, and the answer turns out to depend critically on the choice of prior distribution.
NASA Astrophysics Data System (ADS)
Takamizawa, Hisashi; Itoh, Hiroto; Nishiyama, Yutaka
2016-10-01
In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.
Manzo, Carlo; van Zanten, Thomas S.; Saha, Suvrajit; Torreno-Pina, Juan A.; Mayor, Satyajit; Garcia-Parajo, Maria F.
2014-01-01
The spatial organization of membrane receptors at the nanoscale has major implications in cellular function and signaling. The advent of super-resolution techniques has greatly contributed to our understanding of the cellular membrane. Yet, despite the increased resolution, unbiased quantification of highly dense features, such as molecular aggregates, remains challenging. Here we describe an algorithm based on Bayesian inference of the marker intensity distribution that improves the determination of molecular positions inside dense nanometer-scale molecular aggregates. We tested the performance of the method on synthetic images representing a broad range of experimental conditions, demonstrating its wide applicability. We further applied this approach to STED images of GPI-anchored and model transmembrane proteins expressed in mammalian cells. The analysis revealed subtle differences in the organization of these receptors, emphasizing the role of cortical actin in the compartmentalization of the cell membrane. PMID:24619088
Bayesian network meta-analysis for unordered categorical outcomes with incomplete data.
Schmid, Christopher H; Trikalinos, Thomas A; Olkin, Ingram
2014-06-01
We develop a Bayesian multinomial network meta-analysis model for unordered (nominal) categorical outcomes that allows for partially observed data in which exact event counts may not be known for each category. This model properly accounts for correlations of counts in mutually exclusive categories and enables proper comparison and ranking of treatment effects across multiple treatments and multiple outcome categories. We apply the model to analyze 17 trials, each of which compares two of three treatments (high and low dose statins and standard care/control) for three outcomes for which data are complete: cardiovascular death, non-cardiovascular death and no death. We also analyze the cardiovascular death category divided into the three subcategories (coronary heart disease, stroke and other cardiovascular diseases) that are not completely observed. The multinomial and network representations show that high dose statins are effective in reducing the risk of cardiovascular disease.
A Bayesian approach to probabilistic sensitivity analysis in structured benefit-risk assessment.
Waddingham, Ed; Mt-Isa, Shahrul; Nixon, Richard; Ashby, Deborah
2016-01-01
Quantitative decision models such as multiple criteria decision analysis (MCDA) can be used in benefit-risk assessment to formalize trade-offs between benefits and risks, providing transparency to the assessment process. There is however no well-established method for propagating uncertainty of treatment effects data through such models to provide a sense of the variability of the benefit-risk balance. Here, we present a Bayesian statistical method that directly models the outcomes observed in randomized placebo-controlled trials and uses this to infer indirect comparisons between competing active treatments. The resulting treatment effects estimates are suitable for use within the MCDA setting, and it is possible to derive the distribution of the overall benefit-risk balance through Markov Chain Monte Carlo simulation. The method is illustrated using a case study of natalizumab for relapsing-remitting multiple sclerosis.
Chow, Sy-Miin; Tang, Niansheng; Yuan, Ying; Song, Xinyuan; Zhu, Hongtu
2011-02-01
Parameters in time series and other dynamic models often show complex range restrictions and their distributions may deviate substantially from multivariate normal or other standard parametric distributions. We use the truncated Dirichlet process (DP) as a non-parametric prior for such dynamic parameters in a novel nonlinear Bayesian dynamic factor analysis model. This is equivalent to specifying the prior distribution to be a mixture distribution composed of an unknown number of discrete point masses (or clusters). The stick-breaking prior and the blocked Gibbs sampler are used to enable efficient simulation of posterior samples. Using a series of empirical and simulation examples, we illustrate the flexibility of the proposed approach in approximating distributions of very diverse shapes.
Phylogenetic analysis of the N8 neuraminidase gene of influenza A viruses.
Saito, T; Kawaoka, Y; Webster, R G
1993-04-01
Phylogenetic analysis of the N8 neuraminidase (NA) genes from 18 influenza A viruses, representing equine and avian hosts in different geographic locations, revealed three major lineages: (i) currently circulating equine 2 viruses; (ii) avian viruses isolated in the Eurasian region, including A/Equine/Jilin/1/89, a recent avian-like N8 isolate found in horses in China; and (iii) avian viruses isolated in North America. Comparison of mutation rates indicated that avian N8 genes have evolved more slowly than their equine counterparts. That is, in both avian lineages,