Maximum-Likelihood Detection Of Noncoherent CPM
NASA Technical Reports Server (NTRS)
Divsalar, Dariush; Simon, Marvin K.
1993-01-01
Simplified detectors proposed for use in maximum-likelihood-sequence detection of symbols in alphabet of size M transmitted by uncoded, full-response continuous phase modulation over radio channel with additive white Gaussian noise. Structures of receivers derived from particular interpretation of maximum-likelihood metrics. Receivers include front ends, structures of which depends only on M, analogous to those in receivers of coherent CPM. Parts of receivers following front ends have structures, complexity of which would depend on N.
High-Dimensional Exploratory Item Factor Analysis by a Metropolis-Hastings Robbins-Monro Algorithm
ERIC Educational Resources Information Center
Cai, Li
2010-01-01
A Metropolis-Hastings Robbins-Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The…
Stochastic control system parameter identifiability
NASA Technical Reports Server (NTRS)
Lee, C. H.; Herget, C. J.
1975-01-01
The parameter identification problem of general discrete time, nonlinear, multiple input/multiple output dynamic systems with Gaussian white distributed measurement errors is considered. The knowledge of the system parameterization was assumed to be known. Concepts of local parameter identifiability and local constrained maximum likelihood parameter identifiability were established. A set of sufficient conditions for the existence of a region of parameter identifiability was derived. A computation procedure employing interval arithmetic was provided for finding the regions of parameter identifiability. If the vector of the true parameters is locally constrained maximum likelihood (CML) identifiable, then with probability one, the vector of true parameters is a unique maximal point of the maximum likelihood function in the region of parameter identifiability and the constrained maximum likelihood estimation sequence will converge to the vector of true parameters.
Khairuzzaman, Md; Zhang, Chao; Igarashi, Koji; Katoh, Kazuhiro; Kikuchi, Kazuro
2010-03-01
We describe a successful introduction of maximum-likelihood-sequence estimation (MLSE) into digital coherent receivers together with finite-impulse response (FIR) filters in order to equalize both linear and nonlinear fiber impairments. The MLSE equalizer based on the Viterbi algorithm is implemented in the offline digital signal processing (DSP) core. We transmit 20-Gbit/s quadrature phase-shift keying (QPSK) signals through a 200-km-long standard single-mode fiber. The bit-error rate performance shows that the MLSE equalizer outperforms the conventional adaptive FIR filter, especially when nonlinear impairments are predominant.
USDA-ARS?s Scientific Manuscript database
The phylogeny of Amaryllidaceae tribe Hippeastreae was inferred using chloroplast (3’ycf1, ndhF, trnL-F) and nuclear (ITS rDNA) sequence data under maximum parsimony and maximum likelihood frameworks. Network analyses were applied to resolve conflicting signals among data sets and putative scenarios...
Cao, Y; Adachi, J; Yano, T; Hasegawa, M
1994-07-01
Graur et al.'s (1991) hypothesis that the guinea pig-like rodents have an evolutionary origin within mammals that is separate from that of other rodents (the rodent-polyphyly hypothesis) was reexamined by the maximum-likelihood method for protein phylogeny, as well as by the maximum-parsimony and neighbor-joining methods. The overall evidence does not support Graur et al.'s hypothesis, which radically contradicts the traditional view of rodent monophyly. This work demonstrates that we must be careful in choosing a proper method for phylogenetic inference and that an argument based on a small data set (with respect to the length of the sequence and especially the number of species) may be unstable.
GASP: Gapped Ancestral Sequence Prediction for proteins
Edwards, Richard J; Shields, Denis C
2004-01-01
Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Maximum-likelihood soft-decision decoding of block codes using the A* algorithm
NASA Technical Reports Server (NTRS)
Ekroot, L.; Dolinar, S.
1994-01-01
The A* algorithm finds the path in a finite depth binary tree that optimizes a function. Here, it is applied to maximum-likelihood soft-decision decoding of block codes where the function optimized over the codewords is the likelihood function of the received sequence given each codeword. The algorithm considers codewords one bit at a time, making use of the most reliable received symbols first and pursuing only the partially expanded codewords that might be maximally likely. A version of the A* algorithm for maximum-likelihood decoding of block codes has been implemented for block codes up to 64 bits in length. The efficiency of this algorithm makes simulations of codes up to length 64 feasible. This article details the implementation currently in use, compares the decoding complexity with that of exhaustive search and Viterbi decoding algorithms, and presents performance curves obtained with this implementation of the A* algorithm for several codes.
Al-Atiyat, R M; Aljumaah, R S
2014-08-27
This study aimed to estimate evolutionary distances and to reconstruct phylogeny trees between different Awassi sheep populations. Thirty-two sheep individuals from three different geographical areas of Jordan and the Kingdom of Saudi Arabia (KSA) were randomly sampled. DNA was extracted from the tissue samples and sequenced using the T7 promoter universal primer. Different phylogenetic trees were reconstructed from 0.64-kb DNA sequences using the MEGA software with the best general time reverse distance model. Three methods of distance estimation were then used. The maximum composite likelihood test was considered for reconstructing maximum likelihood, neighbor-joining and UPGMA trees. The maximum likelihood tree indicated three major clusters separated by cytosine (C) and thymine (T). The greatest distance was shown between the South sheep and North sheep. On the other hand, the KSA sheep as an outgroup showed shorter evolutionary distance to the North sheep population than to the others. The neighbor-joining and UPGMA trees showed quite reliable clusters of evolutionary differentiation of Jordan sheep populations from the Saudi population. The overall results support geographical information and ecological types of the sheep populations studied. Summing up, the resulting phylogeny trees may contribute to the limited information about the genetic relatedness and phylogeny of Awassi sheep in nearby Arab countries.
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc
1998-01-01
The Viterbi algorithm is indeed a very simple and efficient method of implementing the maximum likelihood decoding. However, if we take advantage of the structural properties in a trellis section, other efficient trellis-based decoding algorithms can be devised. Recently, an efficient trellis-based recursive maximum likelihood decoding (RMLD) algorithm for linear block codes has been proposed. This algorithm is more efficient than the conventional Viterbi algorithm in both computation and hardware requirements. Most importantly, the implementation of this algorithm does not require the construction of the entire code trellis, only some special one-section trellises of relatively small state and branch complexities are needed for constructing path (or branch) metric tables recursively. At the end, there is only one table which contains only the most likely code-word and its metric for a given received sequence r = (r(sub 1), r(sub 2),...,r(sub n)). This algorithm basically uses the divide and conquer strategy. Furthermore, it allows parallel/pipeline processing of received sequences to speed up decoding.
The phylogenetic relationship of Alexandrium monilatum to other Alexandrium spp. was explored using 18S rDNA sequences. Maximum likelihood phylogenetic analysis of the combined rDNA sequences established that A. monilatum paired with Alexandrium taylori and that the pair was the ...
Phylogenetically marking the limits of the genus Fusarium for post-Article 59 usage
USDA-ARS?s Scientific Manuscript database
Fusarium (Hypocreales, Nectriaceae) is one of the most important and systematically challenging groups of mycotoxigenic, plant pathogenic, and human pathogenic fungi. We conducted maximum likelihood (ML), maximum parsimony (MP) and Bayesian (B) analyses on partial nucleotide sequences of genes encod...
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopich, Irina V.
2015-01-21
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when themore » FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated.« less
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
Gopich, Irina V.
2015-01-01
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when the FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated. PMID:25612692
Exploiting Non-sequence Data in Dynamic Model Learning
2013-10-01
For our experiments here and in Section 3.5, we implement the proposed algorithms in MATLAB and use the maximum directed spanning tree solver...embarrassingly parallelizable, whereas PM’s maximum directed spanning tree procedure is harder to parallelize. In this experiment, our MATLAB ...some estimation problems, this approach is able to give unique and consistent estimates while the maximum- likelihood method gets entangled in
A 3D approximate maximum likelihood localization solver
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-09-23
A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with acoustic transmitters and vocalizing marine mammals to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives and support Marine Renewable Energy. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Maximum likelihood estimates, from censored data, for mixed-Weibull distributions
NASA Astrophysics Data System (ADS)
Jiang, Siyuan; Kececioglu, Dimitri
1992-06-01
A new algorithm for estimating the parameters of mixed-Weibull distributions from censored data is presented. The algorithm follows the principle of maximum likelihood estimate (MLE) through the expectation and maximization (EM) algorithm, and it is derived for both postmortem and nonpostmortem time-to-failure data. It is concluded that the concept of the EM algorithm is easy to understand and apply (only elementary statistics and calculus are required). The log-likelihood function cannot decrease after an EM sequence; this important feature was observed in all of the numerical calculations. The MLEs of the nonpostmortem data were obtained successfully for mixed-Weibull distributions with up to 14 parameters in a 5-subpopulation, mixed-Weibull distribution. Numerical examples indicate that some of the log-likelihood functions of the mixed-Weibull distributions have multiple local maxima; therefore, the algorithm should start at several initial guesses of the parameter set.
Ishikawa, Sohta A; Inagaki, Yuji; Hashimoto, Tetsuo
2012-01-01
In phylogenetic analyses of nucleotide sequences, 'homogeneous' substitution models, which assume the stationarity of base composition across a tree, are widely used, albeit individual sequences may bear distinctive base frequencies. In the worst-case scenario, a homogeneous model-based analysis can yield an artifactual union of two distantly related sequences that achieved similar base frequencies in parallel. Such potential difficulty can be countered by two approaches, 'RY-coding' and 'non-homogeneous' models. The former approach converts four bases into purine and pyrimidine to normalize base frequencies across a tree, while the heterogeneity in base frequency is explicitly incorporated in the latter approach. The two approaches have been applied to real-world sequence data; however, their basic properties have not been fully examined by pioneering simulation studies. Here, we assessed the performances of the maximum-likelihood analyses incorporating RY-coding and a non-homogeneous model (RY-coding and non-homogeneous analyses) on simulated data with parallel convergence to similar base composition. Both RY-coding and non-homogeneous analyses showed superior performances compared with homogeneous model-based analyses. Curiously, the performance of RY-coding analysis appeared to be significantly affected by a setting of the substitution process for sequence simulation relative to that of non-homogeneous analysis. The performance of a non-homogeneous analysis was also validated by analyzing a real-world sequence data set with significant base heterogeneity.
Correcting for sequencing error in maximum likelihood phylogeny inference.
Kuhner, Mary K; McGill, James
2014-11-04
Accurate phylogenies are critical to taxonomy as well as studies of speciation processes and other evolutionary patterns. Accurate branch lengths in phylogenies are critical for dating and rate measurements. Such accuracy may be jeopardized by unacknowledged sequencing error. We use simulated data to test a correction for DNA sequencing error in maximum likelihood phylogeny inference. Over a wide range of data polymorphism and true error rate, we found that correcting for sequencing error improves recovery of the branch lengths, even if the assumed error rate is up to twice the true error rate. Low error rates have little effect on recovery of the topology. When error is high, correction improves topological inference; however, when error is extremely high, using an assumed error rate greater than the true error rate leads to poor recovery of both topology and branch lengths. The error correction approach tested here was proposed in 2004 but has not been widely used, perhaps because researchers do not want to commit to an estimate of the error rate. This study shows that correction with an approximate error rate is generally preferable to ignoring the issue. Copyright © 2014 Kuhner and McGill.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
Basal jawed vertebrate phylogeny inferred from multiple nuclear DNA-coded genes
Kikugawa, Kanae; Katoh, Kazutaka; Kuraku, Shigehiro; Sakurai, Hiroshi; Ishida, Osamu; Iwabe, Naoyuki; Miyata, Takashi
2004-01-01
Background Phylogenetic analyses of jawed vertebrates based on mitochondrial sequences often result in confusing inferences which are obviously inconsistent with generally accepted trees. In particular, in a hypothesis by Rasmussen and Arnason based on mitochondrial trees, cartilaginous fishes have a terminal position in a paraphyletic cluster of bony fishes. No previous analysis based on nuclear DNA-coded genes could significantly reject the mitochondrial trees of jawed vertebrates. Results We have cloned and sequenced seven nuclear DNA-coded genes from 13 vertebrate species. These sequences, together with sequences available from databases including 13 jawed vertebrates from eight major groups (cartilaginous fishes, bichir, chondrosteans, gar, bowfin, teleost fishes, lungfishes and tetrapods) and an outgroup (a cyclostome and a lancelet), have been subjected to phylogenetic analyses based on the maximum likelihood method. Conclusion Cartilaginous fishes have been inferred to be basal to other jawed vertebrates, which is consistent with the generally accepted view. The minimum log-likelihood difference between the maximum likelihood tree and trees not supporting the basal position of cartilaginous fishes is 18.3 ± 13.1. The hypothesis by Rasmussen and Arnason has been significantly rejected with the minimum log-likelihood difference of 123 ± 23.3. Our tree has also shown that living holosteans, comprising bowfin and gar, form a monophyletic group which is the sister group to teleost fishes. This is consistent with a formerly prevalent view of vertebrate classification, although inconsistent with both of the current morphology-based and mitochondrial sequence-based trees. Furthermore, the bichir has been shown to be the basal ray-finned fish. Tetrapods and lungfish have formed a monophyletic cluster in the tree inferred from the concatenated alignment, being consistent with the currently prevalent view. It also remains possible that tetrapods are more closely related to ray-finned fishes than to lungfishes. PMID:15070407
Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir
2011-01-01
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
TOWARD A MOLECULAR PHYLOGENY FOR PEROMYSCUS: EVIDENCE FROM MITOCHONDRIAL CYTOCHROME-b SEQUENCES
Bradley, Robert D.; Durish, Nevin D.; Rogers, Duke S.; Miller, Jacqueline R.; Engstrom, Mark D.; Kilpatrick, C. William
2009-01-01
One hundred DNA sequences from the mitochondrial cytochrome-b gene of 44 species of deer mice (Peromyscus (sensu stricto), 1 of Habromys, 1 of Isthmomys, 2 of Megadontomys, and the monotypic genera Neotomodon, Osgoodomys, and Podomys were used to develop a molecular phylogeny for Peromyscus. Phylogenetic analyses (maximum parsimony, maximum likelihood, and Bayesian inference) were conducted to evaluate alternative hypotheses concerning taxonomic arrangements (sensu stricto versus sensu lato) of the genus. In all analyses, monophyletic clades were obtained that corresponded to species groups proposed by previous authors; however, relationships among species groups generally were poorly resolved. The concept of the genus Peromyscus based on molecular data differed significantly from the most current taxonomic arrangement. Maximum-likelihood and Bayesian trees depicted strong support for a clade placing Habromys, Megadontomys, Neotomodon, Osgoodomys, and Podomys within Peromyscus. If Habromys, Megadontomys, Neotomodon, Osgoodomys, and Podomys are regarded as genera, then several species groups within Peromyscus (sensu stricto) should be elevated to generic rank. Isthmomys was associated with the genus Reithrodontomys; in turn this clade was sister to Baiomys, indicating a distant relationship of Isthmomys to Peromyscus. A formal taxonomic revision awaits synthesis of additional sequence data from nuclear markers together with inclusion of available allozymic and karyotypic data. PMID:19924266
NASA Astrophysics Data System (ADS)
Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong
2016-09-01
Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.
2010-01-01
Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504
NASA Astrophysics Data System (ADS)
Kojima, Yohei; Takeda, Kazuaki; Adachi, Fumiyuki
Frequency-domain equalization (FDE) based on the minimum mean square error (MMSE) criterion can provide better downlink bit error rate (BER) performance of direct sequence code division multiple access (DS-CDMA) than the conventional rake combining in a frequency-selective fading channel. FDE requires accurate channel estimation. In this paper, we propose a new 2-step maximum likelihood channel estimation (MLCE) for DS-CDMA with FDE in a very slow frequency-selective fading environment. The 1st step uses the conventional pilot-assisted MMSE-CE and the 2nd step carries out the MLCE using decision feedback from the 1st step. The BER performance improvement achieved by 2-step MLCE over pilot assisted MMSE-CE is confirmed by computer simulation.
Estimation of submarine mass failure probability from a sequence of deposits with age dates
Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.
2013-01-01
The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D
2004-10-01
Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and flexibility. To determine whether biased reconstructions using optimization methods might affect inferences of functional properties, ancestral primate mitochondrial tRNA sequences were inferred and helix-forming propensities for conserved pairs were evaluated in silico. For ambiguously reconstructed nucleotides at sites with high base composition variability, ancestral tRNA sequences from Bayesian analyses were more compatible with canonical base pairing than were those inferred by other methods. Thus, nucleotide bias in reconstructed sequences apparently can lead to serious bias and inaccuracies in functional predictions.
Zhi-Bin Wen; Ming-Li Zhang; Ge-Lin Zhu; Stewart C. Sanderson
2010-01-01
To reconstruct phylogeny and verify the monophyly of major subgroups, a total of 52 species representing almost all species of Salsoleae s.l. in China were sampled, with analysis based on three molecular markers (nrDNA ITS, cpDNA psbB-psbH and rbcL), using maximum parsimony, maximum likelihood, and Bayesian inference methods. Our molecular evidence provides strong...
Approximated mutual information training for speech recognition using myoelectric signals.
Guo, Hua J; Chan, A D C
2006-01-01
A new training algorithm called the approximated maximum mutual information (AMMI) is proposed to improve the accuracy of myoelectric speech recognition using hidden Markov models (HMMs). Previous studies have demonstrated that automatic speech recognition can be performed using myoelectric signals from articulatory muscles of the face. Classification of facial myoelectric signals can be performed using HMMs that are trained using the maximum likelihood (ML) algorithm; however, this algorithm maximizes the likelihood of the observations in the training sequence, which is not directly associated with optimal classification accuracy. The AMMI training algorithm attempts to maximize the mutual information, thereby training the HMMs to optimize their parameters for discrimination. Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
NASA Astrophysics Data System (ADS)
Humpula, James F.; Ostrom, Peggy H.; Gandhi, Hasand; Strahler, John R.; Walker, Angela K.; Stafford, Thomas W.; Smith, James J.; Voorhies, Michael R.; George Corner, R.; Andrews, Phillip C.
2007-12-01
Ancient DNA sequences offer an extraordinary opportunity to unravel the evolutionary history of ancient organisms. Protein sequences offer another reservoir of genetic information that has recently become tractable through the application of mass spectrometric techniques. The extent to which ancient protein sequences resolve phylogenetic relationships, however, has not been explored. We determined the osteocalcin amino acid sequence from the bone of an extinct Camelid (21 ka, Camelops hesternus) excavated from Isleta Cave, New Mexico and three bones of extant camelids: bactrian camel ( Camelus bactrianus); dromedary camel ( Camelus dromedarius) and guanaco ( Llama guanacoe) for a diagenetic and phylogenetic assessment. There was no difference in sequence among the four taxa. Structural attributes observed in both modern and ancient osteocalcin include a post-translation modification, Hyp 9, deamidation of Gln 35 and Gln 39, and oxidation of Met 36. Carbamylation of the N-terminus in ancient osteocalcin may result in blockage and explain previous difficulties in sequencing ancient proteins via Edman degradation. A phylogenetic analysis using osteocalcin sequences of 25 vertebrate taxa was conducted to explore osteocalcin protein evolution and the utility of osteocalcin sequences for delineating phylogenetic relationships. The maximum likelihood tree closely reflected generally recognized taxonomic relationships. For example, maximum likelihood analysis recovered rodents, birds and, within hominins, the Homo-Pan-Gorilla trichotomy. Within Artiodactyla, character state analysis showed that a substitution of Pro 4 for His 4 defines the Capra-Ovis clade within Artiodactyla. Homoplasy in our analysis indicated that osteocalcin evolution is not a perfect indicator of species evolution. Limited sequence availability prevented assigning functional significance to sequence changes. Our preliminary analysis of osteocalcin evolution represents an initial step towards a complete character analysis aimed at determining the evolutionary history of this functionally significant protein. We emphasize that ancient protein sequencing and phylogenetic analyses using amino acid sequences must pay close attention to post-translational modifications, amino acid substitutions due to diagenetic alteration and the impacts of isobaric amino acids on mass shifts and sequence alignments.
Varela, Eduardo S; Lima, João P M S; Galdino, Alexsandro S; Pinto, Luciano da S; Bezerra, Walderly M; Nunes, Edson P; Alves, Maria A O; Grangeiro, Thalles B
2004-01-01
The complete sequences of nuclear ribosomal DNA (nrDNA) internal transcribed spacer regions (ITS/5.8S) were determined for species belonging to six genera from the subtribe Diocleinae as well as for the anomalous genera Calopogonium and Pachyrhizus. Phylogenetic trees constructed by distance matrix, maximum parsimony and maximum likelihood methods showed that Calopogonium and Pachyrhizus were outside the clade Diocleinae (Canavalia, Camptosema, Cratylia, Dioclea, Cymbosema, and Galactia). This finding supports previous morphological, phytochemical, and molecular evidence that Calopogonium and Pachyrhizus do not belong to the subtribe Diocleinae. Within the true Diocleinae clade, the clustering of genera and species were congruent with morphology-based classifications, suggesting that ITS/5.8S sequences can provide enough informative sites to allow resolution below the genus level. This is the first evidence of the phylogeny of subtribe Diocleinae based on nuclear DNA sequences.
Maximum likelihood sequence estimation for optical complex direct modulation.
Che, Di; Yuan, Feng; Shieh, William
2017-04-17
Semiconductor lasers are versatile optical transmitters in nature. Through the direct modulation (DM), the intensity modulation is realized by the linear mapping between the injection current and the light power, while various angle modulations are enabled by the frequency chirp. Limited by the direct detection, DM lasers used to be exploited only as 1-D (intensity or angle) transmitters by suppressing or simply ignoring the other modulation. Nevertheless, through the digital coherent detection, simultaneous intensity and angle modulations (namely, 2-D complex DM, CDM) can be realized by a single laser diode. The crucial technique of CDM is the joint demodulation of intensity and differential phase with the maximum likelihood sequence estimation (MLSE), supported by a closed-form discrete signal approximation of frequency chirp to characterize the MLSE transition probability. This paper proposes a statistical method for the transition probability to significantly enhance the accuracy of the chirp model. Using the statistical estimation, we demonstrate the first single-channel 100-Gb/s PAM-4 transmission over 1600-km fiber with only 10G-class DM lasers.
Alam, M S; Bognar, J G; Cain, S; Yasuda, B J
1998-03-10
During the process of microscanning a controlled vibrating mirror typically is used to produce subpixel shifts in a sequence of forward-looking infrared (FLIR) images. If the FLIR is mounted on a moving platform, such as an aircraft, uncontrolled random vibrations associated with the platform can be used to generate the shifts. Iterative techniques such as the expectation-maximization (EM) approach by means of the maximum-likelihood algorithm can be used to generate high-resolution images from multiple randomly shifted aliased frames. In the maximum-likelihood approach the data are considered to be Poisson random variables and an EM algorithm is developed that iteratively estimates an unaliased image that is compensated for known imager-system blur while it simultaneously estimates the translational shifts. Although this algorithm yields high-resolution images from a sequence of randomly shifted frames, it requires significant computation time and cannot be implemented for real-time applications that use the currently available high-performance processors. The new image shifts are iteratively calculated by evaluation of a cost function that compares the shifted and interlaced data frames with the corresponding values in the algorithm's latest estimate of the high-resolution image. We present a registration algorithm that estimates the shifts in one step. The shift parameters provided by the new algorithm are accurate enough to eliminate the need for iterative recalculation of translational shifts. Using this shift information, we apply a simplified version of the EM algorithm to estimate a high-resolution image from a given sequence of video frames. The proposed modified EM algorithm has been found to reduce significantly the computational burden when compared with the original EM algorithm, thus making it more attractive for practical implementation. Both simulation and experimental results are presented to verify the effectiveness of the proposed technique.
Schuster, Tanja M.; Setaro, Sabrina D.; Tibbits, Josquin F. G.; Batty, Erin L.; Fowler, Rachael M.; McLay, Todd G. B.; Wilcox, Stephen; Ades, Peter K.
2018-01-01
Previous molecular phylogenetic analyses have resolved the Australian bloodwood eucalypt genus Corymbia (~100 species) as either monophyletic or paraphyletic with respect to Angophora (9–10 species). Here we assess relationships of Corymbia and Angophora using a large dataset of chloroplast DNA sequences (121,016 base pairs; from 90 accessions representing 55 Corymbia and 8 Angophora species, plus 33 accessions of related genera), skimmed from high throughput sequencing of genomic DNA, and compare results with new analyses of nuclear ITS sequences (119 accessions) from previous studies. Maximum likelihood and maximum parsimony analyses of cpDNA resolve well supported trees with most nodes having >95% bootstrap support. These trees strongly reject monophyly of Corymbia, its two subgenera (Corymbia and Blakella), most taxonomic sections (Abbreviatae, Maculatae, Naviculares, Septentrionales), and several species. ITS trees weakly indicate paraphyly of Corymbia (bootstrap support <50% for maximum likelihood, and 71% for parsimony), but are highly incongruent with the cpDNA analyses, in that they support monophyly of both subgenera and some taxonomic sections of Corymbia. The striking incongruence between cpDNA trees and both morphological taxonomy and ITS trees is attributed largely to chloroplast introgression between taxa, because of geographic sharing of chloroplast clades across taxonomic groups. Such introgression has been widely inferred in studies of the related genus Eucalyptus. This is the first report of its likely prevalence in Corymbia and Angophora, but this is consistent with previous morphological inferences of hybridisation between species. Our findings (based on continent-wide sampling) highlight a need for more focussed studies to assess the extent of hybridisation and introgression in the evolutionary history of these genera, and that critical testing of the classification of Corymbia and Angophora requires additional sequence data from nuclear genomes. PMID:29668710
Speech processing using conditional observable maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, John; Nix, David
A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence ofmore » speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.« less
Stamatakis, Alexandros; Ott, Michael
2008-12-27
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.
Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu
2015-06-01
High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
On the error probability of general tree and trellis codes with applications to sequential decoding
NASA Technical Reports Server (NTRS)
Johannesson, R.
1973-01-01
An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding.
Parallel implementation of D-Phylo algorithm for maximum likelihood clusters.
Malik, Shamita; Sharma, Dolly; Khatri, Sunil Kumar
2017-03-01
This study explains a newly developed parallel algorithm for phylogenetic analysis of DNA sequences. The newly designed D-Phylo is a more advanced algorithm for phylogenetic analysis using maximum likelihood approach. The D-Phylo while misusing the seeking capacity of k -means keeps away from its real constraint of getting stuck at privately conserved motifs. The authors have tested the behaviour of D-Phylo on Amazon Linux Amazon Machine Image(Hardware Virtual Machine)i2.4xlarge, six central processing unit, 122 GiB memory, 8 × 800 Solid-state drive Elastic Block Store volume, high network performance up to 15 processors for several real-life datasets. Distributing the clusters evenly on all the processors provides us the capacity to accomplish a near direct speed if there should arise an occurrence of huge number of processors.
Johnson, Rebecca N; Agapow, Paul-Michael; Crozier, Ross H
2003-11-01
The ant subfamily Formicinae is a large assemblage (2458 species (J. Nat. Hist. 29 (1995) 1037), including species that weave leaf nests together with larval silk and in which the metapleural gland-the ancestrally defining ant character-has been secondarily lost. We used sequences from two mitochondrial genes (cytochrome b and cytochrome oxidase 2) from 18 formicine and 4 outgroup taxa to derive a robust phylogeny, employing a search for tree islands using 10000 randomly constructed trees as starting points and deriving a maximum likelihood consensus tree from the ML tree and those not significantly different from it. Non-parametric bootstrapping showed that the ML consensus tree fit the data significantly better than three scenarios based on morphology, with that of Bolton (Identification Guide to the Ant Genera of the World, Harvard University Press, Cambridge, MA) being the best among these alternative trees. Trait mapping showed that weaving had arisen at least four times and possibly been lost once. A maximum likelihood analysis showed that loss of the metapleural gland is significantly associated with the weaver life-pattern. The graph of the frequencies with which trees were discovered versus their likelihood indicates that trees with high likelihoods have much larger basins of attraction than those with lower likelihoods. While this result indicates that single searches are more likely to find high- than low-likelihood tree islands, it also indicates that searching only for the single best tree may lose important information.
Fragment assignment in the cloud with eXpress-D
2013-01-01
Background Probabilistic assignment of ambiguously mapped fragments produced by high-throughput sequencing experiments has been demonstrated to greatly improve accuracy in the analysis of RNA-Seq and ChIP-Seq, and is an essential step in many other sequence census experiments. A maximum likelihood method using the expectation-maximization (EM) algorithm for optimization is commonly used to solve this problem. However, batch EM-based approaches do not scale well with the size of sequencing datasets, which have been increasing dramatically over the past few years. Thus, current approaches to fragment assignment rely on heuristics or approximations for tractability. Results We present an implementation of a distributed EM solution to the fragment assignment problem using Spark, a data analytics framework that can scale by leveraging compute clusters within datacenters–“the cloud”. We demonstrate that our implementation easily scales to billions of sequenced fragments, while providing the exact maximum likelihood assignment of ambiguous fragments. The accuracy of the method is shown to be an improvement over the most widely used tools available and can be run in a constant amount of time when cluster resources are scaled linearly with the amount of input data. Conclusions The cloud offers one solution for the difficulties faced in the analysis of massive high-thoughput sequencing data, which continue to grow rapidly. Researchers in bioinformatics must follow developments in distributed systems–such as new frameworks like Spark–for ways to port existing methods to the cloud and help them scale to the datasets of the future. Our software, eXpress-D, is freely available at: http://github.com/adarob/express-d. PMID:24314033
Vink, Cor J; Paterson, Adrian M
2003-09-01
Datasets from the mitochondrial gene regions NADH dehydrogenase subunit I (ND1) and cytochrome c oxidase subunit I (COI) of the 20 species in the New Zealand wolf spider (Lycosidae) genus Anoteropsis were generated. Sequence data were phylogenetically analysed using parsimony and maximum likelihood analyses. The phylogenies generated from the ND1 and COI sequence data and a previously generated morphological dataset were significantly congruent (p<0.001). Sequence data were combined with morphological data and phylogenetically analysed using parsimony. The ND1 region sequenced included part of tRNA(Leu(CUN)), which appears to have an unstable amino-acyl arm and no TpsiC arm in lycosids. Analyses supported the existence of five species groups within Anoteropsis and the monophyly of species represented by multiple samples. A radiation of Anoteropsis species within the last five million years is inferred from the ND1 and COI likelihood phylograms, habitat and geological data, which also indicates that Anoteropsis arrived in New Zealand some time after it separated from Gondwana.
Krajewski, C; Fain, M G; Buckley, L; King, D G
1999-11-01
ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets. Copyright 1999 Academic Press.
Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M
2017-10-01
Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.
Phylogenetic study on Shiraia bambusicola by rDNA sequence analyses.
Cheng, Tian-Fan; Jia, Xiao-Ming; Ma, Xiao-Hang; Lin, Hai-Ping; Zhao, Yu-Hua
2004-01-01
In this study, 18S rDNA and ITS-5.8S rDNA regions of four Shiraia bambusicola isolates collected from different species of bamboos were amplified by PCR with universal primer pairs NS1/NS8 and ITS5/ITS4, respectively, and sequenced. Phylogenetic analyses were conducted on three selected datasets of rDNA sequences. Maximum parsimony, distance and maximum likelihood criteria were used to infer trees. Morphological characteristics were also observed. The positioning of Shiraia in the order Pleosporales was well supported by bootstrap, which agreed with the placement by Amano (1980) according to their morphology. We did not find significant inter-hostal differences among these four isolates from different species of bamboos. From the results of analyses and comparison of their rDNA sequences, we conclude that Shiraia should be classified into Pleosporales as Amano (1980) proposed and suggest that it might be positioned in the family Phaeosphaeriaceae. Copyright 2004 WILEY-VCH Verlag GmbH & Co.
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; ...
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
NASA Astrophysics Data System (ADS)
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-11-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-01-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517
Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
Two stochastic models useful in petroleum exploration
NASA Technical Reports Server (NTRS)
Kaufman, G. M.; Bradley, P. G.
1972-01-01
A model of the petroleum exploration process that tests empirically the hypothesis that at an early stage in the exploration of a basin, the process behaves like sampling without replacement is proposed along with a model of the spatial distribution of petroleum reserviors that conforms to observed facts. In developing the model of discovery, the following topics are discussed: probabilitistic proportionality, likelihood function, and maximum likelihood estimation. In addition, the spatial model is described, which is defined as a stochastic process generating values of a sequence or random variables in a way that simulates the frequency distribution of areal extent, the geographic location, and shape of oil deposits
Atibalentja, N; Noel, G R; Domier, L L
2000-03-01
A 1341 bp sequence of the 16S rDNA of an undescribed species of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, was determined and then compared with a homologous sequence of Pasteuria ramosa, a parasite of cladoceran water fleas of the family Daphnidae. The two Pasteuria sequences, which diverged from each other by a dissimilarity index of 7%, also were compared with the 16S rDNA sequences of 30 other bacterial species to determine the phylogenetic position of the genus Pasteuria among the Gram-positive eubacteria. Phylogenetic analyses using maximum-likelihood, maximum-parsimony and neighbour-joining methods showed that the Heterodera glycines-infecting Pasteuria and its sister species, P. ramosa, form a distinct line of descent within the Alicyclobacillus group of the Bacillaceae. These results are consistent with the view that the genus Pasteuria is a deeply rooted member of the Clostridium-Bacillus-Streptococcus branch of the Gram-positive eubacteria, neither related to the actinomycetes nor closely related to true endospore-forming bacteria.
Bendiksby, Mika; Næsborg, Rikke Reese; Timdal, Einar
2018-01-01
Xylopsora canopeorum Timdal, Reese Næsborg & Bendiksby is described as a new species occupying the crowns of large Sequoia sempervirens trees in California, USA. The new species is supported by morphology, anatomy, secondary chemistry and DNA sequence data. While similar in external appearance to X. friesii , it is distinguished by forming smaller, partly coralloid squamules, by the occurrence of soralia and, in some specimens, by the presence of thamnolic acid in addition to friesiic acid in the thallus. Molecular phylogenetic results are based on nuclear (ITS and LSU) as well as mitochondrial (SSU) ribosomal DNA sequence alignments. Phylogenetic hypotheses obtained using Bayesian Inference, Maximum Likelihood and Maximum Parsimony all support X. canopeorum as a distinct evolutionary lineage belonging to the X. caradocensis - X. friesii clade.
Xylopsora canopeorum (Umbilicariaceae), a new lichen species from the canopy of Sequoia sempervirens
Bendiksby, Mika; Næsborg, Rikke Reese; Timdal, Einar
2018-01-01
Abstract Xylopsora canopeorum Timdal, Reese Næsborg & Bendiksby is described as a new species occupying the crowns of large Sequoia sempervirens trees in California, USA. The new species is supported by morphology, anatomy, secondary chemistry and DNA sequence data. While similar in external appearance to X. friesii, it is distinguished by forming smaller, partly coralloid squamules, by the occurrence of soralia and, in some specimens, by the presence of thamnolic acid in addition to friesiic acid in the thallus. Molecular phylogenetic results are based on nuclear (ITS and LSU) as well as mitochondrial (SSU) ribosomal DNA sequence alignments. Phylogenetic hypotheses obtained using Bayesian Inference, Maximum Likelihood and Maximum Parsimony all support X. canopeorum as a distinct evolutionary lineage belonging to the X. caradocensis–X. friesii clade. PMID:29559828
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies
Rukhin, Andrew L.
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed. PMID:26989583
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies.
Rukhin, Andrew L
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed.
Maximum-likelihood estimation of recent shared ancestry (ERSA).
Huff, Chad D; Witherspoon, David J; Simonson, Tatum S; Xing, Jinchuan; Watkins, W Scott; Zhang, Yuhua; Tuohy, Therese M; Neklason, Deborah W; Burt, Randall W; Guthery, Stephen L; Woodward, Scott R; Jorde, Lynn B
2011-05-01
Accurate estimation of recent shared ancestry is important for genetics, evolution, medicine, conservation biology, and forensics. Established methods estimate kinship accurately for first-degree through third-degree relatives. We demonstrate that chromosomal segments shared by two individuals due to identity by descent (IBD) provide much additional information about shared ancestry. We developed a maximum-likelihood method for the estimation of recent shared ancestry (ERSA) from the number and lengths of IBD segments derived from high-density SNP or whole-genome sequence data. We used ERSA to estimate relationships from SNP genotypes in 169 individuals from three large, well-defined human pedigrees. ERSA is accurate to within one degree of relationship for 97% of first-degree through fifth-degree relatives and 80% of sixth-degree and seventh-degree relatives. We demonstrate that ERSA's statistical power approaches the maximum theoretical limit imposed by the fact that distant relatives frequently share no DNA through a common ancestor. ERSA greatly expands the range of relationships that can be estimated from genetic data and is implemented in a freely available software package.
2010-06-01
GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
MXLKID: a maximum likelihood parameter identifier. [In LRLTRAN for CDC 7600
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gavel, D.T.
MXLKID (MaXimum LiKelihood IDentifier) is a computer program designed to identify unknown parameters in a nonlinear dynamic system. Using noisy measurement data from the system, the maximum likelihood identifier computes a likelihood function (LF). Identification of system parameters is accomplished by maximizing the LF with respect to the parameters. The main body of this report briefly summarizes the maximum likelihood technique and gives instructions and examples for running the MXLKID program. MXLKID is implemented LRLTRAN on the CDC7600 computer at LLNL. A detailed mathematical description of the algorithm is given in the appendices. 24 figures, 6 tables.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate were considered. These equations suggest certain successive approximations iterative procedures for obtaining maximum likelihood estimates. The procedures, which are generalized steepest ascent (deflected gradient) procedures, contain those of Hosmer as a special case.
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing
Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.
2007-01-01
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Neandertal admixture in Eurasia confirmed by maximum-likelihood analysis of three genomes.
Lohse, Konrad; Frantz, Laurent A F
2014-04-01
Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4-7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination.
Neandertal Admixture in Eurasia Confirmed by Maximum-Likelihood Analysis of Three Genomes
Lohse, Konrad; Frantz, Laurent A. F.
2014-01-01
Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4−7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination. PMID:24532731
Phylogenetic evidence for cladogenetic polyploidization in land plants.
Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay
2016-07-01
Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.
Determining the accuracy of maximum likelihood parameter estimates with colored residuals
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; Klein, Vladislav
1994-01-01
An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
Speech processing using maximum likelihood continuity mapping
Hogden, John E.
2000-01-01
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Speech processing using maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.E.
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Rouhani, Soheila; Raeghi, Saber; Spotin, Adel
2017-01-01
Fascioliasis is economically important to the livestock industry that caused with Fasciola hepatica and Fasciola gigantica. The objective of this study was to identify these two species F. hepatica and F. gigantica by using nuclear and mitochondrial markers (ITS1, ND1 and CO1) and have been employed to analyze intraspecific phylogenetic relations of Fasciola spp. Approximately 150 Fasciola specimens were collected, then stained with haematoxylin-carmine dye and observed under an optical microscope to examine for the existence of sperm. The ITS1 marker was used to identify different Fasciola and phylogenetic analysis based on ND1 and CO1 sequence data were conducted by maximum likelihood algorithm. Fasciola samples were separated into 2 groups. Almost all specimens had many sperms in the seminal vesicle (spermic fluke) and one fluke did not contain any sperm in the seminal vesicle. The aspermic sample had F. gigantica RFLP pattern with ITS1 gene. Phylogenetic analysis based on NDI and COI sequence data were conducted by maximum likelihood showed a similar topology of the trees obtained particularly for F. hepatica and F. gigantica. This study demonstrated that aspermic Fasciola found in this region of Iran has same genetic structures through the spermic F. gigantica populations in accordance to phylogenetic tree.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
Callejón, Rocío; Robles, María Del Rosario; Panei, Carlos Javier; Cutillas, Cristina
2016-08-01
A molecular phylogenetic hypothesis is presented for the genus Trichuris based on sequence data from mitochondrial cytochrome c oxidase 1 (cox1) and cytochrome b (cob). The taxa consisted of nine populations of whipworm from five species of Sigmodontinae rodents from Argentina. Bayesian Inference, Maximum Parsimony, and Maximum Likelihood methods were used to infer phylogenies for each gene separately but also for the combined mitochondrial data and the combined mitochondrial and nuclear dataset. Phylogenetic results based on cox1 and cob mitochondrial DNA (mtDNA) revealed three clades strongly resolved corresponding to three different species (Trichuris navonae, Trichuris bainae, and Trichuris pardinasi) showing phylogeographic variation, but relationships among Trichuris species were poorly resolved. Phylogenetic reconstruction based on concatenated sequences had greater phylogenetic resolution for delimiting species and populations intra-specific of Trichuris than those based on partitioned genes. Thus, populations of T. bainae and T. pardinasi could be affected by geographical factors and co-divergence parasite-host.
Bit Error Probability for Maximum Likelihood Decoding of Linear Block Codes
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc P. C.; Rhee, Dojun
1996-01-01
In this paper, the bit error probability P(sub b) for maximum likelihood decoding of binary linear codes is investigated. The contribution of each information bit to P(sub b) is considered. For randomly generated codes, it is shown that the conventional approximation at high SNR P(sub b) is approximately equal to (d(sub H)/N)P(sub s), where P(sub s) represents the block error probability, holds for systematic encoding only. Also systematic encoding provides the minimum P(sub b) when the inverse mapping corresponding to the generator matrix of the code is used to retrieve the information sequence. The bit error performances corresponding to other generator matrix forms are also evaluated. Although derived for codes with a generator matrix randomly generated, these results are shown to provide good approximations for codes used in practice. Finally, for decoding methods which require a generator matrix with a particular structure such as trellis decoding or algebraic-based soft decision decoding, equivalent schemes that reduce the bit error probability are discussed.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
Ogura, Kohei; Watanabe, Shinya; Kirikae, Teruo; Miyoshi-Akiyama, Tohru
2017-01-01
Epidemiologic typing of Streptococcus pyogenes (GAS) is frequently based on the genotype of the emm gene, which encodes M/Emm protein. In this study, the complete genome sequence of GAS emm3 strain M3-b, isolated from a patient with streptococcal toxic shock syndrome (STSS), was determined. This strain exhibited 99% identity with other complete genome sequences of emm3 strains MGAS315, SSI-1, and STAB902. The complete genomes of five additional strains isolated from Japanese patients with and without STSS were also sequences. Maximum-likelihood phylogenetic analysis showed that strains M3-b, M3-e, and SSI-1, all which were isolated from STSS patients, were relatively close.
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
The recursive maximum likelihood proportion estimator: User's guide and test results
NASA Technical Reports Server (NTRS)
Vanrooy, D. L.
1976-01-01
Implementation of the recursive maximum likelihood proportion estimator is described. A user's guide to programs as they currently exist on the IBM 360/67 at LARS, Purdue is included, and test results on LANDSAT data are described. On Hill County data, the algorithm yields results comparable to the standard maximum likelihood proportion estimator.
New applications of maximum likelihood and Bayesian statistics in macromolecular crystallography.
McCoy, Airlie J
2002-10-01
Maximum likelihood methods are well known to macromolecular crystallographers as the methods of choice for isomorphous phasing and structure refinement. Recently, the use of maximum likelihood and Bayesian statistics has extended to the areas of molecular replacement and density modification, placing these methods on a stronger statistical foundation and making them more accurate and effective.
On the existence of maximum likelihood estimates for presence-only data
Hefley, Trevor J.; Hooten, Mevin B.
2015-01-01
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
A polyphasic taxonomic approach in isolated strains of Cyanobacteria from thermal springs of Greece.
Bravakos, Panos; Kotoulas, Georgios; Skaraki, Katerina; Pantazidou, Adriani; Economou-Amilli, Athena
2016-05-01
Strains of Cyanobacteria isolated from mats of 9 thermal springs of Greece have been studied for their taxonomic evaluation. A polyphasic taxonomic approach was employed which included: morphological observations by light microscopy and scanning electron microscopy, maximum parsimony, maximum likelihood and Bayesian analysis of 16S rDNA sequences, secondary structural comparisons of 16S-23S rRNA Internal Transcribed Spacer sequences, and finally environmental data. The 17 cyanobacterial isolates formed a diverse group that contained filamentous, coccoid and heterocytous strains. These included representatives of the polyphyletic genera of Synechococcus and Phormidium, and the orders Oscillatoriales, Spirulinales, Chroococcales and Nostocales. After analysis, at least 6 new taxa at the genus level provide new evidence in the taxonomy of Cyanobacteria and highlight the abundant diversity of thermal spring environments with many potential endemic species or ecotypes. Copyright © 2016 Elsevier Inc. All rights reserved.
Mugleston, Joseph D; Song, Hojun; Whiting, Michael F
2013-12-01
The phylogenetic relationships of Tettigoniidae (katydids and bush-crickets) were inferred using molecular sequence data. Six genes (18S rDNA, 28S rDNA, Cytochrome Oxidase II, Histone 3, Tubulin Alpha I, and Wingless) were sequenced for 135 ingroup taxa representing 16 of the 19 extant katydid subfamilies. Five subfamilies (Tettigoniinae, Pseudophyllinae, Mecopodinae, Meconematinae, and Listroscelidinae) were found to be paraphyletic under various tree reconstruction methods (Maximum Likelihood, Bayesisan Inference and Maximum Parsimony). Seven subfamilies - Conocephalinae, Hetrodinae, Hexacentrinae, Saginae, Phaneropterinae, Phyllophorinae, and Lipotactinae - were each recovered as well-supported monophyletic groups. We mapped the small and exposed thoracic auditory spiracle (a defining character of the subfamily Pseudophyllinae) and found it to be homoplasious. We also found the leaf-like wings of katydids have been derived independently in at least six lineages. Copyright © 2013 Elsevier Inc. All rights reserved.
Padial, José M; Grant, Taran; Frost, Darrel R
2014-06-26
Brachycephaloidea is a monophyletic group of frogs with more than 1000 species distributed throughout the New World tropics, subtropics, and Andean regions. Recently, the group has been the target of multiple molecular phylogenetic analyses, resulting in extensive changes in its taxonomy. Here, we test previous hypotheses of phylogenetic relationships for the group by combining available molecular evidence (sequences of 22 genes representing 431 ingroup and 25 outgroup terminals) and performing a tree-alignment analysis under the parsimony optimality criterion using the program POY. To elucidate the effects of alignment and optimality criterion on phylogenetic inferences, we also used the program MAFFT to obtain a similarity-alignment for analysis under both parsimony and maximum likelihood using the programs TNT and GARLI, respectively. Although all three analytical approaches agreed on numerous points, there was also extensive disagreement. Tree-alignment under parsimony supported the monophyly of the ingroup and the sister group relationship of the monophyletic marsupial frogs (Hemiphractidae), while maximum likelihood and parsimony analyses of the MAFFT similarity-alignment did not. All three methods differed with respect to the position of Ceuthomantis smaragdinus (Ceuthomantidae), with tree-alignment using parsimony recovering this species as the sister of Pristimantis + Yunganastes. All analyses rejected the monophyly of Strabomantidae and Strabomantinae as originally defined, and the tree-alignment analysis under parsimony further rejected the recently redefined Craugastoridae and Pristimantinae. Despite the greater emphasis in the systematics literature placed on the choice of optimality criterion for evaluating trees than on the choice of method for aligning DNA sequences, we found that the topological differences attributable to the alignment method were as great as those caused by the optimality criterion. Further, the optimal tree-alignment indicates that insertions and deletions occurred in twice as many aligned positions as implied by the optimal similarity-alignment, confirming previous findings that sequence turnover through insertion and deletion events plays a greater role in molecular evolution than indicated by similarity-alignments. Our results also provide a clear empirical demonstration of the different effects of wildcard taxa produced by missing data in parsimony and maximum likelihood analyses. Specifically, maximum likelihood analyses consistently (81% bootstrap frequency) provided spurious resolution despite a lack of evidence, whereas parsimony correctly depicted the ambiguity due to missing data by collapsing unsupported nodes. We provide a new taxonomy for the group that retains previously recognized Linnaean taxa except for Ceuthomantidae, Strabomantidae, and Strabomantinae. A phenotypically diagnosable superfamily is recognized formally as Brachycephaloidea, with the informal, unranked name terrarana retained as the standard common name for these frogs. We recognize three families within Brachycephaloidea that are currently diagnosable solely on molecular grounds (Brachycephalidae, Craugastoridae, and Eleutherodactylidae), as well as five subfamilies (Craugastorinae, Eleutherodactylinae, Holoadeninae, Phyzelaphryninae, and Pristimantinae) corresponding in large part to previous families and subfamilies. Our analyses upheld the monophyly of all tested genera, but we found numerous subgeneric taxa to be non-monophyletic and modified the taxonomy accordingly.
Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites.
Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying
2012-10-01
To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.
Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites*
Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying
2012-01-01
To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi’an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%–99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites. PMID:23024043
A novel gammaherpesvirus in a large flying fox (Pteropus vampyrus) with blepharitis.
Paige Brock, A; Cortés-Hinojosa, Galaxia; Plummer, Caryn E; Conway, Julia A; Roff, Shannon R; Childress, April L; Wellehan, James F X
2013-05-01
A novel gammaherpesvirus was identified in a large flying fox (Pteropus vampyrus) with conjunctivitis, blepharitis, and meibomianitis by nested polymerase chain reaction and sequencing. Polymerase chain reaction amplification and sequencing of 472 base pairs of the DNA-dependent DNA polymerase gene were used to identify a novel herpesvirus. Bayesian and maximum likelihood phylogenetic analyses indicated that the virus is a member of the genus Percavirus in the subfamily Gammaherpesvirinae. Additional research is needed regarding the association of this virus with conjunctivitis and other ocular pathology. This virus may be useful as a biomarker of stress and may be a useful model of virus recrudescence in Pteropus spp.
Alić, Nikola; Papen, George; Saperstein, Robert; Milstein, Laurence; Fainman, Yeshaiahu
2005-06-13
Exact signal statistics for fiber-optic links containing a single optical pre-amplifier are calculated and applied to sequence estimation for electronic dispersion compensation. The performance is evaluated and compared with results based on the approximate chi-square statistics. We show that detection in existing systems based on exact statistics can be improved relative to using a chi-square distribution for realistic filter shapes. In contrast, for high-spectral efficiency systems the difference between the two approaches diminishes, and performance tends to be less dependent on the exact shape of the filter used.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Computation of nonparametric convex hazard estimators via profile methods.
Jankowski, Hanna K; Wellner, Jon A
2009-05-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
Wellehan, James F.X.; Pessier, Allan P.; Archer, Linda L.; Childress, April L.; Jacobson, Elliott R.; Tesh, Robert B.
2012-01-01
Rhabdoviruses infect a variety of hosts, including non-avian reptiles. Consensus PCR techniques were used to obtain partial RNA-dependent RNA polymerase gene sequence from five rhabdoviruses of South American lizards; Marco, Chaco, Timbo, Sena Madureira, and a rhabdovirus from a caiman lizard (Dracaena guianensis). The caiman lizard rhabdovirus formed inclusions in erythrocytes, which may be a route for infecting hematophagous insects. This is the first information on behavior of a rhabdovirus in squamates. We also obtained sequence from two rhabdoviruses of Australian lizards, confirming previous Charleville virus sequence and finding that, unlike a previous sequence report but in agreement with serologic reports, Almpiwar virus is clearly distinct from Charleville virus. Bayesian and maximum likelihood phylogenetic analysis revealed that most known rhabdoviruses of squamates cluster in the Almpiwar subgroup. The exception is Marco virus, which is found in the Hart Park group. PMID:22397930
Lampe, David J; Witherspoon, David J; Soto-Adames, Felipe N; Robertson, Hugh M
2003-04-01
We report the isolation and sequencing of genomic copies of mariner transposons involved in recent horizontal transfers into the genomes of the European earwig, Forficula auricularia; the European honey bee, Apis mellifera; the Mediterranean fruit fly, Ceratitis capitata; and a blister beetle, Epicauta funebris, insects from four different orders. These elements are in the mellifera subfamily and are the second documented example of full-length mariner elements involved in this kind of phenomenon. We applied maximum likelihood methods to the coding sequences and determined that the copies in each genome were evolving neutrally, whereas reconstructed ancestral coding sequences appeared to be under selection, which strengthens our previous hypothesis that the primary selective constraint on mariner sequence evolution is the act of horizontal transfer between genomes.
A maximum likelihood map of chromosome 1.
Rao, D C; Keats, B J; Lalouel, J M; Morton, N E; Yee, S
1979-01-01
Thirteen loci are mapped on chromosome 1 from genetic evidence. The maximum likelihood map presented permits confirmation that Scianna (SC) and a fourteenth locus, phenylketonuria (PKU), are on chromosome 1, although the location of the latter on the PGM1-AMY segment is uncertain. Eight other controversial genetic assignments are rejected, providing a practical demonstration of the resolution which maximum likelihood theory brings to mapping. PMID:293128
ERIC Educational Resources Information Center
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.
2016-01-01
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
Maximum likelihood estimation of signal-to-noise ratio and combiner weight
NASA Technical Reports Server (NTRS)
Kalson, S.; Dolinar, S. J.
1986-01-01
An algorithm for estimating signal to noise ratio and combiner weight parameters for a discrete time series is presented. The algorithm is based upon the joint maximum likelihood estimate of the signal and noise power. The discrete-time series are the sufficient statistics obtained after matched filtering of a biphase modulated signal in additive white Gaussian noise, before maximum likelihood decoding is performed.
Changren Weng; Thomas L. Kubisiak; C. Dana Nelson; James P. Geaghan; Michael Stine
1999-01-01
Single marker regression and single marker maximum likelihood estimation were tied to detect quantitative trait loci (QTLs) controlling the early height growth of longleaf pine and slash pine using a ((longleaf pine x slash pine) x slash pine) BC, population consisting of 83 progeny. Maximum likelihood estimation was found to be more power than regression and could...
PyEvolve: a toolkit for statistical modelling of molecular evolution.
Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A
2004-01-05
Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.
Shahin, Arwa; Smulders, Marinus J. M.; van Tuyl, Jaap M.; Arens, Paul; Bakker, Freek T.
2014-01-01
Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from transcriptome sequences using three approaches: POFAD (Phylogeny of Organisms from Allelic Data, uses allelic information of sequence data), RAxML (Randomized Accelerated Maximum Likelihood, tree building based on concatenated consensus sequences) and Consensus Network (constructing a network summarizing among gene tree conflicts). Twenty six gene contigs were chosen based on the presence of orthologous sequences in all cultivars, seven of which also had an orthologous sequence in Tulipa, used as out-group. The three approaches generated the same topology. Although the resolution offered by these approaches is high, in this case there was no extra benefit in using allelic information. We conclude that these 26 genes can be widely applied to construct a species tree for the genus Lilium. PMID:25368628
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).
Choi, Kyoung Su; Park, SeonJoo
2016-09-01
The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.
Chen, Jing; Jiang, Li-Yun; Qiao, Ge-Xia
2011-01-01
Abstract The taxonomic position of Hormaphis similibetulae Qiao & Zhang, 2004 has been reexamined. The phylogenetic position of Hormaphis similibetulae was inferred by maximum parsimony, maximum likelihood and Bayesian analyses on the basis of partial nuclear elongation factor-1α and mitochondrial tRNA leucine/cytochrome oxidase II sequences. The results showed that this species fell into the clade of Hamamelistes species, occupying a basal position, and was clearly distinct from other Hormaphis species. A closer relationship between Hormaphis similibetulae and Hamamelistes species was also revealed by life cycle analysis. Therefore, we conclude that Hormaphis similibetulae should be transferred to the genus Hamamelistes as Hamamelistes similibetulae (Qiao & Zhang), comb. n. PMID:21852935
SubspaceEM: A Fast Maximum-a-posteriori Algorithm for Cryo-EM Single Particle Reconstruction
Dvornek, Nicha C.; Sigworth, Fred J.; Tagare, Hemant D.
2015-01-01
Single particle reconstruction methods based on the maximum-likelihood principle and the expectation-maximization (E–M) algorithm are popular because of their ability to produce high resolution structures. However, these algorithms are computationally very expensive, requiring a network of computational servers. To overcome this computational bottleneck, we propose a new mathematical framework for accelerating maximum-likelihood reconstructions. The speedup is by orders of magnitude and the proposed algorithm produces similar quality reconstructions compared to the standard maximum-likelihood formulation. Our approach uses subspace approximations of the cryo-electron microscopy (cryo-EM) data and projection images, greatly reducing the number of image transformations and comparisons that are computed. Experiments using simulated and actual cryo-EM data show that speedup in overall execution time compared to traditional maximum-likelihood reconstruction reaches factors of over 300. PMID:25839831
ILP-based maximum likelihood genome scaffolding
2014-01-01
Background Interest in de novo genome assembly has been renewed in the past decade due to rapid advances in high-throughput sequencing (HTS) technologies which generate relatively short reads resulting in highly fragmented assemblies consisting of contigs. Additional long-range linkage information is typically used to orient, order, and link contigs into larger structures referred to as scaffolds. Due to library preparation artifacts and erroneous mapping of reads originating from repeats, scaffolding remains a challenging problem. In this paper, we provide a scalable scaffolding algorithm (SILP2) employing a maximum likelihood model capturing read mapping uncertainty and/or non-uniformity of contig coverage which is solved using integer linear programming. A Non-Serial Dynamic Programming (NSDP) paradigm is applied to render our algorithm useful in the processing of larger mammalian genomes. To compare scaffolding tools, we employ novel quantitative metrics in addition to the extant metrics in the field. We have also expanded the set of experiments to include scaffolding of low-complexity metagenomic samples. Results SILP2 achieves better scalability throughg a more efficient NSDP algorithm than previous release of SILP. The results show that SILP2 compares favorably to previous methods OPERA and MIP in both scalability and accuracy for scaffolding single genomes of up to human size, and significantly outperforms them on scaffolding low-complexity metagenomic samples. Conclusions Equipped with NSDP, SILP2 is able to scaffold large mammalian genomes, resulting in the longest and most accurate scaffolds. The ILP formulation for the maximum likelihood model is shown to be flexible enough to handle metagenomic samples. PMID:25253180
NASA Technical Reports Server (NTRS)
Scholz, D.; Fuhs, N.; Hixson, M.
1979-01-01
The overall objective of this study was to apply and evaluate several of the currently available classification schemes for crop identification. The approaches examined were: (1) a per point Gaussian maximum likelihood classifier, (2) a per point sum of normal densities classifier, (3) a per point linear classifier, (4) a per point Gaussian maximum likelihood decision tree classifier, and (5) a texture sensitive per field Gaussian maximum likelihood classifier. Three agricultural data sets were used in the study: areas from Fayette County, Illinois, and Pottawattamie and Shelby Counties in Iowa. The segments were located in two distinct regions of the Corn Belt to sample variability in soils, climate, and agricultural practices.
Kang, Hae Ji; Bennett, Shannon N.; Dizney, Laurie; Sumibcay, Laarni; Arai, Satoru; Ruedas, Luis A.; Song, Jin-Won; Yanagihara, Richard
2009-01-01
A genetically distinct hantavirus, designated Oxbow virus (OXBV), was detected in tissues of an American shrew mole (Neurotrichus gibbsii), captured in Gresham, Oregon, in September 2003. Pairwise analysis of full-length S- and M- and partial L-segment nucleotide and amino acid sequences of OXBV indicated low sequence similarity with rodent-borne hantaviruses. Phylogenetic analyses using maximum-likelihood and Bayesian methods, and host-parasite evolutionary comparisons, showed that OXBV and Asama virus, a hantavirus recently identified from the Japanese shrew mole (Urotrichus talpoides), were related to soricine shrew-borne hantaviruses from North America and Eurasia, respectively, suggesting parallel evolution associated with cross-species transmission. PMID:19394994
Saarela, Jeffery M.; Wysocki, William P.; Barrett, Craig F.; Soreng, Robert J.; Davis, Jerrold I.; Clark, Lynn G.; Kelchner, Scot A.; Pires, J. Chris; Edger, Patrick P.; Mayfield, Dustin R.; Duvall, Melvin R.
2015-01-01
Whole plastid genomes are being sequenced rapidly from across the green plant tree of life, and phylogenetic analyses of these are increasing resolution and support for relationships that have varied among or been unresolved in earlier single- and multi-gene studies. Pooideae, the cool-season grass lineage, is the largest of the 12 grass subfamilies and includes important temperate cereals, turf grasses and forage species. Although numerous studies of the phylogeny of the subfamily have been undertaken, relationships among some ‘early-diverging’ tribes conflict among studies, and some relationships among subtribes of Poeae have not yet been resolved. To address these issues, we newly sequenced 25 whole plastomes, which showed rearrangements typical of Poaceae. These plastomes represent 9 tribes and 11 subtribes of Pooideae, and were analysed with 20 existing plastomes for the subfamily. Maximum likelihood (ML), maximum parsimony (MP) and Bayesian inference (BI) robustly resolve most deep relationships in the subfamily. Complete plastome data provide increased nodal support compared with protein-coding data alone at nodes that are not maximally supported. Following the divergence of Brachyelytrum, Phaenospermateae, Brylkinieae–Meliceae and Ampelodesmeae–Stipeae are the successive sister groups of the rest of the subfamily. Ampelodesmeae are nested within Stipeae in the plastome trees, consistent with its hybrid origin between a phaenospermatoid and a stipoid grass (the maternal parent). The core Pooideae are strongly supported and include Brachypodieae, a Bromeae–Triticeae clade and Poeae. Within Poeae, a novel sister group relationship between Phalaridinae and Torreyochloinae is found, and the relative branching order of this clade and Aveninae, with respect to an Agrostidinae–Brizinae clade, are discordant between MP and ML/BI trees. Maximum likelihood and Bayesian analyses strongly support Airinae and Holcinae as the successive sister groups of a Dactylidinae–Loliinae clade. PMID:25940204
Laguardia-Nascimento, Mateus; de Oliveira, Ana Paula Ferreira; Fernandes, Fernanda Rodas Pires; Rivetti, Anselmo Vasconcelos; Camargos, Marcelo Fernandes; Fonseca Júnior, Antônio Augusto
2017-12-01
Parapoxviruses are zoonotic viruses that infect cattle, goats and sheep; there have also been reports of infections in camels, domestic cats and seals. The objective of this report was to describe a case of vesicular disease caused by pseudocowpox virus (PCPV) in water buffalo (Bubalus bubalis) in Brazil. Sixty buffalo less than 6 months old exhibited ulcers and widespread peeling of the tongue epithelium. There were no cases of vesicular disease in pigs or horses on the same property. Samples were analysed by PCR and sequencing. Phylogenetic analysis in MEGA 7.01 was reconstructed using major envelope protein (B2L) by the Tamura three-parameter nucleotide substitution model and the maximum likelihood and neighbor joining models, both with 1000 bootstrap replicates. The genetic distance between the groups was analysed in MEGA using the maximum composite likelihood model. The rate variation among sites was modeled using gamma distribution. The presence of PCPV in the buffalo herd could be demonstrated in epithelium and serum. The minimum genetic distance between the isolated PCPV strain (262-2016) and orf virus and bovine papular stomatitis virus was 6.7% and 18.4%, respectively. The maximum genetic distance calculated was 4.6% when compared with a PCPV detected in a camel. Conclusions/Clinical Importance: The peculiar position of the isolated strain in the phylogenetic trees does not necessarily indicate a different kind of PCPV that infects buffalo. More samples from cattle and buffalo in Brazil must be sequenced and compared to verify if PCPV from buffalo are genetically different from samples derived from cattle.
Murdock, Andrew G
2008-05-01
Closely related outgroups are optimal for rooting phylogenetic trees; however, such ideal outgroups are not always available. A phylogeny of the marattioid ferns (Marattiaceae), an ancient lineage with no close relatives, was reconstructed using nucleotide sequences of multiple chloroplast regions (rps4 + rps4-trnS spacer, trnS-trnG spacer + trnG intron, rbcL, atpB), from 88 collections, selected to cover the broadest possible range of morphologies and geographic distributions within the extant taxa. Because marattioid ferns are phylogenetically isolated from other lineages, and internal branches are relatively short, rooting was problematic. Root placement was strongly affected by long-branch attraction under maximum parsimony and by model choice under maximum likelihood. A multifaceted approach to rooting was employed to isolate the sources of bias and produce a consensus root position. In a statistical comparison of all possible root positions with three different outgroups, most root positions were not significantly less optimal than the maximum likelihood root position, including the consensus root position. This phylogeny has several important taxonomic implications for marattioid ferns: Marattia in the broad sense is paraphyletic; the Hawaiian endemic Marattia douglasii is most closely related to tropical American taxa; and Angiopteris is monophyletic only if Archangiopteris and Macroglossum are included.
Wang, Yan; Liu, Guo-Hua; Li, Jia-Yuan; Xu, Min-Jun; Ye, Yong-Gang; Zhou, Dong-Hui; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2013-02-01
This study examined sequence variation in three mitochondrial DNA (mtDNA) regions, namely cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 5 (nad5) and cytochrome b (cytb), among Trichuris ovis isolates from different hosts in Guangdong Province, China. A portion of the cox1 (pcox1), nad5 (pnad5) and cytb (pcytb) genes was amplified separately from individual whipworms by PCR, and was subjected to sequencing from both directions. The size of the sequences of pcox1, pnad5 and pcytb was 618, 240 and 464 bp, respectively. Although the intra-specific sequence variations within T. ovis were 0-0.8% for pcox1, 0-0.8% for pnad5 and 0-1.9% for pcytb, the inter-specific sequence differences among members of the genus Trichuris were significantly higher, being 24.3-26.5% for pcox1, 33.7-56.4% for pnad5 and 24.8-26.1% for pcytb, respectively. Phylogenetic analyses using combined sequences of pcox1, pnad5 and pcytb, with three different computational algorithms (maximum likelihood, maximum parsimony and Bayesian inference), indicated that all of the T. ovis isolates grouped together with high statistical support. These findings demonstrated the existence of intra-specific variation in mtDNA sequences among T. ovis isolates from different hosts, and have implications for studying molecular epidemiology and population genetics of T. ovis.
Huang, Chih-Wei; Lin, Si-Min; Wu, Wen-Lung
2016-07-01
The first mitochondrial genome sequences of Aegista and Dolicheulota belonging to Bradybaenidae are described in this report. Mitogenomic sequences were generated from Illumina paired-end sequencing. The complete mitogenome of Aegista diversifamilia was 14,039 bp in length and nearly complete mitogenome of Dolicheulota formosensis was 14,237 bp. Both mitogenomes consisted of 13 protein-coding genes (PCGs), 2 ribosomal RNA genes, and 22 transfer RNA genes. Most genes were overlapped with neighboring genes that the overlapping regions ranged from 2 to 64 bp in A. diversifamilia and from 1 to 45 bp in D. formosensis. Novel gene arrangement, tRNA-Tyr-ND3-tRNA-Trp, was identified in A. diversifamilia, whereas D. formosensis showed identical gene order to other Bradybaenidae mitogenomes. Maximum likelihood phylogenetic tree suggested Aegista as a sister clade to Euhadra and Dolicheulota. Bradybaenidae is monophyly sister clade to Camaenidae.
Espin‐Garcia, Osvaldo; Craiu, Radu V.
2017-01-01
ABSTRACT We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. PMID:29239496
Armstrong, Miles R; Husmeier, Dirk; Phillips, Mark S; Blok, Vivian C
2007-06-01
The discovery that the potato cyst nematode Globodera pallida has a multipartite mitochondrial DNA (mtDNA) composed, at least in part, of six small circular mtDNAs (scmtDNAs) raised a number of questions concerning the population-level processes that might act on such a complex genome. Here we report our observations on the distribution of some scmtDNAs among a sample of European and South American G. pallida populations. The occurrence of sequence variants of scmtDNA IV in population P4A from South America, and that particular sequence variants are common to the individuals within a single cyst, is described. Evidence for recombination of sequence variants of scmtDNA IV in P4A is also reported. The mosaic structure of P4A scmtDNA IV sequences was revealed using several detection methods and recombination breakpoints were independently detected by maximum likelihood and Bayesian MCMC methods.
Applications of non-standard maximum likelihood techniques in energy and resource economics
NASA Astrophysics Data System (ADS)
Moeltner, Klaus
Two important types of non-standard maximum likelihood techniques, Simulated Maximum Likelihood (SML) and Pseudo-Maximum Likelihood (PML), have only recently found consideration in the applied economic literature. The objective of this thesis is to demonstrate how these methods can be successfully employed in the analysis of energy and resource models. Chapter I focuses on SML. It constitutes the first application of this technique in the field of energy economics. The framework is as follows: Surveys on the cost of power outages to commercial and industrial customers usually capture multiple observations on the dependent variable for a given firm. The resulting pooled data set is censored and exhibits cross-sectional heterogeneity. We propose a model that addresses these issues by allowing regression coefficients to vary randomly across respondents and by using the Geweke-Hajivassiliou-Keane simulator and Halton sequences to estimate high-order cumulative distribution terms. This adjustment requires the use of SML in the estimation process. Our framework allows for a more comprehensive analysis of outage costs than existing models, which rely on the assumptions of parameter constancy and cross-sectional homogeneity. Our results strongly reject both of these restrictions. The central topic of the second Chapter is the use of PML, a robust estimation technique, in count data analysis of visitor demand for a system of recreation sites. PML has been popular with researchers in this context, since it guards against many types of mis-specification errors. We demonstrate, however, that estimation results will generally be biased even if derived through PML if the recreation model is based on aggregate, or zonal data. To countervail this problem, we propose a zonal model of recreation that captures some of the underlying heterogeneity of individual visitors by incorporating distributional information on per-capita income into the aggregate demand function. This adjustment eliminates the unrealistic constraint of constant income across zonal residents, and thus reduces the risk of aggregation bias in estimated macro-parameters. The corrected aggregate specification reinstates the applicability of PML. It also increases model efficiency, and allows-for the generation of welfare estimates for population subgroups.
Combining Ratio Estimation for Low Density Parity Check (LDPC) Coding
NASA Technical Reports Server (NTRS)
Mahmoud, Saad; Hi, Jianjun
2012-01-01
The Low Density Parity Check (LDPC) Code decoding algorithm make use of a scaled receive signal derived from maximizing the log-likelihood ratio of the received signal. The scaling factor (often called the combining ratio) in an AWGN channel is a ratio between signal amplitude and noise variance. Accurately estimating this ratio has shown as much as 0.6 dB decoding performance gain. This presentation briefly describes three methods for estimating the combining ratio: a Pilot-Guided estimation method, a Blind estimation method, and a Simulation-Based Look-Up table. The Pilot Guided Estimation method has shown that the maximum likelihood estimates of signal amplitude is the mean inner product of the received sequence and the known sequence, the attached synchronization marker (ASM) , and signal variance is the difference of the mean of the squared received sequence and the square of the signal amplitude. This method has the advantage of simplicity at the expense of latency since several frames worth of ASMs. The Blind estimation method s maximum likelihood estimator is the average of the product of the received signal with the hyperbolic tangent of the product combining ratio and the received signal. The root of this equation can be determined by an iterative binary search between 0 and 1 after normalizing the received sequence. This method has the benefit of requiring one frame of data to estimate the combining ratio which is good for faster changing channels compared to the previous method, however it is computationally expensive. The final method uses a look-up table based on prior simulated results to determine signal amplitude and noise variance. In this method the received mean signal strength is controlled to a constant soft decision value. The magnitude of the deviation is averaged over a predetermined number of samples. This value is referenced in a look up table to determine the combining ratio that prior simulation associated with the average magnitude of the deviation. This method is more complicated than the Pilot-Guided Method due to the gain control circuitry, but does not have the real-time computation complexity of the Blind Estimation method. Each of these methods can be used to provide an accurate estimation of the combining ratio, and the final selection of the estimation method depends on other design constraints.
Cramer-Rao Bound, MUSIC, and Maximum Likelihood. Effects of Temporal Phase Difference
1990-11-01
Technical Report 1373 November 1990 Cramer-Rao Bound, MUSIC , And Maximum Likelihood Effects of Temporal Phase o Difference C. V. TranI OTIC Approved... MUSIC , and Maximum Likelihood (ML) asymptotic variances corresponding to the two-source direction-of-arrival estimation where sources were modeled as...1pI = 1.00, SNR = 20 dB ..................................... 27 2. MUSIC for two equipowered signals impinging on a 5-element ULA (a) IpI = 0.50, SNR
Are humans the initial source of canine mange?
Andriantsoanirina, Valérie; Fang, Fang; Ariey, Frédéric; Izri, Arezki; Foulet, Françoise; Botterel, Françoise; Bernigaud, Charlotte; Chosidow, Olivier; Huang, Weiyi; Guillot, Jacques; Durand, Rémy
2016-03-25
Scabies, or mange as it is called in animals, is an ectoparasitic contagious infestation caused by the mite Sarcoptes scabiei. Sarcoptic mange is an important veterinary disease leading to significant morbidity and mortality in wild and domestic animals. A widely accepted hypothesis, though never substantiated by factual data, suggests that humans were the initial source of the animal contamination. In this study we performed phylogenetic analyses of populations of S. scabiei from humans and from canids to validate or not the hypothesis of a human origin of the mites infecting domestic dogs. Mites from dogs and foxes were obtained from three French sites and from other countries. A part of cytochrome c oxidase subunit 1 (cox1) gene was amplified and directly sequenced. Other sequences corresponding to mites from humans, raccoon dogs, foxes, jackal and dogs from various geographical areas were retrieved from GenBank. Phylogenetic analyses were performed using the Otodectes cynotis cox1 sequence as outgroup. Maximum Likelihood and Bayesian Inference analysis approaches were used. To visualize the relationship between the haplotypes, a median joining haplotype network was constructed using Network v4.6 according to host. Twenty-one haplotypes were observed among mites collected from five different host species, including humans and canids from nine geographical areas. The phylogenetic trees based on Maximum Likelihood and Bayesian Inference analyses showed similar topologies with few differences in node support values. The results were not consistent with a human origin of S. scabiei mites in dogs and, on the contrary, did not exclude the opposite hypothesis of a host switch from dogs to humans. Phylogenetic relatedness may have an impact in terms of epidemiological control strategy. Our results and other recent studies suggest to re-evaluate the level of transmission between domestic dogs and humans.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leebens-Mack, Jim; Raubeson, Linda A.; Cui, Liying
2005-05-27
While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxonmore » sampling. The added taxa include three monocots (Acorus, Yucca and Typha), a water lily (Nuphar), a ranunculid(Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein datasets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiospermphylogeny. However, their relative positions proved to be dependent on method of analysis, with parsimony favoring Amborella as sister to all other angiosperms, and maximum likelihood and neighbor-joining methods favoring an Amborella + Nympheales clade as sister. The maximum likelihood phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single gene phylogenies, estimated divergence dates and conflicting in del characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiospermphylogeny. Molecular dating analyses provided median age estimates of 161 mya for the most recent common ancestor of all extant angiosperms and 145 mya for the most recent common ancestor of monocots, magnoliids andeudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction can mislead genome-scale phylogenetic analyses.« less
Recreating a functional ancestral archosaur visual pigment.
Chang, Belinda S W; Jönsson, Karolina; Kazmi, Manija A; Donoghue, Michael J; Sakmar, Thomas P
2002-09-01
The ancestors of the archosaurs, a major branch of the diapsid reptiles, originated more than 240 MYA near the dawn of the Triassic Period. We used maximum likelihood phylogenetic ancestral reconstruction methods and explored different models of evolution for inferring the amino acid sequence of a putative ancestral archosaur visual pigment. Three different types of maximum likelihood models were used: nucleotide-based, amino acid-based, and codon-based models. Where possible, within each type of model, likelihood ratio tests were used to determine which model best fit the data. Ancestral reconstructions of the ancestral archosaur node using the best-fitting models of each type were found to be in agreement, except for three amino acid residues at which one reconstruction differed from the other two. To determine if these ancestral pigments would be functionally active, the corresponding genes were chemically synthesized and then expressed in a mammalian cell line in tissue culture. The expressed artificial genes were all found to bind to 11-cis-retinal to yield stable photoactive pigments with lambda(max) values of about 508 nm, which is slightly redshifted relative to that of extant vertebrate pigments. The ancestral archosaur pigments also activated the retinal G protein transducin, as measured in a fluorescence assay. Our results show that ancestral genes from ancient organisms can be reconstructed de novo and tested for function using a combination of phylogenetic and biochemical methods.
A general methodology for maximum likelihood inference from band-recovery data
Conroy, M.J.; Williams, B.K.
1984-01-01
A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
DiMeglio, Laura M.; Yu, Hongrun; Davis, Thomas M.
2014-01-01
The genus Fragaria encompasses species at ploidy levels ranging from diploid to decaploid. The cultivated strawberry, Fragaria×ananassa, and its two immediate progenitors, F. chiloensis and F. virginiana, are octoploids. To elucidate the ancestries of these octoploid species, we performed a phylogenetic analysis using intron-containing sequences of the nuclear ADH-1 gene from 39 germplasm accessions representing nineteen Fragaria species and one outgroup species, Dasiphora fruticosa. All trees from Maximum Parsimony and Maximum Likelihood analyses showed two major clades, Clade A and Clade B. Each of the sampled octoploids contributed alleles to both major clades. All octoploid-derived alleles in Clade A clustered with alleles of diploid F. vesca, with the exception of one octoploid allele that clustered with the alleles of diploid F. mandshurica. All octoploid-derived alleles in clade B clustered with the alleles of only one diploid species, F. iinumae. When gaps encoded as binary characters were included in the Maximum Parsimony analysis, tree resolution was improved with the addition of six nodes, and the bootstrap support was generally higher, rising above the 50% threshold for an additional nine branches. These results, coupled with the congruence of the sequence data and the coded gap data, validate and encourage the employment of sequence sets containing gaps for phylogenetic analysis. Our phylogenetic conclusions, based upon sequence data from the ADH-1 gene located on F. vesca linkage group II, complement and generally agree with those obtained from analyses of protein-encoding genes GBSSI-2 and DHAR located on F. vesca linkage groups V and VII, respectively, but differ from a previous study that utilized rDNA sequences and did not detect the ancestral role of F. iinumae. PMID:25078607
Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
2014-11-19
Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Robles, María del Rosario; Cutillas, Cristina; Panei, Carlos Javier; Callejón, Rocío
2014-01-01
Populations of Trichuris spp. isolated from six species of sigmodontine rodents from Argentina were analyzed based on morphological characteristics and ITS2 (rDNA) region sequences. Molecular data provided an opportunity to discuss the phylogenetic relationships among the Trichuris spp. from Noth and South America (mainly from Argentina). Trichuris specimens were identified morphologically as Trichuris pardinasi, T. navonae, Trichuris sp. and Trichuris new species, described in this paper. Sequences analyzed by Maximum Parsimony, Maximum Likelihood and Bayesian inference methods showed four main clades corresponding with the four different species regardless of geographical origin and host species. These four species from sigmodontine rodents clustered together and separated from Trichuris species isolated from murine and arvicoline rodents (outgroup). Different genetic lineages observed among Trichuris species from sigmodontine rodents which supported the proposal of a new species. Moreover, host distribution showed correspondence with the different tribes within the subfamily Sigmodontinae. PMID:25393618
NASA Astrophysics Data System (ADS)
Simmonds, Sara E.; Chou, Vincent; Cheng, Samantha H.; Rachmawati, Rita; Calumpong, Hilconida P.; Ngurah Mahardika, G.; Barber, Paul H.
2018-06-01
We studied how host-associations and geography shape the genetic structure of sister species of marine snails Coralliophila radula (A. Adams, 1853) and C. violacea (Kiener, 1836). These obligate ectoparasites prey upon corals and are sympatric throughout much of their ranges in coral reefs of the tropical and subtropical Indo-Pacific. We tested for population genetic structure of snails in relation to geography and their host corals using mtDNA (COI) sequences in minimum spanning trees and AMOVAs. We also examined the evolutionary relationships of their Porites host coral species using maximum likelihood trees of RAD-seq (restriction site-associated DNA sequencing) loci mapped to a reference transcriptome. A maximum likelihood tree of host corals revealed three distinct clades. Coralliophila radula showed a pronounced genetic break across the Sunda Shelf ( Φ CT = 0.735) but exhibited no genetic structure with respect to host. C. violacea exhibited significant geographic structure ( Φ CT = 0.427), with divergence among Hawaiian populations, the Coral Triangle and the Indian Ocean. Notably, C. violacea showed evidence of ecological divergence; two lineages were associated with different groups of host coral species, one widespread found at all sites, and the other restricted to the Coral Triangle. Sympatric populations of C. violacea found on different suites of coral species were highly divergent ( Φ CT = 0.561, d = 5.13%), suggesting that symbiotic relationships may contribute to lineage diversification in the Coral Triangle.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.
Wang, Sheng; Sun, Siqi; Xu, Jinbo
2016-09-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling
Wang, Sheng; Sun, Siqi
2017-01-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168
ERIC Educational Resources Information Center
Wothke, Werner; Burket, George; Chen, Li-Sue; Gao, Furong; Shu, Lianghua; Chia, Mike
2011-01-01
It has been known for some time that item response theory (IRT) models may exhibit a likelihood function of a respondent's ability which may have multiple modes, flat modes, or both. These conditions, often associated with guessing of multiple-choice (MC) questions, can introduce uncertainty and bias to ability estimation by maximum likelihood…
Wellehan, James F X; Pessier, Allan P; Archer, Linda L; Childress, April L; Jacobson, Elliott R; Tesh, Robert B
2012-08-17
Rhabdoviruses infect a variety of hosts, including non-avian reptiles. Consensus PCR techniques were used to obtain partial RNA-dependent RNA polymerase gene sequence from five rhabdoviruses of South American lizards; Marco, Chaco, Timbo, Sena Madureira, and a rhabdovirus from a caiman lizard (Dracaena guianensis). The caiman lizard rhabdovirus formed inclusions in erythrocytes, which may be a route for infecting hematophagous insects. This is the first information on behavior of a rhabdovirus in squamates. We also obtained sequence from two rhabdoviruses of Australian lizards, confirming previous Charleville virus sequence and finding that, unlike a previous sequence report but in agreement with serologic reports, Almpiwar virus is clearly distinct from Charleville virus. Bayesian and maximum likelihood phylogenetic analysis revealed that most known rhabdoviruses of squamates cluster in the Almpiwar subgroup. The exception is Marco virus, which is found in the Hart Park group. Copyright © 2012 Elsevier B.V. All rights reserved.
Dual phylogenetic origins of Nigerian lions (Panthera leo).
Tende, Talatu; Bensch, Staffan; Ottosson, Ulf; Hansson, Bengt
2014-07-01
Lion fecal DNA extracts from four individuals each from Yankari Game Reserve and Kainji-Lake National Park (central northeast and west Nigeria, respectively) were Sanger-sequenced for the mitochondrial cytochrome b gene. The sequences were aligned against 61 lion reference sequences from other parts of Africa and India. The sequence data were analyzed further for the construction of phylogenetic trees using the maximum-likelihood approach to depict phylogenetic patterns of distribution among sequences. Our results show that Nigerian lions grouped together with lions from West and Central Africa. At the smaller geographical scale, lions from Kainji-Lake National Park in western Nigeria grouped with lions from Benin (located west of Nigeria), whereas lions from Yankari Game Reserve in central northeastern Nigeria grouped with the lion populations in Cameroon (located east of Nigeria). The finding that the two remaining lion populations in Nigeria have different phylogenetic origins is an important aspect to consider in future decisions regarding management and conservation of rapidly shrinking lion populations in West Africa.
Dual phylogenetic origins of Nigerian lions (Panthera leo)
Tende, Talatu; Bensch, Staffan; Ottosson, Ulf; Hansson, Bengt
2014-01-01
Lion fecal DNA extracts from four individuals each from Yankari Game Reserve and Kainji-Lake National Park (central northeast and west Nigeria, respectively) were Sanger-sequenced for the mitochondrial cytochrome b gene. The sequences were aligned against 61 lion reference sequences from other parts of Africa and India. The sequence data were analyzed further for the construction of phylogenetic trees using the maximum-likelihood approach to depict phylogenetic patterns of distribution among sequences. Our results show that Nigerian lions grouped together with lions from West and Central Africa. At the smaller geographical scale, lions from Kainji-Lake National Park in western Nigeria grouped with lions from Benin (located west of Nigeria), whereas lions from Yankari Game Reserve in central northeastern Nigeria grouped with the lion populations in Cameroon (located east of Nigeria). The finding that the two remaining lion populations in Nigeria have different phylogenetic origins is an important aspect to consider in future decisions regarding management and conservation of rapidly shrinking lion populations in West Africa. PMID:25077018
Estimating residual fault hitting rates by recapture sampling
NASA Technical Reports Server (NTRS)
Lee, Larry; Gupta, Rajan
1988-01-01
For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
Structural analysis of the α subunit of Na(+)/K(+) ATPase genes in invertebrates.
Thabet, Rahma; Rouault, J-D; Ayadi, Habib; Leignel, Vincent
2016-01-01
The Na(+)/K(+) ATPase is a ubiquitous pump coordinating the transport of Na(+) and K(+) across the membrane of cells and its role is fundamental to cellular functions. It is heteromer in eukaryotes including two or three subunits (α, β and γ which is specific to the vertebrates). The catalytic functions of the enzyme have been attributed to the α subunit. Several complete α protein sequences are available, but only few gene structures were characterized. We identified the genomic sequences coding the α-subunit of the Na(+)/K(+) ATPase, from the whole-genome shotgun contigs (WGS), NCBI Genomes (chromosome), Genomic Survey Sequences (GSS) and High Throughput Genomic Sequences (HTGS) databases across distinct phyla. One copy of the α subunit gene was found in Annelida, Arthropoda, Cnidaria, Echinodermata, Hemichordata, Mollusca, Placozoa, Porifera, Platyhelminthes, Urochordata, but the nematodes seem to possess 2 to 4 copies. The number of introns varied from 0 (Platyhelminthes) to 26 (Porifera); and their localization and length are also highly variable. Molecular phylogenies (Maximum Likelihood and Maximum Parsimony methods) showed some clusters constituted by (Chordata/(Echinodermata/Hemichordata)) or (Plathelminthes/(Annelida/Mollusca)) and a basal position for Porifera. These structural analyses increase our knowledge about the evolutionary events of the α subunit genes in the invertebrates. Copyright © 2016 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
ERIC Educational Resources Information Center
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method
NASA Astrophysics Data System (ADS)
Ardianti, Fitri; Sutarman
2018-01-01
In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
Martin, Donald S; Wright, André-Denis G; Barta, John R; Desser, Sherwin S
2002-06-01
Phylogenetic relationships within the kinetoplastid flagellates were inferred from comparisons of small-subunit ribosomal RNA gene sequences. These included 5 new gene sequences, Trypanosoma fallisi (2,239 bp), Trypanosoma chattoni (2,180 bp), Trypanosoma mega (2,211 bp), Trypanosoma neveulemairei (2,197 bp), and Trypanosoma ranarum (2,203 bp). Trees produced using maximum-parsimony and distance-matrix methods (least-squares, neighbor-joining, and maximum-likelihood), supported by strong bootstrap and quartet-puzzle analyses, indicated that the trypanosomes are a monophyletic group that divides into 2 major lineages, the salivarian trypanosomes and the nonsalivarian trypanosomes. The nonsalivarian trypanosomes further divide into 2 lineages, 1 containing trypanosomes of birds, mammals, and reptiles and the other containing trypanosomes of fish, reptiles, and anurans. Among the giant trypanosomes, T. chattoni is clearly shown to be distantly related to all the other anuran trypanosome species. Trypanosoma mega is closely associated with T. fallisi and T. ranarum, whereas T. neveulemairei and Trypanosoma rotatorium are sister taxa. The branching order of the anuran trypanosomes suggests that some toad trypanosomes may have evolved by host switching from frogs to toads.
Closed-loop carrier phase synchronization techniques motivated by likelihood functions
NASA Technical Reports Server (NTRS)
Tsou, H.; Hinedi, S.; Simon, M.
1994-01-01
This article reexamines the notion of closed-loop carrier phase synchronization motivated by the theory of maximum a posteriori phase estimation with emphasis on the development of new structures based on both maximum-likelihood and average-likelihood functions. The criterion of performance used for comparison of all the closed-loop structures discussed is the mean-squared phase error for a fixed-loop bandwidth.
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
Low-complexity approximations to maximum likelihood MPSK modulation classification
NASA Technical Reports Server (NTRS)
Hamkins, Jon
2004-01-01
We present a new approximation to the maximum likelihood classifier to discriminate between M-ary and M'-ary phase-shift-keying transmitted on an additive white Gaussian noise (AWGN) channel and received noncoherentl, partially coherently, or coherently.
Cho, Myong-Suk; Hyun Cho, Chung; Yeon Kim, Su; Su Yoon, Hwan; Kim, Seung-Chul
2016-09-01
The complete chloroplast genome sequences of the wild flowering cherry, Prunus yedoensis Matsum., which is native and endemic to Jeju Island, Korea, is reported in this study. The genome size is 157 786 bp in length with 36.7% GC content, which is composed of LSC region of 85 908 bp, SSC region of 19 120 bp and two IR copies of 26 379 bp each. The cp genome contains 131 genes, including 86 coding genes, 8 rRNA genes and 37 tRNA genes. The maximum likelihood analysis was conducted to verify a phylogenetic position of the newly sequenced cp genome of P. yedoensis using 11 representatives of complete cp genome sequences within the family Rosaceae. The genus Prunus exhibited monophyly and the result of the phylogenetic relationship agreed with the previous phylogenetic analyses within Rosaceae.
Maximum likelihood decoding analysis of accumulate-repeat-accumulate codes
NASA Technical Reports Server (NTRS)
Abbasfar, A.; Divsalar, D.; Yao, K.
2004-01-01
In this paper, the performance of the repeat-accumulate codes with (ML) decoding are analyzed and compared to random codes by very tight bounds. Some simple codes are shown that perform very close to Shannon limit with maximum likelihood decoding.
NASA Technical Reports Server (NTRS)
Thadani, S. G.
1977-01-01
The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Maximum-likelihood block detection of noncoherent continuous phase modulation
NASA Technical Reports Server (NTRS)
Simon, Marvin K.; Divsalar, Dariush
1993-01-01
This paper examines maximum-likelihood block detection of uncoded full response CPM over an additive white Gaussian noise (AWGN) channel. Both the maximum-likelihood metrics and the bit error probability performances of the associated detection algorithms are considered. The special and popular case of minimum-shift-keying (MSK) corresponding to h = 0.5 and constant amplitude frequency pulse is treated separately. The many new receiver structures that result from this investigation can be compared to the traditional ones that have been used in the past both from the standpoint of simplicity of implementation and optimality of performance.
Design of simplified maximum-likelihood receivers for multiuser CPM systems.
Bing, Li; Bai, Baoming
2014-01-01
A class of simplified maximum-likelihood receivers designed for continuous phase modulation based multiuser systems is proposed. The presented receiver is built upon a front end employing mismatched filters and a maximum-likelihood detector defined in a low-dimensional signal space. The performance of the proposed receivers is analyzed and compared to some existing receivers. Some schemes are designed to implement the proposed receivers and to reveal the roles of different system parameters. Analysis and numerical results show that the proposed receivers can approach the optimum multiuser receivers with significantly (even exponentially in some cases) reduced complexity and marginal performance degradation.
Maximum likelihood clustering with dependent feature trees
NASA Technical Reports Server (NTRS)
Chittineni, C. B. (Principal Investigator)
1981-01-01
The decomposition of mixture density of the data into its normal component densities is considered. The densities are approximated with first order dependent feature trees using criteria of mutual information and distance measures. Expressions are presented for the criteria when the densities are Gaussian. By defining different typs of nodes in a general dependent feature tree, maximum likelihood equations are developed for the estimation of parameters using fixed point iterations. The field structure of the data is also taken into account in developing maximum likelihood equations. Experimental results from the processing of remotely sensed multispectral scanner imagery data are included.
Zhang, Min; Jia, Dijing; Li, Hanping; Gui, Tao; Jia, Lei; Wang, Xiaolin; Li, Tianyi; Liu, Yongjian; Bao, Zuoyi; Liu, Siyang; Zhuang, Daomin; Li, Jingyun; Li, Lin
2017-10-01
CRF07_BC was originally formed in Yunnan province of China in 1980s and spread quickly in injecting drug users (IDUs). In recent years, it has been introduced into men who have sex with men (MSM) and become the most dominant strain in China. In this study, we performed a comprehensively phylodynamic analysis of CRF07_BC sequences from China. All CRF07_BC sequences identified in China were retrieved from database. More sequences obtained in our laboratory were added to make the dataset more representative. A maximum-likelihood (ML) tree was constructed with PhyML3.0. Maximum clade credibility (MCC) tree and effective population size were predicted by using Markov Chains Monte Carlo sampling method with Beast software. A total of 610 CRF07_BC sequences coving 1,473 bp of the gag gene (from 817 to 2,289 according to HXB2 calculator) were included into the dataset. Three epidemic clusters were identified; two clusters comprised sequences from IDUs, while one cluster mainly contained sequences from MSMs. The time of the most recent common ancestor of clusters that composed of sequences from MSMs was estimated to be in 2000. Two rapid spreading waves of effective population size of CRF07_BC infections were identified in the skyline plot. The second wave coincided with the expanding of MSM cluster. The results indicated that the control of CRF07_BC infections in MSMs would help to decrease its epidemic in China.
NASA Astrophysics Data System (ADS)
Chu, A.
2016-12-01
Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work implements three of the homogeneous ETAS models described in Ogata (1998). With a model's log-likelihood function, my software finds the Maximum-Likelihood Estimates (MLEs) of the model's parameters to estimate the homogeneous background rate and the temporal and spatial parameters that govern triggering effects. EM-algorithm is employed for its advantages of stability and robustness (Veen and Schoenberg, 2008). My work also presents comparisons among the three models in robustness, convergence speed, and implementations from theory to computing practice. Up-to-date regional seismic data of seismic active areas such as Southern California and Japan are used to demonstrate the comparisons. Data analysis has been done using computer languages Java and R. Java has the advantages of being strong-typed and easiness of controlling memory resources, while R has the advantages of having numerous available functions in statistical computing. Comparisons are also made between the two programming languages in convergence and stability, computational speed, and easiness of implementation. Issues that may affect convergence such as spatial shapes are discussed.
Andersen, Heidi L; Ekman, Stefan
2005-01-01
The phylogeny of the family Micareaceae and the genus Micarea was studied using mitochondrial small subunit ribosomal DNA sequences. Phylogenetic reconstructions were performed using Bayesian MCMC tree sampling and a maximum likelihood approach. The Micareaceae in its current sense is highly heterogeneous, and Helocarpon, Psilolechia, and Scutula, all thought to be close relatives of Micarea, are shown to be only distantly related. The genus Micarea is paraphyletic unless the entire Pilocarpaceae and Ectolechiaceae are included, as also indicated by an expected likelihood weights test. It is suggested that the Micareaceae is reduced to synonymy with the Pilocarpaceae, which also includes the Ectolechiaceae, and that Micarea may have to be divided into a series of smaller genera in the future. Micarea species with a 'non-micareoid' photobiont group with Psora and the Ramalinaceae, whereas Micarea intrusa appears to belong in Scoliciosporum. Three species fall inside the paraphyletic Micarea: Szczawinskia tsugae, Catillaria contristans, and Fellhaneropsis vezdae. Tropical foliicolous taxa are nested within groups of mainly temperate and arctic-alpine distribution. A 'micareoid' photobiont appears to be plesiomorphic in the Pilocarpaceae but has been lost a few times.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2010-01-01
In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
Treetrimmer: a method for phylogenetic dataset size reduction.
Maruyama, Shinichiro; Eveleigh, Robert J M; Archibald, John M
2013-04-12
With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual 'pruning' of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. Here we present 'TreeTrimmer', a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined 'redundant' sequences, e.g., orthologous sequences from closely related organisms and 'recently' evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion.
Espin-Garcia, Osvaldo; Craiu, Radu V; Bull, Shelley B
2018-02-01
We evaluate two-phase designs to follow-up findings from genome-wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation-maximization-based inference under a semiparametric maximum likelihood formulation tailored for post-GWAS inference. A GWAS-SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT-SNP-dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme-QT strata yields significant power improvements compared to marginal QT- or SNP-based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. © 2017 The Authors. Genetic Epidemiology Published by Wiley Periodicals, Inc.
Djordjevic, Ivan B; Vasic, Bane
2006-05-29
A maximum a posteriori probability (MAP) symbol decoding supplemented with iterative decoding is proposed as an effective mean for suppression of intrachannel nonlinearities. The MAP detector, based on Bahl-Cocke-Jelinek-Raviv algorithm, operates on the channel trellis, a dynamical model of intersymbol interference, and provides soft-decision outputs processed further in an iterative decoder. A dramatic performance improvement is demonstrated. The main reason is that the conventional maximum-likelihood sequence detector based on Viterbi algorithm provides hard-decision outputs only, hence preventing the soft iterative decoding. The proposed scheme operates very well in the presence of strong intrachannel intersymbol interference, when other advanced forward error correction schemes fail, and it is also suitable for 40 Gb/s upgrade over existing 10 Gb/s infrastructure.
Johnson, Tania Aspasia; Iyengar, Arati
2015-01-01
Sturgeons and paddlefish are freshwater fish which are highly valued for their caviar. Despite the fact that every single species of sturgeon and paddlefish is listed under CITES, there are reports of illegal trade in caviar where products are deliberately mislabeled. Three samples of caviar purchased in the United Kingdom were investigated for accurate CITES labeling using COI and cyt b sequencing. Initial species identification was carried out using BLAST followed by phylogenetic analyses using both maximum parsimony and maximum likelihood methods. Results showed no evidence for mislabeling with respect to CITES labels in any of the three samples, but we observed clear evidence for a case of misleading the customer in one sample. © 2014 American Academy of Forensic Sciences.
Statistical alignment: computational properties, homology testing and goodness-of-fit.
Hein, J; Wiuf, C; Knudsen, B; Møller, M B; Wibling, G
2000-09-08
The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum likelihood estimate. In addition, the recursions originally presented by Thorne, Kishino and Felsenstein can be simplified. Two proteins, about 1500 amino acids long, can be analysed with this method in less than five seconds on a fast desktop computer, which makes this method practical for actual data analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test, that allows testing the proposed insertion-deletion (indel) process inherent to this model and find that real sequences (here globins) probably experience indels longer than one, contrary to what is assumed by the model. Copyright 2000 Academic Press.
Concept for estimating mitochondrial DNA haplogroups using a maximum likelihood approach (EMMA)☆
Röck, Alexander W.; Dür, Arne; van Oven, Mannis; Parson, Walther
2013-01-01
The assignment of haplogroups to mitochondrial DNA haplotypes contributes substantial value for quality control, not only in forensic genetics but also in population and medical genetics. The availability of Phylotree, a widely accepted phylogenetic tree of human mitochondrial DNA lineages, led to the development of several (semi-)automated software solutions for haplogrouping. However, currently existing haplogrouping tools only make use of haplogroup-defining mutations, whereas private mutations (beyond the haplogroup level) can be additionally informative allowing for enhanced haplogroup assignment. This is especially relevant in the case of (partial) control region sequences, which are mainly used in forensics. The present study makes three major contributions toward a more reliable, semi-automated estimation of mitochondrial haplogroups. First, a quality-controlled database consisting of 14,990 full mtGenomes downloaded from GenBank was compiled. Together with Phylotree, these mtGenomes serve as a reference database for haplogroup estimates. Second, the concept of fluctuation rates, i.e. a maximum likelihood estimation of the stability of mutations based on 19,171 full control region haplotypes for which raw lane data is available, is presented. Finally, an algorithm for estimating the haplogroup of an mtDNA sequence based on the combined database of full mtGenomes and Phylotree, which also incorporates the empirically determined fluctuation rates, is brought forward. On the basis of examples from the literature and EMPOP, the algorithm is not only validated, but both the strength of this approach and its utility for quality control of mitochondrial haplotypes is also demonstrated. PMID:23948335
Otto, Wolfgang; Stadler, Peter F.; López-Giraldéz, Francesc; Townsend, Jeffrey P.; Lynch, Vincent J.
2009-01-01
A major mode of gene expression evolution is based on changes in cis-regulatory elements (CREs) whose function critically depends on the presence of transcription factor–binding sites (TFBS). Because CREs experience extensive TFBS turnover even with conserved function, alignment-based studies of CRE sequence evolution are limited to very closely related species. Here, we propose an alternative approach based on a stochastic model of TFBS turnover. We implemented a maximum likelihood model that permits variable turnover rates in different parts of the species tree. This model can be used to detect changes in turnover rate as a proxy for differences in the selective pressures acting on TFBS in different clades. We applied this method to five TFBS in the fungi methionine biosynthesis pathway and three TFBS in the HoxA clusters of vertebrates. We find that the estimated turnover rate is generally high, with half-life ranging between ∼5 and 150 My and a mode around tens of millions of years. This rate is consistent with the finding that even functionally conserved enhancers can show very low sequence similarity. We also detect statistically significant differences in the equilibrium densities of estrogen- and progesterone-response elements in the HoxA clusters between mammal and nonmammal vertebrates. Even more extreme clade-specific differences were found in the fungal data. We conclude that stochastic models of TFBS turnover enable the detection of shifts in the selective pressures acting on CREs in different organisms. The analysis tool, called CRETO (Cis-Regulatory Element Turn-Over) can be downloaded from http://www.bioinf.uni-leipzig.de/Software/creto/. PMID:20333180
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference
NASA Astrophysics Data System (ADS)
Hall, Alex; Taylor, Andy
2017-06-01
We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
Kelly, S; Wickstead, B; Gull, K
2011-04-07
We have developed a machine-learning approach to identify 3537 discrete orthologue protein sequence groups distributed across all available archaeal genomes. We show that treating these orthologue groups as binary detection/non-detection data is sufficient to capture the majority of archaeal phylogeny. We subsequently use the sequence data from these groups to infer a method and substitution-model-independent phylogeny. By holding this phylogeny constrained and interrogating the intersection of this large dataset with both the Eukarya and the Bacteria using Bayesian and maximum-likelihood approaches, we propose and provide evidence for a methanogenic origin of the Archaea. By the same criteria, we also provide evidence in support of an origin for Eukarya either within or as sisters to the Thaumarchaea.
NASA Technical Reports Server (NTRS)
Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.
1984-01-01
Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.
The complete mitochondrial genome structure of the jaguar (Panthera onca).
Caragiulo, Anthony; Dougherty, Eric; Soto, Sofia; Rabinowitz, Salisa; Amato, George
2016-01-01
The jaguar (Panthera onca) is the largest felid in the Western hemisphere, and the only member of the Panthera genus in the New World. The jaguar inhabits most countries within Central and South America, and is considered near threatened by the International Union for the Conservation of Nature. This study represents the first sequence of the entire jaguar mitogenome, which was the only Panthera mitogenome that had not been sequenced. The jaguar mitogenome is 17,049 bases and possesses the same molecular structure as other felid mitogenomes. Bayesian inference (BI) and maximum likelihood (ML) were used to determine the phylogenetic placement of the jaguar within the Panthera genus. Both BI and ML analyses revealed the jaguar to be sister to the tiger/leopard/snow leopard clade.
On the synchronizability and detectability of random PPM sequences
NASA Technical Reports Server (NTRS)
Georghiades, Costas N.; Lin, Shu
1987-01-01
The problem of synchronization and detection of random pulse-position-modulation (PPM) sequences is investigated under the assumption of perfect slot synchronization. Maximum-likelihood PPM symbol synchronization and receiver algorithms are derived that make decisions based both on soft as well as hard data; these algorithms are seen to be easily implementable. Bounds derived on the symbol error probability as well as the probability of false synchronization indicate the existence of a rather severe performance floor, which can easily be the limiting factor in the overall system performance. The performance floor is inherent in the PPM format and random data and becomes more serious as the PPM alphabet size Q is increased. A way to eliminate the performance floor is suggested by inserting special PPM symbols in the random data stream.
On the synchronizability and detectability of random PPM sequences
NASA Technical Reports Server (NTRS)
Georghiades, Costas N.
1987-01-01
The problem of synchronization and detection of random pulse-position-modulation (PPM) sequences is investigated under the assumption of perfect slot synchronization. Maximum likelihood PPM symbol synchronization and receiver algorithms are derived that make decisions based both on soft as well as hard data; these algorithms are seen to be easily implementable. Bounds were derived on the symbol error probability as well as the probability of false synchronization that indicate the existence of a rather severe performance floor, which can easily be the limiting factor in the overall system performance. The performance floor is inherent in the PPM format and random data and becomes more serious as the PPM alphabet size Q is increased. A way to eliminate the performance floor is suggested by inserting special PPM symbols in the random data stream.
NASA Technical Reports Server (NTRS)
Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.
1981-01-01
Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.
Adaptive decoding of convolutional codes
NASA Astrophysics Data System (ADS)
Hueske, K.; Geldmacher, J.; Götze, J.
2007-06-01
Convolutional codes, which are frequently used as error correction codes in digital transmission systems, are generally decoded using the Viterbi Decoder. On the one hand the Viterbi Decoder is an optimum maximum likelihood decoder, i.e. the most probable transmitted code sequence is obtained. On the other hand the mathematical complexity of the algorithm only depends on the used code, not on the number of transmission errors. To reduce the complexity of the decoding process for good transmission conditions, an alternative syndrome based decoder is presented. The reduction of complexity is realized by two different approaches, the syndrome zero sequence deactivation and the path metric equalization. The two approaches enable an easy adaptation of the decoding complexity for different transmission conditions, which results in a trade-off between decoding complexity and error correction performance.
Some Small Sample Results for Maximum Likelihood Estimation in Multidimensional Scaling.
ERIC Educational Resources Information Center
Ramsay, J. O.
1980-01-01
Some aspects of the small sample behavior of maximum likelihood estimates in multidimensional scaling are investigated with Monte Carlo techniques. In particular, the chi square test for dimensionality is examined and a correction for bias is proposed and evaluated. (Author/JKS)
ATAC Autocuer Modeling Analysis.
1981-01-01
the analysis of the simple rectangular scrnentation (1) is based on detection and estimation theory (2). This approach uses the concept of maximum ...continuous wave forms. In order to develop the principles of maximum likelihood, it is con- venient to develop the principles for the "classical...the concept of maximum likelihood is significant in that it provides the optimum performance of the detection/estimation problem. With a knowledge of
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
The Maximum Likelihood Solution for Inclination-only Data
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2006-12-01
The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki
2009-01-15
The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.
Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe
2016-02-15
Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
Yamaguchi, M; Miya, M; Okiyama, M; Nishida, M
2000-04-01
Larvae of the deep-sea lanternfish genus Hygophum (Myctophidae) exhibit a remarkable morphological diversity that is quite unexpected, considering their homogeneous adult morphology. In an attempt to elucidate the evolutionary patterns of such larval morphological diversity, nucleotide sequences of a portion of the mitochondrially encoded 16S ribosomal RNA gene were determined for seven Hygophum species and three outgroup taxa. Secondary structure-based alignment resulted in a character matrix consisting of 1172 bp of unambiguously aligned sequences, which were subjected to phylogenetic analyses using maximum-parsimony, maximum-likelihood, and neighbor-joining methods. The resultant tree topologies from the three methods were congruent, with most nodes, including that of the genus Hygophum, being strongly supported by various tree statistics. The most parsimonious reconstruction of the three previously recognized, distinct larval morphs onto the molecular phylogeny revealed that one of the morphs had originated as the common ancestor of the genus, the other two having diversified separately in two subsequent major clades. The patterns of such diversification are discussed in terms of the unusual larval eye morphology and geographic distribution. Copyright 2000 Academic Press.
Driving style recognition method using braking characteristics based on hidden Markov model
Wu, Chaozhong; Lyu, Nengchao; Huang, Zhen
2017-01-01
Since the advantage of hidden Markov model in dealing with time series data and for the sake of identifying driving style, three driving style (aggressive, moderate and mild) are modeled reasonably through hidden Markov model based on driver braking characteristics to achieve efficient driving style. Firstly, braking impulse and the maximum braking unit area of vacuum booster within a certain time are collected from braking operation, and then general braking and emergency braking characteristics are extracted to code the braking characteristics. Secondly, the braking behavior observation sequence is used to describe the initial parameters of hidden Markov model, and the generation of the hidden Markov model for differentiating and an observation sequence which is trained and judged by the driving style is introduced. Thirdly, the maximum likelihood logarithm could be implied from the observable parameters. The recognition accuracy of algorithm is verified through experiments and two common pattern recognition algorithms. The results showed that the driving style discrimination based on hidden Markov model algorithm could realize effective discriminant of driving style. PMID:28837580
Algorithms of maximum likelihood data clustering with applications
NASA Astrophysics Data System (ADS)
Giada, Lorenzo; Marsili, Matteo
2002-12-01
We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.
Naushad, Sohail; Barkema, Herman W.; Luby, Christopher; Condas, Larissa A. Z.; Nobrega, Diego B.; Carson, Domonique A.; De Buck, Jeroen
2016-01-01
Non-aureus staphylococci (NAS), a heterogeneous group of a large number of species and subspecies, are the most frequently isolated pathogens from intramammary infections in dairy cattle. Phylogenetic relationships among bovine NAS species are controversial and have mostly been determined based on single-gene trees. Herein, we analyzed phylogeny of bovine NAS species using whole-genome sequencing (WGS) of 441 distinct isolates. In addition, evolutionary relationships among bovine NAS were estimated from multilocus data of 16S rRNA, hsp60, rpoB, sodA, and tuf genes and sequences from these and numerous other single genes/proteins. All phylogenies were created with FastTree, Maximum-Likelihood, Maximum-Parsimony, and Neighbor-Joining methods. Regardless of methodology, WGS-trees clearly separated bovine NAS species into five monophyletic coherent clades. Furthermore, there were consistent interspecies relationships within clades in all WGS phylogenetic reconstructions. Except for the Maximum-Parsimony tree, multilocus data analysis similarly produced five clades. There were large variations in determining clades and interspecies relationships in single gene/protein trees, under different methods of tree constructions, highlighting limitations of using single genes for determining bovine NAS phylogeny. However, based on WGS data, we established a robust phylogeny of bovine NAS species, unaffected by method or model of evolutionary reconstructions. Therefore, it is now possible to determine associations between phylogeny and many biological traits, such as virulence, antimicrobial resistance, environmental niche, geographical distribution, and host specificity. PMID:28066335
Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan
2016-05-01
In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species.
NASA Technical Reports Server (NTRS)
Mccallister, R. D.; Crawford, J. J.
1981-01-01
It is pointed out that the NASA 30/20 GHz program will place in geosynchronous orbit a technically advanced communication satellite which can process time-division multiple access (TDMA) information bursts with a data throughput in excess of 4 GBPS. To guarantee acceptable data quality during periods of signal attenuation it will be necessary to provide a significant forward error correction (FEC) capability. Convolutional decoding (utilizing the maximum-likelihood techniques) was identified as the most attractive FEC strategy. Design trade-offs regarding a maximum-likelihood convolutional decoder (MCD) in a single-chip CMOS implementation are discussed.
PAMLX: a graphical user interface for PAML.
Xu, Bo; Yang, Ziheng
2013-12-01
This note announces pamlX, a graphical user interface/front end for the paml (for Phylogenetic Analysis by Maximum Likelihood) program package (Yang Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 13:555-556; Yang Z. 2007. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 24:1586-1591). pamlX is written in C++ using the Qt library and communicates with paml programs through files. It can be used to create, edit, and print control files for paml programs and to launch paml runs. The interface is available for free download at http://abacus.gene.ucl.ac.uk/software/paml.html.
Saarela, Jeffery M; Wysocki, William P; Barrett, Craig F; Soreng, Robert J; Davis, Jerrold I; Clark, Lynn G; Kelchner, Scot A; Pires, J Chris; Edger, Patrick P; Mayfield, Dustin R; Duvall, Melvin R
2015-05-04
Whole plastid genomes are being sequenced rapidly from across the green plant tree of life, and phylogenetic analyses of these are increasing resolution and support for relationships that have varied among or been unresolved in earlier single- and multi-gene studies. Pooideae, the cool-season grass lineage, is the largest of the 12 grass subfamilies and includes important temperate cereals, turf grasses and forage species. Although numerous studies of the phylogeny of the subfamily have been undertaken, relationships among some 'early-diverging' tribes conflict among studies, and some relationships among subtribes of Poeae have not yet been resolved. To address these issues, we newly sequenced 25 whole plastomes, which showed rearrangements typical of Poaceae. These plastomes represent 9 tribes and 11 subtribes of Pooideae, and were analysed with 20 existing plastomes for the subfamily. Maximum likelihood (ML), maximum parsimony (MP) and Bayesian inference (BI) robustly resolve most deep relationships in the subfamily. Complete plastome data provide increased nodal support compared with protein-coding data alone at nodes that are not maximally supported. Following the divergence of Brachyelytrum, Phaenospermateae, Brylkinieae-Meliceae and Ampelodesmeae-Stipeae are the successive sister groups of the rest of the subfamily. Ampelodesmeae are nested within Stipeae in the plastome trees, consistent with its hybrid origin between a phaenospermatoid and a stipoid grass (the maternal parent). The core Pooideae are strongly supported and include Brachypodieae, a Bromeae-Triticeae clade and Poeae. Within Poeae, a novel sister group relationship between Phalaridinae and Torreyochloinae is found, and the relative branching order of this clade and Aveninae, with respect to an Agrostidinae-Brizinae clade, are discordant between MP and ML/BI trees. Maximum likelihood and Bayesian analyses strongly support Airinae and Holcinae as the successive sister groups of a Dactylidinae-Loliinae clade. Published by Oxford University Press on behalf of the Annals of Botany Company.
Maximum Likelihood Estimation of Nonlinear Structural Equation Models.
ERIC Educational Resources Information Center
Lee, Sik-Yum; Zhu, Hong-Tu
2002-01-01
Developed an EM type algorithm for maximum likelihood estimation of a general nonlinear structural equation model in which the E-step is completed by a Metropolis-Hastings algorithm. Illustrated the methodology with results from a simulation study and two real examples using data from previous studies. (SLD)
ERIC Educational Resources Information Center
Hamaker, Ellen L.; Dolan, Conor V.; Molenaar, Peter C. M.
2003-01-01
Demonstrated, through simulation, that stationary autoregressive moving average (ARMA) models may be fitted readily when T>N, using normal theory raw maximum likelihood structural equation modeling. Also provides some illustrations based on real data. (SLD)
Maximum likelihood phase-retrieval algorithm: applications.
Nahrstedt, D A; Southwell, W H
1984-12-01
The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
Population Synthesis of Radio and Gamma-ray Pulsars using the Maximum Likelihood Approach
NASA Astrophysics Data System (ADS)
Billman, Caleb; Gonthier, P. L.; Harding, A. K.
2012-01-01
We present the results of a pulsar population synthesis of normal pulsars from the Galactic disk using a maximum likelihood method. We seek to maximize the likelihood of a set of parameters in a Monte Carlo population statistics code to better understand their uncertainties and the confidence region of the model's parameter space. The maximum likelihood method allows for the use of more applicable Poisson statistics in the comparison of distributions of small numbers of detected gamma-ray and radio pulsars. Our code simulates pulsars at birth using Monte Carlo techniques and evolves them to the present assuming initial spatial, kick velocity, magnetic field, and period distributions. Pulsars are spun down to the present and given radio and gamma-ray emission characteristics. We select measured distributions of radio pulsars from the Parkes Multibeam survey and Fermi gamma-ray pulsars to perform a likelihood analysis of the assumed model parameters such as initial period and magnetic field, and radio luminosity. We present the results of a grid search of the parameter space as well as a search for the maximum likelihood using a Markov Chain Monte Carlo method. We express our gratitude for the generous support of the Michigan Space Grant Consortium, of the National Science Foundation (REU and RUI), the NASA Astrophysics Theory and Fundamental Program and the NASA Fermi Guest Investigator Program.
Wu, Yufeng
2012-03-01
Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R
2001-10-01
The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.
A study of parameter identification
NASA Technical Reports Server (NTRS)
Herget, C. J.; Patterson, R. E., III
1978-01-01
A set of definitions for deterministic parameter identification ability were proposed. Deterministic parameter identificability properties are presented based on four system characteristics: direct parameter recoverability, properties of the system transfer function, properties of output distinguishability, and uniqueness properties of a quadratic cost functional. Stochastic parameter identifiability was defined in terms of the existence of an estimation sequence for the unknown parameters which is consistent in probability. Stochastic parameter identifiability properties are presented based on the following characteristics: convergence properties of the maximum likelihood estimate, properties of the joint probability density functions of the observations, and properties of the information matrix.
Estimation and classification by sigmoids based on mutual information
NASA Technical Reports Server (NTRS)
Baram, Yoram
1994-01-01
An estimate of the probability density function of a random vector is obtained by maximizing the mutual information between the input and the output of a feedforward network of sigmoidal units with respect to the input weights. Classification problems can be solved by selecting the class associated with the maximal estimated density. Newton's s method, applied to an estimated density, yields a recursive maximum likelihood estimator, consisting of a single internal layer of sigmoids, for a random variable or a random sequence. Applications to the diamond classification and to the prediction of a sun-spot process are demonstrated.
NASA Astrophysics Data System (ADS)
Gao, Fengtao; Wei, Min; Zhu, Ying; Guo, Hua; Chen, Songlin; Yang, Guanpin
2017-06-01
This study presents the complete mitochondrial genome of the hybrid Epinephelus moara♀× Epinephelus lanceolatus♂. The genome is 16886 bp in length, and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, a light-strand replication origin and a control region. Additionally, phylogenetic analysis based on the nucleotide sequences of 13 conserved protein-coding genes using the maximum likelihood method indicated that the mitochondrial genome is maternally inherited. This study presents genomic data for studying phylogenetic relationships and breeding of hybrid Epinephelinae.
Villano, Umbertina; Lo Presti, Alessandra; Equestre, Michele; Cella, Eleonora; Pisani, Giulio; Giovanetti, Marta; Bruni, Roberto; Tritarelli, Elena; Amicosante, Massimo; Grifoni, Alba; Scarcella, Carmelo; El-Hamad, Issa; Pezzoli, Maria Chiara; Angeletti, Silvia; Silvia, Angeletti; Ciccaglione, Anna Rita; Ciccozzi, Massimo
2015-07-25
Hepatitis B virus infection (HBV) is widespread and it is considered a major health problem worldwide. The global distribution of HBV varies significantly between countries and between regions of the world. Among the many factors contributing to the changing epidemiology of viral hepatitis, the movement of people within and between countries is a potentially important one. In Italy, the number of migrant individuals has been increasing during the past 25 years. HBV genotype D has been found throughout the world, although its highest prevalence is in the Mediterranean area, the Middle East and southern Asia. We describe the molecular epidemiology of HBV in a chronically infected population of migrants (living in Italy), by using the phylogenetic analysis. HBV-DNA was amplified and sequenced from 43 HBV chronically infected patients. Phylogenetic and evolutionary analysis were performed using both maximum Likelihood and Bayesian methods. Of the 43 HBV S gene isolates from migrants, 25 (58.1 %) were classified as D genotype. Maximum Likelihood analysis showed an intermixing between Moldavian and foreigners sequences mostly respect to Italian ones. Italian sequences clustered mostly together in a main clade separately from all others. The estimation of the time of the tree's root gave a mean value of 17 years ago, suggesting the origin of the tree back to 1992 year. The skyline plot showed that the number of infections softly increased until the early 2005s, after which reached a plateau. Comparing phylogenetic data to the migrants date of arrival in Italy, it should be possible that migrants arrived in Italy yet infected from their country of origin. In conclusion, this is the first paper where phylogenetic analysis and genetic evolution has been used to characterize HBV sub genotypes D1 circulation in a selected and homogenous group of migrants coming from a restricted area of Balkans and to approximately define the period of infection besides the migration date.
Ali, Akhtar; Ali, Ijaz
2015-01-01
Dengue virus serotype 2 (DENV-2) isolates have been implicated in deadly outbreaks of dengue fever (DF) and dengue hemorrhagic fever (DHF) in several regions of the world. Phylogenetic analysis of DENV-2 isolates collected from particular countries has been performed using partial or individual genes but only a few studies have examined complete whole-genome sequences collected worldwide. Herein, 50 complete genome sequences of DENV-2 isolates, reported over the past 70 years from 19 different countries, were downloaded from GenBank. Phylogenetic analysis was conducted and evolutionary distances of the 50 DENV-2 isolates were determined using maximum likelihood (ML) trees or Bayesian phylogenetic analysis created from complete genome nucleotide (nt) and amino acid (aa) sequences or individual gene sequences. The results showed that all DENV-2 isolates fell into seven main groups containing five previously defined genotypes. A Cosmopolitan genotype showed further division into three groups (C-I, C-II, and C-III) with the C-I group containing two subgroups (C-IA and C-IB). Comparison of the aa sequences showed specific mutations among the various groups of DENV-2 isolates. A maximum number of aa mutations was observed in the NS5 gene, followed by the NS2A, NS3 and NS1 genes, while the smallest number of aa substitutions was recorded in the capsid gene, followed by the PrM/M, NS4A, and NS4B genes. Maximum evolutionary distances were found in the NS2A gene, followed by the NS4A and NS4B genes. Based on these results, we propose that genotyping of DENV-2 isolates in future studies should be performed on entire genome sequences in order to gain a complete understanding of the evolution of various isolates reported from different geographical locations around the world. PMID:26414178
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
On Muthen's Maximum Likelihood for Two-Level Covariance Structure Models
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Hayashi, Kentaro
2005-01-01
Data in social and behavioral sciences are often hierarchically organized. Special statistical procedures that take into account the dependence of such observations have been developed. Among procedures for 2-level covariance structure analysis, Muthen's maximum likelihood (MUML) has the advantage of easier computation and faster convergence. When…
Maximum Likelihood Estimation of Nonlinear Structural Equation Models with Ignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Lee, John C. K.
2003-01-01
The existing maximum likelihood theory and its computer software in structural equation modeling are established on the basis of linear relationships among latent variables with fully observed data. However, in social and behavioral sciences, nonlinear relationships among the latent variables are important for establishing more meaningful models…
Mixture Rasch Models with Joint Maximum Likelihood Estimation
ERIC Educational Resources Information Center
Willse, John T.
2011-01-01
This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
ERIC Educational Resources Information Center
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
A Study of Item Bias for Attitudinal Measurement Using Maximum Likelihood Factor Analysis.
ERIC Educational Resources Information Center
Mayberry, Paul W.
A technique for detecting item bias that is responsive to attitudinal measurement considerations is a maximum likelihood factor analysis procedure comparing multivariate factor structures across various subpopulations, often referred to as SIFASP. The SIFASP technique allows for factorial model comparisons in the testing of various hypotheses…
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.
ERIC Educational Resources Information Center
Baldwin, Beatrice
The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models
ERIC Educational Resources Information Center
Lee, Taehun
2010-01-01
In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Cheng, Tian; Liu, Guo-Hua; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2016-03-01
Hymenolepis nana, commonly known as the dwarf tapeworm, is one of the most common tapeworms of humans and rodents and can cause hymenolepiasis. Although this zoonotic tapeworm is of socio-economic significance in many countries of the world, its genetics, systematics, epidemiology, and biology are poorly understood. In the present study, we sequenced and characterized the complete mitochondrial (mt) genome of H. nana. The mt genome is 13,764 bp in size and encodes 36 genes, including 12 protein-coding genes, 2 ribosomal RNA, and 22 transfer RNA genes. All genes are transcribed in the same direction. The gene order and genome content are completely identical with their congener Hymenolepis diminuta. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference, Maximum likelihood, and Maximum parsimony showed the division of class Cestoda into two orders, supported the monophylies of both the orders Cyclophyllidea and Pseudophyllidea. Analyses of mt genome sequences also support the monophylies of the three families Taeniidae, Hymenolepididae, and Diphyllobothriidae. This novel mt genome provides a useful genetic marker for studying the molecular epidemiology, systematics, and population genetics of the dwarf tapeworm and should have implications for the diagnosis, prevention, and control of hymenolepiasis in humans.
Cross-Border Sexual Transmission of the Newly Emerging HIV-1 Clade CRF51_01B
Cheong, Hui Ting; Ng, Kim Tien; Ong, Lai Yee; Chook, Jack Bee; Chan, Kok Gan; Takebe, Yutaka; Kamarulzaman, Adeeba; Tee, Kok Keng
2014-01-01
A novel HIV-1 recombinant clade (CRF51_01B) was recently identified among men who have sex with men (MSM) in Singapore. As cases of sexually transmitted HIV-1 infection increase concurrently in two socioeconomically intimate countries such as Malaysia and Singapore, cross transmission of HIV-1 between said countries is highly probable. In order to investigate the timeline for the emergence of HIV-1 CRF51_01B in Singapore and its possible introduction into Malaysia, 595 HIV-positive subjects recruited in Kuala Lumpur from 2008 to 2012 were screened. Phylogenetic relationship of 485 amplified polymerase gene sequences was determined through neighbour-joining method. Next, near-full length sequences were amplified for genomic sequences inferred to be CRF51_01B and subjected to further analysis implemented through Bayesian Markov chain Monte Carlo (MCMC) sampling and maximum likelihood methods. Based on the near full length genomes, two isolates formed a phylogenetic cluster with CRF51_01B sequences of Singapore origin, sharing identical recombination structure. Spatial and temporal information from Bayesian MCMC coalescent and maximum likelihood analysis of the protease, gp120 and gp41 genes suggest that Singapore is probably the country of origin of CRF51_01B (as early as in the mid-1990s) and featured a Malaysian who acquired the infection through heterosexual contact as host for its ancestral lineages. CRF51_01B then spread rapidly among the MSM in Singapore and Malaysia. Although the importation of CRF51_01B from Singapore to Malaysia is supported by coalescence analysis, the narrow timeframe of the transmission event indicates a closely linked epidemic. Discrepancies in the estimated divergence times suggest that CRF51_01B may have arisen through multiple recombination events from more than one parental lineage. We report the cross transmission of a novel CRF51_01B lineage between countries that involved different sexual risk groups. Understanding the cross-border transmission of HIV-1 involving sexual networks is crucial for effective intervention strategies in the region. PMID:25340817
Cross-border sexual transmission of the newly emerging HIV-1 clade CRF51_01B.
Cheong, Hui Ting; Ng, Kim Tien; Ong, Lai Yee; Chook, Jack Bee; Chan, Kok Gan; Takebe, Yutaka; Kamarulzaman, Adeeba; Tee, Kok Keng
2014-01-01
A novel HIV-1 recombinant clade (CRF51_01B) was recently identified among men who have sex with men (MSM) in Singapore. As cases of sexually transmitted HIV-1 infection increase concurrently in two socioeconomically intimate countries such as Malaysia and Singapore, cross transmission of HIV-1 between said countries is highly probable. In order to investigate the timeline for the emergence of HIV-1 CRF51_01B in Singapore and its possible introduction into Malaysia, 595 HIV-positive subjects recruited in Kuala Lumpur from 2008 to 2012 were screened. Phylogenetic relationship of 485 amplified polymerase gene sequences was determined through neighbour-joining method. Next, near-full length sequences were amplified for genomic sequences inferred to be CRF51_01B and subjected to further analysis implemented through Bayesian Markov chain Monte Carlo (MCMC) sampling and maximum likelihood methods. Based on the near full length genomes, two isolates formed a phylogenetic cluster with CRF51_01B sequences of Singapore origin, sharing identical recombination structure. Spatial and temporal information from Bayesian MCMC coalescent and maximum likelihood analysis of the protease, gp120 and gp41 genes suggest that Singapore is probably the country of origin of CRF51_01B (as early as in the mid-1990s) and featured a Malaysian who acquired the infection through heterosexual contact as host for its ancestral lineages. CRF51_01B then spread rapidly among the MSM in Singapore and Malaysia. Although the importation of CRF51_01B from Singapore to Malaysia is supported by coalescence analysis, the narrow timeframe of the transmission event indicates a closely linked epidemic. Discrepancies in the estimated divergence times suggest that CRF51_01B may have arisen through multiple recombination events from more than one parental lineage. We report the cross transmission of a novel CRF51_01B lineage between countries that involved different sexual risk groups. Understanding the cross-border transmission of HIV-1 involving sexual networks is crucial for effective intervention strategies in the region.
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program Linear SCIDNT which evaluates rotorcraft stability and control coefficients from flight or wind tunnel test data is described. It implements the maximum likelihood method to maximize the likelihood function of the parameters based on measured input/output time histories. Linear SCIDNT may be applied to systems modeled by linear constant-coefficient differential equations. This restriction in scope allows the application of several analytical results which simplify the computation and improve its efficiency over the general nonlinear case.
Remarkable convergent evolution in specialized parasitic Thecostraca (Crustacea)
Pérez-Losada, Marcos; Høeg, Jens T; Crandall, Keith A
2009-01-01
Background The Thecostraca are arguably the most morphologically and biologically variable group within the Crustacea, including both suspension feeders (Cirripedia: Thoracica and Acrothoracica) and parasitic forms (Cirripedia: Rhizocephala, Ascothoracida and Facetotecta). Similarities between the metamorphosis found in the Facetotecta and Rhizocephala suggests a common evolutionary origin, but until now no comprehensive study has looked at the basic evolution of these thecostracan groups. Results To this end, we collected DNA sequences from three nuclear genes [18S rRNA (2,305), 28S rRNA (2,402), Histone H3 (328)] and 41 larval characters in seven facetotectans, five ascothoracidans, three acrothoracicans, 25 rhizocephalans and 39 thoracicans (ingroup) and 12 Malacostraca and 10 Copepoda (outgroup). Maximum parsimony, maximum likelihood and Bayesian analyses showed the Facetotecta, Ascothoracida and Cirripedia each as monophyletic. The better resolved and highly supported DNA maximum likelihood and morphological-DNA Bayesian analysis trees depicted the main phylogenetic relationships within the Thecostraca as (Facetotecta, (Ascothoracida, (Acrothoracica, (Rhizocephala, Thoracica)))). Conclusion Our analyses indicate a convergent evolution of the very similar and highly reduced slug-shaped stages found during metamorphosis of both the Rhizocephala and the Facetotecta. This provides a remarkable case of convergent evolution and implies that the advanced endoparasitic mode of life known from the Rhizocephala and strongly indicated for the Facetotecta had no common origin. Future analyses are needed to determine whether the most recent common ancestor of the Thecostraca was free-living or some primitive form of ectoparasite. PMID:19374762
NASA Astrophysics Data System (ADS)
Kilany, Mona
2017-11-01
The potentially deleterious effects of methylene blue (MB) on human health drove the interest in its removal promptly. Bioremediation is an effective and eco friendly for removing MB. Soil bacteria were isolated and examined for their potential to remove MB. The most potent bacterial candidate was characterized and identified using 16S rRNA sequence technique. The evolutionary history of the isolate was conducted by maximum likelihood method. Some physiochemical parameters were optimized for maximum decolorization. Decolorization mechanism and microbial toxicity study of MB (100 mg/l) and by-products were investigated. Participation of heat killed bacteria in color adsorption have been investigated too. The bacterial isolate was identified as Stenotrophomonas maltophilia strain Kilany_MB 16S ribosomal RNA gene with 99% sequence similarity. The sequence was submitted to NCBI (Accession number = KU533726). Phylogeny depicted the phylogenetic relationships between 16S ribosomal RNA gene, partial sequence (1442 bp), of the isolated strain and other strains related to Stenotrophomonas maltophilia in the GenBank database. The optimal conditions were investigated to be pH 5 at 30 °C, after 24 h using 5 mg/l MB showing optimum decolorization percentage (61.3%). Microbial toxicity study demonstrated relative reduction in the toxicity of MB decolorized products on test bacteria. Mechanism of color removal was proved by both biosorption and biodegradation, where heat-killed and live cells showed 43 and 52% of decolorization, respectively, as a maximum value after 24-h incubation. It was demonstrated that the mechanism of color removal is by adsorption. Therefore, good performance of S maltophilia in MB color removal reinforces the exploitation of these bacteria in environmental clean-up and restoration of the ecosystem.
Spotorno O, Angel E; Córdova, Luis; Solari I, Aldo
2008-12-01
To identify and characterize chilean samples of Trypanosoma cruzi and their association with hosts, the first 516 bp of the mitochondrial cytochrome b gene were sequenced from eight biological samples, and phylogenetically compared with other known 20 American sequences. The molecular characterization of these 28 sequences in a maximum likelihood phylogram (-lnL = 1255.12, tree length = 180, consistency index = 0.79) allowed the robust identification (bootstrap % > 99) of three previously known discrete typing units (DTU): DTU IIb, IIa, and I. An apparently undescribed new sequence found in four new chilean samples was detected and designated as DTU Ib; they were separated by 24.7 differences, but robustly related (bootstrap % = 97 in 500 replicates) to those of DTU I by sharing 12 substitutions, among which four were nonsynonymous ones. Such new DTU Ib was also robust (bootstrap % = 100), and characterized by 10 unambiguous substitutions, with a single nonsynonymous G to T change at site 409. The fact that two of such new sequences were found in parasites from a chilean endemic caviomorph rodent, Octodon degus, and that they were closely related to the ancient DTU I suggested old origins and a long association to caviomorph hosts.
Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan
2012-01-01
Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464
Analysis of the cytochrome c oxidase subunit II (COX2) gene in giant panda, Ailuropoda melanoleuca.
Ling, S S; Zhu, Y; Lan, D; Li, D S; Pang, H Z; Wang, Y; Li, D Y; Wei, R P; Zhang, H M; Wang, C D; Hu, Y D
2017-01-23
The giant panda, Ailuropoda melanoleuca (Ursidae), has a unique bamboo-based diet; however, this low-energy intake has been sufficient to maintain the metabolic processes of this species since the fourth ice age. As mitochondria are the main sites for energy metabolism in animals, the protein-coding genes involved in mitochondrial respiratory chains, particularly cytochrome c oxidase subunit II (COX2), which is the rate-limiting enzyme in electron transfer, could play an important role in giant panda metabolism. Therefore, the present study aimed to isolate, sequence, and analyze the COX2 DNA from individuals kept at the Giant Panda Protection and Research Center, China, and compare these sequences with those of the other Ursidae family members. Multiple sequence alignment showed that the COX2 gene had three point mutations that defined three haplotypes, with 60% of the sequences corresponding to haplotype I. The neutrality tests revealed that the COX2 gene was conserved throughout evolution, and the maximum likelihood phylogenetic analysis, using homologous sequences from other Ursidae species, showed clustering of the COX2 sequences of giant pandas, suggesting that this gene evolved differently in them.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters
Stanley J. Zarnoch; Tommy R. Dell
1985-01-01
Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
ERIC Educational Resources Information Center
Klein, Andreas G.; Muthen, Bengt O.
2007-01-01
In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Maximum Likelihood Analysis of Nonlinear Structural Equation Models with Dichotomous Variables
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lee, Sik-Yum
2005-01-01
In this article, a maximum likelihood approach is developed to analyze structural equation models with dichotomous variables that are common in behavioral, psychological and social research. To assess nonlinear causal effects among the latent variables, the structural equation in the model is defined by a nonlinear function. The basic idea of the…
Unclassified Publications of Lincoln Laboratory, 1 January - 31 December 1990. Volume 16
1990-12-31
Apr. 1990 ADA223419 Hopped Communication Systems with Nonuniform Hopping Distributions 880 Bistatic Radar Cross Section of a Fenn, A.J. 2 May1990...EXPERIMENT JA-6241 MS-8424 LUNAR PERTURBATION MAXIMUM LIKELIHOOD ALGORITHM JA-6241 JA-6467 LWIR SPECTRAL BAND MAXIMUM LIKELIHOOD ESTIMATOR JA-6476 MS-8466
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
ERIC Educational Resources Information Center
Yang, Xiangdong; Poggio, John C.; Glasnapp, Douglas R.
2006-01-01
The effects of five ability estimators, that is, maximum likelihood estimator, weighted likelihood estimator, maximum a posteriori, expected a posteriori, and Owen's sequential estimator, on the performances of the item response theory-based adaptive classification procedure on multiple categories were studied via simulations. The following…
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
Five Methods for Estimating Angoff Cut Scores with IRT
ERIC Educational Resources Information Center
Wyse, Adam E.
2017-01-01
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
NASA Technical Reports Server (NTRS)
Grove, R. D.; Bowles, R. L.; Mayhew, S. C.
1972-01-01
A maximum likelihood parameter estimation procedure and program were developed for the extraction of the stability and control derivatives of aircraft from flight test data. Nonlinear six-degree-of-freedom equations describing aircraft dynamics were used to derive sensitivity equations for quasilinearization. The maximum likelihood function with quasilinearization was used to derive the parameter change equations, the covariance matrices for the parameters and measurement noise, and the performance index function. The maximum likelihood estimator was mechanized into an iterative estimation procedure utilizing a real time digital computer and graphic display system. This program was developed for 8 measured state variables and 40 parameters. Test cases were conducted with simulated data for validation of the estimation procedure and program. The program was applied to a V/STOL tilt wing aircraft, a military fighter airplane, and a light single engine airplane. The particular nonlinear equations of motion, derivation of the sensitivity equations, addition of accelerations into the algorithm, operational features of the real time digital system, and test cases are described.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-06-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data
NASA Technical Reports Server (NTRS)
Iliff, K. W.; Maine, R. E.
1985-01-01
Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic
Yebra, Gonzalo; Hodcroft, Emma B.; Ragonnet-Cronin, Manon L.; Pillay, Deenan; Brown, Andrew J. Leigh; Fraser, Christophe; Kellam, Paul; de Oliveira, Tulio; Dennis, Ann; Hoppe, Anne; Kityo, Cissy; Frampton, Dan; Ssemwanga, Deogratius; Tanser, Frank; Keshani, Jagoda; Lingappa, Jairam; Herbeck, Joshua; Wawer, Maria; Essex, Max; Cohen, Myron S.; Paton, Nicholas; Ratmann, Oliver; Kaleebu, Pontiano; Hayes, Richard; Fidler, Sarah; Quinn, Thomas; Novitsky, Vladimir; Haywards, Andrew; Nastouli, Eleni; Morris, Steven; Clark, Duncan; Kozlakidis, Zisis
2016-01-01
HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences. PMID:28008945
Yebra, Gonzalo; Hodcroft, Emma B; Ragonnet-Cronin, Manon L; Pillay, Deenan; Brown, Andrew J Leigh
2016-12-23
HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree's using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.
Tyrer, Jonathan P; Guo, Qi; Easton, Douglas F; Pharoah, Paul D P
2013-06-06
The development of genotyping arrays containing hundreds of thousands of rare variants across the genome and advances in high-throughput sequencing technologies have made feasible empirical genetic association studies to search for rare disease susceptibility alleles. As single variant testing is underpowered to detect associations, the development of statistical methods to combine analysis across variants - so-called "burden tests" - is an area of active research interest. We previously developed a method, the admixture maximum likelihood test, to test multiple, common variants for association with a trait of interest. We have extended this method, called the rare admixture maximum likelihood test (RAML), for the analysis of rare variants. In this paper we compare the performance of RAML with six other burden tests designed to test for association of rare variants. We used simulation testing over a range of scenarios to test the power of RAML compared to the other rare variant association testing methods. These scenarios modelled differences in effect variability, the average direction of effect and the proportion of associated variants. We evaluated the power for all the different scenarios. RAML tended to have the greatest power for most scenarios where the proportion of associated variants was small, whereas SKAT-O performed a little better for the scenarios with a higher proportion of associated variants. The RAML method makes no assumptions about the proportion of variants that are associated with the phenotype of interest or the magnitude and direction of their effect. The method is flexible and can be applied to both dichotomous and quantitative traits and allows for the inclusion of covariates in the underlying regression model. The RAML method performed well compared to the other methods over a wide range of scenarios. Generally power was moderate in most of the scenarios, underlying the need for large sample sizes in any form of association testing.
Aktas, Munir; Ozübek, Sezayi; Ipek, Duygu Neval Sayın
2013-06-01
The occurrence and distribution of Hepatozoon species in stray dogs, and the developmental stages of Rhipicephalus sanguineus detached from the same dogs in Diyarbakır Province, Turkey is reported. A total of 328 ticks, including 133 adults (55 males and 75 females consist of 63 partially engorged and 15 fully engorged) and 195 nymphs (91 partially engorged and 104 fully engorged) were detached from the dogs. Fully engorged nymphs and females were incubated at 27 °C and relative humidity of 85 % to molt to adult stage and recover eggs. The ticks were pooled according to sex and developmental stage. No Hepatozoon gamonts were found, whereas, by PCR, 15.87 % (10/63) of the dogs were infected with Hepatozoon canis. Of the 68 tick pools tested, 14 (20.58 %) pools were infected with Hepatozoon spp., an overall maximum likelihood estimation of prevalence of 4.9 % (95 % confidence intervals (CI) = 2.85-7.93 %) per 100 ticks. Maximum likelihood estimation of the infection rate varied by tick sex and developmental categories, ranging from 1.75 % (95 % CI = 0.11-8.11 %) in fed males to 6.81 % (95 % CI = 2.07-17.46 %) in unfed females. One amplicon from a fed adult female was 99 % identical to the sequence for Hepatozoon felis. The remaining sequences isolated from both dogs and ticks shared 99-100 % similarity with the corresponding H. canis isolates. This is the first detection of H. canis and H. felis in the tick R. sanguineus in Turkey.
Lu, Xin; Zhou, Haijian; Du, Xiaoli; Liu, Sha; Xu, Jialiang; Cui, Zhigang; Pang, Bo; Kan, Biao
2016-11-01
Vibrio parahaemolyticus is a common seafood-borne pathogenic bacterium which causes gastroenteritis in humans. Continuous surveillance on the molecular characters of the clinical and environmental V. parahaemolyticus strains needs to be conducted for the epidemiological and genetic purposes. To generate a picture of the population distribution of V. parahaemolyticus in eastern China isolated from clinical cases of gastroenteritis and environmental samples, we investigated the genetic and evolutionary relationships of the strains using the commonly used multi-locus sequence typing (MLST, in which seven house-keeping genes are used in the protocol). A highly genetic diversity within the V. parahaemolyticus population was observed but ST3 was still dominant in the clinical strains, and 103 new sequence types (ST) were found in the clinical strains by searching in the global V. parahaemolyticus MLST database. With these genetically diverse strains, we estimated the recombination rates of the loci in MLST analysis. The locus recA was found to be subject to exceptionally high rate of recombination, and the recombinant single nucleotide polymorphisms (SNPs) were also identified within the seven loci. The phylogenetic tree of the strains was re-constructed using the maximum likelihood method by removing the recombination SNPs of the seven loci, and the minimum spanning tree was re-constructed with the six loci without recA. Some changes were observed in comparison with the previously used methods, suggesting that the homologous recombination has roles in shaping the clonal structure of V. parahaemolyticus. We propose the recombination-free SNPs strategy in the clonality analysis of V. parahaemolyticus, especially when using the maximum likelihood method. Copyright © 2016. Published by Elsevier B.V.
Phalee, Anawat; Wongsawad, Chalobol
2014-03-01
To investigate the infection of Fasciola gigantica (F. gigantica) in domestic cattle from Chiang Mai province and molecular confirmation using ITS-2 region. The liver and gall bladder of Bubalus bubalis (B. bubalis) and Bos taurus (B. taurus) from slaughterhouses were examined adult worms and prevalence investigation. The species confirmation with phylogenetic analysis using ITS-2 sequences was performed by maximum likelihood and UPGMA methods. The total prevalences of infection in B. bubalis and Bubalus taurus (B. taurus) were 67.27% and 52.94% respectively. The respective prevalence in both B. bubalis and B. taurus were acquired from Doi-Saket, Muang, and Sanpatong districts, with 81.25%, 62.50% and 60.00% for B. bubalis and 62.50%, 50.00% and 47.06% for Bos taurus respectively. The species confirmation of F. gigantica and some related species by basing on maximum likelihood and UPGMA methods used, 4 groups of trematodes were generated, first F. gigantica group including specimen of Chiang Mai, second 2 samples of F. hepatica, third group of 3 rumen flukes; Orthocoelium streptocoelium, F. elongatus and Paramphistomum epliclitum and fourth group of 3 minute intestinal flukes; Haplorchis taichui, Stellantchasmu falcatus, Haplorchoides sp. and liver fluke; Opisthorchis viverrini respectively. These results can be confirmed the Giant liver fluke which mainly caused fascioliasis in Chiang Mai was identified as F. gigantica and specimens were the same as those of F. gigantica recorded in other different countries. Nucleotide sequence of ITS-2 region has been proven as effective diagnostic tool for the identification of F. gigantica. Copyright © 2014 Hainan Medical College. Published by Elsevier B.V. All rights reserved.
DeChaine, Eric G.; Anderson, Stacy A.; McNew, Jennifer M.; Wendling, Barry M.
2013-01-01
Arctic-alpine plants in the genus Saxifraga L. (Saxifragaceae Juss.) provide an excellent system for investigating the process of diversification in northern regions. Yet, sect. Trachyphyllum (Gaud.) Koch, which is comprised of about 8 to 26 species, has still not been explored by molecular systematists even though taxonomists concur that the section needs to be thoroughly re-examined. Our goals were to use chloroplast trnL-F and nuclear ITS DNA sequence data to circumscribe the section phylogenetically, test models of geographically-based population divergence, and assess the utility of morphological characters in estimating evolutionary relationships. To do so, we sequenced both genetic markers for 19 taxa within the section. The phylogenetic inferences of sect. Trachyphyllum using maximum likelihood and Bayesian analyses showed that the section is polyphyletic, with S. aspera L. and S bryoides L. falling outside the main clade. In addition, the analyses supported several taxonomic re-classifications to prior names. We used two approaches to test biogeographic hypotheses: i) a coalescent approach in Mesquite to test the fit of our reconstructed gene trees to geographically-based models of population divergence and ii) a maximum likelihood inference in Lagrange. These tests uncovered strong support for an origin of the clade in the Southern Rocky Mountains of North America followed by dispersal and divergence episodes across refugia. Finally we adopted a stochastic character mapping approach in SIMMAP to investigate the utility of morphological characters in estimating evolutionary relationships among taxa. We found that few morphological characters were phylogenetically informative and many were misleading. Our molecular analyses provide a foundation for the diversity and evolutionary relationships within sect. Trachyphyllum and hypotheses for better understanding the patterns and processes of divergence in this section, other saxifrages, and plants inhabiting the North Pacific Rim. PMID:23922810
Pyron, R Alexander; Hendry, Catriona R; Chou, Vincent M; Lemmon, Emily M; Lemmon, Alan R; Burbrink, Frank T
2014-12-01
Next-generation genomic sequencing promises to quickly and cheaply resolve remaining contentious nodes in the Tree of Life, and facilitates species-tree estimation while taking into account stochastic genealogical discordance among loci. Recent methods for estimating species trees bypass full likelihood-based estimates of the multi-species coalescent, and approximate the true species-tree using simpler summary metrics. These methods converge on the true species-tree with sufficient genomic sampling, even in the anomaly zone. However, no studies have yet evaluated their efficacy on a large-scale phylogenomic dataset, and compared them to previous concatenation strategies. Here, we generate such a dataset for Caenophidian snakes, a group with >2500 species that contains several rapid radiations that were poorly resolved with fewer loci. We generate sequence data for 333 single-copy nuclear loci with ∼100% coverage (∼0% missing data) for 31 major lineages. We estimate phylogenies using neighbor joining, maximum parsimony, maximum likelihood, and three summary species-tree approaches (NJst, STAR, and MP-EST). All methods yield similar resolution and support for most nodes. However, not all methods support monophyly of Caenophidia, with Acrochordidae placed as the sister taxon to Pythonidae in some analyses. Thus, phylogenomic species-tree estimation may occasionally disagree with well-supported relationships from concatenated analyses of small numbers of nuclear or mitochondrial genes, a consideration for future studies. In contrast for at least two diverse, rapid radiations (Lamprophiidae and Colubridae), phylogenomic data and species-tree inference do little to improve resolution and support. Thus, certain nodes may lack strong signal, and larger datasets and more sophisticated analyses may still fail to resolve them. Copyright © 2014 Elsevier Inc. All rights reserved.
Kelly, S.; Wickstead, B.; Gull, K.
2011-01-01
We have developed a machine-learning approach to identify 3537 discrete orthologue protein sequence groups distributed across all available archaeal genomes. We show that treating these orthologue groups as binary detection/non-detection data is sufficient to capture the majority of archaeal phylogeny. We subsequently use the sequence data from these groups to infer a method and substitution-model-independent phylogeny. By holding this phylogeny constrained and interrogating the intersection of this large dataset with both the Eukarya and the Bacteria using Bayesian and maximum-likelihood approaches, we propose and provide evidence for a methanogenic origin of the Archaea. By the same criteria, we also provide evidence in support of an origin for Eukarya either within or as sisters to the Thaumarchaea. PMID:20880885
Asexual-sexual morph connection in the type species of Berkleasmium.
Tanney, Joey; Miller, Andrew N
2017-06-01
Berkleasmium is a polyphyletic genus comprising 37 dematiaceous hyphomycetous species. In this study, independent collections of the type species, B. concinnum , were made from Eastern North America. Nuclear internal transcribed spacer rDNA (ITS) and partial nuc 28S large subunit rDNA (LSU) sequences obtained from collections and subsequent cultures showed that Berkleasmium concinnum is the asexual morph of Neoacanthostigma septoconstrictum ( Tubeufiaceae , Tubeufiales ). Phylogenies inferred from Bayesian inference and maximum likelihood analyses of ITS-LSU sequence data confirmed this asexual-sexual morph connection and a re-examination of fungarium reference specimens also revealed the co-occurrence of N. septoconstrictum ascomata and B. concinnum sporodochia. Neoacanthostigma septoconstrictum is therefore synonymized under B. concinnum on the basis of priority. A specimen identified as N. septoconstrictum from Thailand is described as N. thailandicum sp. nov., based on morphological and genetic distinctiveness.
Masters, J C; Anthony, N M; de Wit, M J; Mitchell, A
2005-08-01
Major aspects of lorisid phylogeny and systematics remain unresolved, despite several studies (involving morphology, histology, karyology, immunology, and DNA sequencing) aimed at elucidating them. Our study is the first to investigate the evolution of this enigmatic group using molecular and morphological data for all four well-established genera: Arctocebus, Loris, Nycticebus, and Perodicticus. Data sets consisting of 386 bp of 12S rRNA, 535 bp of 16S rRNA, and 36 craniodental characters were analyzed separately and in combination, using maximum parsimony and maximum likelihood. Outgroups, consisting of two galagid taxa (Otolemur and Galagoides) and a lemuroid (Microcebus), were also varied. The morphological data set yielded a paraphyletic lorisid clade with the robust Nycticebus and Perodicticus grouped as sister taxa, and the galagids allied with Arctocebus. All molecular analyses maximum parsimony (MP) or maximum likelihood (ML) which included Microcebus as an outgroup rendered a paraphyletic lorisid clade, with one exception: the 12S + 16S data set analyzed with ML. The position of the galagids in these paraphyletic topologies was inconsistent, however, and bootstrap values were low. Exclusion of Microcebus generated a monophyletic Lorisidae with Asian and African subclades; bootstrap values for all three clades in the total evidence tree were over 90%. We estimated mean genetic distances for lemuroids vs. lorisoids, lorisids vs. galagids, and Asian vs. African lorisids as a guide to relative divergence times. We present information regarding a temporary land bridge that linked the two now widely separated regions inhabited by lorisids that may explain their distribution. Finally, we make taxonomic recommendations based on our results. (c) 2005 Wiley-Liss, Inc.
Helaers, Raphaël; Milinkovitch, Michel C
2010-07-15
The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high customization for the phylogeneticist, as well as to an ergonomic interface and functionalities assisting the non-specialist for sound inference of large phylogenetic trees using nucleotide sequences. MetaPIGA v2.0 and its extensive user-manual are freely available to academics at http://www.metapiga.org.
2010-01-01
Background The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Results Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. Conclusions The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high customization for the phylogeneticist, as well as to an ergonomic interface and functionalities assisting the non-specialist for sound inference of large phylogenetic trees using nucleotide sequences. MetaPIGA v2.0 and its extensive user-manual are freely available to academics at http://www.metapiga.org. PMID:20633263
Maximum Likelihood Implementation of an Isolation-with-Migration Model for Three Species.
Dalquen, Daniel A; Zhu, Tianqi; Yang, Ziheng
2017-05-01
We develop a maximum likelihood (ML) method for estimating migration rates between species using genomic sequence data. A species tree is used to accommodate the phylogenetic relationships among three species, allowing for migration between the two sister species, while the third species is used as an out-group. A Markov chain characterization of the genealogical process of coalescence and migration is used to integrate out the migration histories at each locus analytically, whereas Gaussian quadrature is used to integrate over the coalescent times on each genealogical tree numerically. This is an extension of our early implementation of the symmetrical isolation-with-migration model for three species to accommodate arbitrary loci with two or three sequences per locus and to allow asymmetrical migration rates. Our implementation can accommodate tens of thousands of loci, making it feasible to analyze genome-scale data sets to test for gene flow. We calculate the posterior probabilities of gene trees at individual loci to identify genomic regions that are likely to have been transferred between species due to gene flow. We conduct a simulation study to examine the statistical properties of the likelihood ratio test for gene flow between the two in-group species and of the ML estimates of model parameters such as the migration rate. Inclusion of data from a third out-group species is found to increase dramatically the power of the test and the precision of parameter estimation. We compiled and analyzed several genomic data sets from the Drosophila fruit flies. Our analyses suggest no migration from D. melanogaster to D. simulans, and a significant amount of gene flow from D. simulans to D. melanogaster, at the rate of ~0.02 migrant individuals per generation. We discuss the utility of the multispecies coalescent model for species tree estimation, accounting for incomplete lineage sorting and migration. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Williamson, Scott; Fledel-Alon, Adi; Bustamante, Carlos D
2004-09-01
We develop a Poisson random-field model of polymorphism and divergence that allows arbitrary dominance relations in a diploid context. This model provides a maximum-likelihood framework for estimating both selection and dominance parameters of new mutations using information on the frequency spectrum of sequence polymorphisms. This is the first DNA sequence-based estimator of the dominance parameter. Our model also leads to a likelihood-ratio test for distinguishing nongenic from genic selection; simulations indicate that this test is quite powerful when a large number of segregating sites are available. We also use simulations to explore the bias in selection parameter estimates caused by unacknowledged dominance relations. When inference is based on the frequency spectrum of polymorphisms, genic selection estimates of the selection parameter can be very strongly biased even for minor deviations from the genic selection model. Surprisingly, however, when inference is based on polymorphism and divergence (McDonald-Kreitman) data, genic selection estimates of the selection parameter are nearly unbiased, even for completely dominant or recessive mutations. Further, we find that weak overdominant selection can increase, rather than decrease, the substitution rate relative to levels of polymorphism. This nonintuitive result has major implications for the interpretation of several popular tests of neutrality.
Optimum quantum receiver for detecting weak signals in PAM communication systems
NASA Astrophysics Data System (ADS)
Sharma, Navneet; Rawat, Tarun Kumar; Parthasarathy, Harish; Gautam, Kumar
2017-09-01
This paper deals with the modeling of an optimum quantum receiver for pulse amplitude modulator (PAM) communication systems. The information bearing sequence {I_k}_{k=0}^{N-1} is estimated using the maximum likelihood (ML) method. The ML method is based on quantum mechanical measurements of an observable X in the Hilbert space of the quantum system at discrete times, when the Hamiltonian of the system is perturbed by an operator obtained by modulating a potential V with a PAM signal derived from the information bearing sequence {I_k}_{k=0}^{N-1}. The measurement process at each time instant causes collapse of the system state to an observable eigenstate. All probabilities of getting different outcomes from an observable are calculated using the perturbed evolution operator combined with the collapse postulate. For given probability densities, calculation of the mean square error evaluates the performance of the receiver. Finally, we present an example involving estimating an information bearing sequence that modulates a quantum electromagnetic field incident on a quantum harmonic oscillator.
Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test.
Wilcox, T P; García de León, F J; Hendrickson, D A; Hillis, D M
2004-06-01
Convergence has long been of interest to evolutionary biologists. Cave organisms appear to be ideal candidates for studying convergence in morphological, physiological, and developmental traits. Here we report apparent convergence in two cave-catfishes that were described on morphological grounds as congeners: Prietella phreatophila and Prietella lundbergi. We collected mitochondrial DNA sequence data from 10 species of catfishes, representing five of the seven genera in Ictaluridae, as well as seven species from a broad range of siluriform outgroups. Analysis of the sequence data under parsimony supports a monophyletic Prietella. However, both maximum-likelihood and Bayesian analyses support polyphyly of the genus, with P. lundbergi sister to Ictalurus and P. phreatophila sister to Ameiurus. The topological difference between parsimony and the other methods appears to result from long-branch attraction between the Prietella species. Similarly, the sequence data do not support several other relationships within Ictaluridae supported by morphology. We develop a new Bayesian method for examining variation in molecular rates of evolution across a phylogeny.
The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications.
Shen, Jinhui; Cong, Qian; Grishin, Nick V
2015-09-01
Due to the intriguing morphology, lifecycle, and diversity of butterflies and moths, Lepidoptera are emerging as model organisms for the study of genetics, evolution and speciation. The progress of these studies relies on decoding Lepidoptera genomes, both nuclear and mitochondrial. Here we describe a protocol to obtain mitogenomes from Next Generation Sequencing reads performed for whole-genome sequencing and report the complete mitogenome of Papilio (Pterourus) glaucus. The circular mitogenome is 15,306 bp in length and rich in A and T. It contains 13 protein-coding genes (PCGs), 22 transfer-RNA-coding genes (tRNA), and 2 ribosomal-RNA-coding genes (rRNA), with a gene order typical for mitogenomes of Lepidoptera. We performed phylogenetic analyses based on PCG and RNA-coding genes or protein sequences using Bayesian Inference and Maximum Likelihood methods. The phylogenetic trees consistently show that among species with available mitogenomes Papilio glaucus is the closest to Papilio (Agehana) maraho from Asia.
Hajduskova, Eva; Literak, Ivan; Papousek, Ivo; Costa, Francisco B; Novakova, Marketa; Labruna, Marcelo B; Zdrazilova-Dubska, Lenka
2016-04-01
A novel rickettsial sequence in the citrate synthase gltA gene indicating a novel Rickettsia species has been detected in 7 out of 4524 Ixodes ricinus ticks examined within several surveys performed in the Czech Republic from 2005 to 2009. This new Candidatus Rickettsia sp. sequence has been found in 2 nymphs feeding on wild birds (Luscinia megarhynchos and Erithacus rubecula), in a male tick from vegetation, and 4 ticks feeding on a dog (3 males, 1 female tick). Portions of the ompA, ompB, sca4, and htrA genes were not amplifiable in these samples. A maximum likelihood tree of rickettsiae based on comparisons of partial amino acid sequences of citrate synthase and nucleotide sequences of 16S rDNA genes and phylogenetic analysis revealed a basal position of the novel species in the proximity of R. bellii and R. canadensis. The novel species has been named 'Candidatus Rickettsia mendelii' after the founder of genetics, Gregor Mendel. Copyright © 2016 Elsevier GmbH. All rights reserved.
Jiang, Fan; Pan, Xubin; Li, Xuankun; Yu, Yanxue; Zhang, Junhua; Jiang, Hongshan; Dou, Liduo; Zhu, Shuifang
2016-01-01
The genus Dacus is one of the most economically important tephritid fruit flies. The first complete mitochondrial genome (mitogenome) of Dacus species – D. longicornis was sequenced by next-generation sequencing in order to develop the mitogenome data for this genus. The circular 16,253 bp mitogenome is the typical set and arrangement of 37 genes present in the ancestral insect. The mitogenome data of D. longicornis was compared to all the published homologous sequences of other tephritid species. We discovered the subgenera Bactrocera, Daculus and Tetradacus differed from the subgenus Zeugodacus, the genera Dacus, Ceratitis and Procecidochares in the possession of TA instead of TAA stop codon for COI gene. There is a possibility that the TA stop codon in COI is the synapomorphy in Bactrocera group in the genus Bactrocera comparing with other Tephritidae species. Phylogenetic analyses based on the mitogenome data from Tephritidae were inferred by Bayesian and Maximum-likelihood methods, strongly supported the sister relationship between Zeugodacus and Dacus. PMID:27812024
DNA Barcode for Identifying Folium Artemisiae Argyi from Counterfeits.
Mei, Quanxi; Chen, Xiaolu; Xiang, Li; Liu, Yue; Su, Yanyan; Gao, Yuqiao; Dai, Weibo; Dong, Pengpeng; Chen, Shilin
2016-01-01
Folium Artemisiae Argyi is an important herb in traditional Chinese medicine. It is commonly used in moxibustion, medicine, etc. However, identifying Artemisia argyi is difficult because this herb exhibits similar morphological characteristics to closely related species and counterfeits. To verify the applicability of DNA barcoding, ITS2 and psbA-trnH were used to identify A. argyi from 15 closely related species and counterfeits. Results indicated that total DNA was easily extracted from all the samples and that both ITS2 and psbA-trnH fragments can be easily amplified. ITS2 was a more ideal barcode than psbA-trnH and ITS2+psbA-trnH to identify A. argyi from closely related species and counterfeits on the basis of sequence character, genetic distance, and tree methods. The sequence length was 225 bp for the 56 ITS2 sequences of A. argyi, and no variable site was detected. For the ITS2 sequences, A. capillaris, A. anomala, A. annua, A. igniaria, A. maximowicziana, A. princeps, Dendranthema vestitum, and D. indicum had single nucleotide polymorphisms (SNPs). The intraspecific Kimura 2-Parameter distance was zero, which is lower than the minimum interspecific distance (0.005). A. argyi, the closely related species, and counterfeits, except for Artemisia maximowicziana and Artemisia sieversiana, were separated into pairs of divergent clusters by using the neighbor joining, maximum parsimony, and maximum likelihood tree methods. Thus, the ITS2 sequence was an ideal barcode to identify A. argyi from closely related species and counterfeits to ensure the safe use of this plant.
García-Varela, Martín; García-Prieto, Luís; Rodríguez, Rodolfo Pérez
2011-12-01
The morphology of the males of Neoechinorhynchus schmidti (Acanthocephala: Neoechinorhynchidae) is unknown, because this species was described based exclusively on females. However, recently we collected 2 common slider turtles Trachemys scripta in Centla swamps, Tabasco, Mexico, parasitized by 27 specimens of an acanthocephalan whose females were morphologically identical to N. schmidti. The domains D2 and D3 of the large subunit of the nuclear ribosomal RNA (LSU) of 3 males and 2 females of this material were sequenced. The sequences of both sexes were identical, and based on this result, we described for the first time the morphology of the males of N. schmidti. In addition, 6 sequences of a congeneric species, also parasite of turtles (Neoechinorhynchus emyditoides) were generated in the current research. The 11 sequences of these 2 species were aligned with 13 sequences of another 4 species of the same genus, producing a data set of 24 taxa with 674 nucleotides. The genetic divergence between N. schmidti and N. emyditoides was 4% and intraspecific differences ranged from 0.01 to 0.02%. Pairwise differences between either of these species and 4 other congeners parasitic in fresh and brackish water fishes (Neoechinorhynchus golvani, Neoechinorhynchus roseum, Neoechinorhynchus saginatus, and Neoechinorhynchus sp.) varied from 9.5 to 33%. Maximum likelihood and maximum parsimony analyses show that N. schmidti and N. emyditoides are sister taxa. Bootstrap analysis also indicates that the sister relationship is reliably supported. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Approximated maximum likelihood estimation in multifractal random walks
NASA Astrophysics Data System (ADS)
Løvsletten, O.; Rypdal, M.
2012-04-01
We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry , Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.64.026103 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the r computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
Maximum Likelihood Analysis of a Two-Level Nonlinear Structural Equation Model with Fixed Covariates
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan
2005-01-01
In this article, a maximum likelihood (ML) approach for analyzing a rather general two-level structural equation model is developed for hierarchically structured data that are very common in educational and/or behavioral research. The proposed two-level model can accommodate nonlinear causal relations among latent variables as well as effects…
12-mode OFDM transmission using reduced-complexity maximum likelihood detection.
Lobato, Adriana; Chen, Yingkan; Jung, Yongmin; Chen, Haoshuo; Inan, Beril; Kuschnerov, Maxim; Fontaine, Nicolas K; Ryf, Roland; Spinnler, Bernhard; Lankl, Berthold
2015-02-01
We report the transmission of 163-Gb/s MDM-QPSK-OFDM and 245-Gb/s MDM-8QAM-OFDM transmission over 74 km of few-mode fiber supporting 12 spatial and polarization modes. A low-complexity maximum likelihood detector is employed to enhance the performance of a system impaired by mode-dependent loss.
ERIC Educational Resources Information Center
Han, Kyung T.; Guo, Fanmin
2014-01-01
The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Constrained Maximum Likelihood Estimation for Two-Level Mean and Covariance Structure Models
ERIC Educational Resources Information Center
Bentler, Peter M.; Liang, Jiajuan; Tang, Man-Lai; Yuan, Ke-Hai
2011-01-01
Maximum likelihood is commonly used for the estimation of model parameters in the analysis of two-level structural equation models. Constraints on model parameters could be encountered in some situations such as equal factor loadings for different factors. Linear constraints are the most common ones and they are relatively easy to handle in…
Maximum Likelihood Item Easiness Models for Test Theory without an Answer Key
ERIC Educational Resources Information Center
France, Stephen L.; Batchelder, William H.
2015-01-01
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
ERIC Educational Resources Information Center
Kelderman, Henk
1992-01-01
Describes algorithms used in the computer program LOGIMO for obtaining maximum likelihood estimates of the parameters in loglinear models. These algorithms are also useful for the analysis of loglinear item-response theory models. Presents modified versions of the iterative proportional fitting and Newton-Raphson algorithms. Simulated data…
ERIC Educational Resources Information Center
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Maximum Likelihood Dynamic Factor Modeling for Arbitrary "N" and "T" Using SEM
ERIC Educational Resources Information Center
Voelkle, Manuel C.; Oud, Johan H. L.; von Oertzen, Timo; Lindenberger, Ulman
2012-01-01
This article has 3 objectives that build on each other. First, we demonstrate how to obtain maximum likelihood estimates for dynamic factor models (the direct autoregressive factor score model) with arbitrary "T" and "N" by means of structural equation modeling (SEM) and compare the approach to existing methods. Second, we go beyond standard time…
NASA Technical Reports Server (NTRS)
Kelly, D. A.; Fermelia, A.; Lee, G. K. F.
1990-01-01
An adaptive Kalman filter design that utilizes recursive maximum likelihood parameter identification is discussed. At the center of this design is the Kalman filter itself, which has the responsibility for attitude determination. At the same time, the identification algorithm is continually identifying the system parameters. The approach is applicable to nonlinear, as well as linear systems. This adaptive Kalman filter design has much potential for real time implementation, especially considering the fast clock speeds, cache memory and internal RAM available today. The recursive maximum likelihood algorithm is discussed in detail, with special attention directed towards its unique matrix formulation. The procedure for using the algorithm is described along with comments on how this algorithm interacts with the Kalman filter.
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
NASA Astrophysics Data System (ADS)
Lowell, A. W.; Boggs, S. E.; Chiu, C. L.; Kierans, C. A.; Sleator, C.; Tomsick, J. A.; Zoglauer, A. C.; Chang, H.-K.; Tseng, C.-H.; Yang, C.-Y.; Jean, P.; von Ballmoos, P.; Lin, C.-H.; Amman, M.
2017-10-01
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ˜21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. We find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.
Caridha, Rozina; Ha, Tran Thi Thanh; Gaseitsiwe, Simani; Hung, Pham Viet; Anh, Nguyen Mai; Bao, Nguyen Huy; Khang, Dinh Duy; Hien, Nguyen Tran; Cam, Phung Dac; Chiodi, Francesca
2012-01-01
Abstract Characterization of HIV-1 strains is important for surveillance of the HIV-1 epidemic. In Vietnam HIV-1-infected pregnant women often fail to receive the care they are entitled to. Here, we analyzed phylogenetically HIV-1 env sequences from 37 HIV-1-infected pregnant women from Ha Noi (n=22) and Hai Phong (n=15), where they delivered in 2005–2007. All carried CRF01_AE in the gp120 V3 region. In 21 women CRF01_AE was also found in the reverse transcriptase gene. We compared their env gp120 V3 sequences phylogenetically in a maximum likelihood tree to those of 198 other CRF01_AE sequences in Vietnam and 229 from neighboring countries, predominantly Thailand, from the HIV-1 database. Altogether 464 sequences were analyzed. All but one of the maternal sequences colocalized with sequences from northern Vietnam. The maternal sequences had evolved the least when compared to sequences collected in Ha Noi in 2002, as shown by analysis of synonymous and nonsynonymous changes, than to other Vietnamese sequences collected earlier and/or elsewhere. Since the HIV-1 epidemic in women in Vietnam may still be underestimated, characterization of HIV-1 in pregnant women is important to observe how HIV-1 has evolved and follow its molecular epidemiology. PMID:21936713
Fonseca, Luiz Henrique M; Lohmann, Lúcia G
2018-06-01
Combining high-throughput sequencing data with amplicon sequences allows the reconstruction of robust phylogenies based on comprehensive sampling of characters and taxa. Here, we combine Next Generation Sequencing (NGS) and Sanger sequencing data to infer the phylogeny of the "Adenocalymma-Neojobertia" clade (Bignonieae, Bignoniaceae), a diverse lineage of Neotropical plants, using Maximum Likelihood and Bayesian approaches. We used NGS to obtain complete or nearly-complete plastomes of members of this clade, leading to a final dataset with 54 individuals, representing 44 members of ingroup and 10 outgroups. In addition, we obtained Sanger sequences of two plastid markers (ndhF and rpl32-trnL) for 44 individuals (43 ingroup and 1 outgroup) and the nuclear PepC for 64 individuals (63 ingroup and 1 outgroup). Our final dataset includes 87 individuals of members of the "Adenocalymma-Neojobertia" clade, representing 66 species (ca. 90% of the diversity), plus 11 outgroups. Plastid and nuclear datasets recovered congruent topologies and were combined. The combined analysis recovered a monophyletic "Adenocalymma-Neojobertia" clade and a paraphyletic Adenocalymma that also contained a monophyletic Neojobertia plus Pleonotoma albiflora. Relationships are strongly supported in all analyses, with most lineages within the "Adenocalymma-Neojobertia" clade receiving maximum posterior probabilities. Ancestral character state reconstructions using Bayesian approaches identified six morphological synapomorphies of clades namely, prophyll type, petiole and petiolule articulation, tendril ramification, inflorescence ramification, calyx shape, and fruit wings. Other characters such as habit, calyx cupular trichomes, corolla color, and corolla shape evolved multiple times. These characters are putatively related with the clade diversification and can be further explored in diversification studies. Copyright © 2018 Elsevier Inc. All rights reserved.
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.
Kelly, Steven; Maini, Philip K
2013-01-01
The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
The First Mitochondrial Genome for Caddisfly (Insecta: Trichoptera) with Phylogenetic Implications
Wang, Yuyu; Liu, Xingyue; Yang, Ding
2014-01-01
The Trichoptera (caddisflies) is a holometabolous insect order with 14,300 described species forming the second most species-rich monophyletic group of animals in freshwater. Hitherto, there is no mitochondrial genome reported of this order. Herein, we describe the complete mitochondrial (mt) genome of a caddisfly species, Eubasilissa regina (McLachlan, 1871). A phylogenomic analysis was carried out based on the mt genomic sequences of 13 mt protein coding genes (PCGs) and two rRNA genes of 24 species belonging to eight holometabolous orders. Both maximum likelihood and Bayesian inference analyses highly support the sister relationship between Trichoptera and Lepidoptera. PMID:24391451
Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai
2017-01-01
The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.
Torres-Cruz, Terry J.; Billingsley Tobias, Terri L.; Almatruk, Maryam; ...
2017-08-08
Illumina amplicon sequencing of soil in a temperate pine forest in the southeastern United States detected an abundant, nitrogen (N)-responsive fungal genotype of unknown phylogenetic affiliation. Two isolates with ribosomal sequences consistent with that genotype were subsequently obtained. Examination of records in GenBank revealed that a genetically similar fungus had been isolated previously as an endophyte of moss in a pine forest in the southwestern United States. The three isolates were characterized using morphological, genomic, and multilocus molecular data (18S, internal transcribed spacer [ITS], and 28S rRNA sequences). Phylogenetic and maximum likelihood phylogenomic reconstructions revealed that the taxon represents amore » novel lineage in Mucoromycotina, only preceded by Calcarisporiella, the earliest diverging lineage in the subphylum. Sequences for the novel taxon are frequently detected in environmental sequencing studies, and it is currently part of UNITE’s dynamic list of most wanted fungi. The fungus is dimorphic, grows best at room temperature, and is associated with a wide variety of bacteria. In this paper, a new monotypic genus, Bifiguratus, is proposed, typified by Bifiguratus adelaidae.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torres-Cruz, Terry J.; Billingsley Tobias, Terri L.; Almatruk, Maryam
Illumina amplicon sequencing of soil in a temperate pine forest in the southeastern United States detected an abundant, nitrogen (N)-responsive fungal genotype of unknown phylogenetic affiliation. Two isolates with ribosomal sequences consistent with that genotype were subsequently obtained. Examination of records in GenBank revealed that a genetically similar fungus had been isolated previously as an endophyte of moss in a pine forest in the southwestern United States. The three isolates were characterized using morphological, genomic, and multilocus molecular data (18S, internal transcribed spacer [ITS], and 28S rRNA sequences). Phylogenetic and maximum likelihood phylogenomic reconstructions revealed that the taxon represents amore » novel lineage in Mucoromycotina, only preceded by Calcarisporiella, the earliest diverging lineage in the subphylum. Sequences for the novel taxon are frequently detected in environmental sequencing studies, and it is currently part of UNITE’s dynamic list of most wanted fungi. The fungus is dimorphic, grows best at room temperature, and is associated with a wide variety of bacteria. In this paper, a new monotypic genus, Bifiguratus, is proposed, typified by Bifiguratus adelaidae.« less
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen
2016-01-01
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection. PMID:26927946
Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen
2016-01-01
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection.
Chen, Weicai; Zhang, Wei; Zhou, Shichu; Li, Ning; Huang, Yong; Mo, Yunming
2013-01-01
Lepobrachiun guangxiense Fei, Mo, Ye and Jiang, 2009 (Anura: Megophryidae), is presently thought to be endemic to Shangsi, Guangxi Province, China. A molecular phylogenetic analysis and morphological data were performed to gain insight into the phylogenetic position of this species. Maximum parsimony, maximum likelihood, and Bayesian inference methods were employed to reconstruct phylogenetic relationship, using 1914 bp of sequences from mtDNA genes of 12S rRNA, tRNAVal and 16S rRNA. Topologies revealed that L. guangxiense and Tam Dao (Vietnam) L. chapaense lineage (3A) formed a monophyletic group with well-supported values. The uncorrected p-distance of ~1.4k bp 16S rRNA data-sets between Tam Dao L. chapaense lineage (3A) and L. guangxiense is only 0.1%. Morphologically, L. guangxiense and Tam Dao L. chapaense lineage (3A) shared the same characters, and are distinguishable from "true" L. chapaense from the type locality in Sa Pa, Vietnam. Based on morphological characters and mitochondrial DNA, we suggested that the Tam Dao lineages of L. chapaense are conspecific with L. guangxiense. This represents a range extension for L. guangxiense, and a new country record for Vietnam.
A Distance Measure for Genome Phylogenetic Analysis
NASA Astrophysics Data System (ADS)
Cao, Minh Duc; Allison, Lloyd; Dix, Trevor
Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
Molecular phylogeny of the armored catfish family Callichthyidae (Ostariophysi, Siluriformes).
Shimabukuro-Dias, Cristiane Kioko; Oliveira, Claudio; Reis, Roberto E; Foresti, Fausto
2004-07-01
The family Callichthyidae comprises eight genera of fishes widely distributed across the Neotropical region. In the present study, sequences of the mitochondrial genes 12S rRNA, 16S rRNA, ND4, tRNAHis, and tRNASer were obtained from 28 callichthyid specimens. The sample included 12 species of Corydoras, three species of Aspidoras, two species of Brochis, Dianema, Lepthoplosternum, and Megalechis, and two local populations of Callichthys and Hoplosternum. Sequences of Nematogenys inermis (Nematogenyidae), Trichomycterus areolatus, and Henonemus punctatus (Trichomycteridae), Astroblepus sp. (Astroblepidae), and Neoplecostomus paranensis, Delturus parahybae, and Hemipsilichthys nimius (Loricariidae) were included as the outgroup. Phylogenetic analyses were performed by using the methods of maximum parsimony and maximum likelihood. The results of almost all analyses were very similar. The family Callichthyidae is monophyletic and comprises two natural groups: the subfamilies Corydoradinae (Aspidoras, Brochis, and Corydoras) and Callichthyinae (Callichthys, Dianema, Hoplosternum, Lepthoplosternum, and Megalechis), as previously demonstrated by morphological studies. The relationships observed within these subfamilies are in several ways different from those previously proposed on the basis of morphological data. Molecular results were compared with the morphologic and cytogenetic data available on the family. Copyright 2003 Elsevier Inc.
Molecular and Clinical Characterization of Chikungunya Virus Infections in Southeast Mexico.
Galán-Huerta, Kame A; Martínez-Landeros, Erik; Delgado-Gallegos, Juan L; Caballero-Sosa, Sandra; Malo-García, Iliana R; Fernández-Salas, Ildefonso; Ramos-Jiménez, Javier; Rivas-Estilla, Ana M
2018-05-09
Chikungunya fever is an arthropod-borne infection caused by Chikungunya virus (CHIKV). Even though clinical features of Chikungunya fever in the Mexican population have been described before, there is no detailed information. The aim of this study was to perform a full description of the clinical features in confirmed Chikungunya-infected patients and describe the molecular epidemiology of CHIKV. We evaluated febrile patients who sought medical assistance in Tapachula, Chiapas, Mexico, from June through July 2015. Infection was confirmed with molecular and serological methods. Viruses were isolated and the E1 gene was sequenced. Phylogeny reconstruction was inferred using maximum-likelihood and maximum clade credibility approaches. We studied 52 patients with confirmed CHIKV infection. They were more likely to have wrist, metacarpophalangeal, and knee arthralgia. Two combinations of clinical features were obtained to differentiate between Chikungunya fever and acute undifferentiated febrile illness. We obtained 10 CHIKV E1 sequences that grouped with the Asian lineage. Seven strains diverged from the formerly reported. Patients infected with the divergent CHIKV strains showed a broader spectrum of clinical manifestations. We defined the complete clinical features of Chikungunya fever in patients from Southeastern Mexico. Our results demonstrate co-circulation of different CHIKV strains in the state of Chiapas.
Yuri, Tamaki; Kimball, Rebecca T.; Harshman, John; Bowie, Rauri C. K.; Braun, Michael J.; Chojnowski, Jena L.; Han, Kin-Lan; Hackett, Shannon J.; Huddleston, Christopher J.; Moore, William S.; Reddy, Sushma; Sheldon, Frederick H.; Steadman, David W.; Witt, Christopher C.; Braun, Edward L.
2013-01-01
Insertion/deletion (indel) mutations, which are represented by gaps in multiple sequence alignments, have been used to examine phylogenetic hypotheses for some time. However, most analyses combine gap data with the nucleotide sequences in which they are embedded, probably because most phylogenetic datasets include few gap characters. Here, we report analyses of 12,030 gap characters from an alignment of avian nuclear genes using maximum parsimony (MP) and a simple maximum likelihood (ML) framework. Both trees were similar, and they exhibited almost all of the strongly supported relationships in the nucleotide tree, although neither gap tree supported many relationships that have proven difficult to recover in previous studies. Moreover, independent lines of evidence typically corroborated the nucleotide topology instead of the gap topology when they disagreed, although the number of conflicting nodes with high bootstrap support was limited. Filtering to remove short indels did not substantially reduce homoplasy or reduce conflict. Combined analyses of nucleotides and gaps resulted in the nucleotide topology, but with increased support, suggesting that gap data may prove most useful when analyzed in combination with nucleotide substitutions. PMID:24832669
Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2014-01-01
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
NASA Technical Reports Server (NTRS)
Benjauthrit, B.; Mulhall, B.; Madsen, B. D.; Alberda, M. E.
1976-01-01
The DSN telemetry system performance with convolutionally coded data using the operational maximum-likelihood convolutional decoder (MCD) being implemented in the Network is described. Data rates from 80 bps to 115.2 kbps and both S- and X-band receivers are reported. The results of both one- and two-way radio losses are included.
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
ERIC Educational Resources Information Center
Khattab, Ali-Maher; And Others
1982-01-01
A causal modeling system, using confirmatory maximum likelihood factor analysis with the LISREL IV computer program, evaluated the construct validity underlying the higher order factor structure of a given correlation matrix of 46 structure-of-intellect tests emphasizing the product of transformations. (Author/PN)
NASA Astrophysics Data System (ADS)
Sutawanir
2015-12-01
Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
On non-parametric maximum likelihood estimation of the bivariate survivor function.
Prentice, R L
The likelihood function for the bivariate survivor function F, under independent censorship, is maximized to obtain a non-parametric maximum likelihood estimator &Fcirc;. &Fcirc; may or may not be unique depending on the configuration of singly- and doubly-censored pairs. The likelihood function can be maximized by placing all mass on the grid formed by the uncensored failure times, or half lines beyond the failure time grid, or in the upper right quadrant beyond the grid. By accumulating the mass along lines (or regions) where the likelihood is flat, one obtains a partially maximized likelihood as a function of parameters that can be uniquely estimated. The score equations corresponding to these point mass parameters are derived, using a Lagrange multiplier technique to ensure unit total mass, and a modified Newton procedure is used to calculate the parameter estimates in some limited simulation studies. Some considerations for the further development of non-parametric bivariate survivor function estimators are briefly described.
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Tartar, Aurélien; Boucias, Drion G; Becnel, James J; Adams, Byron J
2003-11-01
The Helicosporidia are invertebrate pathogens that have recently been identified as non-photosynthetic green algae (Chlorophyta). In order to confirm the algal nature of the genus Helicosporidium, the presence of a retained chloroplast genome in Helicosporidia cells was investigated. Fragments homologous to plastid 16S rRNA (rrn16) genes were amplified successfully from cellular DNA extracted from two different Helicosporidium isolates. The fragment sequences are 1269 and 1266 bp long, are very AT-rich (60.7 %) and are similar to homologous genes sequenced from non-photosynthetic green algae. Maximum-parsimony, maximum-likelihood and neighbour-joining methods were used to infer phylogenetic trees from an rrn16 sequence alignment. All trees depicted the Helicosporidia as sister taxa to the non-photosynthetic, pathogenic alga Prototheca zopfii. Moreover, the trees identified Helicosporidium spp. as members of a clade that included the heterotrophic species Prototheca spp. and the mesotrophic species Chlorella protothecoides. The clade is always strongly supported by bootstrap values, suggesting that all these organisms share a most recent common ancestor. Phylogenetic analyses inferred from plastid 16S rRNA genes confirmed that the Helicosporidia are non-photosynthetic green algae, close relatives of the genus Prototheca (Chlorophyta, Trebouxiophyceae). Such phylogenetic affinities suggest that Helicosporidium spp. are likely to possess Prototheca-like organelles and organelle genomes.
Wang, Li; Yokoyama, Koji; Miyaji, Makoto; Nishimura, Kazuko
2001-01-01
We analyzed a 402-bp sequence of the mitochondrial cytochrome b gene of 34 strains of Exophiala jeanselmei and 16 strains representing 12 related species. The strains of E. jeanselmei were classified into 20 DNA types and 17 amino acid types. The differences between these strains were found in 1 to 60 nucleotides and 1 to 17 amino acids. On the basis of the identities and similarities of nucleotide and amino acid sequences, some strains were reidentified: i.e., two strains of E. jeanselmei var. hetermorpha and one strain of E. castellanii as E. dermatitidis (including the type strain), three strains of E. jeanselmei as E. jeanselmei var. lecanii-corni (including the type strain), three strains of E. jeanselmei as E. bergeri (including the type strain), seven strains of E. jeanselmei as E. pisciphila (including the type strain), seven strains of E. jeanselmei as E. jeanselmei var. jeanselmei (including the type strain), one strain of E. jeanselmei as Fonsecaea pedrosoi (including the type strain), and one strain of E. jeanselmei as E. spinifera (including the type strain). Some E. jeanselmei strains showed distinct nucleotide and amino acid sequences. The amino-acid-based UPGMA (unweighted pair group method with the arithmetic mean) tree exhibited nearly the same topology as those of the DNA-based trees obtained by neighbor joining, maximum parsimony, and maximum likelihood methods. PMID:11724862
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lowell, A. W.; Boggs, S. E; Chiu, C. L.
2017-10-20
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ∼21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. Wemore » find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.« less
Zagata, Patrycja; Greczek-Stachura, Magdalena; Tarcz, Sebastian; Rautian, Maria
2015-01-01
Paramecium bursaria is composed of five syngens that are morphologically indistinguishable but sexually isolated. The aim of the present study was to confirm by molecular methods (analyses of mitochondrial COI) the identification of P. bursaria syngens originating from different geographical locations. Phylograms constructed using both the neighbor-joining and maximum-likelihood methods based on a comparison of 34 sequences of P. bursaria strains and P. multimicronucleatum, P. caudatum and P.calkinsi strains used as outgroups revealed five clusters which correspond to results obtained previously by mating reaction. Our analysis shows the existence of 24 haplotypes for the COI gene sequence in the studied strains. The interspecies haplotype diversity was Hd = 0.967. We confirmed genetic differentiation between strains of P. bursaria and the occurrence of a correlation between geographical distribution and the correspondent syngen.
Hofman, Sebastian; Pabijan, Maciej; Osikowski, Artur; Litvinchuk, Spartak N; Szymura, Jacek M
2016-09-01
We present the full-length mitogenome sequences of four European water frog species: Pelophylax cypriensis, P. epeiroticus, P. kurtmuelleri and P. shqipericus. The mtDNA size varied from 17,363 to 17,895 bp, and its organization with the LPTF tRNA gene cluster preceding the 12 S rRNA gene displayed the typical Neobatrachian arrangement. Maximum likelihood and Bayesian inference revealed a well-resolved mtDNA phylogeny of seven European Pelophylax species. The uncorrected p-distance for among Pelophylax mitogenomes was 9.6 (range 0.01-0.13). Most divergent was the P. shqipericus mitogenome, clustering with the "P. lessonae" group, in contrast to the other three new Pelophylax mitogenomes related to the "P. bedriagae/ridibundus" lineage. The new mitogenomes resolve ambiguities of the phylogenetic placement of P. cretensis and P. epeiroticus.
Stephen, Alexa A; Leone, Angelique M; Toplon, David E; Archer, Linda L; Wellehan, James F X
2016-12-01
A juvenile female bald eagle ( Haliaeetus leucocephalus ) was presented with emaciation and proliferative periocular lesions. The eagle did not respond to supportive therapy and was euthanatized. Histopathologic examination of the skin lesions revealed plaques of marked epidermal hyperplasia parakeratosis, marked acanthosis and spongiosis, and eosinophilic intracytoplasmic inclusion bodies. Novel polymerase chain reaction (PCR) assays were done to amplify and sequence DNA polymerase and rpo147 genes. The 4b gene was also analyzed by a previously developed assay. Bayesian and maximum likelihood phylogenetic analyses of the obtained sequences found it to be poxvirus of the genus Avipoxvirus and clustered with other raptor isolates. Better phylogenetic resolution was found in rpo147 rather than the commonly used DNA polymerase. The novel consensus rpo147 PCR assay will create more accurate phylogenic trees and allow better insight into poxvirus history.
Iterative Code-Aided ML Phase Estimation and Phase Ambiguity Resolution
NASA Astrophysics Data System (ADS)
Wymeersch, Henk; Moeneclaey, Marc
2005-12-01
As many coded systems operate at very low signal-to-noise ratios, synchronization becomes a very difficult task. In many cases, conventional algorithms will either require long training sequences or result in large BER degradations. By exploiting code properties, these problems can be avoided. In this contribution, we present several iterative maximum-likelihood (ML) algorithms for joint carrier phase estimation and ambiguity resolution. These algorithms operate on coded signals by accepting soft information from the MAP decoder. Issues of convergence and initialization are addressed in detail. Simulation results are presented for turbo codes, and are compared to performance results of conventional algorithms. Performance comparisons are carried out in terms of BER performance and mean square estimation error (MSEE). We show that the proposed algorithm reduces the MSEE and, more importantly, the BER degradation. Additionally, phase ambiguity resolution can be performed without resorting to a pilot sequence, thus improving the spectral efficiency.
Lod scores for gene mapping in the presence of marker map uncertainty.
Stringham, H M; Boehnke, M
2001-07-01
Multipoint lod scores are typically calculated for a grid of locus positions, moving the putative disease locus across a fixed map of genetic markers. Changing the order of a set of markers and/or the distances between the markers can make a substantial difference in the resulting lod score curve and the location and height of its maximum. The typical approach of using the best maximum likelihood marker map is not easily justified if other marker orders are nearly as likely and give substantially different lod score curves. To deal with this problem, we propose three weighted multipoint lod score statistics that make use of information from all plausible marker orders. In each of these statistics, the information conditional on a particular marker order is included in a weighted sum, with weight equal to the posterior probability of that order. We evaluate the type 1 error rate and power of these three statistics on the basis of results from simulated data, and compare these results to those obtained using the best maximum likelihood map and the map with the true marker order. We find that the lod score based on a weighted sum of maximum likelihoods improves on using only the best maximum likelihood map, having a type 1 error rate and power closest to that of using the true marker order in the simulation scenarios we considered. Copyright 2001 Wiley-Liss, Inc.
On the Existence and Uniqueness of JML Estimates for the Partial Credit Model
ERIC Educational Resources Information Center
Bertoli-Barsotti, Lucio
2005-01-01
A necessary and sufficient condition is given in this paper for the existence and uniqueness of the maximum likelihood (the so-called joint maximum likelihood) estimate of the parameters of the Partial Credit Model. This condition is stated in terms of a structural property of the pattern of the data matrix that can be easily verified on the basis…
ERIC Educational Resources Information Center
Paek, Insu; Wilson, Mark
2011-01-01
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Robust analysis of semiparametric renewal process models
Lin, Feng-Chang; Truong, Young K.; Fine, Jason P.
2013-01-01
Summary A rate model is proposed for a modulated renewal process comprising a single long sequence, where the covariate process may not capture the dependencies in the sequence as in standard intensity models. We consider partial likelihood-based inferences under a semiparametric multiplicative rate model, which has been widely studied in the context of independent and identical data. Under an intensity model, gap times in a single long sequence may be used naively in the partial likelihood with variance estimation utilizing the observed information matrix. Under a rate model, the gap times cannot be treated as independent and studying the partial likelihood is much more challenging. We employ a mixing condition in the application of limit theory for stationary sequences to obtain consistency and asymptotic normality. The estimator's variance is quite complicated owing to the unknown gap times dependence structure. We adapt block bootstrapping and cluster variance estimators to the partial likelihood. Simulation studies and an analysis of a semiparametric extension of a popular model for neural spike train data demonstrate the practical utility of the rate approach in comparison with the intensity approach. PMID:24550568
Estimating Divergence Parameters With Small Samples From a Large Number of Loci
Wang, Yong; Hey, Jody
2010-01-01
Most methods for studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history but they are unlikely to contain sufficient information about older events. However, the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying relatively ancient divergence. Data sets extracted from whole-genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data we developed a new maximum-likelihood method for genomic data under the isolation-with-migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans and detected a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster. PMID:19917765
Gifford, Robert J.; Rhee, Soo-Yon; Eriksson, Nicolas; Liu, Tommy F.; Kiuchi, Mark; Das, Amar K.; Shafer, Robert W.
2008-01-01
Design Promiscuous guanine (G) to adenine (A) substitutions catalysed by apolipoprotein B RNA-editing catalytic component (APOBEC) enzymes are observed in a proportion of HIV-1 sequences in vivo and can introduce artifacts into some genetic analyses. The potential impact of undetected lethal editing on genotypic estimation of transmitted drug resistance was assessed. Methods Classifiers of lethal, APOBEC-mediated editing were developed by analysis of lentiviral pol gene sequence variation and evaluated using control sets of HIV-1 sequences. The potential impact of sequence editing on genotypic estimation of drug resistance was assessed in sets of sequences obtained from 77 studies of 25 or more therapy-naive individuals, using mixture modelling approaches to determine the maximum likelihood classification of sequences as lethally edited as opposed to viable. Results Analysis of 6437 protease and reverse transcriptase sequences from therapy-naive individuals using a novel classifier of lethal, APOBEC3G-mediated sequence editing, the polypeptide-like 3G (APOBEC3G)-mediated defectives (A3GD) index’, detected lethal editing in association with spurious ‘transmitted drug resistance’ in nearly 3% of proviral sequences obtained from whole blood and 0.2% of samples obtained from plasma. Conclusion Screening for lethally edited sequences in datasets containing a proportion of proviral DNA, such as those likely to be obtained for epidemiological surveillance of transmitted drug resistance in the developing world, can eliminate rare but potentially significant errors in genotypic estimation of transmitted drug resistance. PMID:18356601
Yang, Li; Wang, Guobao; Qi, Jinyi
2016-04-01
Detecting cancerous lesions is a major clinical application of emission tomography. In a previous work, we studied penalized maximum-likelihood (PML) image reconstruction for lesion detection in static PET. Here we extend our theoretical analysis of static PET reconstruction to dynamic PET. We study both the conventional indirect reconstruction and direct reconstruction for Patlak parametric image estimation. In indirect reconstruction, Patlak parametric images are generated by first reconstructing a sequence of dynamic PET images, and then performing Patlak analysis on the time activity curves (TACs) pixel-by-pixel. In direct reconstruction, Patlak parametric images are estimated directly from raw sinogram data by incorporating the Patlak model into the image reconstruction procedure. PML reconstruction is used in both the indirect and direct reconstruction methods. We use a channelized Hotelling observer (CHO) to assess lesion detectability in Patlak parametric images. Simplified expressions for evaluating the lesion detectability have been derived and applied to the selection of the regularization parameter value to maximize detection performance. The proposed method is validated using computer-based Monte Carlo simulations. Good agreements between the theoretical predictions and the Monte Carlo results are observed. Both theoretical predictions and Monte Carlo simulation results show the benefit of the indirect and direct methods under optimized regularization parameters in dynamic PET reconstruction for lesion detection, when compared with the conventional static PET reconstruction.
Kim, Tane; Hao, Weilong
2014-09-27
The study of discrete characters is crucial for the understanding of evolutionary processes. Even though great advances have been made in the analysis of nucleotide sequences, computer programs for non-DNA discrete characters are often dedicated to specific analyses and lack flexibility. Discrete characters often have different transition rate matrices, variable rates among sites and sometimes contain unobservable states. To obtain the ability to accurately estimate a variety of discrete characters, programs with sophisticated methodologies and flexible settings are desired. DiscML performs maximum likelihood estimation for evolutionary rates of discrete characters on a provided phylogeny with the options that correct for unobservable data, rate variations, and unknown prior root probabilities from the empirical data. It gives users options to customize the instantaneous transition rate matrices, or to choose pre-determined matrices from models such as birth-and-death (BD), birth-death-and-innovation (BDI), equal rates (ER), symmetric (SYM), general time-reversible (GTR) and all rates different (ARD). Moreover, we show application examples of DiscML on gene family data and on intron presence/absence data. DiscML was developed as a unified R program for estimating evolutionary rates of discrete characters with no restriction on the number of character states, and with flexibility to use different transition models. DiscML is ideal for the analyses of binary (1s/0s) patterns, multi-gene families, and multistate discrete morphological characteristics.
Bayesian image reconstruction for improving detection performance of muon tomography.
Wang, Guobao; Schultz, Larry J; Qi, Jinyi
2009-05-01
Muon tomography is a novel technology that is being developed for detecting high-Z materials in vehicles or cargo containers. Maximum likelihood methods have been developed for reconstructing the scattering density image from muon measurements. However, the instability of maximum likelihood estimation often results in noisy images and low detectability of high-Z targets. In this paper, we propose using regularization to improve the image quality of muon tomography. We formulate the muon reconstruction problem in a Bayesian framework by introducing a prior distribution on scattering density images. An iterative shrinkage algorithm is derived to maximize the log posterior distribution. At each iteration, the algorithm obtains the maximum a posteriori update by shrinking an unregularized maximum likelihood update. Inverse quadratic shrinkage functions are derived for generalized Laplacian priors and inverse cubic shrinkage functions are derived for generalized Gaussian priors. Receiver operating characteristic studies using simulated data demonstrate that the Bayesian reconstruction can greatly improve the detection performance of muon tomography.
Inferring HIV-1 Transmission Dynamics in Germany From Recently Transmitted Viruses.
Pouran Yousef, Kaveh; Meixenberger, Karolin; Smith, Maureen R; Somogyi, Sybille; Gromöller, Silvana; Schmidt, Daniel; Gunsenheimer-Bartmeyer, Barbara; Hamouda, Osamah; Kücherer, Claudia; von Kleist, Max
2016-11-01
Although HIV continues to spread globally, novel intervention strategies such as treatment as prevention (TasP) may bring the epidemic to a halt. However, their effective implementation requires a profound understanding of the underlying transmission dynamics. We analyzed parameters of the German HIV epidemic based on phylogenetic clustering of viral sequences from recently infected seroconverters with known infection dates. Viral baseline and follow-up pol sequences (n = 1943) from 1159 drug-naïve individuals were selected from a nationwide long-term observational study initiated in 1997. Putative transmission clusters were computed based on a maximum likelihood phylogeny. Using individual follow-up sequences, we optimized our clustering threshold to maximize the likelihood of co-clustering individuals connected by direct transmission. The sizes of putative transmission clusters scaled inversely with their abundance and their distribution exhibited a heavy tail. Clusters based on the optimal clustering threshold were significantly more likely to contain members of the same or bordering German federal states. Interinfection times between co-clustered individuals were significantly shorter (26 weeks; interquartile range: 13-83) than in a null model. Viral intraindividual evolution may be used to select criteria that maximize co-clustering of transmission pairs in the absence of strong adaptive selection pressure. Interinfection times of co-clustered individuals may then be an indicator of the typical time to onward transmission. Our analysis suggests that onward transmission may have occurred early after infection, when individuals are typically unaware of their serological status. The latter argues that TasP should be combined with HIV testing campaigns to reduce the possibility of transmission before TasP initiation.
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
Donato, David I.
2012-01-01
This report presents the mathematical expressions and the computational techniques required to compute maximum-likelihood estimates for the parameters of the National Descriptive Model of Mercury in Fish (NDMMF), a statistical model used to predict the concentration of methylmercury in fish tissue. The expressions and techniques reported here were prepared to support the development of custom software capable of computing NDMMF parameter estimates more quickly and using less computer memory than is currently possible with available general-purpose statistical software. Computation of maximum-likelihood estimates for the NDMMF by numerical solution of a system of simultaneous equations through repeated Newton-Raphson iterations is described. This report explains the derivation of the mathematical expressions required for computational parameter estimation in sufficient detail to facilitate future derivations for any revised versions of the NDMMF that may be developed.
Nagelkerke, Nico; Fidler, Vaclav
2015-01-01
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal
2012-01-01
Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.
Viterbi equalization for long-distance, high-speed underwater laser communication
NASA Astrophysics Data System (ADS)
Hu, Siqi; Mi, Le; Zhou, Tianhua; Chen, Weibiao
2017-07-01
In long-distance, high-speed underwater laser communication, because of the strong absorption and scattering processes, the laser pulse is stretched with the increase in communication distance and the decrease in water clarity. The maximum communication bandwidth is limited by laser-pulse stretching. Improving the communication rate increases the intersymbol interference (ISI). To reduce the effect of ISI, the Viterbi equalization (VE) algorithm is used to estimate the maximum-likelihood receiving sequence. The Monte Carlo method is used to simulate the stretching of the received laser pulse and the maximum communication rate at a wavelength of 532 nm in Jerlov IB and Jerlov II water channels with communication distances of 80, 100, and 130 m, respectively. The high-data rate communication performance for the VE and hard-decision algorithms is compared. The simulation results show that the VE algorithm can be used to reduce the ISI by selecting the minimum error path. The trade-off between the high-data rate communication performance and minor bit-error rate performance loss makes VE a promising option for applications in long-distance, high-speed underwater laser communication systems.
Peters, Ralph S; Meusemann, Karen; Petersen, Malte; Mayer, Christoph; Wilbrandt, Jeanne; Ziesmann, Tanja; Donath, Alexander; Kjer, Karl M; Aspöck, Ulrike; Aspöck, Horst; Aberer, Andre; Stamatakis, Alexandros; Friedrich, Frank; Hünefeld, Frank; Niehuis, Oliver; Beutel, Rolf G; Misof, Bernhard
2014-03-20
Despite considerable progress in systematics, a comprehensive scenario of the evolution of phenotypic characters in the mega-diverse Holometabola based on a solid phylogenetic hypothesis was still missing. We addressed this issue by de novo sequencing transcriptome libraries of representatives of all orders of holometabolan insects (13 species in total) and by using a previously published extensive morphological dataset. We tested competing phylogenetic hypotheses by analyzing various specifically designed sets of amino acid sequence data, using maximum likelihood (ML) based tree inference and Four-cluster Likelihood Mapping (FcLM). By maximum parsimony-based mapping of the morphological data on the phylogenetic relationships we traced evolutionary transformations at the phenotypic level and reconstructed the groundplan of Holometabola and of selected subgroups. In our analysis of the amino acid sequence data of 1,343 single-copy orthologous genes, Hymenoptera are placed as sister group to all remaining holometabolan orders, i.e., to a clade Aparaglossata, comprising two monophyletic subunits Mecopterida (Amphiesmenoptera + Antliophora) and Neuropteroidea (Neuropterida + Coleopterida). The monophyly of Coleopterida (Coleoptera and Strepsiptera) remains ambiguous in the analyses of the transcriptome data, but appears likely based on the morphological data. Highly supported relationships within Neuropterida and Antliophora are Raphidioptera + (Neuroptera + monophyletic Megaloptera), and Diptera + (Siphonaptera + Mecoptera). ML tree inference and FcLM yielded largely congruent results. However, FcLM, which was applied here for the first time to large phylogenomic supermatrices, displayed additional signal in the datasets that was not identified in the ML trees. Our phylogenetic results imply that an orthognathous larva belongs to the groundplan of Holometabola, with compound eyes and well-developed thoracic legs, externally feeding on plants or fungi. Ancestral larvae of Aparaglossata were prognathous, equipped with single larval eyes (stemmata), and possibly agile and predacious. Ancestral holometabolan adults likely resembled in their morphology the groundplan of adult neopteran insects. Within Aparaglossata, the adult's flight apparatus and ovipositor underwent strong modifications. We show that the combination of well-resolved phylogenies obtained by phylogenomic analyses and well-documented extensive morphological datasets is an appropriate basis for reconstructing complex morphological transformations and for the inference of evolutionary histories.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Statistical Bias in Maximum Likelihood Estimators of Item Parameters.
1982-04-01
34 a> E r’r~e r ,C Ie I# ne,..,.rVi rnd Id.,flfv b1 - bindk numb.r) I; ,t-i i-cd I ’ tiie bias in the maximum likelihood ,st i- i;, ’ t iIeiIrs in...NTC, IL 60088 Psychometric Laboratory University of North Carolina I ERIC Facility-Acquisitions Davie Hall 013A 4833 Rugby Avenue Chapel Hill, NC
ERIC Educational Resources Information Center
Beauducel, Andre; Herzberg, Philipp Yorck
2006-01-01
This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no…
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Gritsun, T S; Venugopal, K; Zanotto, P M; Mikhailov, M V; Sall, A A; Holmes, E C; Polkinghorne, I; Frolova, T V; Pogodina, V V; Lashkevich, V A; Gould, E A
1997-05-01
The complete nucleotide sequence of two tick-transmitted flaviviruses, Vasilchenko (Vs) from Siberia and louping ill (LI) from the UK, have been determined. The genomes were respectively, 10928 and 10871 nucleotides (nt) in length. The coding strategy and functional protein sequence motifs of tick-borne flaviviruses are presented in both Vs and LI viruses. The phylogenies based on maximum likelihood, maximum parsimony and distance analysis of the polyproteins, identified Vs virus as a member of the tick-borne encephalitis virus subgroup within the tick-borne serocomplex, genus Flavivirus, family Flaviviridae. Comparative alignment of the 3'-untranslated regions revealed deletions of different lengths essentially at the same position downstream of the stop codon for all tick-borne viruses. Two direct 27 nucleotide repeats at the 3'-end were found only for Vs and LI virus. Immediately following the deletions a region of 332-334 nt with relatively conserved primary structure (67-94% identity) was observed at the 3'-non-coding end of the virus genome. Pairwise comparisons of the nucleotide sequence data revealed similar levels of variation between the coding region, and the 5' and 3'-termini of the genome, implying an equivalent strong selective control for translated and untranslated regions. Indeed the predicted folding of the 5' and 3'-untranslated regions revealed patterns of stem and loop structures conserved for all tick-borne flaviviruses suggesting a purifying selection for preservation of essential RNA secondary structures which could be involved in translational control and replication. The possible implications of these findings are discussed.
Agaricus section Xanthodermatei: a phylogenetic reconstruction with commentary on taxa.
Kerrigan, Richard W; Callac, Philippe; Guinberteau, Jacques; Challen, Michael P; Parra, Luis A
2005-01-01
Agaricus section Xanthodermatei comprises a group of species allied to A. xanthodermus and generally characterized by basidiomata having phenolic odors, transiently yellowing discolorations in some parts of the basidiome, Schaeffer's reaction negative, and mild to substantial toxicity. The section has a global distribution, while most included species have distributions restricted to regions of single continents. Using specimens and cultures from Europe, North America, and Hawaii, we analyzed DNA sequences from the ITS1+2 region of the nuclear rDNA to identify and characterize phylogenetically distinct entities and to construct a hypothesis of relationships, both among members of the section and with representative taxa from other sections of the genus. 61 sequences from affiliated taxa, plus 20 from six (or seven) other sections of Agaricus, and one Micropsalliota sequence, were evaluated under distance, maximum parsimony and maximum likelihood methods. We recognized 21 discrete entities in Xanthodermatei, including 14 established species and 7 new ones, three of which are described elsewhere. Four species from California, New Mexico, and France deserve further study before they are described. Type studies of American taxa are particularly emphasized, and a lectotype is designated for A. californicus. Section Xanthodermatei formed a single clade in most analyses, indicating that the traditional sectional characters noted above are good unifying characters that appear to have arisen only once within Agaricus. Deep divisions within the sequence-derived structure of the section could be interpreted as subsections in Xanthodermatei; however, various considerations led us to refrain from proposing new supraspecific taxa. The nearest neighbors of section Xanthodermatei are putatively in section Duploannulati.
Inference from Samples of DNA Sequences Using a Two-Locus Model
Griffiths, Robert C.
2011-01-01
Abstract Performing inference on contemporary samples of DNA sequence data is an important and challenging task. Computationally intensive methods such as importance sampling (IS) are attractive because they make full use of the available data, but in the presence of recombination the large state space of genealogies can be prohibitive. In this article, we make progress by developing an efficient IS proposal distribution for a two-locus model of sequence data. We show that the proposal developed here leads to much greater efficiency, outperforming existing IS methods that could be adapted to this model. Among several possible applications, the algorithm can be used to find maximum likelihood estimates for mutation and crossover rates, and to perform ancestral inference. We illustrate the method on previously reported sequence data covering two loci either side of the well-studied TAP2 recombination hotspot. The two loci are themselves largely non-recombining, so we obtain a gene tree at each locus and are able to infer in detail the effect of the hotspot on their joint ancestry. We summarize this joint ancestry by introducing the gene graph, a summary of the well-known ancestral recombination graph. PMID:21210733
Barr, Norman; Ruiz-Arce, Raul; Obregón, Oscar; De Leon, Rosita; Foster, Nelson; Reuter, Chris; Boratynski, Theodore; Vacek, Don
2013-02-01
The utility of the cytochrome oxidase I (COI) DNA sequence used for DNA barcoding and a Sequence Characterized Amplified Region for diagnosing boll weevil, Anthonomus grandis Boheman, variants was evaluated. Maximum likelihood analysis of COI DNA sequences from 154 weevils collected from the United States and Mexico supports previous evidence for limited gene flow between weevil populations on wild cotton and commercial cotton in northern Mexico and southern United States. The wild cotton populations represent a variant of the species called the thurberia weevil, which is not regarded as a significant pest. The 31 boll weevil COI haplotypes observed in the study form two distinct haplogroups (A and B) that are supported by five fixed nucleotide differences and a phylogenetic analysis. Although wild and commercial cotton populations are closely associated with specific haplogroups, there is not a fixed difference between the thurberia weevil variant and other populations. The Sequence Characterized Amplified Region marker generated a larger number of inconclusive results than the COI gene but also supported evidence of shared genotypes between wild and commercial cotton weevil populations. These methods provide additional markers that can assist in the identification of pest weevil populations but not definitively diagnose samples.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel.
Meadows, J R S; Hiendleder, S; Kijas, J W
2011-04-01
Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920,000 ± 190,000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel
Meadows, J R S; Hiendleder, S; Kijas, J W
2011-01-01
Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920 000±190 000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA. PMID:20940734
Downie, John D; Hurley, Jason; Mauro, Yihong
2008-09-29
We experimentally demonstrate uncompensated 8-channel wavelength division multiplexing (WDM) and single channel transmission at 10.7 Gb/s over a 470 km hybrid fiber link with in-line semiconductor optical amplifiers (SOAs). Two different forms of the duobinary modulation format are investigated and compared. Maximum Likelihood Sequence Estimation (MLSE) receiver technology is found to significantly mitigate nonlinear effects from the SOAs and to enable the long transmission, especially for optical duobinary signals derived from differential phase shift keying (DPSK) signals directly detected after narrowband optical filter demodulation. The MLSE also helps to compensate for a non-optimal Fabry-Perot optical filter demodulator.
Arai, Satoru; Gu, Se Hun; Baek, Luck Ju; Tabara, Kenji; Bennett, Shannon; Oh, Hong-Shik; Takada, Nobuhiro; Kang, Hae Ji; Tanaka-Taya, Keiko; Morikawa, Shigeru; Okabe, Nobuhiko; Yanagihara, Richard; Song, Jin-Won
2012-01-01
Spurred by the recent isolation of a novel hantavirus, named Imjin virus (MJNV), from the Ussuri white-toothed shrew (Crocidura lasiura), targeted trapping was conducted for the phylogenetically related Asian lesser white-toothed shrew (Crocidura shantungensis). Pair-wise alignment and comparison of the S, M and L segments of a newfound hantavirus, designated Jeju virus (JJUV), indicated remarkably low nucleotide and amino acid sequence similarity with MJNV. Phylogenetic analyses, using maximum likelihood and Bayesian methods, showed divergent ancestral lineages for JJUV and MJNV, despite the close phylogenetic relationship of their reservoir soricid hosts. Also, no evidence of host switching was apparent in tanglegrams, generated by TreeMap 2.0β. PMID:22230701
Layered classification techniques for remote sensing applications
NASA Technical Reports Server (NTRS)
Swain, P. H.; Wu, C. L.; Landgrebe, D. A.; Hauska, H.
1975-01-01
The single-stage method of pattern classification utilizes all available features in a single test which assigns the unknown to a category according to a specific decision strategy (such as the maximum likelihood strategy). The layered classifier classifies the unknown through a sequence of tests, each of which may be dependent on the outcome of previous tests. Although the layered classifier was originally investigated as a means of improving classification accuracy and efficiency, it was found that in the context of remote sensing data analysis, other advantages also accrue due to many of the special characteristics of both the data and the applications pursued. The layered classifier method and several of the diverse applications of this approach are discussed.
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Gottschling, Marc; Soehner, Sylvia; Zinssmeister, Carmen; John, Uwe; Plötner, Jörg; Schweikert, Michael; Aligizaki, Katerina; Elbrächter, Malte
2012-01-01
The phylogenetic relationships of the Dinophyceae (Alveolata) are not sufficiently resolved at present. The Thoracosphaeraceae (Peridiniales) are the only group of the Alveolata that include members with calcareous coccoid stages; this trait is considered apomorphic. Although the coccoid stage apparently is not calcareous, Bysmatrum has been assigned to the Thoracosphaeraceae based on thecal morphology. We tested the monophyly of the Thoracosphaeraceae using large sets of ribosomal RNA sequence data of the Alveolata including the Dinophyceae. Phylogenetic analyses were performed using Maximum Likelihood and Bayesian approaches. The Thoracosphaeraceae were monophyletic, but included also a number of non-calcareous dinophytes (such as Pentapharsodinium and Pfiesteria) and even parasites (such as Duboscquodinium and Tintinnophagus). Bysmatrum had an isolated and uncertain phylogenetic position outside the Thoracosphaeraceae. The phylogenetic relationships among calcareous dinophytes appear complex, and the assumption of the single origin of the potential to produce calcareous structures is challenged. The application of concatenated ribosomal RNA sequence data may prove promising for phylogenetic reconstructions of the Dinophyceae in future. Copyright © 2011 Elsevier GmbH. All rights reserved.
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Lirio, R B; Dondériz, I C; Pérez Abalo, M C
1992-08-01
The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
ERIC Educational Resources Information Center
Kelderman, Henk
In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual counts in the full contingency table. This is…
Maximum Likelihood Item Easiness Models for Test Theory Without an Answer Key
Batchelder, William H.
2014-01-01
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce two extensions to the basic model in order to account for item rating easiness/difficulty. The first extension is a multiplicative model and the second is an additive model. We show how the multiplicative model is related to the Rasch model. We describe several maximum-likelihood estimation procedures for the models and discuss issues of model fit and identifiability. We describe how the CCT models could be used to give alternative consensus-based measures of reliability. We demonstrate the utility of both the basic and extended models on a set of essay rating data and give ideas for future research. PMID:29795812
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, M.
1980-12-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
A Maximum Likelihood Approach to Functional Mapping of Longitudinal Binary Traits
Wang, Chenguang; Li, Hongying; Wang, Zhong; Wang, Yaqun; Wang, Ningtao; Wang, Zuoheng; Wu, Rongling
2013-01-01
Despite their importance in biology and biomedicine, genetic mapping of binary traits that change over time has not been well explored. In this article, we develop a statistical model for mapping quantitative trait loci (QTLs) that govern longitudinal responses of binary traits. The model is constructed within the maximum likelihood framework by which the association between binary responses is modeled in terms of conditional log odds-ratios. With this parameterization, the maximum likelihood estimates (MLEs) of marginal mean parameters are robust to the misspecification of time dependence. We implement an iterative procedures to obtain the MLEs of QTL genotype-specific parameters that define longitudinal binary responses. The usefulness of the model was validated by analyzing a real example in rice. Simulation studies were performed to investigate the statistical properties of the model, showing that the model has power to identify and map specific QTLs responsible for the temporal pattern of binary traits. PMID:23183762
A Gateway for Phylogenetic Analysis Powered by Grid Computing Featuring GARLI 2.0
Bazinet, Adam L.; Zwickl, Derrick J.; Cummings, Michael P.
2014-01-01
We introduce molecularevolution.org, a publicly available gateway for high-throughput, maximum-likelihood phylogenetic analysis powered by grid computing. The gateway features a garli 2.0 web service that enables a user to quickly and easily submit thousands of maximum likelihood tree searches or bootstrap searches that are executed in parallel on distributed computing resources. The garli web service allows one to easily specify partitioned substitution models using a graphical interface, and it performs sophisticated post-processing of phylogenetic results. Although the garli web service has been used by the research community for over three years, here we formally announce the availability of the service, describe its capabilities, highlight new features and recent improvements, and provide details about how the grid system efficiently delivers high-quality phylogenetic results. [garli, gateway, grid computing, maximum likelihood, molecular evolution portal, phylogenetics, web service.] PMID:24789072
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.
Bruneau, Marine; Mottet, Thierry; Moulin, Serge; Kerbiriou, Maël; Chouly, Franz; Chretien, Stéphane; Guyeux, Christophe
2018-02-01
In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clusters is not required here. For the sake of illustration, this method is applied on a set of 100 DNA sequences taken from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene, extracted from a collection of Platyhelminthes and Nematoda species. The resulting clusters are tightly consistent with the phylogenetic tree computed using a maximum likelihood approach on gene alignment. They are coherent too with the NCBI taxonomy. Further test results based on synthesized data are then provided, showing that the proposed approach is better able to recover the clusters than the most widely used software, namely Cd-hit-est and BLASTClust. Copyright © 2017 Elsevier Ltd. All rights reserved.
Peyretaillade, E; Broussolle, V; Peyret, P; Méténier, G; Gouy, M; Vivarès, C P
1998-06-01
An intronless gene encoding a protein of 592 amino acid residues with similarity to 70-kDa heat shock proteins (HSP70s) has been cloned and sequenced from the amitochondrial protist Encephalitozoon cuniculi (phylum Microsporidia). Southern blot analyses show the presence of a single gene copy located on chromosome XI. The encoded protein exhibits an N-terminal hydrophobic leader sequence and two motifs shared by proteobacterial and mitochondrially expressed HSP70 homologs. Phylogenetic analysis using maximum likelihood and evolutionary distances place the E. cuniculi sequence in the cluster of mitochondrially expressed HSP70s, with a higher evolutionary rate than those of homologous sequences. Similar results were obtained after cloning a fragment of the homologous gene in the closely related species E. hellem. The presence of a nuclear targeting signal-like sequence supports a role of the Encephalitozoon HSP70 as a molecular chaperone of nuclear proteins. No evidence for cytosolic or endoplasmic reticulum forms of HSP70 was obtained through PCR amplification. These data suggest that Encephalitozoon species have evolved from an ancestor bearing mitochondria, which is in disagreement with the postulated presymbiotic origin of Microsporidia. The specific role and intracellular localization of the mitochondrial HSP70-like protein remain to be elucidated.
Estrada-Bárcenas, Daniel Alfonso; Vite-Garín, Tania; Navarro-Barranco, Hortensia; de la Torre-Arciniega, Raúl; Pérez-Mejía, Amelia; Rodríguez-Arellanes, Gabriela; Ramirez, Jose Antonio; Humberto Sahaza, Jorge; Taylor, Maria Lucia; Toriello, Conchita
2014-01-01
High sensitivity and specificity of molecular biology techniques have proven usefulness for the detection, identification and typing of different pathogens. The ITS (Internal Transcribed Spacer) regions of the ribosomal DNA are highly conserved non-coding regions, and have been widely used in different studies including the determination of the genetic diversity of human fungal pathogens. This article wants to contribute to the understanding of the intra- and interspecific genetic diversity of isolates of the Histoplasma capsulatum and Sporothrix schenckii species complexes by an analysis of the available sequences of the ITS regions from different sequence databases. ITS1-5.8S-ITS2 sequences of each fungus, either deposited in GenBank, or from our research groups (registered in the Fungi Barcode of Life Database), were analyzed using the maximum likelihood (ML) method. ML analysis of the ITS sequences discriminated isolates from distant geographic origins and particular wild hosts, depending on the fungal species analyzed. This manuscript is part of the series of works presented at the "V International Workshop: Molecular genetic approaches to the study of human pathogenic fungi" (Oaxaca, Mexico, 2012). Copyright © 2013 Revista Iberoamericana de Micología. Published by Elsevier Espana. All rights reserved.
NASA Astrophysics Data System (ADS)
Yang, Zhongyu
This thesis describes the design, experimental performance, and theoretical simulation of a novel time-of-flight analyzer that was integrated into a high resolution electron energy loss spectrometer (TOF-HREELS). First we examined the use of an interleaved comb chopper for chopping a continuous electron beam. Both static and dynamic behaviors were simulated theoretically and measured experimentally, with very good agreement. The finite penetration of the field beyond the plane of the chopper leads to non-ideal chopper response, which is characterized in terms of an "energy corruption" effect and a lead or lag in the time at which the beam responds to the chopper potential. Second we considered the recovery of spectra from pseudo-random binary sequence (PRBS) modulated TOF-HREELS data. The effects of the Poisson noise distribution and the non-ideal behavior of the "interleaved comb" chopper were simulated. We showed, for the first time, that maximum likelihood methods can be combined with PRBS modulation to achieve resolution enhancement, while properly accounting for the Poisson noise distribution and artifacts introduced by the chopper. Our results indicate that meV resolution, similar to that of modern high resolution electron energy loss spectrometers, can be achieved with a dramatic performance advantage over conventional, serial detection analyzers. To demonstrate the capabilities of the TOF-HREELS instrument, we made measurements on a highly oriented thin film polytetrafluoroethylene (PTFE) sample. We demonstrated that the TOF-HREELS can achieve a throughput advantage of a factor of 85 compared to the conventional HREELS instrument. Comparisons were made between the experimental results and theoretical simulations. We discuss various factors which affect inversion of PRBS modulated Time of Flight (TOF) data with the Lucy algorithm. Using simulations, we conclude that the convolution assumption was good under the conditions of our experiment. The chopper rise time, Poisson noise, and artifacts of the chopper response are evaluated. Finally, we conclude that the maximum likelihood algorithms are able to gain a multiplex advantage in PRBS modulation, despite the Poisson noise in the detector.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
2014-01-01
Background Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. Methods In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. Results The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. Conclusions The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants. PMID:25015379
Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua
2014-07-11
Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.
Liu, Guo-Hua; Wang, Yan; Xu, Min-Jun; Zhou, Dong-Hui; Ye, Yong-Gang; Li, Jia-Yuan; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2012-12-01
For many years, whipworms (Trichuris spp.) have been described with a relatively narrow range of both morphological and biometrical features. Moreover, there has been insufficient discrimination between congeners (or closely related species). In the present study, we determined the complete mitochondrial (mt) genomes of two whipworms Trichuris ovis and Trichuris discolor, compared them and then tested the hypothesis that T. ovis and T. discolor are distinct species by phylogenetic analyses using Bayesian inference, maximum likelihood and maximum parsimony) based on the deduced amino acid sequences of the mt protein-coding genes. The complete mt genomes of T. ovis and T. discolor were 13,946 bp and 13,904 bp in size, respectively. Both mt genomes are circular, and consist of 37 genes, including 13 genes coding for proteins, 2 genes for rRNA, and 22 genes for tRNA. The gene content and arrangement are identical to that of human and pig whipworms Trichuris trichiura and Trichuris suis. Taken together, these analyses showed genetic distinctiveness and strongly supported the recent proposal that T. ovis and T. discolor are distinct species using nuclear ribosomal DNA and a portion of the mtDNA sequence dataset. The availability of the complete mtDNA sequences of T. ovis and T. discolor provides novel genetic markers for studying the population genetics, diagnostics and molecular epidemiology of T. ovis and T. discolor. Copyright © 2012 Elsevier B.V. All rights reserved.
Livistona palms in Australia: ancient relics or opportunistic immigrants?
Crisp, Michael D; Isagi, Yuji; Kato, Yohei; Cook, Lyn G; Bowman, David M J S
2010-02-01
Eighteen of the 34 species of the fan palm genus Livistona (Arecaceae) are restricted to Australia and southern New Guinea, east of Wallace's Line, an ancient biogeographic boundary between the former supercontinents Laurasia and Gondwana. The remaining species extend from SE Asia to Africa, west of Wallace's Line. Competing hypotheses contend that Livistona is (a) ancient, its current distribution a relict of the supercontinents, or (b) a Miocene immigrant from the north into Australia as it drifted towards Asia. We have tested these hypotheses using Bayesian and penalized likelihood molecular dating based on 4Kb of nuclear and chloroplast DNA sequences with multiple fossil calibration points. Ancestral areas and biomes were reconstructed using parsimony and maximum likelihood. We found strong support for the second hypothesis, that a single Livistona ancestor colonized Australia from the north about 10-17Ma. Spread and diversification of the genus within Australia was likely favoured by a transition from the aseasonal wet to monsoonal biome, to which it could have been preadapted by fire-tolerance. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Evolutionary genetic analyses of MEF2C gene: implications for learning and memory in Homo sapiens.
Kalmady, Sunil V; Venkatasubramanian, Ganesan; Arasappa, Rashmi; Rao, Naren P
2013-02-01
MEF2C facilitates context-dependent fear conditioning (CFC) which is a salient aspect of hippocampus-dependent learning and memory. CFC might have played a crucial role in human evolution because of its advantageous influence on survival of species. In this study, we analyzed 23 orthologous mammalian gene sequences of MEF2C gene to examine the evidence for positive selection on this gene in Homo sapiens using Phylogenetic Analysis by Maximum Likelihood (PAML) and HyPhy software. Both PAML Bayes Empirical Bayes (BEB) and HyPhy Fixed Effects Likelihood (FEL) analyses supported significant positive selection on 4 codon sites in H. sapiens. Also, haplotter analysis revealed significant ongoing positive selection on this gene in Central European population. The study findings suggest that adaptive selective pressure on this gene might have influenced human evolution. Further research on this gene might unravel the potential role of this gene in learning and memory as well as its pathogenetic effect in certain hippocampal disorders with evolutionary basis like schizophrenia. Copyright © 2012 Elsevier B.V. All rights reserved.
Molecular and Clinical Characterization of Chikungunya Virus Infections in Southeast Mexico
Martínez-Landeros, Erik; Delgado-Gallegos, Juan L.; Caballero-Sosa, Sandra; Malo-García, Iliana R.
2018-01-01
Chikungunya fever is an arthropod-borne infection caused by Chikungunya virus (CHIKV). Even though clinical features of Chikungunya fever in the Mexican population have been described before, there is no detailed information. The aim of this study was to perform a full description of the clinical features in confirmed Chikungunya-infected patients and describe the molecular epidemiology of CHIKV. We evaluated febrile patients who sought medical assistance in Tapachula, Chiapas, Mexico, from June through July 2015. Infection was confirmed with molecular and serological methods. Viruses were isolated and the E1 gene was sequenced. Phylogeny reconstruction was inferred using maximum-likelihood and maximum clade credibility approaches. We studied 52 patients with confirmed CHIKV infection. They were more likely to have wrist, metacarpophalangeal, and knee arthralgia. Two combinations of clinical features were obtained to differentiate between Chikungunya fever and acute undifferentiated febrile illness. We obtained 10 CHIKV E1 sequences that grouped with the Asian lineage. Seven strains diverged from the formerly reported. Patients infected with the divergent CHIKV strains showed a broader spectrum of clinical manifestations. We defined the complete clinical features of Chikungunya fever in patients from Southeastern Mexico. Our results demonstrate co-circulation of different CHIKV strains in the state of Chiapas. PMID:29747416
von Konrat, Matt; de Lange, Peter; Greif, Matt; Strozier, Lynika; Hentschel, Jörn; Heinrichs, Jochen
2012-01-01
Abstract Frullania is a large and taxonomically complex genus. A new liverwort species, Frullania knightbridgei sp. nov. from southern New Zealand, is described and illustrated. The new species, and its placement in Frullania subg. Microfrullania, is based on an integrated evidence-based approach derived from morphology, ecology, experimental growth studies of plasticity, as well as sequence data. Diagnostic characters associated with the leaf and lobule cell-wall anatomy, oil bodies, and spore ultra-structure distinguish it from all other New Zealand species of Frullania. A critical comparison is also made between Frullania knightbridgei and morphologically allied species of botanical regions outside the New Zealand region and an artificial key is provided. The new species is similar to some forms of the widespread Australasian species, Frullania rostrata, but has unique characters associated with the lobule and oil bodies. Frullania knightbridgei is remarkably interesting in comparison with the majority of Frullania species, and indeed liverworts in general, in that it is at least partially halotolerant. Maximum parsimony and maximum likelihood analyses of nuclear ribosomal ITS2 and plastidic trnL-trnF sequences from purported related speciesconfirms its independent taxonomic status and corroborates its placement within Frullania subg. Microfrullania. PMID:22287928
Yuan, Le-Yang; Liu, Xiao-Xiang; Zhang, E
2015-12-21
Sequences from the mitochondrial control region of 14 putative species of Acrossocheilus (Cyprinidae) were examined to elucidate phylogenetic relationships within species of the barred group in that genus. Phylogenetic reconstructions were generated using three tree-building methods: maximum parsimony, maximum likelihood, and Bayesian inference. The resultant phylogenies were consistent with monophyly of the majority of the morphologically recognized species. However, mitochondrial DNA sequence evidence is incongruent with monophyly of A. fasciatus, as currently conceived. This species occurs only in the upper Qiantang-Jiang basin in Zhejiang and Anhui provinces, and coastal rivers in the Zhejiang Province. The species formerly recognized as A. paradoxus from Zhejiang Province is A. fasciatus. The specimens previously reported as A. fasciatus from river basins in Fujian Province are misidentified A. wuyiensis. The barred group of Acrossocheilus is shown to be polyphyletic. Acrossocheilus is restricted to the barred species here placed in "Clade II," containing A. paradoxus and relatives. Separate generic status is recommended for A. monticola and for A. longipinnis and their closest relatives, although more information on phylogenetic relationships based on multiple genes is required to develop robust phylogenetic hypotheses and diagnoses. Masticbarbus Tang, 1942 is available for A. longipinnis and three allied species (A. iridescens, A. microstomus and A. lamus).
Love, Jeffrey J.; Rigler, E. Joshua; Pulkkinen, Antti; Riley, Pete
2015-01-01
An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to −Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, −Dst≥850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42,2.41] times per century; a 100-yr magnetic storm is identified as having a −Dst≥880 nT (greater than Carrington) but a wide 95% confidence interval of [490,1187] nT.
Maximum likelihood convolutional decoding (MCD) performance due to system losses
NASA Technical Reports Server (NTRS)
Webster, L.
1976-01-01
A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model
NASA Astrophysics Data System (ADS)
Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel
2011-03-01
This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Zhao, Fang; Huang, Dun-Yuan; Sun, Xiao-Yan; Shi, Qing-Hui; Hao, Jia-Sheng; Zhang, Lan-Lan; Yang, Qun
2013-10-01
The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides, the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon. The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides , the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon.
Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.
2015-01-01
This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
A Single Early Introduction of HIV-1 Subtype B into Central America Accounts for Most Current Cases
Murillo, Wendy; Veras, Nazle; Prosperi, Mattia; de Rivera, Ivette Lorenzana; Paz-Bailey, Gabriela; Morales-Miranda, Sonia; Juarez, Sandra I.; Yang, Chunfu; DeVos, Joshua; Marín, José Pablo; Mild, Mattias; Albert, Jan
2013-01-01
Human immunodeficiency virus type 1 (HIV-1) variants show considerable geographical separation across the world, but there is limited information from Central America. We provide the first detailed investigation of the genetic diversity and molecular epidemiology of HIV-1 in six Central American countries. Phylogenetic analysis was performed on 625 HIV-1 pol gene sequences collected between 2002 and 2010 in Honduras, El Salvador, Nicaragua, Costa Rica, Panama, and Belize. Published sequences from neighboring countries (n = 57) and the rest of the world (n = 740) were included as controls. Maximum likelihood methods were used to explore phylogenetic relationships. Bayesian coalescence-based methods were used to time HIV-1 introductions. Nearly all (98.9%) Central American sequences were of subtype B. Phylogenetic analysis revealed that 437 (70%) sequences clustered within five significantly supported monophyletic clades formed essentially by Central American sequences. One clade contained 386 (62%) sequences from all six countries; the other four clades were smaller and more country specific, suggesting discrete subepidemics. The existence of one large well-supported Central American clade provides evidence that a single introduction of HIV-1 subtype B in Central America accounts for most current cases. An introduction during the early phase of the HIV-1 pandemic may explain its epidemiological success. Moreover, the smaller clades suggest a subsequent regional spread related to specific transmission networks within each country. PMID:23616665
Molecular delineation of the Agave Red Worm Comadia redtenbacheri (Lepidoptera: Cossidae).
CÁrdenas-Aquino, MarÍa Del Rosario; AlarcÓn-rodrÍguez, Norma Marina; Rivas-Medrano, Mario; GonzÁlez-hernÁndez, HÉctor; Vargas-hernÁndez, Mateo; SÁnchez-Arroyo, Hussein; Llanderal-cÁzares, Celina
2018-01-25
Comadia redtenbacheri (Hammerschmidt) (Agave Red Worm) is the only member of the family Cossidae that has been described as a phytophagous specialist of the plant genus Agave, which is mainly distributed in México. A new extraction protocol adapted from Stewart Via (1993) has been implemented for sequencing the COI gene from samples collected in five states of the North Central (Querétaro and Zacatecas), South Central (Estado de México) and East Central (Hidalgo and Tlaxcala) regions of México with the purpose of contributing to delineation of the species. A Maximum Likelihood (ML) tree based on these COI sequences as well as COI sequences from other Cossinae species was developed to complement the existing morphological and taxonomic approaches to delineation of this species. As expected, our Comadia samples cluster together within a monophyletic clade that includes four C. redtenbacheri sequences previously reported. This group seems to be consistent with our reconstruction, which is supported by a bootstrap value of over 99%. The closely related branches associated with the latter group include organisms known to be the plant and tree borers of the Cossinae subfamily. The COI sequences from our samples were analyzed to determine the percentage of identity among the C. redtenbacheri in a first attempt to detect differences in the sequence that matches a particular region of México.
Detection and phylogenetic analysis of bacteriophage WO in spiders (Araneae).
Yan, Qian; Qiao, Huping; Gao, Jin; Yun, Yueli; Liu, Fengxiang; Peng, Yu
2015-11-01
Phage WO is a bacteriophage found in Wolbachia. Herein, we represent the first phylogenetic study of WOs that infect spiders (Araneae). Seven species of spiders (Araneus alternidens, Nephila clavata, Hylyphantes graminicola, Prosoponoides sinensis, Pholcus crypticolens, Coleosoma octomaculatum, and Nurscia albofasciata) from six families were infected by Wolbachia and WO, followed by comprehensive sequence analysis. Interestingly, WO could be only detected Wolbachia-infected spiders. The relative infection rates of those seven species of spiders were 75, 100, 88.9, 100, 62.5, 72.7, and 100 %, respectively. Our results indicated that both Wolbachia and WO were found in three different body parts of N. clavata, and WO could be passed to the next generation of H. graminicola by vertical transmission. There were three different sequences for WO infected in A. alternidens and two different WO sequences from C. octomaculatum. Only one sequence of WO was found for the other five species of spiders. The discovered sequence of WO ranged from 239 to 311 bp. Phylogenetic tree was generated using maximum likelihood (ML) based on the orf7 gene sequences. According to the phylogenetic tree, WOs in N. clavata and H. graminicola were clustered in the same group. WOs from A. alternidens (WAlt1) and C. octomaculatum (WOct2) were closely related to another clade, whereas WO in P. sinensis was classified as a sole cluster.
Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation
Meyer, Karin
2016-01-01
Multivariate estimates of genetic parameters are subject to substantial sampling variation, especially for smaller data sets and more than a few traits. A simple modification of standard, maximum-likelihood procedures for multivariate analyses to estimate genetic covariances is described, which can improve estimates by substantially reducing their sampling variances. This is achieved by maximizing the likelihood subject to a penalty. Borrowing from Bayesian principles, we propose a mild, default penalty—derived assuming a Beta distribution of scale-free functions of the covariance components to be estimated—rather than laboriously attempting to determine the stringency of penalization from the data. An extensive simulation study is presented, demonstrating that such penalties can yield very worthwhile reductions in loss, i.e., the difference from population values, for a wide range of scenarios and without distorting estimates of phenotypic covariances. Moreover, mild default penalties tend not to increase loss in difficult cases and, on average, achieve reductions in loss of similar magnitude to computationally demanding schemes to optimize the degree of penalization. Pertinent details required for the adaptation of standard algorithms to locate the maximum of the likelihood function are outlined. PMID:27317681
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Molecular phylogeny of the Achatinoidea (Mollusca: Gastropoda).
Fontanilla, Ian Kendrich; Naggs, Fred; Wade, Christopher Mark
2017-09-01
This study presents a multi-gene phylogenetic analysis of the Achatinoidea and provides an initial basis for a taxonomic re-evaluation of family level groups within the superfamily. A total of 5028 nucleotides from the nuclear rRNA, actin and histone 3 genes and the 1st and 2nd codon positions of the mitochondrial cytochrome c oxidase subunit I gene were sequenced from 24 species, representing six currently recognised families. Results from maximum likelihood, neighbour joining, maximum parsimony and Bayesian inference trees revealed that, of currently recognised families, only the Achatinidae are monophyletic. For the Ferussaciidae, Ferussacia folliculus fell separately to Cecilioides gokweanus and formed a sister taxon to the rest of the Achatinoidea. For the Coeliaxidae, Coeliaxis blandii and Pyrgina umbilicata did not group together. The Subulinidae was not resolved, with some subulinids clustering with the Coeliaxidae and Thyrophorellidae. Three subfamilies currently included within the Subulinidae based on current taxonomy likewise did not form monophyletic groups. Copyright © 2017 Elsevier Inc. All rights reserved.
Swart, Belinda L; von der Heyden, Sophie; Bester-van der Merwe, Aletta; Roodt-Wilding, Rouvay
2015-12-01
The genus Seriola includes several important commercially exploited species and has a disjunct distribution globally; yet phylogenetic relationships within this genus have not been thoroughly investigated. This study reports the first comprehensive molecular phylogeny for this genus based on mitochondrial (Cytb) and nuclear gene (RAG1 and Rhod) DNA sequence data for all extant Seriola species (nine species, n=27). All species were found to be monophyletic based on Maximum parsimony, Maximum likelihood and Bayesian inference. The closure of the Tethys Sea (12-20 MYA) coincides with the divergence of a clade containing ((S. fasciata and S. peruana), S. carpenteri) from the rest of the Seriola species, while the formation of the Isthmus of Panama (±3 MYA) played an important role in the divergence of S. fasciata and S. peruana. Furthermore, factors such as climate and water temperature fluctuations during the Pliocene played important roles during the divergence of the remaining Seriola species. Copyright © 2015 Elsevier Inc. All rights reserved.
D'Angelino, Rubens Henrique Ramos; Pituco, Edviges Maristela; Villalobos, Eliana Monteforte Cassaro; Harakava, Ricardo; Gregori, Fábio
2013-01-01
Bovine leukemia virus (BLV) was investigated in the central nervous system (CNS) of cattle with neurological syndrome. A total of 269 CNS samples were submitted to nested-PCR (BLV env gene gp51), and the viral genotypes were identified. The nested-PCR was positive in 4.8% (13/269) CNS samples, with 2.7% (2/74) presenting at histological examination lesions of nonpurulent meningoencephalitis (NPME), whereas 5.6% (11/195) not presenting NPME (P > 0.05). No samples presented lymphosarcoma. The PCR products (437 bp) were sequenced and submitted to phylogenetic analysis by neighbor-joining and maximum composite likelihood methods, and genotypes 1, 5, and 6 were detected, corroborating other South American studies. The genotype 6 barely described in Brazil and Argentina was more frequently detected in this study. The identity matrices showed maximum similarity (100%) among some samples of this study and one from Argentina (FJ808582), recovered from GenBank. There was no association among the genotypes and NPME lesions. PMID:23710448
Canedo, Clarissa; Haddad, Célio F B
2012-11-01
We present a phylogenetic hypothesis of the anuran clade Terrarana based on partial sequences of nuclear (Tyr and RAG1) and mitochondrial (12S, tRNA-Val, and 16S) genes, testing the monophyly of Ischnocnema and its species series. We performed maximum parsimony, maximum likelihood, and Bayesian inference analyses on 364 terminals: 11 outgroup terminals and 353 ingroup Terrarana terminals, including 139 Ischnocnema terminals (accounting for 29 of the 35 named Ischnocnema species) and 214 other Terrarana terminals within the families Brachycephalidae, Ceuthomantidae, Craugastoridae, and Eleutherodactylidae. Different optimality criteria produced similar results and mostly recovered the currently accepted families and genera. According to these topologies, Ischnocnema is not a monophyletic group. We propose new combinations for three species, relocating them to Pristimantis, and render Eleutherodactylus bilineatus Bokermann, 1975 incertae sedis status within Holoadeninae. The rearrangements in Ischnocnema place it outside the northernmost Brazilian Atlantic rainforest, where the fauna of Terrarana comprises typical Amazonian genera. Copyright © 2012 Elsevier Inc. All rights reserved.
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers.
Hu, Y J; Lin, D Y; Sun, W; Zeng, D
2014-10-01
Copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) co-exist throughout the human genome and jointly contribute to phenotypic variations. Thus, it is desirable to consider both types of variants, as characterized by allele-specific copy numbers (ASCNs), in association studies of complex human diseases. Current SNP genotyping technologies capture the CNV and SNP information simultaneously via fluorescent intensity measurements. The common practice of calling ASCNs from the intensity measurements and then using the ASCN calls in downstream association analysis has important limitations. First, the association tests are prone to false-positive findings when differential measurement errors between cases and controls arise from differences in DNA quality or handling. Second, the uncertainties in the ASCN calls are ignored. We present a general framework for the integrated analysis of CNVs and SNPs, including the analysis of total copy numbers as a special case. Our approach combines the ASCN calling and the association analysis into a single step while allowing for differential measurement errors. We construct likelihood functions that properly account for case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the corresponding inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a genome-wide association study of schizophrenia. Extensions to next-generation sequencing data are discussed.
Yang, Ji; Gu, Hongya; Yang, Ziheng
2004-01-01
Chalcone synthase (CHS) is a key enzyme in the biosynthesis of flavonoides, which are important for the pigmentation of flowers and act as attractants to pollinators. Genes encoding CHS constitute a multigene family in which the copy number varies among plant species and functional divergence appears to have occurred repeatedly. In morning glories (Ipomoea), five functional CHS genes (A-E) have been described. Phylogenetic analysis of the Ipomoea CHS gene family revealed that CHS A, B, and C experienced accelerated rates of amino acid substitution relative to CHS D and E. To examine whether the CHS genes of the morning glories underwent adaptive evolution, maximum-likelihood models of codon substitution were used to analyze the functional sequences in the Ipomoea CHS gene family. These models used the nonsynonymous/synonymous rate ratio (omega = d(N)/ d(S)) as an indicator of selective pressure and allowed the ratio to vary among lineages or sites. Likelihood ratio test suggested significant variation in selection pressure among amino acid sites, with a small proportion of them detected to be under positive selection along the branches ancestral to CHS A, B, and C. Positive Darwinian selection appears to have promoted the divergence of subfamily ABC and subfamily DE and is at least partially responsible for a rate increase following gene duplication.
Waits, L P; Sullivan, J; O'Brien, S J; Ward, R H
1999-10-01
The bear family (Ursidae) presents a number of phylogenetic ambiguities as the evolutionary relationships of the six youngest members (ursine bears) are largely unresolved. Recent mitochondrial DNA analyses have produced conflicting results with respect to the phylogeny of ursine bears. In an attempt to resolve these issues, we obtained 1916 nucleotides of mitochondrial DNA sequence data from six gene segments for all eight bear species and conducted maximum likelihood and maximum parsimony analyses on all fragments separately and combined. All six single-region gene trees gave different phylogenetic estimates; however, only for control region data was this significantly incongruent with the results from the combined data. The optimal phylogeny for the combined data set suggests that the giant panda is most basal followed by the spectacled bear. The sloth bear is the basal ursine bear, and there is weak support for a sister taxon relationship of the American and Asiatic black bears. The sun bear is sister taxon to the youngest clade containing brown bears and polar bears. Statistical analyses of alternate hypotheses revealed a lack of strong support for many of the relationships. We suggest that the difficulties surrounding the resolution of the evolutionary relationships of the Ursidae are linked to the existence of sequential rapid radiation events in bear evolution. Thus, unresolved branching orders during these time periods may represent an accurate representation of the evolutionary history of bear species. Copyright 1999 Academic Press.
Parsons, Tom
2008-01-01
Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques [e.g., Ellsworth et al., 1999]. In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means [e.g., NIST/SEMATECH, 2006]. For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDF?s, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.
Parsons, T.
2008-01-01
Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques (e.g., Ellsworth et al., 1999). In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means (e.g., NIST/SEMATECH, 2006). For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDFs, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.
Merz, Clayton; Catchen, Julian M; Hanson-Smith, Victor; Emerson, Kevin J; Bradshaw, William E; Holzapfel, Christina M
2013-01-01
Herein we tested the repeatability of phylogenetic inference based on high throughput sequencing by increased taxon sampling using our previously published techniques in the pitcher-plant mosquito, Wyeomyia smithii in North America. We sampled 25 natural populations drawn from different localities nearby 21 previous collection localities and used these new data to construct a second, independent phylogeny, expressly to test the reproducibility of phylogenetic patterns. Comparison of trees between the two data sets based on both maximum parsimony and maximum likelihood with Bayesian posterior probabilities showed close correspondence in the grouping of the most southern populations into clear clades. However, discrepancies emerged, particularly in the middle of W. smithii's current range near the previous maximum extent of the Laurentide Ice Sheet, especially concerning the most recent common ancestor to mountain and northern populations. Combining all 46 populations from both studies into a single maximum parsimony tree and taking into account the post-glacial historical biogeography of associated flora provided an improved picture of W. smithii's range expansion in North America. In a more general sense, we propose that extensive taxon sampling, especially in areas of known geological disruption is key to a comprehensive approach to phylogenetics that leads to biologically meaningful phylogenetic inference.
Huang, Wei-Yi; Zhao, Guang-Hui; Wei, Shu-Jun; Song, Hui-Qun; Xu, Min-Jun; Lin, Rui-Qing; Zhou, Dong-Hui; Zhu, Xing-Quan
2012-01-01
Complete mitochondrial (mt) genomes and the gene rearrangements are increasingly used as molecular markers for investigating phylogenetic relationships. Contributing to the complete mt genomes of Gastropoda, especially Pulmonata, we determined the mt genome of the freshwater snail Galba pervia, which is an important intermediate host for Fasciola spp. in China. The complete mt genome of G. pervia is 13,768 bp in length. Its genome is circular, and consists of 37 genes, including 13 genes for proteins, 2 genes for rRNA, 22 genes for tRNA. The mt gene order of G. pervia showed novel arrangement (tRNA-His, tRNA-Gly and tRNA-Tyr change positions and directions) when compared with mt genomes of Pulmonata species sequenced to date, indicating divergence among different species within the Pulmonata. A total of 3655 amino acids were deduced to encode 13 protein genes. The most frequently used amino acid is Leu (15.05%), followed by Phe (11.24%), Ser (10.76%) and IIe (8.346%). Phylogenetic analyses using the concatenated amino acid sequences of the 13 protein-coding genes, with three different computational algorithms (maximum parsimony, maximum likelihood and Bayesian analysis), all revealed that the families Lymnaeidae and Planorbidae are closely related two snail families, consistent with previous classifications based on morphological and molecular studies. The complete mt genome sequence of G. pervia showed a novel gene arrangement and it represents the first sequenced high quality mt genome of the family Lymnaeidae. These novel mtDNA data provide additional genetic markers for studying the epidemiology, population genetics and phylogeographics of freshwater snails, as well as for understanding interplay between the intermediate snail hosts and the intra-mollusca stages of Fasciola spp.. PMID:22844544
Vector Antenna and Maximum Likelihood Imaging for Radio Astronomy
2016-03-05
Maximum Likelihood Imaging for Radio Astronomy Mary Knapp1, Frank Robey2, Ryan Volz3, Frank Lind3, Alan Fenn2, Alex Morris2, Mark Silver2, Sarah Klein2...haystack.mit.edu Abstract1— Radio astronomy using frequencies less than ~100 MHz provides a window into non-thermal processes in objects ranging from planets...observational astronomy . Ground-based observatories including LOFAR [1], LWA [2], [3], MWA [4], and the proposed SKA-Low [5], [6] are improving access to
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
The effect of lossy image compression on image classification
NASA Technical Reports Server (NTRS)
Paola, Justin D.; Schowengerdt, Robert A.
1995-01-01
We have classified four different images, under various levels of JPEG compression, using the following classification algorithms: minimum-distance, maximum-likelihood, and neural network. The training site accuracy and percent difference from the original classification were tabulated for each image compression level, with maximum-likelihood showing the poorest results. In general, as compression ratio increased, the classification retained its overall appearance, but much of the pixel-to-pixel detail was eliminated. We also examined the effect of compression on spatial pattern detection using a neural network.
EMG-based speech recognition using hidden markov models with global control variables.
Lee, Ki-Seung
2008-03-01
It is well known that a strong relationship exists between human voices and the movement of articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The sequence of EMG signals for each word is modelled by a hidden Markov model (HMM) framework. The main objective of the work involves building a model for state observation density when multichannel observation sequences are given. The proposed model reflects the dependencies between each of the EMG signals, which are described by introducing a global control variable. We also develop an efficient model training method, based on a maximum likelihood criterion. In a preliminary study, 60 isolated words were used as recognition variables. EMG signals were acquired from three articulatory facial muscles. The findings indicate that such a system may have the capacity to recognize speech signals with an accuracy of up to 87.07%, which is superior to the independent probabilistic model.
Accurate and sensitive quantification of protein-DNA binding affinity.
Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J
2018-04-17
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.
Accurate and sensitive quantification of protein-DNA binding affinity
Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.
2018-01-01
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332
Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok
2015-12-07
Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Stochastic processes constrain the within and between host evolution of influenza virus.
McCrone, John T; Woods, Robert J; Martin, Emily T; Malosh, Ryan E; Monto, Arnold S; Lauring, Adam S
2018-05-03
The evolutionary dynamics of influenza virus ultimately derive from processes that take place within and between infected individuals. Here we define influenza virus dynamics in human hosts through sequencing of 249 specimens from 200 individuals collected over 6290 person-seasons of observation. Because these viruses were collected from individuals in a prospective community-based cohort, they are broadly representative of natural infections with seasonal viruses. Consistent with a neutral model of evolution, sequence data from 49 serially sampled individuals illustrated the dynamic turnover of synonymous and nonsynonymous single nucleotide variants and provided little evidence for positive selection of antigenic variants. We also identified 43 genetically-validated transmission pairs in this cohort. Maximum likelihood optimization of multiple transmission models estimated an effective transmission bottleneck of 1-2 genomes. Our data suggest that positive selection is inefficient at the level of the individual host and that stochastic processes dominate the host-level evolution of influenza viruses. © 2018, McCrone et al.
Genetic characterization of Enterovirus 71 strains circulating in Vietnam in 2012.
Donato, Celeste; Hoi, Le Thi; Hoa, Nguyen Thi; Hoa, Tran Mai; Van Duyet, Le; Dieu Ngan, Ta Thi; Van Kinh, Nguyen; Vu Trung, Nguyen; Vijaykrishna, Dhanasekaran
2016-08-01
Enterovirus 71 subgenogroup C4 caused the largest outbreak of Hand, Foot and Mouth Disease (HFMD) in Vietnam during 2011-2012, resulting in over 200,000 hospitalisations and 207 fatalities. A total of 1917 samples with adequate volume for RT-PCR analysis were collected from patients hospitalised with HFMD throughout Vietnam and 637 were positive for EV71. VP1 gene (n=87) and complete genome (n=9) sequencing was performed. Maximum-likelihood phylogenetic analysis was performed to characterise the B5, C4 and C5 strains detected. Sequence analyses revealed that the dominant subgenogroup associated with the 2012 outbreak was C4, with B5 and C5 strains representing a small proportion of these cases. Numerous countries in the region including Malaysia, Taiwan and China have a large influence on strain diversity in Vietnam and understanding the transmission of EV71 throughout Southeast Asia is vital to inform preventative public health measures and vaccine development efforts. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Identification of simple objects in image sequences
NASA Astrophysics Data System (ADS)
Geiselmann, Christoph; Hahn, Michael
1994-08-01
We present an investigation in the identification and location of simple objects in color image sequences. As an example the identification of traffic signs is discussed. Three aspects are of special interest. First regions have to be detected which may contain the object. The separation of those regions from the background can be based on color, motion, and contours. In the experiments all three possibilities are investigated. The second aspect focuses on the extraction of suitable features for the identification of the objects. For that purpose the border line of the region of interest is used. For planar objects a sufficient approximation of perspective projection is affine mapping. In consequence, it is near at hand to extract affine-invariant features from the border line. The investigation includes invariant features based on Fourier descriptors and moments. Finally, the object is identified by maximum likelihood classification. In the experiments all three basic object types are correctly identified. The probabilities for misclassification have been found to be below 1%
Full-length genomic characterization and molecular evolution of canine parvovirus in China.
Zhou, Ling; Tang, Qinghai; Shi, Lijun; Kong, Miaomiao; Liang, Lin; Mao, Qianqian; Bu, Bin; Yao, Lunguang; Zhao, Kai; Cui, Shangjin; Leal, Élcio
2016-06-01
Canine parvovirus type 2 (CPV-2) can cause acute haemorrhagic enteritis in dogs and myocarditis in puppies. This disease has become one of the most serious infectious diseases of dogs. During 2014 in China, there were many cases of acute infectious diarrhoea in dogs. Some faecal samples were negative for the CPV-2 antigen based on a colloidal gold test strip but were positive based on PCR, and a viral strain was isolated from one such sample. The cytopathic effect on susceptible cells and the results of the immunoperoxidase monolayer assay, PCR, and sequencing indicated that the pathogen was CPV-2. The strain was named CPV-NY-14, and the full-length genome was sequenced and analysed. A maximum likelihood tree was constructed using the full-length genome and all available CPV-2 genomes. New strains have replaced the original strain in Taiwan and Italy, although the CPV-2a strain is still predominant there. However, CPV-2a still causes many cases of acute infectious diarrhoea in dogs in China.
Dor, Roi; Carling, Matthew D; Lovette, Irby J; Sheldon, Frederick H; Winkler, David W
2012-10-01
The New World swallow genus Tachycineta comprises nine species that collectively have a wide geographic distribution and remarkable variation both within- and among-species in ecologically important traits. Existing phylogenetic hypotheses for Tachycineta are based on mitochondrial DNA sequences, thus they provide estimates of a single gene tree. In this study we sequenced multiple individuals from each species at 16 nuclear intron loci. We used gene concatenated approaches (Bayesian and maximum likelihood) as well as coalescent-based species tree inference to reconstruct phylogenetic relationships of the genus. We examined the concordance and conflict between the nuclear and mitochondrial trees and between concatenated and coalescent-based inferences. Our results provide an alternative phylogenetic hypothesis to the existing mitochondrial DNA estimate of phylogeny. This new hypothesis provides a more accurate framework in which to explore trait evolution and examine the evolution of the mitochondrial genome in this group. Copyright © 2012 Elsevier Inc. All rights reserved.
Kumar, Sudhir; Stecher, Glen; Peterson, Daniel; Tamura, Koichiro
2012-10-15
There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis. http://www.megasoftware.net/.
Maba, Dao Lamèga; Guelly, Atsu K; Yorou, Nourou S; Verbeken, Annemieke; Agerer, Reinhard
2015-06-01
Despite the crucial ecological role of lactarioid taxa (Lactifluus, Lactarius) as common ectomycorrhiza formers in tropical African seasonal forests, their current diversity is not yet adequately assessed. During the last few years, numerous lactarioid specimens have been sampled in various ecosystems from Togo (West Africa). We generated 48 ITS sequences and aligned them against lactarioid taxa from other tropical African ecozones (Guineo-Congolean evergreen forests, Zambezian miombo). A Maximum Likelihood phylogenetic tree was inferred from a dataset of 109 sequences. The phylogenetic placement of the specimens, combined with morpho-anatomical data, supported the description of four new species from Togo within the monophyletic genus Lactifluus: within subgen. Lactifluus (L. flavellus), subgen. Russulopsis (L. longibasidius and L. pectinatus), and subgen. Edules (L. melleus). This demonstrates that the current species richness of the genus is considerably higher than hitherto estimated for African species and, in addition, a need to redefine the subgenera and sections within it.
Genetic Diversity of HIV-1 in Tunisia.
El Moussi, Awatef; Thomson, Michael M; Delgado, Elena; Cuevas, María Teresa; Nasr, Majda; Abid, Salma; Ben Hadj Kacem, Mohamed Ali; Benaissa Tiouiri, Hanene; Letaief, Amel; Chakroun, Mohamed; Ben Jemaa, Mounir; Hamdouni, Hayet; Tej Dellagi, Rafla; Kheireddine, Khaled; Boutiba, Ilhem; Pérez-Álvarez, Lucía; Slim, Amine
2017-01-01
In this study, the genetic diversity of HIV-1 in Tunisia was analyzed. For this, 193 samples were collected in different regions of Tunisia between 2012 and 2015. A protease and reverse transcriptase fragment were amplified and sequenced. Phylogenetic analyses were performed through maximum likelihood and recombination was analyzed by bootscanning. Six HIV-1 subtypes (B, A1, G, D, C, and F2), 5 circulating recombinant forms (CRF02_AG, CRF25_cpx, CRF43_02G, CRF06_cpx, and CRF19_cpx), and 11 unique recombinant forms were identified. Subtype B (46.4%) and CRF02_AG (39.4%) were the predominant genetic forms. A group of 44 CRF02_AG sequences formed a distinct Tunisian cluster, which also included four viruses from western Europe. Nine viruses were closely related to isolates collected in other African or in European countries. In conclusion, a high HIV-1 genetic diversity is observed in Tunisia and the local spread of CRF02_AG is first documented in this country.
Jones, Christopher M; Stres, Blaz; Rosenquist, Magnus; Hallin, Sara
2008-09-01
Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene duplication/divergence and lineage sorting, may have differently influenced the evolution of each denitrification gene.
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures
Theobald, Douglas L.; Wuttke, Deborah S.
2008-01-01
Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907
Synchronization for Optical PPM with Inter-Symbol Guard Times
NASA Astrophysics Data System (ADS)
Rogalin, R.; Srinivasan, M.
2017-05-01
Deep space optical communications promises orders of magnitude growth in communication capacity, supporting high data rate applications such as video streaming and high-bandwidth science instruments. Pulse position modulation is the modulation format of choice for deep space applications, and by inserting inter-symbol guard times between the symbols, the signal carries the timing information needed by the demodulator. Accurately extracting this timing information is crucial to demodulating and decoding this signal. In this article, we propose a number of timing and frequency estimation schemes for this modulation format, and in particular highlight a low complexity maximum likelihood timing estimator that significantly outperforms the prior art in this domain. This method does not require an explicit synchronization sequence, freeing up channel resources for data transmission.
A complex valued radial basis function network for equalization of fast time varying channels.
Gan, Q; Saratchandran, P; Sundararajan, N; Subramanian, K R
1999-01-01
This paper presents a complex valued radial basis function (RBF) network for equalization of fast time varying channels. A new method for calculating the centers of the RBF network is given. The method allows fixing the number of RBF centers even as the equalizer order is increased so that a good performance is obtained by a high-order RBF equalizer with small number of centers. Simulations are performed on time varying channels using a Rayleigh fading channel model to compare the performance of our RBF with an adaptive maximum-likelihood sequence estimator (MLSE) consisting of a channel estimator and a MLSE implemented by the Viterbi algorithm. The results show that the RBF equalizer produces superior performance with less computational complexity.
Super-Nyquist shaping and processing technologies for high-spectral-efficiency optical systems
NASA Astrophysics Data System (ADS)
Jia, Zhensheng; Chien, Hung-Chang; Zhang, Junwen; Dong, Ze; Cai, Yi; Yu, Jianjun
2013-12-01
The implementations of super-Nyquist pulse generation, both in a digital field using a digital-to-analog converter (DAC) or an optical filter at transmitter side, are introduced. Three corresponding signal processing algorithms at receiver are presented and compared for high spectral-efficiency (SE) optical systems employing the spectral prefiltering. Those algorithms are designed for the mitigation towards inter-symbol-interference (ISI) and inter-channel-interference (ICI) impairments by the bandwidth constraint, including 1-tap constant modulus algorithm (CMA) and 3-tap maximum likelihood sequence estimation (MLSE), regular CMA and digital filter with 2-tap MLSE, and constant multi-modulus algorithm (CMMA) with 2-tap MLSE. The principles and prefiltering tolerance are given through numerical and experimental results.
A step-up test procedure to find the minimum effective dose.
Wang, Weizhen; Peng, Jianan
2015-01-01
It is of great interest to find the minimum effective dose (MED) in dose-response studies. A sequence of decreasing null hypotheses to find the MED is formulated under the assumption of nondecreasing dose response means. A step-up multiple test procedure that controls the familywise error rate (FWER) is constructed based on the maximum likelihood estimators for the monotone normal means. When the MED is equal to one, the proposed test is uniformly more powerful than Hsu and Berger's test (1999). Also, a simulation study shows a substantial power improvement for the proposed test over four competitors. Three R-codes are provided in Supplemental Materials for this article. Go to the publishers online edition of Journal of Biopharmaceutical Statistics to view the files.
Poletto, S; Gambetta, Jay M; Merkel, Seth T; Smolin, John A; Chow, Jerry M; Córcoles, A D; Keefe, George A; Rothwell, Mary B; Rozen, J R; Abraham, D W; Rigetti, Chad; Steffen, M
2012-12-14
We report a system where fixed interactions between noncomputational levels make bright the otherwise forbidden two-photon |00}→|11} transition. The system is formed by hand selection and assembly of two discrete component transmon-style superconducting qubits inside a rectangular microwave cavity. The application of a monochromatic drive tuned to this transition induces two-photon Rabi-like oscillations between the ground and doubly excited states via the Bell basis. The system therefore allows all-microwave two-qubit universal control with the same techniques and hardware required for single qubit control. We report Ramsey-like and spin echo sequences with the generated Bell states, and measure a two-qubit gate fidelity of F(g)=90% (unconstrained) and 86% (maximum likelihood estimator).
Phylogenetic tree and community structure from a Tangled Nature model.
Canko, Osman; Taşkın, Ferhat; Argın, Kamil
2015-10-07
In evolutionary biology, the taxonomy and origination of species are widely studied subjects. An estimation of the evolutionary tree can be done via available DNA sequence data. The calculation of the tree is made by well-known and frequently used methods such as maximum likelihood and neighbor-joining. In order to examine the results of these methods, an evolutionary tree is pursued computationally by a mathematical model, called Tangled Nature. A relatively small genome space is investigated due to computational burden and it is found that the actual and predicted trees are in reasonably good agreement in terms of shape. Moreover, the speciation and the resulting community structure of the food-web are investigated by modularity. Copyright © 2015 Elsevier Ltd. All rights reserved.
Computational Software for Fitting Seismic Data to Epidemic-Type Aftershock Sequence Models
NASA Astrophysics Data System (ADS)
Chu, A.
2014-12-01
Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work introduces software to implement two of ETAS models described in Ogata (1998). To find the Maximum-Likelihood Estimates (MLEs), my software provides estimates of the homogeneous background rate parameter and the temporal and spatial parameters that govern triggering effects by applying the Expectation-Maximization (EM) algorithm introduced in Veen and Schoenberg (2008). Despite other computer programs exist for similar data modeling purpose, using EM-algorithm has the benefits of stability and robustness (Veen and Schoenberg, 2008). Spatial shapes that are very long and narrow cause difficulties in optimization convergence and problems with flat or multi-modal log-likelihood functions encounter similar issues. My program uses a robust method to preset a parameter to overcome the non-convergence computational issue. In addition to model fitting, the software is equipped with useful tools for examining modeling fitting results, for example, visualization of estimated conditional intensity, and estimation of expected number of triggered aftershocks. A simulation generator is also given with flexible spatial shapes that may be defined by the user. This open-source software has a very simple user interface. The user may execute it on a local computer, and the program also has potential to be hosted online. Java language is used for the software's core computing part and an optional interface to the statistical package R is provided.
Mikaeili, F; Mirhendi, H; Mohebali, M; Hosseini, M; Sharbatkhori, M; Zarei, Z; Kia, E B
2015-07-01
The study was conducted to determine the sequence variation in two mitochondrial genes, namely cytochrome c oxidase 1 (pcox1) and NADH dehydrogenase 1 (pnad1) within and among isolates of Toxocara cati, Toxocara canis and Toxascaris leonina. Genomic DNA was extracted from 32 isolates of T. cati, 9 isolates of T. canis and 19 isolates of T. leonina collected from cats and dogs in different geographical areas of Iran. Mitochondrial genes were amplified by polymerase chain reaction (PCR) and sequenced. Sequence data were aligned using the BioEdit software and compared with published sequences in GenBank. Phylogenetic analysis was performed using Bayesian inference and maximum likelihood methods. Based on pairwise comparison, intra-species genetic diversity within Iranian isolates of T. cati, T. canis and T. leonina amounted to 0-2.3%, 0-1.3% and 0-1.0% for pcox1 and 0-2.0%, 0-1.7% and 0-2.6% for pnad1, respectively. Inter-species sequence variation among the three ascaridoid nematodes was significantly higher, being 9.5-16.6% for pcox1 and 11.9-26.7% for pnad1. Sequence and phylogenetic analysis of the pcox1 and pnad1 genes indicated that there is significant genetic diversity within and among isolates of T. cati, T. canis and T. leonina from different areas of Iran, and these genes can be used for studying genetic variation of ascaridoid nematodes.
Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry
2006-08-31
Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats > or = 30 bp with a sequence identity > or = 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements.
Maximum Likelihood Analysis in the PEN Experiment
NASA Astrophysics Data System (ADS)
Lehman, Martin
2013-10-01
The experimental determination of the π+ -->e+ ν (γ) decay branching ratio currently provides the most accurate test of lepton universality. The PEN experiment at PSI, Switzerland, aims to improve the present world average experimental precision of 3 . 3 ×10-3 to 5 ×10-4 using a stopped beam approach. During runs in 2008-10, PEN has acquired over 2 ×107 πe 2 events. The experiment includes active beam detectors (degrader, mini TPC, target), central MWPC tracking with plastic scintillator hodoscopes, and a spherical pure CsI electromagnetic shower calorimeter. The final branching ratio will be calculated using a maximum likelihood analysis. This analysis assigns each event a probability for 5 processes (π+ -->e+ ν , π+ -->μ+ ν , decay-in-flight, pile-up, and hadronic events) using Monte Carlo verified probability distribution functions of our observables (energies, times, etc). A progress report on the PEN maximum likelihood analysis will be presented. Work supported by NSF grant PHY-0970013.
The Extended-Image Tracking Technique Based on the Maximum Likelihood Estimation
NASA Technical Reports Server (NTRS)
Tsou, Haiping; Yan, Tsun-Yee
2000-01-01
This paper describes an extended-image tracking technique based on the maximum likelihood estimation. The target image is assume to have a known profile covering more than one element of a focal plane detector array. It is assumed that the relative position between the imager and the target is changing with time and the received target image has each of its pixels disturbed by an independent additive white Gaussian noise. When a rotation-invariant movement between imager and target is considered, the maximum likelihood based image tracking technique described in this paper is a closed-loop structure capable of providing iterative update of the movement estimate by calculating the loop feedback signals from a weighted correlation between the currently received target image and the previously estimated reference image in the transform domain. The movement estimate is then used to direct the imager to closely follow the moving target. This image tracking technique has many potential applications, including free-space optical communications and astronomy where accurate and stabilized optical pointing is essential.
Reyes-Valdés, M H; Stelly, D M
1995-01-01
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes. Images Fig. 1 PMID:7568226
Comparisons of neural networks to standard techniques for image classification and correlation
NASA Technical Reports Server (NTRS)
Paola, Justin D.; Schowengerdt, Robert A.
1994-01-01
Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.
Schminkey, Donna L; von Oertzen, Timo; Bullock, Linda
2016-08-01
With increasing access to population-based data and electronic health records for secondary analysis, missing data are common. In the social and behavioral sciences, missing data frequently are handled with multiple imputation methods or full information maximum likelihood (FIML) techniques, but healthcare researchers have not embraced these methodologies to the same extent and more often use either traditional imputation techniques or complete case analysis, which can compromise power and introduce unintended bias. This article is a review of options for handling missing data, concluding with a case study demonstrating the utility of multilevel structural equation modeling using full information maximum likelihood (MSEM with FIML) to handle large amounts of missing data. MSEM with FIML is a parsimonious and hypothesis-driven strategy to cope with large amounts of missing data without compromising power or introducing bias. This technique is relevant for nurse researchers faced with ever-increasing amounts of electronic data and decreasing research budgets. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
DECONV-TOOL: An IDL based deconvolution software package
NASA Technical Reports Server (NTRS)
Varosi, F.; Landsman, W. B.
1992-01-01
There are a variety of algorithms for deconvolution of blurred images, each having its own criteria or statistic to be optimized in order to estimate the original image data. Using the Interactive Data Language (IDL), we have implemented the Maximum Likelihood, Maximum Entropy, Maximum Residual Likelihood, and sigma-CLEAN algorithms in a unified environment called DeConv_Tool. Most of the algorithms have as their goal the optimization of statistics such as standard deviation and mean of residuals. Shannon entropy, log-likelihood, and chi-square of the residual auto-correlation are computed by DeConv_Tool for the purpose of determining the performance and convergence of any particular method and comparisons between methods. DeConv_Tool allows interactive monitoring of the statistics and the deconvolved image during computation. The final results, and optionally, the intermediate results, are stored in a structure convenient for comparison between methods and review of the deconvolution computation. The routines comprising DeConv_Tool are available via anonymous FTP through the IDL Astronomy User's Library.
F-8C adaptive flight control laws
NASA Technical Reports Server (NTRS)
Hartmann, G. L.; Harvey, C. A.; Stein, G.; Carlson, D. N.; Hendrick, R. C.
1977-01-01
Three candidate digital adaptive control laws were designed for NASA's F-8C digital flyby wire aircraft. Each design used the same control laws but adjusted the gains with a different adaptative algorithm. The three adaptive concepts were: high-gain limit cycle, Liapunov-stable model tracking, and maximum likelihood estimation. Sensors were restricted to conventional inertial instruments (rate gyros and accelerometers) without use of air-data measurements. Performance, growth potential, and computer requirements were used as criteria for selecting the most promising of these candidates for further refinement. The maximum likelihood concept was selected primarily because it offers the greatest potential for identifying several aircraft parameters and hence for improved control performance in future aircraft application. In terms of identification and gain adjustment accuracy, the MLE design is slightly superior to the other two, but this has no significant effects on the control performance achievable with the F-8C aircraft. The maximum likelihood design is recommended for flight test, and several refinements to that design are proposed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Washeleski, Robert L.; Meyer, Edmond J. IV; King, Lyon B.
2013-10-15
Laser Thomson scattering (LTS) is an established plasma diagnostic technique that has seen recent application to low density plasmas. It is difficult to perform LTS measurements when the scattered signal is weak as a result of low electron number density, poor optical access to the plasma, or both. Photon counting methods are often implemented in order to perform measurements in these low signal conditions. However, photon counting measurements performed with photo-multiplier tubes are time consuming and multi-photon arrivals are incorrectly recorded. In order to overcome these shortcomings a new data analysis method based on maximum likelihood estimation was developed. Themore » key feature of this new data processing method is the inclusion of non-arrival events in determining the scattered Thomson signal. Maximum likelihood estimation and its application to Thomson scattering at low signal levels is presented and application of the new processing method to LTS measurements performed in the plume of a 2-kW Hall-effect thruster is discussed.« less
Washeleski, Robert L; Meyer, Edmond J; King, Lyon B
2013-10-01
Laser Thomson scattering (LTS) is an established plasma diagnostic technique that has seen recent application to low density plasmas. It is difficult to perform LTS measurements when the scattered signal is weak as a result of low electron number density, poor optical access to the plasma, or both. Photon counting methods are often implemented in order to perform measurements in these low signal conditions. However, photon counting measurements performed with photo-multiplier tubes are time consuming and multi-photon arrivals are incorrectly recorded. In order to overcome these shortcomings a new data analysis method based on maximum likelihood estimation was developed. The key feature of this new data processing method is the inclusion of non-arrival events in determining the scattered Thomson signal. Maximum likelihood estimation and its application to Thomson scattering at low signal levels is presented and application of the new processing method to LTS measurements performed in the plume of a 2-kW Hall-effect thruster is discussed.
NASA Technical Reports Server (NTRS)
Lei, Ning; Chiang, Kwo-Fu; Oudrari, Hassan; Xiong, Xiaoxiong
2011-01-01
Optical sensors aboard Earth orbiting satellites such as the next generation Visible/Infrared Imager/Radiometer Suite (VIIRS) assume that the sensors radiometric response in the Reflective Solar Bands (RSB) is described by a quadratic polynomial, in relating the aperture spectral radiance to the sensor Digital Number (DN) readout. For VIIRS Flight Unit 1, the coefficients are to be determined before launch by an attenuation method, although the linear coefficient will be further determined on-orbit through observing the Solar Diffuser. In determining the quadratic polynomial coefficients by the attenuation method, a Maximum Likelihood approach is applied in carrying out the least-squares procedure. Crucial to the Maximum Likelihood least-squares procedure is the computation of the weight. The weight not only has a contribution from the noise of the sensor s digital count, with an important contribution from digitization error, but also is affected heavily by the mathematical expression used to predict the value of the dependent variable, because both the independent and the dependent variables contain random noise. In addition, model errors have a major impact on the uncertainties of the coefficients. The Maximum Likelihood approach demonstrates the inadequacy of the attenuation method model with a quadratic polynomial for the retrieved spectral radiance. We show that using the inadequate model dramatically increases the uncertainties of the coefficients. We compute the coefficient values and their uncertainties, considering both measurement and model errors.
Evaluation of properties over phylogenetic trees using stochastic logics.
Requeno, José Ignacio; Colom, José Manuel
2016-06-14
Model checking has been recently introduced as an integrated framework for extracting information of the phylogenetic trees using temporal logics as a querying language, an extension of modal logics that imposes restrictions of a boolean formula along a path of events. The phylogenetic tree is considered a transition system modeling the evolution as a sequence of genomic mutations (we understand mutation as different ways that DNA can be changed), while this kind of logics are suitable for traversing it in a strict and exhaustive way. Given a biological property that we desire to inspect over the phylogeny, the verifier returns true if the specification is satisfied or a counterexample that falsifies it. However, this approach has been only considered over qualitative aspects of the phylogeny. In this paper, we repair the limitations of the previous framework for including and handling quantitative information such as explicit time or probability. To this end, we apply current probabilistic continuous-time extensions of model checking to phylogenetics. We reinterpret a catalog of qualitative properties in a numerical way, and we also present new properties that couldn't be analyzed before. For instance, we obtain the likelihood of a tree topology according to a mutation model. As case of study, we analyze several phylogenies in order to obtain the maximum likelihood with the model checking tool PRISM. In addition, we have adapted the software for optimizing the computation of maximum likelihoods. We have shown that probabilistic model checking is a competitive framework for describing and analyzing quantitative properties over phylogenetic trees. This formalism adds soundness and readability to the definition of models and specifications. Besides, the existence of model checking tools hides the underlying technology, omitting the extension, upgrade, debugging and maintenance of a software tool to the biologists. A set of benchmarks justify the feasibility of our approach.
Dutheil, Julien; Gaillard, Sylvain; Bazin, Eric; Glémin, Sylvain; Ranwez, Vincent; Galtier, Nicolas; Belkhir, Khalid
2006-04-04
A large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/output methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications. We present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus. Implementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.
Zhang, Yanhong; Pham, Nancy Kim; Zhang, Huixian; Lin, Junda; Lin, Qiang
2014-01-01
Population genetic of seahorses is confidently influenced by their species-specific ecological requirements and life-history traits. In the present study, partial sequences of mitochondrial cytochrome b (cytb) and control region (CR) were obtained from 50 Hippocampus mohnikei and 92 H. trimaculatus from four zoogeographical zones. A total of 780 base pairs of cytb gene were sequenced to characterize mitochondrial DNA (mtDNA) diversity. The mtDNA marker revealed high haplotype diversity, low nucleotide diversity, and a lack of population structure across both populations of H. mohnikei and H. trimaculatus. A neighbour-joining (NJ) tree of cytb gene sequences showed that H. mohnikei haplotypes formed one cluster. A maximum likelihood (ML) tree of cytb gene sequences showed that H. trimaculatus belonged to one lineage. The star-like pattern median-joining network of cytb and CR markers indicated a previous demographic expansion of H. mohnikei and H. trimaculatus. The cytb and CR data sets exhibited a unimodal mismatch distribution, which may have resulted from population expansion. Mismatch analysis suggested that the expansion was initiated about 276,000 years ago for H. mohnikei and about 230,000 years ago for H. trimaculatus during the middle Pleistocene period. This study indicates a possible signature of genetic variation and population expansion in two seahorses under complex marine environments.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Dong, Yi; Mihalas, Stefan; Russell, Alexander; Etienne-Cummings, Ralph; Niebur, Ernst
2012-01-01
When a neuronal spike train is observed, what can we say about the properties of the neuron that generated it? A natural way to answer this question is to make an assumption about the type of neuron, select an appropriate model for this type, and then to choose the model parameters as those that are most likely to generate the observed spike train. This is the maximum likelihood method. If the neuron obeys simple integrate and fire dynamics, Paninski, Pillow, and Simoncelli (2004) showed that its negative log-likelihood function is convex and that its unique global minimum can thus be found by gradient descent techniques. The global minimum property requires independence of spike time intervals. Lack of history dependence is, however, an important constraint that is not fulfilled in many biological neurons which are known to generate a rich repertoire of spiking behaviors that are incompatible with history independence. Therefore, we expanded the integrate and fire model by including one additional variable, a variable threshold (Mihalas & Niebur, 2009) allowing for history-dependent firing patterns. This neuronal model produces a large number of spiking behaviors while still being linear. Linearity is important as it maintains the distribution of the random variables and still allows for maximum likelihood methods to be used. In this study we show that, although convexity of the negative log-likelihood is not guaranteed for this model, the minimum of the negative log-likelihood function yields a good estimate for the model parameters, in particular if the noise level is treated as a free parameter. Furthermore, we show that a nonlinear function minimization method (r-algorithm with space dilation) frequently reaches the global minimum. PMID:21851282
Phylogeny of Kinorhyncha Based on Morphology and Two Molecular Loci
Sørensen, Martin V.; Dal Zotto, Matteo; Rho, Hyun Soo; Herranz, Maria; Sánchez, Nuria; Pardos, Fernando; Yamasaki, Hiroshi
2015-01-01
The phylogeny of Kinorhyncha was analyzed using morphology and the molecular loci 18S rRNA and 28S rRNA. The different datasets were analyzed separately and in combination, using maximum likelihood and Bayesian Inference. Bayesian inference of molecular sequence data in combination with morphology supported the division of Kinorhyncha into two major clades: Cyclorhagida comb. nov. and Allomalorhagida nom. nov. The latter clade represents a new kinorhynch class, and accommodates Dracoderes, Franciscideres, a yet undescribed genus which is closely related with Franciscideres, and the traditional homalorhagid genera. Homalorhagid monophyly was not supported by any analyses with molecular sequence data included. Analysis of the combined molecular and morphological data furthermore supported a cyclorhagid clade which included all traditional cyclorhagid taxa, except Dracoderes that no longer should be considered a cyclorhagid genus. Accordingly, Cyclorhagida is divided into three main lineages: Echinoderidae, Campyloderidae, and a large clade, ‘Kentrorhagata’, which except for species of Campyloderes, includes all species with a midterminal spine present in adult individuals. Maximum likelihood analysis of the combined datasets produced a rather unresolved tree that was not regarded in the following discussion. Results of the analyses with only molecular sequence data included were incongruent at different points. However, common for all analyses was the support of several major clades, i.e., Campyloderidae, Kentrorhagata, Echinoderidae, Dracoderidae, Pycnophyidae, and a clade with Paracentrophyes + New Genus and Franciscideres (in those analyses where the latter was included). All molecular analyses including 18S rRNA sequence data furthermore supported monophyly of Allomalorhagida. Cyclorhagid monophyly was only supported in analyses of combined 18S rRNA and 28S rRNA (both ML and BI), and only in a restricted dataset where taxa with incomplete information from 28S rRNA had been omitted. Analysis of the morphological data produced results that were similar with those from the combined molecular and morphological analysis. E.g., the morphological data also supported exclusion of Dracoderes from Cyclorhagida. The main differences between the morphological analysis and analyses based on the combined datasets include: 1) Homalorhagida appears as monophyletic in the morphological tree only, 2) the morphological analyses position Franciscideres and the new genus within Cyclorhagida near Zelinkaderidae and Cateriidae, whereas analyses including molecular data place the two genera inside Allomalorhagida, and 3) species of Campyloderes appear in a basal trichotomy within Kentrorhagata in the morphological tree, whereas analysis of the combined datasets places species of Campyloderes as a sister clade to Echinoderidae and Kentrorhagata. PMID:26200115
Dollet, Michel; Sturm, Nancy R; Campbell, David A
2012-03-01
The distinction between plant trypanosomatids and opportunistic monoxenous insect trypanosomatids has not been demarcated clearly due to the mass placement of all trypanosomatids isolated from plants into the arbitrary genus Phytomonas spp. The advent of molecular markers has been useful in distinguishing plant trypanosomatids from the rest of the Trypanosomatidae family. Here we have examined the internal transcribed spacer (ITS) region of the ribosomal RNA (rRNA) locus for classification purposes. This region contains two distinct ITSs flanked by the small subunit and large subunit of ribosomal RNA genes and separated by the 5.8S ribosomal RNA gene. Sequences within the 5.8S ribosomal RNA gene and in the ITS sequences can serve as specific markers for several of the Phytomonas groups. Microsatellite sequences were identified in Phytomonas spp. in both ITS regions. Several classes of microsatellites were seen, with inter-isolate variation that has potential for future use. Maximum Likelihood analysis of the ITS sequences of 20 Phytomonas isolates representing the eight defined groups and a few unclassified isolates revealed a total of 10 distinct subgroups within our collection, of which two are new. The ITS region, which includes the 5.8S sequence, is a robust marker for the subdivisions within the genus Phytomonas spp. Copyright © 2011 Elsevier B.V. All rights reserved.
Ayyagari, Vijaya Sai; Sreerama, Krupanidhi
2017-08-01
Achatina fulica (Lissachatina fulica) is one of the most invasive species found across the globe causing a significant damage to crops, vegetables, and horticultural plants. This terrestrial snail is native to east Africa and spread to different parts of the world by introductions. India, a hot spot for biodiversity of several endemic gastropods, has witnessed an outburst of this snail population in several parts of the country posing a serious threat to crop loss and also to human health. With an objective to evaluate the genetic diversity of this snail, we have sampled this snail from different parts of India and analyzed its haplotype diversity by means of 16S rDNA sequence information. Apart from this, we have studied the phylogenetic relationships of the isolates sequenced in the present study in relation with other global populations by Bayesian and Maximum-likelihood approaches. Of the isolates sequenced, haplotype 'C' is the predominant one. A new haplotype 'S' from the state of Odisha was observed. The isolates sequenced in the present study clustered with its conspecifics from the Indian sub-continent. Haplotype network analyses were also carried out for studying the evolution of different haplotypes. It was observed that haplotype 'S' was associated with a Mauritius haplotype 'H', indicating the possibility of multiple introductions of A. fulica to India.
Feldman, Sanford H; Ntenda, Abraham M
2011-01-01
We used high-fidelity PCR to amplify 2 overlapping regions of the ribosomal gene complex from the rodent fur mite Myobia musculi. The amplicons encompassed a large portion of the mite's ribosomal gene complex spanning 3128 nucleotides containing the entire 18S rRNA, internal transcribed spacer (ITS) 1, 5.8S rRNA, ITS2, and a portion of the 5′-end of the 28S rRNA. M. musculi’s 179-nucleotide 5.8S rRNA nucleotide sequence was not conserved, so this region was identified by conservation of rRNA secondary structure. Maximum likelihood and Bayesian inference phylogenetic analyses were performed by using multiple sequence alignment consisting of 1524 nucleotides of M. musculi 18S rRNA and homologous sequences from 42 prostigmatid mites and the tick Dermacentor andersoni. The phylograms produced by both methods were in agreement regarding terminal, secondary, and some tertiary phylogenetic relationships among mites. Bayesian inference discriminated most infraordinal relationships between Eleutherengona and Parasitengona mites in the suborder Anystina. Basal relationships between suborders Anystina and Eupodina historically determined by comparing differences in anatomic characteristics were less well-supported by our molecular analysis. Our results recapitulated similar 18S rRNA sequence analyses recently reported. Our study supports M. musculi as belonging to the suborder Anystina, infraorder Eleutherenona, and superfamily Cheyletoidea. PMID:22330574
Falade, Mofolusho O.; Opene, Anthony J.; Benson, Otarigho
2016-01-01
DNA barcoding has been adopted as a gold standard rapid, precise and unifying identification system for animal species and provides a database of genetic sequences that can be used as a tool for universal species identification. In this study, we employed mitochondrial genes 16S rRNA (16S) and cytochrome oxidase subunit I (COI) for the identification of some Nigerian freshwater catfish and Tilapia species. Approximately 655 bp were amplified from the 5′ region of the mitochondrial cytochrome C oxidase subunit I (COI) gene whereas 570 bp were amplified for the 16S rRNA gene. Nucleotide divergences among sequences were estimated based on Kimura 2-parameter distances and the genetic relationships were assessed by constructing phylogenetic trees using the neighbour-joining (NJ) and maximum likelihood (ML) methods. Analyses of consensus barcode sequences for each species, and alignment of individual sequences from within a given species revealed highly consistent barcodes (99% similarity on average), which could be compared with deposited sequences in public databases. The nucleotide distance between species belonging to different genera based on COI ranged from 0.17% between Sarotherodon melanotheron and Coptodon zillii to 0.49% between Clarias gariepinus and C. zillii, indicating that S. melanotheron and C. zillii are closely related. Based on the data obtained, the utility of COI gene was confirmed in accurate identification of three fish species from Southwest Nigeria. PMID:27990256
Accurate Structural Correlations from Maximum Likelihood Superpositions
Theobald, Douglas L; Wuttke, Deborah S
2008-01-01
The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
Nelson, Randin; Cañate, Raul; Pascale, Juan Miguel; Dragoo, Jerry W; Armien, Blas; Armien, Anibal G; Koster, Frederick
2010-09-01
Choclo virus (CHOV) was described in sigmodontine rodents, Oligoryzomys fulvescens, and humans during an outbreak of hantavirus cardiopulmonary syndrome (HCPS) in 1999-2000 in western Panama. Although HCPS is rare, hantavirus-specific serum antibody prevalence among the general population is high suggesting that CHOV may cause many mild or asymptomatic infections. The goals of this study were to confirm the role of CHOV in HCPS and in the frequently detected serum antibody and to establish the phylogenetic relationship with other New World hantaviruses. CHOV was cultured to facilitate the sequencing of the small (S) and medium (M) segments and to perform CHOV-specific serum neutralization antibody assays. Sequences of the S and M segments found a close relationship to other Oligoryzomys-borne hantaviruses in the Americas, highly conserved terminal nucleotides, and no evidence for recombination events. The maximum likelihood and maximum parsimony analyses of complete M segment nucleotide sequences indicate a close relationship to Maporal and Laguna Negra viruses, found at the base of the South American clade. In a focus neutralization assay acute and convalescent sera from six Panamanian HCPS patients neutralized CHOV in dilutions from 1:200 to 1:6,400. In a sample of antibody-positive adults without a history of HCPS, 9 of 10 sera neutralized CHOV in dilutions ranging from 1:100 to 1:6,400. Although cross-neutralization with other sympatric hantaviruses not yet associated with human disease is possible, CHOV appears to be the causal agent for most of the mild or asymptomatic hantavirus infections, as well as HCPS, in Panama.
Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
Kolaczkowski, Bryan; Thornton, Joseph W.
2009-01-01
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis. PMID:20011052
Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
Kolaczkowski, Bryan; Thornton, Joseph W
2009-12-09
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi
2017-04-26
We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.
Hurtado, Luis A; Santamaria, Carlos A; Fitzgerald, Lee A
2014-05-06
The phylogenetic position of the critically endangered Saint Croix ground lizard Ameiva polops is presently unknown and several hypotheses have been proposed. We investigated the phylogenetic position of this species using molecular phylogenetic methods. We obtained sequences of DNA fragments of the mitochondrial ribosomal genes 12S rDNA and 16S rDNA for this species. We aligned these sequences with published sequences of other Ameiva species, which include most of the Ameiva species from the West Indies, three Ameiva species from Central America and South America, and one from the teiid lizard Tupinambis teguixin, which was used as outgroup. We conducted Maximum Likelihood and Bayesian phylogenetic analyses. The phylogenetic reconstructions among the different methods were very similar, supporting the monophyly of West Indian Ameiva and showing within this lineage, a basal polytomy of four clades that are separated geographically. Ameiva polops grouped in a cluster that included the other two Ameiva species found in the Puerto Rican Bank: A. wetmorei and A. exsul. A sister relationship between A. polops and A. wetmorei is suggested by our analyses. We compare our results with a previous study on molecular systematics of West Indian Ameiva.
Higher-level phylogeny of paraneopteran insects inferred from mitochondrial genome sequences
Li, Hu; Shao, Renfu; Song, Nan; Song, Fan; Jiang, Pei; Li, Zhihong; Cai, Wanzhi
2015-01-01
Mitochondrial (mt) genome data have been proven to be informative for animal phylogenetic studies but may also suffer from systematic errors, due to the effects of accelerated substitution rate and compositional heterogeneity. We analyzed the mt genomes of 25 insect species from the four paraneopteran orders, aiming to better understand how accelerated substitution rate and compositional heterogeneity affect the inferences of the higher-level phylogeny of this diverse group of hemimetabolous insects. We found substantial heterogeneity in base composition and contrasting rates in nucleotide substitution among these paraneopteran insects, which complicate the inference of higher-level phylogeny. The phylogenies inferred with concatenated sequences of mt genes using maximum likelihood and Bayesian methods and homogeneous models failed to recover Psocodea and Hemiptera as monophyletic groups but grouped, instead, the taxa that had accelerated substitution rates together, including Sternorrhyncha (a suborder of Hemiptera), Thysanoptera, Phthiraptera and Liposcelididae (a family of Psocoptera). Bayesian inference with nucleotide sequences and heterogeneous models (CAT and CAT + GTR), however, recovered Psocodea, Thysanoptera and Hemiptera each as a monophyletic group. Within Psocodea, Liposcelididae is more closely related to Phthiraptera than to other species of Psocoptera. Furthermore, Thysanoptera was recovered as the sister group to Hemiptera. PMID:25704094
Wesener, Thomas; Voigtländer, Karin; Decker, Peter; Oeyen, Jan Philip; Spelda, Jörg; Lindner, Norman
2015-01-01
Abstract As part of the German Barcode of Life (GBOL) Myriapoda program, which aims to sequence the COI barcoding fragment for 2000 specimens of Germany’s 200 myriapod species in the near future, 44 sequences of the centipede order Geophilomorpha are analyzed. The analyses are limited to the genera Geophilus Leach, 1814 and Stenotaenia Koch, 1847 and include a total of six species. A special focus is Stenotaenia, of which 19 specimens from southern, western and eastern Germany could be successfully sequenced. The Stenotaenia data shows the presence of three to four vastly different (13.7–16.7% p-distance) lineages of the genus in Germany. At least two of the three lineages show a wide distribution across Germany, only the lineage including topotypes of Stenotaenia linearis shows a more restricted distribution in southern Germany. In a maximum likelihood phylogenetic analysis the Italian species Stenotaenia ‘sorrentina’ (Attems, 1903) groups with the different German Stenotaenia linearis clades. The strongly different Stenotaenia linearis lineages within Germany, independent of geography, are a strong hint for the presence of additional, cryptic Stenotaenia species in Germany. PMID:26257532
Maximum-Likelihood Methods for Processing Signals From Gamma-Ray Detectors
Barrett, Harrison H.; Hunter, William C. J.; Miller, Brian William; Moore, Stephen K.; Chen, Yichun; Furenlid, Lars R.
2009-01-01
In any gamma-ray detector, each event produces electrical signals on one or more circuit elements. From these signals, we may wish to determine the presence of an interaction; whether multiple interactions occurred; the spatial coordinates in two or three dimensions of at least the primary interaction; or the total energy deposited in that interaction. We may also want to compute listmode probabilities for tomographic reconstruction. Maximum-likelihood methods provide a rigorous and in some senses optimal approach to extracting this information, and the associated Fisher information matrix provides a way of quantifying and optimizing the information conveyed by the detector. This paper will review the principles of likelihood methods as applied to gamma-ray detectors and illustrate their power with recent results from the Center for Gamma-ray Imaging. PMID:20107527
E6 and E7 Gene Polymorphisms in Human Papillomavirus Types-58 and 33 Identified in Southwest China
Wen, Qiang; Wang, Tao; Mu, Xuemei; Chenzhang, Yuwei; Cao, Man
2017-01-01
Cancer of the cervix is associated with infection by certain types of human papillomavirus (HPV). The gene variants differ in immune responses and oncogenic potential. The E6 and E7 proteins encoded by high-risk HPV play a key role in cellular transformation. HPV-33 and HPV-58 types are highly prevalent among Chinese women. To study the gene intratypic variations, polymorphisms and positive selections of HPV-33 and HPV-58 E6/E7 in southwest China, HPV-33 (E6, E7: n = 216) and HPV-58 (E6, E7: n = 405) E6 and E7 genes were sequenced and compared to others submitted to GenBank. Phylogenetic trees were constructed by Maximum-likelihood and the Kimura 2-parameters methods by MEGA 6 (Molecular Evolutionary Genetics Analysis version 6.0). The diversity of secondary structure was analyzed by PSIPred software. The selection pressures acting on the E6/E7 genes were estimated by PAML 4.8 (Phylogenetic Analyses by Maximun Likelihood version4.8) software. The positive sites of HPV-33 and HPV-58 E6/E7 were contrasted by ClustalX 2.1. Among 216 HPV-33 E6 sequences, 8 single nucleotide mutations were observed with 6/8 non-synonymous and 2/8 synonymous mutations. The 216 HPV-33 E7 sequences showed 3 single nucleotide mutations that were non-synonymous. The 405 HPV-58 E6 sequences revealed 8 single nucleotide mutations with 4/8 non-synonymous and 4/8 synonymous mutations. Among 405 HPV-58 E7 sequences, 13 single nucleotide mutations were observed with 10/13 non-synonymous mutations and 3/13 synonymous mutations. The selective pressure analysis showed that all HPV-33 and 4/6 HPV-58 E6/E7 major non-synonymous mutations were sites of positive selection. All variations were observed in sites belonging to major histocompatibility complex and/or B-cell predicted epitopes. K93N and R145 (I/N) were observed in both HPV-33 and HPV-58 E6. PMID:28141822
Association of Bartonella Species with Wild and Synanthropic Rodents in Different Brazilian Biomes
Gonçalves, Luiz Ricardo; Favacho, Alexsandra Rodrigues de Mendonça; Roque, André Luiz Rodrigues; Mendes, Natalia Serra; Fidelis Junior, Otávio Luiz; Benevenute, Jyan Lucas; Herrera, Heitor Miraglia; D'Andrea, Paulo Sérgio; de Lemos, Elba Regina Sampaio; Machado, Rosangela Zacarias
2016-01-01
ABSTRACT Bartonella spp. comprise an ecologically successful group of microorganisms that infect erythrocytes and have adapted to different hosts, which include a wide range of mammals, besides humans. Rodents are reservoirs of about two-thirds of Bartonella spp. described to date; and some of them have been implicated as causative agents of human diseases. In our study, we performed molecular and phylogenetic analyses of Bartonella spp. infecting wild rodents from five different Brazilian biomes. In order to characterize the genetic diversity of Bartonella spp., we performed a robust analysis based on three target genes, followed by sequencing, Bayesian inference, and maximum likelihood analysis. Bartonella spp. were detected in 25.6% (117/457) of rodent spleen samples analyzed, and this occurrence varied among different biomes. The diversity analysis of gltA sequences showed the presence of 15 different haplotypes. Analysis of the phylogenetic relationship of gltA sequences performed by Bayesian inference and maximum likelihood showed that the Bartonella species detected in rodents from Brazil was closely related to the phylogenetic group A detected in other cricetid rodents from North America, probably constituting only one species. Last, the Bartonella species genogroup identified in the present study formed a monophyletic group that included Bartonella samples from seven different rodent species distributed in three distinct biomes. In conclusion, our study showed that the occurrence of Bartonella bacteria in rodents is much more frequent and widespread than previously recognized. IMPORTANCE In the present study, we reported the occurrence of Bartonella spp. in some sites in Brazil. The identification and understanding of the distribution of this important group of bacteria may allow the Brazilian authorities to recognize potential regions with the risk of transmission of these pathogens among wild and domestic animals and humans. In addition, our study accessed important gaps in the biology of this group of bacteria in Brazil, such as its low host specificity, high genetic diversity, and relationship with other Bartonella spp. detected in rodents trapped in America. Considering the diversity of newly discovered Bartonella species and the great ecological plasticity of these bacteria, new studies with the aim of revealing the biological aspects unknown until now are needed and must be performed around the world. In this context, the impact of Bartonella spp. associated with rodents in human health should be assessed in future studies. PMID:27736785
Shen, Yi; Dai, Wei; Richards, Virginia M
2015-03-01
A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given.
A maximum likelihood convolutional decoder model vs experimental data comparison
NASA Technical Reports Server (NTRS)
Chen, R. Y.
1979-01-01
This article describes the comparison of a maximum likelihood convolutional decoder (MCD) prediction model and the actual performance of the MCD at the Madrid Deep Space Station. The MCD prediction model is used to develop a subroutine that has been utilized by the Telemetry Analysis Program (TAP) to compute the MCD bit error rate for a given signal-to-noise ratio. The results indicate that that the TAP can predict quite well compared to the experimental measurements. An optimal modulation index also can be found through TAP.
Salje, Ekhard K H; Planes, Antoni; Vives, Eduard
2017-10-01
Crackling noise can be initiated by competing or coexisting mechanisms. These mechanisms can combine to generate an approximate scale invariant distribution that contains two or more contributions. The overall distribution function can be analyzed, to a good approximation, using maximum-likelihood methods and assuming that it follows a power law although with nonuniversal exponents depending on a varying lower cutoff. We propose that such distributions are rather common and originate from a simple superposition of crackling noise distributions or exponential damping.
Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu
2009-01-01
Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Sato, Mitsuharu; Miyazaki, Kentaro
2017-01-01
Horizontal gene transfer (HGT) is a ubiquitous genetic event in bacterial evolution, but it seldom occurs for genes involved in highly complex supramolecules (or biosystems), which consist of many gene products. The ribosome is one such supramolecule, but several bacteria harbor dissimilar and/or chimeric 16S rRNAs in their genomes, suggesting the occurrence of HGT of this gene. However, we know little about whether the genes actually experience HGT and, if so, the frequency of such a transfer. This is primarily because the methods currently employed for phylogenetic analysis (e.g., neighbor-joining, maximum likelihood, and maximum parsimony) of 16S rRNA genes assume point mutation-driven tree-shape evolution as an evolutionary model, which is intrinsically inappropriate to decipher the evolutionary history for genes driven by recombination. To address this issue, we applied a phylogenetic network analysis, which has been used previously for detection of genetic recombination in homologous alleles, to the 16S rRNA gene. We focused on the genus Enterobacter, whose phylogenetic relationships inferred by multi-locus sequence alignment analysis and 16S rRNA sequences are incompatible. All 10 complete genomic sequences were retrieved from the NCBI database, in which 71 16S rRNA genes were included. Neighbor-joining analysis demonstrated that the genes residing in the same genomes clustered, indicating the occurrence of intragenomic recombination. However, as suggested by the low bootstrap values, evolutionary relationships between the clusters were uncertain. We then applied phylogenetic network analysis to representative sequences from each cluster. We found three ancestral 16S rRNA groups; the others were likely created through recursive recombination between the ancestors and chimeric descendants. Despite the large sequence changes caused by the recombination events, the RNA secondary structures were conserved. Successive intergenomic and intragenomic recombination thus shaped the evolution of 16S rRNA genes in the genus Enterobacter. PMID:29180992
Negrisolo, Enrico; Kuhl, Heiner; Forcato, Claudio; Vitulo, Nicola; Reinhardt, Richard; Patarnello, Tomaso; Bargelloni, Luca
2010-12-01
Comparative genomics holds the promise to magnify the information obtained from individual genome sequencing projects, revealing common features conserved across genomes and identifying lineage-specific characteristics. To implement such a comparative approach, a robust phylogenetic framework is required to accurately reconstruct evolution at the genome level. Among vertebrate taxa, teleosts represent the second best characterized group, with high-quality draft genome sequences for five model species (Danio rerio, Gasterosteus aculeatus, Oryzias latipes, Takifugu rubripes, and Tetraodon nigroviridis), and several others are in the finishing lane. However, the relationships among the acanthomorph teleost model fishes remain an unresolved taxonomic issue. Here, a genomic region spanning over 1.2 million base pairs was sequenced in the teleost fish Dicentrarchus labrax. Together with genomic data available for the above fish models, the new sequence was used to identify unique orthologous genomic regions shared across all target taxa. Different strategies were applied to produce robust multiple gene and genomic alignments spanning from 11,802 to 186,474 amino acid/nucleotide positions. Ten data sets were analyzed according to Bayesian inference, maximum likelihood, maximum parsimony, and neighbor joining methods. Extensive analyses were performed to explore the influence of several factors (e.g., alignment methodology, substitution model, data set partitions, and long-branch attraction) on the tree topology. Although a general consensus was observed for a closer relationship between G. aculeatus (Gasterosteidae) and Di. labrax (Moronidae) with the atherinomorph O. latipes (Beloniformes) sister taxon of this clade, with the tetraodontiform group Ta. rubripes and Te. nigroviridis (Tetraodontiformes) representing a more distantly related taxon among acanthomorph model fish species, conflicting results were obtained between data sets and methods, especially with respect to the choice of alignment methodology applied to noncoding parts of the genomic region under study. This may limit the use of intergenic/noncoding sequences in phylogenomics until more robust alignment algorithms are developed.
2014-01-01
Background Limited available sequence information has greatly impeded population genetics, phylogenetics and systematics studies in the subclass Acari (mites and ticks). Mitochondrial (mt) DNA is well known to provide genetic markers for investigations in these areas, but complete mt genomic data have been lacking for many Acari species. Herein, we present the complete mt genome of the scab mite Psoroptes cuniculi. Methods P. cuniculi was collected from a naturally infected New Zealand white rabbit from China and identified by morphological criteria. The complete mt genome of P. cuniculi was amplified by PCR and then sequenced. The relationships of this scab mite with selected members of the Acari were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI), maximum likelihood (ML) and maximum parsimony (MP). Results This mt genome (14,247 bp) is circular and consists of 37 genes, including 13 genes for proteins, 22 genes for tRNA, 2 genes for rRNA. The gene arrangement in mt genome of P. cuniculi is the same as those of Dermatophagoides farinae (Pyroglyphidae) and Aleuroglyphus ovatus (Acaridae), but distinct from those of Steganacarus magnus (Steganacaridae) and Panonychus citri (Tetranychidae). Phylogenetic analyses using concatenated amino acid sequences of 12 protein-coding genes, with three different computational algorithms (BI, ML and MP), showed the division of subclass Acari into two superorders, supported the monophylies of the both superorders Parasitiformes and Acariformes; and the three orders Ixodida and Mesostigmata and Astigmata, but rejected the monophyly of the order Prostigmata. Conclusions The mt genome of P. cuniculi represents the first mt genome of any member of the family Psoroptidae. Analysis of mt genome sequences in the present study has provided new insights into the phylogenetic relationships among several major lineages of Acari species. PMID:25052180
Likelihood-based modification of experimental crystal structure electron density maps
Terwilliger, Thomas C [Sante Fe, NM
2005-04-16
A maximum-likelihood method for improves an electron density map of an experimental crystal structure. A likelihood of a set of structure factors {F.sub.h } is formed for the experimental crystal structure as (1) the likelihood of having obtained an observed set of structure factors {F.sub.h.sup.OBS } if structure factor set {F.sub.h } was correct, and (2) the likelihood that an electron density map resulting from {F.sub.h } is consistent with selected prior knowledge about the experimental crystal structure. The set of structure factors {F.sub.h } is then adjusted to maximize the likelihood of {F.sub.h } for the experimental crystal structure. An improved electron density map is constructed with the maximized structure factors.
Wendel, Andreas F; Meyer, Sebastian; Deenen, René; Köhrer, Karl; Kolbe-Busch, Susanne; Pfeffer, Klaus; Willmann, Matthias; Kaasch, Achim J; MacKenzie, Colin R
2018-05-11
Enterobacter cloacae complex is a common cause of hospital outbreaks. A retrospective and prospective molecular analysis of carbapenem-resistant clinical isolates in a tertiary care center demonstrated an outbreak of a German-imipenemase-1 (GIM-1) metallo-beta-lactamase-producing Enterobacter hormaechei ssp. steigerwaltii affecting 23 patients between 2009 and 2016. Thirty-three isolates were sequence type 89 by conventional multilocus sequence typing (MLST) and displayed a maximum difference of 49 out of 3,643 targets in the ad-hoc core-genome MLST (cgMLST) scheme (SeqSphere+ software; Ridom, Münster, Germany). The relatedness of all isolates was confirmed by further maximum-likelihood phylogeny. One clonal complex of highly related isolates (≤15 allele difference in cgMLST) contained 17 patients, but epidemiological data only suggested five transmission events. The bla GIM-1 -gene was embedded in a class-1-integron (In770) and the Tn21-subgroup transposon Tn6216 (KC511628) on a 25-kb plasmid. Environmental screening detected one colonized sink trap in a service room. The outbreak was self-limited as no further bla GIM-1 -positive E. hormaechei has been isolated since 2016. Routine molecular screening of carbapenem-nonsusceptible gram-negative isolates detected a long-term, low-frequency outbreak of a GIM-1-producing E. hormaechei ssp. steigerwaltii clone. This highlights the necessity of molecular surveillance.
Liu, Luxian; Jin, Xinjie; Chen, Nan; Li, Xian; Li, Pan; Fu, Chengxin
2015-01-01
Phylogenetic relationships among Chinese species of Morella (Myricaceae) are unresolved. Here, we use restriction site-associated DNA sequencing (RAD-seq) to identify candidate loci that will help in determining phylogenetic relationships among Morella rubra, M. adenophora, M. nana and M. esculenta. Three methods for inferring phylogeny, maximum parsimony (MP), maximum likelihood (ML) and Bayesian concordance, were applied to data sets including as many as 4253 RAD loci with 8360 parsimony informative variable sites. All three methods significantly favored the topology of (((M. rubra, M. adenophora), M. nana), M. esculenta). Two species from North America (M. cerifera and M. pensylvanica) were placed as sister to the four Chinese species. According to BEAST analysis, we deduced speciation of M. rubra to be at about the Miocene-Pliocene boundary (5.28 Ma). Intraspecific divergence in M. rubra occurred in the late Pliocene (3.39 Ma). From pooled data, we assembled 29378, 21902 and 23552 de novo contigs with an average length of 229, 234 and 234 bp for M. rubra, M. nana and M. esculenta respectively. The contigs were used to investigate functional classification of RAD tags in a BLASTX search. Additionally, we identified 3808 unlinked SNP sites across the four populations of M. rubra and discovered genes associated with fruit ripening and senescence, fruit quality and disease/defense metabolism based on KEGG database. PMID:26431030
Petrovskaya, Lada E; Novototskaya-Vlasova, Ksenia A; Spirina, Elena V; Durdenko, Ekaterina V; Lomakina, Galina Yu; Zavialova, Maria G; Nikolaev, Evgeny N; Rivkina, Elizaveta M
2016-05-01
As a result of construction and screening of a metagenomic library prepared from a permafrost-derived microcosm, we have isolated a novel gene coding for a putative lipolytic enzyme that belongs to the hormone-sensitive lipase family. It encodes a polypeptide of 343 amino acid residues whose amino acid sequence displays maximum likelihood with uncharacterized proteins from Sphingomonas species. A putative catalytic serine residue of PMGL2 resides in a new variant of a recently discovered GTSAG sequence in which a Thr residue is replaced by a Cys residue (GCSAG). The recombinant PMGL2 was produced in Escherichia coli cells and purified by Ni-affinity chromatography. The resulting protein preferably utilizes short-chain p-nitrophenyl esters (C4 and C8) and therefore is an esterase. It possesses maximum activity at 45°C in slightly alkaline conditions and has limited thermostability at higher temperatures. Activity of PMGL2 is stimulated in the presence of 0.25-1.5 M NaCl indicating the good salt tolerance of the new enzyme. Mass spectrometric analysis demonstrated that N-terminal methionine in PMGL2 is processed and cysteine residues do not form a disulfide bond. The results of the study demonstrate the significance of the permafrost environment as a unique genetic reservoir and its potential for metagenomic exploration. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ding, Hui-Hui; Chao, Yi-Shan; Callado, John Rey; Dong, Shi-Yong
2014-11-01
In this study we provide a phylogeny for the pantropical fern genus Tectaria, with emphasis on the Old World species, based on sequences of five plastid regions (atpB, ndhF plus ndhF-trnL, rbcL, rps16-matK plus matK, and trnL-F). Maximum parsimony, maximum likelihood, and Bayesian inference are used to analyze 115 individuals, representing ca. 56 species of Tectaria s.l. and 36 species of ten related genera. The results strongly support the monophyly of Tectaria in a broad sense, in which Ctenitopsis, Hemigramma, Heterogonium, Psomiocarpa, Quercifilix, Stenosemia, and Tectaridium should be submerged. Such broadly circumscribed Tectaria is supported by the arising pattern of veinlets and the base chromosome number (x=40). Four primary clades are well resolved within Tectaria, one from the Neotropic (T. trifoliata clade) and three from the Old World (T. subtriphylla clade, Ctenitopsis clade, and T. crenata clade). Tectaria crenata clade is the largest one including six subclades. Of the genera previously recognized as tectarioid ferns, Ctenitis, Lastreopsis, and Pleocnemia, are confirmed to be members in Dryopteridaceae; while Pteridrys and Triplophyllum are supported in Tectariaceae. To infer morphological evolution, 13 commonly used characters are optimized on the resulting phylogenetic trees and in result, are all homoplastic in Tectaria. Copyright © 2014 Elsevier Inc. All rights reserved.
Automation and Evaluation of the SOWH Test with SOWHAT.
Church, Samuel H; Ryan, Joseph F; Dunn, Casey W
2015-11-01
The Swofford-Olsen-Waddell-Hillis (SOWH) test evaluates statistical support for incongruent phylogenetic topologies. It is commonly applied to determine if the maximum likelihood tree in a phylogenetic analysis is significantly different than an alternative hypothesis. The SOWH test compares the observed difference in log-likelihood between two topologies to a null distribution of differences in log-likelihood generated by parametric resampling. The test is a well-established phylogenetic method for topology testing, but it is sensitive to model misspecification, it is computationally burdensome to perform, and its implementation requires the investigator to make several decisions that each have the potential to affect the outcome of the test. We analyzed the effects of multiple factors using seven data sets to which the SOWH test was previously applied. These factors include a number of sample replicates, likelihood software, the introduction of gaps to simulated data, the use of distinct models of evolution for data simulation and likelihood inference, and a suggested test correction wherein an unresolved "zero-constrained" tree is used to simulate sequence data. To facilitate these analyses and future applications of the SOWH test, we wrote SOWHAT, a program that automates the SOWH test. We find that inadequate bootstrap sampling can change the outcome of the SOWH test. The results also show that using a zero-constrained tree for data simulation can result in a wider null distribution and higher p-values, but does not change the outcome of the SOWH test for most of the data sets tested here. These results will help others implement and evaluate the SOWH test and allow us to provide recommendations for future applications of the SOWH test. SOWHAT is available for download from https://github.com/josephryan/SOWHAT. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Task Performance with List-Mode Data
NASA Astrophysics Data System (ADS)
Caucci, Luca
This dissertation investigates the application of list-mode data to detection, estimation, and image reconstruction problems, with an emphasis on emission tomography in medical imaging. We begin by introducing a theoretical framework for list-mode data and we use it to define two observers that operate on list-mode data. These observers are applied to the problem of detecting a signal (known in shape and location) buried in a random lumpy background. We then consider maximum-likelihood methods for the estimation of numerical parameters from list-mode data, and we characterize the performance of these estimators via the so-called Fisher information matrix. Reconstruction from PET list-mode data is then considered. In a process we called "double maximum-likelihood" reconstruction, we consider a simple PET imaging system and we use maximum-likelihood methods to first estimate a parameter vector for each pair of gamma-ray photons that is detected by the hardware. The collection of these parameter vectors forms a list, which is then fed to another maximum-likelihood algorithm for volumetric reconstruction over a grid of voxels. Efficient parallel implementation of the algorithms discussed above is then presented. In this work, we take advantage of two low-cost, mass-produced computing platforms that have recently appeared on the market, and we provide some details on implementing our algorithms on these devices. We conclude this dissertation work by elaborating on a possible application of list-mode data to X-ray digital mammography. We argue that today's CMOS detectors and computing platforms have become fast enough to make X-ray digital mammography list-mode data acquisition and processing feasible.
NASA Astrophysics Data System (ADS)
Perlovsky, Leonid I.; Webb, Virgil H.; Bradley, Scott R.; Hansen, Christopher A.
1998-07-01
An advanced detection and tracking system is being developed for the U.S. Navy's Relocatable Over-the-Horizon Radar (ROTHR) to provide improved tracking performance against small aircraft typically used in drug-smuggling activities. The development is based on the Maximum Likelihood Adaptive Neural System (MLANS), a model-based neural network that combines advantages of neural network and model-based algorithmic approaches. The objective of the MLANS tracker development effort is to address user requirements for increased detection and tracking capability in clutter and improved track position, heading, and speed accuracy. The MLANS tracker is expected to outperform other approaches to detection and tracking for the following reasons. It incorporates adaptive internal models of target return signals, target tracks and maneuvers, and clutter signals, which leads to concurrent clutter suppression, detection, and tracking (track-before-detect). It is not combinatorial and thus does not require any thresholding or peak picking and can track in low signal-to-noise conditions. It incorporates superresolution spectrum estimation techniques exceeding the performance of conventional maximum likelihood and maximum entropy methods. The unique spectrum estimation method is based on the Einsteinian interpretation of the ROTHR received energy spectrum as a probability density of signal frequency. The MLANS neural architecture and learning mechanism are founded on spectrum models and maximization of the "Einsteinian" likelihood, allowing knowledge of the physical behavior of both targets and clutter to be injected into the tracker algorithms. The paper describes the addressed requirements and expected improvements, theoretical foundations, engineering methodology, and results of the development effort to date.
Curiale, Ariel H; Vegas-Sánchez-Ferrero, Gonzalo; Bosch, Johan G; Aja-Fernández, Santiago
2015-08-01
The strain and strain-rate measures are commonly used for the analysis and assessment of regional myocardial function. In echocardiography (EC), the strain analysis became possible using Tissue Doppler Imaging (TDI). Unfortunately, this modality shows an important limitation: the angle between the myocardial movement and the ultrasound beam should be small to provide reliable measures. This constraint makes it difficult to provide strain measures of the entire myocardium. Alternative non-Doppler techniques such as Speckle Tracking (ST) can provide strain measures without angle constraints. However, the spatial resolution and the noisy appearance of speckle still make the strain estimation a challenging task in EC. Several maximum likelihood approaches have been proposed to statistically characterize the behavior of speckle, which results in a better performance of speckle tracking. However, those models do not consider common transformations to achieve the final B-mode image (e.g. interpolation). This paper proposes a new maximum likelihood approach for speckle tracking which effectively characterizes speckle of the final B-mode image. Its formulation provides a diffeomorphic scheme than can be efficiently optimized with a second-order method. The novelty of the method is threefold: First, the statistical characterization of speckle generalizes conventional speckle models (Rayleigh, Nakagami and Gamma) to a more versatile model for real data. Second, the formulation includes local correlation to increase the efficiency of frame-to-frame speckle tracking. Third, a probabilistic myocardial tissue characterization is used to automatically identify more reliable myocardial motions. The accuracy and agreement assessment was evaluated on a set of 16 synthetic image sequences for three different scenarios: normal, acute ischemia and acute dyssynchrony. The proposed method was compared to six speckle tracking methods. Results revealed that the proposed method is the most accurate method to measure the motion and strain with an average median motion error of 0.42 mm and a median strain error of 2.0 ± 0.9%, 2.1 ± 1.3% and 7.1 ± 4.9% for circumferential, longitudinal and radial strain respectively. It also showed its capability to identify abnormal segments with reduced cardiac function and timing differences for the dyssynchrony cases. These results indicate that the proposed diffeomorphic speckle tracking method provides robust and accurate motion and strain estimation. Copyright © 2015. Published by Elsevier B.V.
Larridon, Isabel; Walter, Helmut E; Guerrero, Pablo C; Duarte, Milén; Cisternas, Mauricio A; Hernández, Carol Peña; Bauters, Kenneth; Asselman, Pieter; Goetghebeur, Paul; Samain, Marie-Stéphanie
2015-09-01
Species of the endemic Chilean cactus genus Copiapoa have cylindrical or (sub)globose stems that are solitary or form (large) clusters and typically yellow flowers. Many species are threatened with extinction. Despite being icons of the Atacama Desert and well loved by cactus enthusiasts, the evolution and diversity of Copiapoa has not yet been studied using a molecular approach. Sequence data of three plastid DNA markers (rpl32-trnL, trnH-psbA, ycf1) of 39 Copiapoa taxa were analyzed using maximum likelihood and Bayesian inference approaches. Species distributions were modeled based on geo-referenced localities and climatic data. Evolution of character states of four characters (root morphology, stem branching, stem shape, and stem diameter) as well as ancestral areas were reconstructed using a Bayesian and maximum likelihood framework, respectively. Clades of species are revealed. Though 32 morphologically defined species can be recognized, genetic diversity between some species and infraspecific taxa is too low to delimit their boundaries using plastid DNA markers. Recovered relationships are often supported by morphological and biogeographical patterns. The origin of Copiapoa likely lies between southern Peru and the extreme north of Chile. The Copiapó Valley limited colonization between two biogeographical areas. Copiapoa is here defined to include 32 species and five heterotypic subspecies. Thirty species are classified into four sections and two subsections, while two species remain unplaced. A better understanding of evolution and diversity of Copiapoa will allow allocating conservation resources to the most threatened lineages and focusing conservation action on real biodiversity. © 2015 Botanical Society of America.
Effects of control inputs on the estimation of stability and control parameters of a light airplane
NASA Technical Reports Server (NTRS)
Cannaday, R. L.; Suit, W. T.
1977-01-01
The maximum likelihood parameter estimation technique was used to determine the values of stability and control derivatives from flight test data for a low-wing, single-engine, light airplane. Several input forms were used during the tests to investigate the consistency of parameter estimates as it relates to inputs. These consistencies were compared by using the ensemble variance and estimated Cramer-Rao lower bound. In addition, the relationship between inputs and parameter correlations was investigated. Results from the stabilator inputs are inconclusive but the sequence of rudder input followed by aileron input or aileron followed by rudder gave more consistent estimates than did rudder or ailerons individually. Also, square-wave inputs appeared to provide slightly improved consistency in the parameter estimates when compared to sine-wave inputs.
NASA Astrophysics Data System (ADS)
Poletto, S.; Gambetta, Jay M.; Merkel, Seth T.; Smolin, John A.; Chow, Jerry M.; Córcoles, A. D.; Keefe, George A.; Rothwell, Mary B.; Rozen, J. R.; Abraham, D. W.; Rigetti, Chad; Steffen, M.
2012-12-01
We report a system where fixed interactions between noncomputational levels make bright the otherwise forbidden two-photon |00⟩→|11⟩ transition. The system is formed by hand selection and assembly of two discrete component transmon-style superconducting qubits inside a rectangular microwave cavity. The application of a monochromatic drive tuned to this transition induces two-photon Rabi-like oscillations between the ground and doubly excited states via the Bell basis. The system therefore allows all-microwave two-qubit universal control with the same techniques and hardware required for single qubit control. We report Ramsey-like and spin echo sequences with the generated Bell states, and measure a two-qubit gate fidelity of Fg=90% (unconstrained) and 86% (maximum likelihood estimator).
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Maximum-likelihood fitting of data dominated by Poisson statistical uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoneking, M.R.; Den Hartog, D.J.
1996-06-01
The fitting of data by {chi}{sup 2}-minimization is valid only when the uncertainties in the data are normally distributed. When analyzing spectroscopic or particle counting data at very low signal level (e.g., a Thomson scattering diagnostic), the uncertainties are distributed with a Poisson distribution. The authors have developed a maximum-likelihood method for fitting data that correctly treats the Poisson statistical character of the uncertainties. This method maximizes the total probability that the observed data are drawn from the assumed fit function using the Poisson probability function to determine the probability for each data point. The algorithm also returns uncertainty estimatesmore » for the fit parameters. They compare this method with a {chi}{sup 2}-minimization routine applied to both simulated and real data. Differences in the returned fits are greater at low signal level (less than {approximately}20 counts per measurement). the maximum-likelihood method is found to be more accurate and robust, returning a narrower distribution of values for the fit parameters with fewer outliers.« less
Land cover mapping after the tsunami event over Nanggroe Aceh Darussalam (NAD) province, Indonesia
NASA Astrophysics Data System (ADS)
Lim, H. S.; MatJafri, M. Z.; Abdullah, K.; Alias, A. N.; Mohd. Saleh, N.; Wong, C. J.; Surbakti, M. S.
2008-03-01
Remote sensing offers an important means of detecting and analyzing temporal changes occurring in our landscape. This research used remote sensing to quantify land use/land cover changes at the Nanggroe Aceh Darussalam (Nad) province, Indonesia on a regional scale. The objective of this paper is to assess the changed produced from the analysis of Landsat TM data. A Landsat TM image was used to develop land cover classification map for the 27 March 2005. Four supervised classifications techniques (Maximum Likelihood, Minimum Distance-to- Mean, Parallelepiped and Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier) were performed to the satellite image. Training sites and accuracy assessment were needed for supervised classification techniques. The training sites were established using polygons based on the colour image. High detection accuracy (>80%) and overall Kappa (>0.80) were achieved by the Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier in this study. This preliminary study has produced a promising result. This indicates that land cover mapping can be carried out using remote sensing classification method of the satellite digital imagery.
Lehmann, A; Scheffler, Ch; Hermanussen, M
2010-02-01
Recent progress in modelling individual growth has been achieved by combining the principal component analysis and the maximum likelihood principle. This combination models growth even in incomplete sets of data and in data obtained at irregular intervals. We re-analysed late 18th century longitudinal growth of German boys from the boarding school Carlsschule in Stuttgart. The boys, aged 6-23 years, were measured at irregular 3-12 monthly intervals during the period 1771-1793. At the age of 18 years, mean height was 1652 mm, but height variation was large. The shortest boy reached 1474 mm, the tallest 1826 mm. Measured height closely paralleled modelled height, with mean difference of 4 mm, SD 7 mm. Seasonal height variation was found. Low growth rates occurred in spring and high growth rates in summer and autumn. The present study demonstrates that combining the principal component analysis and the maximum likelihood principle enables growth modelling in historic height data also. Copyright (c) 2009 Elsevier GmbH. All rights reserved.
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis
van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
NASA Astrophysics Data System (ADS)
Love, J. J.; Rigler, E. J.; Pulkkinen, A. A.; Riley, P.
2015-12-01
An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to -Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, -Dst > 850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42, 2.41] times per century; a 100-yr magnetic storm is identified as having a -Dst > 880 nT (greater than Carrington) but a wide 95% confidence interval of [490, 1187] nT. This work is partially motivated by United States National Science and Technology Council and Committee on Space Research and International Living with a Star priorities and strategic plans for the assessment and mitigation of space-weather hazards.
NASA Astrophysics Data System (ADS)
Omi, Takahiro; Ogata, Yosihiko; Hirata, Yoshito; Aihara, Kazuyuki
2015-04-01
Because aftershock occurrences can cause significant seismic risks for a considerable time after the main shock, prospective forecasting of the intermediate-term aftershock activity as soon as possible is important. The epidemic-type aftershock sequence (ETAS) model with the maximum likelihood estimate effectively reproduces general aftershock activity including secondary or higher-order aftershocks and can be employed for the forecasting. However, because we cannot always expect the accurate parameter estimation from incomplete early aftershock data where many events are missing, such forecasting using only a single estimated parameter set (plug-in forecasting) can frequently perform poorly. Therefore, we here propose Bayesian forecasting that combines the forecasts by the ETAS model with various probable parameter sets given the data. By conducting forecasting tests of 1 month period aftershocks based on the first 1 day data after the main shock as an example of the early intermediate-term forecasting, we show that the Bayesian forecasting performs better than the plug-in forecasting on average in terms of the log-likelihood score. Furthermore, to improve forecasting of large aftershocks, we apply a nonparametric (NP) model using magnitude data during the learning period and compare its forecasting performance with that of the Gutenberg-Richter (G-R) formula. We show that the NP forecast performs better than the G-R formula in some cases but worse in other cases. Therefore, robust forecasting can be obtained by employing an ensemble forecast that combines the two complementary forecasts. Our proposed method is useful for a stable unbiased intermediate-term assessment of aftershock probabilities.
IDEA: Interactive Display for Evolutionary Analyses.
Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C
2008-12-08
The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.
IDEA: Interactive Display for Evolutionary Analyses
Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C
2008-01-01
Background The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. Results We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. Conclusion IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data. PMID:19061522
NASA Technical Reports Server (NTRS)
Clark, R. T.; Mccallister, R. D.
1982-01-01
The particular coding option identified as providing the best level of coding gain performance in an LSI-efficient implementation was the optimal constraint length five, rate one-half convolutional code. To determine the specific set of design parameters which optimally matches this decoder to the LSI constraints, a breadboard MCD (maximum-likelihood convolutional decoder) was fabricated and used to generate detailed performance trade-off data. The extensive performance testing data gathered during this design tradeoff study are summarized, and the functional and physical MCD chip characteristics are presented.
Gyro-based Maximum-Likelihood Thruster Fault Detection and Identification
NASA Technical Reports Server (NTRS)
Wilson, Edward; Lages, Chris; Mah, Robert; Clancy, Daniel (Technical Monitor)
2002-01-01
When building smaller, less expensive spacecraft, there is a need for intelligent fault tolerance vs. increased hardware redundancy. If fault tolerance can be achieved using existing navigation sensors, cost and vehicle complexity can be reduced. A maximum likelihood-based approach to thruster fault detection and identification (FDI) for spacecraft is developed here and applied in simulation to the X-38 space vehicle. The system uses only gyro signals to detect and identify hard, abrupt, single and multiple jet on- and off-failures. Faults are detected within one second and identified within one to five accords,
Maximum likelihood estimation for life distributions with competing failure modes
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1979-01-01
Systems which are placed on test at time zero, function for a period and die at some random time were studied. Failure may be due to one of several causes or modes. The parameters of the life distribution may depend upon the levels of various stress variables the item is subject to. Maximum likelihood estimation methods are discussed. Specific methods are reported for the smallest extreme-value distributions of life. Monte-Carlo results indicate the methods to be promising. Under appropriate conditions, the location parameters are nearly unbiased, the scale parameter is slight biased, and the asymptotic covariances are rapidly approached.
Gyre and gimble: a maximum-likelihood replacement for Patterson correlation refinement.
McCoy, Airlie J; Oeffner, Robert D; Millán, Claudia; Sammito, Massimo; Usón, Isabel; Read, Randy J
2018-04-01
Descriptions are given of the maximum-likelihood gyre method implemented in Phaser for optimizing the orientation and relative position of rigid-body fragments of a model after the orientation of the model has been identified, but before the model has been positioned in the unit cell, and also the related gimble method for the refinement of rigid-body fragments of the model after positioning. Gyre refinement helps to lower the root-mean-square atomic displacements between model and target molecular-replacement solutions for the test case of antibody Fab(26-10) and improves structure solution with ARCIMBOLDO_SHREDDER.
Richards, V. M.; Dai, W.
2014-01-01
A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given. PMID:24671826
F-8C adaptive flight control extensions. [for maximum likelihood estimation
NASA Technical Reports Server (NTRS)
Stein, G.; Hartmann, G. L.
1977-01-01
An adaptive concept which combines gain-scheduled control laws with explicit maximum likelihood estimation (MLE) identification to provide the scheduling values is described. The MLE algorithm was improved by incorporating attitude data, estimating gust statistics for setting filter gains, and improving parameter tracking during changing flight conditions. A lateral MLE algorithm was designed to improve true air speed and angle of attack estimates during lateral maneuvers. Relationships between the pitch axis sensors inherent in the MLE design were examined and used for sensor failure detection. Design details and simulation performance are presented for each of the three areas investigated.
NASA Technical Reports Server (NTRS)
Battin, R. H.; Croopnick, S. R.; Edwards, J. A.
1977-01-01
The formulation of a recursive maximum likelihood navigation system employing reference position and velocity vectors as state variables is presented. Convenient forms of the required variational equations of motion are developed together with an explicit form of the associated state transition matrix needed to refer measurement data from the measurement time to the epoch time. Computational advantages accrue from this design in that the usual forward extrapolation of the covariance matrix of estimation errors can be avoided without incurring unacceptable system errors. Simulation data for earth orbiting satellites are provided to substantiate this assertion.
Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano
2015-01-01
We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
NASA Astrophysics Data System (ADS)
Abbasi, R. U.; Abu-Zayyad, T.; Amann, J. F.; Archbold, G.; Atkins, R.; Bellido, J. A.; Belov, K.; Belz, J. W.; Ben-Zvi, S. Y.; Bergman, D. R.; Boyer, J. H.; Burt, G. W.; Cao, Z.; Clay, R. W.; Connolly, B. M.; Dawson, B. R.; Deng, W.; Farrar, G. R.; Fedorova, Y.; Findlay, J.; Finley, C. B.; Hanlon, W. F.; Hoffman, C. M.; Holzscheiter, M. H.; Hughes, G. A.; Hüntemeyer, P.; Jui, C. C. H.; Kim, K.; Kirn, M. A.; Knapp, B. C.; Loh, E. C.; Maestas, M. M.; Manago, N.; Mannel, E. J.; Marek, L. J.; Martens, K.; Matthews, J. A. J.; Matthews, J. N.; O'Neill, A.; Painter, C. A.; Perera, L.; Reil, K.; Riehle, R.; Roberts, M. D.; Sasaki, M.; Schnetzer, S. R.; Seman, M.; Simpson, K. M.; Sinnis, G.; Smith, J. D.; Snow, R.; Sokolsky, P.; Song, C.; Springer, R. W.; Stokes, B. T.; Thomas, J. R.; Thomas, S. B.; Thomson, G. B.; Tupa, D.; Westerhoff, S.; Wiencke, L. R.; Zech, A.
2005-04-01
We present the results of a search for cosmic-ray point sources at energies in excess of 4.0×1019 eV in the combined data sets recorded by the Akeno Giant Air Shower Array and High Resolution Fly's Eye stereo experiments. The analysis is based on a maximum likelihood ratio test using the probability density function for each event rather than requiring an a priori choice of a fixed angular bin size. No statistically significant clustering of events consistent with a point source is found.
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Contributions to the Underlying Bivariate Normal Method for Factor Analyzing Ordinal Data
ERIC Educational Resources Information Center
Xi, Nuo; Browne, Michael W.
2014-01-01
A promising "underlying bivariate normal" approach was proposed by Jöreskog and Moustaki for use in the factor analysis of ordinal data. This was a limited information approach that involved the maximization of a composite likelihood function. Its advantage over full-information maximum likelihood was that very much less computation was…
Investigating the Impact of Uncertainty about Item Parameters on Ability Estimation
ERIC Educational Resources Information Center
Zhang, Jinming; Xie, Minge; Song, Xiaolan; Lu, Ting
2011-01-01
Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error. The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators.…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth
ERIC Educational Resources Information Center
Jeon, Minjeong
2012-01-01
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
Zhang, Honghai; Chen, Lei
2011-03-01
The dhole (Cuon alpinus) is the only existent species in the genus Cuon (Carnivora: Canidae). In the present study, the complete mitochondrial genome of the dhole was sequenced. The total length is 16672 base pairs which is the shortest in Canidae. Sequence analysis revealed that most mitochondrial genomic functional regions were highly consistent among canid animals except the CSB domain of the control region. The difference in length among the Canidae mitochondrial genome sequences is mainly due to the number of short segments of tandem repeated in the CSB domain. Phylogenetic analysis was progressed based on the concatenated data set of 14 mitochondrial genes of 8 canid animals by using maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) inference methods. The genera Vulpes and Nyctereutes formed a sister group and split first within Canidae, followed by that in the Cuon. The divergence in the genus Canis was the latest. The divarication of domestic dogs after that of the Canis lupus laniger is completely supported by all the three topologies. Pairwise sequence divergence data of different mitochondrial genes among canid animals were also determined. Except for the synonymous substitutions in protein-coding genes, the control region exhibits the highest sequence divergences. The synonymous rates are approximately two to six times higher than those of the non-synonymous sites except for a slightly higher rate in the non-synonymous substitution between Cuon alpinus and Vulpes vulpes. 16S rRNA genes have a slightly faster sequence divergence than 12S rRNA and tRNA genes. Based on nucleotide substitutions of tRNA genes and rRNA genes, the times since divergence between dhole and other canid animals, and between domestic dogs and three subspecies of wolves were evaluated. The result indicates that Vulpes and Nyctereutes have a close phylogenetic relationship and the divergence of Nyctereutes is a little earlier. The Tibetan wolf may be an archaic pedigree within wolf subspecies. The genetic distance between wolves and domestic dogs is less than that among different subspecies of wolves. The domestication of dogs was about 1.56-1.92 million years ago or even earlier.
Maximum parsimony, substitution model, and probability phylogenetic trees.
Weng, J F; Thomas, D A; Mareels, I
2011-01-01
The problem of inferring phylogenies (phylogenetic trees) is one of the main problems in computational biology. There are three main methods for inferring phylogenies-Maximum Parsimony (MP), Distance Matrix (DM) and Maximum Likelihood (ML), of which the MP method is the most well-studied and popular method. In the MP method the optimization criterion is the number of substitutions of the nucleotides computed by the differences in the investigated nucleotide sequences. However, the MP method is often criticized as it only counts the substitutions observable at the current time and all the unobservable substitutions that really occur in the evolutionary history are omitted. In order to take into account the unobservable substitutions, some substitution models have been established and they are now widely used in the DM and ML methods but these substitution models cannot be used within the classical MP method. Recently the authors proposed a probability representation model for phylogenetic trees and the reconstructed trees in this model are called probability phylogenetic trees. One of the advantages of the probability representation model is that it can include a substitution model to infer phylogenetic trees based on the MP principle. In this paper we explain how to use a substitution model in the reconstruction of probability phylogenetic trees and show the advantage of this approach with examples.
Baazizi, Ratiba; Mahapatra, Mana; Clarke, Brian Donald; Ait-Oudhia, Khatima; Khelef, Djamel; Parida, Satya
2017-01-01
Peste des petits ruminants (PPR) is a contagious disease listed by the World Organisation for Animal health (OIE) as being a specific hazard. It affects sheep, goats, and wild ungulates, and is prevalent throughout the developing world particularly Asia, the Middle East, and Africa. PPR has been targeted for eradication by 2030 by the Food and Agriculture Organization of the United Nations (FAO) and the OIE, after the successful eradication of the related disease, rinderpest in cattle. PPR was first reported in 1942 in the Ivory Coast in Western Africa and has since extended its range in Asia, the Middle East, and Africa posing an immediate threat of incursion into Europe, South East Asia and South Africa. Although robust vaccines are available, the use of these vaccines in a systematic and rational manner is not widespread, resulting in this devastating disease becoming an important neglected tropical disease in the developing world. We isolated and characterized the PPR virus from an outbreak in Cheraga, northern Algeria, during October 2015 by analyzing the partial N-gene sequence in comparison with other viruses from the Maghreb region. As well as sequencing the full length viral genome and performing real-time RT-PCR on clinical samples. Maximum-likelihood and Bayesian temporal and phylogeographic analyses were performed to assess the persistence and spread of PPRV circulation from Eastern Africa in the Maghreb region of North Africa. Recent PPR outbreaks in Cheraga, in the northern part of Algiers (October 2015) and North-West Morocco (June, 2015) highlight that PPRV has spread to the northern border of North Africa and may pose a threat of introduction to Europe. Phylogeographic analysis suggests that lineage IV PPRV has spread from Eastern Africa, most likely from the Sudan 2000 outbreak, into Northern Africa resulting in the 2008 Moroccan outbreak. Maximum-likelihood and Bayesian analysis shows that these North African viruses cluster closely together suggesting the existence of continual regional circulation. Considering the same virus is circulating in Algeria, Morocco and Tunisia, implementation of a common Maghreb PPR eradication strategy would be beneficial for the region.
Fourment, Mathieu; Holmes, Edward C
2014-07-24
Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. We develop a maximum likelihood method--Physher--that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.
Wu, F Z; Ma, J; Hu, X N; Zeng, L
2015-02-01
The mealybug species Phenacoccus solenopsis (P. solenopsis) has caused much agricultural damage since its recent invasion in China. However, the source of this invasion remains unclear. This study uses molecular methods to clarify the relationships among different population of P. solenopsis from China, USA, Pakistan, India, and Vietnam to determine the geographic origin of the introduction of this species into China. P. solenopsis samples were collected from 25 different locations in three provinces of Southern China. Samples from the USA, Pakistan, and Vietnam were also obtained. Parts of the mitochondrial genes for cytochrome oxidase I (COI) were sequenced for each sample. Homologous DNA sequences of the samples from the USA and India were downloaded from Gen Bank. Two haplotypes were found in China. The first was from most samples from the Guangdong, Guangxi, and Hainan populations in the China and Pakistan groups, and the second from a few samples from the Guangdong, Guangxi, Hainan populations in the China, Pakistan, India, and Vietnam groups. As shown in the maximum likelihood of trees constructed using the COI sequences, these samples belonged to two clades. Phylogenetic analysis suggested that most P. solenopsis mealybugs in Southern China are probably closely related to populations in Pakistan. The variation, relationship, expansion, and probable geographic origin of P. solenopsis mealybugs in Southern China are also discussed.
Zhang, Yanhong; Pham, Nancy Kim; Zhang, Huixian; Lin, Junda; Lin, Qiang
2014-01-01
Population genetic of seahorses is confidently influenced by their species-specific ecological requirements and life-history traits. In the present study, partial sequences of mitochondrial cytochrome b (cytb) and control region (CR) were obtained from 50 Hippocampus mohnikei and 92 H. trimaculatus from four zoogeographical zones. A total of 780 base pairs of cytb gene were sequenced to characterize mitochondrial DNA (mtDNA) diversity. The mtDNA marker revealed high haplotype diversity, low nucleotide diversity, and a lack of population structure across both populations of H. mohnikei and H. trimaculatus. A neighbour-joining (NJ) tree of cytb gene sequences showed that H. mohnikei haplotypes formed one cluster. A maximum likelihood (ML) tree of cytb gene sequences showed that H. trimaculatus belonged to one lineage. The star-like pattern median-joining network of cytb and CR markers indicated a previous demographic expansion of H. mohnikei and H. trimaculatus. The cytb and CR data sets exhibited a unimodal mismatch distribution, which may have resulted from population expansion. Mismatch analysis suggested that the expansion was initiated about 276,000 years ago for H. mohnikei and about 230,000 years ago for H. trimaculatus during the middle Pleistocene period. This study indicates a possible signature of genetic variation and population expansion in two seahorses under complex marine environments. PMID:25144384
A RAD-based phylogenetics for Orestias fishes from Lake Titicaca.
Takahashi, Tetsumi; Moreno, Edmundo
2015-12-01
The fish genus Orestias is endemic to the Andes highlands, and Lake Titicaca is the centre of the species diversity of the genus. Previous phylogenetic studies based on a single locus of mitochondrial and nuclear DNA strongly support the monophyly of a group composed of many of species endemic to the Lake Titicaca basin (the Lake Titicaca radiation), but the relationships among the species in the radiation remain unclear. Recently, restriction site-associated DNA (RAD) sequencing, which can produce a vast number of short sequences from various loci of nuclear DNA, has emerged as a useful way to resolve complex phylogenetic problems. To propose a new phylogenetic hypothesis of Orestias fishes of the Lake Titicaca radiation, we conducted a cluster analysis based on morphological similarities among fish samples and a molecular phylogenetic analysis based on RAD sequencing. From a morphological cluster analysis, we recognised four species groups in the radiation, and three of the four groups were resolved as monophyletic groups in maximum-likelihood trees based on RAD sequencing data. The other morphology-based group was not resolved as a monophyletic group in molecular phylogenies, and some members of the group were diverged from its sister group close to the root of the Lake Titicaca radiation. The evolution of these fishes is discussed from the phylogenetic relationships. Copyright © 2015 Elsevier Inc. All rights reserved.
Lee, Soohyun; Seo, Chae Hwa; Alver, Burak Han; Lee, Sanghyuk; Park, Peter J
2015-09-03
RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost. EMSAR is available at https://github.com/parklab/emsar.
Phylogeny and active ingredients of artificial Ophiocordyceps lanpingensis ascomata
NASA Astrophysics Data System (ADS)
Chen, Zihong; Xu, Ling; Yu, Hong; Zeng, Wenbo; Dai, Yongdong; Wang, Yuanbing
2018-04-01
To evaluate the morphological character, phylogenesis and functional components of artificial Ophiocordyceps lanpingensis, a related species of O. sinensis. The ascomata of O. lanpingensis was induced with its asexual strain, HLANY0707 and its microscopic feature was described. Phylogenesis was analyzed with ITS-5.8S sequences of HLANY0707, its cultured stroma, and 39 relative sequences of Hirsutella and Ophiocordyceps based on the maximum likelihood tree. Six nucleosides of artificial O. lanpingensis, natural O. lanpingensis and natural O. sinensis were compared with HPLC analysis. Artificial ascomata of O. lanpingensis could be massively produced with HLANY0707 and had similar microscopic features as the nature specimens. Phylogenetic analysis showed that both the artificial and natural O. lanpingensis had closer relationship with O. sinensis, O. xuefengensis, H. uncinata and O. robertsii, the species whose massively cultured ascomata being not reported. Nucleosides of artificial O. lanpingensis were very similar to natural O. sinensis, implying a promising application prospect of artificial O. lanpingensis as an alternative to O. sinensis. It showed a promising way to develop artificial O. lanpingensis and conserve the rare and endangered species, O. sinensis.
Einstein X-ray survey of the Pleiades - The dependence of X-ray emission on stellar age
NASA Technical Reports Server (NTRS)
Micela, G.; Sciortino, S.; Serio, S.; Vaiana, G. S.; Bookbinder, J.; Golub, L.; Harnden, F. R., Jr.; Rosner, R.
1985-01-01
The data obtained with two pointed observations of 1 deg by 1 deg fields of the Pleiades region have been analyzed, and the results are presented. The maximum-likelihood X-ray luminosity functions for the Pleiades G and K stars in the cluster are derived, and it is shown that, for the G stars, the Pleiades X-ray luminosity function is significantly brighter than the corresponding function for Hyades G dwarf stars. This finding indicates a dependence of X-ray luminosity on stellar age, which is confirmed by comparison of the same data with median X-ray luminosities of pre-main sequence and local disk population dwarf G stars. It is suggested that the significantly larger number of bright X-ray sources associated with G stars than with K stars, the lack of detection of M stars, and the relatively rapid rotation of the Pleiades K stars can be explained in terms of the onset of internal differential rotation near the convective envelope-radidative core interface after the spin-up phase during evolution to the main sequence.
Baum, D A; Small, R L; Wendel, J F
1998-06-01
The phylogeny of baobab trees was analyzed using four data sets: chloroplast DNA restriction sites, sequences of the chloroplast rpl16 intron, sequences of the internal transcribed spacer (ITS) region of nuclear ribosomal DNA, and morphology. We sampled each of the eight species of Adansonia plus three outgroup taxa from tribe Adansonieae. These data were analyzed singly and in combination using parsimony. ITS and morphology provided the greatest resolution and were largely concordant. The two chloroplast data sets showed concordance with one another but showed significant conflict with ITS and morphology. A possible explanation for the conflict is genealogical discordance within the Malagasy Longitubae, perhaps due to introgression events. A maximum-likelihood analysis of branching times shows that the dispersal between Africa and Australia occurred well after the fragmentation of Gondwana and therefore involved overwater dispersal. The phylogeny does not permit unambiguous reconstruction of floral evolution but suggests the plausible hypothesis that hawkmoth pollination was ancestral in Adansonia and that there were two parallel switches to pollination by mammals in the genus.
Wu, Hai-Yan; Ji, Xiao-Yu; Yu, Wei-Wei; Du, Yu-Zhou
2014-03-10
We present the complete mitogenome of a stonefly, Cryptoperla stilifera Sivec (Plecoptera; Peltoperlidae). The mitogenome was a circular molecule consisting of 15,633 nucleotides, 37 genes and a A+T-rich region. C. stilifera mitogenome was similar to Pteronarcys princeps mitogenome (Plecoptera; Pteronarcyidae). All transfer RNA genes (tRNAs) had typical cloverleaf secondary structures except for trnSer (AGN), where the stem-loop structure of the dihydrouridine (DHU) arm was missing. The A+T-rich region of C. stilifera had two stem-loops and each had two interlink. Three conserved sequence blocks (CSBs) were present in the A+T-rich regions of C. stilifera, Peltoperla tarteri and Peltoperla arcuata. Moreover, many polynucleotide stretches (Poly N, N=A, T and C) in the A+T-rich region of C. stilifera Phylogenetic relationships of Polyneopteran species were constructed based on the nucleotide sequences of 13 protein coding genes (PCGs). Both maximum likelihood (ML) and Bayesian inference (BI) analyses supported Grylloblattodea as the sister group to Plecoptera+Dermaptera and Embiidina and Phasmatodea as sister groups. Copyright © 2014 Elsevier B.V. All rights reserved.
Zhao, Fang; Ma, Jun-Ying; Cai, Hui-Xia; Su, Jian-Ping; Hou, Zhi-Bin; Zhang, Tong-Zuo; Lin, Gong-Hua
2014-07-01
Cestode larvae spend one phase of their two-phase life cycle in the viscera of rodents, but cases of cestodes infecting subterranean rodents have only been rarely observed. To experimentally gain some insight into this phenomenon, we captured approximately 300 plateau zokors (Eospalax baileyi), a typical subterranean rodent inhabiting the Qinghai-Tibet Plateau, and examined their livers for the presence of cysts. Totally, we collected five cysts, and using a mitochondrial gene (cox1) and two nuclear genes (pepck and pold) as genetic markers, we were able to analyze the taxonomy of the cysts. Both the maximum likelihood and Bayesian methods showed that the cysts share a monophyly with Taenia mustelae, while Kimura 2-parameter distances and number of different sites between our sequences and T. mustelae were far less than those found between the examined sequences and other Taeniidae species. These results, alongside supporting paraffin section histology, imply that the cysts found in plateau zokors can be regarded as larvae of T. mustelae, illustrating that zokors are a newly discovered intermediate host record of this parasite.
Karabanov, Dmitry P; Bekker, Eugeniya I; Shiel, Russell J; Kotov, Alexey A
2018-03-27
We found a Holarctic microcrustacean Daphnia galeata Sars, 1863 (Cladocera: Daphniidae) in the Lower Lakes of South Australia. This taxon was never detected in continental Australia before. Its identity was confirmed by the sequences of mitochondrial COI, 12S and 16S and nuclear 18S and 28S genes. A maximum likelihood tree from a dataset from combining 12S + 16S mitochondrial sequence and a split network of the COI haplotypes are provided, but resolution of both genes is not sufficient to reveal the exact region of the Holarctic from where D. galeata was introduced to Australia; the vector of its invasion also is unknown. We hypothesize that appearance of D. galeata in the Lower Lakes of the Murray River is related to a recent anthropogenic eutrophication of water bodies in this region, keeping in mind that examples of successful invasion of some European lakes by D. galeata after their eutrophication are well-known. We also hypothesize that establishment of this non-indigenous taxon populations in Australia might have a strong negative impact on native lake biota.
ZHAO, Fang; ZHANG, Ming-Xia; MA, Jun-Ying; CAI, Hui-Xia; SU, Jian-Ping; CAI, Hui-Xia; HOU, Zhi-Bin; ZHANG, Tong-Zuo; LIN, Gong-Hua
2014-01-01
Cestode larvae spend one phase of their two-phase life cycle in the viscera of rodents, but cases of cestodes infecting subterranean rodents have only been rarely observed. To experimentally gain some insight into this phenomenon, we captured approximately 300 plateau zokors (Eospalax baileyi), a typical subterranean rodent inhabiting the Qinghai-Tibet Plateau, and examined their livers for the presence of cysts. Totally, we collected five cysts, and using a mitochondrial gene (cox1) and two nuclear genes (pepck and pold) as genetic markers, we were able to analyze the taxonomy of the cysts. Both the maximum likelihood and Bayesian methods showed that the cysts share a monophyly with Taenia mustelae, while Kimura 2-parameter distances and number of different sites between our sequences and T. mustelae were far less than those found between the examined sequences and other Taeniidae species. These results, alongside supporting paraffin section histology, imply that the cysts found in plateau zokors can be regarded as larvae of T. mustelae, illustrating that zokors are a newly discovered intermediate host record of this parasite. PMID:25017751
bcgTree: automatized phylogenetic tree building from bacterial core genomes.
Ankenbrand, Markus J; Keller, Alexander
2016-10-01
The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis. Here, we describe the workflow of bcgTree and, as a proof-of-concept, its usefulness in resolving the phylogeny of 293 publically available bacterial strains of the genus Lactobacillus. We also evaluate its performance in both low- and high-level taxonomy test sets. The tool is freely available at github ( https://github.com/iimog/bcgTree ) and our institutional homepage ( http://www.dna-analytics.biozentrum.uni-wuerzburg.de ).
Yu, Yi-Kuo; Capra, John A.; Stojmirović, Aleksandar; Landsman, David; Altschul, Stephen F.
2015-01-01
Motivation: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns. Results: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column’s observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions. Availability and implementation: Our new measures are implemented in an open-source Web-based logo generation program, which is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/logoddslogo/index.html. A stand-alone version of the program is also available from this site. Contact: altschul@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25294922
A Maximum-Likelihood Approach to Force-Field Calibration.
Zaborowski, Bartłomiej; Jagieła, Dawid; Czaplewski, Cezary; Hałabis, Anna; Lewandowska, Agnieszka; Żmudzińska, Wioletta; Ołdziej, Stanisław; Karczyńska, Agnieszka; Omieczynski, Christian; Wirecki, Tomasz; Liwo, Adam
2015-09-28
A new approach to the calibration of the force fields is proposed, in which the force-field parameters are obtained by maximum-likelihood fitting of the calculated conformational ensembles to the experimental ensembles of training system(s). The maximum-likelihood function is composed of logarithms of the Boltzmann probabilities of the experimental conformations, calculated with the current energy function. Because the theoretical distribution is given in the form of the simulated conformations only, the contributions from all of the simulated conformations, with Gaussian weights in the distances from a given experimental conformation, are added to give the contribution to the target function from this conformation. In contrast to earlier methods for force-field calibration, the approach does not suffer from the arbitrariness of dividing the decoy set into native-like and non-native structures; however, if such a division is made instead of using Gaussian weights, application of the maximum-likelihood method results in the well-known energy-gap maximization. The computational procedure consists of cycles of decoy generation and maximum-likelihood-function optimization, which are iterated until convergence is reached. The method was tested with Gaussian distributions and then applied to the physics-based coarse-grained UNRES force field for proteins. The NMR structures of the tryptophan cage, a small α-helical protein, determined at three temperatures (T = 280, 305, and 313 K) by Hałabis et al. ( J. Phys. Chem. B 2012 , 116 , 6898 - 6907 ), were used. Multiplexed replica-exchange molecular dynamics was used to generate the decoys. The iterative procedure exhibited steady convergence. Three variants of optimization were tried: optimization of the energy-term weights alone and use of the experimental ensemble of the folded protein only at T = 280 K (run 1); optimization of the energy-term weights and use of experimental ensembles at all three temperatures (run 2); and optimization of the energy-term weights and the coefficients of the torsional and multibody energy terms and use of experimental ensembles at all three temperatures (run 3). The force fields were subsequently tested with a set of 14 α-helical and two α + β proteins. Optimization run 1 resulted in better agreement with the experimental ensemble at T = 280 K compared with optimization run 2 and in comparable performance on the test set but poorer agreement of the calculated folding temperature with the experimental folding temperature. Optimization run 3 resulted in the best fit of the calculated ensembles to the experimental ones for the tryptophan cage but in much poorer performance on the training set, suggesting that use of a small α-helical protein for extensive force-field calibration resulted in overfitting of the data for this protein at the expense of transferability. The optimized force field resulting from run 2 was found to fold 13 of the 14 tested α-helical proteins and one small α + β protein with the correct topologies; the average structures of 10 of them were predicted with accuracies of about 5 Å C(α) root-mean-square deviation or better. Test simulations with an additional set of 12 α-helical proteins demonstrated that this force field performed better on α-helical proteins than the previous parametrizations of UNRES. The proposed approach is applicable to any problem of maximum-likelihood parameter estimation when the contributions to the maximum-likelihood function cannot be evaluated at the experimental points and the dimension of the configurational space is too high to construct histograms of the experimental distributions.
The discovery of Halictivirus resolves the Sinaivirus phylogeny.
Bigot, Diane; Dalmon, Anne; Roy, Bronwen; Hou, Chunsheng; Germain, Michèle; Romary, Manon; Deng, Shuai; Diao, Qingyun; Weinert, Lucy A; Cook, James M; Herniou, Elisabeth A; Gayral, Philippe
2017-11-01
By providing pollination services, bees are among the most important insects, both in ecological and economical terms. Combined next-generation and classical sequencing approaches were applied to discover and study new insect viruses potentially harmful to bees. A bioinformatics virus discovery pipeline was used on individual Illumina transcriptomes of 13 wild bees from three species from the genus Halictus and 30 ants from six species of the genera Messor and Aphaenogaster. This allowed the discovery and description of three sequences of a new virus termed Halictus scabiosae Adlikon virus (HsAV). Phylogenetic analyses of ORF1, RNA-dependent RNA-polymerase (RdRp) and capsid genes showed that HsAV is closely related to (+)ssRNA viruses of the unassigned Sinaivirus genus but distant enough to belong to a different new genus we called Halictivirus. In addition, our study of ant transcriptomes revealed the first four sinaivirus sequences from ants (Messor barbarus, M. capitatus and M. concolor). Maximum likelihood phylogenetic analyses were performed on a 594 nt fragment of the ORF1/RdRp region from 84 sinaivirus sequences, including 31 new Lake Sinai viruses (LSVs) from honey bees collected in five countries across the globe and the four ant viral sequences. The phylogeny revealed four main clades potentially representing different viral species infecting honey bees. Moreover, the ant viruses belonged to the LSV4 clade, suggesting a possible cross-species transmission between bees and ants. Lastly, wide honey bee screening showed that all four LSV clades have worldwide distributions with no obvious geographical segregation.
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.
Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G
2012-10-02
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Aliisedimentitalea scapharcae gen. nov., sp. nov., isolated from ark shell Scapharca broughtonii.
Kim, Young-Ok; Park, Sooyeon; Nam, Bo-Hye; Kim, Dong-Gyun; Won, Sung-Min; Park, Ji-Min; Yoon, Jung-Hoon
2015-08-01
A Gram-negative, aerobic, non-spore-forming, motile and ovoid or rod-shaped bacterial strain, designated MA2-16(T), was isolated from ark shell (Scapharca broughtonii) collected from the South Sea, South Korea. Strain MA2-16(T) was found to grow optimally at 30°C, at pH 7.0-8.0 and in the presence of 2.0% (w/v) NaCl. Neighbour-joining, maximum-likelihood and maximum-parsimony phylogenetic trees based on 16S rRNA gene sequences revealed that strain MA2-16(T) clustered with the type strain of Sedimentitalea nanhaiensis. The novel strain exhibited a 16S rRNA gene sequence similarity value of 97.1% to the type strain of S. nanhaiensis. In the neighbour-joining phylogenetic tree based on gyrB sequences, strain MA2-16(T) formed an evolutionary lineage independent of those of other taxa. Strain MA2-16(T) contained Q-10 as the predominant ubiquinone and C18:1 ω7c and 11-methyl C18:1 ω7c as the major fatty acids. The major polar lipids of strain MA2-16(T) were phosphatidylcholine, phosphatidylglycerol, phosphatidylethanolamine, an unidentified aminolipid and an unidentified lipid. The DNA G+C content of strain MA2-16(T) was 57.7 mol% and its DNA-DNA relatedness values with the type strains of S. nanhaiensis and some phylogenetically related species of the genera Leisingera and Phaeobacter were 13-24%. On the basis of the data presented, strain MA2-16(T) is considered to represent a novel genus and novel species within the family Rhodobacteraceae, for which the name Aliisedimentitalea scapharcae gen. nov., sp. nov. is proposed. The type strain is MA2-16(T) (=KCTC 42119(T) =CECT 8598(T)).
Hochbach, Anne; Schneider, Julia; Röser, Martin
2015-06-01
To investigate phylogenetic relationships within the grass subfamily Pooideae we studied about 50 taxa covering all recognized tribes, using one plastid DNA (cpDNA) marker (matK gene-3'trnK exon) and for the first time four nuclear single copy gene loci. DNA sequence information from two parts of the nuclear genes topoisomerase 6 (Topo6) spanning the exons 8-13 and 17-19, the exons 9-13 encoding plastid acetyl-CoA-carboxylase (Acc1) and the partial exon 1 of phytochrome B (PhyB) were generated. Individual and nuclear combined data were evaluated using maximum parsimony, maximum likelihood and Bayesian methods. All of the phylogenetic results show Brachyelytrum and the tribe Nardeae as earliest diverging lineages within the subfamily. The 'core' Pooideae (Hordeeae and the Aveneae/Poeae tribe complex) are also strongly supported, as well as the monophyly of the tribes Brachypodieae, Meliceae and Stipeae (except PhyB). The beak grass tribe Diarrheneae and the tribe Duthieeae are not monophyletic in some of the analyses. However, the combined nuclear DNA (nDNA) tree yields the highest resolution and the best delimitation of the tribes, and provides the following evolutionary hypothesis for the tribes: Brachyelytrum, Nardeae, Duthieeae, Meliceae, Stipeae, Diarrheneae, Brachypodieae and the 'core' Pooideae. Within the individual datasets, the phylogenetic trees obtained from Topo6 exon 8-13 shows the most interesting results. The divergent positions of some clone sequences of Ampelodesmos mauritanicus and Trikeraia pappiformis, for instance, may indicate a hybrid origin of these stipoid taxa. Copyright © 2015 Elsevier Inc. All rights reserved.
Amphritea ceti sp. nov., isolated from faeces of Beluga whale (Delphinapterus leucas).
Kim, Young-Ok; Park, Sooyeon; Kim, Doo Nam; Nam, Bo-Hye; Won, Sung-Min; An, Du Hae; Yoon, Jung-Hoon
2014-12-01
A Gram-stain-negative, aerobic, non-spore-forming, non-flagellated and rod-shaped or ovoid bacterial strain, designated RA1(T), was isolated from faeces collected from Beluga whale (Delphinapterus leucas) in Yeosu aquarium, South Korea. Strain RA1(T) grew optimally at 25 °C, at pH 7.0-8.0 and in the presence of 2.0 % (w/v) NaCl. Neighbour-joining, maximum-likelihood and maximum-parsimony phylogenetic trees based on 16S rRNA gene sequences revealed that strain RA1(T) joins the cluster comprising the type strains of three species of the genus Amphritea, with which it exhibited 95.8-96.0 % sequence similarity. Sequence similarities to the type strains of other recognized species were less than 94.3 %. Strain RA1(T) contained Q-8 as the predominant ubiquinone and summed feature 3 (C16 : 1ω7c and/or C16 : 1ω6c), C18 : 1ω7c and C16 : 0 as the major fatty acids. The major polar lipids of strain RA1(T) were phosphatidylethanolamine, phosphatidylglycerol, two unidentified lipids and one unidentified aminolipid. The DNA G+C content of strain RA1(T) was 47.4 mol%. The differential phenotypic properties, together with the phylogenetic distinctiveness, revealed that strain RA1(T) is separated from other species of the genus Amphritea. On the basis of the data presented, strain RA1(T) is considered to represent a novel species of the genus Amphritea, for which the name Amphritea ceti sp. nov. is proposed. The type strain is RA1(T) ( = KCTC 42154(T) = NBRC 110551(T)). © 2014 IUMS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si
2014-12-01
The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model
ERIC Educational Resources Information Center
Roberts, James S.; Thompson, Vanessa M.
2011-01-01
A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.
Theobald, Douglas L; Wuttke, Deborah S
2006-09-01
THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.