Computation of nonparametric convex hazard estimators via profile methods.
Jankowski, Hanna K; Wellner, Jon A
2009-05-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Hock, Sabrina; Hasenauer, Jan; Theis, Fabian J
2013-01-01
Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
Statistical inference methods for sparse biological time series data.
Ndukum, Juliet; Fonseca, Luís L; Santos, Helena; Voit, Eberhard O; Datta, Susmita
2011-04-25
Comparing metabolic profiles under different biological perturbations has become a powerful approach to investigating the functioning of cells. The profiles can be taken as single snapshots of a system, but more information is gained if they are measured longitudinally over time. The results are short time series consisting of relatively sparse data that cannot be analyzed effectively with standard time series techniques, such as autocorrelation and frequency domain methods. In this work, we study longitudinal time series profiles of glucose consumption in the yeast Saccharomyces cerevisiae under different temperatures and preconditioning regimens, which we obtained with methods of in vivo nuclear magnetic resonance (NMR) spectroscopy. For the statistical analysis we first fit several nonlinear mixed effect regression models to the longitudinal profiles and then used an ANOVA likelihood ratio method in order to test for significant differences between the profiles. The proposed methods are capable of distinguishing metabolic time trends resulting from different treatments and associate significance levels to these differences. Among several nonlinear mixed-effects regression models tested, a three-parameter logistic function represents the data with highest accuracy. ANOVA and likelihood ratio tests suggest that there are significant differences between the glucose consumption rate profiles for cells that had been--or had not been--preconditioned by heat during growth. Furthermore, pair-wise t-tests reveal significant differences in the longitudinal profiles for glucose consumption rates between optimal conditions and heat stress, optimal and recovery conditions, and heat stress and recovery conditions (p-values <0.0001). We have developed a nonlinear mixed effects model that is appropriate for the analysis of sparse metabolic and physiological time profiles. The model permits sound statistical inference procedures, based on ANOVA likelihood ratio tests, for testing the significance of differences between short time course data under different biological perturbations.
Program for Weibull Analysis of Fatigue Data
NASA Technical Reports Server (NTRS)
Krantz, Timothy L.
2005-01-01
A Fortran computer program has been written for performing statistical analyses of fatigue-test data that are assumed to be adequately represented by a two-parameter Weibull distribution. This program calculates the following: (1) Maximum-likelihood estimates of the Weibull distribution; (2) Data for contour plots of relative likelihood for two parameters; (3) Data for contour plots of joint confidence regions; (4) Data for the profile likelihood of the Weibull-distribution parameters; (5) Data for the profile likelihood of any percentile of the distribution; and (6) Likelihood-based confidence intervals for parameters and/or percentiles of the distribution. The program can account for tests that are suspended without failure (the statistical term for such suspension of tests is "censoring"). The analytical approach followed in this program for the software is valid for type-I censoring, which is the removal of unfailed units at pre-specified times. Confidence regions and intervals are calculated by use of the likelihood-ratio method.
The Fecal Microbiota Profile and Bronchiolitis in Infants
Linnemann, Rachel W.; Mansbach, Jonathan M.; Ajami, Nadim J.; Espinola, Janice A.; Petrosino, Joseph F.; Piedra, Pedro A.; Stevenson, Michelle D.; Sullivan, Ashley F.; Thompson, Amy D.; Camargo, Carlos A.
2016-01-01
BACKGROUND: Little is known about the association of gut microbiota, a potentially modifiable factor, with bronchiolitis in infants. We aimed to determine the association of fecal microbiota with bronchiolitis in infants. METHODS: We conducted a case–control study. As a part of multicenter prospective study, we collected stool samples from 40 infants hospitalized with bronchiolitis. We concurrently enrolled 115 age-matched healthy controls. By applying 16S rRNA gene sequencing and an unbiased clustering approach to these 155 fecal samples, we identified microbiota profiles and determined the association of microbiota profiles with likelihood of bronchiolitis. RESULTS: Overall, the median age was 3 months, 55% were male, and 54% were non-Hispanic white. Unbiased clustering of fecal microbiota identified 4 distinct profiles: Escherichia-dominant profile (30%), Bifidobacterium-dominant profile (21%), Enterobacter/Veillonella-dominant profile (22%), and Bacteroides-dominant profile (28%). The proportion of bronchiolitis was lowest in infants with the Enterobacter/Veillonella-dominant profile (15%) and highest in the Bacteroides-dominant profile (44%), corresponding to an odds ratio of 4.59 (95% confidence interval, 1.58–15.5; P = .008). In the multivariable model, the significant association between the Bacteroides-dominant profile and a greater likelihood of bronchiolitis persisted (odds ratio for comparison with the Enterobacter/Veillonella-dominant profile, 4.24; 95% confidence interval, 1.56–12.0; P = .005). In contrast, the likelihood of bronchiolitis in infants with the Escherichia-dominant or Bifidobacterium-dominant profile was not significantly different compared with those with the Enterobacter/Veillonella-dominant profile. CONCLUSIONS: In this case–control study, we identified 4 distinct fecal microbiota profiles in infants. The Bacteroides-dominant profile was associated with a higher likelihood of bronchiolitis. PMID:27354456
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
A New Maximum Likelihood Approach for Free Energy Profile Construction from Molecular Simulations
Lee, Tai-Sung; Radak, Brian K.; Pabis, Anna; York, Darrin M.
2013-01-01
A novel variational method for construction of free energy profiles from molecular simulation data is presented. The variational free energy profile (VFEP) method uses the maximum likelihood principle applied to the global free energy profile based on the entire set of simulation data (e.g from multiple biased simulations) that spans the free energy surface. The new method addresses common obstacles in two major problems usually observed in traditional methods for estimating free energy surfaces: the need for overlap in the re-weighting procedure and the problem of data representation. Test cases demonstrate that VFEP outperforms other methods in terms of the amount and sparsity of the data needed to construct the overall free energy profiles. For typical chemical reactions, only ~5 windows and ~20-35 independent data points per window are sufficient to obtain an overall qualitatively correct free energy profile with sampling errors an order of magnitude smaller than the free energy barrier. The proposed approach thus provides a feasible mechanism to quickly construct the global free energy profile and identify free energy barriers and basins in free energy simulations via a robust, variational procedure that determines an analytic representation of the free energy profile without the requirement of numerically unstable histograms or binning procedures. It can serve as a new framework for biased simulations and is suitable to be used together with other methods to tackle with the free energy estimation problem. PMID:23457427
Han, Jubong; Lee, K B; Lee, Jong-Man; Park, Tae Soon; Oh, J S; Oh, Pil-Jei
2016-03-01
We discuss a new method to incorporate Type B uncertainty into least-squares procedures. The new method is based on an extension of the likelihood function from which a conventional least-squares function is derived. The extended likelihood function is the product of the original likelihood function with additional PDFs (Probability Density Functions) that characterize the Type B uncertainties. The PDFs are considered to describe one's incomplete knowledge on correction factors being called nuisance parameters. We use the extended likelihood function to make point and interval estimations of parameters in the basically same way as the least-squares function used in the conventional least-squares method is derived. Since the nuisance parameters are not of interest and should be prevented from appearing in the final result, we eliminate such nuisance parameters by using the profile likelihood. As an example, we present a case study for a linear regression analysis with a common component of Type B uncertainty. In this example we compare the analysis results obtained from using our procedure with those from conventional methods. Copyright © 2015. Published by Elsevier Ltd.
Nasal Airway Microbiota Profile and Severe Bronchiolitis in Infants: A Case-control Study.
Hasegawa, Kohei; Linnemann, Rachel W; Mansbach, Jonathan M; Ajami, Nadim J; Espinola, Janice A; Petrosino, Joseph F; Piedra, Pedro A; Stevenson, Michelle D; Sullivan, Ashley F; Thompson, Amy D; Camargo, Carlos A
2017-11-01
Little is known about the relationship of airway microbiota with bronchiolitis in infants. We aimed to identify nasal airway microbiota profiles and to determine their association with the likelihood of bronchiolitis in infants. A case-control study was conducted. As a part of a multicenter prospective study, we collected nasal airway samples from 40 infants hospitalized with bronchiolitis. We concurrently enrolled 110 age-matched healthy controls. By applying 16S ribosomal RNA gene sequencing and an unbiased clustering approach to these 150 nasal samples, we identified microbiota profiles and determined the association of microbiota profiles with likelihood of bronchiolitis. Overall, the median age was 3 months and 56% were male. Unbiased clustering of airway microbiota identified 4 distinct profiles: Moraxella-dominant profile (37%), Corynebacterium/Dolosigranulum-dominant profile (27%), Staphylococcus-dominant profile (15%) and mixed profile (20%). Proportion of bronchiolitis was lowest in infants with Moraxella-dominant profile (14%) and highest in those with Staphylococcus-dominant profile (57%), corresponding to an odds ratio of 7.80 (95% confidence interval, 2.64-24.9; P < 0.001). In the multivariable model, the association between Staphylococcus-dominant profile and greater likelihood of bronchiolitis persisted (odds ratio for comparison with Moraxella-dominant profile, 5.16; 95% confidence interval, 1.26-22.9; P = 0.03). By contrast, Corynebacterium/Dolosigranulum-dominant profile group had low proportion of infants with bronchiolitis (17%); the likelihood of bronchiolitis in this group did not significantly differ from those with Moraxella-dominant profile in both unadjusted and adjusted analyses. In this case-control study, we identified 4 distinct nasal airway microbiota profiles in infants. Moraxella-dominant and Corynebacterium/Dolosigranulum-dominant profiles were associated with low likelihood of bronchiolitis, while Staphylococcus-dominant profile was associated with high likelihood of bronchiolitis.
NASA Astrophysics Data System (ADS)
Pan, Zhen; Anderes, Ethan; Knox, Lloyd
2018-05-01
One of the major targets for next-generation cosmic microwave background (CMB) experiments is the detection of the primordial B-mode signal. Planning is under way for Stage-IV experiments that are projected to have instrumental noise small enough to make lensing and foregrounds the dominant source of uncertainty for estimating the tensor-to-scalar ratio r from polarization maps. This makes delensing a crucial part of future CMB polarization science. In this paper we present a likelihood method for estimating the tensor-to-scalar ratio r from CMB polarization observations, which combines the benefits of a full-scale likelihood approach with the tractability of the quadratic delensing technique. This method is a pixel space, all order likelihood analysis of the quadratic delensed B modes, and it essentially builds upon the quadratic delenser by taking into account all order lensing and pixel space anomalies. Its tractability relies on a crucial factorization of the pixel space covariance matrix of the polarization observations which allows one to compute the full Gaussian approximate likelihood profile, as a function of r , at the same computational cost of a single likelihood evaluation.
Pascazio, Vito; Schirinzi, Gilda
2002-01-01
In this paper, a technique that is able to reconstruct highly sloped and discontinuous terrain height profiles, starting from multifrequency wrapped phase acquired by interferometric synthetic aperture radar (SAR) systems, is presented. We propose an innovative unwrapping method, based on a maximum likelihood estimation technique, which uses multifrequency independent phase data, obtained by filtering the interferometric SAR raw data pair through nonoverlapping band-pass filters, and approximating the unknown surface by means of local planes. Since the method does not exploit the phase gradient, it assures the uniqueness of the solution, even in the case of highly sloped or piecewise continuous elevation patterns with strong discontinuities.
Maximal likelihood correspondence estimation for face recognition across pose.
Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang
2014-10-01
Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
Lun, Aaron T L; Chen, Yunshun; Smyth, Gordon K
2016-01-01
RNA sequencing (RNA-seq) is widely used to profile transcriptional activity in biological systems. Here we present an analysis pipeline for differential expression analysis of RNA-seq experiments using the Rsubread and edgeR software packages. The basic pipeline includes read alignment and counting, filtering and normalization, modelling of biological variability and hypothesis testing. For hypothesis testing, we describe particularly the quasi-likelihood features of edgeR. Some more advanced downstream analysis steps are also covered, including complex comparisons, gene ontology enrichment analyses and gene set testing. The code required to run each step is described, along with an outline of the underlying theory. The chapter includes a case study in which the pipeline is used to study the expression profiles of mammary gland cells in virgin, pregnant and lactating mice.
Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module
NASA Astrophysics Data System (ADS)
Martinez, Gregory D.; McKay, James; Farmer, Ben; Scott, Pat; Roebber, Elinore; Putze, Antje; Conrad, Jan
2017-11-01
We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics.
Leveraging cues from person-generated health data for peer matching in online communities
Hartzler, Andrea L; Taylor, Megan N; Park, Albert; Griffiths, Troy; Backonja, Uba; McDonald, David W; Wahbeh, Sam; Brown, Cory; Pratt, Wanda
2016-01-01
Objective Online health communities offer a diverse peer support base, yet users can struggle to identify suitable peer mentors as these communities grow. To facilitate mentoring connections, we designed a peer-matching system that automatically profiles and recommends peer mentors to mentees based on person-generated health data (PGHD). This study examined the profile characteristics that mentees value when choosing a peer mentor. Materials and Methods Through a mixed-methods user study, in which cancer patients and caregivers evaluated peer mentor recommendations, we examined the relative importance of four possible profile elements: health interests, language style, demographics, and sample posts. Playing the role of mentees, the study participants ranked mentors, then rated both the likelihood that they would hypothetically contact each mentor and the helpfulness of each profile element in helping the make that decision. We analyzed the participants’ ratings with linear regression and qualitatively analyzed participants’ feedback for emerging themes about choosing mentors and improving profile design. Results Of the four profile elements, only sample posts were a significant predictor for the likelihood of a mentee contacting a mentor. Communication cues embedded in posts were critical for helping the participants choose a compatible mentor. Qualitative themes offer insight into the interpersonal characteristics that mentees sought in peer mentors, including being knowledgeable, sociable, and articulate. Additionally, the participants emphasized the need for streamlined profiles that minimize the time required to choose a mentor. Conclusion Peer-matching systems in online health communities offer a promising approach for leveraging PGHD to connect patients. Our findings point to interpersonal communication cues embedded in PGHD that could prove critical for building mentoring relationships among the growing membership of online health communities. PMID:26911825
Harbert, Robert S; Nixon, Kevin C
2015-08-01
• Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.
NASA Technical Reports Server (NTRS)
Sheridan P. J.; Andrews, E.; Ogren, J A.; Tackett, J. L.; Winker, D. M.
2012-01-01
Between June 2006 and September 2009, an instrumented light aircraft measured over 400 vertical profiles of aerosol and trace gas properties over eastern and central Illinois. The primary objectives of this program were to (1) measure the in situ aerosol properties and determine their vertical and temporal variability and (2) relate these aircraft measurements to concurrent surface and satellite measurements. Underflights of the CALIPSO satellite show reasonable agreement in a majority of retrieved profiles between aircraft-measured extinction at 532 nm (adjusted to ambient relative humidity) and CALIPSO-retrieved extinction, and suggest that routine aircraft profiling programs can be used to better understand and validate satellite retrieval algorithms. CALIPSO tended to overestimate the aerosol extinction at this location in some boundary layer flight segments when scattered or broken clouds were present, which could be related to problems with CALIPSO cloud screening methods. The in situ aircraft-collected aerosol data suggest extinction thresholds for the likelihood of aerosol layers being detected by the CALIOP lidar. These statistical data offer guidance as to the likelihood of CALIPSO's ability to retrieve aerosol extinction at various locations around the globe.
Childhood Sports Participation and Adolescent Sport Profile.
Gallant, François; O'Loughlin, Jennifer L; Brunet, Jennifer; Sabiston, Catherine M; Bélanger, Mathieu
2017-12-01
We aimed to increase understanding of the link between sport specialization during childhood and adolescent physical activity (PA). The objectives were as follows: (1) describe the natural course of sport participation over 5 years among children who are early sport samplers or early sport specializers and (2) determine if a sport participation profile in childhood predicts the sport profile in adolescence. Participants ( n = 756, ages 10-11 years at study inception) reported their participation in organized and unorganized PA during in-class questionnaires administered every 4 months over 5 years. They were categorized as early sport samplers, early sport specializers, or nonparticipants in year 1 and as recreational sport participants, performance sport participants, or nonparticipants in years 2 to 5. The likelihood that a childhood sport profile would predict the adolescent profile was computed as relative risks. Polynomial logistic regression was used to identify predictors of an adolescent sport profile. Compared with early sport specialization and nonparticipation, early sport sampling in childhood was associated with a higher likelihood of recreational participation (relative risk, 95% confidence interval: 1.55, 1.18-2.03) and a lower likelihood of nonparticipation (0.69, 0.51-0.93) in adolescence. Early sport specialization was associated with a higher likelihood of performance participation (1.65, 1.19-2.28) but not of nonparticipation (1.01, 0.70-1.47) in adolescence. Nonparticipation in childhood was associated with nearly doubling the likelihood of nonparticipation in adolescence (1.88, 1.36-2.62). Sport sampling should be promoted in childhood because it may be linked to higher PA levels during adolescence. Copyright © 2017 by the American Academy of Pediatrics.
NASA Technical Reports Server (NTRS)
Smith, Greg
2003-01-01
Schedule risk assessments determine the likelihood of finishing on time. Each task in a schedule has a varying degree of probability of being finished on time. A schedule risk assessment quantifies these probabilities by assigning values to each task. This viewgraph presentation contains a flow chart for conducting a schedule risk assessment, and profiles applicable several methods of data analysis.
Li, Guanghui; Luo, Jiawei; Xiao, Qiu; Liang, Cheng; Ding, Pingjian
2018-05-12
Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases. Copyright © 2018. Published by Elsevier Inc.
Dziak, John J.; Bray, Bethany C.; Zhang, Jieting; Zhang, Minqiang; Lanza, Stephanie T.
2016-01-01
Several approaches are available for estimating the relationship of latent class membership to distal outcomes in latent profile analysis (LPA). A three-step approach is commonly used, but has problems with estimation bias and confidence interval coverage. Proposed improvements include the correction method of Bolck, Croon, and Hagenaars (BCH; 2004), Vermunt’s (2010) maximum likelihood (ML) approach, and the inclusive three-step approach of Bray, Lanza, & Tan (2015). These methods have been studied in the related case of latent class analysis (LCA) with categorical indicators, but not as well studied for LPA with continuous indicators. We investigated the performance of these approaches in LPA with normally distributed indicators, under different conditions of distal outcome distribution, class measurement quality, relative latent class size, and strength of association between latent class and the distal outcome. The modified BCH implemented in Latent GOLD had excellent performance. The maximum likelihood and inclusive approaches were not robust to violations of distributional assumptions. These findings broadly agree with and extend the results presented by Bakk and Vermunt (2016) in the context of LCA with categorical indicators. PMID:28630602
Halo-independent determination of the unmodulated WIMP signal in DAMA: the isotropic case
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gondolo, Paolo; Scopel, Stefano, E-mail: paolo.gondolo@utah.edu, E-mail: scopel@sogang.ac.kr
2017-09-01
We present a halo-independent determination of the unmodulated signal corresponding to the DAMA modulation if interpreted as due to dark matter weakly interacting massive particles (WIMPs). First we show how a modulated signal gives information on the WIMP velocity distribution function in the Galactic rest frame from which the unmodulated signal descends. Then we describe a mathematically-sound profile likelihood analysis in which the likelihood is profiled over a continuum of nuisance parameters (namely, the WIMP velocity distribution). As a first application of the method, which is very general and valid for any class of velocity distributions, we restrict the analysismore » to velocity distributions that are isotropic in the Galactic frame. In this way we obtain halo-independent maximum-likelihood estimates and confidence intervals for the DAMA unmodulated signal. We find that the estimated unmodulated signal is in line with expectations for a WIMP-induced modulation and is compatible with the DAMA background+signal rate. Specifically, for the isotropic case we find that the modulated amplitude ranges between a few percent and about 25% of the unmodulated amplitude, depending on the WIMP mass.« less
Planck intermediate results. XVI. Profile likelihoods for cosmological parameters
NASA Astrophysics Data System (ADS)
Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartlett, J. G.; Battaner, E.; Benabed, K.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bobin, J.; Bonaldi, A.; Bond, J. R.; Bouchet, F. R.; Burigana, C.; Cardoso, J.-F.; Catalano, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Couchot, F.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Dickinson, C.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Dupac, X.; Enßlin, T. A.; Eriksen, H. K.; Finelli, F.; Forni, O.; Frailis, M.; Franceschi, E.; Galeotta, S.; Galli, S.; Ganga, K.; Giard, M.; Giraud-Héraud, Y.; González-Nuevo, J.; Górski, K. M.; Gregorio, A.; Gruppuso, A.; Hansen, F. K.; Harrison, D. L.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huffenberger, K. M.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Knox, L.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lawrence, C. R.; Leonardi, R.; Liddle, A.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maffei, B.; Maino, D.; Mandolesi, N.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Massardi, M.; Matarrese, S.; Mazzotta, P.; Melchiorri, A.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mitra, S.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski∗, S.; Pointecouteau, E.; Polenta, G.; Popa, L.; Pratt, G. W.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Ricciardi, S.; Riller, T.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Roudier, G.; Rouillé d'Orfeuil, B.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Savelainen, M.; Savini, G.; Spencer, L. D.; Spinelli, M.; Starck, J.-L.; Sureau, F.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Umana, G.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; White, M.; Yvon, D.; Zacchei, A.; Zonca, A.
2014-06-01
We explore the 2013 Planck likelihood function with a high-precision multi-dimensional minimizer (Minuit). This allows a refinement of the ΛCDM best-fit solution with respect to previously-released results, and the construction of frequentist confidence intervals using profile likelihoods. The agreement with the cosmological results from the Bayesian framework is excellent, demonstrating the robustness of the Planck results to the statistical methodology. We investigate the inclusion of neutrino masses, where more significant differences may appear due to the non-Gaussian nature of the posterior mass distribution. By applying the Feldman-Cousins prescription, we again obtain results very similar to those of the Bayesian methodology. However, the profile-likelihood analysis of the cosmic microwave background (CMB) combination (Planck+WP+highL) reveals a minimum well within the unphysical negative-mass region. We show that inclusion of the Planck CMB-lensing information regularizes this issue, and provide a robust frequentist upper limit ∑ mν ≤ 0.26 eV (95% confidence) from the CMB+lensing+BAO data combination.
Gengsheng Qin; Davis, Angela E; Jing, Bing-Yi
2011-06-01
For a continuous-scale diagnostic test, it is often of interest to find the range of the sensitivity of the test at the cut-off that yields a desired specificity. In this article, we first define a profile empirical likelihood ratio for the sensitivity of a continuous-scale diagnostic test and show that its limiting distribution is a scaled chi-square distribution. We then propose two new empirical likelihood-based confidence intervals for the sensitivity of the test at a fixed level of specificity by using the scaled chi-square distribution. Simulation studies are conducted to compare the finite sample performance of the newly proposed intervals with the existing intervals for the sensitivity in terms of coverage probability. A real example is used to illustrate the application of the recommended methods.
Application of random match probability calculations to mixed STR profiles.
Bille, Todd; Bright, Jo-Anne; Buckleton, John
2013-03-01
Mixed DNA profiles are being encountered more frequently as laboratories analyze increasing amounts of touch evidence. If it is determined that an individual could be a possible contributor to the mixture, it is necessary to perform a statistical analysis to allow an assignment of weight to the evidence. Currently, the combined probability of inclusion (CPI) and the likelihood ratio (LR) are the most commonly used methods to perform the statistical analysis. A third method, random match probability (RMP), is available. This article compares the advantages and disadvantages of the CPI and LR methods to the RMP method. We demonstrate that although the LR method is still considered the most powerful of the binary methods, the RMP and LR methods make similar use of the observed data such as peak height, assumed number of contributors, and known contributors where the CPI calculation tends to waste information and be less informative. © 2013 American Academy of Forensic Sciences.
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Passport examination by a confocal-type laser profile microscope.
Sugawara, Shigeru
2008-06-10
The author proposes a nondestructive and highly precise method of measuring the thickness of a film pasted on a passport using a confocal-type laser profile microscope. The effectiveness of this method in passport examination is demonstrated. A confocal-type laser profile microscope is used to create profiles of the film surface and film-paper interface; these profiles are used to calculate the film thickness by employing an algorithm developed by the author. The film thicknesses of the passport samples--35 genuine and 80 counterfeit Japanese passports--are measured nondestructively. The intra-sample standard deviation of the film thicknesses of the genuine and counterfeit Japanese passports was of the order of 1 microm The intersample standard deviations of the film thicknesses of passports forged using the same tools and techniques are expected to be of the order of 1 microm. The thickness values of the films on the machine-readable genuine passports ranged between 31.95 microm and 36.95 microm. The likelihood ratio of this method in the authentication of machine-readable Japanese genuine passports is 11.7. Therefore, this method is effective for the authentification of genuine passports. Since the distribution of the film thickness of all forged passports was considerably larger than the accuracy of this method, this method is considered effective also for revealing the relation among the forged passports and acquiring proof of the crime.
Ryan, K; Williams, D Gareth; Balding, David J
2016-11-01
Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Robust statistical reconstruction for charged particle tomography
Schultz, Larry Joe; Klimenko, Alexei Vasilievich; Fraser, Andrew Mcleod; Morris, Christopher; Orum, John Christopher; Borozdin, Konstantin N; Sossong, Michael James; Hengartner, Nicolas W
2013-10-08
Systems and methods for charged particle detection including statistical reconstruction of object volume scattering density profiles from charged particle tomographic data to determine the probability distribution of charged particle scattering using a statistical multiple scattering model and determine a substantially maximum likelihood estimate of object volume scattering density using expectation maximization (ML/EM) algorithm to reconstruct the object volume scattering density. The presence of and/or type of object occupying the volume of interest can be identified from the reconstructed volume scattering density profile. The charged particle tomographic data can be cosmic ray muon tomographic data from a muon tracker for scanning packages, containers, vehicles or cargo. The method can be implemented using a computer program which is executable on a computer.
Semiparametric time-to-event modeling in the presence of a latent progression event.
Rice, John D; Tsodikov, Alex
2017-06-01
In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood-based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. © 2016, The International Biometric Society.
Searching mixed DNA profiles directly against profile databases.
Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John
2014-03-01
DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Nikolskiy, Igor; Siuzdak, Gary; Patti, Gary J
2015-06-15
The goal of large-scale metabolite profiling is to compare the relative concentrations of as many metabolites extracted from biological samples as possible. This is typically accomplished by measuring the abundances of thousands of ions with high-resolution and high mass accuracy mass spectrometers. Although the data from these instruments provide a comprehensive fingerprint of each sample, identifying the structures of the thousands of detected ions is still challenging and time intensive. An alternative, less-comprehensive approach is to use triple quadrupole (QqQ) mass spectrometry to analyze predetermined sets of metabolites (typically fewer than several hundred). This is done using authentic standards to develop QqQ experiments that specifically detect only the targeted metabolites, with the advantage that the need for ion identification after profiling is eliminated. Here, we propose a framework to extend the application of QqQ mass spectrometers to large-scale metabolite profiling. We aim to provide a foundation for designing QqQ multiple reaction monitoring (MRM) experiments for each of the 82 696 metabolites in the METLIN metabolite database. First, we identify common fragmentation products from the experimental fragmentation data in METLIN. Then, we model the likelihoods of each precursor structure in METLIN producing each common fragmentation product. With these likelihood estimates, we select ensembles of common fragmentation products that minimize our uncertainty about metabolite identities. We demonstrate encouraging performance and, based on our results, we suggest how our method can be integrated with future work to develop large-scale MRM experiments. Our predictions, Supplementary results, and the code for estimating likelihoods and selecting ensembles of fragmentation reactions are made available on the lab website at http://pattilab.wustl.edu/FragPred. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Religiosity profiles of American youth in relation to substance use, violence, and delinquency.
Salas-Wright, Christopher P; Vaughn, Michael G; Hodge, David R; Perron, Brian E
2012-12-01
Relatively little is known in terms of the relationship between religiosity profiles and adolescents' involvement in substance use, violence, and delinquency. Using a diverse sample of 17,705 (49 % female) adolescents from the 2008 National Survey on Drug Use and Health, latent profile analysis and multinomial regression are employed to examine the relationships between latent religiosity classes and substance use, violence, and delinquency. Results revealed a five class solution. Classes were identified as religiously disengaged (10.76 %), religiously infrequent (23.59 %), privately religious (6.55 %), religious regulars (40.85 %), and religiously devoted (18.25 %). Membership in the religiously devoted class was associated with the decreased likelihood of participation in a variety of substance use behaviors as well as decreases in the likelihood of fighting and theft. To a lesser extent, membership in the religious regulars class was also associated with the decreased likelihood of substance use and fighting. However, membership in the religiously infrequent and privately religious classes was only associated with the decreased likelihood of marijuana use. Findings suggest that private religiosity alone does not serve to buffer youth effectively against involvement in problem behavior, but rather that it is the combination of intrinsic and extrinsic adolescent religiosity factors that is associated with participation in fewer problem behaviors.
Factors Associated with Young Adults’ Pregnancy Likelihood
Kitsantas, Panagiota; Lindley, Lisa L.; Wu, Huichuan
2014-01-01
OBJECTIVES While progress has been made to reduce adolescent pregnancies in the United States, rates of unplanned pregnancy among young adults (18–29 years) remain high. In this study, we assessed factors associated with perceived likelihood of pregnancy (likelihood of getting pregnant/getting partner pregnant in the next year) among sexually experienced young adults who were not trying to get pregnant and had ever used contraceptives. METHODS We conducted a secondary analysis of 660 young adults, 18–29 years old in the United States, from the cross-sectional National Survey of Reproductive and Contraceptive Knowledge. Logistic regression and classification tree analyses were conducted to generate profiles of young adults most likely to report anticipating a pregnancy in the next year. RESULTS Nearly one-third (32%) of young adults indicated they believed they had at least some likelihood of becoming pregnant in the next year. Young adults who believed that avoiding pregnancy was not very important were most likely to report pregnancy likelihood (odds ratio [OR], 5.21; 95% CI, 2.80–9.69), as were young adults for whom avoiding a pregnancy was important but not satisfied with their current contraceptive method (OR, 3.93; 95% CI, 1.67–9.24), attended religious services frequently (OR, 3.0; 95% CI, 1.52–5.94), were uninsured (OR, 2.63; 95% CI, 1.31–5.26), and were likely to have unprotected sex in the next three months (OR, 1.77; 95% CI, 1.04–3.01). DISCUSSION These results may help guide future research and the development of pregnancy prevention interventions targeting sexually experienced young adults. PMID:25782849
Hulsegge, Gerben; van der Schouw, Yvonne T; Daviglus, Martha L; Smit, Henriëtte A; Verschuren, W M Monique
2016-02-01
While maintenance of a low cardiovascular risk profile is essential for cardiovascular disease (CVD) prevention, few people maintain a low CVD risk profile throughout their life. We studied the association of demographic, lifestyle, psychological factors and family history of CVD with attainment and maintenance of a low risk profile over three subsequent 5-year periods. Measurements of 6390 adults aged 26-65 years at baseline were completed from 1993 to 97 and subsequently at 5-year intervals until 2013. At each wave, participants were categorized into low risk profile (ideal levels of blood pressure, cholesterol and body mass index, non-smoking and no diabetes) and medium/high risk profile (all others). Multivariable-adjusted modified Poisson regression analyses were used to examine determinants of attainment and maintenance of low risk; risk ratios (RR) and 95% confidence intervals (95% CI) were obtained. Generalized estimating equations were used to combine multiple 5-year comparisons. Younger age, female gender and high educational level were associated with higher likelihood of both maintaining and attaining low risk profile (P < 0.05). In addition, likelihood of attaining low risk was 9% higher with each 1-unit increment in Mediterranean diet score (RR: 1.09, 95% CI: 1.02-1.16), twice as high with any physical activity versus none (RR: 2.17, 95% CI: 1.16-4.04) and 35% higher with moderate alcohol consumption versus heavy consumption (RR: 1.35, 95% CI: 1.06-1.73). Healthy lifestyle factors such as adherence to a Mediterranean diet, physical activity and moderate as opposed to heavy alcohol consumption were associated with a higher likelihood of attaining a low risk profile. © The Author 2015. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.
Flassig, Robert J; Migal, Iryna; der Zalm, Esther van; Rihko-Struckmann, Liisa; Sundmacher, Kai
2015-01-16
Understanding the dynamics of biological processes can substantially be supported by computational models in the form of nonlinear ordinary differential equations (ODE). Typically, this model class contains many unknown parameters, which are estimated from inadequate and noisy data. Depending on the ODE structure, predictions based on unmeasured states and associated parameters are highly uncertain, even undetermined. For given data, profile likelihood analysis has been proven to be one of the most practically relevant approaches for analyzing the identifiability of an ODE structure, and thus model predictions. In case of highly uncertain or non-identifiable parameters, rational experimental design based on various approaches has shown to significantly reduce parameter uncertainties with minimal amount of effort. In this work we illustrate how to use profile likelihood samples for quantifying the individual contribution of parameter uncertainty to prediction uncertainty. For the uncertainty quantification we introduce the profile likelihood sensitivity (PLS) index. Additionally, for the case of several uncertain parameters, we introduce the PLS entropy to quantify individual contributions to the overall prediction uncertainty. We show how to use these two criteria as an experimental design objective for selecting new, informative readouts in combination with intervention site identification. The characteristics of the proposed multi-criterion objective are illustrated with an in silico example. We further illustrate how an existing practically non-identifiable model for the chlorophyll fluorescence induction in a photosynthetic organism, D. salina, can be rendered identifiable by additional experiments with new readouts. Having data and profile likelihood samples at hand, the here proposed uncertainty quantification based on prediction samples from the profile likelihood provides a simple way for determining individual contributions of parameter uncertainties to uncertainties in model predictions. The uncertainty quantification of specific model predictions allows identifying regions, where model predictions have to be considered with care. Such uncertain regions can be used for a rational experimental design to render initially highly uncertain model predictions into certainty. Finally, our uncertainty quantification directly accounts for parameter interdependencies and parameter sensitivities of the specific prediction.
NASA Astrophysics Data System (ADS)
Lundberg, J.; Conrad, J.; Rolke, W.; Lopez, A.
2010-03-01
A C++ class was written for the calculation of frequentist confidence intervals using the profile likelihood method. Seven combinations of Binomial, Gaussian, Poissonian and Binomial uncertainties are implemented. The package provides routines for the calculation of upper and lower limits, sensitivity and related properties. It also supports hypothesis tests which take uncertainties into account. It can be used in compiled C++ code, in Python or interactively via the ROOT analysis framework. Program summaryProgram title: TRolke version 2.0 Catalogue identifier: AEFT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEFT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: MIT license No. of lines in distributed program, including test data, etc.: 3431 No. of bytes in distributed program, including test data, etc.: 21 789 Distribution format: tar.gz Programming language: ISO C++. Computer: Unix, GNU/Linux, Mac. Operating system: Linux 2.6 (Scientific Linux 4 and 5, Ubuntu 8.10), Darwin 9.0 (Mac-OS X 10.5.8). RAM:˜20 MB Classification: 14.13. External routines: ROOT ( http://root.cern.ch/drupal/) Nature of problem: The problem is to calculate a frequentist confidence interval on the parameter of a Poisson process with statistical or systematic uncertainties in signal efficiency or background. Solution method: Profile likelihood method, Analytical Running time:<10 seconds per extracted limit.
Expectation maximization for hard X-ray count modulation profiles
NASA Astrophysics Data System (ADS)
Benvenuto, F.; Schwartz, R.; Piana, M.; Massone, A. M.
2013-07-01
Context. This paper is concerned with the image reconstruction problem when the measured data are solar hard X-ray modulation profiles obtained from the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) instrument. Aims: Our goal is to demonstrate that a statistical iterative method classically applied to the image deconvolution problem is very effective when utilized to analyze count modulation profiles in solar hard X-ray imaging based on rotating modulation collimators. Methods: The algorithm described in this paper solves the maximum likelihood problem iteratively and encodes a positivity constraint into the iterative optimization scheme. The result is therefore a classical expectation maximization method this time applied not to an image deconvolution problem but to image reconstruction from count modulation profiles. The technical reason that makes our implementation particularly effective in this application is the use of a very reliable stopping rule which is able to regularize the solution providing, at the same time, a very satisfactory Cash-statistic (C-statistic). Results: The method is applied to both reproduce synthetic flaring configurations and reconstruct images from experimental data corresponding to three real events. In this second case, the performance of expectation maximization, when compared to Pixon image reconstruction, shows a comparable accuracy and a notably reduced computational burden; when compared to CLEAN, shows a better fidelity with respect to the measurements with a comparable computational effectiveness. Conclusions: If optimally stopped, expectation maximization represents a very reliable method for image reconstruction in the RHESSI context when count modulation profiles are used as input data.
Identifying Patterns of Situational Antecedents to Heavy Drinking among College Students
Lau-Barraco, Cathy; Linden-Carmichael, Ashley N.; Braitman, Abby L.; Stamates, Amy L.
2016-01-01
Background Emerging adults have the highest prevalence of heavy drinking as compared to all other age groups. Given the negative consequences associated with such drinking, additional research efforts focused on at-risk consumption are warranted. The current study sought to identify patterns of situational antecedents to drinking and to examine their associations with drinking motivations, alcohol involvement, and mental health functioning in a sample of heavy drinking college students. Method Participants were 549 (65.8% women) college student drinkers. Results Latent profile analysis identified three classes based on likelihood of heavy drinking across eight situational precipitants. The “High Situational Endorsement” group reported the greatest likelihood of heavy drinking in most situations assessed. This class experienced the greatest level of alcohol-related harms as compared to the “Low Situational Endorsement” and “Moderate Situational Endorsement” groups. The Low Situational Endorsement class was characterized by the lowest likelihood of heavy drinking across all situational antecedents and they experienced the fewest alcohol-related harms, relative to the other classes. Class membership was related to drinking motivations with the “High Situational Endorsement” class endorsing the highest coping- and conformity-motivated drinking. The “High Situational Endorsement” class also reported experiencing more mental health symptoms than other groups. Conclusions The current study contributed to the larger drinking literature by identifying profiles that may signify a particularly risky drinking style. Findings may help guide intervention work with college heavy drinkers. PMID:28163666
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies
Rukhin, Andrew L.
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed. PMID:26989583
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies.
Rukhin, Andrew L
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed.
Elashoff, Robert M.; Li, Gang; Li, Ning
2009-01-01
Summary In this article we study a joint model for longitudinal measurements and competing risks survival data. Our joint model provides a flexible approach to handle possible nonignorable missing data in the longitudinal measurements due to dropout. It is also an extension of previous joint models with a single failure type, offering a possible way to model informatively censored events as a competing risk. Our model consists of a linear mixed effects submodel for the longitudinal outcome and a proportional cause-specific hazards frailty submodel (Prentice et al., 1978, Biometrics 34, 541-554) for the competing risks survival data, linked together by some latent random effects. We propose to obtain the maximum likelihood estimates of the parameters by an expectation maximization (EM) algorithm and estimate their standard errors using a profile likelihood method. The developed method works well in our simulation studies and is applied to a clinical trial for the scleroderma lung disease. PMID:18162112
Leveraging cues from person-generated health data for peer matching in online communities.
Hartzler, Andrea L; Taylor, Megan N; Park, Albert; Griffiths, Troy; Backonja, Uba; McDonald, David W; Wahbeh, Sam; Brown, Cory; Pratt, Wanda
2016-05-01
Online health communities offer a diverse peer support base, yet users can struggle to identify suitable peer mentors as these communities grow. To facilitate mentoring connections, we designed a peer-matching system that automatically profiles and recommends peer mentors to mentees based on person-generated health data (PGHD). This study examined the profile characteristics that mentees value when choosing a peer mentor. Through a mixed-methods user study, in which cancer patients and caregivers evaluated peer mentor recommendations, we examined the relative importance of four possible profile elements: health interests, language style, demographics, and sample posts. Playing the role of mentees, the study participants ranked mentors, then rated both the likelihood that they would hypothetically contact each mentor and the helpfulness of each profile element in helping the make that decision. We analyzed the participants' ratings with linear regression and qualitatively analyzed participants' feedback for emerging themes about choosing mentors and improving profile design. Of the four profile elements, only sample posts were a significant predictor for the likelihood of a mentee contacting a mentor. Communication cues embedded in posts were critical for helping the participants choose a compatible mentor. Qualitative themes offer insight into the interpersonal characteristics that mentees sought in peer mentors, including being knowledgeable, sociable, and articulate. Additionally, the participants emphasized the need for streamlined profiles that minimize the time required to choose a mentor. Peer-matching systems in online health communities offer a promising approach for leveraging PGHD to connect patients. Our findings point to interpersonal communication cues embedded in PGHD that could prove critical for building mentoring relationships among the growing membership of online health communities. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NGS-based likelihood ratio for identifying contributors in two- and three-person DNA mixtures.
Chan Mun Wei, Joshua; Zhao, Zicheng; Li, Shuai Cheng; Ng, Yen Kaow
2018-06-01
DNA fingerprinting, also known as DNA profiling, serves as a standard procedure in forensics to identify a person by the short tandem repeat (STR) loci in their DNA. By comparing the STR loci between DNA samples, practitioners can calculate a probability of match to identity the contributors of a DNA mixture. Most existing methods are based on 13 core STR loci which were identified by the Federal Bureau of Investigation (FBI). Analyses based on these loci of DNA mixture for forensic purposes are highly variable in procedures, and suffer from subjectivity as well as bias in complex mixture interpretation. With the emergence of next-generation sequencing (NGS) technologies, the sequencing of billions of DNA molecules can be parallelized, thus greatly increasing throughput and reducing the associated costs. This allows the creation of new techniques that incorporate more loci to enable complex mixture interpretation. In this paper, we propose a computation for likelihood ratio that uses NGS (next generation sequencing) data for DNA testing on mixed samples. We have applied the method to 4480 simulated DNA mixtures, which consist of various mixture proportions of 8 unrelated whole-genome sequencing data. The results confirm the feasibility of utilizing NGS data in DNA mixture interpretations. We observed an average likelihood ratio as high as 285,978 for two-person mixtures. Using our method, all 224 identity tests for two-person mixtures and three-person mixtures were correctly identified. Copyright © 2018 Elsevier Ltd. All rights reserved.
Impurity profiling of trinitrotoluene using vacuum-outlet gas chromatography-mass spectrometry.
Brust, Hanneke; Willemse, Sander; Zeng, Tuoyu; van Asten, Arian; Koeberg, Mattijs; van der Heijden, Antoine; Bolck, Annabel; Schoenmakers, Peter
2014-12-29
In this work, a reliable and robust vacuum-outlet gas chromatography-mass spectrometry (GC-MS) method is introduced for the identification and quantification of impurities in trinitrotoluene (TNT). Vacuum-outlet GC-MS allows for short analysis times; the analysis of impurities in TNT was performed in 4min. This study shows that impurity profiling of TNT can be used to investigate relations between TNT samples encountered in forensic casework. A wide variety of TNT samples were analyzed with the developed method. Dinitrobenzene, dinitrotoluene, trinitrotoluene and amino-dinitrotoluene isomers were detected at very low levels (<1wt.%) by applying the MS in selected-ion monitoring (SIM) mode. Limits of detection ranged from 6ng/mL for 2,6-dinitrotoluene to 43ng/mL for 4-amino-2,6-dinitrotoluene. Major impurities in TNT were 2,4-dinitrotoluene and 2,3,4-trinitrotoluene. Impurity profiles based on seven compounds showed to be useful to TNT samples from different sources. Statistical analysis of these impurity profiles using likelihood ratios demonstrated the potential to investigate whether two questioned TNT samples encountered in forensic casework are from the same source. Copyright © 2014 Elsevier B.V. All rights reserved.
Experimental Design for Parameter Estimation of Gene Regulatory Networks
Timmer, Jens
2012-01-01
Systems biology aims for building quantitative models to address unresolved issues in molecular biology. In order to describe the behavior of biological cells adequately, gene regulatory networks (GRNs) are intensively investigated. As the validity of models built for GRNs depends crucially on the kinetic rates, various methods have been developed to estimate these parameters from experimental data. For this purpose, it is favorable to choose the experimental conditions yielding maximal information. However, existing experimental design principles often rely on unfulfilled mathematical assumptions or become computationally demanding with growing model complexity. To solve this problem, we combined advanced methods for parameter and uncertainty estimation with experimental design considerations. As a showcase, we optimized three simulated GRNs in one of the challenges from the Dialogue for Reverse Engineering Assessment and Methods (DREAM). This article presents our approach, which was awarded the best performing procedure at the DREAM6 Estimation of Model Parameters challenge. For fast and reliable parameter estimation, local deterministic optimization of the likelihood was applied. We analyzed identifiability and precision of the estimates by calculating the profile likelihood. Furthermore, the profiles provided a way to uncover a selection of most informative experiments, from which the optimal one was chosen using additional criteria at every step of the design process. In conclusion, we provide a strategy for optimal experimental design and show its successful application on three highly nonlinear dynamic models. Although presented in the context of the GRNs to be inferred for the DREAM6 challenge, the approach is generic and applicable to most types of quantitative models in systems biology and other disciplines. PMID:22815723
Modeling Bivariate Longitudinal Hormone Profiles by Hierarchical State Space Models
Liu, Ziyue; Cappola, Anne R.; Crofford, Leslie J.; Guo, Wensheng
2013-01-01
The hypothalamic-pituitary-adrenal (HPA) axis is crucial in coping with stress and maintaining homeostasis. Hormones produced by the HPA axis exhibit both complex univariate longitudinal profiles and complex relationships among different hormones. Consequently, modeling these multivariate longitudinal hormone profiles is a challenging task. In this paper, we propose a bivariate hierarchical state space model, in which each hormone profile is modeled by a hierarchical state space model, with both population-average and subject-specific components. The bivariate model is constructed by concatenating the univariate models based on the hypothesized relationship. Because of the flexible framework of state space form, the resultant models not only can handle complex individual profiles, but also can incorporate complex relationships between two hormones, including both concurrent and feedback relationship. Estimation and inference are based on marginal likelihood and posterior means and variances. Computationally efficient Kalman filtering and smoothing algorithms are used for implementation. Application of the proposed method to a study of chronic fatigue syndrome and fibromyalgia reveals that the relationships between adrenocorticotropic hormone and cortisol in the patient group are weaker than in healthy controls. PMID:24729646
Modeling Bivariate Longitudinal Hormone Profiles by Hierarchical State Space Models.
Liu, Ziyue; Cappola, Anne R; Crofford, Leslie J; Guo, Wensheng
2014-01-01
The hypothalamic-pituitary-adrenal (HPA) axis is crucial in coping with stress and maintaining homeostasis. Hormones produced by the HPA axis exhibit both complex univariate longitudinal profiles and complex relationships among different hormones. Consequently, modeling these multivariate longitudinal hormone profiles is a challenging task. In this paper, we propose a bivariate hierarchical state space model, in which each hormone profile is modeled by a hierarchical state space model, with both population-average and subject-specific components. The bivariate model is constructed by concatenating the univariate models based on the hypothesized relationship. Because of the flexible framework of state space form, the resultant models not only can handle complex individual profiles, but also can incorporate complex relationships between two hormones, including both concurrent and feedback relationship. Estimation and inference are based on marginal likelihood and posterior means and variances. Computationally efficient Kalman filtering and smoothing algorithms are used for implementation. Application of the proposed method to a study of chronic fatigue syndrome and fibromyalgia reveals that the relationships between adrenocorticotropic hormone and cortisol in the patient group are weaker than in healthy controls.
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Paninski, Liam; Haith, Adrian; Szirtes, Gabor
2008-02-01
We recently introduced likelihood-based methods for fitting stochastic integrate-and-fire models to spike train data. The key component of this method involves the likelihood that the model will emit a spike at a given time t. Computing this likelihood is equivalent to computing a Markov first passage time density (the probability that the model voltage crosses threshold for the first time at time t). Here we detail an improved method for computing this likelihood, based on solving a certain integral equation. This integral equation method has several advantages over the techniques discussed in our previous work: in particular, the new method has fewer free parameters and is easily differentiable (for gradient computations). The new method is also easily adaptable for the case in which the model conductance, not just the input current, is time-varying. Finally, we describe how to incorporate large deviations approximations to very small likelihoods.
Motivation and Self-Perception Profiles and Links with Physical Activity in Adolescent Girls
ERIC Educational Resources Information Center
Biddle, Stuart J. H.; Wang, C. K. John
2003-01-01
Research shows a decline in participation in physical activity across the teenage years. It is important, therefore, to examine factors that might influence adolescent girl's likelihood of being physically active. This study used contemporary theoretical perspectives from psychology to assess a comprehensive profile of motivational and…
Influence analysis in quantitative trait loci detection.
Dou, Xiaoling; Kuriki, Satoshi; Maeno, Akiteru; Takada, Toyoyuki; Shiroishi, Toshihiko
2014-07-01
This paper presents systematic methods for the detection of influential individuals that affect the log odds (LOD) score curve. We derive general formulas of influence functions for profile likelihoods and introduce them into two standard quantitative trait locus detection methods-the interval mapping method and single marker analysis. Besides influence analysis on specific LOD scores, we also develop influence analysis methods on the shape of the LOD score curves. A simulation-based method is proposed to assess the significance of the influence of the individuals. These methods are shown useful in the influence analysis of a real dataset of an experimental population from an F2 mouse cross. By receiver operating characteristic analysis, we confirm that the proposed methods show better performance than existing diagnostics. © 2014 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Predicting workplace aggression and violence.
Barling, Julian; Dupré, Kathryne E; Kelloway, E Kevin
2009-01-01
Consistent with the relative recency of research on workplace aggression and the considerable media attention given to high-profile incidents, numerous myths about the nature of workplace aggression have emerged. In this review, we examine these myths from an evidence-based perspective, bringing greater clarity to our understanding of the predictors of workplace aggression. We conclude by pointing to the need for more research focusing on construct validity and prevention issues as well as for methodologies that minimize the likelihood of mono-method bias and that strengthen the ability to make causal inferences.
Armour, Cherie; Műllerová, Jana; Fletcher, Shelley; Lagdon, Susan; Burns, Carol Rhonda; Robinson, Martin; Robinson, Jake
2016-03-01
Previous research suggests that childhood maltreatment is associated with the onset of eating disorders (ED). In turn, EDs are associated with alternative psychopathologies such as depression and posttraumatic stress disorder (PTSD), and with suicidality. Moreover, it has been reported that various ED profiles may exist. The aim of the current study was to examine the profiles of disordered eating and the associations of these with childhood maltreatment and with mental health psychopathology. The current study utilised a representative sample of English females (N = 4206) and assessed for the presence of disordered eating profiles using Latent Class Analysis. Multinomial logistic regression was implemented to examine the associations of childhood sexual and physical abuse with the disordered eating profiles and the associations of these with PTSD, depression and suicidality. Results supported those of previous findings in that we found five latent classes of which three were regarded as disordered eating classes. Significant relationships were found between these and measures of childhood trauma and mental health outcomes. Childhood sexual and physical abuse increased the likelihood of membership in disordered eating classes and these in turn increased the likelihood of adverse mental health and suicidal outcomes.
Whitty, Jennifer A; Rundle-Thiele, Sharyn R; Scuffham, Paul A
2012-03-01
Discrete choice experiments (DCEs) and the Juster scale are accepted methods for the prediction of individual purchase probabilities. Nevertheless, these methods have seldom been applied to a social decision-making context. To gain an overview of social decisions for a decision-making population through data triangulation, these two methods were used to understand purchase probability in a social decision-making context. We report an exploratory social decision-making study of pharmaceutical subsidy in Australia. A DCE and selected Juster scale profiles were presented to current and past members of the Australian Pharmaceutical Benefits Advisory Committee and its Economic Subcommittee. Across 66 observations derived from 11 respondents for 6 different pharmaceutical profiles, there was a small overall median difference of 0.024 in the predicted probability of public subsidy (p = 0.003), with the Juster scale predicting the higher likelihood. While consistency was observed at the extremes of the probability scale, the funding probability differed over the mid-range of profiles. There was larger variability in the DCE than Juster predictions within each individual respondent, suggesting the DCE is better able to discriminate between profiles. However, large variation was observed between individuals in the Juster scale but not DCE predictions. It is important to use multiple methods to obtain a complete picture of the probability of purchase or public subsidy in a social decision-making context until further research can elaborate on our findings. This exploratory analysis supports the suggestion that the mixed logit model, which was used for the DCE analysis, may fail to adequately account for preference heterogeneity in some contexts.
On the Relation between the Linear Factor Model and the Latent Profile Model
ERIC Educational Resources Information Center
Halpin, Peter F.; Dolan, Conor V.; Grasman, Raoul P. P. P.; De Boeck, Paul
2011-01-01
The relationship between linear factor models and latent profile models is addressed within the context of maximum likelihood estimation based on the joint distribution of the manifest variables. Although the two models are well known to imply equivalent covariance decompositions, in general they do not yield equivalent estimates of the…
Fast automated analysis of strong gravitational lenses with convolutional neural networks.
Hezaveh, Yashar D; Levasseur, Laurence Perreault; Marshall, Philip J
2017-08-30
Quantifying image distortions caused by strong gravitational lensing-the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures-and estimating the corresponding matter distribution of these structures (the 'gravitational lens') has primarily been performed using maximum likelihood modelling of observations. This procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physical processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. Here we report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the 'singular isothermal ellipsoid' density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.
NASA Technical Reports Server (NTRS)
Carson, John M., III; Bayard, David S.
2006-01-01
G-SAMPLE is an in-flight dynamical method for use by sample collection missions to identify the presence and quantity of collected sample material. The G-SAMPLE method implements a maximum-likelihood estimator to identify the collected sample mass, based on onboard force sensor measurements, thruster firings, and a dynamics model of the spacecraft. With G-SAMPLE, sample mass identification becomes a computation rather than an extra hardware requirement; the added cost of cameras or other sensors for sample mass detection is avoided. Realistic simulation examples are provided for a spacecraft configuration with a sample collection device mounted on the end of an extended boom. In one representative example, a 1000 gram sample mass is estimated to within 110 grams (95% confidence) under realistic assumptions of thruster profile error, spacecraft parameter uncertainty, and sensor noise. For convenience to future mission design, an overall sample-mass estimation error budget is developed to approximate the effect of model uncertainty, sensor noise, data rate, and thrust profile error on the expected estimate of collected sample mass.
Exclusion probabilities and likelihood ratios with applications to kinship problems.
Slooten, Klaas-Jan; Egeland, Thore
2014-05-01
In forensic genetics, DNA profiles are compared in order to make inferences, paternity cases being a standard example. The statistical evidence can be summarized and reported in several ways. For example, in a paternity case, the likelihood ratio (LR) and the probability of not excluding a random man as father (RMNE) are two common summary statistics. There has been a long debate on the merits of the two statistics, also in the context of DNA mixture interpretation, and no general consensus has been reached. In this paper, we show that the RMNE is a certain weighted average of inverse likelihood ratios. This is true in any forensic context. We show that the likelihood ratio in favor of the correct hypothesis is, in expectation, bigger than the reciprocal of the RMNE probability. However, with the exception of pathological cases, it is also possible to obtain smaller likelihood ratios. We illustrate this result for paternity cases. Moreover, some theoretical properties of the likelihood ratio for a large class of general pairwise kinship cases, including expected value and variance, are derived. The practical implications of the findings are discussed and exemplified.
Assessment of parametric uncertainty for groundwater reactive transport modeling,
Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun
2014-01-01
The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood
ERIC Educational Resources Information Center
Karabatsos, George
2017-01-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon…
Chen, Yunshun; Lun, Aaron T L; Smyth, Gordon K
2016-01-01
In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Semiparametric Time-to-Event Modeling in the Presence of a Latent Progression Event
Rice, John D.; Tsodikov, Alex
2017-01-01
Summary In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood–based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. PMID:27556886
Johnson, Timothy R; Kuhn, Kristine M
2015-12-01
This paper introduces the ltbayes package for R. This package includes a suite of functions for investigating the posterior distribution of latent traits of item response models. These include functions for simulating realizations from the posterior distribution, profiling the posterior density or likelihood function, calculation of posterior modes or means, Fisher information functions and observed information, and profile likelihood confidence intervals. Inferences can be based on individual response patterns or sets of response patterns such as sum scores. Functions are included for several common binary and polytomous item response models, but the package can also be used with user-specified models. This paper introduces some background and motivation for the package, and includes several detailed examples of its use.
Two new methods to fit models for network meta-analysis with random inconsistency effects.
Law, Martin; Jackson, Dan; Turner, Rebecca; Rhodes, Kirsty; Viechtbauer, Wolfgang
2016-07-28
Meta-analysis is a valuable tool for combining evidence from multiple studies. Network meta-analysis is becoming more widely used as a means to compare multiple treatments in the same analysis. However, a network meta-analysis may exhibit inconsistency, whereby the treatment effect estimates do not agree across all trial designs, even after taking between-study heterogeneity into account. We propose two new estimation methods for network meta-analysis models with random inconsistency effects. The model we consider is an extension of the conventional random-effects model for meta-analysis to the network meta-analysis setting and allows for potential inconsistency using random inconsistency effects. Our first new estimation method uses a Bayesian framework with empirically-based prior distributions for both the heterogeneity and the inconsistency variances. We fit the model using importance sampling and thereby avoid some of the difficulties that might be associated with using Markov Chain Monte Carlo (MCMC). However, we confirm the accuracy of our importance sampling method by comparing the results to those obtained using MCMC as the gold standard. The second new estimation method we describe uses a likelihood-based approach, implemented in the metafor package, which can be used to obtain (restricted) maximum-likelihood estimates of the model parameters and profile likelihood confidence intervals of the variance components. We illustrate the application of the methods using two contrasting examples. The first uses all-cause mortality as an outcome, and shows little evidence of between-study heterogeneity or inconsistency. The second uses "ear discharge" as an outcome, and exhibits substantial between-study heterogeneity and inconsistency. Both new estimation methods give results similar to those obtained using MCMC. The extent of heterogeneity and inconsistency should be assessed and reported in any network meta-analysis. Our two new methods can be used to fit models for network meta-analysis with random inconsistency effects. They are easily implemented using the accompanying R code in the Additional file 1. Using these estimation methods, the extent of inconsistency can be assessed and reported.
ERIC Educational Resources Information Center
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.
2016-01-01
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
New applications of maximum likelihood and Bayesian statistics in macromolecular crystallography.
McCoy, Airlie J
2002-10-01
Maximum likelihood methods are well known to macromolecular crystallographers as the methods of choice for isomorphous phasing and structure refinement. Recently, the use of maximum likelihood and Bayesian statistics has extended to the areas of molecular replacement and density modification, placing these methods on a stronger statistical foundation and making them more accurate and effective.
Fast automated analysis of strong gravitational lenses with convolutional neural networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.
Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. Our procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physicalmore » processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. We report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.« less
Fast automated analysis of strong gravitational lenses with convolutional neural networks
Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.
2017-08-30
Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. Our procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physicalmore » processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. We report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.« less
Fast automated analysis of strong gravitational lenses with convolutional neural networks
NASA Astrophysics Data System (ADS)
Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.
2017-08-01
Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. This procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physical processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. Here we report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.
Measuring coherence of computer-assisted likelihood ratio methods.
Haraksim, Rudolf; Ramos, Daniel; Meuwly, Didier; Berger, Charles E H
2015-04-01
Measuring the performance of forensic evaluation methods that compute likelihood ratios (LRs) is relevant for both the development and the validation of such methods. A framework of performance characteristics categorized as primary and secondary is introduced in this study to help achieve such development and validation. Ground-truth labelled fingerprint data is used to assess the performance of an example likelihood ratio method in terms of those performance characteristics. Discrimination, calibration, and especially the coherence of this LR method are assessed as a function of the quantity and quality of the trace fingerprint specimen. Assessment of the coherence revealed a weakness of the comparison algorithm in the computer-assisted likelihood ratio method used. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram
2017-02-01
In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
DNA typing for the identification of old skeletal remains from Korean War victims.
Lee, Hwan Young; Kim, Na Young; Park, Myung Jin; Sim, Jeong Eun; Yang, Woo Ick; Shin, Kyoung-Jin
2010-11-01
The identification of missing casualties of the Korean War (1950-1953) has been performed using mitochondrial DNA (mtDNA) profiles, but recent advances in DNA extraction techniques and approaches using smaller amplicons have significantly increased the possibility of obtaining DNA profiles from highly degraded skeletal remains. Therefore, 21 skeletal remains of Korean War victims and 24 samples from biological relatives of the supposed victims were selected based on circumstantial evidence and/or mtDNA-matching results and were analyzed to confirm the alleged relationship. Cumulative likelihood ratios were obtained from autosomal short tandem repeat, Y-chromosomal STR, and mtDNA-genotyping results, and mainly confirmed the alleged relationship with values over 10⁵. The present analysis emphasizes the value of mini- and Y-STR systems as well as an efficient DNA extraction method in DNA testing for the identification of old skeletal remains. © 2010 American Academy of Forensic Sciences.
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method
NASA Astrophysics Data System (ADS)
Ardianti, Fitri; Sutarman
2018-01-01
In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi
2017-04-26
We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.
The Maximum Likelihood Solution for Inclination-only Data
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2006-12-01
The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag
Decomposition Odour Profiling in the Air and Soil Surrounding Vertebrate Carrion
2014-01-01
Chemical profiling of decomposition odour is conducted in the environmental sciences to detect malodourous target sources in air, water or soil. More recently decomposition odour profiling has been employed in the forensic sciences to generate a profile of the volatile organic compounds (VOCs) produced by decomposed remains. The chemical profile of decomposition odour is still being debated with variations in the VOC profile attributed to the sample collection technique, method of chemical analysis, and environment in which decomposition occurred. To date, little consideration has been given to the partitioning of odour between different matrices and the impact this has on developing an accurate VOC profile. The purpose of this research was to investigate the decomposition odour profile surrounding vertebrate carrion to determine how VOCs partition between soil and air. Four pig carcasses (Sus scrofa domesticus L.) were placed on a soil surface to decompose naturally and their odour profile monitored over a period of two months. Corresponding control sites were also monitored to determine the VOC profile of the surrounding environment. Samples were collected from the soil below and the air (headspace) above the decomposed remains using sorbent tubes and analysed using gas chromatography-mass spectrometry. A total of 249 compounds were identified but only 58 compounds were common to both air and soil samples. This study has demonstrated that soil and air samples produce distinct subsets of VOCs that contribute to the overall decomposition odour. Sample collection from only one matrix will reduce the likelihood of detecting the complete spectrum of VOCs, which further confounds the issue of determining a complete and accurate decomposition odour profile. Confirmation of this profile will enhance the performance of cadaver-detection dogs that are tasked with detecting decomposition odour in both soil and air to locate victim remains. PMID:24740412
Decomposition odour profiling in the air and soil surrounding vertebrate carrion.
Forbes, Shari L; Perrault, Katelynn A
2014-01-01
Chemical profiling of decomposition odour is conducted in the environmental sciences to detect malodourous target sources in air, water or soil. More recently decomposition odour profiling has been employed in the forensic sciences to generate a profile of the volatile organic compounds (VOCs) produced by decomposed remains. The chemical profile of decomposition odour is still being debated with variations in the VOC profile attributed to the sample collection technique, method of chemical analysis, and environment in which decomposition occurred. To date, little consideration has been given to the partitioning of odour between different matrices and the impact this has on developing an accurate VOC profile. The purpose of this research was to investigate the decomposition odour profile surrounding vertebrate carrion to determine how VOCs partition between soil and air. Four pig carcasses (Sus scrofa domesticus L.) were placed on a soil surface to decompose naturally and their odour profile monitored over a period of two months. Corresponding control sites were also monitored to determine the VOC profile of the surrounding environment. Samples were collected from the soil below and the air (headspace) above the decomposed remains using sorbent tubes and analysed using gas chromatography-mass spectrometry. A total of 249 compounds were identified but only 58 compounds were common to both air and soil samples. This study has demonstrated that soil and air samples produce distinct subsets of VOCs that contribute to the overall decomposition odour. Sample collection from only one matrix will reduce the likelihood of detecting the complete spectrum of VOCs, which further confounds the issue of determining a complete and accurate decomposition odour profile. Confirmation of this profile will enhance the performance of cadaver-detection dogs that are tasked with detecting decomposition odour in both soil and air to locate victim remains.
Perez-Brena, Norma J.; Cookston, Jeffrey T.; Fabricius, William V.; Saenz, Delia
2013-01-01
A mixed-method study identified profiles of fathers who mentioned key dimensions of their parenting and linked profile membership to adolescents’ adjustment using data from 337 European American, Mexican American and Mexican immigrant fathers and their early adolescent children. Father narratives about what fathers do well as parents were thematically coded for the presence of five fathering dimensions: emotional quality (how well father and child get along), involvement (amount of time spent together), provisioning (the amount of resources provided), discipline (the amount and success in parental control), and role modeling (teaching life lessons through example). Next, latent class analysis was used to identify three patterns of the likelihood of mentioning certain fathering dimensions: an emotionally-involved group mentioned emotional quality and involvement; an affective-control group mentioned emotional quality, involvement, discipline and role modeling; and an affective-model group mentioned emotional quality and role modeling. Profiles were significantly associated with subsequent adolescents’ reports of adjustment such that adolescents of affective-control fathers reported significantly more externalizing behaviors than adolescents of emotionally-involved fathers. PMID:24883049
Benschop, Corina C G; van de Merwe, Linda; de Jong, Jeroen; Vanvooren, Vanessa; Kempenaers, Morgane; Kees van der Beek, C P; Barni, Filippo; Reyes, Eusebio López; Moulin, Léa; Pene, Laurent; Haned, Hinda; Sijen, Titia
2017-07-01
Searching a national DNA database with complex and incomplete profiles usually yields very large numbers of possible matches that can present many candidate suspects to be further investigated by the forensic scientist and/or police. Current practice in most forensic laboratories consists of ordering these 'hits' based on the number of matching alleles with the searched profile. Thus, candidate profiles that share the same number of matching alleles are not differentiated and due to the lack of other ranking criteria for the candidate list it may be difficult to discern a true match from the false positives or notice that all candidates are in fact false positives. SmartRank was developed to put forward only relevant candidates and rank them accordingly. The SmartRank software computes a likelihood ratio (LR) for the searched profile and each profile in the DNA database and ranks database entries above a defined LR threshold according to the calculated LR. In this study, we examined for mixed DNA profiles of variable complexity whether the true donors are retrieved, what the number of false positives above an LR threshold is and the ranking position of the true donors. Using 343 mixed DNA profiles over 750 SmartRank searches were performed. In addition, the performance of SmartRank and CODIS were compared regarding DNA database searches and SmartRank was found complementary to CODIS. We also describe the applicable domain of SmartRank and provide guidelines. The SmartRank software is open-source and freely available. Using the best practice guidelines, SmartRank enables obtaining investigative leads in criminal cases lacking a suspect. Copyright © 2017 Elsevier B.V. All rights reserved.
Rasch, Elizabeth K.; Huynh, Minh; Ho, Pei-Shu; Heuser, Aaron; Houtenville, Andrew; Chan, Leighton
2014-01-01
Background: Given the complexity of the adjudication process and volume of applications to Social Security Administration’s (SSA) disability programs, many individuals with serious medical conditions die while awaiting an application decision. Limitations of traditional survival methods called for a new empirical approach to identify conditions resulting in rapid mortality. Objective: To identify health conditions associated with significantly higher mortality than a key reference group among applicants for SSA disability programs. Research design: We identified mortality patterns and generated a survival surface for a reference group using conditions already designated for expedited processing. We identified conditions associated with significantly higher mortality than the reference group and prioritized them by the expected likelihood of death during the adjudication process. Subjects: Administrative records of 29 million Social Security disability applicants, who applied for benefits from 1996 – 2007, were analyzed. Measures: We computed survival spells from time of onset of disability to death, and from date of application to death. Survival data were organized by entry cohort. Results: In our sample, we observed that approximately 42,000 applicants died before a decision was made on their disability claims. We identified 24 conditions with survival profiles comparable to the reference group. Applicants with these conditions were not likely to survive adjudication. Conclusions: Our approach facilitates ongoing revision of the conditions SSA designates for expedited awards and has applicability to other programs where survival profiles are a consideration. PMID:25310524
ERIC Educational Resources Information Center
Chung, Hwan; Anthony, James C.
2013-01-01
This article presents a multiple-group latent class-profile analysis (LCPA) by taking a Bayesian approach in which a Markov chain Monte Carlo simulation is employed to achieve more robust estimates for latent growth patterns. This article describes and addresses a label-switching problem that involves the LCPA likelihood function, which has…
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
New prior sampling methods for nested sampling - Development and testing
NASA Astrophysics Data System (ADS)
Stokes, Barrie; Tuyl, Frank; Hudson, Irene
2017-06-01
Nested Sampling is a powerful algorithm for fitting models to data in the Bayesian setting, introduced by Skilling [1]. The nested sampling algorithm proceeds by carrying out a series of compressive steps, involving successively nested iso-likelihood boundaries, starting with the full prior distribution of the problem parameters. The "central problem" of nested sampling is to draw at each step a sample from the prior distribution whose likelihood is greater than the current likelihood threshold, i.e., a sample falling inside the current likelihood-restricted region. For both flat and informative priors this ultimately requires uniform sampling restricted to the likelihood-restricted region. We present two new methods of carrying out this sampling step, and illustrate their use with the lighthouse problem [2], a bivariate likelihood used by Gregory [3] and a trivariate Gaussian mixture likelihood. All the algorithm development and testing reported here has been done with Mathematica® [4].
Kück, Patrick; Meusemann, Karen; Dambach, Johannes; Thormann, Birthe; von Reumont, Björn M; Wägele, Johann W; Misof, Bernhard
2010-03-31
Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.
NASA Technical Reports Server (NTRS)
Schmahl, Edward J.; Kundu, Mukul R.
1998-01-01
We have continued our previous efforts in studies of fourier imaging methods applied to hard X-ray flares. We have performed physical and theoretical analysis of rotating collimator grids submitted to GSFC(Goddard Space Flight Center) for the High Energy Solar Spectroscopic Imager (HESSI). We have produced simulation algorithms which are currently being used to test imaging software and hardware for HESSI. We have developed Maximum-Entropy, Maximum-Likelihood, and "CLEAN" methods for reconstructing HESSI images from count-rate profiles. This work is expected to continue through the launch of HESSI in July, 2000. Section 1 shows a poster presentation "Image Reconstruction from HESSI Photon Lists" at the Solar Physics Division Meeting, June 1998; Section 2 shows the text and viewgraphs prepared for "Imaging Simulations" at HESSI's Preliminary Design Review on July 30, 1998.
Synthesizing Regression Results: A Factored Likelihood Method
ERIC Educational Resources Information Center
Wu, Meng-Jia; Becker, Betsy Jane
2013-01-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Vexler, Albert; Tanajian, Hovig; Hutson, Alan D
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Chaikriangkrai, Kongkiat; Jhun, Hye Yeon; Shantha, Ghanshyam Palamaner Subash; Abdulhak, Aref Bin; Tandon, Rudhir; Alqasrawi, Musab; Klappa, Anthony; Pancholy, Samir; Deshmukh, Abhishek; Bhama, Jay; Sigurdsson, Gardar
2018-07-01
In aortic stenosis patients referred for surgical and transcatheter aortic valve replacement (AVR), the evidence of diagnostic accuracy of coronary computed tomography angiography (CCTA) has been limited. The objective of this study was to investigate the diagnostic accuracy of CCTA for significant coronary artery disease (CAD) in patients referred for AVR using invasive coronary angiography (ICA) as the gold standard. We searched databases for all diagnostic studies of CCTA in patients referred for AVR, which reported diagnostic testing characteristics on patient-based analysis required to pool summary sensitivity, specificity, positive-likelihood ratio, and negative-likelihood ratio. Significant CAD in both CCTA and ICA was defined by >50% stenosis in any coronary artery, coronary stent, or bypass graft. Thirteen studies evaluated 1498 patients (mean age, 74 y; 47% men; 76% transcatheter AVR). The pooled prevalence of significant stenosis determined by ICA was 43%. Hierarchical summary receiver-operating characteristic analysis demonstrated a summary area under curve of 0.96. The pooled sensitivity, specificity, and positive-likelihood and negative-likelihood ratios of CCTA in identifying significant stenosis determined by ICA were 95%, 79%, 4.48, and 0.06, respectively. In subgroup analysis, the diagnostic profiles of CCTA were comparable between surgical and transcatheter AVR. Despite the higher prevalence of significant CAD in patients with aortic stenosis than with other valvular heart diseases, our meta-analysis has shown that CCTA has a suitable diagnostic accuracy profile as a gatekeeper test for ICA. Our study illustrates a need for further study of the potential role of CCTA in preoperative planning for AVR.
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
ERIC Educational Resources Information Center
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
A Game Theoretical Approach to Hacktivism: Is Attack Likelihood a Product of Risks and Payoffs?
Bodford, Jessica E; Kwan, Virginia S Y
2018-02-01
The current study examines hacktivism (i.e., hacking to convey a moral, ethical, or social justice message) through a general game theoretic framework-that is, as a product of costs and benefits. Given the inherent risk of carrying out a hacktivist attack (e.g., legal action, imprisonment), it would be rational for the user to weigh these risks against perceived benefits of carrying out the attack. As such, we examined computer science students' estimations of risks, payoffs, and attack likelihood through a game theoretic design. Furthermore, this study aims at constructing a descriptive profile of potential hacktivists, exploring two predicted covariates of attack decision making, namely, peer prevalence of hacking and sex differences. Contrary to expectations, results suggest that participants' estimations of attack likelihood stemmed solely from expected payoffs, rather than subjective risks. Peer prevalence significantly predicted increased payoffs and attack likelihood, suggesting an underlying descriptive norm in social networks. Notably, we observed no sex differences in the decision to attack, nor in the factors predicting attack likelihood. Implications for policymakers and the understanding and prevention of hacktivism are discussed, as are the possible ramifications of widely communicated payoffs over potential risks in hacking communities.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Wienke, B R; O'Leary, T R
2008-05-01
Linking model and data, we detail the LANL diving reduced gradient bubble model (RGBM), dynamical principles, and correlation with data in the LANL Data Bank. Table, profile, and meter risks are obtained from likelihood analysis and quoted for air, nitrox, helitrox no-decompression time limits, repetitive dive tables, and selected mixed gas and repetitive profiles. Application analyses include the EXPLORER decompression meter algorithm, NAUI tables, University of Wisconsin Seafood Diver tables, comparative NAUI, PADI, Oceanic NDLs and repetitive dives, comparative nitrogen and helium mixed gas risks, USS Perry deep rebreather (RB) exploration dive,world record open circuit (OC) dive, and Woodville Karst Plain Project (WKPP) extreme cave exploration profiles. The algorithm has seen extensive and utilitarian application in mixed gas diving, both in recreational and technical sectors, and forms the bases forreleased tables and decompression meters used by scientific, commercial, and research divers. The LANL Data Bank is described, and the methods used to deduce risk are detailed. Risk functions for dissolved gas and bubbles are summarized. Parameters that can be used to estimate profile risk are tallied. To fit data, a modified Levenberg-Marquardt routine is employed with L2 error norm. Appendices sketch the numerical methods, and list reports from field testing for (real) mixed gas diving. A Monte Carlo-like sampling scheme for fast numerical analysis of the data is also detailed, as a coupled variance reduction technique and additional check on the canonical approach to estimating diving risk. The method suggests alternatives to the canonical approach. This work represents a first time correlation effort linking a dynamical bubble model with deep stop data. Supercomputing resources are requisite to connect model and data in application.
Global population structure and adaptive evolution of aflatoxin-producing fungi
USDA-ARS?s Scientific Manuscript database
We employed interspecific principal component analyses for six different categories (geography, species, precipitation, temperature, aflatoxin chemotype profile, and mating type) and inferred maximum likelihood phylogenies for six combined loci, including two aflatoxin cluster regions (aflM/alfN and...
Dynamic Method for Identifying Collected Sample Mass
NASA Technical Reports Server (NTRS)
Carson, John
2008-01-01
G-Sample is designed for sample collection missions to identify the presence and quantity of sample material gathered by spacecraft equipped with end effectors. The software method uses a maximum-likelihood estimator to identify the collected sample's mass based on onboard force-sensor measurements, thruster firings, and a dynamics model of the spacecraft. This makes sample mass identification a computation rather than a process requiring additional hardware. Simulation examples of G-Sample are provided for spacecraft model configurations with a sample collection device mounted on the end of an extended boom. In the absence of thrust knowledge errors, the results indicate that G-Sample can identify the amount of collected sample mass to within 10 grams (with 95-percent confidence) by using a force sensor with a noise and quantization floor of 50 micrometers. These results hold even in the presence of realistic parametric uncertainty in actual spacecraft inertia, center-of-mass offset, and first flexibility modes. Thrust profile knowledge is shown to be a dominant sensitivity for G-Sample, entering in a nearly one-to-one relationship with the final mass estimation error. This means thrust profiles should be well characterized with onboard accelerometers prior to sample collection. An overall sample-mass estimation error budget has been developed to approximate the effect of model uncertainty, sensor noise, data rate, and thrust profile error on the expected estimate of collected sample mass.
NASA Astrophysics Data System (ADS)
Laing, Kevin J. C.; Russamono, Thais
2013-02-01
The likelihood of trained astronauts developing a life threatening cardiac event during spaceflight is relatively rare, whilst the incidence in untrained individuals is unknown. Space tourists who live a sedentary lifestyle have reduced cardiovascular function, but the associated danger of sudden cardiac arrest (SCA) during a suborbital spaceflight (SOSF) is unclear. Risk during SOSF was examined by reviewing several microgravity studies and methods of determining poor cardiovascular condition. Accurately assessing cardiovascular function and improving baroreceptor sensitivity through exercise is suggested to reduce the incidence of SCA during future SOSFs. Future studies will benefit from past participants sharing medical history; allowing creation of risk profiles and suitable guidelines.
Shih, Weichung Joe; Li, Gang; Wang, Yining
2016-03-01
Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one. Copyright © 2015 Elsevier Inc. All rights reserved.
In African-American adolescents with persistent asthma, allergic profile predicted the likelihood of having poorly controlled asthma despite guidelines-directed therapies. Our results suggest that tree and weed pollen sensitization are independent risk factors for poorly controll...
González-Andrade, Fabricio; Sánchez, Dora
2005-10-01
We present individual body identification efforts, to identify skeletal remains and relatives of missing persons of an explosion took place inside one of the munitions recesses of the Armoured Brigade of the Galapagos Armoured Cavalry, in the city of Riobamba, Ecuador, on Wednesday, November 20, 2002. Nineteen samples of bone remains and two tissue samples (a blood stain on a piece of fabric) from the zero zone were analysed. DNA extraction was made by Isoamilic Phenol-Chloroform-Alcohol, and proteinase K. We increased PCR cycles to identify DNA from bones to 35 cycles in some cases. An ABI 310 sequencer was used. Determination of the fragment size and the allelic designation of the different loci was carried out by comparison with the allelic ladders of the PowerPlex 16 kit and Gene Scan Analysis Software programme. Five possible family groups were established and were compared with the profiles found. Classical Bayesian methods were used to calculate the Likelihood Ratio and it was possible to identify five different genetic profiles in our country. This paper is important because is a novel experience for our forensic services, because this was the first time DNA had been used as an identification method in disasters, and it was validated by Ecuadorian justice like a very effective method.
Liu, Peigui; Elshall, Ahmed S.; Ye, Ming; ...
2016-02-05
Evaluating marginal likelihood is the most critical and computationally expensive task, when conducting Bayesian model averaging to quantify parametric and model uncertainties. The evaluation is commonly done by using Laplace approximations to evaluate semianalytical expressions of the marginal likelihood or by using Monte Carlo (MC) methods to evaluate arithmetic or harmonic mean of a joint likelihood function. This study introduces a new MC method, i.e., thermodynamic integration, which has not been attempted in environmental modeling. Instead of using samples only from prior parameter space (as in arithmetic mean evaluation) or posterior parameter space (as in harmonic mean evaluation), the thermodynamicmore » integration method uses samples generated gradually from the prior to posterior parameter space. This is done through a path sampling that conducts Markov chain Monte Carlo simulation with different power coefficient values applied to the joint likelihood function. The thermodynamic integration method is evaluated using three analytical functions by comparing the method with two variants of the Laplace approximation method and three MC methods, including the nested sampling method that is recently introduced into environmental modeling. The thermodynamic integration method outperforms the other methods in terms of their accuracy, convergence, and consistency. The thermodynamic integration method is also applied to a synthetic case of groundwater modeling with four alternative models. The application shows that model probabilities obtained using the thermodynamic integration method improves predictive performance of Bayesian model averaging. As a result, the thermodynamic integration method is mathematically rigorous, and its MC implementation is computationally general for a wide range of environmental problems.« less
Risk Factors for Hearing Decrement Among U.S. Air Force Aviation-Related Personnel.
Greenwell, Brandon M; Tvaryanas, Anthony P; Maupin, Genny M
2018-02-01
The purpose of this study was to analyze historical hearing sensitivity data to determine factors associated with an occupationally significant change in hearing sensitivity in U.S. Air Force aviation-related personnel. This study was a longitudinal, retrospective cohort analysis of audiogram records for Air Force aviation-related personnel on active duty during calendar year 2013 without a diagnosis of non-noise-related hearing loss. The outcomes of interest were raw change in hearing sensitivity from initial baseline to 2013 audiogram and initial occurrence of a significant threshold shift (STS) and non-H1 audiogram profile. Potential predictor variables included age and elapsed time in cohort for each audiogram, gender, and Air Force Specialty Code. Random forest analyses conducted on a learning sample were used to identify relevant predictor variables. Mixed effects models were fitted to a separate validation sample to make statistical inferences. The final dataset included 167,253 nonbaseline audiograms on 10,567 participants. Only the interaction between time since baseline audiogram and age was significantly associated with raw change in hearing sensitivity by STS metric. None of the potential predictors were associated with the likelihood for an STS. Time since baseline audiogram, age, and their interaction were significantly associated with the likelihood for a non-HI hearing profile. In this study population, age and elapsed time since baseline audiogram were modestly associated with decreased hearing sensitivity and increased likelihood for a non-H1 hearing profile. Aircraft type, as determined from Air Force Specialty Code, was not associated with changes in hearing sensitivity by STS metric.Greenwell BM, Tvaryanas AP, Maupin GM. Risk factors for hearing decrement among U.S. Air Force aviation-related personnel. Aerosp Med Hum Perform. 2018; 89(2):80-86.
Dose-finding designs using a novel quasi-continuous endpoint for multiple toxicities
Ezzalfani, Monia; Zohar, Sarah; Qin, Rui; Mandrekar, Sumithra J; Deley, Marie-Cécile Le
2013-01-01
The aim of a phase I oncology trial is to identify a dose with an acceptable safety profile. Most phase I designs use the dose-limiting toxicity, a binary endpoint, to assess the unacceptable level of toxicity. The dose-limiting toxicity might be incomplete for investigating molecularly targeted therapies as much useful toxicity information is discarded. In this work, we propose a quasi-continuous toxicity score, the total toxicity profile (TTP), to measure quantitatively and comprehensively the overall severity of multiple toxicities. We define the TTP as the Euclidean norm of the weights of toxicities experienced by a patient, where the weights reflect the relative clinical importance of each grade and toxicity type. We propose a dose-finding design, the quasi-likelihood continual reassessment method (CRM), incorporating the TTP score into the CRM, with a logistic model for the dose–toxicity relationship in a frequentist framework. Using simulations, we compared our design with three existing designs for quasi-continuous toxicity score (the Bayesian quasi-CRM with an empiric model and two nonparametric designs), all using the TTP score, under eight different scenarios. All designs using the TTP score to identify the recommended dose had good performance characteristics for most scenarios, with good overdosing control. For a sample size of 36, the percentage of correct selection for the quasi-likelihood CRM ranged from 80% to 90%, with similar results for the quasi-CRM design. These designs with TTP score present an appealing alternative to the conventional dose-finding designs, especially in the context of molecularly targeted agents. PMID:23335156
DOE Office of Scientific and Technical Information (OSTI.GOV)
Craciunescu, Teddy, E-mail: teddy.craciunescu@jet.uk; Tiseanu, Ion; Zoita, Vasile
The Joint European Torus (JET) neutron profile monitor ensures 2D coverage of the gamma and neutron emissive region that enables tomographic reconstruction. Due to the availability of only two projection angles and to the coarse sampling, tomographic inversion is a limited data set problem. Several techniques have been developed for tomographic reconstruction of the 2-D gamma and neutron emissivity on JET, but the problem of evaluating the errors associated with the reconstructed emissivity profile is still open. The reconstruction technique based on the maximum likelihood principle, that proved already to be a powerful tool for JET tomography, has been usedmore » to develop a method for the numerical evaluation of the statistical properties of the uncertainties in gamma and neutron emissivity reconstructions. The image covariance calculation takes into account the additional techniques introduced in the reconstruction process for tackling with the limited data set (projection resampling, smoothness regularization depending on magnetic field). The method has been validated by numerically simulations and applied to JET data. Different sources of artefacts that may significantly influence the quality of reconstructions and the accuracy of variance calculation have been identified.« less
Approximated maximum likelihood estimation in multifractal random walks
NASA Astrophysics Data System (ADS)
Løvsletten, O.; Rypdal, M.
2012-04-01
We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry , Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.64.026103 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the r computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
Draborg, Eva; Andersen, Christian Kronborg
2006-01-01
Health technology assessment (HTA) has been used as input in decision making worldwide for more than 25 years. However, no uniform definition of HTA or agreement on assessment methods exists, leaving open the question of what influences the choice of assessment methods in HTAs. The objective of this study is to analyze statistically a possible relationship between methods of assessment used in practical HTAs, type of assessed technology, type of assessors, and year of publication. A sample of 433 HTAs published by eleven leading institutions or agencies in nine countries was reviewed and analyzed by multiple logistic regression. The study shows that outsourcing of HTA reports to external partners is associated with a higher likelihood of using assessment methods, such as meta-analysis, surveys, economic evaluations, and randomized controlled trials; and with a lower likelihood of using assessment methods, such as literature reviews and "other methods". The year of publication was statistically related to the inclusion of economic evaluations and shows a decreasing likelihood during the year span. The type of assessed technology was related to economic evaluations with a decreasing likelihood, to surveys, and to "other methods" with a decreasing likelihood when pharmaceuticals were the assessed type of technology. During the period from 1989 to 2002, no major developments in assessment methods used in practical HTAs were shown statistically in a sample of 433 HTAs worldwide. Outsourcing to external assessors has a statistically significant influence on choice of assessment methods.
Muthuri, Stella G.; Kuh, Diana; Cooper, Rachel
2018-01-01
Abstract This study aimed to (1) characterise long-term profiles of back pain across adulthood and (2) examine whether childhood risk factors were associated with these profiles, using data from 3271 participants in the Medical Research Council National Survey of Health and Development. A longitudinal latent class analysis was conducted on binary outcomes of back pain at ages 31, 36, 43, 53, 60 to 64, and 68 years. Multinomial logistic regression models were used to examine associations between selected childhood risk factors and class membership; adjusted for sex, adult body size, health status and behaviours, socioeconomic position, and family history of back pain. Four profiles of back pain were identified: no or occasional pain (57.7%), early-adulthood only (16.1%), mid-adulthood onset (16.9%), and persistent (9.4%). The “no or occasional” profile was treated as the referent category in subsequent analyses. After adjustment, taller height at age 7 years was associated with a higher likelihood of early-adulthood only (relative risk ratio per 1 SD increase in height = 1.31 [95% confidence interval: 1.05-1.65]) and persistent pain (relative risk ratio = 1.33 [95% confidence interval: 1.01-1.74]) in women (P for sex interaction = 0.01). Factors associated with an increased risk of persistent pain in both sexes were abdominal pain, poorest care in childhood, and poorer maternal health. Abdominal pain and poorest housing quality were also associated with an increased likelihood of mid-adulthood onset pain. These findings suggest that there are different long-term profiles of back pain, each of which is associated with different early life risk factors. This highlights the potential importance of early life interventions for the prevention and management of back pain. PMID:29408834
Gusto, Gaelle; Schbath, Sophie
2005-01-01
We propose an original statistical method to estimate how the occurrences of a given process along a genome, genes or motifs for instance, may be influenced by the occurrences of a second process. More precisely, the aim is to detect avoided and/or favored distances between two motifs, for instance, suggesting possible interactions at a molecular level. For this, we consider occurrences along the genome as point processes and we use the so-called Hawkes' model. In such model, the intensity at position t depends linearly on the distances to past occurrences of both processes via two unknown profile functions to estimate. We perform a non parametric estimation of both profiles by using B-spline decompositions and a constrained maximum likelihood method. Finally, we use the AIC criterion for the model selection. Simulations show the excellent behavior of our estimation procedure. We then apply it to study (i) the dependence between gene occurrences along the E. coli genome and the occurrences of a motif known to be part of the major promoter for this bacterium, and (ii) the dependence between the yeast S. cerevisiae genes and the occurrences of putative polyadenylation signals. The results are coherent with known biological properties or previous predictions, meaning this method can be of great interest for functional motif detection, or to improve knowledge of some biological mechanisms.
NASA Astrophysics Data System (ADS)
Sun, Mei; Zhang, Xiaolin; Huo, Zailin; Feng, Shaoyuan; Huang, Guanhua; Mao, Xiaomin
2016-03-01
Quantitatively ascertaining and analyzing the effects of model uncertainty on model reliability is a focal point for agricultural-hydrological models due to more uncertainties of inputs and processes. In this study, the generalized likelihood uncertainty estimation (GLUE) method with Latin hypercube sampling (LHS) was used to evaluate the uncertainty of the RZWQM-DSSAT (RZWQM2) model outputs responses and the sensitivity of 25 parameters related to soil properties, nutrient transport and crop genetics. To avoid the one-sided risk of model prediction caused by using a single calibration criterion, the combined likelihood (CL) function integrated information concerning water, nitrogen, and crop production was introduced in GLUE analysis for the predictions of the following four model output responses: the total amount of water content (T-SWC) and the nitrate nitrogen (T-NIT) within the 1-m soil profile, the seed yields of waxy maize (Y-Maize) and winter wheat (Y-Wheat). In the process of evaluating RZWQM2, measurements and meteorological data were obtained from a field experiment that involved a winter wheat and waxy maize crop rotation system conducted from 2003 to 2004 in southern Beijing. The calibration and validation results indicated that RZWQM2 model can be used to simulate the crop growth and water-nitrogen migration and transformation in wheat-maize crop rotation planting system. The results of uncertainty analysis using of GLUE method showed T-NIT was sensitive to parameters relative to nitrification coefficient, maize growth characteristics on seedling period, wheat vernalization period, and wheat photoperiod. Parameters on soil saturated hydraulic conductivity, nitrogen nitrification and denitrification, and urea hydrolysis played an important role in crop yield component. The prediction errors for RZWQM2 outputs with CL function were relatively lower and uniform compared with other likelihood functions composed of individual calibration criterion. This new and successful application of the GLUE method for determining the uncertainty and sensitivity of the RZWQM2 could provide a reference for the optimization of model parameters with different emphases according to research interests.
Halo-independence with quantified maximum entropy at DAMA/LIBRA
NASA Astrophysics Data System (ADS)
Fowlie, Andrew
2017-10-01
Using the DAMA/LIBRA anomaly as an example, we formalise the notion of halo-independence in the context of Bayesian statistics and quantified maximum entropy. We consider an infinite set of possible profiles, weighted by an entropic prior and constrained by a likelihood describing noisy measurements of modulated moments by DAMA/LIBRA. Assuming an isotropic dark matter (DM) profile in the galactic rest frame, we find the most plausible DM profiles and predictions for unmodulated signal rates at DAMA/LIBRA. The entropic prior contains an a priori unknown regularisation factor, β, that describes the strength of our conviction that the profile is approximately Maxwellian. By varying β, we smoothly interpolate between a halo-independent and a halo-dependent analysis, thus exploring the impact of prior information about the DM profile.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Validation of DNA-based identification software by computation of pedigree likelihood ratios.
Slooten, K
2011-08-01
Disaster victim identification (DVI) can be aided by DNA-evidence, by comparing the DNA-profiles of unidentified individuals with those of surviving relatives. The DNA-evidence is used optimally when such a comparison is done by calculating the appropriate likelihood ratios. Though conceptually simple, the calculations can be quite involved, especially with large pedigrees, precise mutation models etc. In this article we describe a series of test cases designed to check if software designed to calculate such likelihood ratios computes them correctly. The cases include both simple and more complicated pedigrees, among which inbred ones. We show how to calculate the likelihood ratio numerically and algebraically, including a general mutation model and possibility of allelic dropout. In Appendix A we show how to derive such algebraic expressions mathematically. We have set up these cases to validate new software, called Bonaparte, which performs pedigree likelihood ratio calculations in a DVI context. Bonaparte has been developed by SNN Nijmegen (The Netherlands) for the Netherlands Forensic Institute (NFI). It is available free of charge for non-commercial purposes (see www.dnadvi.nl for details). Commercial licenses can also be obtained. The software uses Bayesian networks and the junction tree algorithm to perform its calculations. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Bully/Victim Profiles’ Differential Risk for Worsening Peer Acceptance: The Role of Friendship
Kochel, Karen P.; Ladd, Gary W.; Bagwell, Catherine L.; Yabko, Brandon A.
2015-01-01
Study aims were to: (1) evaluate the association between bully/victim profiles, derived via latent profile analysis (LPA), and changes in peer acceptance from the fall to spring of 7th grade, and (2) investigate the likelihood of friendlessness, and the protective function of mutual friendship, among identified profiles. Participants were 2,587 7th graders; peer nomination and rating-scale data were collected in the fall and spring. Four profiles, including bullies, victims, bully-victims, and uninvolved adolescents, were identified at each time point. Findings showed that for victims, more so than for bullies and uninvolved profiles, acceptance scores worsened over time. Results further revealed that bully-victim and victim profiles included a greater proportion of friendless youth relative to the bully profile, which, in turn, contained a greater proportion of friendless adolescents than the uninvolved profile. Findings also provided evidence for the buffering role of friendship among all bully/victim profiles and among bully-victims especially. PMID:26309346
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood.
Karabatsos, George
2018-06-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon previous methods because it provides an omnibus test of the entire hierarchy of cancellation axioms, beyond double cancellation. It does so while accounting for the posterior uncertainty that is inherent in the empirical orderings that are implied by these axioms, together. The new method is illustrated through a test of the cancellation axioms on a classic survey data set, and through the analysis of simulated data.
Maximum-likelihood estimation of parameterized wavefronts from multifocal data
Sakamoto, Julia A.; Barrett, Harrison H.
2012-01-01
A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time. PMID:22772282
Empirical Profiles of Alcohol and Marijuana Use, Drugged Driving, and Risk Perceptions.
Arterberry, Brooke J; Treloar, Hayley; McCarthy, Denis M
2017-11-01
The present study sought to inform models of risk for drugged driving through empirically identifying patterns of marijuana use, alcohol use, and related driving behaviors. Perceived dangerousness and consequences of drugged driving were evaluated as putative influences on risk patterns. We used latent profile analysis of survey responses from 897 college students to identify patterns of substance use and drugged driving. We tested the hypotheses that low perceived danger and low perceived likelihood of negative consequences of drugged driving would identify individuals with higher-risk patterns. Findings from the latent profile analysis indicated that a four-profile model provided the best model fit. Low-level engagers had low rates of substance use and drugged driving. Alcohol-centric engagers had higher rates of alcohol use but low rates of marijuana/simultaneous use and low rates of driving after substance use. Concurrent engagers had higher rates of marijuana and alcohol use, simultaneous use, and related driving behaviors, but marijuana-centric/simultaneous engagers had the highest rates of marijuana use, co-use, and related driving behaviors. Those with higher perceived danger of driving while high were more likely to be in the low-level, alcohol-centric, or concurrent engagers' profiles; individuals with higher perceived likelihood of consequences of driving while high were more likely to be in the low-level engagers group. Findings suggested that college students' perceived dangerousness of driving after using marijuana had greater influence on drugged driving behaviors than alcohol-related driving risk perceptions. These results support targeting marijuana-impaired driving risk perceptions in young adult intervention programs.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
On the shape and likelihood of oceanic rogue waves.
Benetazzo, Alvise; Ardhuin, Fabrice; Bergamasco, Filippo; Cavaleri, Luigi; Guimarães, Pedro Veras; Schwendeman, Michael; Sclavo, Mauro; Thomson, Jim; Torsello, Andrea
2017-08-15
We consider the observation and analysis of oceanic rogue waves collected within spatio-temporal (ST) records of 3D wave fields. This class of records, allowing a sea surface region to be retrieved, is appropriate for the observation of rogue waves, which come up as a random phenomenon that can occur at any time and location of the sea surface. To verify this aspect, we used three stereo wave imaging systems to gather ST records of the sea surface elevation, which were collected in different sea conditions. The wave with the ST maximum elevation (happening to be larger than the rogue threshold 1.25H s ) was then isolated within each record, along with its temporal profile. The rogue waves show similar profiles, in agreement with the theory of extreme wave groups. We analyze the rogue wave probability of occurrence, also in the context of ST extreme value distributions, and we conclude that rogue waves are more likely than previously reported; the key point is coming across them, in space as well as in time. The dependence of the rogue wave profile and likelihood on the sea state conditions is also investigated. Results may prove useful in predicting extreme wave occurrence probability and strength during oceanic storms.
Chemical communication, sexual selection, and introgression in wall lizards.
MacGregor, Hannah E A; Lewandowsky, Rachel A M; d'Ettorre, Patrizia; Leroy, Chloé; Davies, Noel W; While, Geoffrey M; Uller, Tobias
2017-10-01
Divergence in communication systems should influence the likelihood that individuals from different lineages interbreed, and consequently shape the direction and rate of hybridization. Here, we studied the role of chemical communication in hybridization, and its contribution to asymmetric and sexually selected introgression between two lineages of the common wall lizard (Podarcis muralis). Males of the two lineages differed in the chemical composition of their femoral secretions. Chemical profiles provided information regarding male secondary sexual characters, but the associations were variable and inconsistent between lineages. In experimental contact zones, chemical composition was weakly associated with male reproductive success, and did not predict the likelihood of hybridization. Consistent with these results, introgression of chemical profiles in a natural hybrid zone resembled that of neutral nuclear genetic markers overall, but one compound in particular (tocopherol methyl ether) matched closely the introgression of visual sexual characters. These results imply that associations among male chemical profiles, sexual characters, and reproductive success largely reflect transient and environmentally driven effects, and that genetic divergence in chemical composition is largely neutral. We therefore suggest that femoral secretions in wall lizards primarily provide information about residency and individual identity rather than function as sexual signals. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
Halo-independence with quantified maximum entropy at DAMA/LIBRA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fowlie, Andrew, E-mail: andrew.j.fowlie@googlemail.com
2017-10-01
Using the DAMA/LIBRA anomaly as an example, we formalise the notion of halo-independence in the context of Bayesian statistics and quantified maximum entropy. We consider an infinite set of possible profiles, weighted by an entropic prior and constrained by a likelihood describing noisy measurements of modulated moments by DAMA/LIBRA. Assuming an isotropic dark matter (DM) profile in the galactic rest frame, we find the most plausible DM profiles and predictions for unmodulated signal rates at DAMA/LIBRA. The entropic prior contains an a priori unknown regularisation factor, β, that describes the strength of our conviction that the profile is approximately Maxwellian.more » By varying β, we smoothly interpolate between a halo-independent and a halo-dependent analysis, thus exploring the impact of prior information about the DM profile.« less
Philanthropic Motivations of Community College Donors
ERIC Educational Resources Information Center
Carter, Linnie S.; Duggan, Molly H.
2011-01-01
This descriptive study surveyed current, lapsed, and major gift donors to explore the impact of college communications on donors' decisions to contribute to the college, the likelihood of donor financial support for various college projects, and the philanthropic motivation profiles of the donors of a midsized, multicampus community college in…
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program Linear SCIDNT which evaluates rotorcraft stability and control coefficients from flight or wind tunnel test data is described. It implements the maximum likelihood method to maximize the likelihood function of the parameters based on measured input/output time histories. Linear SCIDNT may be applied to systems modeled by linear constant-coefficient differential equations. This restriction in scope allows the application of several analytical results which simplify the computation and improve its efficiency over the general nonlinear case.
Method and tool for network vulnerability analysis
Swiler, Laura Painton [Albuquerque, NM; Phillips, Cynthia A [Albuquerque, NM
2006-03-14
A computer system analysis tool and method that will allow for qualitative and quantitative assessment of security attributes and vulnerabilities in systems including computer networks. The invention is based on generation of attack graphs wherein each node represents a possible attack state and each edge represents a change in state caused by a single action taken by an attacker or unwitting assistant. Edges are weighted using metrics such as attacker effort, likelihood of attack success, or time to succeed. Generation of an attack graph is accomplished by matching information about attack requirements (specified in "attack templates") to information about computer system configuration (contained in a configuration file that can be updated to reflect system changes occurring during the course of an attack) and assumed attacker capabilities (reflected in "attacker profiles"). High risk attack paths, which correspond to those considered suited to application of attack countermeasures given limited resources for applying countermeasures, are identified by finding "epsilon optimal paths."
NASA Astrophysics Data System (ADS)
Aprile, E.; Aalbers, J.; Agostini, F.; Alfonsi, M.; Amaro, F. D.; Anthony, M.; Arneodo, F.; Barrow, P.; Baudis, L.; Bauermeister, B.; Benabderrahmane, M. L.; Berger, T.; Breur, P. A.; Brown, A.; Brown, E.; Bruenner, S.; Bruno, G.; Budnik, R.; Bütikofer, L.; Calvén, J.; Cardoso, J. M. R.; Cervantes, M.; Cichon, D.; Coderre, D.; Colijn, A. P.; Conrad, J.; Cussonneau, J. P.; Decowski, M. P.; de Perio, P.; di Gangi, P.; di Giovanni, A.; Diglio, S.; Eurin, G.; Fei, J.; Ferella, A. D.; Fieguth, A.; Fulgione, W.; Gallo Rosso, A.; Galloway, M.; Gao, F.; Garbini, M.; Geis, C.; Goetzke, L. W.; Greene, Z.; Grignon, C.; Hasterok, C.; Hogenbirk, E.; Itay, R.; Kaminsky, B.; Kazama, S.; Kessler, G.; Kish, A.; Landsman, H.; Lang, R. F.; Lellouch, D.; Levinson, L.; Lin, Q.; Lindemann, S.; Lindner, M.; Lombardi, F.; Lopes, J. A. M.; Manfredini, A.; Maris, I.; Marrodán Undagoitia, T.; Masbou, J.; Massoli, F. V.; Masson, D.; Mayani, D.; Messina, M.; Micheneau, K.; Molinario, A.; Morâ, K.; Murra, M.; Naganoma, J.; Ni, K.; Oberlack, U.; Pakarha, P.; Pelssers, B.; Persiani, R.; Piastra, F.; Pienaar, J.; Pizzella, V.; Piro, M.-C.; Plante, G.; Priel, N.; Rauch, L.; Reichard, S.; Reuter, C.; Rizzo, A.; Rosendahl, S.; Rupp, N.; Dos Santos, J. M. F.; Sartorelli, G.; Scheibelhut, M.; Schindler, S.; Schreiner, J.; Schumann, M.; Scotto Lavina, L.; Selvi, M.; Shagin, P.; Silva, M.; Simgen, H.; Sivers, M. V.; Stein, A.; Thers, D.; Tiseni, A.; Trinchero, G.; Tunnell, C.; Vargas, M.; Wang, H.; Wang, Z.; Wei, Y.; Weinheimer, C.; Wulf, J.; Ye, J.; Zhang., Y.; Farmer, B.; Xenon Collaboration
2017-08-01
We report on weakly interacting massive particles (WIMPs) search results in the XENON100 detector using a nonrelativistic effective field theory approach. The data from science run II (34 kg ×224.6 live days) were reanalyzed, with an increased recoil energy interval compared to previous analyses, ranging from (6.6 -240 ) keVnr . The data are found to be compatible with the background-only hypothesis. We present 90% confidence level exclusion limits on the coupling constants of WIMP-nucleon effective operators using a binned profile likelihood method. We also consider the case of inelastic WIMP scattering, where incident WIMPs may up-scatter to a higher mass state, and set exclusion limits on this model as well.
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.
Beerli, Peter; Palczewski, Michal
2010-05-01
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Likelihood-based methods for evaluating principal surrogacy in augmented vaccine trials.
Liu, Wei; Zhang, Bo; Zhang, Hui; Zhang, Zhiwei
2017-04-01
There is growing interest in assessing immune biomarkers, which are quick to measure and potentially predictive of long-term efficacy, as surrogate endpoints in randomized, placebo-controlled vaccine trials. This can be done under a principal stratification approach, with principal strata defined using a subject's potential immune responses to vaccine and placebo (the latter may be assumed to be zero). In this context, principal surrogacy refers to the extent to which vaccine efficacy varies across principal strata. Because a placebo recipient's potential immune response to vaccine is unobserved in a standard vaccine trial, augmented vaccine trials have been proposed to produce the information needed to evaluate principal surrogacy. This article reviews existing methods based on an estimated likelihood and a pseudo-score (PS) and proposes two new methods based on a semiparametric likelihood (SL) and a pseudo-likelihood (PL), for analyzing augmented vaccine trials. Unlike the PS method, the SL method does not require a model for missingness, which can be advantageous when immune response data are missing by happenstance. The SL method is shown to be asymptotically efficient, and it performs similarly to the PS and PL methods in simulation experiments. The PL method appears to have a computational advantage over the PS and SL methods.
Handwriting individualization using distance and rarity
NASA Astrophysics Data System (ADS)
Tang, Yi; Srihari, Sargur; Srinivasan, Harish
2012-01-01
Forensic individualization is the task of associating observed evidence with a specific source. The likelihood ratio (LR) is a quantitative measure that expresses the degree of uncertainty in individualization, where the numerator represents the likelihood that the evidence corresponds to the known and the denominator the likelihood that it does not correspond to the known. Since the number of parameters needed to compute the LR is exponential with the number of feature measurements, a commonly used simplification is the use of likelihoods based on distance (or similarity) given the two alternative hypotheses. This paper proposes an intermediate method which decomposes the LR as the product of two factors, one based on distance and the other on rarity. It was evaluated using a data set of handwriting samples, by determining whether two writing samples were written by the same/different writer(s). The accuracy of the distance and rarity method, as measured by error rates, is significantly better than the distance method.
Fast integration-based prediction bands for ordinary differential equation models.
Hass, Helge; Kreutz, Clemens; Timmer, Jens; Kaschek, Daniel
2016-04-15
To gain a deeper understanding of biological processes and their relevance in disease, mathematical models are built upon experimental data. Uncertainty in the data leads to uncertainties of the model's parameters and in turn to uncertainties of predictions. Mechanistic dynamic models of biochemical networks are frequently based on nonlinear differential equation systems and feature a large number of parameters, sparse observations of the model components and lack of information in the available data. Due to the curse of dimensionality, classical and sampling approaches propagating parameter uncertainties to predictions are hardly feasible and insufficient. However, for experimental design and to discriminate between competing models, prediction and confidence bands are essential. To circumvent the hurdles of the former methods, an approach to calculate a profile likelihood on arbitrary observations for a specific time point has been introduced, which provides accurate confidence and prediction intervals for nonlinear models and is computationally feasible for high-dimensional models. In this article, reliable and smooth point-wise prediction and confidence bands to assess the model's uncertainty on the whole time-course are achieved via explicit integration with elaborate correction mechanisms. The corresponding system of ordinary differential equations is derived and tested on three established models for cellular signalling. An efficiency analysis is performed to illustrate the computational benefit compared with repeated profile likelihood calculations at multiple time points. The integration framework and the examples used in this article are provided with the software package Data2Dynamics, which is based on MATLAB and freely available at http://www.data2dynamics.org helge.hass@fdm.uni-freiburg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Vullo, Carlos M; Romero, Magdalena; Catelli, Laura; Šakić, Mustafa; Saragoni, Victor G; Jimenez Pleguezuelos, María Jose; Romanini, Carola; Anjos Porto, Maria João; Puente Prieto, Jorge; Bofarull Castro, Alicia; Hernandez, Alexis; Farfán, María José; Prieto, Victoria; Alvarez, David; Penacino, Gustavo; Zabalza, Santiago; Hernández Bolaños, Alejandro; Miguel Manterola, Irati; Prieto, Lourdes; Parsons, Thomas
2016-03-01
The GHEP-ISFG Working Group has recognized the importance of assisting DNA laboratories to gain expertise in handling DVI or missing persons identification (MPI) projects which involve the need for large-scale genetic profile comparisons. Eleven laboratories participated in a DNA matching exercise to identify victims from a hypothetical conflict with 193 missing persons. The post mortem database was comprised of 87 skeletal remain profiles from a secondary mass grave displaying a minimal number of 58 individuals with evidence of commingling. The reference database was represented by 286 family reference profiles with diverse pedigrees. The goal of the exercise was to correctly discover re-associations and family matches. The results of direct matching for commingled remains re-associations were correct and fully concordant among all laboratories. However, the kinship analysis for missing persons identifications showed variable results among the participants. There was a group of laboratories with correct, concordant results but nearly half of the others showed discrepant results exhibiting likelihood ratio differences of several degrees of magnitude in some cases. Three main errors were detected: (a) some laboratories did not use the complete reference family genetic data to report the match with the remains, (b) the identity and/or non-identity hypotheses were sometimes wrongly expressed in the likelihood ratio calculations, and (c) many laboratories did not properly evaluate the prior odds for the event. The results suggest that large-scale profile comparisons for DVI or MPI is a challenge for forensic genetics laboratories and the statistical treatment of DNA matching and the Bayesian framework should be better standardized among laboratories. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
Assessing compatibility of direct detection data: halo-independent global likelihood analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gelmini, Graciela B.; Huh, Ji-Haeng; Witte, Samuel J.
2016-10-18
We present two different halo-independent methods to assess the compatibility of several direct dark matter detection data sets for a given dark matter model using a global likelihood consisting of at least one extended likelihood and an arbitrary number of Gaussian or Poisson likelihoods. In the first method we find the global best fit halo function (we prove that it is a unique piecewise constant function with a number of down steps smaller than or equal to a maximum number that we compute) and construct a two-sided pointwise confidence band at any desired confidence level, which can then be comparedmore » with those derived from the extended likelihood alone to assess the joint compatibility of the data. In the second method we define a “constrained parameter goodness-of-fit” test statistic, whose p-value we then use to define a “plausibility region” (e.g. where p≥10%). For any halo function not entirely contained within the plausibility region, the level of compatibility of the data is very low (e.g. p<10%). We illustrate these methods by applying them to CDMS-II-Si and SuperCDMS data, assuming dark matter particles with elastic spin-independent isospin-conserving interactions or exothermic spin-independent isospin-violating interactions.« less
Sensitivity Analysis for Atmospheric Infrared Sounder (AIRS) CO2 Retrieval
NASA Technical Reports Server (NTRS)
Gat, Ilana
2012-01-01
The Atmospheric Infrared Sounder (AIRS) is a thermal infrared sensor able to retrieve the daily atmospheric state globally for clear as well as partially cloudy field-of-views. The AIRS spectrometer has 2378 channels sensing from 15.4 micrometers to 3.7 micrometers, of which a small subset in the 15 micrometers region has been selected, to date, for CO2 retrieval. To improve upon the current retrieval method, we extended the retrieval calculations to include a prior estimate component and developed a channel ranking system to optimize the channels and number of channels used. The channel ranking system uses a mathematical formalism to rapidly process and assess the retrieval potential of large numbers of channels. Implementing this system, we identifed a larger optimized subset of AIRS channels that can decrease retrieval errors and minimize the overall sensitivity to other iridescent contributors, such as water vapor, ozone, and atmospheric temperature. This methodology selects channels globally by accounting for the latitudinal, longitudinal, and seasonal dependencies of the subset. The new methodology increases accuracy in AIRS CO2 as well as other retrievals and enables the extension of retrieved CO2 vertical profiles to altitudes ranging from the lower troposphere to upper stratosphere. The extended retrieval method for CO2 vertical profile estimation using a maximum-likelihood estimation method. We use model data to demonstrate the beneficial impact of the extended retrieval method using the new channel ranking system on CO2 retrieval.
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Developmental Assets: Profile of Youth in a Juvenile Justice Facility
ERIC Educational Resources Information Center
Chew, Weslee; Osseck, Jenna; Raygor, Desiree; Eldridge-Houser, Jennifer; Cox, Carol
2010-01-01
Background: Possessing high numbers of developmental assets greatly reduces the likelihood of a young person engaging in health-risk behaviors. Since youth in the juvenile justice system seem to exhibit many high-risk behaviors, the purpose of this study was to assess the presence of external, internal, and social context areas of developmental…
Families and Schools Together: Building Relationships. Juvenile Justice Bulletin.
ERIC Educational Resources Information Center
McDonald, Lynn; Frey, Heather E.
This bulletin profiles a program, Families and Schools Together (FAST), that brings at-risk children and their families together in multifamily groups to strengthen families and increase the likelihood that children will succeed at home, at school, and in the community. Drawing on research and family therapy, FAST builds protective factors for…
Early Childhood Poverty: A Statistical Profile.
ERIC Educational Resources Information Center
Song, Younghwan; Lu, Hsien-Hen
Noting that young children in poverty face a greater likelihood of impaired development because of their increased exposure to a number of risk factors associated with poverty, this report presents statistical information on the incidence of poverty during early childhood. The report notes that the poverty rate for U.S. children under age 3…
A Social Psychological Model for Predicting Sexual Harassment.
ERIC Educational Resources Information Center
Pryor, John B.; And Others
1995-01-01
Presents a Person X Situation (PXS) model of sexual harassment suggesting that sexually harassing behavior may be predicted from an analysis of social situational and personal factors. Research on sexual harassment proclivities in men is reviewed, and a profile of men who have a high a likelihood to sexually harass is discussed. Possible PXS…
Ostrovnaya, Irina; Seshan, Venkatraman E; Olshen, Adam B; Begg, Colin B
2011-06-15
If a cancer patient develops multiple tumors, it is sometimes impossible to determine whether these tumors are independent or clonal based solely on pathological characteristics. Investigators have studied how to improve this diagnostic challenge by comparing the presence of loss of heterozygosity (LOH) at selected genetic locations of tumor samples, or by comparing genomewide copy number array profiles. We have previously developed statistical methodology to compare such genomic profiles for an evidence of clonality. We assembled the software for these tests in a new R package called 'Clonality'. For LOH profiles, the package contains significance tests. The analysis of copy number profiles includes a likelihood ratio statistic and reference distribution, as well as an option to produce various plots that summarize the results. Bioconductor (http://bioconductor.org/packages/release/bioc/html/Clonality.html) and http://www.mskcc.org/mskcc/html/13287.cfm.
A general diagnostic model applied to language testing data.
von Davier, Matthias
2008-11-01
Probabilistic models with one or more latent variables are designed to report on a corresponding number of skills or cognitive attributes. Multidimensional skill profiles offer additional information beyond what a single test score can provide, if the reported skills can be identified and distinguished reliably. Many recent approaches to skill profile models are limited to dichotomous data and have made use of computationally intensive estimation methods such as Markov chain Monte Carlo, since standard maximum likelihood (ML) estimation techniques were deemed infeasible. This paper presents a general diagnostic model (GDM) that can be estimated with standard ML techniques and applies to polytomous response variables as well as to skills with two or more proficiency levels. The paper uses one member of a larger class of diagnostic models, a compensatory diagnostic model for dichotomous and partial credit data. Many well-known models, such as univariate and multivariate versions of the Rasch model and the two-parameter logistic item response theory model, the generalized partial credit model, as well as a variety of skill profile models, are special cases of this GDM. In addition to an introduction to this model, the paper presents a parameter recovery study using simulated data and an application to real data from the field test for TOEFL Internet-based testing.
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
On the existence of maximum likelihood estimates for presence-only data
Hefley, Trevor J.; Hooten, Mevin B.
2015-01-01
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
Likelihood-based modification of experimental crystal structure electron density maps
Terwilliger, Thomas C [Sante Fe, NM
2005-04-16
A maximum-likelihood method for improves an electron density map of an experimental crystal structure. A likelihood of a set of structure factors {F.sub.h } is formed for the experimental crystal structure as (1) the likelihood of having obtained an observed set of structure factors {F.sub.h.sup.OBS } if structure factor set {F.sub.h } was correct, and (2) the likelihood that an electron density map resulting from {F.sub.h } is consistent with selected prior knowledge about the experimental crystal structure. The set of structure factors {F.sub.h } is then adjusted to maximize the likelihood of {F.sub.h } for the experimental crystal structure. An improved electron density map is constructed with the maximized structure factors.
The upgrade of the Thomson scattering system for measurement on the C-2/C-2U devices.
Zhai, K; Schindler, T; Kinley, J; Deng, B; Thompson, M C
2016-11-01
The C-2/C-2U Thomson scattering system has been substantially upgraded during the latter phase of C-2/C-2U program. A Rayleigh channel has been added to each of the three polychromators of the C-2/C-2U Thomson scattering system. Onsite spectral calibration has been applied to avoid the issue of different channel responses at different spots on the photomultiplier tube surface. With the added Rayleigh channel, the absolute intensity response of the system is calibrated with Rayleigh scattering in argon gas from 0.1 to 4 Torr, where the Rayleigh scattering signal is comparable to the Thomson scattering signal at electron densities from 1 × 10 13 to 4 × 10 14 cm -3 . A new signal processing algorithm, using a maximum likelihood method and including detailed analysis of different noise contributions within the system, has been developed to obtain electron temperature and density profiles. The system setup, spectral and intensity calibration procedure and its outcome, data analysis, and the results of electron temperature/density profile measurements will be presented.
Population Synthesis of Radio and Gamma-ray Pulsars using the Maximum Likelihood Approach
NASA Astrophysics Data System (ADS)
Billman, Caleb; Gonthier, P. L.; Harding, A. K.
2012-01-01
We present the results of a pulsar population synthesis of normal pulsars from the Galactic disk using a maximum likelihood method. We seek to maximize the likelihood of a set of parameters in a Monte Carlo population statistics code to better understand their uncertainties and the confidence region of the model's parameter space. The maximum likelihood method allows for the use of more applicable Poisson statistics in the comparison of distributions of small numbers of detected gamma-ray and radio pulsars. Our code simulates pulsars at birth using Monte Carlo techniques and evolves them to the present assuming initial spatial, kick velocity, magnetic field, and period distributions. Pulsars are spun down to the present and given radio and gamma-ray emission characteristics. We select measured distributions of radio pulsars from the Parkes Multibeam survey and Fermi gamma-ray pulsars to perform a likelihood analysis of the assumed model parameters such as initial period and magnetic field, and radio luminosity. We present the results of a grid search of the parameter space as well as a search for the maximum likelihood using a Markov Chain Monte Carlo method. We express our gratitude for the generous support of the Michigan Space Grant Consortium, of the National Science Foundation (REU and RUI), the NASA Astrophysics Theory and Fundamental Program and the NASA Fermi Guest Investigator Program.
Wu, Yufeng
2012-03-01
Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
The Extended-Image Tracking Technique Based on the Maximum Likelihood Estimation
NASA Technical Reports Server (NTRS)
Tsou, Haiping; Yan, Tsun-Yee
2000-01-01
This paper describes an extended-image tracking technique based on the maximum likelihood estimation. The target image is assume to have a known profile covering more than one element of a focal plane detector array. It is assumed that the relative position between the imager and the target is changing with time and the received target image has each of its pixels disturbed by an independent additive white Gaussian noise. When a rotation-invariant movement between imager and target is considered, the maximum likelihood based image tracking technique described in this paper is a closed-loop structure capable of providing iterative update of the movement estimate by calculating the loop feedback signals from a weighted correlation between the currently received target image and the previously estimated reference image in the transform domain. The movement estimate is then used to direct the imager to closely follow the moving target. This image tracking technique has many potential applications, including free-space optical communications and astronomy where accurate and stabilized optical pointing is essential.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Kamneva, Olga K; Rosenberg, Noah A
2017-01-01
Hybridization events generate reticulate species relationships, giving rise to species networks rather than species trees. We report a comparative study of consensus, maximum parsimony, and maximum likelihood methods of species network reconstruction using gene trees simulated assuming a known species history. We evaluate the role of the divergence time between species involved in a hybridization event, the relative contributions of the hybridizing species, and the error in gene tree estimation. When gene tree discordance is mostly due to hybridization and not due to incomplete lineage sorting (ILS), most of the methods can detect even highly skewed hybridization events between highly divergent species. For recent divergences between hybridizing species, when the influence of ILS is sufficiently high, likelihood methods outperform parsimony and consensus methods, which erroneously identify extra hybridizations. The more sophisticated likelihood methods, however, are affected by gene tree errors to a greater extent than are consensus and parsimony. PMID:28469378
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
ERIC Educational Resources Information Center
Klein, Andreas G.; Muthen, Bengt O.
2007-01-01
In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
Five Methods for Estimating Angoff Cut Scores with IRT
ERIC Educational Resources Information Center
Wyse, Adam E.
2017-01-01
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Self-consistent Bulge/Disk/Halo Galaxy Dynamical Modeling Using Integral Field Kinematics
NASA Astrophysics Data System (ADS)
Taranu, D. S.; Obreschkow, D.; Dubinski, J. J.; Fogarty, L. M. R.; van de Sande, J.; Catinella, B.; Cortese, L.; Moffett, A.; Robotham, A. S. G.; Allen, J. T.; Bland-Hawthorn, J.; Bryant, J. J.; Colless, M.; Croom, S. M.; D'Eugenio, F.; Davies, R. L.; Drinkwater, M. J.; Driver, S. P.; Goodwin, M.; Konstantopoulos, I. S.; Lawrence, J. S.; López-Sánchez, Á. R.; Lorente, N. P. F.; Medling, A. M.; Mould, J. R.; Owers, M. S.; Power, C.; Richards, S. N.; Tonini, C.
2017-11-01
We introduce a method for modeling disk galaxies designed to take full advantage of data from integral field spectroscopy (IFS). The method fits equilibrium models to simultaneously reproduce the surface brightness, rotation, and velocity dispersion profiles of a galaxy. The models are fully self-consistent 6D distribution functions for a galaxy with a Sérsic profile stellar bulge, exponential disk, and parametric dark-matter halo, generated by an updated version of GalactICS. By creating realistic flux-weighted maps of the kinematic moments (flux, mean velocity, and dispersion), we simultaneously fit photometric and spectroscopic data using both maximum-likelihood and Bayesian (MCMC) techniques. We apply the method to a GAMA spiral galaxy (G79635) with kinematics from the SAMI Galaxy Survey and deep g- and r-band photometry from the VST-KiDS survey, comparing parameter constraints with those from traditional 2D bulge-disk decomposition. Our method returns broadly consistent results for shared parameters while constraining the mass-to-light ratios of stellar components and reproducing the H I-inferred circular velocity well beyond the limits of the SAMI data. Although the method is tailored for fitting integral field kinematic data, it can use other dynamical constraints like central fiber dispersions and H I circular velocities, and is well-suited for modeling galaxies with a combination of deep imaging and H I and/or optical spectra (resolved or otherwise). Our implementation (MagRite) is computationally efficient and can generate well-resolved models and kinematic maps in under a minute on modern processors.
Hudson, H M; Ma, J; Green, P
1994-01-01
Many algorithms for medical image reconstruction adopt versions of the expectation-maximization (EM) algorithm. In this approach, parameter estimates are obtained which maximize a complete data likelihood or penalized likelihood, in each iteration. Implicitly (and sometimes explicitly) penalized algorithms require smoothing of the current reconstruction in the image domain as part of their iteration scheme. In this paper, we discuss alternatives to EM which adapt Fisher's method of scoring (FS) and other methods for direct maximization of the incomplete data likelihood. Jacobi and Gauss-Seidel methods for non-linear optimization provide efficient algorithms applying FS in tomography. One approach uses smoothed projection data in its iterations. We investigate the convergence of Jacobi and Gauss-Seidel algorithms with clinical tomographic projection data.
Unexpected metastatic pheochromocytoma - an unusual presentation.
Birrenbach, Tanja; Stanga, Zeno; Cottagnoud, Philippe; Stucki, Armin
2008-01-01
The classic triad of pheochromocytoma consists of episodic headache, sweating, and tachycardia. General clinicians should be aware, however, that this rare entity might present with a wide spectrum of clinical symptoms. We recently observed a noteworthy case of malignant pheochromocytoma where there was a lack of specific symptoms despite an advanced tumor stage. Malignancy is an important cause of mortality. Reliable diagnosis of malignancy depends upon evidence of local invasion, distant metastases, or recurrence. As in our case, new scintigraphic methods, such as 111-In-pentetreotide scintigraphy (Octreoscan), may occasionally reveal 123-I-metaiodobenzylguanidine-negative distant metastases and help to establish an early diagnosis of malignancy. Tumor size, and perhaps even biochemical profile, may be factors increasing the likelihood of a malignant process and may contribute to early identification of patients at risk.
A class of Box-Cox transformation models for recurrent event data.
Sun, Liuquan; Tong, Xingwei; Zhou, Xian
2011-04-01
In this article, we propose a class of Box-Cox transformation models for recurrent event data, which includes the proportional means models as special cases. The new model offers great flexibility in formulating the effects of covariates on the mean functions of counting processes while leaving the stochastic structure completely unspecified. For the inference on the proposed models, we apply a profile pseudo-partial likelihood method to estimate the model parameters via estimating equation approaches and establish large sample properties of the estimators and examine its performance in moderate-sized samples through simulation studies. In addition, some graphical and numerical procedures are presented for model checking. An example of application on a set of multiple-infection data taken from a clinic study on chronic granulomatous disease (CGD) is also illustrated.
Frndak, Seth E; Smerbeck, Audrey M; Irwin, Lauren N; Drake, Allison S; Kordovski, Victoria M; Kunker, Katrina A; Khan, Anjum L; Benedict, Ralph H B
2016-10-01
We endeavored to clarify how distinct co-occurring symptoms relate to the presence of negative work events in employed multiple sclerosis (MS) patients. Latent profile analysis (LPA) was utilized to elucidate common disability patterns by isolating patient subpopulations. Samples of 272 employed MS patients and 209 healthy controls (HC) were administered neuroperformance tests of ambulation, hand dexterity, processing speed, and memory. Regression-based norms were created from the HC sample. LPA identified latent profiles using the regression-based z-scores. Finally, multinomial logistic regression tested for negative work event differences among the latent profiles. Four profiles were identified via LPA: a common profile (55%) characterized by slightly below average performance in all domains, a broadly low-performing profile (18%), a poor motor abilities profile with average cognition (17%), and a generally high-functioning profile (9%). Multinomial regression analysis revealed that the uniformly low-performing profile demonstrated a higher likelihood of reported negative work events. Employed MS patients with co-occurring motor, memory and processing speed impairments were most likely to report a negative work event, classifying them as uniquely at risk for job loss.
Bavarian, Niloofar; Duncan, Robert; Lewis, Kendra M.; Miao, Alicia; Washburn, Isaac J.
2014-01-01
Background We examined whether adolescents receiving a universal, school-based, drug-prevention program in grade 7 varied, by student profile, in substance use behaviors post-program implementation. Profiles were a function of recall of program receipt and substance use at baseline. Methods We analyzed data from the Adolescent Substance Abuse Prevention Study, a large, geographically diverse, longitudinal school-based cluster-randomized controlled trial of the Take Charge of Your Life drug-prevention program. Profiles were created using self-reported substance use (pre-intervention) and program recall (post-intervention) at Grade 7. We first examined characteristics of each of the four profiles of treatment students who varied by program recall and baseline substance use. Using multilevel logistic regression analyses, we examined differences in the odds of substance use (alcohol, tobacco, and marijuana) among student profiles at the six additional study waves (Time 2 (Grade 7) through Time 7 (Grade 11)). Results Pearson’s chi-square tests showed sample characteristics varied by student profile. Multilevel logistic regression results were consistent across all examined substance use behaviors at all time points. Namely, as compared to students who had no baseline substance use and had program recall (No Use, Recall), each of the remaining three profiles (No Use, No Recall; Use, Recall; Use, No Recall) were more likely to engage in substance use. Post-hoc analyses showed that for the two sub-profiles of baseline substance users, there were only two observed, and inconsistent, differences in the odds of subsequent substance use by recall status. Conclusions Findings suggest that for students who were not baseline substance users, program recall significantly decreased the likelihood of subsequent substance use. For students who were baseline substance users, program recall did not generally influence subsequent substance use. Implications for school-based drug prevention programs are discussed. PMID:25148566
Planning, Execution, and Assessment of Effects-Based Operations (EBO)
2006-05-01
time of execution that would maximize the likelihood of achieving a desired effect. GMU has developed a methodology, named ECAD -EA (Effective...Algorithm EBO Effects Based Operations ECAD -EA Effective Course of Action-Evolutionary Algorithm GMU George Mason University GUI Graphical...Probability Profile Generation ........................................................72 A.2.11 Running ECAD -EA (Effective Courses of Action Determination
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shaffer, Richard, E-mail: rickyshaffer@yahoo.co.u; Department of Clinical Oncology, Imperial College London National Health Service Trust, London; Pickles, Tom
Purpose: Prior studies have derived low values of alpha-beta ratio (a/ss) for prostate cancer of approximately 1-2 Gy. These studies used poorly matched groups, differing definitions of biochemical failure, and insufficient follow-up. Methods and Materials: National Comprehensive Cancer Network low- or low-intermediate risk prostate cancer patients, treated with external beam radiotherapy or permanent prostate brachytherapy, were matched for prostate-specific antigen, Gleason score, T-stage, percentage of positive cores, androgen deprivation therapy, and era, yielding 118 patient pairs. The Phoenix definition of biochemical failure was used. The best-fitting value for a/ss was found for up to 90-month follow-up using maximum likelihood analysis,more » and the 95% confidence interval using the profile likelihood method. Linear quadratic formalism was applied with the radiobiological parameters of relative biological effectiveness = 1.0, potential doubling time = 45 days, and repair half-time = 1 hour. Bootstrap analysis was performed to estimate uncertainties in outcomes, and hence in a/ss. Sensitivity analysis was performed by varying the values of the radiobiological parameters to extreme values. Results: The value of a/ss best fitting the outcomes data was >30 Gy, with lower 95% confidence limit of 5.2 Gy. This was confirmed on bootstrap analysis. Varying parameters to extreme values still yielded best-fit a/ss of >30 Gy, although the lower 95% confidence interval limit was reduced to 0.6 Gy. Conclusions: Using carefully matched groups, long follow-up, the Phoenix definition of biochemical failure, and well-established statistical methods, the best estimate of a/ss for low and low-tier intermediate-risk prostate cancer is likely to be higher than that of normal tissues, although a low value cannot be excluded.« less
Exclusion probabilities and likelihood ratios with applications to mixtures.
Slooten, Klaas-Jan; Egeland, Thore
2016-01-01
The statistical evidence obtained from mixed DNA profiles can be summarised in several ways in forensic casework including the likelihood ratio (LR) and the Random Man Not Excluded (RMNE) probability. The literature has seen a discussion of the advantages and disadvantages of likelihood ratios and exclusion probabilities, and part of our aim is to bring some clarification to this debate. In a previous paper, we proved that there is a general mathematical relationship between these statistics: RMNE can be expressed as a certain average of the LR, implying that the expected value of the LR, when applied to an actual contributor to the mixture, is at least equal to the inverse of the RMNE. While the mentioned paper presented applications for kinship problems, the current paper demonstrates the relevance for mixture cases, and for this purpose, we prove some new general properties. We also demonstrate how to use the distribution of the likelihood ratio for donors of a mixture, to obtain estimates for exceedance probabilities of the LR for non-donors, of which the RMNE is a special case corresponding to L R>0. In order to derive these results, we need to view the likelihood ratio as a random variable. In this paper, we describe how such a randomization can be achieved. The RMNE is usually invoked only for mixtures without dropout. In mixtures, artefacts like dropout and drop-in are commonly encountered and we address this situation too, illustrating our results with a basic but widely implemented model, a so-called binary model. The precise definitions, modelling and interpretation of the required concepts of dropout and drop-in are not entirely obvious, and we attempt to clarify them here in a general likelihood framework for a binary model.
Likelihood Methods for Adaptive Filtering and Smoothing. Technical Report #455.
ERIC Educational Resources Information Center
Butler, Ronald W.
The dynamic linear model or Kalman filtering model provides a useful methodology for predicting the past, present, and future states of a dynamic system, such as an object in motion or an economic or social indicator that is changing systematically with time. Recursive likelihood methods for adaptive Kalman filtering and smoothing are developed.…
ERIC Educational Resources Information Center
Han, Kyung T.; Guo, Fanmin
2014-01-01
The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
Hey, Jody; Nielsen, Rasmus
2007-01-01
In 1988, Felsenstein described a framework for assessing the likelihood of a genetic data set in which all of the possible genealogical histories of the data are considered, each in proportion to their probability. Although not analytically solvable, several approaches, including Markov chain Monte Carlo methods, have been developed to find approximate solutions. Here, we describe an approach in which Markov chain Monte Carlo simulations are used to integrate over the space of genealogies, whereas other parameters are integrated out analytically. The result is an approximation to the full joint posterior density of the model parameters. For many purposes, this function can be treated as a likelihood, thereby permitting likelihood-based analyses, including likelihood ratio tests of nested models. Several examples, including an application to the divergence of chimpanzee subspecies, are provided. PMID:17301231
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model
Xu, Bo; Yang, Ziheng
2016-01-01
The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
Dong, Yi; Mihalas, Stefan; Russell, Alexander; Etienne-Cummings, Ralph; Niebur, Ernst
2012-01-01
When a neuronal spike train is observed, what can we say about the properties of the neuron that generated it? A natural way to answer this question is to make an assumption about the type of neuron, select an appropriate model for this type, and then to choose the model parameters as those that are most likely to generate the observed spike train. This is the maximum likelihood method. If the neuron obeys simple integrate and fire dynamics, Paninski, Pillow, and Simoncelli (2004) showed that its negative log-likelihood function is convex and that its unique global minimum can thus be found by gradient descent techniques. The global minimum property requires independence of spike time intervals. Lack of history dependence is, however, an important constraint that is not fulfilled in many biological neurons which are known to generate a rich repertoire of spiking behaviors that are incompatible with history independence. Therefore, we expanded the integrate and fire model by including one additional variable, a variable threshold (Mihalas & Niebur, 2009) allowing for history-dependent firing patterns. This neuronal model produces a large number of spiking behaviors while still being linear. Linearity is important as it maintains the distribution of the random variables and still allows for maximum likelihood methods to be used. In this study we show that, although convexity of the negative log-likelihood is not guaranteed for this model, the minimum of the negative log-likelihood function yields a good estimate for the model parameters, in particular if the noise level is treated as a free parameter. Furthermore, we show that a nonlinear function minimization method (r-algorithm with space dilation) frequently reaches the global minimum. PMID:21851282
A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins
Knudsen, Bjarne; Miyamoto, Michael M.
2001-01-01
Changes in protein function can lead to changes in the selection acting on specific residues. This can often be detected as evolutionary rate changes at the sites in question. A maximum-likelihood method for detecting evolutionary rate shifts at specific protein positions is presented. The method determines significance values of the rate differences to give a sound statistical foundation for the conclusions drawn from the analyses. A statistical test for detecting slowly evolving sites is also described. The methods are applied to a set of Myc proteins for the identification of both conserved sites and those with changing evolutionary rates. Those positions with conserved and changing rates are related to the structures and functions of their proteins. The results are compared with an earlier Bayesian method, thereby highlighting the advantages of the new likelihood ratio tests. PMID:11734650
Matthews, Allison; Sutherland, Rachel; Peacock, Amy; Van Buskirk, Joe; Whittaker, Elizabeth; Burns, Lucinda; Bruno, Raimondo
2017-02-01
Over the past decade, monitoring systems have identified the rapid emergence of new psychoactive substances (NPS). While the use of many NPS is minimal and transitory, little is known about which products have potential for capturing the attention of significant proportions of the drug consuming market. The aim of this study was to explore self-reported experiences of three commonly used NPS classes within the Australian context (synthetic cathinones, hallucinogenic phenethylamines and hallucinogenic tryptamines) relative to traditional illicit drug counterparts. Frequent psychostimulant consumers interviewed for the Australian Ecstasy and related Drugs Reporting System (EDRS) (n=1208) provided subjective ratings of the pleasurable and negative (acute and longer-term) effects of substances used in the last six months on the last occasion of use, and the likelihood of future use. Stimulant-type NPS (e.g., mephedrone, methylone) were rated less favourably than ecstasy and cocaine in terms of pleasurable effects and likelihood of future use. DMT (a hallucinogenic tryptamine) showed a similar profile to LSD in terms of pleasurable effects and the likelihood of future use, but negative effects (acute and comedown) were rated lower. Hallucinogenic phenethylamines (e.g., 2C-B) showed a similar negative profile to LSD, but were rated as less pleasurable and less likely to be used again. The potential for expanded use of stimulant-type NPS may be lower compared to commonly used stimulants such as ecstasy and cocaine. In contrast, the potential of DMT may be higher relative to LSD given the comparative absence of negative effects. Copyright © 2016 Elsevier B.V. All rights reserved.
Patel, Manish M; Janssen, Alan P; Tardif, Richard R; Herring, Mark; Parashar, Umesh D
2007-10-18
In 2006, a new rotavirus vaccine (RotaTeq) was licensed in the US and recommended for routine immunization of all US infants. Because a previously licensed vaccine (Rotashield) was withdrawn from the US for safety concerns, identifying barriers to uptake of RotaTeq will help develop strategies to broaden vaccine coverage. We explored beliefs and attitudes of parents (n = 57) and providers (n = 10) towards rotavirus disease and vaccines through a qualitative assessment using focus groups and in-depth interviews. All physicians were familiar with safety concerns about rotavirus vaccines, but felt reassured by RotaTeq's safety profile. When asked about likelihood of using RotaTeq on a scale of one to seven (1 = "absolutely not;" 7 = "absolutely yes") the mean score was 5 (range = 3-6). Physicians expressed a high likelihood of adopting RotaTeq, particularly if recommended by their professional organizations and expressed specific interest in post-marketing safety data. Similarly, consumers found the RotaTeq safety profile to be favorable and would rely on their physician's recommendation for vaccination. However, when asked to rank likelihood of having their child vaccinated against rotavirus (1 = "definitely not get;" 7 = "definitely get"), 29% ranked 1 or 2, 36% 3 or 4, and 35% 5 to 7. Our qualitative assessment provides complementary data to recent quantitative surveys and suggests that physicians and parents are likely to adopt the newly licensed rotavirus vaccine. Increasing parental awareness of the rotavirus disease burden and providing physicians with timely post-marketing surveillance data will be integral to a successful vaccination program.
Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods
NASA Astrophysics Data System (ADS)
Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.
2014-12-01
Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.
Risk Indicators for Periodontitis in US Adults: NHANES 2009 to 2012.
Eke, Paul I; Wei, Liang; Thornton-Evans, Gina O; Borrell, Luisa N; Borgnakke, Wenche S; Dye, Bruce; Genco, Robert J
2016-10-01
Through the use of optimal surveillance measures and standard case definitions, it is now possible to more accurately determine population-average risk profiles for severe (SP) and non-severe periodontitis (NSP) in adults (aged 30 years and older) in the United States. Data from the 2009 to 2012 National Health and Nutrition Examination Survey were used, which, for the first time, used the "gold standard" full-mouth periodontitis surveillance protocol to classify severity of periodontitis following suggested Centers for Disease Control/American Academy of Periodontology case definitions. Probabilities of periodontitis by: 1) sociodemographics, 2) behavioral factors, and 3) comorbid conditions were assessed using prevalence ratios (PRs) estimated by predicted marginal probability from multivariable generalized logistic regression models. Analyses were further stratified by sex for each classification of periodontitis. Likelihood of total periodontitis (TP) increased with age for overall and NSP relative to non-periodontitis. Compared with non-Hispanic whites, TP was more likely in Hispanics (adjusted [a]PR = 1.38; 95% confidence interval 95% CI: 1.26 to 1.52) and non-Hispanic blacks (aPR = 1.35; 95% CI: 1.22 to 1.50), whereas SP was most likely in non-Hispanic blacks (aPR = 1.82; 95% CI: 1.44 to 2.31). There was at least a 50% greater likelihood of TP in current smokers compared with non-smokers. In males, likelihood of TP in adults aged 65 years and older was greater (aPR = 2.07; 95% CI: 1.76 to 2.43) than adults aged 30 to 44 years. This probability was even greater in women (aPR = 3.15; 95% CI: 2.63 to 3.77). Likelihood of TP was higher in current smokers relative to non-smokers regardless of sex and periodontitis classification. TP was more likely in men with uncontrolled diabetes mellitus (DM) compared with adults without DM. Assessment of risk profiles for periodontitis in adults in the United States based on gold standard periodontal measures show important differences by severity of disease and sex. Cigarette smoking, specifically current smoking, remains an important modifiable risk for all levels of periodontitis severity. Higher likelihood of TP in older adults and in males with uncontrolled DM is noteworthy. These findings could improve identification of target populations for effective public health interventions to improve periodontal health of adults in the United States.
Inferring the parameters of a Markov process from snapshots of the steady state
NASA Astrophysics Data System (ADS)
Dettmer, Simon L.; Berg, Johannes
2018-02-01
We seek to infer the parameters of an ergodic Markov process from samples taken independently from the steady state. Our focus is on non-equilibrium processes, where the steady state is not described by the Boltzmann measure, but is generally unknown and hard to compute, which prevents the application of established equilibrium inference methods. We propose a quantity we call propagator likelihood, which takes on the role of the likelihood in equilibrium processes. This propagator likelihood is based on fictitious transitions between those configurations of the system which occur in the samples. The propagator likelihood can be derived by minimising the relative entropy between the empirical distribution and a distribution generated by propagating the empirical distribution forward in time. Maximising the propagator likelihood leads to an efficient reconstruction of the parameters of the underlying model in different systems, both with discrete configurations and with continuous configurations. We apply the method to non-equilibrium models from statistical physics and theoretical biology, including the asymmetric simple exclusion process (ASEP), the kinetic Ising model, and replicator dynamics.
Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model.
Sun, Xiaoxiao; Dalpiaz, David; Wu, Di; S Liu, Jun; Zhong, Wenxuan; Ma, Ping
2016-08-26
Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data.
Williamson, Ross S.; Sahani, Maneesh; Pillow, Jonathan W.
2015-01-01
Stimulus dimensionality-reduction methods in neuroscience seek to identify a low-dimensional space of stimulus features that affect a neuron’s probability of spiking. One popular method, known as maximally informative dimensions (MID), uses an information-theoretic quantity known as “single-spike information” to identify this space. Here we examine MID from a model-based perspective. We show that MID is a maximum-likelihood estimator for the parameters of a linear-nonlinear-Poisson (LNP) model, and that the empirical single-spike information corresponds to the normalized log-likelihood under a Poisson model. This equivalence implies that MID does not necessarily find maximally informative stimulus dimensions when spiking is not well described as Poisson. We provide several examples to illustrate this shortcoming, and derive a lower bound on the information lost when spiking is Bernoulli in discrete time bins. To overcome this limitation, we introduce model-based dimensionality reduction methods for neurons with non-Poisson firing statistics, and show that they can be framed equivalently in likelihood-based or information-theoretic terms. Finally, we show how to overcome practical limitations on the number of stimulus dimensions that MID can estimate by constraining the form of the non-parametric nonlinearity in an LNP model. We illustrate these methods with simulations and data from primate visual cortex. PMID:25831448
Laser beam complex amplitude measurement by phase diversity.
Védrenne, Nicolas; Mugnier, Laurent M; Michau, Vincent; Velluet, Marie-Thérèse; Bierent, Rudolph
2014-02-24
The control of the optical quality of a laser beam requires a complex amplitude measurement able to deal with strong modulus variations and potentially highly perturbed wavefronts. The method proposed here consists in an extension of phase diversity to complex amplitude measurements that is effective for highly perturbed beams. Named camelot for Complex Amplitude MEasurement by a Likelihood Optimization Tool, it relies on the acquisition and processing of few images of the beam section taken along the optical path. The complex amplitude of the beam is retrieved from the images by the minimization of a Maximum a Posteriori error metric between the images and a model of the beam propagation. The analytical formalism of the method and its experimental validation are presented. The modulus of the beam is compared to a measurement of the beam profile, the phase of the beam is compared to a conventional phase diversity estimate. The precision of the experimental measurements is investigated by numerical simulations.
Likelihoods for fixed rank nomination networks
HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE
2014-01-01
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586
The age profile of the location decision of Australian general practitioners.
Mu, Chunzhou
2015-10-01
The unbalanced distribution of general practitioners (GPs) across geographic areas has been acknowledged as a problem in many countries around the world. Quantitative information regarding GPs' location decision over their lifecycle is essential in developing effective initiatives to address the unbalanced distribution and retention of GPs. This paper describes the age profile of GPs' location decision and relates it to individual characteristics. I use the Medicine in Australia: Balancing Employment and Life (MABEL) survey of doctors (2008-2012) with a sample size of 5810 male and 5797 female GPs. I employ a mixed logit model to estimate GPs' location decision. The results suggest that younger GPs are more prepared to go to rural and remote areas but they tend to migrate back to urban areas as they age. Coming from a rural background increases the likelihood of choosing rural areas, but with heterogeneity: While male GPs from a rural background tend to stay in rural and remote areas regardless of age, female GPs from a rural background are willing to migrate to urban areas as they age. GPs who obtain basic medical degrees overseas are likely to move back to urban areas in the later stage of their careers. Completing a basic medical degree at an older age increases the likelihood of working outside major cities. I also examine factors influencing GPs' location transition patterns and the results further confirm the association of individual characteristics and GPs' location-age profile. The findings can help target GPs who are most likely to practise and remain in rural and remote areas, and tailor policy initiatives to address the undesirable distribution and movement of GPs according to the identified heterogeneous age profile of their location decisions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Maximum-Likelihood Methods for Processing Signals From Gamma-Ray Detectors
Barrett, Harrison H.; Hunter, William C. J.; Miller, Brian William; Moore, Stephen K.; Chen, Yichun; Furenlid, Lars R.
2009-01-01
In any gamma-ray detector, each event produces electrical signals on one or more circuit elements. From these signals, we may wish to determine the presence of an interaction; whether multiple interactions occurred; the spatial coordinates in two or three dimensions of at least the primary interaction; or the total energy deposited in that interaction. We may also want to compute listmode probabilities for tomographic reconstruction. Maximum-likelihood methods provide a rigorous and in some senses optimal approach to extracting this information, and the associated Fisher information matrix provides a way of quantifying and optimizing the information conveyed by the detector. This paper will review the principles of likelihood methods as applied to gamma-ray detectors and illustrate their power with recent results from the Center for Gamma-ray Imaging. PMID:20107527
BAO from Angular Clustering: Optimization and Mitigation of Theoretical Systematics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crocce, M.; et al.
We study the theoretical systematics and optimize the methodology in Baryon Acoustic Oscillations (BAO) detections using the angular correlation function with tomographic bins. We calibrate and optimize the pipeline for the Dark Energy Survey Year 1 dataset using 1800 mocks. We compare the BAO fitting results obtained with three estimators: the Maximum Likelihood Estimator (MLE), Profile Likelihood, and Markov Chain Monte Carlo. The MLE method yields the least bias in the fit results (bias/spreadmore » $$\\sim 0.02$$) and the error bar derived is the closest to the Gaussian results (1% from 68% Gaussian expectation). When there is mismatch between the template and the data either due to incorrect fiducial cosmology or photo-$z$ error, the MLE again gives the least-biased results. The BAO angular shift that is estimated based on the sound horizon and the angular diameter distance agree with the numerical fit. Various analysis choices are further tested: the number of redshift bins, cross-correlations, and angular binning. We propose two methods to correct the mock covariance when the final sample properties are slightly different from those used to create the mock. We show that the sample changes can be accommodated with the help of the Gaussian covariance matrix or more effectively using the eigenmode expansion of the mock covariance. The eigenmode expansion is significantly less susceptible to statistical fluctuations relative to the direct measurements of the covariance matrix because the number of free parameters is substantially reduced [$p$ parameters versus $p(p+1)/2$ from direct measurement].« less
Lee, Soohyun; Seo, Chae Hwa; Alver, Burak Han; Lee, Sanghyuk; Park, Peter J
2015-09-03
RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost. EMSAR is available at https://github.com/parklab/emsar.
A segmentation/clustering model for the analysis of array CGH data.
Picard, F; Robin, S; Lebarbier, E; Daudin, J-J
2007-09-01
Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.
Evaluation of the ViSiGiTM Calibration System
2013-12-10
Enhance Delineation of the Stomach Anatomy and the Surgeon's Appreciation of the Extent of Gastric Volume to be Removed;; Increase the Safety Profile of the Patient (i.e., Reduce the Likelihood of Accidental Stapling of the Orogastric Tube or Bougie);; Reduce the Incidence of OR Contamination/Infection Transmission;; Streamline OR Workflow, Resulting in Reduced OR Time; Ensure Consistent and Reproducible Staple Lines.
Choosing face: The curse of self in profile image selection.
White, David; Sutherland, Clare A M; Burton, Amy L
2017-01-01
People draw automatic social inferences from photos of unfamiliar faces and these first impressions are associated with important real-world outcomes. Here we examine the effect of selecting online profile images on first impressions. We model the process of profile image selection by asking participants to indicate the likelihood that images of their own face ("self-selection") and of an unfamiliar face ("other-selection") would be used as profile images on key social networking sites. Across two large Internet-based studies (n = 610), in line with predictions, image selections accentuated favorable social impressions and these impressions were aligned to the social context of the networking sites. However, contrary to predictions based on people's general expertise in self-presentation, other-selected images conferred more favorable impressions than self-selected images. We conclude that people make suboptimal choices when selecting their own profile pictures, such that self-perception places important limits on facial first impressions formed by others. These results underscore the dynamic nature of person perception in real-world contexts.
Cosmological parameter estimation using Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Prasad, J.; Souradeep, T.
2014-03-01
Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.
The influence of women’s fear, attitudes and beliefs of childbirth on mode and experience of birth
2012-01-01
Background Women’s fears and attitudes to childbirth may influence the maternity care they receive and the outcomes of birth. This study aimed to develop profiles of women according to their attitudes regarding birth and their levels of childbirth related fear. The association of these profiles with mode and outcomes of birth was explored. Methods Prospective longitudinal cohort design with self report questionnaires containing a set of attitudinal statements regarding birth (Birth Attitudes Profile Scale) and a fear of birth scale (FOBS). Pregnant women responded at 18-20 weeks gestation and two months after birth from a regional area of Sweden (n = 386) and a regional area of Australia (n = 123). Cluster analysis was used to identify a set of profiles. Odds ratios (95% CI) were calculated, comparing cluster membership for country of care, pregnancy characteristics, birth experience and outcomes. Results Three clusters were identified – ‘Self determiners’ (clear attitudes about birth including seeing it as a natural process and no childbirth fear), ‘Take it as it comes’ (no fear of birth and low levels of agreement with any of the attitude statements) and ‘Fearful’ (afraid of birth, with concerns for the personal impact of birth including pain and control, safety concerns and low levels of agreement with attitudes relating to women’s freedom of choice or birth as a natural process). At 18 -20 weeks gestation, when compared to the ‘Self determiners’, women in the ‘Fearful’ cluster were more likely to: prefer a caesarean (OR = 3.3 CI: 1.6-6.8), hold less than positive feelings about being pregnant (OR = 3.6 CI: 1.4-9.0), report less than positive feelings about the approaching birth (OR = 7.2 CI: 4.4-12.0) and less than positive feelings about the first weeks with a newborn (OR = 2.0 CI 1.2-3.6). At two months post partum the ‘Fearful’ cluster had a greater likelihood of having had an elective caesarean (OR = 5.4 CI 2.1-14.2); they were more likely to have had an epidural if they laboured (OR = 1.9 CI 1.1-3.2) and to experience their labour pain as more intense than women in the other clusters. The ‘Fearful’ cluster were more likely to report a negative experience of birth (OR = 1.7 CI 1.02- 2.9). The ‘Take it as it comes’ cluster had a higher likelihood of an elective caesarean (OR 3.0 CI 1.1-8.0). Conclusions In this study three clusters of women were identified. Belonging to the ‘Fearful’ cluster had a negative effect on women’s emotional health during pregnancy and increased the likelihood of a negative birth experience. Both women in the ‘Take it as it comes’ and the ‘Fearful’ cluster had higher odds of having an elective caesarean compared to women in the ‘Self determiners’. Understanding women’s attitudes and level of fear may help midwives and doctors to tailor their interactions with women. PMID:22727217
Ng, Kristine P; Cambridge, Geraldine; Leandro, Maria J; Edwards, Jonathan C W; Ehrenstein, Michael; Isenberg, David A
2007-01-01
Objectives To describe the long‐term clinical outcome and safety profile of B cell depletion therapy (BCDT) in patients with systemic lupus erythematosus (SLE). It was also determined whether baseline parameters can predict the likelihood of disease flare. Methods 32 patients with refractory SLE were treated with BCDT using a combination protocol (rituximab and cyclo‐phosphamide). Patients were assessed with the British Isles Lupus Assessment Group (BILAG) activity index, and baseline serology was measured. Flare was defined as a new BILAG ‘A' or two new subsequent ‘B's in any organ system. Results Of the 32 patients, 12 have remained well after one cycle of BCDT (median follow‐up 39 months). BCDT was followed by a decrease of median global BILAG scores from 13 to 5 at 6 months (p = 0.006). Baseline anti‐extractable nuclear antigen (ENA) was the only identified independent predictor of flare post‐BCDT (p = 0.034, odds ratio = 8, 95% CI 1.2 to 55) from multivariable analysis. Patients with low baseline serum C3 had a shorter time to flare post‐BCDT (p = 0.008). Four serious adverse events were observed. Conclusion Autoantibody profiling may help identify patients who will have a more sustained response. Although the long‐term safety profile of BCDT is favourable, ongoing vigilance is recommended. PMID:17412738
Less-Complex Method of Classifying MPSK
NASA Technical Reports Server (NTRS)
Hamkins, Jon
2006-01-01
An alternative to an optimal method of automated classification of signals modulated with M-ary phase-shift-keying (M-ary PSK or MPSK) has been derived. The alternative method is approximate, but it offers nearly optimal performance and entails much less complexity, which translates to much less computation time. Modulation classification is becoming increasingly important in radio-communication systems that utilize multiple data modulation schemes and include software-defined or software-controlled receivers. Such a receiver may "know" little a priori about an incoming signal but may be required to correctly classify its data rate, modulation type, and forward error-correction code before properly configuring itself to acquire and track the symbol timing, carrier frequency, and phase, and ultimately produce decoded bits. Modulation classification has long been an important component of military interception of initially unknown radio signals transmitted by adversaries. Modulation classification may also be useful for enabling cellular telephones to automatically recognize different signal types and configure themselves accordingly. The concept of modulation classification as outlined in the preceding paragraph is quite general. However, at the present early stage of development, and for the purpose of describing the present alternative method, the term "modulation classification" or simply "classification" signifies, more specifically, a distinction between M-ary and M'-ary PSK, where M and M' represent two different integer multiples of 2. Both the prior optimal method and the present alternative method require the acquisition of magnitude and phase values of a number (N) of consecutive baseband samples of the incoming signal + noise. The prior optimal method is based on a maximum- likelihood (ML) classification rule that requires a calculation of likelihood functions for the M and M' hypotheses: Each likelihood function is an integral, over a full cycle of carrier phase, of a complicated sum of functions of the baseband sample values, the carrier phase, the carrier-signal and noise magnitudes, and M or M'. Then the likelihood ratio, defined as the ratio between the likelihood functions, is computed, leading to the choice of whichever hypothesis - M or M'- is more likely. In the alternative method, the integral in each likelihood function is approximated by a sum over values of the integrand sampled at a number, 1, of equally spaced values of carrier phase. Used in this way, 1 is a parameter that can be adjusted to trade computational complexity against the probability of misclassification. In the limit as 1 approaches infinity, one obtains the integral form of the likelihood function and thus recovers the ML classification. The present approximate method has been tested in comparison with the ML method by means of computational simulations. The results of the simulations have shown that the performance (as quantified by probability of misclassification) of the approximate method is nearly indistinguishable from that of the ML method (see figure).
Likelihood ratio and posterior odds in forensic genetics: Two sides of the same coin.
Caliebe, Amke; Walsh, Susan; Liu, Fan; Kayser, Manfred; Krawczak, Michael
2017-05-01
It has become widely accepted in forensics that, owing to a lack of sensible priors, the evidential value of matching DNA profiles in trace donor identification or kinship analysis is most sensibly communicated in the form of a likelihood ratio (LR). This restraint does not abate the fact that the posterior odds (PO) would be the preferred basis for returning a verdict. A completely different situation holds for Forensic DNA Phenotyping (FDP), which is aimed at predicting externally visible characteristics (EVCs) of a trace donor from DNA left behind at the crime scene. FDP is intended to provide leads to the police investigation helping them to find unknown trace donors that are unidentifiable by DNA profiling. The statistical models underlying FDP typically yield posterior odds (PO) for an individual possessing a certain EVC. This apparent discrepancy has led to confusion as to when LR or PO is the appropriate outcome of forensic DNA analysis to be communicated to the investigating authorities. We thus set out to clarify the distinction between LR and PO in the context of forensic DNA profiling and FDP from a statistical point of view. In so doing, we also addressed the influence of population affiliation on LR and PO. In contrast to the well-known population dependency of the LR in DNA profiling, the PO as obtained in FDP may be widely population-independent. The actual degree of independence, however, is a matter of (i) how much of the causality of the respective EVC is captured by the genetic markers used for FDP and (ii) by the extent to which non-genetic such as environmental causal factors of the same EVC are distributed equally throughout populations. The fact that an LR should be communicated in cases of DNA profiling whereas the PO are suitable for FDP does not conflict with theory, but rather reflects the immanent differences between these two forensic applications of DNA information. Copyright © 2017 Elsevier B.V. All rights reserved.
A mathematical model for ethanol fermentation from oil palm trunk sap using Saccharomyces cerevisiae
NASA Astrophysics Data System (ADS)
Sultana, S.; Jamil, Norazaliza Mohd; Saleh, E. A. M.; Yousuf, A.; Faizal, Che Ku M.
2017-09-01
This paper presents a mathematical model and solution strategy of ethanol fermentation for oil palm trunk (OPT) sap by considering the effect of substrate limitation, substrate inhibition product inhibition and cell death. To investigate the effect of cell death rate on the fermentation process we extended and improved the current mathematical model. The kinetic parameters of the model were determined by nonlinear regression using maximum likelihood function. The temporal profiles of sugar, cell and ethanol concentrations were modelled by a set of ordinary differential equations, which were solved numerically by the 4th order Runge-Kutta method. The model was validated by the experimental data and the agreement between the model and experimental results demonstrates that the model is reasonable for prediction of the dynamic behaviour of the fermentation process.
Gill, P; Bleka, Ø; Egeland, T
2014-11-01
Likelihood ratio (LR) methods to interpret multi-contributor, low template, complex DNA mixtures are becoming standard practice. The next major development will be to introduce search engines based on the new methods to interrogate very large national DNA databases, such as those held by China, the USA and the UK. Here we describe a rapid method that was used to assign a LR to each individual member of database of 5 million genotypes which can be ranked in order. Previous authors have only considered database trawls in the context of binary match or non-match criteria. However, the concept of match/non-match no longer applies within the new paradigm introduced, since the distribution of resultant LRs is continuous for practical purposes. An English appeal court decision allows scientists to routinely report complex DNA profiles using nothing more than their subjective personal 'experience of casework' and 'observations' in order to apply an expression of the rarity of an evidential sample. This ruling must be considered in context of a recent high profile English case, where an individual was extracted from a database and wrongly accused of a serious crime. In this case the DNA evidence was used to negate the overwhelming exculpatory (non-DNA) evidence. Demonstrable confirmation bias, also known as the 'CSI-effect, seriously affected the investigation. The case demonstrated that in practice, databases could be used to select and prosecute an individual, simply because he ranked high in the list of possible matches. We have identified this phenomenon as a cognitive error which we term: 'the naïve investigator effect'. We take the opportunity to test the performance of database extraction strategies either by using a simple matching allele count (MAC) method or LR. The example heard by the appeal court is used as the exemplar case. It is demonstrated that the LR search-method offers substantial benefits compared to searches based on simple matching allele count (MAC) methods. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Empirical likelihood-based confidence intervals for mean medical cost with censored data.
Jeyarajah, Jenny; Qin, Gengsheng
2017-11-10
In this paper, we propose empirical likelihood methods based on influence function and jackknife techniques for constructing confidence intervals for mean medical cost with censored data. We conduct a simulation study to compare the coverage probabilities and interval lengths of our proposed confidence intervals with that of the existing normal approximation-based confidence intervals and bootstrap confidence intervals. The proposed methods have better finite-sample performances than existing methods. Finally, we illustrate our proposed methods with a relevant example. Copyright © 2017 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Washeleski, Robert L.; Meyer, Edmond J. IV; King, Lyon B.
2013-10-15
Laser Thomson scattering (LTS) is an established plasma diagnostic technique that has seen recent application to low density plasmas. It is difficult to perform LTS measurements when the scattered signal is weak as a result of low electron number density, poor optical access to the plasma, or both. Photon counting methods are often implemented in order to perform measurements in these low signal conditions. However, photon counting measurements performed with photo-multiplier tubes are time consuming and multi-photon arrivals are incorrectly recorded. In order to overcome these shortcomings a new data analysis method based on maximum likelihood estimation was developed. Themore » key feature of this new data processing method is the inclusion of non-arrival events in determining the scattered Thomson signal. Maximum likelihood estimation and its application to Thomson scattering at low signal levels is presented and application of the new processing method to LTS measurements performed in the plume of a 2-kW Hall-effect thruster is discussed.« less
Washeleski, Robert L; Meyer, Edmond J; King, Lyon B
2013-10-01
Laser Thomson scattering (LTS) is an established plasma diagnostic technique that has seen recent application to low density plasmas. It is difficult to perform LTS measurements when the scattered signal is weak as a result of low electron number density, poor optical access to the plasma, or both. Photon counting methods are often implemented in order to perform measurements in these low signal conditions. However, photon counting measurements performed with photo-multiplier tubes are time consuming and multi-photon arrivals are incorrectly recorded. In order to overcome these shortcomings a new data analysis method based on maximum likelihood estimation was developed. The key feature of this new data processing method is the inclusion of non-arrival events in determining the scattered Thomson signal. Maximum likelihood estimation and its application to Thomson scattering at low signal levels is presented and application of the new processing method to LTS measurements performed in the plume of a 2-kW Hall-effect thruster is discussed.
2015-08-01
McCullagh, P.; Nelder, J.A. Generalized Linear Model , 2nd ed.; Chapman and Hall: London, 1989. 7. Johnston, J. Econometric Methods, 3rd ed.; McGraw...FOR A DOSE-RESPONSE MODEL ECBC-TN-068 Kyong H. Park Steven J. Lagan RESEARCH AND TECHNOLOGY DIRECTORATE August 2015 Approved for public release...Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model 5a. CONTRACT NUMBER 5b. GRANT
Antidepressant Prescribing by Pediatricians: A Mixed-Methods Analysis.
Tulisiak, Anne K; Klein, Jillian A; Harris, Emily; Luft, Marissa J; Schroeder, Heidi K; Mossman, Sarah A; Varney, Sara T; Keeshin, Brooks R; Cotton, Sian; Strawn, Jeffrey R
2017-01-01
Among pediatricians, perceived knowledge of efficacy, tolerability, dosing, and side effects of antidepressants represent significant sources of variability in the use of these medications in youth with depressive and anxiety disorders. Importantly, the qualitative factors that relate to varying levels of comfort with antidepressants and willingness to prescribe are poorly understood. Using a mixed-methods approach, in-depth interviews were conducted with community-based and academic medical center-based pediatricians (N = 14). Interviews were audio recorded and iteratively coded; themes were then generated using inductive thematic analysis. The relationship between demographic factors, knowledge of antidepressants, dosing, and side effects, as well as prescribing likelihood scores for depressive disorders, anxiety disorders or co-morbid anxiety and depressive disorders, were evaluated using mixed models. Pediatricians reported antidepressants to be effective and well-tolerated. However, the likelihood of individual physicians initiating an antidepressant was significantly lower for anxiety disorders relative to depressive disorders with similar functional impairment. Pediatricians considered symptom severity/functional impairment, age and the availability of psychotherapy as they considered prescribing antidepressants to individual patients. Antidepressant choice was related to the physician׳s perceived knowledge and comfort with a particular antidepressant, financial factors, and the disorder-specific evidence base for that particular medication and consultation with mental health practitioners. Pediatricians noted similar efficacy and tolerability profiles for antidepressants in youth with depressive disorders and anxiety disorders, but tended to utilize "therapy first" approaches for anxiety disorders relative to depressive disorders. Parental and family factors that influenced prescribing of antidepressants by pediatricians included parental ambivalence, family-related dysfunction and impairment secondary to the child׳s psychopathology as well as the child׳s psychosocial milieu. Pediatricians consider patient- and family-specific challenges when choosing prescribing antidepressant medications and are, in general, less likely to prescribe antidepressants for youth with anxiety disorders compared to youth with depressive disorders. The lower likelihood of prescribing antidepressants for anxious youth is not related to perception of the efficacy or tolerability, but rather to a perception that anxiety disorders are less impairing and more appropriately managed with psychotherapy. Copyright © 2016 Mosby, Inc. All rights reserved.
Scientific expertise and the Athlete Biological Passport: 3 years of experience.
Schumacher, Yorck Olaf; d'Onofrio, Giuseppe
2012-06-01
Expert evaluation of biological data is a key component of the Athlete Biological Passport approach in the fight against doping. The evaluation consists of a longitudinal assessment of biological variables to determine the probability of the data being physiological on the basis of the athlete's on own previous values (performed by an automated software system using a Bayesian model) and a subjective evaluation of the results in view of possible causes (performed by experts). The role of the expert is therefore a key component in the process. Experts should be qualified to evaluate the data regarding possible explanations related to the influence of doping products and methods, analytical issues, and the influence of exercise or pathological conditions. The evaluation provides a scientific basis for the decision taken by a disciplinary panel. This evaluation should therefore encompass and balance all possible causes for a given blood profile and provide a likelihood for potential scenarios (pathology, normal variation, doping) that might have caused the pattern. It should comply with the standards for the evaluation of scientific evidence in forensics. On the basis of their evaluation of profiles, experts might provide assistance in planning appropriate target testing schemes.
The upgrade of the Thomson scattering system for measurement on the C-2/C-2U devices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhai, K.; Schindler, T.; Kinley, J.
The C-2/C-2U Thomson scattering system has been substantially upgraded during the latter phase of C-2/C-2U program. A Rayleigh channel has been added to each of the three polychromators of the C-2/C-2U Thomson scattering system. Onsite spectral calibration has been applied to avoid the issue of different channel responses at different spots on the photomultiplier tube surface. With the added Rayleigh channel, the absolute intensity response of the system is calibrated with Rayleigh scattering in argon gas from 0.1 to 4 Torr, where the Rayleigh scattering signal is comparable to the Thomson scattering signal at electron densities from 1 × 10{supmore » 13} to 4 × 10{sup 14} cm{sup −3}. A new signal processing algorithm, using a maximum likelihood method and including detailed analysis of different noise contributions within the system, has been developed to obtain electron temperature and density profiles. The system setup, spectral and intensity calibration procedure and its outcome, data analysis, and the results of electron temperature/density profile measurements will be presented.« less
Physical activity and sleep profiles in Finnish men and women.
Wennman, Heini; Kronholm, Erkki; Partonen, Timo; Tolvanen, Asko; Peltonen, Markku; Vasankari, Tommi; Borodulin, Katja
2014-01-27
Physical activity (PA) and sleep are related to cardiovascular diseases (CVD) and their risk factors. The interrelationship between these behaviors has been studied, but there remain questions regarding the association of different types of PA, such as occupational, commuting, and leisure time to sleep, including quality, duration and sufficiency. It is also unclear to what extent sleep affects peoples' PA levels and patterns. Our aim is to investigate the interrelationship between PA and sleep behaviors in the Finnish population, including employment status and gender. The study comprised population based data from the FINRISK 2012 Study. A stratified, random sample of 10,000 Finns, 25 to 74 years-old, were sent a questionnaire and an invitation to a health examination. The participation rate was 64% (n = 6,414). Latent class analysis was used to search for different underlying profiles of PA and sleep behavior in men and women, respectively. Models with one through five latent profiles were fitted to the data. Based on fit indicators, a four-class model for men and women, respectively, was decided to be the best fitted model. Four different profiles of PA and sleep were found in both men and women. The most common profile of men comprised 45% of the total participants, and in women, 47%. These profiles were distinguished by probabilities for high leisure time PA and sleep, subjectively rated as sufficient, as well as sleep duration of 7-7.9 hours. The least common profiles represented 5% (men) and 11% (women) of the population, and were characterized by probabilities for physical inactivity, short sleep, and evening type for women and morning type for men. There was also one profile in both genders characterized by likelihood for both high occupational PA and subjectively experienced insufficient sleep. The use of latent class analysis in investigating the interrelationship between PA and sleep is a novel perspective. The method provides information on the clustering of behaviors in people and the profiles found suggest an accumulative nature of leisure time PA, and better sleep. Our data also suggest that high levels of occupational PA are associated with shorter and poorer sleep.
A mathematical approach to beam matching
Manikandan, A; Nandy, M; Gossman, M S; Sureka, C S; Ray, A; Sujatha, N
2013-01-01
Objective: This report provides the mathematical commissioning instructions for the evaluation of beam matching between two different linear accelerators. Methods: Test packages were first obtained including an open beam profile, a wedge beam profile and a depth–dose curve, each from a 10×10 cm2 beam. From these plots, a spatial error (SE) and a percentage dose error were introduced to form new plots. These three test package curves and the associated error curves were then differentiated in space with respect to dose for a first and second derivative to determine the slope and curvature of each data set. The derivatives, also known as bandwidths, were analysed to determine the level of acceptability for the beam matching test described in this study. Results: The open and wedged beam profiles and depth–dose curve in the build-up region were determined to match within 1% dose error and 1-mm SE at 71.4% and 70.8% for of all points, respectively. For the depth–dose analysis specifically, beam matching was achieved for 96.8% of all points at 1%/1 mm beyond the depth of maximum dose. Conclusion: To quantify the beam matching procedure in any clinic, the user needs to merely generate test packages from their reference linear accelerator. It then follows that if the bandwidths are smooth and continuous across the profile and depth, there is greater likelihood of beam matching. Differentiated spatial and percentage variation analysis is appropriate, ideal and accurate for this commissioning process. Advances in knowledge: We report a mathematically rigorous formulation for the qualitative evaluation of beam matching between linear accelerators. PMID:23995874
XENON100 exclusion limit without considering Leff as a nuisance parameter
NASA Astrophysics Data System (ADS)
Davis, Jonathan H.; Bœhm, Céline; Oppermann, Niels; Ensslin, Torsten; Lacroix, Thomas
2012-07-01
In 2011, the XENON100 experiment has set unprecedented constraints on dark matter-nucleon interactions, excluding dark matter candidates with masses down to 6 GeV if the corresponding cross section is larger than 10-39cm2. The dependence of the exclusion limit in terms of the scintillation efficiency (Leff) has been debated at length. To overcome possible criticisms XENON100 performed an analysis in which Leff was considered as a nuisance parameter and its uncertainties were profiled out by using a Gaussian likelihood in which the mean value corresponds to the best fit Leff value (smoothly extrapolated to 0 below 3 keVnr). Although such a method seems fairly robust, it does not account for more extreme types of extrapolation nor does it enable us to anticipate how much the exclusion limit would vary if new data were to support a flat behavior for Leff below 3 keVnr, for example. Yet, such a question is crucial for light dark matter models which are close to the published XENON100 limit. To answer this issue, we use a maximum likelihood ratio analysis, as done by the XENON100 Collaboration, but do not consider Leff as a nuisance parameter. Instead, Leff is obtained directly from the fits to the data. This enables us to define frequentist confidence intervals by marginalizing over Leff.
Berger, Christian; Batanova, Milena; Cance, Jessica Duncan
2015-12-01
The present study tests whether aggression and prosocial behavior can coexist as part of a socially functional and adaptive profile among early adolescents. Using a person-centered approach, the study examined early adolescents' likelihood of being classified into profiles involving aggressive and prosocial behavior, social status (popular, liked, cool), machiavellianism, and both affective and cognitive components of empathy (empathic concern and perspective taking, respectively). Participants were 1170 early adolescents (10-12 years of age; 52% male) from four schools in metropolitan Santiago, Chile. Through latent profile analysis, three profiles emerged (normative-low aggressive, high prosocial-low aggressive, and high aggressive-high popular status). Both empathic concern and perspective taking were higher in the high prosocial-low aggressive profile, whereas the high aggressive-high popular status profile had the lowest scores on both empathy components as well as machiavellianism. No profile emerged where aggressive and prosocial behaviors were found to co-exist, or to be significantly above the mean. The results underscore that aggressive behavior is highly contextual and likely culturally specific, and that the study of behavioral profiles should consider social status as well as socio-emotional adjustment indicators. These complex associations should be taken into consideration when planning prevention and intervention efforts to reduce aggression or school bullying and to promote positive peer relationships.
Olga Loseva; Mohamed Ibrahim; Mehmet Candas; C. Noah Koller; Leah S. Bauer; Lee A. Jr. Bulla
2002-01-01
Widespread commercial use of Bacillus thuringiensis Cry toxins to control pest insects has increased the likelihood for development of insect resistance to this entomopathogen. In this study, we investigated protease activity profiles and toxin-binding capacities in the midgut of a strain of Colorado potato beetle (CPB) that has developed resistance...
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J
2007-09-24
In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.
Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles.
Moretti, Tamyra R; Just, Rebecca S; Kehl, Susannah C; Willis, Leah E; Buckleton, John S; Bright, Jo-Anne; Taylor, Duncan A; Onorato, Anthony J
2017-07-01
The interpretation of DNA evidence can entail analysis of challenging STR typing results. Genotypes inferred from low quality or quantity specimens, or mixed DNA samples originating from multiple contributors, can result in weak or inconclusive match probabilities when a binary interpretation method and necessary thresholds (such as a stochastic threshold) are employed. Probabilistic genotyping approaches, such as fully continuous methods that incorporate empirically determined biological parameter models, enable usage of more of the profile information and reduce subjectivity in interpretation. As a result, software-based probabilistic analyses tend to produce more consistent and more informative results regarding potential contributors to DNA evidence. Studies to assess and internally validate the probabilistic genotyping software STRmix™ for casework usage at the Federal Bureau of Investigation Laboratory were conducted using lab-specific parameters and more than 300 single-source and mixed contributor profiles. Simulated forensic specimens, including constructed mixtures that included DNA from two to five donors across a broad range of template amounts and contributor proportions, were used to examine the sensitivity and specificity of the system via more than 60,000 tests comparing hundreds of known contributors and non-contributors to the specimens. Conditioned analyses, concurrent interpretation of amplification replicates, and application of an incorrect contributor number were also performed to further investigate software performance and probe the limitations of the system. In addition, the results from manual and probabilistic interpretation of both prepared and evidentiary mixtures were compared. The findings support that STRmix™ is sufficiently robust for implementation in forensic laboratories, offering numerous advantages over historical methods of DNA profile analysis and greater statistical power for the estimation of evidentiary weight, and can be used reliably in human identification testing. With few exceptions, likelihood ratio results reflected intuitively correct estimates of the weight of the genotype possibilities and known contributor genotypes. This comprehensive evaluation provides a model in accordance with SWGDAM recommendations for internal validation of a probabilistic genotyping system for DNA evidence interpretation. Copyright © 2017. Published by Elsevier B.V.
A Non-parametric Cutout Index for Robust Evaluation of Identified Proteins*
Serang, Oliver; Paulo, Joao; Steen, Hanno; Steen, Judith A.
2013-01-01
This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems. PMID:23292186
Factors that predict the use or non-use of virtual dissection by high school biology teachers
NASA Astrophysics Data System (ADS)
Cockerham, William
2001-07-01
With the advent of computers into scholastic classrooms, virtual dissection has become a potential educational tool in high school biology lab settings. Utilizing non-experimental survey research methodology, this study attempted to identify factors that may influence high school biology teachers to use or not to use a virtual dissection. A 75-item research survey instrument consisting of both demographic background and Likert style questions was completed by 215 high school members of the National Association of Biology Teachers. The survey responses provided data to answer the research questions concerning the relationship between the likelihood of a high school biology teacher using a virtual dissection and a number of independent variables from the following three categories: (a) demographics, (b) attitude and experience, and (c) resources and support. These data also allowed for the determination of a demographic profile of the sample population. The demographic profile showed the sample population of high school biology teachers to be two-thirds female, mature, highly educated and very experienced. Analysis of variance and Pearson product moment correlational statistics were used to determine if there was a relationship between high school biology teachers' likelihood to use a virtual dissection and the independent variables. None of the demographic or resource and support independent variables demonstrated a strong relationship to the dependent variable of teachers' likelihood to use a virtual dissection. Three of the attitude and experience independent variables showed a statistically significant (p < .05) relationship to teachers' likelihood to use a virtual dissection: attitude toward virtual dissection, previous use of a virtual dissection and intention to use a real animal dissection. These findings may indicate that teachers are using virtual dissection as a supplement rather than a substitute. It appears that those concerned with promoting virtual dissection in high school biology classrooms will have to develop simulations that are more compelling to the teachers. Additionally, if science teacher organizations want to reduce the controversy surrounding dissection, they may need to re-visit their positions on the importance of real animal dissection.
Using pseudoalignment and base quality to accurately quantify microbial community composition
Novembre, John
2018-01-01
Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose a novel method for taxonomic profiling in pooled DNA that combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. We apply the method to the problem of classifying 16S rRNA reads using a reference database of known organisms, a common challenge in microbiome research. Using simulations, we show the method is accurate across a variety of read lengths, with different length reference sequences, at different sample depths, and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, where we reanalyze previous genetic association data to show our method discovers a larger number of quantitative trait associations than other widely used methods. We implement our method in the software Karp, for k-mer based analysis of read pools, to provide a novel combination of speed and accuracy that is uniquely suited for enhancing discoveries in microbial studies. PMID:29659582
Yu, Peng; Shaw, Chad A
2014-06-01
The Dirichlet-multinomial (DMN) distribution is a fundamental model for multicategory count data with overdispersion. This distribution has many uses in bioinformatics including applications to metagenomics data, transctriptomics and alternative splicing. The DMN distribution reduces to the multinomial distribution when the overdispersion parameter ψ is 0. Unfortunately, numerical computation of the DMN log-likelihood function by conventional methods results in instability in the neighborhood of [Formula: see text]. An alternative formulation circumvents this instability, but it leads to long runtimes that make it impractical for large count data common in bioinformatics. We have developed a new method for computation of the DMN log-likelihood to solve the instability problem without incurring long runtimes. The new approach is composed of a novel formula and an algorithm to extend its applicability. Our numerical experiments show that this new method both improves the accuracy of log-likelihood evaluation and the runtime by several orders of magnitude, especially in high-count data situations that are common in deep sequencing data. Using real metagenomic data, our method achieves manyfold runtime improvement. Our method increases the feasibility of using the DMN distribution to model many high-throughput problems in bioinformatics. We have included in our work an R package giving access to this method and a vingette applying this approach to metagenomic data. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Cha, Kenny H.; Hadjiiski, Lubomir; Samala, Ravi K.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.
2016-01-01
Purpose: The authors are developing a computerized system for bladder segmentation in CT urography (CTU) as a critical component for computer-aided detection of bladder cancer. Methods: A deep-learning convolutional neural network (DL-CNN) was trained to distinguish between the inside and the outside of the bladder using 160 000 regions of interest (ROI) from CTU images. The trained DL-CNN was used to estimate the likelihood of an ROI being inside the bladder for ROIs centered at each voxel in a CTU case, resulting in a likelihood map. Thresholding and hole-filling were applied to the map to generate the initial contour for the bladder, which was then refined by 3D and 2D level sets. The segmentation performance was evaluated using 173 cases: 81 cases in the training set (42 lesions, 21 wall thickenings, and 18 normal bladders) and 92 cases in the test set (43 lesions, 36 wall thickenings, and 13 normal bladders). The computerized segmentation accuracy using the DL likelihood map was compared to that using a likelihood map generated by Haar features and a random forest classifier, and that using our previous conjoint level set analysis and segmentation system (CLASS) without using a likelihood map. All methods were evaluated relative to the 3D hand-segmented reference contours. Results: With DL-CNN-based likelihood map and level sets, the average volume intersection ratio, average percent volume error, average absolute volume error, average minimum distance, and the Jaccard index for the test set were 81.9% ± 12.1%, 10.2% ± 16.2%, 14.0% ± 13.0%, 3.6 ± 2.0 mm, and 76.2% ± 11.8%, respectively. With the Haar-feature-based likelihood map and level sets, the corresponding values were 74.3% ± 12.7%, 13.0% ± 22.3%, 20.5% ± 15.7%, 5.7 ± 2.6 mm, and 66.7% ± 12.6%, respectively. With our previous CLASS with local contour refinement (LCR) method, the corresponding values were 78.0% ± 14.7%, 16.5% ± 16.8%, 18.2% ± 15.0%, 3.8 ± 2.3 mm, and 73.9% ± 13.5%, respectively. Conclusions: The authors demonstrated that the DL-CNN can overcome the strong boundary between two regions that have large difference in gray levels and provides a seamless mask to guide level set segmentation, which has been a problem for many gradient-based segmentation methods. Compared to our previous CLASS with LCR method, which required two user inputs to initialize the segmentation, DL-CNN with level sets achieved better segmentation performance while using a single user input. Compared to the Haar-feature-based likelihood map, the DL-CNN-based likelihood map could guide the level sets to achieve better segmentation. The results demonstrate the feasibility of our new approach of using DL-CNN in combination with level sets for segmentation of the bladder. PMID:27036584
Identifying the most likely contributors to a Y-STR mixture using the discrete Laplace method.
Andersen, Mikkel Meyer; Eriksen, Poul Svante; Mogensen, Helle Smidt; Morling, Niels
2015-03-01
In some crime cases, the male part of the DNA in a stain can only be analysed using Y chromosomal markers, e.g. Y-STRs. This may be the case in e.g. rape cases, where the male components can only be detected as Y-STR profiles, because the fraction of male DNA is much smaller than that of female DNA, which can mask the male results when autosomal STRs are investigated. Sometimes, mixtures of Y-STRs are observed, e.g. in rape cases with multiple offenders. In such cases, Y-STR mixture analysis is required, e.g. by mixture deconvolution, to deduce the most likely DNA profiles from the contributors. We demonstrate how the discrete Laplace method can be used to separate a two person Y-STR mixture, where the Y-STR profiles of the true contributors are not present in the reference dataset, which is often the case for Y-STR profiles in real case work. We also briefly discuss how to calculate the weight of the evidence using the likelihood ratio principle when a suspect's Y-STR profile fits into a two person mixture. We used three datasets with between 7 and 21 Y-STR loci: Denmark (n=181), Somalia (n=201) and Germany (n=3443). The Danish dataset with 21 loci was truncated to 15 and 10 loci to examine the effect of the number of loci. For each of these datasets, an out of sample simulation study was performed: A total of 550 mixtures were composed by randomly sampling two haplotypes, h1 and h2, from the dataset. We then used the discrete Laplace method on the remaining data (excluding h1 and h2) to rank the contributor pairs by the product of the contributors' estimated haplotype frequencies. Successful separation of mixtures (defined by the observation that the true contributor pair was among the 10 most likely contributor pairs) was found in 42-52% of the cases for 21 loci, 69-75% for 15 loci and 92-99% for 10 loci or less depending on the dataset and how the discrete Laplace model was chosen. Y-STR mixtures with many loci are difficult to separate, but even haplotypes with 21 Y-STR loci can be separated. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Estimation of brood and nest survival: Comparative methods in the presence of heterogeneity
Manly, Bryan F.J.; Schmutz, Joel A.
2001-01-01
The Mayfield method has been widely used for estimating survival of nests and young animals, especially when data are collected at irregular observation intervals. However, this method assumes survival is constant throughout the study period, which often ignores biologically relevant variation and may lead to biased survival estimates. We examined the bias and accuracy of 1 modification to the Mayfield method that allows for temporal variation in survival, and we developed and similarly tested 2 additional methods. One of these 2 new methods is simply an iterative extension of Klett and Johnson's method, which we refer to as the Iterative Mayfield method and bears similarity to Kaplan-Meier methods. The other method uses maximum likelihood techniques for estimation and is best applied to survival of animals in groups or families, rather than as independent individuals. We also examined how robust these estimators are to heterogeneity in the data, which can arise from such sources as dependent survival probabilities among siblings, inherent differences among families, and adoption. Testing of estimator performance with respect to bias, accuracy, and heterogeneity was done using simulations that mimicked a study of survival of emperor goose (Chen canagica) goslings. Assuming constant survival for inappropriately long periods of time or use of Klett and Johnson's methods resulted in large bias or poor accuracy (often >5% bias or root mean square error) compared to our Iterative Mayfield or maximum likelihood methods. Overall, estimator performance was slightly better with our Iterative Mayfield than our maximum likelihood method, but the maximum likelihood method provides a more rigorous framework for testing covariates and explicity models a heterogeneity factor. We demonstrated use of all estimators with data from emperor goose goslings. We advocate that future studies use the new methods outlined here rather than the traditional Mayfield method or its previous modifications.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.
Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L
2005-12-01
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
NASA Astrophysics Data System (ADS)
Sutawanir
2015-12-01
Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
An alternative method to measure the likelihood of a financial crisis in an emerging market
NASA Astrophysics Data System (ADS)
Özlale, Ümit; Metin-Özcan, Kıvılcım
2007-07-01
This paper utilizes an early warning system in order to measure the likelihood of a financial crisis in an emerging market economy. We introduce a methodology, where we can both obtain a likelihood series and analyze the time-varying effects of several macroeconomic variables on this likelihood. Since the issue is analyzed in a non-linear state space framework, the extended Kalman filter emerges as the optimal estimation algorithm. Taking the Turkish economy as our laboratory, the results indicate that both the derived likelihood measure and the estimated time-varying parameters are meaningful and can successfully explain the path that the Turkish economy had followed between 2000 and 2006. The estimated parameters also suggest that overvalued domestic currency, current account deficit and the increase in the default risk increase the likelihood of having an economic crisis in the economy. Overall, the findings in this paper suggest that the estimation methodology introduced in this paper can also be applied to other emerging market economies as well.
Tests for detecting overdispersion in models with measurement error in covariates.
Yang, Yingsi; Wong, Man Yu
2015-11-30
Measurement error in covariates can affect the accuracy in count data modeling and analysis. In overdispersion identification, the true mean-variance relationship can be obscured under the influence of measurement error in covariates. In this paper, we propose three tests for detecting overdispersion when covariates are measured with error: a modified score test and two score tests based on the proposed approximate likelihood and quasi-likelihood, respectively. The proposed approximate likelihood is derived under the classical measurement error model, and the resulting approximate maximum likelihood estimator is shown to have superior efficiency. Simulation results also show that the score test based on approximate likelihood outperforms the test based on quasi-likelihood and other alternatives in terms of empirical power. By analyzing a real dataset containing the health-related quality-of-life measurements of a particular group of patients, we demonstrate the importance of the proposed methods by showing that the analyses with and without measurement error correction yield significantly different results. Copyright © 2015 John Wiley & Sons, Ltd.
Efficient simulation and likelihood methods for non-neutral multi-allele models.
Joyce, Paul; Genz, Alan; Buzbas, Erkan Ozge
2012-06-01
Throughout the 1980s, Simon Tavaré made numerous significant contributions to population genetics theory. As genetic data, in particular DNA sequence, became more readily available, a need to connect population-genetic models to data became the central issue. The seminal work of Griffiths and Tavaré (1994a , 1994b , 1994c) was among the first to develop a likelihood method to estimate the population-genetic parameters using full DNA sequences. Now, we are in the genomics era where methods need to scale-up to handle massive data sets, and Tavaré has led the way to new approaches. However, performing statistical inference under non-neutral models has proved elusive. In tribute to Simon Tavaré, we present an article in spirit of his work that provides a computationally tractable method for simulating and analyzing data under a class of non-neutral population-genetic models. Computational methods for approximating likelihood functions and generating samples under a class of allele-frequency based non-neutral parent-independent mutation models were proposed by Donnelly, Nordborg, and Joyce (DNJ) (Donnelly et al., 2001). DNJ (2001) simulated samples of allele frequencies from non-neutral models using neutral models as auxiliary distribution in a rejection algorithm. However, patterns of allele frequencies produced by neutral models are dissimilar to patterns of allele frequencies produced by non-neutral models, making the rejection method inefficient. For example, in some cases the methods in DNJ (2001) require 10(9) rejections before a sample from the non-neutral model is accepted. Our method simulates samples directly from the distribution of non-neutral models, making simulation methods a practical tool to study the behavior of the likelihood and to perform inference on the strength of selection.
A general methodology for maximum likelihood inference from band-recovery data
Conroy, M.J.; Williams, B.K.
1984-01-01
A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
Kretschmer, Tina; Sentse, Miranda; Meeus, Wim; Verhulst, Frank C; Veenstra, René; Oldehinkel, Albertine J
2016-09-01
Adolescents' peer experiences embrace behavior, relationship quality, status, and victimization, but studies that account for multiple dimensions are rare. Using latent profile modeling and measures of peer behavior, relationship quality, peer status, and victimization assessed from 1,677 adolescents, four profiles were identified: High Quality, Low Quality, Low Quality Victimized, and Deviant Peers. Multinomial logistic regressions showed that negative parent-child relationships in preadolescence reduced the likelihood of High Quality peer relations in mid-adolescence but only partly differentiated between the other three profiles. Moderation by gender was partly found with girls showing greater sensitivity to parent-child relationship quality with respect to peer experiences. Results underline the multifaceted nature of peer experiences, and practical and theoretical implications are discussed. © 2015 The Authors. Journal of Research on Adolescence © 2015 Society for Research on Adolescence.
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
NASA Astrophysics Data System (ADS)
Lowell, A. W.; Boggs, S. E.; Chiu, C. L.; Kierans, C. A.; Sleator, C.; Tomsick, J. A.; Zoglauer, A. C.; Chang, H.-K.; Tseng, C.-H.; Yang, C.-Y.; Jean, P.; von Ballmoos, P.; Lin, C.-H.; Amman, M.
2017-10-01
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ˜21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. We find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.
Patch-based image reconstruction for PET using prior-image derived dictionaries
NASA Astrophysics Data System (ADS)
Tahaei, Marzieh S.; Reader, Andrew J.
2016-09-01
In PET image reconstruction, regularization is often needed to reduce the noise in the resulting images. Patch-based image processing techniques have recently been successfully used for regularization in medical image reconstruction through a penalized likelihood framework. Re-parameterization within reconstruction is another powerful regularization technique in which the object in the scanner is re-parameterized using coefficients for spatially-extensive basis vectors. In this work, a method for extracting patch-based basis vectors from the subject’s MR image is proposed. The coefficients for these basis vectors are then estimated using the conventional MLEM algorithm. Furthermore, using the alternating direction method of multipliers, an algorithm for optimizing the Poisson log-likelihood while imposing sparsity on the parameters is also proposed. This novel method is then utilized to find sparse coefficients for the patch-based basis vectors extracted from the MR image. The results indicate the superiority of the proposed methods to patch-based regularization using the penalized likelihood framework.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang Shulian; Li Yexiong, E-mail: yexiong@yahoo.com; Song Yongwen
2011-07-15
Purpose: To evaluate the prognostic value of determining estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor 2 (HER2) expression in node-positive breast cancer patients treated with mastectomy. Methods and Materials: The records of 835 node-positive breast cancer patients who had undergone mastectomy between January 2000 and December 2004 were analyzed retrospectively. Of these, 764 patients (91.5%) received chemotherapy; 68 of 398 patients (20.9%) with T1-2N1 disease and 352 of 437 patients (80.5%) with T3-4 or N2-3 disease received postoperative radiotherapy. Patients were classified into four subgroups according to hormone receptor (Rec+ or Rec-) and HER2 expression profiles:more » Rec-/HER2- (triple negative; n = 141), Rec-/HER2+ (n = 99), Rec+/HER2+ (n = 157), and Rec+/HER2- (n = 438). The endpoints were the duration of locoregional recurrence-free survival, distant metastasis-free survival, disease-free survival, and overall survival. Results: Patients with triple-negative, Rec-/HER2+, and Rec+/HER2+ expression profiles had a significantly lower 5-year locoregional recurrence-free survival than those with Rec+/HER2- profiles (86.5% vs. 93.6%, p = 0.002). Compared with those with Rec+/HER2+ and Rec+/HER2- profiles, patients with Rec-/HER2- and Rec-/HER2+ profiles had significantly lower 5-year distant metastasis-free survival (69.1% vs. 78.5%, p = 0.000), lower disease-free survival (66.6% vs. 75.6%, p = 0.000), and lower overall survival (71.4% vs. 84.2%, p = 0.000). Triple-negative or Rec-/HER2+ breast cancers had an increased likelihood of relapse and death within the first 3 years after treatment. Conclusions: Triple-negative and HER2-positive profiles are useful markers of prognosis for locoregional recurrence and survival in node-positive breast cancer patients treated with mastectomy.« less
Free energy reconstruction from steered dynamics without post-processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Athenes, Manuel, E-mail: Manuel.Athenes@cea.f; Condensed Matter and Materials Division, Physics and Life Sciences Directorate, LLNL, Livermore, CA 94551; Marinica, Mihai-Cosmin
2010-09-20
Various methods achieving importance sampling in ensembles of nonequilibrium trajectories enable one to estimate free energy differences and, by maximum-likelihood post-processing, to reconstruct free energy landscapes. Here, based on Bayes theorem, we propose a more direct method in which a posterior likelihood function is used both to construct the steered dynamics and to infer the contribution to equilibrium of all the sampled states. The method is implemented with two steering schedules. First, using non-autonomous steering, we calculate the migration barrier of the vacancy in Fe-{alpha}. Second, using an autonomous scheduling related to metadynamics and equivalent to temperature-accelerated molecular dynamics, wemore » accurately reconstruct the two-dimensional free energy landscape of the 38-atom Lennard-Jones cluster as a function of an orientational bond-order parameter and energy, down to the solid-solid structural transition temperature of the cluster and without maximum-likelihood post-processing.« less
2017-01-01
Introduction In South Africa, the rate of HIV in the sex worker (SW) population is exceedingly high, but critical gaps exist in our understanding of SWs and the factors that make them vulnerable to HIV. This study aimed to estimate HIV prevalence among female sex workers (FSWs) in Soweto, South Africa, and to describe their sexual behavior and other factors associated with HIV infection. Methods A cross-sectional, respondent-driven sampling (RDS) recruitment methodology was used to enroll 508 FSWs based in Soweto. Data were collected using a survey instrument, followed by two HIV rapid tests. Raw and RDS adjusted data were analyzed using a chi-squared test of association and multivariate logistic regression to show factors associated with HIV infection. Findings HIV prevalence among FSWs was 53.6% (95% CI 47.5–59.9). FSWs were almost exclusively based in taverns (85.6%) and hostels (52.0%). Less than a quarter (24.4%) were under 25 years of age. Non-partner violence was reported by 55.5%, 59.6% of whom were HIV-infected. Advancing age, incomplete secondary schooling, migrancy and multiple clients increased the likelihood of HIV acquisition: >30 years of age was associated with a 4.9 times (95% CI 2.6–9.3) increased likelihood of HIV; incomplete secondary schooling almost tripled the likelihood (AOR 2.8, 95% CI 1.6–5.0); being born outside of the Gauteng province increased the likelihood of HIV 2.3 times (95% CI 1.3–4.0); and having more than five clients per day almost doubled the likelihood (AOR 1.9, 95% CI 1.1–3.2). Conclusion Our findings highlight the extreme vulnerability of FSWs to HIV. Advancing age, limited education and multiple clients were risk factors associated with HIV, strongly driven by a combination of structural, biological and behavioral determinants. Evidence suggests that interventions need to be carefully tailored to the varying profiles of SW populations across South Africa. Soweto could be considered a microcosm of South Africa in terms of the epidemic of violence and HIV experienced by the SW population, which is influenced by factors often beyond an individual level of control. While describing a hitherto largely undocumented population of FSWs, our findings confirm the urgent need to scale up innovative HIV prevention and treatment programs for this population. PMID:28981511
Modelling small-area inequality in premature mortality using years of life lost rates
NASA Astrophysics Data System (ADS)
Congdon, Peter
2013-04-01
Analysis of premature mortality variations via standardized expected years of life lost (SEYLL) measures raises questions about suitable modelling for mortality data, especially when developing SEYLL profiles for areas with small populations. Existing fixed effects estimation methods take no account of correlations in mortality levels over ages, causes, socio-ethnic groups or areas. They also do not specify an underlying data generating process, or a likelihood model that can include trends or correlations, and are likely to produce unstable estimates for small-areas. An alternative strategy involves a fully specified data generation process, and a random effects model which "borrows strength" to produce stable SEYLL estimates, allowing for correlations between ages, areas and socio-ethnic groups. The resulting modelling strategy is applied to gender-specific differences in SEYLL rates in small-areas in NE London, and to cause-specific mortality for leading causes of premature mortality in these areas.
A Selective Overview of Variable Selection in High Dimensional Feature Space
Fan, Jianqing
2010-01-01
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods. PMID:21572976
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent; Anisimova, Maria; Hordijk, Wim; Gascuel, Olivier
2010-05-01
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasonable starting tree topology. Since the original publication (Guindon S., Gascuel O. 2003. A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704), PhyML has been widely used (>2500 citations in ISI Web of Science) because of its simplicity and a fair compromise between accuracy and speed. In the meantime, research around PhyML has continued, and this article describes the new algorithms and methods implemented in the program. First, we introduce a new algorithm to search the tree space with user-defined intensity using subtree pruning and regrafting topological moves. The parsimony criterion is used here to filter out the least promising topology modifications with respect to the likelihood function. The analysis of a large collection of real nucleotide and amino acid data sets of various sizes demonstrates the good performance of this method. Second, we describe a new test to assess the support of the data for internal branches of a phylogeny. This approach extends the recently proposed approximate likelihood-ratio test and relies on a nonparametric, Shimodaira-Hasegawa-like procedure. A detailed analysis of real alignments sheds light on the links between this new approach and the more classical nonparametric bootstrap method. Overall, our tests show that the last version (3.0) of PhyML is fast, accurate, stable, and ready to use. A Web server and binary files are available from http://www.atgc-montpellier.fr/phyml/.
Ferreira, António Miguel; Marques, Hugo; Tralhão, António; Santos, Miguel Borges; Santos, Ana Rita; Cardoso, Gonçalo; Dores, Hélder; Carvalho, Maria Salomé; Madeira, Sérgio; Machado, Francisco Pereira; Cardim, Nuno; de Araújo Gonçalves, Pedro
2016-11-01
Current guidelines recommend the use of the Modified Diamond-Forrester (MDF) method to assess the pre-test likelihood of obstructive coronary artery disease (CAD). We aimed to compare the performance of the MDF method with two contemporary algorithms derived from multicenter trials that additionally incorporate cardiovascular risk factors: the calculator-based 'CAD Consortium 2' method, and the integer-based CONFIRM score. We assessed 1069 consecutive patients without known CAD undergoing coronary CT angiography (CCTA) for stable chest pain. Obstructive CAD was defined as the presence of coronary stenosis ≥50% on 64-slice dual-source CT. The three methods were assessed for calibration, discrimination, net reclassification, and changes in proposed downstream testing based upon calculated pre-test likelihoods. The observed prevalence of obstructive CAD was 13.8% (n=147). Overestimations of the likelihood of obstructive CAD were 140.1%, 9.8%, and 18.8%, respectively, for the MDF, CAD Consortium 2 and CONFIRM methods. The CAD Consortium 2 showed greater discriminative power than the MDF method, with a C-statistic of 0.73 vs. 0.70 (p<0.001), while the CONFIRM score did not (C-statistic 0.71, p=0.492). Reclassification of pre-test likelihood using the 'CAD Consortium 2' or CONFIRM scores resulted in a net reclassification improvement of 0.19 and 0.18, respectively, which would change the diagnostic strategy in approximately half of the patients. Newer risk factor-encompassing models allow for a more precise estimation of pre-test probabilities of obstructive CAD than the guideline-recommended MDF method. Adoption of these scores may improve disease prediction and change the diagnostic pathway in a significant proportion of patients. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Calculating the weight of evidence in low-template forensic DNA casework.
Lohmueller, Kirk E; Rudin, Norah
2013-01-01
Interpreting and assessing the weight of low-template DNA evidence presents a formidable challenge in forensic casework. This report describes a case in which a similar mixed DNA profile was obtained from four different bloodstains. The defense proposed that the low-level minor profile came from an alternate suspect, the defendant's mistress. The strength of the evidence was assessed using a probabilistic approach that employed likelihood ratios incorporating the probability of allelic drop-out. Logistic regression was used to model the probability of drop-out using empirical validation data from the government laboratory. The DNA profile obtained from the bloodstain described in this report is at least 47 billion times more likely if, in addition to the victim, the alternate suspect was the minor contributor, than if another unrelated individual was the minor contributor. This case illustrates the utility of the probabilistic approach for interpreting complex low-template DNA profiles. © 2012 American Academy of Forensic Sciences.
Model-based clustering for RNA-seq data.
Si, Yaqing; Liu, Peng; Li, Pinghua; Brutnell, Thomas P
2014-01-15
RNA-seq technology has been widely adopted as an attractive alternative to microarray-based methods to study global gene expression. However, robust statistical tools to analyze these complex datasets are still lacking. By grouping genes with similar expression profiles across treatments, cluster analysis provides insight into gene functions and networks, and hence is an important technique for RNA-seq data analysis. In this manuscript, we derive clustering algorithms based on appropriate probability models for RNA-seq data. An expectation-maximization algorithm and another two stochastic versions of expectation-maximization algorithms are described. In addition, a strategy for initialization based on likelihood is proposed to improve the clustering algorithms. Moreover, we present a model-based hybrid-hierarchical clustering method to generate a tree structure that allows visualization of relationships among clusters as well as flexibility of choosing the number of clusters. Results from both simulation studies and analysis of a maize RNA-seq dataset show that our proposed methods provide better clustering results than alternative methods such as the K-means algorithm and hierarchical clustering methods that are not based on probability models. An R package, MBCluster.Seq, has been developed to implement our proposed algorithms. This R package provides fast computation and is publicly available at http://www.r-project.org
The orbital PDF: general inference of the gravitational potential from steady-state tracers
NASA Astrophysics Data System (ADS)
Han, Jiaxin; Wang, Wenting; Cole, Shaun; Frenk, Carlos S.
2016-02-01
We develop two general methods to infer the gravitational potential of a system using steady-state tracers, I.e. tracers with a time-independent phase-space distribution. Combined with the phase-space continuity equation, the time independence implies a universal orbital probability density function (oPDF) dP(λ|orbit) ∝ dt, where λ is the coordinate of the particle along the orbit. The oPDF is equivalent to Jeans theorem, and is the key physical ingredient behind most dynamical modelling of steady-state tracers. In the case of a spherical potential, we develop a likelihood estimator that fits analytical potentials to the system and a non-parametric method (`phase-mark') that reconstructs the potential profile, both assuming only the oPDF. The methods involve no extra assumptions about the tracer distribution function and can be applied to tracers with any arbitrary distribution of orbits, with possible extension to non-spherical potentials. The methods are tested on Monte Carlo samples of steady-state tracers in dark matter haloes to show that they are unbiased as well as efficient. A fully documented C/PYTHON code implementing our method is freely available at a GitHub repository linked from http://icc.dur.ac.uk/data/#oPDF.
Extreme data compression for the CMB
NASA Astrophysics Data System (ADS)
Zablocki, Alan; Dodelson, Scott
2016-04-01
We apply the Karhunen-Loéve methods to cosmic microwave background (CMB) data sets, and show that we can recover the input cosmology and obtain the marginalized likelihoods in Λ cold dark matter cosmologies in under a minute, much faster than Markov chain Monte Carlo methods. This is achieved by forming a linear combination of the power spectra at each multipole l , and solving a system of simultaneous equations such that the Fisher matrix is locally unchanged. Instead of carrying out a full likelihood evaluation over the whole parameter space, we need evaluate the likelihood only for the parameter of interest, with the data compression effectively marginalizing over all other parameters. The weighting vectors contain insight about the physical effects of the parameters on the CMB anisotropy power spectrum Cl . The shape and amplitude of these vectors give an intuitive feel for the physics of the CMB, the sensitivity of the observed spectrum to cosmological parameters, and the relative sensitivity of different experiments to cosmological parameters. We test this method on exact theory Cl as well as on a Wilkinson Microwave Anisotropy Probe (WMAP)-like CMB data set generated from a random realization of a fiducial cosmology, comparing the compression results to those from a full likelihood analysis using CosmoMC. After showing that the method works, we apply it to the temperature power spectrum from the WMAP seven-year data release, and discuss the successes and limitations of our method as applied to a real data set.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model
NASA Astrophysics Data System (ADS)
Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel
2011-03-01
This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Online dating and conjugal bereavement.
Young, Goldthwaite Dannagal; Caplan, Scott E
2010-08-01
This study examined self-presentation in the online dating profiles of 241 widowed and 280 divorced individuals between 18 and 40 years old. A content analysis of open-ended user-generated profiles assessed the presence or absence of various themes, including the user's marital status, the backstory of their lost relationship, and whether they engaged in sense-making regarding that lost relationship. Results indicated that about one-third of widowed individuals discussed their loss in their profiles. In addition, about one-third of the widowed profiles included explicit reference to a philosophy of life, and about 16% mentioned sense-making or cognitive reappraisals of their bereavement. Many profiles included some articulation of a vision of a future partnership. Results also revealed a significant correlation between widowed individuals including a backstory and their likelihood of exhibiting sense-making in their profiles. Finally, unlike the widowed users, divorcees provided much briefer mentions of their lost relationships, used less sense-making language, and were less likely to articulate an explicit vision of future partnerships. Overall, the results suggest that for widowed individuals, online dating sites may function as venues to explore their past experiences and engage in the construction of a post-loss identity or a post-loss "ideal self".
Robust Multipoint Water-Fat Separation Using Fat Likelihood Analysis
Yu, Huanzhou; Reeder, Scott B.; Shimakawa, Ann; McKenzie, Charles A.; Brittain, Jean H.
2016-01-01
Fat suppression is an essential part of routine MRI scanning. Multiecho chemical-shift based water-fat separation methods estimate and correct for Bo field inhomogeneity. However, they must contend with the intrinsic challenge of water-fat ambiguity that can result in water-fat swapping. This problem arises because the signals from two chemical species, when both are modeled as a single discrete spectral peak, may appear indistinguishable in the presence of Bo off-resonance. In conventional methods, the water-fat ambiguity is typically removed by enforcing field map smoothness using region growing based algorithms. In reality, the fat spectrum has multiple spectral peaks. Using this spectral complexity, we introduce a novel concept that identifies water and fat for multiecho acquisitions by exploiting the spectral differences between water and fat. A fat likelihood map is produced to indicate if a pixel is likely to be water-dominant or fat-dominant by comparing the fitting residuals of two different signal models. The fat likelihood analysis and field map smoothness provide complementary information, and we designed an algorithm (Fat Likelihood Analysis for Multiecho Signals) to exploit both mechanisms. It is demonstrated in a wide variety of data that the Fat Likelihood Analysis for Multiecho Signals algorithm offers highly robust water-fat separation for 6-echo acquisitions, particularly in some previously challenging applications. PMID:21842498
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skjoth-Rasmussen, Jane, E-mail: jane@skjoeth-rasmussen.d; Roed, Henrik; Ohlhues, Lars
2010-06-01
Purpose: Primarily, gamma knife centers are predominant in publishing results on arteriovenous malformations (AVM) treatments including reports on risk profile. However, many patients are treated using a linear accelerator-most of these at smaller centers. Because this setting is different from a large gamma knife center, the risk profile at Linac departments could be different from the reported experience. Prescribed radiation doses are dependent on AVM volume. This study details results from a medium sized Linac department center focusing on risk profiles. Method and Materials: A database was searched for all patients with AVMs. We included 50 consecutive patients with amore » minimum of 24 months follow-up (24-51 months). Results: AVM occlusion was verified in 78% of patients (39/50). AVM occlusion without new deficits (excellent outcome) was obtained in 44%. Good or fair outcome (AVM occlusion with mild or moderate new deficits) was seen in 30%. Severe complications after AVM occlusion occurred in 4% with a median interval of 15 months after treatment (range, 1-26 months). Conclusions: We applied an AVM grading score developed at the Mayo Clinic to predict probable outcome after radiosurgery in a large patient population treated with Gamma knife. A cutoff above and below a score of 1.5 could not discriminate between the likelihood of having an excellent outcome (approximately 45%). The chance of having an excellent or good outcome was slightly higher in patients with an AVM score below 1.5 (64% vs. 57%).« less
NASA Astrophysics Data System (ADS)
Peng, Juan-juan; Wang, Jian-qiang; Yang, Wu-E.
2017-01-01
In this paper, multi-criteria decision-making (MCDM) problems based on the qualitative flexible multiple criteria method (QUALIFLEX), in which the criteria values are expressed by multi-valued neutrosophic information, are investigated. First, multi-valued neutrosophic sets (MVNSs), which allow the truth-membership function, indeterminacy-membership function and falsity-membership function to have a set of crisp values between zero and one, are introduced. Then the likelihood of multi-valued neutrosophic number (MVNN) preference relations is defined and the corresponding properties are also discussed. Finally, an extended QUALIFLEX approach based on likelihood is explored to solve MCDM problems where the assessments of alternatives are in the form of MVNNs; furthermore an example is provided to illustrate the application of the proposed method, together with a comparison analysis.
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lowell, A. W.; Boggs, S. E; Chiu, C. L.
2017-10-20
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ∼21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. Wemore » find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.« less
Estimation of parameters of dose volume models and their confidence limits
NASA Astrophysics Data System (ADS)
van Luijk, P.; Delvigne, T. C.; Schilstra, C.; Schippers, J. M.
2003-07-01
Predictions of the normal-tissue complication probability (NTCP) for the ranking of treatment plans are based on fits of dose-volume models to clinical and/or experimental data. In the literature several different fit methods are used. In this work frequently used methods and techniques to fit NTCP models to dose response data for establishing dose-volume effects, are discussed. The techniques are tested for their usability with dose-volume data and NTCP models. Different methods to estimate the confidence intervals of the model parameters are part of this study. From a critical-volume (CV) model with biologically realistic parameters a primary dataset was generated, serving as the reference for this study and describable by the NTCP model. The CV model was fitted to this dataset. From the resulting parameters and the CV model, 1000 secondary datasets were generated by Monte Carlo simulation. All secondary datasets were fitted to obtain 1000 parameter sets of the CV model. Thus the 'real' spread in fit results due to statistical spreading in the data is obtained and has been compared with estimates of the confidence intervals obtained by different methods applied to the primary dataset. The confidence limits of the parameters of one dataset were estimated using the methods, employing the covariance matrix, the jackknife method and directly from the likelihood landscape. These results were compared with the spread of the parameters, obtained from the secondary parameter sets. For the estimation of confidence intervals on NTCP predictions, three methods were tested. Firstly, propagation of errors using the covariance matrix was used. Secondly, the meaning of the width of a bundle of curves that resulted from parameters that were within the one standard deviation region in the likelihood space was investigated. Thirdly, many parameter sets and their likelihood were used to create a likelihood-weighted probability distribution of the NTCP. It is concluded that for the type of dose response data used here, only a full likelihood analysis will produce reliable results. The often-used approximations, such as the usage of the covariance matrix, produce inconsistent confidence limits on both the parameter sets and the resulting NTCP values.
Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour.
Edwards, Andrew M; Freeman, Mervyn P; Breed, Greg A; Jonsen, Ian D
2012-01-01
Ecologists are collecting extensive data concerning movements of animals in marine ecosystems. Such data need to be analysed with valid statistical methods to yield meaningful conclusions. We demonstrate methodological issues in two recent studies that reached similar conclusions concerning movements of marine animals (Nature 451:1098; Science 332:1551). The first study analysed vertical movement data to conclude that diverse marine predators (Atlantic cod, basking sharks, bigeye tuna, leatherback turtles and Magellanic penguins) exhibited "Lévy-walk-like behaviour", close to a hypothesised optimal foraging strategy. By reproducing the original results for the bigeye tuna data, we show that the likelihood of tested models was calculated from residuals of regression fits (an incorrect method), rather than from the likelihood equations of the actual probability distributions being tested. This resulted in erroneous Akaike Information Criteria, and the testing of models that do not correspond to valid probability distributions. We demonstrate how this led to overwhelming support for a model that has no biological justification and that is statistically spurious because its probability density function goes negative. Re-analysis of the bigeye tuna data, using standard likelihood methods, overturns the original result and conclusion for that data set. The second study observed Lévy walk movement patterns by mussels. We demonstrate several issues concerning the likelihood calculations (including the aforementioned residuals issue). Re-analysis of the data rejects the original Lévy walk conclusion. We consequently question the claimed existence of scaling laws of the search behaviour of marine predators and mussels, since such conclusions were reached using incorrect methods. We discourage the suggested potential use of "Lévy-like walks" when modelling consequences of fishing and climate change, and caution that any resulting advice to managers of marine ecosystems would be problematic. For reproducibility and future work we provide R source code for all calculations.
NASA Astrophysics Data System (ADS)
Goodman, Steven N.
1989-11-01
This dissertation explores the use of a mathematical measure of statistical evidence, the log likelihood ratio, in clinical trials. The methods and thinking behind the use of an evidential measure are contrasted with traditional methods of analyzing data, which depend primarily on a p-value as an estimate of the statistical strength of an observed data pattern. It is contended that neither the behavioral dictates of Neyman-Pearson hypothesis testing methods, nor the coherency dictates of Bayesian methods are realistic models on which to base inference. The use of the likelihood alone is applied to four aspects of trial design or conduct: the calculation of sample size, the monitoring of data, testing for the equivalence of two treatments, and meta-analysis--the combining of results from different trials. Finally, a more general model of statistical inference, using belief functions, is used to see if it is possible to separate the assessment of evidence from our background knowledge. It is shown that traditional and Bayesian methods can be modeled as two ends of a continuum of structured background knowledge, methods which summarize evidence at the point of maximum likelihood assuming no structure, and Bayesian methods assuming complete knowledge. Both schools are seen to be missing a concept of ignorance- -uncommitted belief. This concept provides the key to understanding the problem of sampling to a foregone conclusion and the role of frequency properties in statistical inference. The conclusion is that statistical evidence cannot be defined independently of background knowledge, and that frequency properties of an estimator are an indirect measure of uncommitted belief. Several likelihood summaries need to be used in clinical trials, with the quantitative disparity between summaries being an indirect measure of our ignorance. This conclusion is linked with parallel ideas in the philosophy of science and cognitive psychology.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
NASA Astrophysics Data System (ADS)
Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho
2017-03-01
So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
Lopatka, Martin; Sigman, Michael E; Sjerps, Marjan J; Williams, Mary R; Vivó-Truyols, Gabriel
2015-07-01
Forensic chemical analysis of fire debris addresses the question of whether ignitable liquid residue is present in a sample and, if so, what type. Evidence evaluation regarding this question is complicated by interference from pyrolysis products of the substrate materials present in a fire. A method is developed to derive a set of class-conditional features for the evaluation of such complex samples. The use of a forensic reference collection allows characterization of the variation in complex mixtures of substrate materials and ignitable liquids even when the dominant feature is not specific to an ignitable liquid. Making use of a novel method for data imputation under complex mixing conditions, a distribution is modeled for the variation between pairs of samples containing similar ignitable liquid residues. Examining the covariance of variables within the different classes allows different weights to be placed on features more important in discerning the presence of a particular ignitable liquid residue. Performance of the method is evaluated using a database of total ion spectrum (TIS) measurements of ignitable liquid and fire debris samples. These measurements include 119 nominal masses measured by GC-MS and averaged across a chromatographic profile. Ignitable liquids are labeled using the American Society for Testing and Materials (ASTM) E1618 standard class definitions. Statistical analysis is performed in the class-conditional feature space wherein new forensic traces are represented based on their likeness to known samples contained in a forensic reference collection. The demonstrated method uses forensic reference data as the basis of probabilistic statements concerning the likelihood of the obtained analytical results given the presence of ignitable liquid residue of each of the ASTM classes (including a substrate only class). When prior probabilities of these classes can be assumed, these likelihoods can be connected to class probabilities. In order to compare the performance of this method to previous work, a uniform prior was assumed, resulting in an 81% accuracy for an independent test of 129 real burn samples. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Gill, P.; Gusmão, L.; Haned, H.; Mayr, W.R.; Morling, N.; Parson, W.; Prieto, L.; Prinz, M.; Schneider, H.; Schneider, P.M.; Weir, B.S.
2015-01-01
DNA profiling of biological material from scenes of crimes is often complicated because the amount of DNA is limited and the quality of the DNA may be compromised. Furthermore, the sensitivity of STR typing kits has been continuously improved to detect low level DNA traces. This may lead to (1) partial DNA profiles and (2) detection of additional alleles. There are two key phenomena to consider: allelic or locus ‘drop-out’, i.e. ‘missing’ alleles at one or more genetic loci, while ‘drop-in’ may explain alleles in the DNA profile that are additional to the assumed main contributor(s). The drop-in phenomenon is restricted to 1 or 2 alleles per profile. If multiple alleles are observed at more than two loci then these are considered as alleles from an extra contributor and analysis can proceed as a mixture of two or more contributors. Here, we give recommendations on how to estimate probabilities considering drop-out, Pr(D), and drop-in, Pr(C). For reasons of clarity, we have deliberately restricted the current recommendations considering drop-out and/or drop-in at only one locus. Furthermore, we offer recommendations on how to use Pr(D) and Pr(C) with the likelihood ratio principles that are generally recommended by the International Society of Forensic Genetics (ISFG) as measure of the weight of the evidence in forensic genetics. Examples of calculations are included. An Excel spreadsheet is provided so that scientists and laboratories may explore the models and input their own data. PMID:22864188
Methods to estimate the between‐study variance and its uncertainty in meta‐analysis†
Jackson, Dan; Viechtbauer, Wolfgang; Bender, Ralf; Bowden, Jack; Knapp, Guido; Kuss, Oliver; Higgins, Julian PT; Langan, Dean; Salanti, Georgia
2015-01-01
Meta‐analyses are typically used to estimate the overall/mean of an outcome of interest. However, inference about between‐study variability, which is typically modelled using a between‐study variance parameter, is usually an additional aim. The DerSimonian and Laird method, currently widely used by default to estimate the between‐study variance, has been long challenged. Our aim is to identify known methods for estimation of the between‐study variance and its corresponding uncertainty, and to summarise the simulation and empirical evidence that compares them. We identified 16 estimators for the between‐study variance, seven methods to calculate confidence intervals, and several comparative studies. Simulation studies suggest that for both dichotomous and continuous data the estimator proposed by Paule and Mandel and for continuous data the restricted maximum likelihood estimator are better alternatives to estimate the between‐study variance. Based on the scenarios and results presented in the published studies, we recommend the Q‐profile method and the alternative approach based on a ‘generalised Cochran between‐study variance statistic’ to compute corresponding confidence intervals around the resulting estimates. Our recommendations are based on a qualitative evaluation of the existing literature and expert consensus. Evidence‐based recommendations require an extensive simulation study where all methods would be compared under the same scenarios. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd. PMID:26332144
Biasogram: Visualization of Confounding Technical Bias in Gene Expression Data
Krzystanek, Marcin; Szallasi, Zoltan; Eklund, Aron C.
2013-01-01
Gene expression profiles of clinical cohorts can be used to identify genes that are correlated with a clinical variable of interest such as patient outcome or response to a particular drug. However, expression measurements are susceptible to technical bias caused by variation in extraneous factors such as RNA quality and array hybridization conditions. If such technical bias is correlated with the clinical variable of interest, the likelihood of identifying false positive genes is increased. Here we describe a method to visualize an expression matrix as a projection of all genes onto a plane defined by a clinical variable and a technical nuisance variable. The resulting plot indicates the extent to which each gene is correlated with the clinical variable or the technical variable. We demonstrate this method by applying it to three clinical trial microarray data sets, one of which identified genes that may have been driven by a confounding technical variable. This approach can be used as a quality control step to identify data sets that are likely to yield false positive results. PMID:23613961
The Atacama Cosmology Telescope (ACT): Beam Profiles and First SZ Cluster Maps
NASA Technical Reports Server (NTRS)
Hincks, A. D.; Acquaviva, V.; Ade, P. A.; Aguirre, P.; Amiri, M.; Appel, J. W.; Barrientos, L. F.; Battistelli, E. S.; Bond, J. R.; Brown, B.;
2010-01-01
The Atacama Cosmology Telescope (ACT) is currently observing the cosmic microwave background with arcminute resolution at 148 GHz, 218 GHz, and 277 GHz, In this paper, we present ACT's first results. Data have been analyzed using a maximum-likelihood map-making method which uses B-splines to model and remove the atmospheric signal. It has been used to make high-precision beam maps from which we determine the experiment's window functions, This beam information directly impacts all subsequent analyses of the data. We also used the method to map a sample of galaxy clusters via the Sunyaev-Ze1'dovich (SZ) effect, and show five clusters previously detected with X-ray or SZ observations, We provide integrated Compton-y measurements for each cluster. Of particular interest is our detection of the z = 0.44 component of A3128 and our current non-detection of the low-redshift part, providing strong evidence that the further cluster is more massive as suggested by X-ray measurements. This is a compelling example of the redshift-independent mass selection of the SZ effect.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
Maximum likelihood estimation for life distributions with competing failure modes
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1979-01-01
Systems which are placed on test at time zero, function for a period and die at some random time were studied. Failure may be due to one of several causes or modes. The parameters of the life distribution may depend upon the levels of various stress variables the item is subject to. Maximum likelihood estimation methods are discussed. Specific methods are reported for the smallest extreme-value distributions of life. Monte-Carlo results indicate the methods to be promising. Under appropriate conditions, the location parameters are nearly unbiased, the scale parameter is slight biased, and the asymptotic covariances are rapidly approached.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yorita, Kohei
2005-03-01
We have measured the top quark mass with the dynamical likelihood method (DLM) using the CDF II detector at the Fermilab Tevatron. The Tevatron produces top and anti-top pairs in pp collisions at a center of mass energy of 1.96 TeV. The data sample used in this paper was accumulated from March 2002 through August 2003 which corresponds to an integrated luminosity of 162 pb -1.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters
Stanley J. Zarnoch; Tommy R. Dell
1985-01-01
Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Contributions to the Underlying Bivariate Normal Method for Factor Analyzing Ordinal Data
ERIC Educational Resources Information Center
Xi, Nuo; Browne, Michael W.
2014-01-01
A promising "underlying bivariate normal" approach was proposed by Jöreskog and Moustaki for use in the factor analysis of ordinal data. This was a limited information approach that involved the maximization of a composite likelihood function. Its advantage over full-information maximum likelihood was that very much less computation was…
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
Evaluation of Smoking Prevention Television Messages Based on the Elaboration Likelihood Model
ERIC Educational Resources Information Center
Flynn, Brian S.; Worden, John K.; Bunn, Janice Yanushka; Connolly, Scott W.; Dorwaldt, Anne L.
2011-01-01
Progress in reducing youth smoking may depend on developing improved methods to communicate with higher risk youth. This study explored the potential of smoking prevention messages based on the Elaboration Likelihood Model (ELM) to address these needs. Structured evaluations of 12 smoking prevention messages based on three strategies derived from…
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-06-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Christensen, Ole F
2012-12-03
Single-step methods provide a coherent and conceptually simple approach to incorporate genomic information into genetic evaluations. An issue with single-step methods is compatibility between the marker-based relationship matrix for genotyped animals and the pedigree-based relationship matrix. Therefore, it is necessary to adjust the marker-based relationship matrix to the pedigree-based relationship matrix. Moreover, with data from routine evaluations, this adjustment should in principle be based on both observed marker genotypes and observed phenotypes, but until now this has been overlooked. In this paper, I propose a new method to address this issue by 1) adjusting the pedigree-based relationship matrix to be compatible with the marker-based relationship matrix instead of the reverse and 2) extending the single-step genetic evaluation using a joint likelihood of observed phenotypes and observed marker genotypes. The performance of this method is then evaluated using two simulated datasets. The method derived here is a single-step method in which the marker-based relationship matrix is constructed assuming all allele frequencies equal to 0.5 and the pedigree-based relationship matrix is constructed using the unusual assumption that animals in the base population are related and inbred with a relationship coefficient γ and an inbreeding coefficient γ / 2. Taken together, this γ parameter and a parameter that scales the marker-based relationship matrix can handle the issue of compatibility between marker-based and pedigree-based relationship matrices. The full log-likelihood function used for parameter inference contains two terms. The first term is the REML-log-likelihood for the phenotypes conditional on the observed marker genotypes, whereas the second term is the log-likelihood for the observed marker genotypes. Analyses of the two simulated datasets with this new method showed that 1) the parameters involved in adjusting marker-based and pedigree-based relationship matrices can depend on both observed phenotypes and observed marker genotypes and 2) a strong association between these two parameters exists. Finally, this method performed at least as well as a method based on adjusting the marker-based relationship matrix. Using the full log-likelihood and adjusting the pedigree-based relationship matrix to be compatible with the marker-based relationship matrix provides a new and interesting approach to handle the issue of compatibility between the two matrices in single-step genetic evaluation.
Billsten, Johan; Fridell, Mats; Holmberg, Robert; Ivarsson, Andréas
2018-01-01
Organizational climate and related factors are associated with outcome and are as such of vital interest for healthcare organizations. Organizational Readiness for Change (ORC) is the questionnaire used in the present study to assess the influence of organizational factors on implementation success. The respondents were employed in one of 203 Swedish municipalities within social work and psychiatric substance/abuse treatment services. They took part in a nationwide implementation project organized by the Swedish Association of Local Authorities and Regions (SALAR), commissioned by the Swedish National Board of Health and Welfare. The aims were: (a) to identify classes (clusters) of employees with different ORC profiles on the basis of data collected in 2011 and (b) to investigate ORC profiles which predicted the use of assessment instruments, therapy methods and collaborative activities in 2011 and 2013. The evaluation study applied a naturalistic design with registration of outcome at consecutive assessments. The participants were contacted via official e-mail addresses in their respective healthcare units and were encouraged by their officials to participate on a voluntary basis. Descriptive statistics were obtained using SPSS version 23. A latent profile analysis (LPA) using Mplus 7.3 was performed with a robust maximum likelihood estimator (MLR) to identify subgroups (clusters) based on the 18 ORC indexes. A total of 2402 employees responded to the survey, of whom 1794 (74.7%) completed the ORC scores. Descriptive analysis indicated that the respondents were a homogenous group of employees, where women (72.0%) formed the majority. Cronbach's alpha for the 18 ORC indexes ranged from α=0.67 to α=0.78. A principal component analysis yielded a four-factor solution explaining 62% of the variance in total ORC scores. The factors were: motivational readiness (α=0.64), institutional resources (α=0.52), staff attributes (α=0.76), and organizational climate (α=0.74). An LPA analysis of the four factors with their three distinct profiles provided the best data fit: Profile 3 (n=614), Profile 2 (n=934), and Profile 1 (n=246). Respondents with the most favorable ORC scores (Profile 3) used significantly more instruments and more treatment methods and had a better collaborating network in 2011 as well as in 2013 compared to members in Profile 1, the least successful profile. In a large sample of social work and healthcare professionals, ORC scores reflecting higher institutional resources, staff attributes and organizational climate and lower motivational readiness for change were associated with a successful implementation of good practice guidelines for the care and treatment of substance users in Sweden. Low motivational readiness as a construct may indicate satisfaction with the present situation. As ORC proved to be an indicator of successful dissemination of evidence-based guidelines into routine and specialist healthcare, it can be used to tailor interventions to individual employees or services and to improve the dissemination of and compliance with guidelines for the treatment of substance users. Copyright © 2017 Elsevier Inc. All rights reserved.
Maximum-likelihood fitting of data dominated by Poisson statistical uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoneking, M.R.; Den Hartog, D.J.
1996-06-01
The fitting of data by {chi}{sup 2}-minimization is valid only when the uncertainties in the data are normally distributed. When analyzing spectroscopic or particle counting data at very low signal level (e.g., a Thomson scattering diagnostic), the uncertainties are distributed with a Poisson distribution. The authors have developed a maximum-likelihood method for fitting data that correctly treats the Poisson statistical character of the uncertainties. This method maximizes the total probability that the observed data are drawn from the assumed fit function using the Poisson probability function to determine the probability for each data point. The algorithm also returns uncertainty estimatesmore » for the fit parameters. They compare this method with a {chi}{sup 2}-minimization routine applied to both simulated and real data. Differences in the returned fits are greater at low signal level (less than {approximately}20 counts per measurement). the maximum-likelihood method is found to be more accurate and robust, returning a narrower distribution of values for the fit parameters with fewer outliers.« less
Love, Jeffrey J.; Rigler, E. Joshua; Pulkkinen, Antti; Riley, Pete
2015-01-01
An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to −Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, −Dst≥850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42,2.41] times per century; a 100-yr magnetic storm is identified as having a −Dst≥880 nT (greater than Carrington) but a wide 95% confidence interval of [490,1187] nT.
Zhao, Xing; Zhou, Xiao-Hua; Feng, Zijian; Guo, Pengfei; He, Hongyan; Zhang, Tao; Duan, Lei; Li, Xiaosong
2013-01-01
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.
Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua
2017-07-01
Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.
Superfast maximum-likelihood reconstruction for quantum tomography
NASA Astrophysics Data System (ADS)
Shang, Jiangwei; Zhang, Zhengyun; Ng, Hui Khoon
2017-06-01
Conventional methods for computing maximum-likelihood estimators (MLE) often converge slowly in practical situations, leading to a search for simplifying methods that rely on additional assumptions for their validity. In this work, we provide a fast and reliable algorithm for maximum-likelihood reconstruction that avoids this slow convergence. Our method utilizes the state-of-the-art convex optimization scheme, an accelerated projected-gradient method, that allows one to accommodate the quantum nature of the problem in a different way than in the standard methods. We demonstrate the power of our approach by comparing its performance with other algorithms for n -qubit state tomography. In particular, an eight-qubit situation that purportedly took weeks of computation time in 2005 can now be completed in under a minute for a single set of data, with far higher accuracy than previously possible. This refutes the common claim that MLE reconstruction is slow and reduces the need for alternative methods that often come with difficult-to-verify assumptions. In fact, recent methods assuming Gaussian statistics or relying on compressed sensing ideas are demonstrably inapplicable for the situation under consideration here. Our algorithm can be applied to general optimization problems over the quantum state space; the philosophy of projected gradients can further be utilized for optimization contexts with general constraints.
Langholz, Bryan; Thomas, Duncan C.; Stovall, Marilyn; Smith, Susan A.; Boice, John D.; Shore, Roy E.; Bernstein, Leslie; Lynch, Charles F.; Zhang, Xinbo; Bernstein, Jonine L.
2009-01-01
Summary Methods for the analysis of individually matched case-control studies with location-specific radiation dose and tumor location information are described. These include likelihood methods for analyses that just use cases with precise location of tumor information and methods that also include cases with imprecise tumor location information. The theory establishes that each of these likelihood based methods estimates the same radiation rate ratio parameters, within the context of the appropriate model for location and subject level covariate effects. The underlying assumptions are characterized and the potential strengths and limitations of each method are described. The methods are illustrated and compared using the WECARE study of radiation and asynchronous contralateral breast cancer. PMID:18647297
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
Reyes-Valdés, M H; Stelly, D M
1995-01-01
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes. Images Fig. 1 PMID:7568226
DECONV-TOOL: An IDL based deconvolution software package
NASA Technical Reports Server (NTRS)
Varosi, F.; Landsman, W. B.
1992-01-01
There are a variety of algorithms for deconvolution of blurred images, each having its own criteria or statistic to be optimized in order to estimate the original image data. Using the Interactive Data Language (IDL), we have implemented the Maximum Likelihood, Maximum Entropy, Maximum Residual Likelihood, and sigma-CLEAN algorithms in a unified environment called DeConv_Tool. Most of the algorithms have as their goal the optimization of statistics such as standard deviation and mean of residuals. Shannon entropy, log-likelihood, and chi-square of the residual auto-correlation are computed by DeConv_Tool for the purpose of determining the performance and convergence of any particular method and comparisons between methods. DeConv_Tool allows interactive monitoring of the statistics and the deconvolved image during computation. The final results, and optionally, the intermediate results, are stored in a structure convenient for comparison between methods and review of the deconvolution computation. The routines comprising DeConv_Tool are available via anonymous FTP through the IDL Astronomy User's Library.
English, Sangeeta B.; Shih, Shou-Ching; Ramoni, Marco F.; Smith, Lois E.; Butte, Atul J.
2014-01-01
Though genome-wide technologies, such as microarrays, are widely used, data from these methods are considered noisy; there is still varied success in downstream biological validation. We report a method that increases the likelihood of successfully validating microarray findings using real time RT-PCR, including genes at low expression levels and with small differences. We use a Bayesian network to identify the most relevant sources of noise based on the successes and failures in validation for an initial set of selected genes, and then improve our subsequent selection of genes for validation based on eliminating these sources of noise. The network displays the significant sources of noise in an experiment, and scores the likelihood of validation for every gene. We show how the method can significantly increase validation success rates. In conclusion, in this study, we have successfully added a new automated step to determine the contributory sources of noise that determine successful or unsuccessful downstream biological validation. PMID:18790084
Thaden, Joshua T; Mogno, Ilaria; Wierzbowski, Jamey; Cottarel, Guillaume; Kasif, Simon; Collins, James J; Gardner, Timothy S
2007-01-01
Machine learning approaches offer the potential to systematically identify transcriptional regulatory interactions from a compendium of microarray expression profiles. However, experimental validation of the performance of these methods at the genome scale has remained elusive. Here we assess the global performance of four existing classes of inference algorithms using 445 Escherichia coli Affymetrix arrays and 3,216 known E. coli regulatory interactions from RegulonDB. We also developed and applied the context likelihood of relatedness (CLR) algorithm, a novel extension of the relevance networks class of algorithms. CLR demonstrates an average precision gain of 36% relative to the next-best performing algorithm. At a 60% true positive rate, CLR identifies 1,079 regulatory interactions, of which 338 were in the previously known network and 741 were novel predictions. We tested the predicted interactions for three transcription factors with chromatin immunoprecipitation, confirming 21 novel interactions and verifying our RegulonDB-based performance estimates. CLR also identified a regulatory link providing central metabolic control of iron transport, which we confirmed with real-time quantitative PCR. The compendium of expression data compiled in this study, coupled with RegulonDB, provides a valuable model system for further improvement of network inference algorithms using experimental data. PMID:17214507
Investigation into the performance of different models for predicting stutter.
Bright, Jo-Anne; Curran, James M; Buckleton, John S
2013-07-01
In this paper we have examined five possible models for the behaviour of the stutter ratio, SR. These were two log-normal models, two gamma models, and a two-component normal mixture model. A two-component normal mixture model was chosen with different behaviours of variance; at each locus SR was described with two distributions, both with the same mean. The distributions have difference variances: one for the majority of the observations and a second for the less well-behaved ones. We apply each model to a set of known single source Identifiler™, NGM SElect™ and PowerPlex(®) 21 DNA profiles to show the applicability of our findings to different data sets. SR determined from the single source profiles were compared to the calculated SR after application of the models. The model performance was tested by calculating the log-likelihoods and comparing the difference in Akaike information criterion (AIC). The two-component normal mixture model systematically outperformed all others, despite the increase in the number of parameters. This model, as well as performing well statistically, has intuitive appeal for forensic biologists and could be implemented in an expert system with a continuous method for DNA interpretation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Ramirez-Gomez, Liliana; Zheng, Ling; Reed, Bruce; Kramer, Joel; Mungas, Dan; Zarow, Chris; Vinters, Harry; Ringman, John M.; Chui, Helena
2018-01-01
Background/Aims The aim of this study was to assess the ability of neuropsychological tests to differentiate autopsy-defined Alzheimer disease (AD) from subcortical ischemic vascular dementia (SIVD). Methods From a sample of 175 cases followed longitudinally that underwent autopsy, we selected 23 normal controls (NC), 20 SIVD, 69 AD, and 10 mixed cases of dementia. Baseline neuropsychological tests, including Memory Assessment Scale word list learning test, control oral word association test, and animal fluency, were compared between the three autopsy-defined groups. Results The NC, SIVD, and AD groups did not differ by age or education. The SIVD and AD groups did not differ by the Global Clinical Dementia Rating Scale. Subjects with AD performed worse on delayed recall (p < 0.01). A receiver operating characteristics analysis comparing the SIVD and AD groups including age, education, difference between categorical (animals) versus phonemic fluency (letter F), and the first recall from the word learning test distinguished the two groups with a sensitivity of 85%, specificity of 67%, and positive likelihood ratio of 2.57 (AUC = 0.789, 95% CI 0.69–0.88, p < 0.0001). Conclusion In neuropathologically defined subgroups, neuropsychological profiles have modest ability to distinguish patients with AD from those with SIVD. PMID:28595184
Extreme data compression for the CMB
Zablocki, Alan; Dodelson, Scott
2016-04-28
We apply the Karhunen-Loéve methods to cosmic microwave background (CMB) data sets, and show that we can recover the input cosmology and obtain the marginalized likelihoods in Λ cold dark matter cosmologies in under a minute, much faster than Markov chain Monte Carlo methods. This is achieved by forming a linear combination of the power spectra at each multipole l, and solving a system of simultaneous equations such that the Fisher matrix is locally unchanged. Instead of carrying out a full likelihood evaluation over the whole parameter space, we need evaluate the likelihood only for the parameter of interest, with themore » data compression effectively marginalizing over all other parameters. The weighting vectors contain insight about the physical effects of the parameters on the CMB anisotropy power spectrum C l. The shape and amplitude of these vectors give an intuitive feel for the physics of the CMB, the sensitivity of the observed spectrum to cosmological parameters, and the relative sensitivity of different experiments to cosmological parameters. We test this method on exact theory C l as well as on a Wilkinson Microwave Anisotropy Probe (WMAP)-like CMB data set generated from a random realization of a fiducial cosmology, comparing the compression results to those from a full likelihood analysis using CosmoMC. Furthermore, after showing that the method works, we apply it to the temperature power spectrum from the WMAP seven-year data release, and discuss the successes and limitations of our method as applied to a real data set.« less
Mapping Quantitative Traits in Unselected Families: Algorithms and Examples
Dupuis, Josée; Shi, Jianxin; Manning, Alisa K.; Benjamin, Emelia J.; Meigs, James B.; Cupples, L. Adrienne; Siegmund, David
2009-01-01
Linkage analysis has been widely used to identify from family data genetic variants influencing quantitative traits. Common approaches have both strengths and limitations. Likelihood ratio tests typically computed in variance component analysis can accommodate large families but are highly sensitive to departure from normality assumptions. Regression-based approaches are more robust but their use has primarily been restricted to nuclear families. In this paper, we develop methods for mapping quantitative traits in moderately large pedigrees. Our methods are based on the score statistic which in contrast to the likelihood ratio statistic, can use nonparametric estimators of variability to achieve robustness of the false positive rate against departures from the hypothesized phenotypic model. Because the score statistic is easier to calculate than the likelihood ratio statistic, our basic mapping methods utilize relatively simple computer code that performs statistical analysis on output from any program that computes estimates of identity-by-descent. This simplicity also permits development and evaluation of methods to deal with multivariate and ordinal phenotypes, and with gene-gene and gene-environment interaction. We demonstrate our methods on simulated data and on fasting insulin, a quantitative trait measured in the Framingham Heart Study. PMID:19278016
Lee, E Henry; Wickham, Charlotte; Beedlow, Peter A; Waschmann, Ronald S; Tingey, David T
2017-10-01
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for climate and forest disturbances (i.e., pests, diseases, fire). The statistical method is illustrated with a tree-ring width time series for a mature closed-canopy Douglas-fir stand on the west slopes of the Cascade Mountains of Oregon, USA that is impacted by Swiss needle cast disease caused by the foliar fungus, Phaecryptopus gaeumannii (Rhode) Petrak. The likelihood-based TSIA method is proposed for the field of dendrochronology to understand the interaction of temperature, water, and forest disturbances that are important in forest ecology and climate change studies.
U.S. cannabis legalization and use of vaping and edible products among youth.
Borodovsky, Jacob T; Lee, Dustin C; Crosier, Benjamin S; Gabrielli, Joy L; Sargent, James D; Budney, Alan J
2017-08-01
Alternative methods for consuming cannabis (e.g., vaping and edibles) have become more popular in the wake of U.S. cannabis legalization. Specific provisions of legal cannabis laws (LCL) (e.g., dispensary regulations) may impact the likelihood that youth will use alternative methods and the age at which they first try the method - potentially magnifying or mitigating the developmental harms of cannabis use. This study examined associations between LCL provisions and how youth consume cannabis. An online cannabis use survey was distributed using Facebook advertising, and data were collected from 2630 cannabis-using youth (ages 14-18). U.S. states were coded for LCL status and various LCL provisions. Regression analyses tested associations among lifetime use and age of onset of cannabis vaping and edibles and LCL provisions. Longer LCL duration (OR vaping : 2.82, 95% CI: 2.24, 3.55; OR edibles : 3.82, 95% CI: 2.96, 4.94), and higher dispensary density (OR vaping : 2.68, 95% CI: 2.12, 3.38; OR edibles : 3.31, 95% CI: 2.56, 4.26), were related to higher likelihood of trying vaping and edibles. Permitting home cultivation was related to higher likelihood (OR: 1.93, 95% CI: 1.50, 2.48) and younger age of onset (β: -0.30, 95% CI: -0.45, -0.15) of edibles. Specific provisions of LCL appear to impact the likelihood, and age at which, youth use alternative methods to consume cannabis. These methods may carry differential risks for initiation and escalation of cannabis use. Understanding associations between LCL provisions and methods of administration can inform the design of effective cannabis regulatory strategies. Copyright © 2017 Elsevier B.V. All rights reserved.
Estimation of submarine mass failure probability from a sequence of deposits with age dates
Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.
2013-01-01
The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Cross-validation to select Bayesian hierarchical models in phylogenetics.
Duchêne, Sebastián; Duchêne, David A; Di Giallonardo, Francesca; Eden, John-Sebastian; Geoghegan, Jemma L; Holt, Kathryn E; Ho, Simon Y W; Holmes, Edward C
2016-05-26
Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult.
Method and system for diagnostics of apparatus
NASA Technical Reports Server (NTRS)
Gorinevsky, Dimitry (Inventor)
2012-01-01
Proposed is a method, implemented in software, for estimating fault state of an apparatus outfitted with sensors. At each execution period the method processes sensor data from the apparatus to obtain a set of parity parameters, which are further used for estimating fault state. The estimation method formulates a convex optimization problem for each fault hypothesis and employs a convex solver to compute fault parameter estimates and fault likelihoods for each fault hypothesis. The highest likelihoods and corresponding parameter estimates are transmitted to a display device or an automated decision and control system. The obtained accurate estimate of fault state can be used to improve safety, performance, or maintenance processes for the apparatus.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Kastorini, Christina-Maria; Panagiotakos, Demosthenes B; Chrysohoou, Christina; Georgousopoulou, Ekavi; Pitaraki, Evangelia; Puddu, Paolo Emilio; Tousoulis, Dimitrios; Stefanadis, Christodoulos; Pitsavos, Christos
2016-03-01
To better understand the metabolic syndrome (MS) spectrum through principal components analysis and further evaluate the role of the Mediterranean diet on MS presence. During 2001-2002, 1514 men and 1528 women (>18 y) without any clinical evidence of CVD or any other chronic disease, at baseline, living in greater Athens area, Greece, were enrolled. In 2011-2012, the 10-year follow-up was performed in 2583 participants (15% of the participants were lost to follow-up). Incidence of fatal or non-fatal CVD was defined according to WHO-ICD-10 criteria. MS was defined by the National Cholesterol Education Program Adult Treatment panel III (revised NCEP ATP III) definition. Adherence to the Mediterranean diet was assessed using the MedDietScore (range 0-55). Five principal components were derived, explaining 73.8% of the total variation, characterized by the: a) body weight and lipid profile, b) blood pressure, c) lipid profile, d) glucose profile, e) inflammatory factors. All components were associated with higher likelihood of CVD incidence. After adjusting for various potential confounding factors, adherence to the Mediterranean dietary pattern for each 10% increase in the MedDietScore, was associated with 15% lower odds of CVD incidence (95%CI: 0.71-1.06). For the participants with low adherence to the Mediterranean diet all five components were significantly associated with increased likelihood of CVD incidence. However, for the ones following closely the Mediterranean pattern positive, yet not significant associations were observed. Results of the present work propose a wider MS definition, while highlighting the beneficial role of the Mediterranean dietary pattern. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Ou, Sai-Hong Ignatius; Tang, Yiyun; Polli, Anna; Wilner, Keith D; Schnell, Patrick
2016-04-01
Decreases in heart rate (HR) have been described in patients receiving crizotinib. We performed a large retrospective analysis of HR changes during crizotinib therapy. HRs from vital-sign data for patients with anaplastic lymphoma kinase (ALK)-positive nonsmall cell lung cancer enrolled in PROFILE 1005 and the crizotinib arm of PROFILE 1007 were analyzed. Sinus bradycardia (SB) was defined as HR <60 beats per minute (bpm). Magnitude and timing of HR changes were assessed. Potential risk factors for SB were investigated by logistic regression analysis. Progression-free survival (PFS) was evaluated according to HR decrease by <20 versus ≥ 20 bpm within the first 50 days of starting treatment. For the 1053 patients analyzed, the mean maximum postbaseline HR decrease was 25 bpm (standard deviation 15.8). Overall, 441 patients (41.9%) had at least one episode of postbaseline SB. The mean precrizotinib treatment HR was significantly lower among patients with versus without postbaseline SB (82.2 bpm vs. 92.6 bpm). The likelihood of experiencing SB was statistically significantly higher among patients with a precrizotinib treatment HR <70 bpm. PFS was comparable among patients with or without HR decrease of ≥ 20 bpm within the first 50 days of starting crizotinib. Decrease in HR is very common among patients on crizotinib. The likelihood of experiencing SB was statistically significantly higher among patients with a precrizotinib treatment HR <70 bpm. This is the first large-scale report investigating the association between treatment with a tyrosine kinase inhibitor and the development of bradycardia. HRs should be closely monitored during crizotinib treatment. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Atbasoglu, E Cem; Gumus-Akay, Guvem; Guloksuz, Sinan; Saka, Meram Can; Ucok, Alp; Alptekin, Koksal; Gullu, Sevim; van Os, Jim
2018-04-01
Type 2 diabetes (T2D) is more frequent in schizophrenia (Sz) than in the general population. This association is partly accounted for by shared susceptibility genetic variants. We tested the hypotheses that a genetic predisposition to Sz would be associated with higher likelihood of insulin resistance (IR), and that IR would be predicted by subthreshold psychosis phenotypes. Unaffected siblings of Sz patients (n = 101) were compared with a nonclinical sample (n = 305) in terms of IR, schizotypy (SzTy), and a behavioural experiment of "jumping to conclusions". The measures, respectively, were the Homeostatic Model Assessment of Insulin Resistance (HOMA-IR), Structured Interview for Schizotypy-Revised (SIS-R), and the Beads Task (BT). The likelihood of IR was examined in multiple regression models that included sociodemographic, metabolic, and cognitive parameters alongside group status, SIS-R scores, and BT performance. Insulin resistance was less frequent in siblings (31.7%) compared to controls (43.3%) (p < 0.05), and negatively associated with SzTy, as compared among the tertile groups for the latter (p < 0.001). The regression model that examined all relevant parameters included the tSzTy tertiles, TG and HDL-C levels, and BMI, as significant predictors of IR. Lack of IR was predicted by the highest as compared to the lowest SzTy tertile [OR (95%CI): 0.43 (0.21-0.85), p = 0.015]. Higher dopaminergic activity may contribute to both schizotypal features and a favourable metabolic profile in the same individual. This is compatible with dopamine's regulatory role in glucose metabolism via indirect central actions and a direct action on pancreatic insulin secretion. The relationship between dopaminergic activity and metabolic profile in Sz must be examined in longitudinal studies with younger unaffected siblings.
Bleka, Øyvind; Storvik, Geir; Gill, Peter
2016-03-01
We have released a software named EuroForMix to analyze STR DNA profiles in a user-friendly graphical user interface. The software implements a model to explain the allelic peak height on a continuous scale in order to carry out weight-of-evidence calculations for profiles which could be from a mixture of contributors. Through a properly parameterized model we are able to do inference on mixture proportions, the peak height properties, stutter proportion and degradation. In addition, EuroForMix includes models for allele drop-out, allele drop-in and sub-population structure. EuroForMix supports two inference approaches for likelihood ratio calculations. The first approach uses maximum likelihood estimation of the unknown parameters. The second approach is Bayesian based which requires prior distributions to be specified for the parameters involved. The user may specify any number of known and unknown contributors in the model, however we find that there is a practical computing time limit which restricts the model to a maximum of four unknown contributors. EuroForMix is the first freely open source, continuous model (accommodating peak height, stutter, drop-in, drop-out, population substructure and degradation), to be reported in the literature. It therefore serves an important purpose to act as an unrestricted platform to compare different solutions that are available. The implementation of the continuous model used in the software showed close to identical results to the R-package DNAmixtures, which requires a HUGIN Expert license to be used. An additional feature in EuroForMix is the ability for the user to adapt the Bayesian inference framework by incorporating their own prior information. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Anatomically-Aided PET Reconstruction Using the Kernel Method
Hutchcroft, Will; Wang, Guobao; Chen, Kevin T.; Catana, Ciprian; Qi, Jinyi
2016-01-01
This paper extends the kernel method that was proposed previously for dynamic PET reconstruction, to incorporate anatomical side information into the PET reconstruction model. In contrast to existing methods that incorporate anatomical information using a penalized likelihood framework, the proposed method incorporates this information in the simpler maximum likelihood (ML) formulation and is amenable to ordered subsets. The new method also does not require any segmentation of the anatomical image to obtain edge information. We compare the kernel method with the Bowsher method for anatomically-aided PET image reconstruction through a simulated data set. Computer simulations demonstrate that the kernel method offers advantages over the Bowsher method in region of interest (ROI) quantification. Additionally the kernel method is applied to a 3D patient data set. The kernel method results in reduced noise at a matched contrast level compared with the conventional ML expectation maximization (EM) algorithm. PMID:27541810
Anatomically-aided PET reconstruction using the kernel method.
Hutchcroft, Will; Wang, Guobao; Chen, Kevin T; Catana, Ciprian; Qi, Jinyi
2016-09-21
This paper extends the kernel method that was proposed previously for dynamic PET reconstruction, to incorporate anatomical side information into the PET reconstruction model. In contrast to existing methods that incorporate anatomical information using a penalized likelihood framework, the proposed method incorporates this information in the simpler maximum likelihood (ML) formulation and is amenable to ordered subsets. The new method also does not require any segmentation of the anatomical image to obtain edge information. We compare the kernel method with the Bowsher method for anatomically-aided PET image reconstruction through a simulated data set. Computer simulations demonstrate that the kernel method offers advantages over the Bowsher method in region of interest quantification. Additionally the kernel method is applied to a 3D patient data set. The kernel method results in reduced noise at a matched contrast level compared with the conventional ML expectation maximization algorithm.
Anatomically-aided PET reconstruction using the kernel method
NASA Astrophysics Data System (ADS)
Hutchcroft, Will; Wang, Guobao; Chen, Kevin T.; Catana, Ciprian; Qi, Jinyi
2016-09-01
This paper extends the kernel method that was proposed previously for dynamic PET reconstruction, to incorporate anatomical side information into the PET reconstruction model. In contrast to existing methods that incorporate anatomical information using a penalized likelihood framework, the proposed method incorporates this information in the simpler maximum likelihood (ML) formulation and is amenable to ordered subsets. The new method also does not require any segmentation of the anatomical image to obtain edge information. We compare the kernel method with the Bowsher method for anatomically-aided PET image reconstruction through a simulated data set. Computer simulations demonstrate that the kernel method offers advantages over the Bowsher method in region of interest quantification. Additionally the kernel method is applied to a 3D patient data set. The kernel method results in reduced noise at a matched contrast level compared with the conventional ML expectation maximization algorithm.
Optimal Methods for Classification of Digitally Modulated Signals
2013-03-01
of using a ratio of likelihood functions, the proposed approach uses the Kullback - Leibler (KL) divergence. KL...58 List of Acronyms ALRT Average LRT BPSK Binary Shift Keying BPSK-SS BPSK Spread Spectrum or CDMA DKL Kullback - Leibler Information Divergence...blind demodulation for develop classification algorithms for wider set of signals types. Two methodologies were used : Likelihood Ratio Test
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Maximum Likelihood Dynamic Factor Modeling for Arbitrary "N" and "T" Using SEM
ERIC Educational Resources Information Center
Voelkle, Manuel C.; Oud, Johan H. L.; von Oertzen, Timo; Lindenberger, Ulman
2012-01-01
This article has 3 objectives that build on each other. First, we demonstrate how to obtain maximum likelihood estimates for dynamic factor models (the direct autoregressive factor score model) with arbitrary "T" and "N" by means of structural equation modeling (SEM) and compare the approach to existing methods. Second, we go beyond standard time…
Rate of convergence of k-step Newton estimators to efficient likelihood estimators
Steve Verrill
2007-01-01
We make use of Cramer conditions together with the well-known local quadratic convergence of Newton?s method to establish the asymptotic closeness of k-step Newton estimators to efficient likelihood estimators. In Verrill and Johnson [2007. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. USDA Forest Products Laboratory Research...
Issues and strategies in the DNA identification of World Trade Center victims.
Brenner, C H; Weir, B S
2003-05-01
Identification of the nearly 3000 victims of the World Trade Center attack, represented by about 15,000 body parts, rests heavily on DNA. Reference DNA profiles are often from relatives rather than from the deceased themselves. With so large a set of victims, coincidental similarities between non-relatives abound. Therefore considerable care is necessary to succeed in correlating references with correct victims while avoiding spurious assignments. Typically multiple relatives are necessary to establish the identity of a victim. We describe a 3-stage paradigm--collapse, screen, test--to organize the work of sorting out the identities. Inter alia we present a simple and general formula for the likelihood ratio governing practically any potential relationship between two DNA profiles.
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
Algorithms of maximum likelihood data clustering with applications
NASA Astrophysics Data System (ADS)
Giada, Lorenzo; Marsili, Matteo
2002-12-01
We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.
Estimating Divergence Parameters With Small Samples From a Large Number of Loci
Wang, Yong; Hey, Jody
2010-01-01
Most methods for studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history but they are unlikely to contain sufficient information about older events. However, the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying relatively ancient divergence. Data sets extracted from whole-genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data we developed a new maximum-likelihood method for genomic data under the isolation-with-migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans and detected a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster. PMID:19917765
Quantitative PET Imaging in Drug Development: Estimation of Target Occupancy.
Naganawa, Mika; Gallezot, Jean-Dominique; Rossano, Samantha; Carson, Richard E
2017-12-11
Positron emission tomography, an imaging tool using radiolabeled tracers in humans and preclinical species, has been widely used in recent years in drug development, particularly in the central nervous system. One important goal of PET in drug development is assessing the occupancy of various molecular targets (e.g., receptors, transporters, enzymes) by exogenous drugs. The current linear mathematical approaches used to determine occupancy using PET imaging experiments are presented. These algorithms use results from multiple regions with different target content in two scans, a baseline (pre-drug) scan and a post-drug scan. New mathematical estimation approaches to determine target occupancy, using maximum likelihood, are presented. A major challenge in these methods is the proper definition of the covariance matrix of the regional binding measures, accounting for different variance of the individual regional measures and their nonzero covariance, factors that have been ignored by conventional methods. The novel methods are compared to standard methods using simulation and real human occupancy data. The simulation data showed the expected reduction in variance and bias using the proper maximum likelihood methods, when the assumptions of the estimation method matched those in simulation. Between-method differences for data from human occupancy studies were less obvious, in part due to small dataset sizes. These maximum likelihood methods form the basis for development of improved PET covariance models, in order to minimize bias and variance in PET occupancy studies.
Safe semi-supervised learning based on weighted likelihood.
Kawakita, Masanori; Takeuchi, Jun'ichi
2014-05-01
We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n(') unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e., it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska et al. (2008) assumes a very limited situation, i.e., classification, discrete covariates, n(')→∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n≤n('). Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n(')
Puch-Solis, Roberto; Clayton, Tim
2014-07-01
The high sensitivity of the technology for producing profiles means that it has become routine to produce profiles from relatively small quantities of DNA. The profiles obtained from low template DNA (LTDNA) are affected by several phenomena which must be taken into consideration when interpreting and evaluating this evidence. Furthermore, many of the same phenomena affect profiles from higher amounts of DNA (e.g. where complex mixtures has been revealed). In this article we present a statistical model, which forms the basis of software DNA LiRa, and that is able to calculate likelihood ratios where one to four donors are postulated and for any number of replicates. The model can take into account dropin and allelic dropout for different contributors, template degradation and uncertain allele designations. In this statistical model unknown parameters are treated following the Empirical Bayesian paradigm. The performance of LiRa is tested using examples and the outputs are compared with those generated using two other statistical software packages likeLTD and LRmix. The concept of ban efficiency is introduced as a measure for assessing model sensitivity. Copyright © 2014. Published by Elsevier Ireland Ltd.
Mubayi, Anuj; Castillo-Chavez, Carlos
2018-01-01
Background When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. Methods In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. Conclusions When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle. PMID:29742115
The complexities of DNA transfer during a social setting.
Goray, Mariya; van Oorschot, Roland A H
2015-03-01
When questions relating to how a touch DNA sample from a specific individual got to where it was sampled from, one has limited data available to provide an assessment on the likelihood of specific transfer events within a proposed scenario. This data is mainly related to the impact of some key variables affecting transfer that are derived from structured experiments. Here we consider the effects of unstructured social interactions on the transfer of touch DNA. Unscripted social exchanges of three individuals having a drink together while sitting at a table were video recorded and DNA samples were collected and profiled from all relevant items touched during each sitting. Attempts were made to analyze when and how DNA was transferred from one object to another. The analyses demonstrate that simple minor everyday interactions involving only a few items in some instances lead to detectable DNA being transferred among individuals and objects without them having contacted each other through secondary and further transfer. Transfer was also observed to be bi-directional. Furthermore, DNA of unknown source on hands or objects can be transferred and interfere with the interpretation of profiles generated from targeted touched surfaces. This study provides further insight into the transfer of DNA that may be useful when considering the likelihood of alternate scenarios of how a DNA sample got to where it was found. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
A long-term earthquake rate model for the central and eastern United States from smoothed seismicity
Moschetti, Morgan P.
2015-01-01
I present a long-term earthquake rate model for the central and eastern United States from adaptive smoothed seismicity. By employing pseudoprospective likelihood testing (L-test), I examined the effects of fixed and adaptive smoothing methods and the effects of catalog duration and composition on the ability of the models to forecast the spatial distribution of recent earthquakes. To stabilize the adaptive smoothing method for regions of low seismicity, I introduced minor modifications to the way that the adaptive smoothing distances are calculated. Across all smoothed seismicity models, the use of adaptive smoothing and the use of earthquakes from the recent part of the catalog optimizes the likelihood for tests with M≥2.7 and M≥4.0 earthquake catalogs. The smoothed seismicity models optimized by likelihood testing with M≥2.7 catalogs also produce the highest likelihood values for M≥4.0 likelihood testing, thus substantiating the hypothesis that the locations of moderate-size earthquakes can be forecast by the locations of smaller earthquakes. The likelihood test does not, however, maximize the fraction of earthquakes that are better forecast than a seismicity rate model with uniform rates in all cells. In this regard, fixed smoothing models perform better than adaptive smoothing models. The preferred model of this study is the adaptive smoothed seismicity model, based on its ability to maximize the joint likelihood of predicting the locations of recent small-to-moderate-size earthquakes across eastern North America. The preferred rate model delineates 12 regions where the annual rate of M≥5 earthquakes exceeds 2×10−3. Although these seismic regions have been previously recognized, the preferred forecasts are more spatially concentrated than the rates from fixed smoothed seismicity models, with rate increases of up to a factor of 10 near clusters of high seismic activity.
Latent Profiles of Perceived Time Adequacy for Paid Work, Parenting, and Partner Roles
Lee, Soomi; Almeida, David M.; Davis, Kelly D.; King, Rosalind B.; Hammer, Leslie B.; Kelly, Erin L.
2015-01-01
This study examined feelings of having enough time (i.e., perceived time adequacy) in a sample of employed parents (N=880) in information technology and extended-care industries. Adapting a person-centered latent profile approach, we identified three profiles of perceived time adequacy for paid work, parenting, and partner roles: Family Time Protected, Family Time Sacrificed, and Time Balanced. Drawing upon the Conservation of Resources theory (Hobfòll, 1989), we examined the associations of stressors and resources with the time adequacy profiles. Parents in the Family Time Sacrificed profile were more likely to be younger, women, have younger children, work in the extended-care industry, and have nonstandard work schedules compared to those in the Family Time Protected profile. Results from multinomial logistic regression analyses revealed that, with the Time Balanced profile as the reference group, having fewer stressors and more resources in the family context (less parent-child conflict and more partner support), work context (longer company tenure, higher schedule control and job satisfaction), and work-family interface (lower work-to-family conflict) was linked to a higher probability of membership in the Family Time Protected profile. By contrast, having more stressors and fewer resources, in the forms of less partner support and higher work-to-family conflict, predicted a higher likelihood of being in the Family Time Sacrificed profile. Our findings suggest that low work-to-family conflict is the most critical predictor of membership in the Family Time Protected profile, whereas lack of partner support is the most important factor to be included in the Family Time Sacrificed profile. PMID:26075739
Salje, Ekhard K H; Planes, Antoni; Vives, Eduard
2017-10-01
Crackling noise can be initiated by competing or coexisting mechanisms. These mechanisms can combine to generate an approximate scale invariant distribution that contains two or more contributions. The overall distribution function can be analyzed, to a good approximation, using maximum-likelihood methods and assuming that it follows a power law although with nonuniversal exponents depending on a varying lower cutoff. We propose that such distributions are rather common and originate from a simple superposition of crackling noise distributions or exponential damping.
A survey of kernel-type estimators for copula and their applications
NASA Astrophysics Data System (ADS)
Sumarjaya, I. W.
2017-10-01
Copulas have been widely used to model nonlinear dependence structure. Main applications of copulas include areas such as finance, insurance, hydrology, rainfall to name but a few. The flexibility of copula allows researchers to model dependence structure beyond Gaussian distribution. Basically, a copula is a function that couples multivariate distribution functions to their one-dimensional marginal distribution functions. In general, there are three methods to estimate copula. These are parametric, nonparametric, and semiparametric method. In this article we survey kernel-type estimators for copula such as mirror reflection kernel, beta kernel, transformation method and local likelihood transformation method. Then, we apply these kernel methods to three stock indexes in Asia. The results of our analysis suggest that, albeit variation in information criterion values, the local likelihood transformation method performs better than the other kernel methods.
Reducing weapon-carrying among urban American Indian young people.
Bearinger, Linda H; Pettingell, Sandra L; Resnick, Michael D; Potthoff, Sandra J
2010-07-01
To examine the likelihood of weapon-carrying among urban American Indian young people, given the presence of salient risk and protective factors. The study used data from a confidential, self-report Urban Indian Youth Health Survey with 200 forced-choice items examining risk and protective factors and social, contextual, and demographic information. Between 1995 and 1998, 569 American Indian youths, aged 9-15 years, completed surveys administered in public schools and an after-school program. Using logistic regression, probability profiles compared the likelihood of weapon-carrying, given the combinations of salient risk and protective factors. In the final models, weapon-carrying was associated significantly with one risk factor (substance use) and two protective factors (school connectedness, perceiving peers as having prosocial behavior attitudes/norms). With one risk factor and two protective factors, in various combinations in the models, the likelihood of weapon carrying ranged from 4% (with two protective factors and no risk factor in the model) to 80% of youth (with the risk factor and no protective factors in the model). Even in the presence of the risk factor, the two protective factors decreased the likelihood of weapon-carrying to 25%. This analysis highlights the importance of protective factors in comprehensive assessments and interventions for vulnerable youth. In that the risk factor and two protective factors significantly related to weapon-carrying are amenable to intervention at both individual and population-focused levels, study findings offer a guide for prioritizing strategies for decreasing weapon-carrying among urban American Indian young people. Copyright (c) 2010 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Discerning the clinical relevance of biomarkers in early stage breast cancer.
Ballinger, Tarah J; Kassem, Nawal; Shen, Fei; Jiang, Guanglong; Smith, Mary Lou; Railey, Elda; Howell, John; White, Carol B; Schneider, Bryan P
2017-07-01
Prior data suggest that breast cancer patients accept significant toxicity for small benefit. It is unclear whether personalized estimations of risk or benefit likelihood that could be provided by biomarkers alter treatment decisions in the curative setting. A choice-based conjoint (CBC) survey was conducted in 417 HER2-negative breast cancer patients who received chemotherapy in the curative setting. The survey presented pairs of treatment choices derived from common taxane- and anthracycline-based regimens, varying in degree of benefit by risk of recurrence and in toxicity profile, including peripheral neuropathy (PN) and congestive heart failure (CHF). Hypothetical biomarkers shifting benefit and toxicity risk were modeled to determine whether this knowledge alters choice. Previously identified biomarkers were evaluated using this model. Based on CBC analysis, a non-anthracycline regimen was the most preferred. Patients with prior PN had a similar preference for a taxane regimen as those who were PN naïve, but more dramatically shifted preference away from taxanes when PN was described as severe/irreversible. When modeled after hypothetical biomarkers, as the likelihood of PN increased, the preference for taxane-containing regimens decreased; similarly, as the likelihood of CHF increased, the preference for anthracycline regimens decreased. When evaluating validated biomarkers for PN and CHF, this knowledge did alter regimen preference. Patients faced with multi-faceted decisions consider personal experience and perceived risk of recurrent disease. Biomarkers providing information on likelihood of toxicity risk do influence treatment choices, and patients may accept reduced benefit when faced with higher risk of toxicity in the curative setting.
Towers, Sherry; Mubayi, Anuj; Castillo-Chavez, Carlos
2018-01-01
When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle.
Su, Jingjun; Du, Xinzhong; Li, Xuyong
2018-05-16
Uncertainty analysis is an important prerequisite for model application. However, the existing phosphorus (P) loss indexes or indicators were rarely evaluated. This study applied generalized likelihood uncertainty estimation (GLUE) method to assess the uncertainty of parameters and modeling outputs of a non-point source (NPS) P indicator constructed in R language. And the influences of subjective choices of likelihood formulation and acceptability threshold of GLUE on model outputs were also detected. The results indicated the following. (1) Parameters RegR 2 , RegSDR 2 , PlossDP fer , PlossDP man , DPDR, and DPR were highly sensitive to overall TP simulation and their value ranges could be reduced by GLUE. (2) Nash efficiency likelihood (L 1 ) seemed to present better ability in accentuating high likelihood value simulations than the exponential function (L 2 ) did. (3) The combined likelihood integrating the criteria of multiple outputs acted better than single likelihood in model uncertainty assessment in terms of reducing the uncertainty band widths and assuring the fitting goodness of whole model outputs. (4) A value of 0.55 appeared to be a modest choice of threshold value to balance the interests between high modeling efficiency and high bracketing efficiency. Results of this study could provide (1) an option to conduct NPS modeling under one single computer platform, (2) important references to the parameter setting for NPS model development in similar regions, (3) useful suggestions for the application of GLUE method in studies with different emphases according to research interests, and (4) important insights into the watershed P management in similar regions.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, M.
1980-12-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
Robbins, L G
2000-01-01
Graduate school programs in genetics have become so full that courses in statistics have often been eliminated. In addition, typical introductory statistics courses for the "statistics user" rather than the nascent statistician are laden with methods for analysis of measured variables while genetic data are most often discrete numbers. These courses are often seen by students and genetics professors alike as largely irrelevant cookbook courses. The powerful methods of likelihood analysis, although commonly employed in human genetics, are much less often used in other areas of genetics, even though current computational tools make this approach readily accessible. This article introduces the MLIKELY.PAS computer program and the logic of do-it-yourself maximum-likelihood statistics. The program itself, course materials, and expanded discussions of some examples that are only summarized here are available at http://www.unisi. it/ricerca/dip/bio_evol/sitomlikely/mlikely.h tml. PMID:10628965
Krishnamoorthy, K; Oral, Evrim
2017-12-01
Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2010-01-01
In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Empirical likelihood method for non-ignorable missing data problems.
Guan, Zhong; Qin, Jing
2017-01-01
Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)'s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.
Chen, Yong; Liu, Yulun; Ning, Jing; Cormier, Janice; Chu, Haitao
2014-01-01
Systematic reviews of diagnostic tests often involve a mixture of case-control and cohort studies. The standard methods for evaluating diagnostic accuracy only focus on sensitivity and specificity and ignore the information on disease prevalence contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as population averaged or overall positive and negative predictive values, which reflect the clinical utility of a diagnostic test. In this paper, we propose a hybrid approach that jointly models the disease prevalence along with the diagnostic test sensitivity and specificity in cohort studies, and the sensitivity and specificity in case-control studies. In order to overcome the potential computational difficulties in the standard full likelihood inference of the proposed hybrid model, we propose an alternative inference procedure based on the composite likelihood. Such composite likelihood based inference does not suffer computational problems and maintains high relative efficiency. In addition, it is more robust to model mis-specifications compared to the standard full likelihood inference. We apply our approach to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma. PMID:25897179
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
SubspaceEM: A Fast Maximum-a-posteriori Algorithm for Cryo-EM Single Particle Reconstruction
Dvornek, Nicha C.; Sigworth, Fred J.; Tagare, Hemant D.
2015-01-01
Single particle reconstruction methods based on the maximum-likelihood principle and the expectation-maximization (E–M) algorithm are popular because of their ability to produce high resolution structures. However, these algorithms are computationally very expensive, requiring a network of computational servers. To overcome this computational bottleneck, we propose a new mathematical framework for accelerating maximum-likelihood reconstructions. The speedup is by orders of magnitude and the proposed algorithm produces similar quality reconstructions compared to the standard maximum-likelihood formulation. Our approach uses subspace approximations of the cryo-electron microscopy (cryo-EM) data and projection images, greatly reducing the number of image transformations and comparisons that are computed. Experiments using simulated and actual cryo-EM data show that speedup in overall execution time compared to traditional maximum-likelihood reconstruction reaches factors of over 300. PMID:25839831
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
A novel approach to making microstructure measurements in the ice-covered Arctic Ocean.
NASA Astrophysics Data System (ADS)
Guthrie, J.; Morison, J.; Fer, I.
2014-12-01
As part of the 2014 Field Season of the North Pole Environmental Observatory, a 7-day microstructure experiment was performed. A Rockland Scientific Microrider with 2 FP07 fast response thermistors and 2 SBE-7 micro-conductivity probes was attached to a Seabird 911+ Conductivity-Temperature-Depth unit to allow for calibration of the microstructure probes against the highly accurate Seabird temperature and conductivity sensors. From a heated hut, the instrument package was lowered through a 0.75-m hole in the sea ice down to 350 m depth using a lightweight winch powered with a 3-phase, frequency-controlled motor that produced a smooth, controlled lowering speed of 25 cm s-1. Focusing on temperature and conductivity microstructure and using the special winch removed many of the complications involved with the use of free-fall microstructure profilers under the ice. The slow profiling speed permits calculation of Χ, the dissipation of thermal variance, without relying on fits to theoretical spectra to account for the unresolved variance. The dissipation rate of turbulent kinetic energy, ɛ, can then be estimated using the temperature gradient spectrum and the Ruddick et al. [2001] maximum likelihood method. Outside of a few turbulent patches, thermal diffusivity ranged between O(10-7) and O(10-6) m2s-1, resulting in negligible turbulent heat fluxes. Estimated ɛ was often at or below the noise level of most shear-based microstructure profilers. The noise level of Χ is estimated at O(10-11) °C2s-1, revealing the utility and applicability of this technique in future Arctic field work.
An Improved Nested Sampling Algorithm for Model Selection and Assessment
NASA Astrophysics Data System (ADS)
Zeng, X.; Ye, M.; Wu, J.; WANG, D.
2017-12-01
Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
NASA Astrophysics Data System (ADS)
Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens
2018-06-01
This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.
Grummer, Jared A; Bryson, Robert W; Reeder, Tod W
2014-03-01
Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.
A simulation study on Bayesian Ridge regression models for several collinearity levels
NASA Astrophysics Data System (ADS)
Efendi, Achmad; Effrihan
2017-12-01
When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
GNSS Spoofing Detection and Mitigation Based on Maximum Likelihood Estimation
Li, Hong; Lu, Mingquan
2017-01-01
Spoofing attacks are threatening the global navigation satellite system (GNSS). The maximum likelihood estimation (MLE)-based positioning technique is a direct positioning method originally developed for multipath rejection and weak signal processing. We find this method also has a potential ability for GNSS anti-spoofing since a spoofing attack that misleads the positioning and timing result will cause distortion to the MLE cost function. Based on the method, an estimation-cancellation approach is presented to detect spoofing attacks and recover the navigation solution. A statistic is derived for spoofing detection with the principle of the generalized likelihood ratio test (GLRT). Then, the MLE cost function is decomposed to further validate whether the navigation solution obtained by MLE-based positioning is formed by consistent signals. Both formulae and simulations are provided to evaluate the anti-spoofing performance. Experiments with recordings in real GNSS spoofing scenarios are also performed to validate the practicability of the approach. Results show that the method works even when the code phase differences between the spoofing and authentic signals are much less than one code chip, which can improve the availability of GNSS service greatly under spoofing attacks. PMID:28665318
GNSS Spoofing Detection and Mitigation Based on Maximum Likelihood Estimation.
Wang, Fei; Li, Hong; Lu, Mingquan
2017-06-30
Spoofing attacks are threatening the global navigation satellite system (GNSS). The maximum likelihood estimation (MLE)-based positioning technique is a direct positioning method originally developed for multipath rejection and weak signal processing. We find this method also has a potential ability for GNSS anti-spoofing since a spoofing attack that misleads the positioning and timing result will cause distortion to the MLE cost function. Based on the method, an estimation-cancellation approach is presented to detect spoofing attacks and recover the navigation solution. A statistic is derived for spoofing detection with the principle of the generalized likelihood ratio test (GLRT). Then, the MLE cost function is decomposed to further validate whether the navigation solution obtained by MLE-based positioning is formed by consistent signals. Both formulae and simulations are provided to evaluate the anti-spoofing performance. Experiments with recordings in real GNSS spoofing scenarios are also performed to validate the practicability of the approach. Results show that the method works even when the code phase differences between the spoofing and authentic signals are much less than one code chip, which can improve the availability of GNSS service greatly under spoofing attacks.
Li, Xiang; Kuk, Anthony Y C; Xu, Jinfeng
2014-12-10
Human biomonitoring of exposure to environmental chemicals is important. Individual monitoring is not viable because of low individual exposure level or insufficient volume of materials and the prohibitive cost of taking measurements from many subjects. Pooling of samples is an efficient and cost-effective way to collect data. Estimation is, however, complicated as individual values within each pool are not observed but are only known up to their average or weighted average. The distribution of such averages is intractable when the individual measurements are lognormally distributed, which is a common assumption. We propose to replace the intractable distribution of the pool averages by a Gaussian likelihood to obtain parameter estimates. If the pool size is large, this method produces statistically efficient estimates, but regardless of pool size, the method yields consistent estimates as the number of pools increases. An empirical Bayes (EB) Gaussian likelihood approach, as well as its Bayesian analog, is developed to pool information from various demographic groups by using a mixed-effect formulation. We also discuss methods to estimate the underlying mean-variance relationship and to select a good model for the means, which can be incorporated into the proposed EB or Bayes framework. By borrowing strength across groups, the EB estimator is more efficient than the individual group-specific estimator. Simulation results show that the EB Gaussian likelihood estimates outperform a previous method proposed for the National Health and Nutrition Examination Surveys with much smaller bias and better coverage in interval estimation, especially after correction of bias. Copyright © 2014 John Wiley & Sons, Ltd.
A strategy for improved computational efficiency of the method of anchored distributions
NASA Astrophysics Data System (ADS)
Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram
2013-06-01
This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.
NASA Astrophysics Data System (ADS)
Wang, Z.
2015-12-01
For decades, distributed and lumped hydrological models have furthered our understanding of hydrological system. The development of hydrological simulation in large scale and high precision elaborated the spatial descriptions and hydrological behaviors. Meanwhile, the new trend is also followed by the increment of model complexity and number of parameters, which brings new challenges of uncertainty quantification. Generalized Likelihood Uncertainty Estimation (GLUE) has been widely used in uncertainty analysis for hydrological models referring to Monte Carlo method coupled with Bayesian estimation. However, the stochastic sampling method of prior parameters adopted by GLUE appears inefficient, especially in high dimensional parameter space. The heuristic optimization algorithms utilizing iterative evolution show better convergence speed and optimality-searching performance. In light of the features of heuristic optimization algorithms, this study adopted genetic algorithm, differential evolution, shuffled complex evolving algorithm to search the parameter space and obtain the parameter sets of large likelihoods. Based on the multi-algorithm sampling, hydrological model uncertainty analysis is conducted by the typical GLUE framework. To demonstrate the superiority of the new method, two hydrological models of different complexity are examined. The results shows the adaptive method tends to be efficient in sampling and effective in uncertainty analysis, providing an alternative path for uncertainty quantilization.
NASA Astrophysics Data System (ADS)
Coakley, Kevin J.; Vecchia, Dominic F.; Hussey, Daniel S.; Jacobson, David L.
2013-10-01
At the NIST Neutron Imaging Facility, we collect neutron projection data for both the dry and wet states of a Proton-Exchange-Membrane (PEM) fuel cell. Transmitted thermal neutrons captured in a scintillator doped with lithium-6 produce scintillation light that is detected by an amorphous silicon detector. Based on joint analysis of the dry and wet state projection data, we reconstruct a residual neutron attenuation image with a Penalized Likelihood method with an edge-preserving Huber penalty function that has two parameters that control how well jumps in the reconstruction are preserved and how well noisy fluctuations are smoothed out. The choice of these parameters greatly influences the resulting reconstruction. We present a data-driven method that objectively selects these parameters, and study its performance for both simulated and experimental data. Before reconstruction, we transform the projection data so that the variance-to-mean ratio is approximately one. For both simulated and measured projection data, the Penalized Likelihood method reconstruction is visually sharper than a reconstruction yielded by a standard Filtered Back Projection method. In an idealized simulation experiment, we demonstrate that the cross validation procedure selects regularization parameters that yield a reconstruction that is nearly optimal according to a root-mean-square prediction error criterion.
Bayesian inference for OPC modeling
NASA Astrophysics Data System (ADS)
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
Ling, Cheng; Hamada, Tsuyoshi; Gao, Jingyang; Zhao, Guoguang; Sun, Donghong; Shi, Weifeng
2016-01-01
MrBayes is a widespread phylogenetic inference tool harnessing empirical evolutionary models and Bayesian statistics. However, the computational cost on the likelihood estimation is very expensive, resulting in undesirably long execution time. Although a number of multi-threaded optimizations have been proposed to speed up MrBayes, there are bottlenecks that severely limit the GPU thread-level parallelism of likelihood estimations. This study proposes a high performance and resource-efficient method for GPU-oriented parallelization of likelihood estimations. Instead of having to rely on empirical programming, the proposed novel decomposition storage model implements high performance data transfers implicitly. In terms of performance improvement, a speedup factor of up to 178 can be achieved on the analysis of simulated datasets by four Tesla K40 cards. In comparison to the other publicly available GPU-oriented MrBayes, the tgMC 3 ++ method (proposed herein) outperforms the tgMC 3 (v1.0), nMC 3 (v2.1.1) and oMC 3 (v1.00) methods by speedup factors of up to 1.6, 1.9 and 2.9, respectively. Moreover, tgMC 3 ++ supports more evolutionary models and gamma categories, which previous GPU-oriented methods fail to take into analysis.
The effort to personalize treatment plans for cancer patients involves the identification of drug treatments that can effectively target the disease while minimizing the likelihood of adverse reactions. In this study, the gene-expression profile of 810 cancer cell lines and their response data to 368 small molecules from the Cancer Therapeutics Research Portal (CTRP) are analyzed to identify pathways with significant rewiring between genes, or differential gene dependency, between sensitive and non-sensitive cell lines.
Establishment of a center of excellence for applied mathematical and statistical research
NASA Technical Reports Server (NTRS)
Woodward, W. A.; Gray, H. L.
1983-01-01
The state of the art was assessed with regards to efforts in support of the crop production estimation problem and alternative generic proportion estimation techniques were investigated. Topics covered include modeling the greeness profile (Badhwarmos model), parameter estimation using mixture models such as CLASSY, and minimum distance estimation as an alternative to maximum likelihood estimation. Approaches to the problem of obtaining proportion estimates when the underlying distributions are asymmetric are examined including the properties of Weibull distribution.
Efffect of Aeroallergen Sensitization on Asthma Control in ...
In African-American adolescents with persistent asthma, allergic profile predicted the likelihood of having poorly controlled asthma despite guidelines-directed therapies. Our results suggest that tree and weed pollen sensitization are independent risk factors for poorly controlled asthma in this at-risk population. The study examined African-American children with difficult to treat asthma. The findings suggest that in addition to guidelines-directed asthma therapies, targeting the allergic component, particularly tree and weed pollen, is critical to achieving optimal asthma control in this at-risk population.
Development of a Strain Rate Dependent Long Bone Injury Criterion for Use with the ATB Model.
1982-01-12
testing of this computer model and has applied it to the analysis of the response of pilots to ejection from jet aircraft. During these events the body is...acceleration profiles , restraint systems and other variables as to their injury preventing potential. Currently these assessments must be made, in a very...fractures, it is of particular interest to estimate the likelihood of long bone fracture. (It should be noted that a separate computer model, the
Effect of sampling rate and record length on the determination of stability and control derivatives
NASA Technical Reports Server (NTRS)
Brenner, M. J.; Iliff, K. W.; Whitman, R. K.
1978-01-01
Flight data from five aircraft were used to assess the effects of sampling rate and record length reductions on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there were considerable reductions in sampling rate and/or record length. Small amplitude pulse maneuvers showed greater degradation of the derivative maneuvers than large amplitude pulse maneuvers when these reductions were made. Reducing the sampling rate was found to be more desirable than reducing the record length as a method of lessening the total computation time required without greatly degrading the quantity of the estimates.
Characterization, parameter estimation, and aircraft response statistics of atmospheric turbulence
NASA Technical Reports Server (NTRS)
Mark, W. D.
1981-01-01
A nonGaussian three component model of atmospheric turbulence is postulated that accounts for readily observable features of turbulence velocity records, their autocorrelation functions, and their spectra. Methods for computing probability density functions and mean exceedance rates of a generic aircraft response variable are developed using nonGaussian turbulence characterizations readily extracted from velocity recordings. A maximum likelihood method is developed for optimal estimation of the integral scale and intensity of records possessing von Karman transverse of longitudinal spectra. Formulas for the variances of such parameter estimates are developed. The maximum likelihood and least-square approaches are combined to yield a method for estimating the autocorrelation function parameters of a two component model for turbulence.
Discoveries far from the lamppost with matrix elements and ranking
DOE Office of Scientific and Technical Information (OSTI.GOV)
Debnath, Dipsikha; Gainer, James S.; Matchev, Konstantin T.
2015-04-01
The prevalence of null results in searches for new physics at the LHC motivates the effort to make these searches as model-independent as possible. We describe procedures for adapting the Matrix Element Method for situations where the signal hypothesis is not known a priori. We also present general and intuitive approaches for performing analyses and presenting results, which involve the flattening of background distributions using likelihood information. The first flattening method involves ranking events by background matrix element, the second involves quantile binning with respect to likelihood (and other) variables, and the third method involves reweighting histograms by the inversemore » of the background distribution.« less
NASA Astrophysics Data System (ADS)
Li, Yan; Wu, Mingwei; Du, Xinwei; Xu, Zhuoran; Gurusamy, Mohan; Yu, Changyuan; Kam, Pooi-Yuen
2018-02-01
A novel soft-decision-aided maximum likelihood (SDA-ML) carrier phase estimation method and its simplified version, the decision-aided and soft-decision-aided maximum likelihood (DA-SDA-ML) methods are tested in a nonlinear phase noise-dominant channel. The numerical performance results show that both the SDA-ML and DA-SDA-ML methods outperform the conventional DA-ML in systems with constant-amplitude modulation formats. In addition, modified algorithms based on constellation partitioning are proposed. With partitioning, the modified SDA-ML and DA-SDA-ML are shown to be useful for compensating the nonlinear phase noise in multi-level modulation systems.
ERIC Educational Resources Information Center
Andersen, Erling B.
A computer program for solving the conditional likelihood equations arising in the Rasch model for questionnaires is described. The estimation method and the computational problems involved are described in a previous research report by Andersen, but a summary of those results are given in two sections of this paper. A working example is also…
Importance of target-mediated drug disposition for small molecules.
Smith, Dennis A; van Waterschoot, Robert A B; Parrott, Neil J; Olivares-Morales, Andrés; Lavé, Thierry; Rowland, Malcolm
2018-06-18
Target concentration is typically not considered in drug discovery. However, if targets are expressed at relatively high concentrations and compounds have high affinity, such that most of the drug is bound to its target, in vitro screens can give unreliable information on compound affinity. In vivo, a similar situation will generate pharmacokinetic (PK) profiles that deviate greatly from those normally expected, owing to target binding affecting drug distribution and clearance. Such target-mediated drug disposition (TMDD) effects on small molecules have received little attention and might only become apparent during clinical trials, with the potential for data misinterpretation. TMDD also confounds human microdosing approaches by providing therapeutically unrepresentative PK profiles. Being aware of these phenomena will improve the likelihood of successful drug discovery and development. Copyright © 2018. Published by Elsevier Ltd.
Likelihood-based confidence intervals for estimating floods with given return periods
NASA Astrophysics Data System (ADS)
Martins, Eduardo Sávio P. R.; Clarke, Robin T.
1993-06-01
This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.
Estimating metallicities with isochrone fits to photometric data of open clusters
NASA Astrophysics Data System (ADS)
Monteiro, H.; Oliveira, A. F.; Dias, W. S.; Caetano, T. C.
2014-10-01
The metallicity is a critical parameter that affects the correct determination of stellar cluster's fundamental characteristics and has important implications in Galactic and Stellar evolution research. Fewer than 10% of the 2174 currently catalogued open clusters have their metallicity determined in the literature. In this work we present a method for estimating the metallicity of open clusters via non-subjective isochrone fitting using the cross-entropy global optimization algorithm applied to UBV photometric data. The free parameters distance, reddening, age, and metallicity are simultaneously determined by the fitting method. The fitting procedure uses weights for the observational data based on the estimation of membership likelihood for each star, which considers the observational magnitude limit, the density profile of stars as a function of radius from the center of the cluster, and the density of stars in multi-dimensional magnitude space. We present results of [Fe/H] for well-studied open clusters based on distinct UBV data sets. The [Fe/H] values obtained in the ten cases for which spectroscopic determinations were available in the literature agree, indicating that our method provides a good alternative to estimating [Fe/H] by using an objective isochrone fitting. Our results show that the typical precision is about 0.1 dex.
Laquale, Michele Giovanni; Coppola, Gabrielle; Cassibba, Rosalinda; Pasceri, Maria; Pietralunga, Susanna; Taurino, Alessandro; Semeraro, Cristina; Grattagliano, Ignazio
2018-04-16
The study aimed at investigating the role of confidence in attachment relationships and marital status as protective factors for incarcerated fathers' self-perceived parental role and in-person contacts with their children. Participants included 150 inmate fathers and 145 nonincarcerated control fathers who provided background sociodemographic information and completed two self-reports, the Attachment Style Questionnaire and the Self-Perception of Parental Role. A two-phased cluster analytic plan allowed us to highlight two profiles of self-perceived parental roles, with incarceration and low confidence in attachment relationships increasing the risk of the less optimal of the two profiles. Higher confidence in attachment relationships and having a stable romantic relationship increased the likelihood of incarcerated fathers engaging in frequent contacts with their children, while the profile of self-perceived parental role had no effect. Implications for practice are discussed, and suggestions for further research are provided. © 2018 American Academy of Forensic Sciences.
Hoyland, Meredith A; Rowatt, Wade C; Latendresse, Shawn J
2017-01-01
Prior research has demonstrated that adolescent delinquency and depression are prospectively related to adult alcohol use and that adolescent religiosity may influence these relationships. However, such associations have not been investigated using person-centered approaches that provide nuanced explorations of these constructs. Using data from the National Longitudinal Study of Adolescent to Adult Health, we examined whether adolescent delinquency and depression differentiated typologies of adult alcohol users and whether these relationships varied across religiosity profiles. Three typologies of self-identified Christian adolescents and 4 types of adult alcohol users were identified via latent profile analysis. Delinquency and depression were related to increased likelihood of membership in heavy drinking or problematic alcohol use profiles, but this relationship was most evident among those likely to be involved in religious practices. These results demonstrate the importance of person-centered approaches in characterizing the influences of internalizing and externalizing behaviors on subsequent patterns of alcohol use. PMID:28469423
Simulations of Foils Irradiated by Finite Laser Spots
NASA Astrophysics Data System (ADS)
Phillips, Lee
2006-10-01
Recent proposed designs (Obenchain et al., Phys. Plasmas 13 056320 (2006)) for direct-drive ICF targets for energy applications involve high implosion velocities with lower laser energies combined with higher irradiances. The use of high irradiances increases the likelihood of deleterious laser plasma instabilities (LPI) that may lead, for example, to the generation of fast electrons. The proposed use of a 248 nm KrF laser is expected to minimize LPI, and this is being studied by experiments on NRL's NIKE laser. Here we report on simulations aimed at designing and interpreting these experiments. The 2d simulations employ a modification of the FAST code to ablate plasma from CH and DT foils using laser pulses with arbitrary spatial and temporal profiles. These include the customary hypergaussian NIKE profile, gaussian profiles, and combinations of these. The simulations model the structure of the ablating plasma and the absorption of the laser light, providing parameters for design of the experiment and indicating where the relevant LPI (two-plasmon, Raman) may be observed.
Statistical inference of static analysis rules
NASA Technical Reports Server (NTRS)
Engler, Dawson Richards (Inventor)
2009-01-01
Various apparatus and methods are disclosed for identifying errors in program code. Respective numbers of observances of at least one correctness rule by different code instances that relate to the at least one correctness rule are counted in the program code. Each code instance has an associated counted number of observances of the correctness rule by the code instance. Also counted are respective numbers of violations of the correctness rule by different code instances that relate to the correctness rule. Each code instance has an associated counted number of violations of the correctness rule by the code instance. A respective likelihood of the validity is determined for each code instance as a function of the counted number of observances and counted number of violations. The likelihood of validity indicates a relative likelihood that a related code instance is required to observe the correctness rule. The violations may be output in order of the likelihood of validity of a violated correctness rule.
Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora
2009-01-01
This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.
Methods, apparatus and system for selective duplication of subtasks
Andrade Costa, Carlos H.; Cher, Chen-Yong; Park, Yoonho; Rosenburg, Bryan S.; Ryu, Kyung D.
2016-03-29
A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.
Gyre and gimble: a maximum-likelihood replacement for Patterson correlation refinement.
McCoy, Airlie J; Oeffner, Robert D; Millán, Claudia; Sammito, Massimo; Usón, Isabel; Read, Randy J
2018-04-01
Descriptions are given of the maximum-likelihood gyre method implemented in Phaser for optimizing the orientation and relative position of rigid-body fragments of a model after the orientation of the model has been identified, but before the model has been positioned in the unit cell, and also the related gimble method for the refinement of rigid-body fragments of the model after positioning. Gyre refinement helps to lower the root-mean-square atomic displacements between model and target molecular-replacement solutions for the test case of antibody Fab(26-10) and improves structure solution with ARCIMBOLDO_SHREDDER.
Quantum state estimation when qubits are lost: a no-data-left-behind approach
Williams, Brian P.; Lougovski, Pavel
2017-04-06
We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.
Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano
2015-01-01
We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin
2016-01-01
In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680
Latent profiles of perceived time adequacy for paid work, parenting, and partner roles.
Lee, Soomi; Almeida, David M; Davis, Kelly D; King, Rosalind B; Hammer, Leslie B; Kelly, Erin L
2015-10-01
This study examined feelings of having enough time (i.e., perceived time adequacy) in a sample of employed parents (N = 880) in information technology and extended-care industries. Adapting a person-centered latent profile approach, we identified 3 profiles of perceived time adequacy for paid work, parenting, and partner roles: family time protected, family time sacrificed, and time balanced. Drawing upon the conservation of resources theory (Hobfòll, 1989), we examined the associations of stressors and resources with the time adequacy profiles. Parents in the family time sacrificed profile were more likely to be younger, women, have younger children, work in the extended-care industry, and have nonstandard work schedules compared to those in the family time protected profile. Results from multinomial logistic regression analyses revealed that, with the time balanced profile as the reference group, having fewer stressors and more resources in the family context (less parent-child conflict and more partner support), work context (longer company tenure, higher schedule control and job satisfaction), and work-family interface (lower work-to-family conflict) was linked to a higher probability of membership in the family time protected profile. By contrast, having more stressors and fewer resources, in the forms of less partner support and higher work-to-family conflict, predicted a higher likelihood of being in the family time sacrificed profile. Our findings suggest that low work-to-family conflict is the most critical predictor of membership in the family time protected profile, whereas lack of partner support is the most important factor to be included in the family time sacrificed profile. (c) 2015 APA, all rights reserved).
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mikkelsen, K.; Næss, S. K.; Eriksen, H. K., E-mail: kristin.mikkelsen@astro.uio.no
2013-11-10
We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3)more » better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N{sub par}. One of the main goals of the present paper is to determine how large N{sub par} can be, while still maintaining reasonable computational efficiency; we find that N{sub par} = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme.« less
NASA Technical Reports Server (NTRS)
Lei, Ning; Chiang, Kwo-Fu; Oudrari, Hassan; Xiong, Xiaoxiong
2011-01-01
Optical sensors aboard Earth orbiting satellites such as the next generation Visible/Infrared Imager/Radiometer Suite (VIIRS) assume that the sensors radiometric response in the Reflective Solar Bands (RSB) is described by a quadratic polynomial, in relating the aperture spectral radiance to the sensor Digital Number (DN) readout. For VIIRS Flight Unit 1, the coefficients are to be determined before launch by an attenuation method, although the linear coefficient will be further determined on-orbit through observing the Solar Diffuser. In determining the quadratic polynomial coefficients by the attenuation method, a Maximum Likelihood approach is applied in carrying out the least-squares procedure. Crucial to the Maximum Likelihood least-squares procedure is the computation of the weight. The weight not only has a contribution from the noise of the sensor s digital count, with an important contribution from digitization error, but also is affected heavily by the mathematical expression used to predict the value of the dependent variable, because both the independent and the dependent variables contain random noise. In addition, model errors have a major impact on the uncertainties of the coefficients. The Maximum Likelihood approach demonstrates the inadequacy of the attenuation method model with a quadratic polynomial for the retrieved spectral radiance. We show that using the inadequate model dramatically increases the uncertainties of the coefficients. We compute the coefficient values and their uncertainties, considering both measurement and model errors.
Multivariate Phylogenetic Comparative Methods: Evaluations, Comparisons, and Recommendations.
Adams, Dean C; Collyer, Michael L
2018-01-01
Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as pairwise composite likelihood methods should be avoided, whereas algebraic generalizations of the phylogenetic comparative toolkit provide a useful means of assessing macroevolutionary patterns in multivariate data. Finally, we discuss areas in which multivariate phylogenetic comparative methods are still in need of future development; namely highly multivariate Ornstein-Uhlenbeck models and approaches for multivariate evolutionary model comparisons. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Tropical Convective Spectrum. Part 1; Archetypal Vertical Structures
NASA Technical Reports Server (NTRS)
Boccippio, Dennis J.; Petersen, Walter A.; Cecil, Daniel J.
2005-01-01
A taxonomy of tropical convective and stratiform vertical structures is constructed through cluster analysis of 3 yr of Tropical Rainfall Measuring Mission (TRMM) "warm-season" (surface temperature greater than 10 C) precipitation radar (PR) vertical profiles, their surface rainfall, and associated radar-based classifiers (convective/ stratiform and brightband existence). Twenty-five archetypal profile types are identified, including nine convective types, eight stratiform types, two mixed types, and six anvil/fragment types (nonprecipitating anvils and sheared deep convective profiles). These profile types are then hierarchically clustered into 10 similar families, which can be further combined, providing an objective and physical reduction of the highly multivariate PR data space that retains vertical structure information. The taxonomy allows for description of any storm or local convective spectrum by the profile types or families. The analysis provides a quasi-independent corroboration of the TRMM 2A23 convective/ stratiform classification. The global frequency of occurrence and contribution to rainfall for the profile types are presented, demonstrating primary rainfall contribution by midlevel glaciated convection (27%) and similar depth decaying/stratiform stages (28%-31%). Profiles of these types exhibit similar 37- and 85-GHz passive microwave brightness temperatures but differ greatly in their frequency of occurrence and mean rain rates, underscoring the importance to passive microwave rain retrieval of convective/stratiform discrimination by other means, such as polarization or texture techniques, or incorporation of lightning observations. Close correspondence is found between deep convective profile frequency and annualized lightning production, and pixel-level lightning occurrence likelihood directly tracks the estimated mean ice water path within profile types.
Hall, Eric William; Heneine, Walid; Sanchez, Travis; Sineath, Robert Craig; Sullivan, Patrick
2016-05-19
Preexposure prophylaxis (PrEP) is available as a daily pill for preventing infection with the human immunodeficiency virus (HIV). Innovative methods of administering PrEP systemically or topically are being discussed and developed. The objective of our study was to assess attitudes toward different experimental modalities of PrEP administration. From April to July 2015, we recruited 1106 HIV-negative men who have sex with men through online social media advertisements and surveyed them about their likelihood of using different PrEP modalities. Participants responded to 5-point Likert-scale items indicating how likely they were to use each of the following PrEP modalities: a daily oral pill, on-demand pills, periodic injection, penile gel (either before or after intercourse), rectal gel (before/after), and rectal suppository (before/after). We used Wilcoxon signed rank tests to determine whether the stated likelihood of using any modality differed from daily oral PrEP. Related items were combined to assess differences in likelihood of use based on tissue or time of administration. Participants also ranked their interest in using each modality, and we used the modified Borda count method to determine consensual rankings. Most participants indicated they would be somewhat likely or very likely to use PrEP as an on-demand pill (685/1105, 61.99%), daily oral pill (528/1036, 50.97%), injection (575/1091, 52.70%), or penile gel (438/755, 58.01% before intercourse; 408/751, 54.33% after). The stated likelihoods of using on-demand pills (median score 4) and of using a penile gel before intercourse (median 4) were both higher than that of using a daily oral pill (median 4, P<.001 and P=.001, respectively). Compared with a daily oral pill, participants reported a significantly lower likelihood of using any of the 4 rectal modalities (Wilcoxon signed rank test, all P<.001). On 10-point Likert scales created by combining application methods, the reported likelihood of using a penile gel (median 7) was higher than that of using a rectal gel (median 6, P<.001), which was higher than the likelihood of using a rectal suppository (median 6, P<.001). The modified Borda count ranked on-demand pills as the most preferred modality. There was no difference in likelihood of use of PrEP (gel or suppository) before or after intercourse. Participants typically prefer systemic PrEP and are less likely to use a modality that is administered rectally. Although most of these modalities are seen as favorable or neutral, attitudes may change as information about efficacy and application becomes available. Further data on modality preference across risk groups will better inform PrEP development.
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
An assessment of the information content of likelihood ratios derived from complex mixtures.
Marsden, Clare D; Rudin, Norah; Inman, Keith; Lohmueller, Kirk E
2016-05-01
With the increasing sensitivity of DNA typing methodologies, as well as increasing awareness by law enforcement of the perceived capabilities of DNA typing, complex mixtures consisting of DNA from two or more contributors are increasingly being encountered. However, insufficient research has been conducted to characterize the ability to distinguish a true contributor (TC) from a known non-contributor (KNC) in these complex samples, and under what specific conditions. In order to investigate this question, sets of six 15-locus Caucasian genotype profiles were simulated and used to create mixtures containing 2-5 contributors. Likelihood ratios were computed for various situations, including varying numbers of contributors and unknowns in the evidence profile, as well as comparisons of the evidence profile to TCs and KNCs. This work was intended to illustrate the best-case scenario, in which all alleles from the TC were detected in the simulated evidence samples. Therefore the possibility of drop-out was not modeled in this study. The computer program DNAMIX was then used to compute LRs comparing the evidence profile to TCs and KNCs. This resulted in 140,000 LRs for each of the two scenarios. These complex mixture simulations show that, even when all alleles are detected (i.e. no drop-out), TCs can generate LRs less than 1 across a 15-locus profile. However, this outcome was rare, 7 of 140,000 replicates (0.005%), and associated only with mixtures comprising 5 contributors in which the numerator hypothesis includes one or more unknown contributors. For KNCs, LRs were found to be greater than 1 in a small number of replicates (75 of 140,000 replicates, or 0.05%). These replicates were limited to 4 and 5 person mixtures with 1 or more unknowns in the numerator. Only 5 of these 75 replicates (0.004%) yielded an LR greater than 1,000. Thus, overall, these results imply that the weight of evidence that can be derived from complex mixtures containing up to 5 contributors, under a scenario in which no drop-out is required to explain any of the contributors, is remarkably high. This is a useful benchmark result on top of which to layer the effects of additional factors, such as drop-out, peak height, and other variables. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Coelho, Carlos A.; Marques, Filipe J.
2013-09-01
In this paper the authors combine the equicorrelation and equivariance test introduced by Wilks [13] with the likelihood ratio test (l.r.t.) for independence of groups of variables to obtain the l.r.t. of block equicorrelation and equivariance. This test or its single block version may find applications in many areas as in psychology, education, medicine, genetics and they are important "in many tests of multivariate analysis, e.g. in MANOVA, Profile Analysis, Growth Curve analysis, etc" [12, 9]. By decomposing the overall hypothesis into the hypotheses of independence of groups of variables and the hypothesis of equicorrelation and equivariance we are able to obtain the expressions for the overall l.r.t. statistic and its moments. From these we obtain a suitable factorization of the characteristic function (c.f.) of the logarithm of the l.r.t. statistic, which enables us to develop highly manageable and precise near-exact distributions for the test statistic.
Effect of squeeze film damper land geometry on damper performance
NASA Astrophysics Data System (ADS)
Wang, Y. H.; Hahn, E. J.
1994-04-01
Variable axial land geometry dampers can significantly alter the unbalance response, and in particular, the likelihood of undesirable jump behavior, or circular orbit-type squeeze film dampers. Assuming end feed, the pressure distribution, the fluid film forces, and the stiffness and damping coefficients are obtained for such variable axial and geometry dampers, as well as the jump-up propensity for vertical squeeze film damped rigid rotors. It is shown that variable land geometry dampers can reduce the variation of stiffness and damping coefficients, thereby reducing the degree of damper force non-linearity, and presumably reducing the likelihood of undesirable bistable operation. However, it is also found that regardless of unbalance and regardless of the depth, width or shape of the profile, parallel land dampers are least likely to experience jump-up to undesirable operation modes. These conflicting conclusions may be accounted for by the reduction in damping. They will need to be qualified for practical dampers which normally have oil hole feed rather than end feed.
Sousa, Carlos Augusto Moreira de; Bahia, Camila Alves; Constantino, Patrícia
2016-12-01
Brazil has the sixth largest bicycles fleet in the world and bicycle is the most used individual transport vehicle in the country. Few studies address the issue of cyclists' accidents and factors that contribute to or prevent this event. VIVA is a cross-sectional survey and is part of the Violence and Accidents Surveillance System, Brazilian Ministry of Health. We used complex sampling and subsequent data review through multivariate logistic regression and calculation of the respective odds ratios. Odds ratios showed greater likelihood of cyclists' accidents in males, people with less schooling and living in urban and periurban areas. People who were not using the bike to go to work were more likely to suffer an accident. The profile found in this study corroborates findings of other studies. They claim that the coexistence of cyclists and other means of transportation in the same urban space increases the likelihood of accidents. The construction of bicycle-exclusive spaces and educational campaigns are required.
O’Leary-Barrett, Maeve; Pihl, Robert O.; Artiges, Eric; Banaschewski, Tobias; Bokde, Arun L. W.; Büchel, Christian; Flor, Herta; Frouin, Vincent; Garavan, Hugh; Heinz, Andreas; Ittermann, Bernd; Mann, Karl; Paillère-Martinot, Marie-Laure; Nees, Frauke; Paus, Tomas; Pausova, Zdenka; Poustka, Luise; Rietschel, Marcella; Robbins, Trevor W.; Smolka, Michael N.; Ströhle, Andreas; Schumann, Gunter; Conrod, Patricia J.
2015-01-01
Objective To investigate the role of personality factors and attentional biases towards emotional faces, in establishing concurrent and prospective risk for mental disorder diagnosis in adolescence. Method Data were obtained as part of the IMAGEN study, conducted across 8 European sites, with a community sample of 2257 adolescents. At 14 years, participants completed an emotional variant of the dot-probe task, as well two personality measures, namely the Substance Use Risk Profile Scale and the revised NEO Personality Inventory. At 14 and 16 years, participants and their parents were interviewed to determine symptoms of mental disorders. Results Personality traits were general and specific risk indicators for mental disorders at 14 years. Increased specificity was obtained when investigating the likelihood of mental disorders over a 2-year period, with the Substance Use Risk Profile Scale showing incremental validity over the NEO Personality Inventory. Attentional biases to emotional faces did not characterise or predict mental disorders examined in the current sample. Discussion Personality traits can indicate concurrent and prospective risk for mental disorders in a community youth sample, and identify at-risk youth beyond the impact of baseline symptoms. This study does not support the hypothesis that attentional biases mediate the relationship between personality and psychopathology in a community sample. Task and sample characteristics that contribute to differing results among studies are discussed. PMID:26046352
Average Likelihood Methods for Code Division Multiple Access (CDMA)
2014-05-01
lengths in the range of 22 to 213 and possibly higher. Keywords: DS / CDMA signals, classification, balanced CDMA load, synchronous CDMA , decision...likelihood ratio test (ALRT). We begin this classification problem by finding the size of the spreading matrix that generated the DS - CDMA signal. As...Theoretical Background The classification of DS / CDMA signals should not be confused with the problem of multiuser detection. The multiuser detection deals
Accurate Structural Correlations from Maximum Likelihood Superpositions
Theobald, Douglas L; Wuttke, Deborah S
2008-01-01
The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091
Modeling abundance effects in distance sampling
Royle, J. Andrew; Dawson, D.K.; Bates, S.
2004-01-01
Distance-sampling methods are commonly used in studies of animal populations to estimate population density. A common objective of such studies is to evaluate the relationship between abundance or density and covariates that describe animal habitat or other environmental influences. However, little attention has been focused on methods of modeling abundance covariate effects in conventional distance-sampling models. In this paper we propose a distance-sampling model that accommodates covariate effects on abundance. The model is based on specification of the distance-sampling likelihood at the level of the sample unit in terms of local abundance (for each sampling unit). This model is augmented with a Poisson regression model for local abundance that is parameterized in terms of available covariates. Maximum-likelihood estimation of detection and density parameters is based on the integrated likelihood, wherein local abundance is removed from the likelihood by integration. We provide an example using avian point-transect data of Ovenbirds (Seiurus aurocapillus) collected using a distance-sampling protocol and two measures of habitat structure (understory cover and basal area of overstory trees). The model yields a sensible description (positive effect of understory cover, negative effect on basal area) of the relationship between habitat and Ovenbird density that can be used to evaluate the effects of habitat management on Ovenbird populations.
A Review of Methods for Missing Data.
ERIC Educational Resources Information Center
Pigott, Therese D.
2001-01-01
Reviews methods for handling missing data in a research study. Model-based methods, such as maximum likelihood using the EM algorithm and multiple imputation, hold more promise than ad hoc methods. Although model-based methods require more specialized computer programs and assumptions about the nature of missing data, these methods are appropriate…
NASA Astrophysics Data System (ADS)
Moschetti, M. P.; Mueller, C. S.; Boyd, O. S.; Petersen, M. D.
2013-12-01
In anticipation of the update of the Alaska seismic hazard maps (ASHMs) by the U. S. Geological Survey, we report progress on the comparison of smoothed seismicity models developed using fixed and adaptive smoothing algorithms, and investigate the sensitivity of seismic hazard to the models. While fault-based sources, such as those for great earthquakes in the Alaska-Aleutian subduction zone and for the ~10 shallow crustal faults within Alaska, dominate the seismic hazard estimates for locations near to the sources, smoothed seismicity rates make important contributions to seismic hazard away from fault-based sources and where knowledge of recurrence and magnitude is not sufficient for use in hazard studies. Recent developments in adaptive smoothing methods and statistical tests for evaluating and comparing rate models prompt us to investigate the appropriateness of adaptive smoothing for the ASHMs. We develop smoothed seismicity models for Alaska using fixed and adaptive smoothing methods and compare the resulting models by calculating and evaluating the joint likelihood test. We use the earthquake catalog, and associated completeness levels, developed for the 2007 ASHM to produce fixed-bandwidth-smoothed models with smoothing distances varying from 10 to 100 km and adaptively smoothed models. Adaptive smoothing follows the method of Helmstetter et al. and defines a unique smoothing distance for each earthquake epicenter from the distance to the nth nearest neighbor. The consequence of the adaptive smoothing methods is to reduce smoothing distances, causing locally increased seismicity rates, where seismicity rates are high and to increase smoothing distances where seismicity is sparse. We follow guidance from previous studies to optimize the neighbor number (n-value) by comparing model likelihood values, which estimate the likelihood that the observed earthquake epicenters from the recent catalog are derived from the smoothed rate models. We compare likelihood values from all rate models to rank the smoothing methods. We find that adaptively smoothed seismicity models yield better likelihood values than the fixed smoothing models. Holding all other (source and ground motion) models constant, we calculate seismic hazard curves for all points across Alaska on a 0.1 degree grid, using the adaptively smoothed and fixed smoothed seismicity models separately. Because adaptively smoothed models concentrate seismicity near the earthquake epicenters where seismicity rates are high, the corresponding hazard values are higher, locally, but reduced with distance from observed seismicity, relative to the hazard from fixed-bandwidth models. We suggest that adaptively smoothed seismicity models be considered for implementation in the update to the ASHMs because of their improved likelihood estimates relative to fixed smoothing methods; however, concomitant increases in seismic hazard will cause significant changes in regions of high seismicity, such as near the subduction zone, northeast of Kotzebue, and along the NNE trending zone of seismicity in the Alaskan interior.
Moschetti, Morgan P.; Mueller, Charles S.; Boyd, Oliver S.; Petersen, Mark D.
2014-01-01
In anticipation of the update of the Alaska seismic hazard maps (ASHMs) by the U. S. Geological Survey, we report progress on the comparison of smoothed seismicity models developed using fixed and adaptive smoothing algorithms, and investigate the sensitivity of seismic hazard to the models. While fault-based sources, such as those for great earthquakes in the Alaska-Aleutian subduction zone and for the ~10 shallow crustal faults within Alaska, dominate the seismic hazard estimates for locations near to the sources, smoothed seismicity rates make important contributions to seismic hazard away from fault-based sources and where knowledge of recurrence and magnitude is not sufficient for use in hazard studies. Recent developments in adaptive smoothing methods and statistical tests for evaluating and comparing rate models prompt us to investigate the appropriateness of adaptive smoothing for the ASHMs. We develop smoothed seismicity models for Alaska using fixed and adaptive smoothing methods and compare the resulting models by calculating and evaluating the joint likelihood test. We use the earthquake catalog, and associated completeness levels, developed for the 2007 ASHM to produce fixed-bandwidth-smoothed models with smoothing distances varying from 10 to 100 km and adaptively smoothed models. Adaptive smoothing follows the method of Helmstetter et al. and defines a unique smoothing distance for each earthquake epicenter from the distance to the nth nearest neighbor. The consequence of the adaptive smoothing methods is to reduce smoothing distances, causing locally increased seismicity rates, where seismicity rates are high and to increase smoothing distances where seismicity is sparse. We follow guidance from previous studies to optimize the neighbor number (n-value) by comparing model likelihood values, which estimate the likelihood that the observed earthquake epicenters from the recent catalog are derived from the smoothed rate models. We compare likelihood values from all rate models to rank the smoothing methods. We find that adaptively smoothed seismicity models yield better likelihood values than the fixed smoothing models. Holding all other (source and ground motion) models constant, we calculate seismic hazard curves for all points across Alaska on a 0.1 degree grid, using the adaptively smoothed and fixed smoothed seismicity models separately. Because adaptively smoothed models concentrate seismicity near the earthquake epicenters where seismicity rates are high, the corresponding hazard values are higher, locally, but reduced with distance from observed seismicity, relative to the hazard from fixed-bandwidth models. We suggest that adaptively smoothed seismicity models be considered for implementation in the update to the ASHMs because of their improved likelihood estimates relative to fixed smoothing methods; however, concomitant increases in seismic hazard will cause significant changes in regions of high seismicity, such as near the subduction zone, northeast of Kotzebue, and along the NNE trending zone of seismicity in the Alaskan interior.
Kretschmer, Tina; Barker, Edward D; Dijkstra, Jan Kornelis; Oldehinkel, Albertine J; Veenstra, René
2015-10-01
Peer victimization is a common and pervasive experience in childhood and adolescence and is associated with various maladjustment symptoms, including internalizing, externalizing, and somatic problems. This variety suggests that peer victimization is multifinal where exposure to the same risk leads to different outcomes. However, very little is known about the relative likelihood of each form of maladjustment. We used a latent profile approach to capture multiple possible outcomes and examined prediction by peer victimization. We also examined the role of peer victimization with regard to stability and change in maladjustment. Maladjustment symptoms and peer victimization were assessed from the participants of the large cohort study TRacking Adolescents' Individual Lives Survey in early and mid-adolescence. Latent profile and latent transition analyses were conducted to examine associations between victimization and maladjustment profile and to test the role of victimization in maladjustment profile transitions. Four maladjustment profiles were identified for early adolescence (Low, Internalizing, Externalizing, Comorbid) and three profiles (Low, Internalizing, Externalizing) were identified for mid-adolescence. Internalizing problems were more likely in victimized adolescents than low symptom levels or externalizing problems. Victimized adolescents were at greater risk to develop internalizing problems between early and mid-adolescence than non-victimized adolescents. Peer victimization is multifinal mostly when outcomes are examined separately. If multiple outcomes are tested simultaneously, internalizing problems seem to be the most likely outcome.
Assessing performance and validating finite element simulations using probabilistic knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolin, Ronald M.; Rodriguez, E. A.
Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Janssen, Eveline P C J; de Vugt, Marjolein; Köhler, Sebastian; Wolfs, Claire; Kerpershoek, Liselot; Handels, Ron L H; Orrell, Martin; Woods, Bob; Jelley, Hannah; Stephan, Astrid; Bieber, Anja; Meyer, Gabriele; Engedal, Knut; Selbaek, Geir; Wimo, Anders; Irving, Kate; Hopper, Louise; Gonçalves-Pereira, Manuel; Portolani, Elisa; Zanetti, Orazio; Verhey, Frans R
2017-01-01
To identify caregiver profiles of persons with mild to moderate dementia and to investigate differences between identified caregiver profiles, using baseline data of the international prospective cohort study Actifcare. A latent class analysis was used to discover different caregiver profiles based on disease related characteristics of 453 persons with dementia and their 453 informal caregivers. These profiles were compared with regard to quality of life (CarerQoL score), depressive symptoms (HADS-D score) and perseverance time. A 5-class model was identified, with the best Bayesian Information Criterion value, significant likelihood ratio test (p < 0.001), high entropy score (0.88) and substantive interpretability. The classes could be differentiated on two axes: (i) caregivers' age, relationship with persons with dementia, severity of dementia, and (ii) tendency towards stress and difficulty adapting to stress. Classes showed significant differences with all dependent variables, and were labelled 'older low strain', 'older intermediate strain', 'older high strain', 'younger low strain' and 'younger high strain'. Differences exist between types of caregivers that explain variability in quality of life, depressive symptoms and perseverance time. Our findings may give direction for tailored interventions for caregivers of persons with dementia, which may improve social health and reduce health care costs.
Predicting Rotator Cuff Tears Using Data Mining and Bayesian Likelihood Ratios
Lu, Hsueh-Yi; Huang, Chen-Yuan; Su, Chwen-Tzeng; Lin, Chen-Chiang
2014-01-01
Objectives Rotator cuff tear is a common cause of shoulder diseases. Correct diagnosis of rotator cuff tears can save patients from further invasive, costly and painful tests. This study used predictive data mining and Bayesian theory to improve the accuracy of diagnosing rotator cuff tears by clinical examination alone. Methods In this retrospective study, 169 patients who had a preliminary diagnosis of rotator cuff tear on the basis of clinical evaluation followed by confirmatory MRI between 2007 and 2011 were identified. MRI was used as a reference standard to classify rotator cuff tears. The predictor variable was the clinical assessment results, which consisted of 16 attributes. This study employed 2 data mining methods (ANN and the decision tree) and a statistical method (logistic regression) to classify the rotator cuff diagnosis into “tear” and “no tear” groups. Likelihood ratio and Bayesian theory were applied to estimate the probability of rotator cuff tears based on the results of the prediction models. Results Our proposed data mining procedures outperformed the classic statistical method. The correction rate, sensitivity, specificity and area under the ROC curve of predicting a rotator cuff tear were statistical better in the ANN and decision tree models compared to logistic regression. Based on likelihood ratios derived from our prediction models, Fagan's nomogram could be constructed to assess the probability of a patient who has a rotator cuff tear using a pretest probability and a prediction result (tear or no tear). Conclusions Our predictive data mining models, combined with likelihood ratios and Bayesian theory, appear to be good tools to classify rotator cuff tears as well as determine the probability of the presence of the disease to enhance diagnostic decision making for rotator cuff tears. PMID:24733553
Using DNA fingerprints to infer familial relationships within NHANES III households
Katki, Hormuzd A.; Sanders, Christopher L.; Graubard, Barry I.; Bergen, Andrew W.
2009-01-01
Developing, targeting, and evaluating genomic strategies for population-based disease prevention require population-based data. In response to this urgent need, genotyping has been conducted within the Third National Health and Nutrition Examination (NHANES III), the nationally-representative household-interview health survey in the U.S. However, before these genetic analyses can occur, family relationships within households must be accurately ascertained. Unfortunately, reported family relationships within NHANES III households based on questionnaire data are incomplete and inconclusive with regards to actual biological relatedness of family members. We inferred family relationships within households using DNA fingerprints (Identifiler®) that contain the DNA loci used by law enforcement agencies for forensic identification of individuals. However, performance of these loci for relationship inference is not well understood. We evaluated two competing statistical methods for relationship inference on pairs of household members: an exact likelihood ratio relying on allele frequencies to an Identical By State (IBS) likelihood ratio that only requires matching alleles. We modified these methods to account for genotyping errors and population substructure. The two methods usually agree on the rankings of the most likely relationships. However, the IBS method underestimates the likelihood ratio by not accounting for the informativeness of matching rare alleles. The likelihood ratio is sensitive to estimates of population substructure, and parent-child relationships are sensitive to the specified genotyping error rate. These loci were unable to distinguish second-degree relationships and cousins from being unrelated. The genetic data is also useful for verifying reported relationships and identifying data quality issues. An important by-product is the first explicitly nationally-representative estimates of allele frequencies at these ubiquitous forensic loci. PMID:20664713
Application of the Bootstrap Methods in Factor Analysis.
ERIC Educational Resources Information Center
Ichikawa, Masanori; Konishi, Sadanori
1995-01-01
A Monte Carlo experiment was conducted to investigate the performance of bootstrap methods in normal theory maximum likelihood factor analysis when the distributional assumption was satisfied or unsatisfied. Problems arising with the use of bootstrap methods are highlighted. (SLD)
A Review of System Identification Methods Applied to Aircraft
NASA Technical Reports Server (NTRS)
Klein, V.
1983-01-01
Airplane identification, equation error method, maximum likelihood method, parameter estimation in frequency domain, extended Kalman filter, aircraft equations of motion, aerodynamic model equations, criteria for the selection of a parsimonious model, and online aircraft identification are addressed.
Hybrid pairwise likelihood analysis of animal behavior experiments.
Cattelan, Manuela; Varin, Cristiano
2013-12-01
The study of the determinants of fights between animals is an important issue in understanding animal behavior. For this purpose, tournament experiments among a set of animals are often used by zoologists. The results of these tournament experiments are naturally analyzed by paired comparison models. Proper statistical analysis of these models is complicated by the presence of dependence between the outcomes of fights because the same animal is involved in different contests. This paper discusses two different model specifications to account for between-fights dependence. Models are fitted through the hybrid pairwise likelihood method that iterates between optimal estimating equations for the regression parameters and pairwise likelihood inference for the association parameters. This approach requires the specification of means and covariances only. For this reason, the method can be applied also when the computation of the joint distribution is difficult or inconvenient. The proposed methodology is investigated by simulation studies and applied to real data about adult male Cape Dwarf Chameleons. © 2013, The International Biometric Society.
INFERRING THE ECCENTRICITY DISTRIBUTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogg, David W.; Bovy, Jo; Myers, Adam D., E-mail: david.hogg@nyu.ed
2010-12-20
Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementationmore » of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision-other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts-so long as the measurements have been communicated as a likelihood function or a posterior sampling.« less
Order-restricted inference for means with missing values.
Wang, Heng; Zhong, Ping-Shou
2017-09-01
Missing values appear very often in many applications, but the problem of missing values has not received much attention in testing order-restricted alternatives. Under the missing at random (MAR) assumption, we impute the missing values nonparametrically using kernel regression. For data with imputation, the classical likelihood ratio test designed for testing the order-restricted means is no longer applicable since the likelihood does not exist. This article proposes a novel method for constructing test statistics for assessing means with an increasing order or a decreasing order based on jackknife empirical likelihood (JEL) ratio. It is shown that the JEL ratio statistic evaluated under the null hypothesis converges to a chi-bar-square distribution, whose weights depend on missing probabilities and nonparametric imputation. Simulation study shows that the proposed test performs well under various missing scenarios and is robust for normally and nonnormally distributed data. The proposed method is applied to an Alzheimer's disease neuroimaging initiative data set for finding a biomarker for the diagnosis of the Alzheimer's disease. © 2017, The International Biometric Society.
2009-01-01
Background Marginal posterior genotype probabilities need to be computed for genetic analyses such as geneticcounseling in humans and selective breeding in animal and plant species. Methods In this paper, we describe a peeling based, deterministic, exact algorithm to compute efficiently genotype probabilities for every member of a pedigree with loops without recourse to junction-tree methods from graph theory. The efficiency in computing the likelihood by peeling comes from storing intermediate results in multidimensional tables called cutsets. Computing marginal genotype probabilities for individual i requires recomputing the likelihood for each of the possible genotypes of individual i. This can be done efficiently by storing intermediate results in two types of cutsets called anterior and posterior cutsets and reusing these intermediate results to compute the likelihood. Examples A small example is used to illustrate the theoretical concepts discussed in this paper, and marginal genotype probabilities are computed at a monogenic disease locus for every member in a real cattle pedigree. PMID:19958551
NASA Astrophysics Data System (ADS)
Love, J. J.; Rigler, E. J.; Pulkkinen, A. A.; Riley, P.
2015-12-01
An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to -Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, -Dst > 850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42, 2.41] times per century; a 100-yr magnetic storm is identified as having a -Dst > 880 nT (greater than Carrington) but a wide 95% confidence interval of [490, 1187] nT. This work is partially motivated by United States National Science and Technology Council and Committee on Space Research and International Living with a Star priorities and strategic plans for the assessment and mitigation of space-weather hazards.
Cao, Y; Adachi, J; Yano, T; Hasegawa, M
1994-07-01
Graur et al.'s (1991) hypothesis that the guinea pig-like rodents have an evolutionary origin within mammals that is separate from that of other rodents (the rodent-polyphyly hypothesis) was reexamined by the maximum-likelihood method for protein phylogeny, as well as by the maximum-parsimony and neighbor-joining methods. The overall evidence does not support Graur et al.'s hypothesis, which radically contradicts the traditional view of rodent monophyly. This work demonstrates that we must be careful in choosing a proper method for phylogenetic inference and that an argument based on a small data set (with respect to the length of the sequence and especially the number of species) may be unstable.
A parametric method for determining the number of signals in narrow-band direction finding
NASA Astrophysics Data System (ADS)
Wu, Qiang; Fuhrmann, Daniel R.
1991-08-01
A novel and more accurate method to determine the number of signals in the multisource direction finding problem is developed. The information-theoretic criteria of Yin and Krishnaiah (1988) are applied to a set of quantities which are evaluated from the log-likelihood function. Based on proven asymptotic properties of the maximum likelihood estimation, these quantities have the properties required by the criteria. Since the information-theoretic criteria use these quantities instead of the eigenvalues of the estimated correlation matrix, this approach possesses the advantage of not requiring a subjective threshold, and also provides higher performance than when eigenvalues are used. Simulation results are presented and compared to those obtained from the nonparametric method given by Wax and Kailath (1985).
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers
Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.
2004-01-01
LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...
2014-10-16
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.
2014-01-01
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
Hopfer, Suellen; Tan, Xianming; Wylie, John L
2014-05-01
We assessed whether a meaningful set of latent risk profiles could be identified in an inner-city population through individual and network characteristics of substance use, sexual behaviors, and mental health status. Data came from 600 participants in Social Network Study III, conducted in 2009 in Winnipeg, Manitoba, Canada. We used latent class analysis (LCA) to identify risk profiles and, with covariates, to identify predictors of class. A 4-class model of risk profiles fit the data best: (1) solitary users reported polydrug use at the individual level, but low probabilities of substance use or concurrent sexual partners with network members; (2) social-all-substance users reported polydrug use at the individual and network levels; (3) social-noninjection drug users reported less likelihood of injection drug and solvent use; (4) low-risk users reported low probabilities across substances. Unstable housing, preadolescent substance use, age, and hepatitis C status predicted risk profiles. Incorporation of social network variables into LCA can distinguish important subgroups with varying patterns of risk behaviors that can lead to sexually transmitted and bloodborne infections.
Verbeke, Wim; Pérez-Cueto, Federico J A; Grunert, Klaus G
2011-08-01
This study uses pork consumption frequency and variety to identify and profile European pork consumer segments. Data (n=1931) were collected in January 2008 in Belgium, Denmark, Germany and Poland. "Non-pork eaters" are profiled as predominantly younger (<35 years) females, with a high likelihood of living single and being underweight (BMI<18.5 kg/m²). Three segments of pork eaters were identified. The "Low variety, Low frequency" segment (17.4%) has a similar profile as the non-pork eaters, though it is a largely non-Polish and non-German segment. The "High variety, High frequency" segment (18.6%) consists mainly of rural, lower educated and overweight or obese (BMI>30 kg/m²) males. The segment "High variety, Medium frequency" (50.1%) includes families and other non-single households, with a profile that matches the overall sample. Their pork consumption is balanced over a wide range of pork cuts and pork meat products. Each segment entails specific challenges for the industry and the public health sector. Copyright © 2011 Elsevier Ltd. All rights reserved.
Cancer genetic risk assessment and referral patterns in primary care.
Vig, Hetal S; Armstrong, Joanne; Egleston, Brian L; Mazar, Carla; Toscano, Michele; Bradbury, Angela R; Daly, Mary B; Meropol, Neal J
2009-12-01
This study was undertaken to describe cancer risk assessment practices among primary care providers (PCPs). An electronic survey was sent to PCPs affiliated with a single insurance carrier. Demographic and practice characteristics associated with cancer genetic risk assessment and testing activities were described. Latent class analysis supported by likelihood ratio tests was used to define PCP profiles with respect to the level of engagement in genetic risk assessment and referral activity based on demographic and practice characteristics. 860 physicians responded to the survey (39% family practice, 29% internal medicine, 22% obstetrics/gynecology (OB/GYN), 10% other). Most respondents (83%) reported that they routinely assess hereditary cancer risk; however, only 33% reported that they take a full, three-generation pedigree for risk assessment. OB/GYN specialty, female gender, and physician access to a genetic counselor were independent predictors of referral to cancer genetics specialists. Three profiles of PCPs, based upon referral practice and extent of involvement in genetics evaluation, were defined. Profiles of physician characteristics associated with varying levels of engagement with cancer genetic risk assessment and testing can be identified. These profiles may ultimately be useful in targeting decision support tools and services.
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter
2014-09-24
Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data from genome-wide association studies, and will help in the understanding of how the associated genetic variants influence disease or quantitative phenotypes.
Effects of time-shifted data on flight determined stability and control derivatives
NASA Technical Reports Server (NTRS)
Steers, S. T.; Iliff, K. W.
1975-01-01
Flight data were shifted in time by various increments to assess the effects of time shifts on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there was a considerable time shift in the data. Time shifts degraded the estimates of the derivatives, but the degradation was in a consistent rather than a random pattern. Time shifts in the control variables caused the most degradation, and the lateral-directional rotary derivatives were affected the most by time shifts in any variable.
Neonatal cytokine profiles associated with autism spectrum disorder
Tancredi, Daniel J.; Ashwood, Paul; Hansen, Robin L.; Hertz-Picciotto, Irva; Van de Water, Judy
2015-01-01
Background Autism spectrum disorder (ASD) is a complex neurodevelopmental condition that can be reliably diagnosed as early as 24 months. Immunological phenomena, including skewed cytokine production, have been observed among children with ASD. Little is known about whether immune dysregulation is present before diagnosis of ASD. Methods We utilized neonatal blood spots from 214 children with ASD (141 severe, 73 mild/moderate), 62 typically developing (TD), and 27 developmental delayed controls who participated in CHARGE (Childhood Autism Risks from Genetics and the Environment), a population-based case-control study. Levels of 17 cytokines/chemokines were compared across groups and in relation to developmental/behavioral domains. Results Interleukin (IL)-1β and IL-4 were independently associated with ASD vs. TD although these relationships varied by ASD symptom intensity. Elevated IL-4 associated with increased odds of severe ASD (ASDsev) (odds ratio[OR]=1.40, 95% confidence interval[CI] 1.03, 1.91) whereas IL-1β associated with increased odds of mild/moderate ASD (ASDmild) (OR=3.02, 95% CI 1.43, 6.38). Additionally, IL-4 was associated with a higher likelihood of ASDsev vs. ASDmild (OR=1.35, 95% CI 1.04, 1.75). In male ASD cases, IL-4 was negatively associated with non-verbal cognitive ability (β=−3.63, SE=1.33, P=0.04). Conclusions This study is part of a growing effort to identify early biological markers for ASD. We demonstrate that peripheral cytokine profiles at birth are associated with ASD later in childhood and that cytokine profiles vary depending on ASD severity. Cytokines have complex roles in neurodevelopment, and dysregulated levels may be indicative of genetic differences and environmental exposures or their interactions that relate to ASD. PMID:26392128
Cario, Gunnar; Izraeli, Shai; Teichert, Anja; Rhein, Peter; Skokowa, Julia; Möricke, Anja; Zimmermann, Martin; Schrauder, Andre; Karawajew, Leonid; Ludwig, Wolf-Dieter; Welte, Karl; Schünemann, Holger J; Schlegelberger, Brigitte; Schrappe, Martin; Stanulla, Martin
2007-10-20
Applying current diagnostic methods, overt CNS involvement is a rare event in childhood acute lymphoblastic leukemia (ALL). In contrast, CNS-directed therapy is essential for all patients with ALL because without it, the majority of patients eventually will experience relapse. To approach this discrepancy and to explore potential distinct biologic properties of leukemic cells that migrate into the CNS, we compared gene expression profiles of childhood ALL patients with initial CNS involvement with the profiles of CNS-negative patients. We evaluated leukemic gene expression profiles from the bone marrow of 17 CNS-positive patients and 26 CNS-negative patients who were frequency matched for risk factors associated with CNS involvement. Results were confirmed by real-time quantitative polymerase chain reaction analysis and validated using independent patient samples. Interleukin-15 (IL-15) expression was consistently upregulated in leukemic cells of CNS-positive patients compared with CNS-negative patients. In multivariate analysis, IL-15 expression levels greater than the median were associated with CNS involvement compared with expression equal to or less than the median (odds ratio [OR] = 10.70; 95% CI, 2.95 to 38.81). Diagnostic likelihood ratios for CNS positivity were 0.09 (95% CI, 0.01 to 0.65) for the first and 6.93 (95% CI, 2.55 to 18.83) for the fourth IL-15 expression quartiles. In patients who were CNS negative at diagnosis, IL-15 levels greater than the median were associated with subsequent CNS relapse compared with expression equal to or less than the median (OR = 13.80; 95% CI, 3.38 to 56.31). Quantification of leukemic IL-15 expression at diagnosis predicts CNS status and could be a new tool to further tailor CNS-directed therapy in childhood ALL.
Deckersbach, Thilo; Peters, Amy T.; Sylvia, Louisa G.; Gold, Alexandra K.; da Silva Magalhaes, Pedro Vieira; Henry, David B.; Frank, Ellen; Otto, Michael W.; Berk, Michael; Dougherty, Darin D.; Nierenberg, Andrew A.; Miklowitz, David J.
2016-01-01
Background We sought to address how predictors and moderators of psychotherapy for bipolar depression – identified individually in prior analyses – can inform the development of a metric for prospectively classifying treatment outcome in intensive psychotherapy (IP) versus collaborative care (CC) adjunctive to pharmacotherapy in the Systematic Treatment Enhancement Program (STEP-BD) study. Methods We conducted post-hoc analyses on 135 STEP-BD participants using cluster analysis to identify subsets of participants with similar clinical profiles and investigated this combined metric as a moderator and predictor of response to IP. We used agglomerative hierarchical cluster analyses and k-means clustering to determine the content of the clinical profiles. Logistic regression and Cox proportional hazard models were used to evaluate whether the resulting clusters predicted or moderated likelihood of recovery or time until recovery. Results The cluster analysis yielded a two-cluster solution: 1) “less-recurrent/severe” and 2) “chronic/recurrent.” Rates of recovery in IP were similar for less-recurrent/severe and chronic/recurrent participants. Less-recurrent/severe patients were more likely than chronic/recurrent patients to achieve recovery in CC (p = .040, OR = 4.56). IP yielded a faster recovery for chronic/recurrent participants, whereas CC led to recovery sooner in the less-recurrent/severe cluster (p = .034, OR = 2.62). Limitations Cluster analyses require list-wise deletion of cases with missing data so we were unable to conduct analyses on all STEP-BD participants. Conclusions A well-powered, parametric approach can distinguish patients based on illness history and provide clinicians with symptom profiles of patients that confer differential prognosis in CC vs. IP. PMID:27289316
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
Villalobos-Gallegos, Luis; Marín-Navarrete, Rodrigo; Roncero, Calos; González-Cantú, Hugo
2017-01-01
To identify symptom-based subgroups within a sample of patients with co-occurring disorders (CODs) and to analyze intersubgroup differences in mental health services utilization. Two hundred and fifteen patients with COD from an addiction clinic completed the Symptom Checklist 90-Revised. Subgroups were determined using latent class profile analysis. Services utilization data were collected from electronic records during a 3-year span. The five-class model obtained the best fit (Bayesian information criteria [BIC] = 3,546.95; adjusted BIC = 3,363.14; bootstrapped likelihood ratio test p < 0.0001). Differences between classes were quantitative, and groups were labeled according to severity: mild (26%), mild-moderate (28.8%), moderate (18.6%), moderate-severe (17.2%), and severe (9.3%). A significant time by class interaction was obtained (chi-square [χ2[15
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Application of permanents of square matrices for DNA identification in multiple-fatality cases
2013-01-01
Background DNA profiling is essential for individual identification. In forensic medicine, the likelihood ratio (LR) is commonly used to identify individuals. The LR is calculated by comparing two hypotheses for the sample DNA: that the sample DNA is identical or related to a reference DNA, and that it is randomly sampled from a population. For multiple-fatality cases, however, identification should be considered as an assignment problem, and a particular sample and reference pair should therefore be compared with other possibilities conditional on the entire dataset. Results We developed a new method to compute the probability via permanents of square matrices of nonnegative entries. As the exact permanent is known as a #P-complete problem, we applied the Huber–Law algorithm to approximate the permanents. We performed a computer simulation to evaluate the performance of our method via receiver operating characteristic curve analysis compared with LR under the assumption of a closed incident. Differences between the two methods were well demonstrated when references provided neither obligate alleles nor impossible alleles. The new method exhibited higher sensitivity (0.188 vs. 0.055) at a threshold value of 0.999, at which specificity was 1, and it exhibited higher area under a receiver operating characteristic curve (0.990 vs. 0.959, P = 9.6E-15). Conclusions Our method therefore offers a solution for a computationally intensive assignment problem and may be a viable alternative to LR-based identification for closed-incident multiple-fatality cases. PMID:23962363
NASA Astrophysics Data System (ADS)
Samulski, Maurice; Karssemeijer, Nico
2008-03-01
Most of the current CAD systems detect suspicious mass regions independently in single views. In this paper we present a method to match corresponding regions in mediolateral oblique (MLO) and craniocaudal (CC) mammographic views of the breast. For every possible combination of mass regions in the MLO view and CC view, a number of features are computed, such as the difference in distance of a region to the nipple, a texture similarity measure, the gray scale correlation and the likelihood of malignancy of both regions computed by single-view analysis. In previous research, Linear Discriminant Analysis was used to discriminate between correct and incorrect links. In this paper we investigate if the performance can be improved by employing a statistical method in which four classes are distinguished. These four classes are defined by the combinations of view (MLO/CC) and pathology (TP/FP) labels. We use distance-weighted k-Nearest Neighbor density estimation to estimate the likelihood of a region combination. Next, a correspondence score is calculated as the likelihood that the region combination is a TP-TP link. The method was tested on 412 cases with a malignant lesion visible in at least one of the views. In 82.4% of the cases a correct link could be established between the TP detections in both views. In future work, we will use the framework presented here to develop a context dependent region matching scheme, which takes the number and likelihood of possible alternatives into account. It is expected that more accurate determination of matching probabilities will lead to improved CAD performance.
Fang, Yun; Wu, Hulin; Zhu, Li-Xing
2011-07-01
We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.
Spatial resolution properties of motion-compensated tomographic image reconstruction methods.
Chun, Se Young; Fessler, Jeffrey A
2012-07-01
Many motion-compensated image reconstruction (MCIR) methods have been proposed to correct for subject motion in medical imaging. MCIR methods incorporate motion models to improve image quality by reducing motion artifacts and noise. This paper analyzes the spatial resolution properties of MCIR methods and shows that nonrigid local motion can lead to nonuniform and anisotropic spatial resolution for conventional quadratic regularizers. This undesirable property is akin to the known effects of interactions between heteroscedastic log-likelihoods (e.g., Poisson likelihood) and quadratic regularizers. This effect may lead to quantification errors in small or narrow structures (such as small lesions or rings) of reconstructed images. This paper proposes novel spatial regularization design methods for three different MCIR methods that account for known nonrigid motion. We develop MCIR regularization designs that provide approximately uniform and isotropic spatial resolution and that match a user-specified target spatial resolution. Two-dimensional PET simulations demonstrate the performance and benefits of the proposed spatial regularization design methods.
Control of Risks Through the Use of Procedures: A Method for Evaluating the Change in Risk
NASA Technical Reports Server (NTRS)
Praino, Gregory T.; Sharit, Joseph
2010-01-01
This paper considers how procedures can be used to control risks faced by an organization and proposes a means of recognizing if a particular procedure reduces risk or contributes to the organization's exposure. The proposed method was developed out of the review of work documents and the governing procedures performed in the wake of the Columbia accident by NASA and the Space Shuttle prime contractor, United Space Alliance, LLC. A technique was needed to understand the rules, or procedural controls, in place at the time in the context of how important the role of each rule was. The proposed method assesses procedural risks, the residual risk associated with a hazard after a procedure's influence is accounted for, by considering each clause of a procedure as a unique procedural control that may be beneficial or harmful. For procedural risks with consequences severe enough to threaten the survival of the organization, the method measures the characteristics of each risk on a scale that is an alternative to the traditional consequence/likelihood couple. The dual benefits of the substitute scales are that they eliminate both the need to quantify a relationship between different consequence types and the need for the extensive history a probabilistic risk assessment would require. Control Value is used as an analog for the consequence, where the value of a rule is based on how well the control reduces the severity of the consequence when operating successfully. This value is composed of two parts: the inevitability of the consequence in the absence of the control, and the opportunity to intervene before the consequence is realized. High value controls will be ones where there is minimal need for intervention but maximum opportunity to actively prevent the outcome. Failure Likelihood is used as the substitute for the conventional likelihood of the outcome. For procedural controls, a failure is considered to be any non-malicious violation of the rule, whether intended or not. The model used for describing the Failure Likelihood considers how well a task was established by evaluating that task on five components. The components selected to define a well established task are: that it be defined, assigned to someone capable, that they be trained appropriately, that the actions be organized to enable proper completion and that some form of independent monitoring be performed. Validation of the method was based on the information provided by a group of experts in Space Shuttle ground processing when they were presented with 5 scenarios that identified a clause from a procedure. For each scenario, they recorded their perception of how important the associated rule was and how likely it was to fail. They then rated the components of Control Value and Failure Likelihood for all the scenarios. The order in which each reviewer ranked the scenarios Control Value and Failure Likelihood was compared to the order in which they ranked the scenarios for each of the associated components; inevitability and opportunity for Control Value and definition, assignment, training, organization and monitoring for Failure Likelihood. This order comparison showed how the components contributed to a relative relationship to the substitute risk element. With the relationship established for Space Shuttle ground processing, this method can be used to gauge if the introduction or removal of a particular rule will increase or decrease the .risk associated with the hazard it is intended to control.
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures
Theobald, Douglas L.; Wuttke, Deborah S.
2008-01-01
Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907
García-Hermoso, Antonio; Ramírez-Vélez, Robinson; Ramírez-Campillo, Rodrigo; Izquierdo, Mikel
2017-10-23
Health behaviors and risk factors are independently related with cognitive function in older adults. This study aimed at examining the prevalence and relationship between cognitive function and a number of ideal cardiovascular health (CVH) metrics in older adults from the 2009 to 2010 Chilean National Health Survey. Data from 460 older adults (mean age 73.5 years old, 59.3% women) from the 2009 to 2010 Chilean Health Survey were analyzed. Ideal CVH was defined as meeting the ideal levels of the following components: four behaviors (smoking, body mass index, physical activity, and diet adherence) and three factors (total cholesterol, blood pressure, and fasting glucose). Older adults were grouped into three categories according to their number of ideal CVH metrics: ideal (5-7 metrics), intermediate (3-4 metrics), and poor (0-2 metrics). Cognitive function was assessed by using the modified Mini-Mental Status Examination (mMMSE). Of the 460 participants, 2% had 0 ideal metrics, 11.3% had 1, 23.9% had 2, 32.2% had 3, 20.7% had 4, 9.6% had 5, 0.4% had 6, and 0% had 7. Cognitive function was greater in older adults who met the ideal smoking, physical activity, and fasting blood glucose criteria. Logistic regression analysis suggested that ideal physical activity (Odds Ratio [OR] = 0.411 95% confidence interval [95% CI], 0.209-0.807) and smoking (OR = 0.429 95% CI, 0.095-0.941) behaviors reduced the likelihood of cognitive impairment. Moreover, compared with a poor profile (0-2 metrics), an intermediate (3-4 metrics) (OR = 0.221 95% CI, 0.024-0.911) and ideal CVH profile (5-7 metrics) (OR = 0.106 95% CI, 0.013-0.864) reduced the likelihood of cognitive impairment. We found that intermediate and ideal profiles were associated with a similarly low prevalence of cognitive impairment in Chilean older adults.
Murray, Robin; Correll, Christoph U; Reynolds, Gavin P; Taylor, David
2017-03-01
Available evidence suggests that second-generation atypical antipsychotics are broadly similar to first-generation agents in terms of their efficacy, but may have a more favourable tolerability profile, primarily by being less likely to cause extrapyramidal symptoms. However, atypical antipsychotics are variably associated with disturbances in the cardiometabolic arena, including increased body weight and the development of metabolic syndrome, which may reflect differences in their receptor binding profiles. Effective management of schizophrenia must ensure that the physical health of patients is addressed together with their mental health. This should therefore involve consideration of the specific tolerability profiles of available agents and individualization of treatment to minimize the likelihood of adverse metabolic sequelae, thereby improving long-term adherence and optimizing overall treatment outcomes. Alongside this, modifiable risk factors (such as exercise, diet, obesity/body weight and smoking status) must be addressed, in order to optimize patients' overall health and quality of life (QoL). In addition to antipsychotic-induced side effects, the clinical management of early nonresponders and psychopharmacological approaches for patients with treatment-resistant schizophrenia remain important unmet needs. Evidence suggests that antipsychotic response starts early in the course of treatment and that early nonresponse accurately predicts nonresponse over the longer term. Early nonresponse therefore represents an important modifiable risk factor for poor efficacy and effectiveness outcomes, since switching or augmenting antipsychotic treatment in patients showing early nonresponse has been shown to improve the likelihood of subsequent treatment outcomes. Recent evidence has also demonstrated that patients showing early nonresponse to treatment with lurasidone at 2 weeks may benefit from an increase in dose at this timepoint without compromising tolerability/safety. However, further research is required to determine whether these findings are generalizable to other antipsychotic agents.
NASA Technical Reports Server (NTRS)
Switzer, Eric Ryan; Watts, Duncan J.
2016-01-01
The B-mode polarization of the cosmic microwave background provides a unique window into tensor perturbations from inflationary gravitational waves. Survey effects complicate the estimation and description of the power spectrum on the largest angular scales. The pixel-space likelihood yields parameter distributions without the power spectrum as an intermediate step, but it does not have the large suite of tests available to power spectral methods. Searches for primordial B-modes must rigorously reject and rule out contamination. Many forms of contamination vary or are uncorrelated across epochs, frequencies, surveys, or other data treatment subsets. The cross power and the power spectrum of the difference of subset maps provide approaches to reject and isolate excess variance. We develop an analogous joint pixel-space likelihood. Contamination not modeled in the likelihood produces parameter-dependent bias and complicates the interpretation of the difference map. We describe a null test that consistently weights the difference map. Excess variance should either be explicitly modeled in the covariance or be removed through reprocessing the data.
Likelihood ratio data to report the validation of a forensic fingerprint evaluation method.
Ramos, Daniel; Haraksim, Rudolf; Meuwly, Didier
2017-02-01
Data to which the authors refer to throughout this article are likelihood ratios (LR) computed from the comparison of 5-12 minutiae fingermarks with fingerprints. These LRs data are used for the validation of a likelihood ratio (LR) method in forensic evidence evaluation. These data present a necessary asset for conducting validation experiments when validating LR methods used in forensic evidence evaluation and set up validation reports. These data can be also used as a baseline for comparing the fingermark evidence in the same minutiae configuration as presented in (D. Meuwly, D. Ramos, R. Haraksim,) [1], although the reader should keep in mind that different feature extraction algorithms and different AFIS systems used may produce different LRs values. Moreover, these data may serve as a reproducibility exercise, in order to train the generation of validation reports of forensic methods, according to [1]. Alongside the data, a justification and motivation for the use of methods is given. These methods calculate LRs from the fingerprint/mark data and are subject to a validation procedure. The choice of using real forensic fingerprint in the validation and simulated data in the development is described and justified. Validation criteria are set for the purpose of validation of the LR methods, which are used to calculate the LR values from the data and the validation report. For privacy and data protection reasons, the original fingerprint/mark images cannot be shared. But these images do not constitute the core data for the validation, contrarily to the LRs that are shared.
MCMC multilocus lod scores: application of a new approach.
George, Andrew W; Wijsman, Ellen M; Thompson, Elizabeth A
2005-01-01
On extended pedigrees with extensive missing data, the calculation of multilocus likelihoods for linkage analysis is often beyond the computational bounds of exact methods. Growing interest therefore surrounds the implementation of Monte Carlo estimation methods. In this paper, we demonstrate the speed and accuracy of a new Markov chain Monte Carlo method for the estimation of linkage likelihoods through an analysis of real data from a study of early-onset Alzheimer's disease. For those data sets where comparison with exact analysis is possible, we achieved up to a 100-fold increase in speed. Our approach is implemented in the program lm_bayes within the framework of the freely available MORGAN 2.6 package for Monte Carlo genetic analysis (http://www.stat.washington.edu/thompson/Genepi/MORGAN/Morgan.shtml).
2010-01-01
Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504
Horsch, Karla; Pesce, Lorenzo L.; Giger, Maryellen L.; Metz, Charles E.; Jiang, Yulei
2012-01-01
Purpose: The authors developed scaling methods that monotonically transform the output of one classifier to the “scale” of another. Such transformations affect the distribution of classifier output while leaving the ROC curve unchanged. In particular, they investigated transformations between radiologists and computer classifiers, with the goal of addressing the problem of comparing and interpreting case-specific values of output from two classifiers. Methods: Using both simulated and radiologists’ rating data of breast imaging cases, the authors investigated a likelihood-ratio-scaling transformation, based on “matching” classifier likelihood ratios. For comparison, three other scaling transformations were investigated that were based on matching classifier true positive fraction, false positive fraction, or cumulative distribution function, respectively. The authors explored modifying the computer output to reflect the scale of the radiologist, as well as modifying the radiologist’s ratings to reflect the scale of the computer. They also evaluated how dataset size affects the transformations. Results: When ROC curves of two classifiers differed substantially, the four transformations were found to be quite different. The likelihood-ratio scaling transformation was found to vary widely from radiologist to radiologist. Similar results were found for the other transformations. Our simulations explored the effect of database sizes on the accuracy of the estimation of our scaling transformations. Conclusions: The likelihood-ratio-scaling transformation that the authors have developed and evaluated was shown to be capable of transforming computer and radiologist outputs to a common scale reliably, thereby allowing the comparison of the computer and radiologist outputs on the basis of a clinically relevant statistic. PMID:22559651
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence
NASA Astrophysics Data System (ADS)
Lewis, Nicholas; Grünwald, Peter
2018-03-01
Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Preserving Flow Variability in Watershed Model Calibrations
Background/Question/Methods Although watershed modeling flow calibration techniques often emphasize a specific flow mode, ecological conditions that depend on flow-ecology relationships often emphasize a range of flow conditions. We used informal likelihood methods to investig...
A threshold method for immunological correlates of protection
2013-01-01
Background Immunological correlates of protection are biological markers such as disease-specific antibodies which correlate with protection against disease and which are measurable with immunological assays. It is common in vaccine research and in setting immunization policy to rely on threshold values for the correlate where the accepted threshold differentiates between individuals who are considered to be protected against disease and those who are susceptible. Examples where thresholds are used include development of a new generation 13-valent pneumococcal conjugate vaccine which was required in clinical trials to meet accepted thresholds for the older 7-valent vaccine, and public health decision making on vaccination policy based on long-term maintenance of protective thresholds for Hepatitis A, rubella, measles, Japanese encephalitis and others. Despite widespread use of such thresholds in vaccine policy and research, few statistical approaches have been formally developed which specifically incorporate a threshold parameter in order to estimate the value of the protective threshold from data. Methods We propose a 3-parameter statistical model called the a:b model which incorporates parameters for a threshold and constant but different infection probabilities below and above the threshold estimated using profile likelihood or least squares methods. Evaluation of the estimated threshold can be performed by a significance test for the existence of a threshold using a modified likelihood ratio test which follows a chi-squared distribution with 3 degrees of freedom, and confidence intervals for the threshold can be obtained by bootstrapping. The model also permits assessment of relative risk of infection in patients achieving the threshold or not. Goodness-of-fit of the a:b model may be assessed using the Hosmer-Lemeshow approach. The model is applied to 15 datasets from published clinical trials on pertussis, respiratory syncytial virus and varicella. Results Highly significant thresholds with p-values less than 0.01 were found for 13 of the 15 datasets. Considerable variability was seen in the widths of confidence intervals. Relative risks indicated around 70% or better protection in 11 datasets and relevance of the estimated threshold to imply strong protection. Goodness-of-fit was generally acceptable. Conclusions The a:b model offers a formal statistical method of estimation of thresholds differentiating susceptible from protected individuals which has previously depended on putative statements based on visual inspection of data. PMID:23448322
Pritikin, Joshua N; Brick, Timothy R; Neale, Michael C
2018-04-01
A novel method for the maximum likelihood estimation of structural equation models (SEM) with both ordinal and continuous indicators is introduced using a flexible multivariate probit model for the ordinal indicators. A full information approach ensures unbiased estimates for data missing at random. Exceeding the capability of prior methods, up to 13 ordinal variables can be included before integration time increases beyond 1 s per row. The method relies on the axiom of conditional probability to split apart the distribution of continuous and ordinal variables. Due to the symmetry of the axiom, two similar methods are available. A simulation study provides evidence that the two similar approaches offer equal accuracy. A further simulation is used to develop a heuristic to automatically select the most computationally efficient approach. Joint ordinal continuous SEM is implemented in OpenMx, free and open-source software.
Smolin, John A; Gambetta, Jay M; Smith, Graeme
2012-02-17
We provide an efficient method for computing the maximum-likelihood mixed quantum state (with density matrix ρ) given a set of measurement outcomes in a complete orthonormal operator basis subject to Gaussian noise. Our method works by first changing basis yielding a candidate density matrix μ which may have nonphysical (negative) eigenvalues, and then finding the nearest physical state under the 2-norm. Our algorithm takes at worst O(d(4)) for the basis change plus O(d(3)) for finding ρ where d is the dimension of the quantum state. In the special case where the measurement basis is strings of Pauli operators, the basis change takes only O(d(3)) as well. The workhorse of the algorithm is a new linear-time method for finding the closest probability distribution (in Euclidean distance) to a set of real numbers summing to one.
Bowers, Edmond P; Johnson, Sara K; Buckingham, Mary H; Gasca, Santiago; Warren, Daniel J A; Lerner, Jacqueline V; Lerner, Richard M
2014-06-01
Both parents and important non-parental adults have influential roles in promoting positive youth development (PYD). Little research, however, has examined the simultaneous effects of both parents and important non-parental adults for PYD. We assessed the relationships among youth-reported parenting profiles and important non-parental adult relationships in predicting the Five Cs of PYD (competence, confidence, connection, character, and caring) in four cross-sectional waves of data from the 4-H Study of PYD (Grade 9: N = 975, 61.1% female; Grade 10: N = 1,855, 63.4% female; Grade 11: N = 983, 67.9% female; Grade 12: N = 703, 69.3% female). The results indicated the existence of latent profiles of youth-reported parenting styles based on maternal warmth, parental school involvement, and parental monitoring that were consistent with previously identified profiles (authoritative, authoritarian, permissive, and uninvolved) as well as reflecting several novel profiles (highly involved, integrative, school-focused, controlling). Parenting profile membership predicted mean differences in the Five Cs at each wave, and also moderated the relationships between the presence of an important non-parental adult and the Five Cs. In general, authoritative and highly involved parenting predicted higher levels of PYD and a higher likelihood of being connected to an important non-parental adult. We discuss the implications of these findings for future research on adult influences of youth development and for programs that involve adults in attempts to promote PYD.
Identifying common donors in DNA mixtures, with applications to database searches.
Slooten, K
2017-01-01
Several methods exist to compute the likelihood ratio LR(M, g) evaluating the possible contribution of a person of interest with genotype g to a mixed trace M. In this paper we generalize this LR to a likelihood ratio LR(M 1 , M 2 ) involving two possibly mixed traces M 1 and M 2 , where the question is whether there is a donor in common to both traces. In case one of the traces is in fact a single genotype, then this likelihood ratio reduces to the usual LR(M, g). We explain how our method conceptually is a logical consequence of the fact that LR calculations of the form LR(M, g) can be equivalently regarded as a probabilistic deconvolution of the mixture. Based on simulated data, and using a semi-continuous mixture evaluation model, we derive ROC curves of our method applied to various types of mixtures. From these data we conclude that searches for a common donor are often feasible in the sense that a very small false positive rate can be combined with a high probability to detect a common donor if there is one. We also show how database searches comparing all traces to each other can be carried out efficiently, as illustrated by the application of the method to the mixed traces in the Dutch DNA database. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing
Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.
2007-01-01
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
Chemical information obtained from Auger depth profiles by means of advanced factor analysis (MLCFA)
NASA Astrophysics Data System (ADS)
De Volder, P.; Hoogewijs, R.; De Gryse, R.; Fiermans, L.; Vennik, J.
1993-01-01
The advanced multivariate statistical technique "maximum likelihood common factor analysis (MLCFA)" is shown to be superior to "principal component analysis (PCA)" for decomposing overlapping peaks into their individual component spectra of which neither the number of components nor the peak shape of the component spectra is known. An examination of the maximum resolving power of both techniques, MLCFA and PCA, by means of artificially created series of multicomponent spectra confirms this finding unambiguously. Substantial progress in the use of AES as a chemical-analysis technique is accomplished through the implementation of MLCFA. Chemical information from Auger depth profiles is extracted by investigating the variation of the line shape of the Auger signal as a function of the changing chemical state of the element. In particular, MLCFA combined with Auger depth profiling has been applied to problems related to steelcord-rubber tyre adhesion. MLCFA allows one to elucidate the precise nature of the interfacial layer of reaction products between natural rubber vulcanized on a thin brass layer. This study reveals many interesting chemical aspects of the oxi-sulfidation of brass undetectable with classical AES.
Flight Test Results of a Synthetic Vision Elevation Database Integrity Monitor
NASA Technical Reports Server (NTRS)
deHaag, Maarten Uijt; Sayre, Jonathon; Campbell, Jacob; Young, Steve; Gray, Robert
2001-01-01
This paper discusses the flight test results of a real-time Digital Elevation Model (DEM) integrity monitor for Civil Aviation applications. Providing pilots with Synthetic Vision (SV) displays containing terrain information has the potential to improve flight safety by improving situational awareness and thereby reducing the likelihood of Controlled Flight Into Terrain (CFIT). Utilization of DEMs, such as the digital terrain elevation data (DTED), requires a DEM integrity check and timely integrity alerts to the pilots when used for flight-critical terrain-displays, otherwise the DEM may provide hazardous misleading terrain information. The discussed integrity monitor checks the consistency between a terrain elevation profile synthesized from sensor information, and the profile given in the DEM. The synthesized profile is derived from DGPS and radar altimeter measurements. DEMs of various spatial resolutions are used to illustrate the dependency of the integrity monitor s performance on the DEMs spatial resolution. The paper will give a description of proposed integrity algorithms, the flight test setup, and the results of a flight test performed at the Ohio University airport and in the vicinity of Asheville, NC.
[Accuracy of three methods for the rapid diagnosis of oral candidiasis].
Lyu, X; Zhao, C; Yan, Z M; Hua, H
2016-10-09
Objective: To explore a simple, rapid and efficient method for the diagnosis of oral candidiasis in clinical practice. Methods: Totally 124 consecutive patients with suspected oral candidiasis were enrolled from Department of Oral Medicine, Peking University School and Hospital of Stomatology, Beijing, China. Exfoliated cells of oral mucosa and saliva or concentrated oral rinse) obtained from all participants were tested by three rapid smear methods(10% KOH smear, gram-stained smear, Congo red stained smear). The diagnostic efficacy(sensitivity, specificity, Youden's index, likelihood ratio, consistency, predictive value and area under curve(AUC) of each of the above mentioned three methods was assessed by comparing the results with the gold standard(combination of clinical diagnosis, laboratory diagnosis and expert opinion). Results: Gram-stained smear of saliva(or concentrated oral rinse) demonstrated highest sensitivity(82.3%). Test of 10%KOH smear of exfoliated cells showed highest specificity(93.5%). Congo red stained smear of saliva(or concentrated oral rinse) displayed highest diagnostic efficacy(79.0% sensitivity, 80.6% specificity, 0.60 Youden's index, 4.08 positive likelihood ratio, 0.26 negative likelihood ratio, 80% consistency, 80.3% positive predictive value, 79.4% negative predictive value and 0.80 AUC). Conclusions: Test of Congo red stained smear of saliva(or concentrated oral rinse) could be used as a point-of-care tool for the rapid diagnosis of oral candidiasis in clinical practice. Trial registration: Chinese Clinical Trial Registry, ChiCTR-DDD-16008118.
Statistical modelling of thermal annealing of fission tracks in apatite
NASA Astrophysics Data System (ADS)
Laslett, G. M.; Galbraith, R. F.
1996-12-01
We develop an improved methodology for modelling the relationship between mean track length, temperature, and time in fission track annealing experiments. We consider "fanning Arrhenius" models, in which contours of constant mean length on an Arrhenius plot are straight lines meeting at a common point. Features of our approach are explicit use of subject matter knowledge, treating mean length as the response variable, modelling of the mean-variance relationship with two components of variance, improved modelling of the control sample, and using information from experiments in which no tracks are seen. This approach overcomes several weaknesses in previous models and provides a robust six parameter model that is widely applicable. Estimation is via direct maximum likelihood which can be implemented using a standard numerical optimisation package. Because the model is highly nonlinear, some reparameterisations are needed to achieve stable estimation and calculation of precisions. Experience suggests that precisions are more convincingly estimated from profile log-likelihood functions than from the information matrix. We apply our method to the B-5 and Sr fluorapatite data of Crowley et al. (1991) and obtain well-fitting models in both cases. For the B-5 fluorapatite, our model exhibits less fanning than that of Crowley et al. (1991), although fitted mean values above 12 μm are fairly similar. However, predictions can be different, particularly for heavy annealing at geological time scales, where our model is less retentive. In addition, the refined error structure of our model results in tighter prediction errors, and has components of error that are easier to verify or modify. For the Sr fluorapatite, our fitted model for mean lengths does not differ greatly from that of Crowley et al. (1991), but our error structure is quite different.
Mass and Volume Optimization of Space Flight Medical Kits
NASA Technical Reports Server (NTRS)
Keenan, A. B.; Foy, Millennia Hope; Myers, Jerry
2014-01-01
Resource allocation is a critical aspect of space mission planning. All resources, including medical resources, are subject to a number of mission constraints such a maximum mass and volume. However, unlike many resources, there is often limited understanding in how to optimize medical resources for a mission. The Integrated Medical Model (IMM) is a probabilistic model that estimates medical event occurrences and mission outcomes for different mission profiles. IMM simulates outcomes and describes the impact of medical events in terms of lost crew time, medical resource usage, and the potential for medically required evacuation. Previously published work describes an approach that uses the IMM to generate optimized medical kits that maximize benefit to the crew subject to mass and volume constraints. We improve upon the results obtained previously and extend our approach to minimize mass and volume while meeting some benefit threshold. METHODS We frame the medical kit optimization problem as a modified knapsack problem and implement an algorithm utilizing dynamic programming. Using this algorithm, optimized medical kits were generated for 3 mission scenarios with the goal of minimizing the medical kit mass and volume for a specified likelihood of evacuation or Crew Health Index (CHI) threshold. The algorithm was expanded to generate medical kits that maximize likelihood of evacuation or CHI subject to mass and volume constraints. RESULTS AND CONCLUSIONS In maximizing benefit to crew health subject to certain constraints, our algorithm generates medical kits that more closely resemble the unlimited-resource scenario than previous approaches which leverage medical risk information generated by the IMM. Our work here demonstrates that this algorithm provides an efficient and effective means to objectively allocate medical resources for spaceflight missions and provides an effective means of addressing tradeoffs in medical resource allocations and crew mission success parameters.
Myasoedova, Elena; Gabriel, Sherine E.; Green, Abigail B.; Matteson, Eric L.; Crowson, Cynthia S.
2013-01-01
Objectives To examine lipid profiles among statin-naive patients with rheumatoid arthritis (RA) and those without RA before and after the initiation of statins. Methods Information regarding lipid measures and statin use was gathered in a population-based incident cohort of patients with RA (1987 ACR criteria first met between 1/1/1988 and 1/1/2008) and in a cohort of non-RA subjects from the same underlying population. Only patients with no prior history of statin use were included. Results The study included 161 patients with RA (mean age 56.3 years, 57% female) and 221 non-RA subjects (mean age 56.0 years, 66% female). Prior to the start of statins, the levels of total cholesterol (TC) and low-density lipoprotein cholesterol (LDL) were lower in RA vs non-RA cohort (p<0.001 and p=0.003, respectively). The absolute and percent change in LDL after at least 90 days of statin use tended to be smaller in RA vs non-RA cohort (p=0.03 and p=0.09). After at least 90 days of statin use patients with RA were less likely to achieve therapeutic goals for LDL than the non-RA subjects (p=0.046). Increased erythrocyte sedimentation rate (ESR) at baseline (OR 0.47; 95% CI 0.26, 0.85) was associated with lower likelihood of achieving therapeutic LDL goals. Conclusion Patients with RA had lower TC and LDL levels before statin initiation and lower likelihood of achieving therapeutic LDL goals following statin use than the non-RA subjects. Some RA disease characteristics, in particular ESR at baseline, may have an adverse impact on achieving therapeutic LDL goals. PMID:23592565
Contagion in Mass Killings and School Shootings
Towers, Sherry; Gomez-Lievano, Andres; Khan, Maryam; Mubayi, Anuj; Castillo-Chavez, Carlos
2015-01-01
Background Several past studies have found that media reports of suicides and homicides appear to subsequently increase the incidence of similar events in the community, apparently due to the coverage planting the seeds of ideation in at-risk individuals to commit similar acts. Methods Here we explore whether or not contagion is evident in more high-profile incidents, such as school shootings and mass killings (incidents with four or more people killed). We fit a contagion model to recent data sets related to such incidents in the US, with terms that take into account the fact that a school shooting or mass murder may temporarily increase the probability of a similar event in the immediate future, by assuming an exponential decay in contagiousness after an event. Conclusions We find significant evidence that mass killings involving firearms are incented by similar events in the immediate past. On average, this temporary increase in probability lasts 13 days, and each incident incites at least 0.30 new incidents (p = 0.0015). We also find significant evidence of contagion in school shootings, for which an incident is contagious for an average of 13 days, and incites an average of at least 0.22 new incidents (p = 0.0001). All p-values are assessed based on a likelihood ratio test comparing the likelihood of a contagion model to that of a null model with no contagion. On average, mass killings involving firearms occur approximately every two weeks in the US, while school shootings occur on average monthly. We find that state prevalence of firearm ownership is significantly associated with the state incidence of mass killings with firearms, school shootings, and mass shootings. PMID:26135941
Schwappach, David L. B.; Gehring, Katrin
2014-01-01
Purpose To investigate the likelihood of speaking up about patient safety in oncology and to clarify the effect of clinical and situational context factors on the likelihood of voicing concerns. Patients and Methods 1013 nurses and doctors in oncology rated four clinical vignettes describing coworkers’ errors and rule violations in a self-administered factorial survey (65% response rate). Multiple regression analysis was used to model the likelihood of speaking up as outcome of vignette attributes, responder’s evaluations of the situation and personal characteristics. Results Respondents reported a high likelihood of speaking up about patient safety but the variation between and within types of errors and rule violations was substantial. Staff without managerial function provided significantly higher levels of decision difficulty and discomfort to speak up. Based on the information presented in the vignettes, 74%−96% would speak up towards a supervisor failing to check a prescription, 45%−81% would point a coworker to a missed hand disinfection, 82%−94% would speak up towards nurses who violate a safety rule in medication preparation, and 59%−92% would question a doctor violating a safety rule in lumbar puncture. Several vignette attributes predicted the likelihood of speaking up. Perceived potential harm, anticipated discomfort, and decision difficulty were significant predictors of the likelihood of speaking up. Conclusions Clinicians’ willingness to speak up about patient safety is considerably affected by contextual factors. Physicians and nurses without managerial function report substantial discomfort with speaking up. Oncology departments should provide staff with clear guidance and trainings on when and how to voice safety concerns. PMID:25116338
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Likelihood Ratios for Glaucoma Diagnosis Using Spectral Domain Optical Coherence Tomography
Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M.; Weinreb, Robert N.; Medeiros, Felipe A.
2014-01-01
Purpose To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral domain optical coherence tomography (spectral-domain OCT). Design Observational cohort study. Methods 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the Receiver Operating Characteristic (ROC) curve. Results Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86μm were associated with positive LRs, i.e., LRs greater than 1; whereas RNFL thickness values higher than 86μm were associated with negative LRs, i.e., LRs smaller than 1. A modified Fagan nomogram was provided to assist calculation of post-test probability of disease from the calculated likelihood ratios and pretest probability of disease. Conclusion The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision-making. PMID:23972303
A comprehensive assessment of collision likelihood in Geosynchronous Earth Orbit
NASA Astrophysics Data System (ADS)
Oltrogge, D. L.; Alfano, S.; Law, C.; Cacioni, A.; Kelso, T. S.
2018-06-01
Knowing the likelihood of collision for satellites operating in Geosynchronous Earth Orbit (GEO) is of extreme importance and interest to the global community and the operators of GEO spacecraft. Yet for all of its importance, a comprehensive assessment of GEO collision likelihood is difficult to do and has never been done. In this paper, we employ six independent and diverse assessment methods to estimate GEO collision likelihood. Taken in aggregate, this comprehensive assessment offer new insights into GEO collision likelihood that are within a factor of 3.5 of each other. These results are then compared to four collision and seven encounter rate estimates previously published. Collectively, these new findings indicate that collision likelihood in GEO is as much as four orders of magnitude higher than previously published by other researchers. Results indicate that a collision is likely to occur every 4 years for one satellite out of the entire GEO active satellite population against a 1 cm RSO catalogue, and every 50 years against a 20 cm RSO catalogue. Further, previous assertions that collision relative velocities are low (i.e., <1 km/s) in GEO are disproven, with some GEO relative velocities as high as 4 km/s identified. These new findings indicate that unless operators successfully mitigate this collision risk, the GEO orbital arc is and will remain at high risk of collision, with the potential for serious follow-on collision threats from post-collision debris when a substantial GEO collision occurs.
Ratmann, Oliver; Andrieu, Christophe; Wiuf, Carsten; Richardson, Sylvia
2009-06-30
Mathematical models are an important tool to explain and comprehend complex phenomena, and unparalleled computational advances enable us to easily explore them without any or little understanding of their global properties. In fact, the likelihood of the data under complex stochastic models is often analytically or numerically intractable in many areas of sciences. This makes it even more important to simultaneously investigate the adequacy of these models-in absolute terms, against the data, rather than relative to the performance of other models-but no such procedure has been formally discussed when the likelihood is intractable. We provide a statistical interpretation to current developments in likelihood-free Bayesian inference that explicitly accounts for discrepancies between the model and the data, termed Approximate Bayesian Computation under model uncertainty (ABCmicro). We augment the likelihood of the data with unknown error terms that correspond to freely chosen checking functions, and provide Monte Carlo strategies for sampling from the associated joint posterior distribution without the need of evaluating the likelihood. We discuss the benefit of incorporating model diagnostics within an ABC framework, and demonstrate how this method diagnoses model mismatch and guides model refinement by contrasting three qualitative models of protein network evolution to the protein interaction datasets of Helicobacter pylori and Treponema pallidum. Our results make a number of model deficiencies explicit, and suggest that the T. pallidum network topology is inconsistent with evolution dominated by link turnover or lateral gene transfer alone.
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
A novel retinal vessel extraction algorithm based on matched filtering and gradient vector flow
NASA Astrophysics Data System (ADS)
Yu, Lei; Xia, Mingliang; Xuan, Li
2013-10-01
The microvasculature network of retina plays an important role in the study and diagnosis of retinal diseases (age-related macular degeneration and diabetic retinopathy for example). Although it is possible to noninvasively acquire high-resolution retinal images with modern retinal imaging technologies, non-uniform illumination, the low contrast of thin vessels and the background noises all make it difficult for diagnosis. In this paper, we introduce a novel retinal vessel extraction algorithm based on gradient vector flow and matched filtering to segment retinal vessels with different likelihood. Firstly, we use isotropic Gaussian kernel and adaptive histogram equalization to smooth and enhance the retinal images respectively. Secondly, a multi-scale matched filtering method is adopted to extract the retinal vessels. Then, the gradient vector flow algorithm is introduced to locate the edge of the retinal vessels. Finally, we combine the results of matched filtering method and gradient vector flow algorithm to extract the vessels at different likelihood levels. The experiments demonstrate that our algorithm is efficient and the intensities of vessel images exactly represent the likelihood of the vessels.
Topics in inference and decision-making with partial knowledge
NASA Technical Reports Server (NTRS)
Safavian, S. Rasoul; Landgrebe, David
1990-01-01
Two essential elements needed in the process of inference and decision-making are prior probabilities and likelihood functions. When both of these components are known accurately and precisely, the Bayesian approach provides a consistent and coherent solution to the problems of inference and decision-making. In many situations, however, either one or both of the above components may not be known, or at least may not be known precisely. This problem of partial knowledge about prior probabilities and likelihood functions is addressed. There are at least two ways to cope with this lack of precise knowledge: robust methods, and interval-valued methods. First, ways of modeling imprecision and indeterminacies in prior probabilities and likelihood functions are examined; then how imprecision in the above components carries over to the posterior probabilities is examined. Finally, the problem of decision making with imprecise posterior probabilities and the consequences of such actions are addressed. Application areas where the above problems may occur are in statistical pattern recognition problems, for example, the problem of classification of high-dimensional multispectral remote sensing image data.
Uncued Low SNR Detection with Likelihood from Image Multi Bernoulli Filter
NASA Astrophysics Data System (ADS)
Murphy, T.; Holzinger, M.
2016-09-01
Both SSA and SDA necessitate uncued, partially informed detection and orbit determination efforts for small space objects which often produce only low strength electro-optical signatures. General frame to frame detection and tracking of objects includes methods such as moving target indicator, multiple hypothesis testing, direct track-before-detect methods, and random finite set based multiobject tracking. This paper will apply the multi-Bernoilli filter to low signal-to-noise ratio (SNR), uncued detection of space objects for space domain awareness applications. The primary novel innovation in this paper is a detailed analysis of the existing state-of-the-art likelihood functions and a likelihood function, based on a binary hypothesis, previously proposed by the authors. The algorithm is tested on electro-optical imagery obtained from a variety of sensors at Georgia Tech, including the GT-SORT 0.5m Raven-class telescope, and a twenty degree field of view high frame rate CMOS sensor. In particular, a data set of an extended pass of the Hitomi Astro-H satellite approximately 3 days after loss of communication and potential break up is examined.
NASA Astrophysics Data System (ADS)
Núñez, M.; Robie, T.; Vlachos, D. G.
2017-10-01
Kinetic Monte Carlo (KMC) simulation provides insights into catalytic reactions unobtainable with either experiments or mean-field microkinetic models. Sensitivity analysis of KMC models assesses the robustness of the predictions to parametric perturbations and identifies rate determining steps in a chemical reaction network. Stiffness in the chemical reaction network, a ubiquitous feature, demands lengthy run times for KMC models and renders efficient sensitivity analysis based on the likelihood ratio method unusable. We address the challenge of efficiently conducting KMC simulations and performing accurate sensitivity analysis in systems with unknown time scales by employing two acceleration techniques: rate constant rescaling and parallel processing. We develop statistical criteria that ensure sufficient sampling of non-equilibrium steady state conditions. Our approach provides the twofold benefit of accelerating the simulation itself and enabling likelihood ratio sensitivity analysis, which provides further speedup relative to finite difference sensitivity analysis. As a result, the likelihood ratio method can be applied to real chemistry. We apply our methodology to the water-gas shift reaction on Pt(111).
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis
van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
Subjective global assessment of nutritional status in children.
Mahdavi, Aida Malek; Ostadrahimi, Alireza; Safaiyan, Abdolrasool
2010-10-01
This study was aimed to compare the subjective and objective nutritional assessments and to analyse the performance of subjective global assessment (SGA) of nutritional status in diagnosing undernutrition in paediatric patients. One hundred and forty children (aged 2-12 years) hospitalized consecutively in Tabriz Paediatric Hospital from June 2008 to August 2008 underwent subjective assessment using the SGA questionnaire and objective assessment, including anthropometric and biochemical measurements. Agreement between two assessment methods was analysed by the kappa (κ) statistic. Statistical indicators including (sensitivity, specificity, predictive values, error rates, accuracy, powers, likelihood ratios and odds ratio) between SGA and objective assessment method were determined. The overall prevalence of undernutrition according to the SGA (70.7%) was higher than that by objective assessment of nutritional status (48.5%). Agreement between the two evaluation methods was only fair to moderate (κ = 0.336, P < 0.001). The sensitivity, specificity, positive and negative predictive value of the SGA method for screening undernutrition in this population were 88.235%, 45.833%, 60.606% and 80.487%, respectively. Accuracy, positive and negative power of the SGA method were 66.428%, 56.074% and 41.25%, respectively. Likelihood ratio positive, likelihood ratio negative and odds ratio of the SGA method were 1.628, 0.256 and 6.359, respectively. Our findings indicated that in assessing nutritional status of children, there is not a good level of agreement between SGA and objective nutritional assessment. In addition, SGA is a highly sensitive tool for assessing nutritional status and could identify children at risk of developing undernutrition. © 2009 Blackwell Publishing Ltd.
Abstract: Inference and Interval Estimation for Indirect Effects With Latent Variable Models.
Falk, Carl F; Biesanz, Jeremy C
2011-11-30
Models specifying indirect effects (or mediation) and structural equation modeling are both popular in the social sciences. Yet relatively little research has compared methods that test for indirect effects among latent variables and provided precise estimates of the effectiveness of different methods. This simulation study provides an extensive comparison of methods for constructing confidence intervals and for making inferences about indirect effects with latent variables. We compared the percentile (PC) bootstrap, bias-corrected (BC) bootstrap, bias-corrected accelerated (BC a ) bootstrap, likelihood-based confidence intervals (Neale & Miller, 1997), partial posterior predictive (Biesanz, Falk, and Savalei, 2010), and joint significance tests based on Wald tests or likelihood ratio tests. All models included three reflective latent variables representing the independent, dependent, and mediating variables. The design included the following fully crossed conditions: (a) sample size: 100, 200, and 500; (b) number of indicators per latent variable: 3 versus 5; (c) reliability per set of indicators: .7 versus .9; (d) and 16 different path combinations for the indirect effect (α = 0, .14, .39, or .59; and β = 0, .14, .39, or .59). Simulations were performed using a WestGrid cluster of 1680 3.06GHz Intel Xeon processors running R and OpenMx. Results based on 1,000 replications per cell and 2,000 resamples per bootstrap method indicated that the BC and BC a bootstrap methods have inflated Type I error rates. Likelihood-based confidence intervals and the PC bootstrap emerged as methods that adequately control Type I error and have good coverage rates.
Comparisons of Four Methods for Estimating a Dynamic Factor Model
ERIC Educational Resources Information Center
Zhang, Zhiyong; Hamaker, Ellen L.; Nesselroade, John R.
2008-01-01
Four methods for estimating a dynamic factor model, the direct autoregressive factor score (DAFS) model, are evaluated and compared. The first method estimates the DAFS model using a Kalman filter algorithm based on its state space model representation. The second one employs the maximum likelihood estimation method based on the construction of a…
Statistical methods for the beta-binomial model in teratology.
Yamamoto, E; Yanagimoto, T
1994-01-01
The beta-binomial model is widely used for analyzing teratological data involving littermates. Recent developments in statistical analyses of teratological data are briefly reviewed with emphasis on the model. For statistical inference of the parameters in the beta-binomial distribution, separation of the likelihood introduces an likelihood inference. This leads to reducing biases of estimators and also to improving accuracy of empirical significance levels of tests. Separate inference of the parameters can be conducted in a unified way. PMID:8187716
Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott
2015-01-01
The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott
2015-01-01
The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
A New Online Calibration Method Based on Lord's Bias-Correction.
He, Yinhong; Chen, Ping; Li, Yong; Zhang, Shumei
2017-09-01
Online calibration technique has been widely employed to calibrate new items due to its advantages. Method A is the simplest online calibration method and has attracted many attentions from researchers recently. However, a key assumption of Method A is that it treats person-parameter estimates θ ^ s (obtained by maximum likelihood estimation [MLE]) as their true values θ s , thus the deviation of the estimated θ ^ s from their true values might yield inaccurate item calibration when the deviation is nonignorable. To improve the performance of Method A, a new method, MLE-LBCI-Method A, is proposed. This new method combines a modified Lord's bias-correction method (named as maximum likelihood estimation-Lord's bias-correction with iteration [MLE-LBCI]) with the original Method A in an effort to correct the deviation of θ ^ s which may adversely affect the item calibration precision. Two simulation studies were carried out to explore the performance of both MLE-LBCI and MLE-LBCI-Method A under several scenarios. Simulation results showed that MLE-LBCI could make a significant improvement over the ML ability estimates, and MLE-LBCI-Method A did outperform Method A in almost all experimental conditions.
Social Function in Multiple X and Y Chromosome Disorders: XXY, XYY, XXYY, XXXY
Visootsak, Jeannie; Graham, John M.
2014-01-01
Klinefelter syndrome (47,XXY) was initially described in the context of its endocrinologic and physical features; however, subsequent studies have revealed specific impairments in verbal skills and social functioning. Males with sex chromosomal aneuploidies are known to have variability in their developmental profile with the majority presenting with expressive language deficits. As a consequence of language delays, they have an increased likelihood of language-based learning disabilities and social-emotional problems that may persist through adulthood. Studies on males with 47,XXY have revealed unique behavioral and social profiles with possible vulnerability to autistic traits. The prevalence of males with more than one extra sex chromosome (e.g., 48,XXYY and 48,XXXY) and an additional Y (e.g., 47,XYY) is less common, but it is important to understand their social functioning as it provides insight into treatment implications. PMID:20014367
Missing the Party: Political Categorization and Reasoning in the Absence of Party Label Cues.
Heit, Evan; Nicholson, Stephen P
2016-07-01
This research addressed theoretical approaches in political science arguing that the American electorate is either poorly informed or dependent on party label cues, by assessing performance on political judgment tasks when party label information is missing. The research materials were created from the results of a national opinion survey held during a national election. The experiments themselves were run on nationally representative samples of adults, identified from another national electoral survey. Participants saw profiles of simulated individuals, including information about demographics and issue positions, but omitting party labels. In Experiment 1, participants successfully judged the likelihood of party membership based on the profiles. In Experiment 2, participants successfully voted based on their party interests. The results were mediated by participants' political knowledge. Conclusions are drawn with respect to theories from political science and issues in cognitive science regarding categorization and reasoning. Copyright © 2016 Cognitive Science Society, Inc.
Exponential series approaches for nonparametric graphical models
NASA Astrophysics Data System (ADS)
Janofsky, Eric
Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
The logistic model for predicting the non-gonoactive Aedes aegypti females.
Reyes-Villanueva, Filiberto; Rodríguez-Pérez, Mario A
2004-01-01
To estimate, using logistic regression, the likelihood of occurrence of a non-gonoactive Aedes aegypti female, previously fed human blood, with relation to body size and collection method. This study was conducted in Monterrey, Mexico, between 1994 and 1996. Ten samplings of 60 mosquitoes of Ae. aegypti females were carried out in three dengue endemic areas: six of biting females, two of emerging mosquitoes, and two of indoor resting females. Gravid females, as well as those with blood in the gut were removed. Mosquitoes were taken to the laboratory and engorged on human blood. After 48 hours, ovaries were dissected to register whether they were gonoactive or non-gonoactive. Wing-length in mm was an indicator for body size. The logistic regression model was used to assess the likelihood of non-gonoactivity, as a binary variable, in relation to wing-length and collection method. Of the 600 females, 164 (27%) remained non-gonoactive, with a wing-length range of 1.9-3.2 mm, almost equal to that of all females (1.8-3.3 mm). The logistic regression model showed a significant likelihood of a female remaining non-gonoactive (Y=1). The collection method did not influence the binary response, but there was an inverse relationship between non-gonoactivity and wing-length. Dengue vector populations from Monterrey, Mexico display a wide-range body size. Logistic regression was a useful tool to estimate the likelihood for an engorged female to remain non-gonoactive. The necessity for a second blood meal is present in any female, but small mosquitoes are more likely to bite again within a 2-day interval, in order to attain egg maturation. The English version of this paper is available too at: http://www.insp.mx/salud/index.html.
Nadeau, Mélissa; Rosas-Arellano, M Patricia; Gurr, Kevin R; Bailey, Stewart I; Taylor, David C; Grewal, Ruby; Lawlor, D Kirk; Bailey, Chris S
2013-12-01
Intermittent claudication can be neurogenic or vascular. Physicians use a profile based on symptom attributes to differentiate the 2 types of claudication, and this guides their investigations for diagnosis of the underlying pathology. We evaluated the validity of these symptom attributes in differentiating neurogenic from vascular claudication. Patients with a diagnosis of lumbar spinal stenosis (LSS) or peripheral vascular disease (PVD) who reported claudication answered 14 questions characterizing their symptoms. We determined the sensitivity, specificity and positive and negative likelihood ratios (PLR and NLR) for neurogenic and vascular claudication for each symptom attribute. We studied 53 patients. The most sensitive symptom attribute to rule out LSS was the absence of "triggering of pain with standing alone" (sensitivity 0.97, NLR 0.050). Pain alleviators and symptom location data showed a weak clinical significance for LSS and PVD. Constellation of symptoms yielded the strongest associations: patients with a positive shopping cart sign whose symptoms were located above the knees, triggered with standing alone and relieved with sitting had a strong likelihood of neurogenic claudication (PLR 13). Patients with symptoms in the calf that were relieved with standing alone had a strong likelihood of vascular claudication (PLR 20.0). The classic symptom attributes used to differentiate neurogenic from vascular claudication are at best weakly valid independently. However, certain constellation of symptoms are much more indicative of etiology. These results can guide general practitioners in their evaluation of and investigation for claudication.
NASA Technical Reports Server (NTRS)
Klein, V.
1980-01-01
A frequency domain maximum likelihood method is developed for the estimation of airplane stability and control parameters from measured data. The model of an airplane is represented by a discrete-type steady state Kalman filter with time variables replaced by their Fourier series expansions. The likelihood function of innovations is formulated, and by its maximization with respect to unknown parameters the estimation algorithm is obtained. This algorithm is then simplified to the output error estimation method with the data in the form of transformed time histories, frequency response curves, or spectral and cross-spectral densities. The development is followed by a discussion on the equivalence of the cost function in the time and frequency domains, and on advantages and disadvantages of the frequency domain approach. The algorithm developed is applied in four examples to the estimation of longitudinal parameters of a general aviation airplane using computer generated and measured data in turbulent and still air. The cost functions in the time and frequency domains are shown to be equivalent; therefore, both approaches are complementary and not contradictory. Despite some computational advantages of parameter estimation in the frequency domain, this approach is limited to linear equations of motion with constant coefficients.
Bayesian image reconstruction - The pixon and optimal image modeling
NASA Technical Reports Server (NTRS)
Pina, R. K.; Puetter, R. C.
1993-01-01
In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
Fu, P; Panneerselvam, A; Clifford, B; Dowlati, A; Ma, P C; Zeng, G; Halmos, B; Leidner, R S
2015-12-01
It is well known that non-small cell lung cancer (NSCLC) is a heterogeneous group of diseases. Previous studies have demonstrated genetic variation among different ethnic groups in the epidermal growth factor receptor (EGFR) in NSCLC. Research by our group and others has recently shown a lower frequency of EGFR mutations in African Americans with NSCLC, as compared to their White counterparts. In this study, we use our original study data of EGFR pathway genetics in African American NSCLC as an example to illustrate that univariate analyses based on aggregation versus partition of data leads to contradictory results, in order to emphasize the importance of controlling statistical confounding. We further investigate analytic approaches in logistic regression for data with separation, as is the case in our example data set, and apply appropriate methods to identify predictors of EGFR mutation. Our simulation shows that with separated or nearly separated data, penalized maximum likelihood (PML) produces estimates with smallest bias and approximately maintains the nominal value with statistical power equal to or better than that from maximum likelihood and exact conditional likelihood methods. Application of the PML method in our example data set shows that race and EGFR-FISH are independently significant predictors of EGFR mutation. © The Author(s) 2011.
Holmes, T J; Liu, Y H
1989-11-15
A maximum likelihood based iterative algorithm adapted from nuclear medicine imaging for noncoherent optical imaging was presented in a previous publication with some initial computer-simulation testing. This algorithm is identical in form to that previously derived in a different way by W. H. Richardson "Bayesian-Based Iterative Method of Image Restoration," J. Opt. Soc. Am. 62, 55-59 (1972) and L. B. Lucy "An Iterative Technique for the Rectification of Observed Distributions," Astron. J. 79, 745-765 (1974). Foreseen applications include superresolution and 3-D fluorescence microscopy. This paper presents further simulation testing of this algorithm and a preliminary experiment with a defocused camera. The simulations show quantified resolution improvement as a function of iteration number, and they show qualitatively the trend in limitations on restored resolution when noise is present in the data. Also shown are results of a simulation in restoring missing-cone information for 3-D imaging. Conclusions are in support of the feasibility of using these methods with real systems, while computational cost and timing estimates indicate that it should be realistic to implement these methods. Itis suggested in the Appendix that future extensions to the maximum likelihood based derivation of this algorithm will address some of the limitations that are experienced with the nonextended form of the algorithm presented here.
Woodhead, Jeffrey L; Paech, Franziska; Maurer, Martina; Engelhardt, Marc; Schmitt-Hoffmann, Anne H; Spickermann, Jochen; Messner, Simon; Wind, Mathias; Witschi, Anne-Therese; Krähenbühl, Stephan; Siler, Scott Q; Watkins, Paul B; Howell, Brett A
2018-06-07
Elevations of liver enzymes have been observed in clinical trials with BAL30072, a novel antibiotic. In vitro assays have identified potential mechanisms for the observed hepatotoxicity, including electron transport chain (ETC) inhibition and reactive oxygen species (ROS) generation. DILIsym, a quantitative systems pharmacology (QSP) model of drug-induced liver injury, has been used to predict the likelihood that each mechanism explains the observed toxicity. DILIsym was also used to predict the safety margin for a novel BAL30072 dosing scheme; it was predicted to be low. DILIsym was then used to recommend potential modifications to this dosing scheme; weight-adjusted dosing and a requirement to assay plasma alanine aminotransferase (ALT) daily and stop dosing as soon as ALT increases were observed improved the predicted safety margin of BAL30072 and decreased the predicted likelihood of severe injury. This research demonstrates a potential application for QSP modeling in improving the safety profile of candidate drugs. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Vilchez Barreto, Percy M; Gamboa, Ricardo; Santivañez, Saul; O'Neal, Seth E; Muro, Claudio; Lescano, Andrés G; Moyano, Luz-Maria; Gonzálvez, Guillermo; García, Hector H
2017-08-01
Hymenolepis nana , the dwarf tapeworm, is a common intestinal infection of children worldwide. We evaluated infection and risk factor data that were previously collected from 14,761 children aged 2-15 years during a large-scale program in northern Peru. We found that 1,124 of 14,761 children (7.61%) had H. nana infection, a likely underestimate given that only a single stool sample was examined by microscopy for diagnosis. The strongest association with infection was lack of adequate water (adjusted prevalence ratio [aPR] 2.22, 95% confidence interval [CI] 1.82-2.48) and sanitation infrastructure in the house (aPR 1.94, 95% CI 1.64-2.29). One quarter of those tested did not have a bathroom or latrine at home, which doubled their likelihood of infection. Similarly, one quarter did not have piped public water to the house, which also increased the likelihood of infection. Continued efforts to improve access to basic water and sanitation services will likely reduce the burden of infection in children for this and other intestinal infections.
A wavelet-based Bayesian framework for 3D object segmentation in microscopy
NASA Astrophysics Data System (ADS)
Pan, Kangyu; Corrigan, David; Hillebrand, Jens; Ramaswami, Mani; Kokaram, Anil
2012-03-01
In confocal microscopy, target objects are labeled with fluorescent markers in the living specimen, and usually appear with irregular brightness in the observed images. Also, due to the existence of out-of-focus objects in the image, the segmentation of 3-D objects in the stack of image slices captured at different depth levels of the specimen is still heavily relied on manual analysis. In this paper, a novel Bayesian model is proposed for segmenting 3-D synaptic objects from given image stack. In order to solve the irregular brightness and out-offocus problems, the segmentation model employs a likelihood using the luminance-invariant 'wavelet features' of image objects in the dual-tree complex wavelet domain as well as a likelihood based on the vertical intensity profile of the image stack in 3-D. Furthermore, a smoothness 'frame' prior based on the a priori knowledge of the connections of the synapses is introduced to the model for enhancing the connectivity of the synapses. As a result, our model can successfully segment the in-focus target synaptic object from a 3D image stack with irregular brightness.
Indirect detection constraints on s- and t-channel simplified models of dark matter
NASA Astrophysics Data System (ADS)
Carpenter, Linda M.; Colburn, Russell; Goodman, Jessica; Linden, Tim
2016-09-01
Recent Fermi-LAT observations of dwarf spheroidal galaxies in the Milky Way have placed strong limits on the gamma-ray flux from dark matter annihilation. In order to produce the strongest limit on the dark matter annihilation cross section, the observations of each dwarf galaxy have typically been "stacked" in a joint-likelihood analysis, utilizing optical observations to constrain the dark matter density profile in each dwarf. These limits have typically been computed only for singular annihilation final states, such as b b ¯ or τ+τ- . In this paper, we generalize this approach by producing an independent joint-likelihood analysis to set constraints on models where the dark matter particle annihilates to multiple final-state fermions. We interpret these results in the context of the most popular simplified models, including those with s- and t-channel dark matter annihilation through scalar and vector mediators. We present our results as constraints on the minimum dark matter mass and the mediator sector parameters. Additionally, we compare our simplified model results to those of effective field theory contact interactions in the high-mass limit.
Bernstein Ratner, Nan; Brown, Barbara; Weber, Christine M.
2017-01-01
Purpose Childhood stuttering is common but is often outgrown. Children whose stuttering persists experience significant life impacts, calling for a better understanding of what factors may underlie eventual recovery. In previous research, language ability has been shown to differentiate children who stutter (CWS) from children who do not stutter, yet there is an active debate in the field regarding what, if any, language measures may mark eventual recovery versus persistence. In this study, we examined whether growth in productive language performance may better predict the probability of recovery compared to static profiles taken from a single time point. Method Productive syntax and vocabulary diversity growth rates were calculated for 50 CWS using random coefficient models. Logistic regression models were then used to determine whether growth rates uniquely predict likelihood of recovery, as well as if these rates were predictive over and above currently identified correlates of stuttering onset and recovery. Results Different linguistic profiles emerged between children who went on to recover versus those who persisted. Children who had steeper productive syntactic growth, but not vocabulary diversity growth, were more likely to recover by study end. Moreover, this effect held after controlling for initial language ability at study onset as well as demographic covariates. Conclusions Results are discussed in terms of how growth estimates can be incorporated in recommendations for fostering productive language skills among CWS. The need for additional research on language in early stuttering and recovery is suggested. PMID:29049493
NASA Astrophysics Data System (ADS)
Abdallah, H.; Abramowski, A.; Aharonian, F.; Ait Benkhali, F.; Akhperjanian, A. G.; Angüner, E.; Arrieta, M.; Aubert, P.; Backes, M.; Balzer, A.; Barnard, M.; Becherini, Y.; Becker Tjus, J.; Berge, D.; Bernhard, S.; Bernlöhr, K.; Birsin, E.; Blackwell, R.; Böttcher, M.; Boisson, C.; Bolmont, J.; Bordas, P.; Bregeon, J.; Brun, F.; Brun, P.; Bryan, M.; Bulik, T.; Capasso, M.; Carr, J.; Casanova, S.; Chakraborty, N.; Chalme-Calvet, R.; Chaves, R. C. G.; Chen, A.; Chevalier, J.; Chrétien, M.; Colafrancesco, S.; Cologna, G.; Condon, B.; Conrad, J.; Couturier, C.; Cui, Y.; Davids, I. D.; Degrange, B.; Deil, C.; deWilt, P.; Djannati-Ataï, A.; Domainko, W.; Donath, A.; Drury, L. O'C.; Dubus, G.; Dutson, K.; Dyks, J.; Dyrda, M.; Edwards, T.; Egberts, K.; Eger, P.; Ernenwein, J.-P.; Eschbach, S.; Farnier, C.; Fegan, S.; Fernandes, M. V.; Fiasson, A.; Fontaine, G.; Förster, A.; Funk, S.; Füßling, M.; Gabici, S.; Gajdus, M.; Gallant, Y. A.; Garrigoux, T.; Giavitto, G.; Giebels, B.; Glicenstein, J. F.; Gottschall, D.; Goyal, A.; Grondin, M.-H.; Grudzińska, M.; Hadasch, D.; Hahn, J.; Hawkes, J.; Heinzelmann, G.; Henri, G.; Hermann, G.; Hervet, O.; Hillert, A.; Hinton, J. A.; Hofmann, W.; Hoischen, C.; Holler, M.; Horns, D.; Ivascenko, A.; Jacholkowska, A.; Jamrozy, M.; Janiak, M.; Jankowsky, D.; Jankowsky, F.; Jingo, M.; Jogler, T.; Jouvin, L.; Jung-Richardt, I.; Kastendieck, M. A.; Katarzyński, K.; Katz, U.; Kerszberg, D.; Khélifi, B.; Kieffer, M.; King, J.; Klepser, S.; Klochkov, D.; Kluźniak, W.; Kolitzus, D.; Komin, Nu.; Kosack, K.; Krakau, S.; Kraus, M.; Krayzel, F.; Krüger, P. P.; Laffon, H.; Lamanna, G.; Lau, J.; Lees, J.-P.; Lefaucheur, J.; Lefranc, V.; Lemière, A.; Lemoine-Goumard, M.; Lenain, J.-P.; Leser, E.; Lohse, T.; Lorentz, M.; Lui, R.; Lypova, I.; Marandon, V.; Marcowith, A.; Mariaud, C.; Marx, R.; Maurin, G.; Maxted, N.; Mayer, M.; Meintjes, P. J.; Menzler, U.; Meyer, M.; Mitchell, A. M. W.; Moderski, R.; Mohamed, M.; Morâ, K.; Moulin, E.; Murach, T.; de Naurois, M.; Niederwanger, F.; Niemiec, J.; Oakes, L.; Odaka, H.; Ohm, S.; Öttl, S.; Ostrowski, M.; Oya, I.; Padovani, M.; Panter, M.; Parsons, R. D.; Paz Arribas, M.; Pekeur, N. W.; Pelletier, G.; Petrucci, P.-O.; Peyaud, B.; Pita, S.; Poon, H.; Prokhorov, D.; Prokoph, H.; Pühlhofer, G.; Punch, M.; Quirrenbach, A.; Raab, S.; Reimer, A.; Reimer, O.; Renaud, M.; de los Reyes, R.; Rieger, F.; Romoli, C.; Rosier-Lees, S.; Rowell, G.; Rudak, B.; Rulten, C. B.; Sahakian, V.; Salek, D.; Sanchez, D. A.; Santangelo, A.; Sasaki, M.; Schlickeiser, R.; Schüssler, F.; Schulz, A.; Schwanke, U.; Schwemmer, S.; Seyffert, A. S.; Shafi, N.; Simoni, R.; Sol, H.; Spanier, F.; Spengler, G.; Spieß, F.; Stawarz, L.; Steenkamp, R.; Stegmann, C.; Stinzing, F.; Stycz, K.; Sushch, I.; Tavernet, J.-P.; Tavernier, T.; Taylor, A. M.; Terrier, R.; Tluczykont, M.; Trichard, C.; Tuffs, R.; van der Walt, J.; van Eldik, C.; van Soelen, B.; Vasileiadis, G.; Veh, J.; Venter, C.; Viana, A.; Vincent, P.; Vink, J.; Voisin, F.; Völk, H. J.; Vuillaume, T.; Wadiasingh, Z.; Wagner, S. J.; Wagner, P.; Wagner, R. M.; White, R.; Wierzcholska, A.; Willmann, P.; Wörnlein, A.; Wouters, D.; Yang, R.; Zabalza, V.; Zaborov, D.; Zacharias, M.; Zdziarski, A. A.; Zech, A.; Zefi, F.; Ziegler, A.; Żywucka, N.; H. E. S. S. Collaboration
2016-09-01
The inner region of the Milky Way halo harbors a large amount of dark matter (DM). Given its proximity, it is one of the most promising targets to look for DM. We report on a search for the annihilations of DM particles using γ -ray observations towards the inner 300 pc of the Milky Way, with the H.E.S.S. array of ground-based Cherenkov telescopes. The analysis is based on a 2D maximum likelihood method using Galactic Center (GC) data accumulated by H.E.S.S. over the last 10 years (2004-2014), and does not show any significant γ -ray signal above background. Assuming Einasto and Navarro-Frenk-White DM density profiles at the GC, we derive upper limits on the annihilation cross section ⟨σ v ⟩. These constraints are the strongest obtained so far in the TeV DM mass range and improve upon previous limits by a factor 5. For the Einasto profile, the constraints reach ⟨σ v ⟩ values of 6 ×10-26 cm3 s-1 in the W+W- channel for a DM particle mass of 1.5 TeV, and 2 ×10-26 cm3 s-1 in the τ+τ- channel for a 1 TeV mass. For the first time, ground-based γ -ray observations have reached sufficient sensitivity to probe ⟨σ v ⟩ values expected from the thermal relic density for TeV DM particles.
NASA Technical Reports Server (NTRS)
1979-01-01
A nonlinear, maximum likelihood, parameter identification computer program (NLSCIDNT) is described which evaluates rotorcraft stability and control coefficients from flight test data. The optimal estimates of the parameters (stability and control coefficients) are determined (identified) by minimizing the negative log likelihood cost function. The minimization technique is the Levenberg-Marquardt method, which behaves like the steepest descent method when it is far from the minimum and behaves like the modified Newton-Raphson method when it is nearer the minimum. Twenty-one states and 40 measurement variables are modeled, and any subset may be selected. States which are not integrated may be fixed at an input value, or time history data may be substituted for the state in the equations of motion. Any aerodynamic coefficient may be expressed as a nonlinear polynomial function of selected 'expansion variables'.
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions
Park, Yongseok; Taylor, Jeremy M. G.; Kalbfleisch, John D.
2012-01-01
In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method. PMID:23843661
Peyre, Hugo; Leplège, Alain; Coste, Joël
2011-03-01
Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations. Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
Ringing Artefact Reduction By An Efficient Likelihood Improvement Method
NASA Astrophysics Data System (ADS)
Fuderer, Miha
1989-10-01
In MR imaging, the extent of the acquired spatial frequencies of the object is necessarily finite. The resulting image shows artefacts caused by "truncation" of its Fourier components. These are known as Gibbs artefacts or ringing artefacts. These artefacts are particularly. visible when the time-saving reduced acquisition method is used, say, when scanning only the lowest 70% of the 256 data lines. Filtering the data results in loss of resolution. A method is described that estimates the high frequency data from the low-frequency data lines, with the likelihood of the image as criterion. It is a computationally very efficient method, since it requires practically only two extra Fourier transforms, in addition to the normal. reconstruction. The results of this method on MR images of human subjects are promising. Evaluations on a 70% acquisition image show about 20% decrease of the error energy after processing. "Error energy" is defined as the total power of the difference to a 256-data-lines reference image. The elimination of ringing artefacts then appears almost complete..
Yang, Defu; Wang, Lin; Chen, Dongmei; Yan, Chenggang; He, Xiaowei; Liang, Jimin; Chen, Xueli
2018-05-17
The reconstruction of bioluminescence tomography (BLT) is severely ill-posed due to the insufficient measurements and diffuses nature of the light propagation. Predefined permissible source region (PSR) combined with regularization terms is one common strategy to reduce such ill-posedness. However, the region of PSR is usually hard to determine and can be easily affected by subjective consciousness. Hence, we theoretically developed a filtered maximum likelihood expectation maximization (fMLEM) method for BLT. Our method can avoid predefining the PSR and provide a robust and accurate result for global reconstruction. In the method, the simplified spherical harmonics approximation (SP N ) was applied to characterize diffuse light propagation in medium, and the statistical estimation-based MLEM algorithm combined with a filter function was used to solve the inverse problem. We systematically demonstrated the performance of our method by the regular geometry- and digital mouse-based simulations and a liver cancer-based in vivo experiment. Graphical abstract The filtered MLEM-based global reconstruction method for BLT.
NASA Astrophysics Data System (ADS)
Mandal, Shyamapada; Santhi, B.; Sridhar, S.; Vinolia, K.; Swaminathan, P.
2017-06-01
In this paper, an online fault detection and classification method is proposed for thermocouples used in nuclear power plants. In the proposed method, the fault data are detected by the classification method, which classifies the fault data from the normal data. Deep belief network (DBN), a technique for deep learning, is applied to classify the fault data. The DBN has a multilayer feature extraction scheme, which is highly sensitive to a small variation of data. Since the classification method is unable to detect the faulty sensor; therefore, a technique is proposed to identify the faulty sensor from the fault data. Finally, the composite statistical hypothesis test, namely generalized likelihood ratio test, is applied to compute the fault pattern of the faulty sensor signal based on the magnitude of the fault. The performance of the proposed method is validated by field data obtained from thermocouple sensors of the fast breeder test reactor.
Estimation Methods for One-Parameter Testlet Models
ERIC Educational Resources Information Center
Jiao, Hong; Wang, Shudong; He, Wei
2013-01-01
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Determining the optimal forensic DNA analysis procedure following investigation of sample quality.
Hedell, Ronny; Hedman, Johannes; Mostad, Petter
2018-07-01
Crime scene traces of various types are routinely sent to forensic laboratories for analysis, generally with the aim of addressing questions about the source of the trace. The laboratory may choose to analyse the samples in different ways depending on the type and quality of the sample, the importance of the case and the cost and performance of the available analysis methods. Theoretically well-founded guidelines for the choice of analysis method are, however, lacking in most situations. In this paper, it is shown how such guidelines can be created using Bayesian decision theory. The theory is applied to forensic DNA analysis, showing how the information from the initial qPCR analysis can be utilized. It is assumed the alternatives for analysis are using a standard short tandem repeat (STR) DNA analysis assay, using the standard assay and a complementary assay, or the analysis may be cancelled following quantification. The decision is based on information about the DNA amount and level of DNA degradation of the forensic sample, as well as case circumstances and the cost for analysis. Semi-continuous electropherogram models are used for simulation of DNA profiles and for computation of likelihood ratios. It is shown how tables and graphs, prepared beforehand, can be used to quickly find the optimal decision in forensic casework.
Non-Contact Technique for Determining the Mechanical Stress in thin Films on Wafers by Profiler
NASA Astrophysics Data System (ADS)
Djuzhev, N. A.; Dedkova, A. A.; E Gusev, E.; Makhiboroda, M. A.; Glagolev, P. Y.
2017-04-01
This paper presents an algorithm for analysis of relief for the purpose of calculating mechanical stresses in a selected direction on the plate in the form of software package Matlab. The method allows for the measurement sample in the local area. It provides a visual representation of the data and allows to get stress distribution on wafer surface. Automated analysis process reduces the likelihood of errors researcher. Achieved time saving during processing results. In carrying out several measurements possible drawing card plate to predict yield crystals. According to this technique done in measurement of mechanical stresses of thermal silicon oxide film on a silicon substrate. Analysis of the results showed objectivity and reliability calculations. This method can be used for selecting the optimal parameters of the material deposition conditions. In software of device-technological simulation TCAD defined process time, temperature and oxidation of the operation of the sample environment for receiving the set value of the dielectric film thickness. Calculated thermal stresses are in the system silicon-silicon oxide. There is a good correlation between numerical simulations and analytical calculation. It is shown that the nature of occurrence of mechanical stress is not limited to the difference of thermal expansion coefficients of materials.
Viruses detected among sporadic cases of parotitis, United States, 2009-2011.
Barskey, Albert E; Juieng, Phalasy; Whitaker, Brett L; Erdman, Dean D; Oberste, M Steven; Chern, Shur-Wern Wang; Schmid, D Scott; Radford, Kay W; McNall, Rebecca J; Rota, Paul A; Hickman, Carole J; Bellini, William J; Wallace, Gregory S
2013-12-15
Sporadic cases of parotitis are generally assumed to be mumps, which often requires a resource-intensive public health response. This project surveyed the frequency of viruses detected among such cases. During 2009-2011, 8 jurisdictions throughout the United States investigated sporadic cases of parotitis. Epidemiologic information, serum, and buccal and oropharyngeal swabs were collected. Polymerase chain reaction methods were used to detect a panel of viruses. Anti-mumps virus immunoglobulin M (IgM) antibodies were detected using a variety of methods. Of 101 specimens, 38 were positive for a single virus: Epstein-Barr virus (23), human herpesvirus (HHV)-6B (10), human parainfluenza virus (HPIV)-2 (3), HPIV-3 (1), and human bocavirus (1). Mumps virus, enteroviruses (including human parechovirus), HHV-6A, HPIV-1, and adenoviruses were not detected. Early specimen collection did not improve viral detection rate. Mumps IgM was detected in 17% of available specimens. Patients in whom a virus was detected were younger, but no difference was seen by sex or vaccination profile. No seasonal patterns were identified. Considering the timing of specimen collection, serology results, patient vaccination status, and time of year may be helpful in assessing the likelihood that a sporadic case of parotitis without laboratory confirmation is mumps.
Fowler, Michael J.; Howard, Marylesa; Luttman, Aaron; ...
2015-06-03
One of the primary causes of blur in a high-energy X-ray imaging system is the shape and extent of the radiation source, or ‘spot’. It is important to be able to quantify the size of the spot as it provides a lower bound on the recoverable resolution for a radiograph, and penumbral imaging methods – which involve the analysis of blur caused by a structured aperture – can be used to obtain the spot’s spatial profile. We present a Bayesian approach for estimating the spot shape that, unlike variational methods, is robust to the initial choice of parameters. The posteriormore » is obtained from a normal likelihood, which was constructed from a weighted least squares approximation to a Poisson noise model, and prior assumptions that enforce both smoothness and non-negativity constraints. A Markov chain Monte Carlo algorithm is used to obtain samples from the target posterior, and the reconstruction and uncertainty estimates are the computed mean and variance of the samples, respectively. Lastly, synthetic data-sets are used to demonstrate accurate reconstruction, while real data taken with high-energy X-ray imaging systems are used to demonstrate applicability and feasibility.« less
Tchetgen Tchetgen, Eric
2011-03-01
This article considers the detection and evaluation of genetic effects incorporating gene-environment interaction and independence. Whereas ordinary logistic regression cannot exploit the assumption of gene-environment independence, the proposed approach makes explicit use of the independence assumption to improve estimation efficiency. This method, which uses both cases and controls, fits a constrained retrospective regression in which the genetic variant plays the role of the response variable, and the disease indicator and the environmental exposure are the independent variables. The regression model constrains the association of the environmental exposure with the genetic variant among the controls to be null, thus explicitly encoding the gene-environment independence assumption, which yields substantial gain in accuracy in the evaluation of genetic effects. The proposed retrospective regression approach has several advantages. It is easy to implement with standard software, and it readily accounts for multiple environmental exposures of a polytomous or of a continuous nature, while easily incorporating extraneous covariates. Unlike the profile likelihood approach of Chatterjee and Carroll (Biometrika. 2005;92:399-418), the proposed method does not require a model for the association of a polytomous or continuous exposure with the disease outcome, and, therefore, it is agnostic to the functional form of such a model and completely robust to its possible misspecification.
Free energies from dynamic weighted histogram analysis using unbiased Markov state model.
Rosta, Edina; Hummer, Gerhard
2015-01-13
The weighted histogram analysis method (WHAM) is widely used to obtain accurate free energies from biased molecular simulations. However, WHAM free energies can exhibit significant errors if some of the biasing windows are not fully equilibrated. To account for the lack of full equilibration, we develop the dynamic histogram analysis method (DHAM). DHAM uses a global Markov state model to obtain the free energy along the reaction coordinate. A maximum likelihood estimate of the Markov transition matrix is constructed by joint unbiasing of the transition counts from multiple umbrella-sampling simulations along discretized reaction coordinates. The free energy profile is the stationary distribution of the resulting Markov matrix. For this matrix, we derive an explicit approximation that does not require the usual iterative solution of WHAM. We apply DHAM to model systems, a chemical reaction in water treated using quantum-mechanics/molecular-mechanics (QM/MM) simulations, and the Na(+) ion passage through the membrane-embedded ion channel GLIC. We find that DHAM gives accurate free energies even in cases where WHAM fails. In addition, DHAM provides kinetic information, which we here use to assess the extent of convergence in each of the simulation windows. DHAM may also prove useful in the construction of Markov state models from biased simulations in phase-space regions with otherwise low population.
Campbell, Marnie L; Hewitt, Chad L
2011-07-01
Biofouling of vessels is implicated as a high risk transfer mechanism of non-indigenous marine species (NIMS). Biofouling on international vessels is managed through stringent border control policies, however, domestic biofouling transfers are managed under different policies and legislative arrangements as they cross internal borders. As comprehensive guidelines are developed and increased compliance of international vessels with 'clean hull' expectations increase, vessel movements from port to port will become the focus of biosecurity management. A semi-quantitative port to port biofouling risk assessment is presented that evaluates the presence of known NIMS in the source port and determines the likelihood of transfer based on the NIMS association with biofouling and environmental match between source and receiving ports. This risk assessment method was used to assess the risk profile of a single dredge vessel during three anticipated voyages within Australia, resulting in negligible to low risk outcomes. This finding is contrasted with expectations in the literature, specifically those that suggest slow moving vessels pose a high to extreme risk of transferring NIMS species.
A statistical analysis of murine incisional and excisional acute wound models.
Ansell, David M; Campbell, Laura; Thomason, Helen A; Brass, Andrew; Hardman, Matthew J
2014-01-01
Mice represent the most commonly used species for preclinical in vivo research. While incisional and excisional acute murine wound models are both frequently employed, there is little agreement on which model is optimum. Moreover, current lack of standardization of wounding procedure, analysis time point(s), method of assessment, and the use of individual wounds vs. individual animals as replicates makes it difficult to compare across studies. Here we have profiled secondary intention healing of incisional and excisional wounds within the same animal, assessing multiple parameters to determine the optimal methodology for future studies. We report that histology provides the least variable assessment of healing. Furthermore, histology alone (not planimetry) is able to detect accelerated healing in a castrated mouse model. Perhaps most importantly, we find virtually no correlation between wounds within the same animal, suggesting that use of wound (not animal) biological replicates is perfectly acceptable. Overall, these findings should guide and refine future studies, increasing the likelihood of detecting novel phenotypes while reducing the numbers of animals required for experimentation. © 2014 by the Wound Healing Society.
A statistical analysis of murine incisional and excisional acute wound models
Ansell, David M; Campbell, Laura; Thomason, Helen A; Brass, Andrew; Hardman, Matthew J
2014-01-01
Mice represent the most commonly used species for preclinical in vivo research. While incisional and excisional acute murine wound models are both frequently employed, there is little agreement on which model is optimum. Moreover, current lack of standardization of wounding procedure, analysis time point(s), method of assessment, and the use of individual wounds vs. individual animals as replicates makes it difficult to compare across studies. Here we have profiled secondary intention healing of incisional and excisional wounds within the same animal, assessing multiple parameters to determine the optimal methodology for future studies. We report that histology provides the least variable assessment of healing. Furthermore, histology alone (not planimetry) is able to detect accelerated healing in a castrated mouse model. Perhaps most importantly, we find virtually no correlation between wounds within the same animal, suggesting that use of wound (not animal) biological replicates is perfectly acceptable. Overall, these findings should guide and refine future studies, increasing the likelihood of detecting novel phenotypes while reducing the numbers of animals required for experimentation. PMID:24635179