Sample records for statistical analysis demonstrated

  1. Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers

    ERIC Educational Resources Information Center

    Keiffer, Greggory L.; Lane, Forrest C.

    2016-01-01

    Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…

  2. The Importance of Statistical Modeling in Data Analysis and Inference

    ERIC Educational Resources Information Center

    Rollins, Derrick, Sr.

    2017-01-01

    Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…

  3. The Shock and Vibration Digest. Volume 16, Number 3

    DTIC Science & Technology

    1984-03-01

    Fluid-induced Statistical Energy Analysis Method excitation, Wind tunnel testing V.R. Miller and L.L. Faulkner Flight Dynamics Lab., Air Force...84475 wall by the statistical energy analysis (SEA) method. The fuselage structure is represented as a series of curved, iso- Probabilistic Fracture...heavy are demonstrated in three-dimensional form. floor, a statistical energy analysis (SEA) model is presented. Only structural systems (i.e., no

  4. An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data.

    PubMed

    Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John

    2018-03-07

    DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.

  5. Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

    ERIC Educational Resources Information Center

    Jones, Andrew T.

    2011-01-01

    Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

  6. Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets

    PubMed Central

    Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard

    2013-01-01

    Purpose: With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. Methods: A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. Results: The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. Conclusions: The work demonstrates the viability of the design approach and the software tool for analysis of large data sets. PMID:24320426

  7. Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets.

    PubMed

    Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard

    2013-11-01

    With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. The work demonstrates the viability of the design approach and the software tool for analysis of large data sets.

  8. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

    PubMed

    Willis, Brian H; Riley, Richard D

    2017-09-20

    An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  9. Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

    PubMed Central

    Riley, Richard D.

    2017-01-01

    An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945

  10. DNA Fingerprinting in a Forensic Teaching Experiment

    ERIC Educational Resources Information Center

    Wagoner, Stacy A.; Carlson, Kimberly A.

    2008-01-01

    This article presents an experiment designed to provide students, in a classroom laboratory setting, a hands-on demonstration of the steps used in DNA forensic analysis by performing DNA extraction, DNA fingerprinting, and statistical analysis of the data. This experiment demonstrates how DNA fingerprinting is performed and how long it takes. It…

  11. Statistics Poster Challenge for Schools

    ERIC Educational Resources Information Center

    Payne, Brad; Freeman, Jenny; Stillman, Eleanor

    2013-01-01

    The analysis and interpretation of data are important life skills. A poster challenge for schoolchildren provides an innovative outlet for these skills and demonstrates their relevance to daily life. We discuss our Statistics Poster Challenge and the lessons we have learned.

  12. Evaluation of adding item-response theory analysis for evaluation of the European Board of Ophthalmology Diploma examination.

    PubMed

    Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José

    2013-11-01

    To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.

  13. Is Quality/Effectiveness An Empirically Demonstrable School Attribute? Statistical Aids for Determining Appropriate Levels of Analysis.

    ERIC Educational Resources Information Center

    Griffith, James

    2002-01-01

    Describes and demonstrates analytical techniques used in organizational psychology and contemporary multilevel analysis. Using these analytic techniques, examines the relationship between educational outcomes and the school environment. Finds that at least some indicators might be represented as school-level phenomena. Results imply that the…

  14. Multivariate statistical analysis: Principles and applications to coorbital streams of meteorite falls

    NASA Technical Reports Server (NTRS)

    Wolf, S. F.; Lipschutz, M. E.

    1993-01-01

    Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.

  15. Towards Solving the Mixing Problem in the Decomposition of Geophysical Time Series by Independent Component Analysis

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2000-01-01

    The use of the Principal Component Analysis technique for the analysis of geophysical time series has been questioned in particular for its tendency to extract components that mix several physical phenomena even when the signal is just their linear sum. We demonstrate with a data simulation experiment that the Independent Component Analysis, a recently developed technique, is able to solve this problem. This new technique requires the statistical independence of components, a stronger constraint, that uses higher-order statistics, instead of the classical decorrelation a weaker constraint, that uses only second-order statistics. Furthermore, ICA does not require additional a priori information such as the localization constraint used in Rotational Techniques.

  16. Statistical methods for quantitative mass spectrometry proteomic experiments with labeling.

    PubMed

    Oberg, Ann L; Mahoney, Douglas W

    2012-01-01

    Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.

  17. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  18. A Statistical Analysis of Brain Morphology Using Wild Bootstrapping

    PubMed Central

    Ibrahim, Joseph G.; Tang, Niansheng; Rowe, Daniel B.; Hao, Xuejun; Bansal, Ravi; Peterson, Bradley S.

    2008-01-01

    Methods for the analysis of brain morphology, including voxel-based morphology and surface-based morphometries, have been used to detect associations between brain structure and covariates of interest, such as diagnosis, severity of disease, age, IQ, and genotype. The statistical analysis of morphometric measures usually involves two statistical procedures: 1) invoking a statistical model at each voxel (or point) on the surface of the brain or brain subregion, followed by mapping test statistics (e.g., t test) or their associated p values at each of those voxels; 2) correction for the multiple statistical tests conducted across all voxels on the surface of the brain region under investigation. We propose the use of new statistical methods for each of these procedures. We first use a heteroscedastic linear model to test the associations between the morphological measures at each voxel on the surface of the specified subregion (e.g., cortical or subcortical surfaces) and the covariates of interest. Moreover, we develop a robust test procedure that is based on a resampling method, called wild bootstrapping. This procedure assesses the statistical significance of the associations between a measure of given brain structure and the covariates of interest. The value of this robust test procedure lies in its computationally simplicity and in its applicability to a wide range of imaging data, including data from both anatomical and functional magnetic resonance imaging (fMRI). Simulation studies demonstrate that this robust test procedure can accurately control the family-wise error rate. We demonstrate the application of this robust test procedure to the detection of statistically significant differences in the morphology of the hippocampus over time across gender groups in a large sample of healthy subjects. PMID:17649909

  19. Effects of Heterogeniety on Spatial Pattern Analysis of Wild Pistachio Trees in Zagros Woodlands, Iran

    NASA Astrophysics Data System (ADS)

    Erfanifard, Y.; Rezayan, F.

    2014-10-01

    Vegetation heterogeneity biases second-order summary statistics, e.g., Ripley's K-function, applied for spatial pattern analysis in ecology. Second-order investigation based on Ripley's K-function and related statistics (i.e., L- and pair correlation function g) is widely used in ecology to develop hypothesis on underlying processes by characterizing spatial patterns of vegetation. The aim of this study was to demonstrate effects of underlying heterogeneity of wild pistachio (Pistacia atlantica Desf.) trees on the second-order summary statistics of point pattern analysis in a part of Zagros woodlands, Iran. The spatial distribution of 431 wild pistachio trees was accurately mapped in a 40 ha stand in the Wild Pistachio & Almond Research Site, Fars province, Iran. Three commonly used second-order summary statistics (i.e., K-, L-, and g-functions) were applied to analyse their spatial pattern. The two-sample Kolmogorov-Smirnov goodness-of-fit test showed that the observed pattern significantly followed an inhomogeneous Poisson process null model in the study region. The results also showed that heterogeneous pattern of wild pistachio trees biased the homogeneous form of K-, L-, and g-functions, demonstrating a stronger aggregation of the trees at the scales of 0-50 m than actually existed and an aggregation at scales of 150-200 m, while regularly distributed. Consequently, we showed that heterogeneity of point patterns may bias the results of homogeneous second-order summary statistics and we also suggested applying inhomogeneous summary statistics with related null models for spatial pattern analysis of heterogeneous vegetations.

  20. Application of Transformations in Parametric Inference

    ERIC Educational Resources Information Center

    Brownstein, Naomi; Pensky, Marianna

    2008-01-01

    The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…

  1. Statistical Analysis for Collision-free Boson Sampling.

    PubMed

    Huang, He-Liang; Zhong, Han-Sen; Li, Tan; Li, Feng-Guang; Fu, Xiang-Qun; Zhang, Shuo; Wang, Xiang; Bao, Wan-Su

    2017-11-10

    Boson sampling is strongly believed to be intractable for classical computers but solvable with photons in linear optics, which raises widespread concern as a rapid way to demonstrate the quantum supremacy. However, due to its solution is mathematically unverifiable, how to certify the experimental results becomes a major difficulty in the boson sampling experiment. Here, we develop a statistical analysis scheme to experimentally certify the collision-free boson sampling. Numerical simulations are performed to show the feasibility and practicability of our scheme, and the effects of realistic experimental conditions are also considered, demonstrating that our proposed scheme is experimentally friendly. Moreover, our broad approach is expected to be generally applied to investigate multi-particle coherent dynamics beyond the boson sampling.

  2. Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2001-01-01

    The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.

  3. Older and Younger Workers: The Equalling Effects of Health

    ERIC Educational Resources Information Center

    Beck, Vanessa; Quinn, Martin

    2012-01-01

    Purpose: The purpose of this paper is to consider the statistical evidence on the effects that ill health has on labour market participation and opportunities for younger and older workers in the East Midlands (UK). Design/methodology/approach: A statistical analysis of Labour Force Survey data was undertaken to demonstrate that health issues…

  4. Analysis of Doppler radar windshear data

    NASA Technical Reports Server (NTRS)

    Williams, F.; Mckinney, P.; Ozmen, F.

    1989-01-01

    The objective of this analysis is to process Lincoln Laboratory Doppler radar data obtained during FLOWS testing at Huntsville, Alabama, in the summer of 1986, to characterize windshear events. The processing includes plotting velocity and F-factor profiles, histogram analysis to summarize statistics, and correlation analysis to demonstrate any correlation between different data fields.

  5. Commentary: Using Potential Outcomes to Understand Causal Mediation Analysis

    ERIC Educational Resources Information Center

    Imai, Kosuke; Jo, Booil; Stuart, Elizabeth A.

    2011-01-01

    In this commentary, we demonstrate how the potential outcomes framework can help understand the key identification assumptions underlying causal mediation analysis. We show that this framework can lead to the development of alternative research design and statistical analysis strategies applicable to the longitudinal data settings considered by…

  6. The Application of Bayesian Analysis to Issues in Developmental Research

    ERIC Educational Resources Information Center

    Walker, Lawrence J.; Gustafson, Paul; Frimer, Jeremy A.

    2007-01-01

    This article reviews the concepts and methods of Bayesian statistical analysis, which can offer innovative and powerful solutions to some challenging analytical problems that characterize developmental research. In this article, we demonstrate the utility of Bayesian analysis, explain its unique adeptness in some circumstances, address some…

  7. Pathway analysis with next-generation sequencing data.

    PubMed

    Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

    2015-04-01

    Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

  8. Demonstration of Wavelet Techniques in the Spectral Analysis of Bypass Transition Data

    NASA Technical Reports Server (NTRS)

    Lewalle, Jacques; Ashpis, David E.; Sohn, Ki-Hyeon

    1997-01-01

    A number of wavelet-based techniques for the analysis of experimental data are developed and illustrated. A multiscale analysis based on the Mexican hat wavelet is demonstrated as a tool for acquiring physical and quantitative information not obtainable by standard signal analysis methods. Experimental data for the analysis came from simultaneous hot-wire velocity traces in a bypass transition of the boundary layer on a heated flat plate. A pair of traces (two components of velocity) at one location was excerpted. A number of ensemble and conditional statistics related to dominant time scales for energy and momentum transport were calculated. The analysis revealed a lack of energy-dominant time scales inside turbulent spots but identified transport-dominant scales inside spots that account for the largest part of the Reynolds stress. Momentum transport was much more intermittent than were energetic fluctuations. This work is the first step in a continuing study of the spatial evolution of these scale-related statistics, the goal being to apply the multiscale analysis results to improve the modeling of transitional and turbulent industrial flows.

  9. Vitamin D and depression: a systematic review and meta-analysis comparing studies with and without biological flaws.

    PubMed

    Spedding, Simon

    2014-04-11

    Efficacy of Vitamin D supplements in depression is controversial, awaiting further literature analysis. Biological flaws in primary studies is a possible reason meta-analyses of Vitamin D have failed to demonstrate efficacy. This systematic review and meta-analysis of Vitamin D and depression compared studies with and without biological flaws. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The literature search was undertaken through four databases for randomized controlled trials (RCTs). Studies were critically appraised for methodological quality and biological flaws, in relation to the hypothesis and study design. Meta-analyses were performed for studies according to the presence of biological flaws. The 15 RCTs identified provide a more comprehensive evidence-base than previous systematic reviews; methodological quality of studies was generally good and methodology was diverse. A meta-analysis of all studies without flaws demonstrated a statistically significant improvement in depression with Vitamin D supplements (+0.78 CI +0.24, +1.27). Studies with biological flaws were mainly inconclusive, with the meta-analysis demonstrating a statistically significant worsening in depression by taking Vitamin D supplements (-1.1 CI -0.7, -1.5). Vitamin D supplementation (≥800 I.U. daily) was somewhat favorable in the management of depression in studies that demonstrate a change in vitamin levels, and the effect size was comparable to that of anti-depressant medication.

  10. Single-case research design in pediatric psychology: considerations regarding data analysis.

    PubMed

    Cohen, Lindsey L; Feinstein, Amanda; Masuda, Akihiko; Vowles, Kevin E

    2014-03-01

    Single-case research allows for an examination of behavior and can demonstrate the functional relation between intervention and outcome in pediatric psychology. This review highlights key assumptions, methodological and design considerations, and options for data analysis. Single-case methodology and guidelines are reviewed with an in-depth focus on visual and statistical analyses. Guidelines allow for the careful evaluation of design quality and visual analysis. A number of statistical techniques have been introduced to supplement visual analysis, but to date, there is no consensus on their recommended use in single-case research design. Single-case methodology is invaluable for advancing pediatric psychology science and practice, and guidelines have been introduced to enhance the consistency, validity, and reliability of these studies. Experts generally agree that visual inspection is the optimal method of analysis in single-case design; however, statistical approaches are becoming increasingly evaluated and used to augment data interpretation.

  11. Selecting the "Best" Factor Structure and Moving Measurement Validation Forward: An Illustration.

    PubMed

    Schmitt, Thomas A; Sass, Daniel A; Chappelle, Wayne; Thompson, William

    2018-04-09

    Despite the broad literature base on factor analysis best practices, research seeking to evaluate a measure's psychometric properties frequently fails to consider or follow these recommendations. This leads to incorrect factor structures, numerous and often overly complex competing factor models and, perhaps most harmful, biased model results. Our goal is to demonstrate a practical and actionable process for factor analysis through (a) an overview of six statistical and psychometric issues and approaches to be aware of, investigate, and report when engaging in factor structure validation, along with a flowchart for recommended procedures to understand latent factor structures; (b) demonstrating these issues to provide a summary of the updated Posttraumatic Stress Disorder Checklist (PCL-5) factor models and a rationale for validation; and (c) conducting a comprehensive statistical and psychometric validation of the PCL-5 factor structure to demonstrate all the issues we described earlier. Considering previous research, the PCL-5 was evaluated using a sample of 1,403 U.S. Air Force remotely piloted aircraft operators with high levels of battlefield exposure. Previously proposed PCL-5 factor structures were not supported by the data, but instead a bifactor model is arguably more statistically appropriate.

  12. Pyrotechnic Shock Analysis Using Statistical Energy Analysis

    DTIC Science & Technology

    2015-10-23

    SEA subsystems. A couple of validation examples are provided to demonstrate the new approach. KEY WORDS : Peak Ratio, phase perturbation...Ballistic Shock Prediction Models and Techniques for Use in the Crusader Combat Vehicle Program,” 11th Annual US Army Ground Vehicle Survivability

  13. Assessment of statistical uncertainty in the quantitative analysis of solid samples in motion using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Cabalín, L. M.; González, A.; Ruiz, J.; Laserna, J. J.

    2010-08-01

    Statistical uncertainty in the quantitative analysis of solid samples in motion by laser-induced breakdown spectroscopy (LIBS) has been assessed. For this purpose, a LIBS demonstrator was designed and constructed in our laboratory. The LIBS system consisted of a laboratory-scale conveyor belt, a compact optical module and a Nd:YAG laser operating at 532 nm. The speed of the conveyor belt was variable and could be adjusted up to a maximum speed of 2 m s - 1 . Statistical uncertainty in the analytical measurements was estimated in terms of precision (reproducibility and repeatability) and accuracy. The results obtained by LIBS on shredded scrap samples under real conditions have demonstrated that the analytical precision and accuracy of LIBS is dependent on the sample geometry, position on the conveyor belt and surface cleanliness. Flat, relatively clean scrap samples exhibited acceptable reproducibility and repeatability; by contrast, samples with an irregular shape or a dirty surface exhibited a poor relative standard deviation.

  14. Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

    PubMed Central

    Reif, David M.; Israel, Mark A.; Moore, Jason H.

    2007-01-01

    The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666

  15. Trend analysis and selected summary statistics of annual mean streamflow for 38 selected long-term U.S. Geological Survey streamgages in Texas, water years 1916-2012

    USGS Publications Warehouse

    Asquith, William H.; Barbie, Dana L.

    2014-01-01

    Selected summary statistics (L-moments) and estimates of respective sampling variances were computed for the 35 streamgages lacking statistically significant trends. From the L-moments and estimated sampling variances, weighted means or regional values were computed for each L-moment. An example application is included demonstrating how the L-moments could be used to evaluate the magnitude and frequency of annual mean streamflow.

  16. Statistical analysis for understanding and predicting battery degradations in real-life electric vehicle use

    NASA Astrophysics Data System (ADS)

    Barré, Anthony; Suard, Frédéric; Gérard, Mathias; Montaru, Maxime; Riu, Delphine

    2014-01-01

    This paper describes the statistical analysis of recorded data parameters of electrical battery ageing during electric vehicle use. These data permit traditional battery ageing investigation based on the evolution of the capacity fade and resistance raise. The measured variables are examined in order to explain the correlation between battery ageing and operating conditions during experiments. Such study enables us to identify the main ageing factors. Then, detailed statistical dependency explorations present the responsible factors on battery ageing phenomena. Predictive battery ageing models are built from this approach. Thereby results demonstrate and quantify a relationship between variables and battery ageing global observations, and also allow accurate battery ageing diagnosis through predictive models.

  17. The potential of statistical shape modelling for geometric morphometric analysis of human teeth in archaeological research

    PubMed Central

    Fernee, Christianne; Browne, Martin; Zakrzewski, Sonia

    2017-01-01

    This paper introduces statistical shape modelling (SSM) for use in osteoarchaeology research. SSM is a full field, multi-material analytical technique, and is presented as a supplementary geometric morphometric (GM) tool. Lower mandibular canines from two archaeological populations and one modern population were sampled, digitised using micro-CT, aligned, registered to a baseline and statistically modelled using principal component analysis (PCA). Sample material properties were incorporated as a binary enamel/dentin parameter. Results were assessed qualitatively and quantitatively using anatomical landmarks. Finally, the technique’s application was demonstrated for inter-sample comparison through analysis of the principal component (PC) weights. It was found that SSM could provide high detail qualitative and quantitative insight with respect to archaeological inter- and intra-sample variability. This technique has value for archaeological, biomechanical and forensic applications including identification, finite element analysis (FEA) and reconstruction from partial datasets. PMID:29216199

  18. Statistics on gene-based laser speckles with a small number of scatterers: implications for the detection of polymorphism in the Chlamydia trachomatis omp1 gene

    NASA Astrophysics Data System (ADS)

    Ulyanov, Sergey S.; Ulianova, Onega V.; Zaytsev, Sergey S.; Saltykov, Yury V.; Feodorova, Valentina A.

    2018-04-01

    The transformation mechanism for a nucleotide sequence of the Chlamydia trachomatis gene into a speckle pattern has been considered. The first and second-order statistics of gene-based speckles have been analyzed. It has been demonstrated that gene-based speckles do not obey Gaussian statistics and belong to the class of speckles with a small number of scatterers. It has been shown that gene polymorphism can be easily detected through analysis of the statistical characteristics of gene-based speckles.

  19. Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

    PubMed Central

    Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

    2006-01-01

    In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709

  20. A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data

    PubMed Central

    Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar

    2012-01-01

    Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762

  1. A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.

    PubMed

    Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar

    2012-10-18

    Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.

  2. DAnTE: a statistical tool for quantitative analysis of –omics data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Polpitiya, Ashoka D.; Qian, Weijun; Jaitly, Navdeep

    2008-05-03

    DAnTE (Data Analysis Tool Extension) is a statistical tool designed to address challenges unique to quantitative bottom-up, shotgun proteomics data. This tool has also been demonstrated for microarray data and can easily be extended to other high-throughput data types. DAnTE features selected normalization methods, missing value imputation algorithms, peptide to protein rollup methods, an extensive array of plotting functions, and a comprehensive ANOVA scheme that can handle unbalanced data and random effects. The Graphical User Interface (GUI) is designed to be very intuitive and user friendly.

  3. The neuronal correlates of intranasal trigeminal function – An ALE meta-analysis of human functional brain imaging data

    PubMed Central

    Albrecht, Jessica; Kopietz, Rainer; Frasnelli, Johannes; Wiesmann, Martin; Hummel, Thomas; Lundström, Johan N.

    2009-01-01

    Almost every odor we encounter in daily life has the capacity to produce a trigeminal sensation. Surprisingly, few functional imaging studies exploring human neuronal correlates of intranasal trigeminal function exist, and results are to some degree inconsistent. We utilized activation likelihood estimation (ALE), a quantitative voxel-based meta-analysis tool, to analyze functional imaging data (fMRI/PET) following intranasal trigeminal stimulation with carbon dioxide (CO2), a stimulus known to exclusively activate the trigeminal system. Meta-analysis tools are able to identify activations common across studies, thereby enabling activation mapping with higher certainty. Activation foci of nine studies utilizing trigeminal stimulation were included in the meta-analysis. We found significant ALE scores, thus indicating consistent activation across studies, in the brainstem, ventrolateral posterior thalamic nucleus, anterior cingulate cortex, insula, precentral gyrus, as well as in primary and secondary somatosensory cortices – a network known for the processing of intranasal nociceptive stimuli. Significant ALE values were also observed in the piriform cortex, insula, and the orbitofrontal cortex, areas known to process chemosensory stimuli, and in association cortices. Additionally, the trigeminal ALE statistics were directly compared with ALE statistics originating from olfactory stimulation, demonstrating considerable overlap in activation. In conclusion, the results of this meta-analysis map the human neuronal correlates of intranasal trigeminal stimulation with high statistical certainty and demonstrate that the cortical areas recruited during the processing of intranasal CO2 stimuli include those outside traditional trigeminal areas. Moreover, through illustrations of the considerable overlap between brain areas that process trigeminal and olfactory information; these results demonstrate the interconnectivity of flavor processing. PMID:19913573

  4. Performance of Modified Test Statistics in Covariance and Correlation Structure Analysis under Conditions of Multivariate Nonnormality.

    ERIC Educational Resources Information Center

    Fouladi, Rachel T.

    2000-01-01

    Provides an overview of standard and modified normal theory and asymptotically distribution-free covariance and correlation structure analysis techniques and details Monte Carlo simulation results on Type I and Type II error control. Demonstrates through the simulation that robustness and nonrobustness of structure analysis techniques vary as a…

  5. The use of decision analysis to evaluate the economic effects of heat mount detectors in two dairy herds.

    PubMed

    Williamson, N B

    1975-03-01

    This paper reports a decrease in the interval from calving to conception in two commercial dairy herds, associated with the use of KaMaR Heat Mount Detectors. An economic analysis of the results uses a neoclassical decision theory approach to demonstrate that the use of heat mount detectors is likely to be profitable, with an expected net return of $154.18 per 100 calvings. The analysis demonstrates the suitability of a decision-theoretic approach to the analysis of applied research, and illustrates some of the weaknesses of "Classical" statistical analysis in such circumstances.

  6. Vitamin D and Depression: A Systematic Review and Meta-Analysis Comparing Studies with and without Biological Flaws

    PubMed Central

    Spedding, Simon

    2014-01-01

    Efficacy of Vitamin D supplements in depression is controversial, awaiting further literature analysis. Biological flaws in primary studies is a possible reason meta-analyses of Vitamin D have failed to demonstrate efficacy. This systematic review and meta-analysis of Vitamin D and depression compared studies with and without biological flaws. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The literature search was undertaken through four databases for randomized controlled trials (RCTs). Studies were critically appraised for methodological quality and biological flaws, in relation to the hypothesis and study design. Meta-analyses were performed for studies according to the presence of biological flaws. The 15 RCTs identified provide a more comprehensive evidence-base than previous systematic reviews; methodological quality of studies was generally good and methodology was diverse. A meta-analysis of all studies without flaws demonstrated a statistically significant improvement in depression with Vitamin D supplements (+0.78 CI +0.24, +1.27). Studies with biological flaws were mainly inconclusive, with the meta-analysis demonstrating a statistically significant worsening in depression by taking Vitamin D supplements (−1.1 CI −0.7, −1.5). Vitamin D supplementation (≥800 I.U. daily) was somewhat favorable in the management of depression in studies that demonstrate a change in vitamin levels, and the effect size was comparable to that of anti-depressant medication. PMID:24732019

  7. Advanced building energy management system demonstration for Department of Defense buildings.

    PubMed

    O'Neill, Zheng; Bailey, Trevor; Dong, Bing; Shashanka, Madhusudana; Luo, Dong

    2013-08-01

    This paper presents an advanced building energy management system (aBEMS) that employs advanced methods of whole-building performance monitoring combined with statistical methods of learning and data analysis to enable identification of both gradual and discrete performance erosion and faults. This system assimilated data collected from multiple sources, including blueprints, reduced-order models (ROM) and measurements, and employed advanced statistical learning algorithms to identify patterns of anomalies. The results were presented graphically in a manner understandable to facilities managers. A demonstration of aBEMS was conducted in buildings at Naval Station Great Lakes. The facility building management systems were extended to incorporate the energy diagnostics and analysis algorithms, producing systematic identification of more efficient operation strategies. At Naval Station Great Lakes, greater than 20% savings were demonstrated for building energy consumption by improving facility manager decision support to diagnose energy faults and prioritize alternative, energy-efficient operation strategies. The paper concludes with recommendations for widespread aBEMS success. © 2013 New York Academy of Sciences.

  8. Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

    PubMed

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E

    2014-04-01

    This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  9. Intrabolus pressure on high-resolution manometry distinguishes fibrostenotic and inflammatory phenotypes of eosinophilic esophagitis.

    PubMed

    Colizzo, J M; Clayton, S B; Richter, J E

    2016-08-01

    The aim of this investigation was to determine the motility patterns of inflammatory and fibrostenotic phenotypes of eosinophilic esophagitis (EoE) utilizing high-resolution manometry (HRM). Twenty-nine patients with a confirmed diagnosis of EoE according to clinicopathological criteria currently being managed at the Joy McCann Culverhouse Swallowing Center at the University of South Florida were included in the retrospective analysis. Only patients who completed HRM studies were included in the analysis. Patients were classified into inflammatory or fibrostenotic subtypes based on baseline endoscopic evidence. Their baseline HRM studies prior to therapy were analyzed. Manometric data including distal contractile integral, integrated relaxation pressure, and intrabolus pressure (IBP) values were recorded. HRM results were interpreted according to the Chicago Classification system. Statistical analysis was performed with SPSS software (Version 22, IBM Co., Armonk, NY, USA). Data were compared utilizing Student's t-test, χ(2) test, Pearson correlation, and Spearman correlation tests. Statistical significance was set at P < 0.05. A total of 29 patients with EoE were included into the retrospective analysis. The overall average age among patients was 40 years. Male patients comprised 62% of the overall population. Both groups were similar in age, gender, and overall clinical presentation. Seventeen patients (58%) had fibrostenotic disease, and 12 (42%) displayed inflammatory disease. The average IBP for the fibrostenotic and inflammatory groups were 18.6 ± 6.0 mmHg and 12.6 ± 3.5 mmHg, respectively (P < 0.05). Strictures were only seen in the fibrostenotic group. Of the fibrostenotic group, 6 (35%) demonstrated proximal esophageal strictures, 7 (41%) had distal strictures, 3 (18%) had mid-esophageal strictures, and 1 (6%) patient had pan-esophageal strictures. There was no statistically significant correlation between the level of esophageal stricture and degree of IBP. Integrated relaxation pressure, distal contractile integral, and other HRM metrics did not demonstrate statistical significance between the two subtypes. There also appeared no statistically significant correlation between patient demographics and esophageal metrics. Patients with the fibrostenotic phenotype of EoE demonstrated an IBP that was significantly higher than that of the inflammatory group. © 2015 International Society for Diseases of the Esophagus.

  10. Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

    NASA Astrophysics Data System (ADS)

    McCray, Wilmon Wil L., Jr.

    The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization model and dashboard that demonstrates the use of statistical methods, statistical process control, sensitivity analysis, quantitative and optimization techniques to establish a baseline and predict future customer satisfaction index scores (outcomes). The American Customer Satisfaction Index (ACSI) model and industry benchmarks were used as a framework for the simulation model.

  11. Precipitate statistics in an Al-Mg-Si-Cu alloy from scanning precession electron diffraction data

    NASA Astrophysics Data System (ADS)

    Sunde, J. K.; Paulsen, Ø.; Wenner, S.; Holmestad, R.

    2017-09-01

    The key microstructural feature providing strength to age-hardenable Al alloys is nanoscale precipitates. Alloy development requires a reliable statistical assessment of these precipitates, in order to link the microstructure with material properties. Here, it is demonstrated that scanning precession electron diffraction combined with computational analysis enable the semi-automated extraction of precipitate statistics in an Al-Mg-Si-Cu alloy. Among the main findings is the precipitate number density, which agrees well with a conventional method based on manual counting and measurements. By virtue of its data analysis objectivity, our methodology is therefore seen as an advantageous alternative to existing routines, offering reproducibility and efficiency in alloy statistics. Additional results include improved qualitative information on phase distributions. The developed procedure is generic and applicable to any material containing nanoscale precipitates.

  12. Analysis and meta-analysis of single-case designs: an introduction.

    PubMed

    Shadish, William R

    2014-04-01

    The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  13. Controlling the joint local false discovery rate is more powerful than meta-analysis methods in joint analysis of summary statistics from multiple genome-wide association studies.

    PubMed

    Jiang, Wei; Yu, Weichuan

    2017-02-15

    In genome-wide association studies (GWASs) of common diseases/traits, we often analyze multiple GWASs with the same phenotype together to discover associated genetic variants with higher power. Since it is difficult to access data with detailed individual measurements, summary-statistics-based meta-analysis methods have become popular to jointly analyze datasets from multiple GWASs. In this paper, we propose a novel summary-statistics-based joint analysis method based on controlling the joint local false discovery rate (Jlfdr). We prove that our method is the most powerful summary-statistics-based joint analysis method when controlling the false discovery rate at a certain level. In particular, the Jlfdr-based method achieves higher power than commonly used meta-analysis methods when analyzing heterogeneous datasets from multiple GWASs. Simulation experiments demonstrate the superior power of our method over meta-analysis methods. Also, our method discovers more associations than meta-analysis methods from empirical datasets of four phenotypes. The R-package is available at: http://bioinformatics.ust.hk/Jlfdr.html . eeyu@ust.hk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  14. Single-Level and Multilevel Mediation Analysis

    ERIC Educational Resources Information Center

    Tofighi, Davood; Thoemmes, Felix

    2014-01-01

    Mediation analysis is a statistical approach used to examine how the effect of an independent variable on an outcome is transmitted through an intervening variable (mediator). In this article, we provide a gentle introduction to single-level and multilevel mediation analyses. Using single-level data, we demonstrate an application of structural…

  15. A General Approach to Causal Mediation Analysis

    ERIC Educational Resources Information Center

    Imai, Kosuke; Keele, Luke; Tingley, Dustin

    2010-01-01

    Traditionally in the social sciences, causal mediation analysis has been formulated, understood, and implemented within the framework of linear structural equation models. We argue and demonstrate that this is problematic for 3 reasons: the lack of a general definition of causal mediation effects independent of a particular statistical model, the…

  16. The High Cost of Complexity in Experimental Design and Data Analysis: Type I and Type II Error Rates in Multiway ANOVA.

    ERIC Educational Resources Information Center

    Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.

    2002-01-01

    Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…

  17. Agriculture, population growth, and statistical analysis of the radiocarbon record.

    PubMed

    Zahid, H Jabran; Robinson, Erick; Kelly, Robert L

    2016-01-26

    The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.

  18. Statistical Approaches to Adjusting Weights for Dependent Arms in Network Meta-analysis.

    PubMed

    Su, Yu-Xuan; Tu, Yu-Kang

    2018-05-22

    Network meta-analysis compares multiple treatments in terms of their efficacy and harm by including evidence from randomized controlled trials. Most clinical trials use parallel design, where patients are randomly allocated to different treatments and receive only one treatment. However, some trials use within person designs such as split-body, split-mouth and cross-over designs, where each patient may receive more than one treatment. Data from treatment arms within these trials are no longer independent, so the correlations between dependent arms need to be accounted for within the statistical analyses. Ignoring these correlations may result in incorrect conclusions. The main objective of this study is to develop statistical approaches to adjusting weights for dependent arms within special design trials. In this study, we demonstrate the following three approaches: the data augmentation approach, the adjusting variance approach, and the reducing weight approach. These three methods could be perfectly applied in current statistic tools such as R and STATA. An example of periodontal regeneration was used to demonstrate how these approaches could be undertaken and implemented within statistical software packages, and to compare results from different approaches. The adjusting variance approach can be implemented within the network package in STATA, while reducing weight approach requires computer software programming to set up the within-study variance-covariance matrix. This article is protected by copyright. All rights reserved.

  19. Software for the Integration of Multiomics Experiments in Bioconductor.

    PubMed

    Ramos, Marcel; Schiffer, Lucas; Re, Angela; Azhar, Rimsha; Basunia, Azfar; Rodriguez, Carmen; Chan, Tiffany; Chapman, Phil; Davis, Sean R; Gomez-Cabrero, David; Culhane, Aedin C; Haibe-Kains, Benjamin; Hansen, Kasper D; Kodali, Hanish; Louis, Marie S; Mer, Arvind S; Riester, Markus; Morgan, Martin; Carey, Vince; Waldron, Levi

    2017-11-01

    Multiomics experiments are increasingly commonplace in biomedical research and add layers of complexity to experimental design, data integration, and analysis. R and Bioconductor provide a generic framework for statistical analysis and visualization, as well as specialized data classes for a variety of high-throughput data types, but methods are lacking for integrative analysis of multiomics experiments. The MultiAssayExperiment software package, implemented in R and leveraging Bioconductor software and design principles, provides for the coordinated representation of, storage of, and operation on multiple diverse genomics data. We provide the unrestricted multiple 'omics data for each cancer tissue in The Cancer Genome Atlas as ready-to-analyze MultiAssayExperiment objects and demonstrate in these and other datasets how the software simplifies data representation, statistical analysis, and visualization. The MultiAssayExperiment Bioconductor package reduces major obstacles to efficient, scalable, and reproducible statistical analysis of multiomics data and enhances data science applications of multiple omics datasets. Cancer Res; 77(21); e39-42. ©2017 AACR . ©2017 American Association for Cancer Research.

  20. Neural network approach in multichannel auditory event-related potential analysis.

    PubMed

    Wu, F Y; Slater, J D; Ramsay, R E

    1994-04-01

    Even though there are presently no clearly defined criteria for the assessment of P300 event-related potential (ERP) abnormality, it is strongly indicated through statistical analysis that such criteria exist for classifying control subjects and patients with diseases resulting in neuropsychological impairment such as multiple sclerosis (MS). We have demonstrated the feasibility of artificial neural network (ANN) methods in classifying ERP waveforms measured at a single channel (Cz) from control subjects and MS patients. In this paper, we report the results of multichannel ERP analysis and a modified network analysis methodology to enhance automation of the classification rule extraction process. The proposed methodology significantly reduces the work of statistical analysis. It also helps to standardize the criteria of P300 ERP assessment and facilitate the computer-aided analysis on neuropsychological functions.

  1. Association of bladder sensation measures and bladder diary in patients with urinary incontinence.

    PubMed

    King, Ashley B; Wolters, Jeff P; Klausner, Adam P; Rapp, David E

    2012-04-01

    Investigation suggests the involvement of afferent actions in the pathophysiology of urinary incontinence. Current diagnostic modalities do not allow for the accurate identification of sensory dysfunction. We previously reported urodynamic derivatives that may be useful in assessing bladder sensation. We sought to further investigate these derivatives by assessing for a relationship with 3-day bladder diary. Subset analysis was performed in patients without stress urinary incontinence (SUI) attempting to isolate patients with urgency symptoms. No association was demonstrated between bladder diary parameters and urodynamic derivatives (r coefficient range (-0.06 to 0.08)(p > 0.05)). However, subset analysis demonstrated an association between detrusor overactivity (DO) and bladder urgency velocity (BUV), with a lower BUV identified in patients without DO. Subset analysis of patients with isolated urgency/urge incontinence identified weak associations between voiding frequency and FSR (r = 0.39) and between daily incontinence episodes and BUV (r = 0.35). However, these associations failed to demonstrate statistical significance. No statistical association was seen between bladder diary and urodynamic derivatives. This is not unexpected, given that bladder diary parameters may reflect numerous pathologies including not only sensory dysfunction but also SUI and DO. However, weak associations were identified in patients without SUI and, further, a statistical relationship between DO and BUV was seen. Additional research is needed to assess the utility of FSR/BUV in characterizing sensory dysfunction, especially in patients without concurrent pathology (e.g. SUI, DO).

  2. The benefit of silicone stents in primary endonasal dacryocystorhinostomy: a systematic review and meta-analysis.

    PubMed

    Sarode, D; Bari, D A; Cain, A C; Syed, M I; Williams, A T

    2017-04-01

    To critically evaluate the evidence comparing success rates of endonasal dacryocystorhinostomy (EN-DCR) with and without silicone tubing and to thus determine whether silicone intubation is beneficial in primary EN-DCR. Systematic review and meta-analysis. A literature search was performed on AMED, EMBASE, HMIC, MEDLINE, PsycINFO, BNI, CINAHL, HEALTH BUSINESS ELITE, CENTRAL and Cochrane Ear, Nose and Throat disorders groups trials register using a combination of various MeSH. The date of last search was January 2016. This review was limited to randomised controlled trials (RCTs) in English language. Risk of bias was assessed using the Cochrane Collaboration's risk of bias tool. Chi-square and I 2 statistics were calculated to determine the presence and extent of statistical heterogeneity. Study selection, data extraction and risk of bias scoring were performed independently by two authors in concordance with the PRISMA statement. Five RCTs (447 primary EN-DCR procedures in 426 patients) were included for analysis. Moderate interstudy statistical heterogeneity was demonstrated (Chi 2 = 6.18; d.f. = 4; I 2 = 35%). Bicanalicular silicone stents were used in 229 and not used in 218 procedures. The overall success rate of EN-DCR was 92.8% (415/447). The success rate of EN-DCR was 93.4% (214/229) with silicone tubing and 92.2% (201/218) without silicone tubing. Meta-analysis using a random-effects model showed no statistically significant difference in outcomes between the two groups (P = 0.63; RR = 0.79; 95% CI = 0.3-2.06). Our review and meta-analysis did not demonstrate an additional advantage of silicone stenting. A high-quality well-powered prospective multicentre RCT is needed to further clarify on the benefit of silicone stents. © 2016 John Wiley & Sons Ltd.

  3. System Analysis for the Huntsville Operation Support Center, Distributed Computer System

    NASA Technical Reports Server (NTRS)

    Ingels, F. M.; Massey, D.

    1985-01-01

    HOSC as a distributed computing system, is responsible for data acquisition and analysis during Space Shuttle operations. HOSC also provides computing services for Marshall Space Flight Center's nonmission activities. As mission and nonmission activities change, so do the support functions of HOSC change, demonstrating the need for some method of simulating activity at HOSC in various configurations. The simulation developed in this work primarily models the HYPERchannel network. The model simulates the activity of a steady state network, reporting statistics such as, transmitted bits, collision statistics, frame sequences transmitted, and average message delay. These statistics are used to evaluate such performance indicators as throughout, utilization, and delay. Thus the overall performance of the network is evaluated, as well as predicting possible overload conditions.

  4. A two-point diagnostic for the H II galaxy Hubble diagram

    NASA Astrophysics Data System (ADS)

    Leaf, Kyle; Melia, Fulvio

    2018-03-01

    A previous analysis of starburst-dominated H II galaxies and H II regions has demonstrated a statistically significant preference for the Friedmann-Robertson-Walker cosmology with zero active mass, known as the Rh = ct universe, over Λcold dark matter (ΛCDM) and its related dark-matter parametrizations. In this paper, we employ a two-point diagnostic with these data to present a complementary statistical comparison of Rh = ct with Planck ΛCDM. Our two-point diagnostic compares, in a pairwise fashion, the difference between the distance modulus measured at two redshifts with that predicted by each cosmology. Our results support the conclusion drawn by a previous comparative analysis demonstrating that Rh = ct is statistically preferred over Planck ΛCDM. But we also find that the reported errors in the H II measurements may not be purely Gaussian, perhaps due to a partial contamination by non-Gaussian systematic effects. The use of H II galaxies and H II regions as standard candles may be improved even further with a better handling of the systematics in these sources.

  5. Decomposition of Ag-based soldering alloys used in space maintainers after intra-oral exposure. A retrieval analysis study.

    PubMed

    Soteriou, Despo; Ntasi, Argyro; Papagiannoulis, Lisa; Eliades, Theodore; Zinelis, Spiros

    2014-02-01

    The aim of this study was to evaluate the elemental alterations of Ag soldering alloys used in space maintainers after intra-oral exposure. Twenty devices were fabricated by using two different soldering alloys; US (Dentaurum Universal Silver Solder, n = 10) and OS (Leone Orthodontic Solder, n = 10). All devices were manufactured by the same technician. Surface morphology and elemental quantitative analysis of the soldering alloys before and after intra-oral placement in patients was determined by scanning electron microscopy and energy-dispersive X-ray microanalysis (SEM/EDX). Statistical analysis was performed by t-test, Mann Whitney tests and Pearson's correlation. For all tests a 95% confidence level was used (α = 0.05). Both soldering alloys demonstrated substantially increase in surface roughness after intra-oral aging. Statistical analysis illustrated a significant decrease in the Cu and Zn content after treatment. OS demonstrated higher Cu release than US (p < 0.05). The remaining relative concentrations of Cu and Zn after the treatment did not show any correlation (p > 0.05) with intra-oral exposure time, apart from Zn in OS (r = 0.840, p = 0.04). Both soldering alloys demonstrated a significant Cu and Zn reduction after intra-oral exposure that may raise biocompatibility concerns.

  6. A LISREL Model for the Analysis of Repeated Measures with a Patterned Covariance Matrix.

    ERIC Educational Resources Information Center

    Rovine, Michael J.; Molenaar, Peter C. M.

    1998-01-01

    Presents a LISREL model for the estimation of the repeated measures analysis of variance (ANOVA) with a patterned covariance matrix. The model is demonstrated for a 5 x 2 (Time x Group) ANOVA in which the data are assumed to be serially correlated. Similarities with the Statistical Analysis System PROC MIXED model are discussed. (SLD)

  7. A comparison of healing rates on two pressure-relieving systems.

    PubMed

    Russell, L; Reynolds, T; Carr, J; Evans, A; Holmes, M

    The authors have previously reported the preliminary results of a randomized-controlled trial comparing the relative efficacy of two pressure-relieving systems: Huntleigh Nimbus 3 and Aura Cushion, and Pegasus Cairwave Therapy System and ProActive Seating Cushion (Russell et al, 2000). Although both the mattresses and cushions were effective treatments for pressure ulcers, the Huntleigh equipment was demonstrated to be statistically more effective for heel ulcers, but no differences were demonstrated for sacral ulcers. This article gives a more detailed analysis of the 141 patients assessed using computerized-image analysis of the digital images of sacral ulcers captured during the trial and specifically discusses the healing rates and other patient characteristics. Ninety-eight per cent of ulcers examined were deemed superficial (Torrance grade 2a, 2b, 3). Precision of image analysis assessed by within- and between-batch coefficients of variation was excellent: calibration CV 0.93-1.84%; area CV 4.61-5.72%. The healing rates on the two mattresses were not shown to be statistically different from each other.

  8. Two-sample statistics for testing the equality of survival functions against improper semi-parametric accelerated failure time alternatives: an application to the analysis of a breast cancer clinical trial.

    PubMed

    Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry

    2004-06-01

    This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.

  9. Preparing and Presenting Effective Research Posters

    PubMed Central

    Miller, Jane E

    2007-01-01

    Objectives Posters are a common way to present results of a statistical analysis, program evaluation, or other project at professional conferences. Often, researchers fail to recognize the unique nature of the format, which is a hybrid of a published paper and an oral presentation. This methods note demonstrates how to design research posters to convey study objectives, methods, findings, and implications effectively to varied professional audiences. Methods A review of existing literature on research communication and poster design is used to identify and demonstrate important considerations for poster content and layout. Guidelines on how to write about statistical methods, results, and statistical significance are illustrated with samples of ineffective writing annotated to point out weaknesses, accompanied by concrete examples and explanations of improved presentation. A comparison of the content and format of papers, speeches, and posters is also provided. Findings Each component of a research poster about a quantitative analysis should be adapted to the audience and format, with complex statistical results translated into simplified charts, tables, and bulleted text to convey findings as part of a clear, focused story line. Conclusions Effective research posters should be designed around two or three key findings with accompanying handouts and narrative description to supply additional technical detail and encourage dialog with poster viewers. PMID:17355594

  10. A close examination of double filtering with fold change and t test in microarray analysis

    PubMed Central

    2009-01-01

    Background Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure. PMID:19995439

  11. Introduction to Latent Class Analysis with Applications

    ERIC Educational Resources Information Center

    Porcu, Mariano; Giambona, Francesca

    2017-01-01

    Latent class analysis (LCA) is a statistical method used to group individuals (cases, units) into classes (categories) of an unobserved (latent) variable on the basis of the responses made on a set of nominal, ordinal, or continuous observed variables. In this article, we introduce LCA in order to demonstrate its usefulness to early adolescence…

  12. An Economic Analysis of the Demand for Scientific Journals

    ERIC Educational Resources Information Center

    Berg, Sanford V.

    1972-01-01

    The purpose of this study is to demonstrate that economic analysis can be useful in modeling the scientific journal market. Of particular interest is the efficiency of pricing and page policies. To calculate loses due to inefficiencies, demand parameters are statistically estimated and used in a discussion of market efficiency. (3 references)…

  13. Prototype demonstration of a geographic information system application for seasonal analysis of traffic data : development of seasonal factors and seasonal adjustment of roadways

    DOT National Transportation Integrated Search

    1999-08-15

    The Traffic Survey Unit plans to establish a methodology in which it can assign each Portable Traffic Counter (PTC) station a seasonal group profile through a means of statistical and geographical analysis. An ArcView Geographic Information Systems a...

  14. Applications of Nonlinear Principal Components Analysis to Behavioral Data.

    ERIC Educational Resources Information Center

    Hicks, Marilyn Maginley

    1981-01-01

    An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)

  15. Consequences of common data analysis inaccuracies in CNS trauma injury basic research.

    PubMed

    Burke, Darlene A; Whittemore, Scott R; Magnuson, David S K

    2013-05-15

    The development of successful treatments for humans after traumatic brain or spinal cord injuries (TBI and SCI, respectively) requires animal research. This effort can be hampered when promising experimental results cannot be replicated because of incorrect data analysis procedures. To identify and hopefully avoid these errors in future studies, the articles in seven journals with the highest number of basic science central nervous system TBI and SCI animal research studies published in 2010 (N=125 articles) were reviewed for their data analysis procedures. After identifying the most common statistical errors, the implications of those findings were demonstrated by reanalyzing previously published data from our laboratories using the identified inappropriate statistical procedures, then comparing the two sets of results. Overall, 70% of the articles contained at least one type of inappropriate statistical procedure. The highest percentage involved incorrect post hoc t-tests (56.4%), followed by inappropriate parametric statistics (analysis of variance and t-test; 37.6%). Repeated Measures analysis was inappropriately missing in 52.0% of all articles and, among those with behavioral assessments, 58% were analyzed incorrectly. Reanalysis of our published data using the most common inappropriate statistical procedures resulted in a 14.1% average increase in significant effects compared to the original results. Specifically, an increase of 15.5% occurred with Independent t-tests and 11.1% after incorrect post hoc t-tests. Utilizing proper statistical procedures can allow more-definitive conclusions, facilitate replicability of research results, and enable more accurate translation of those results to the clinic.

  16. UNITY: Confronting Supernova Cosmology's Statistical and Systematic Uncertainties in a Unified Bayesian Framework

    NASA Astrophysics Data System (ADS)

    Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The

    2015-11-01

    While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.

  17. Electrophysiological evidence of heterogeneity in visual statistical learning in young children with ASD.

    PubMed

    Jeste, Shafali S; Kirkham, Natasha; Senturk, Damla; Hasenstab, Kyle; Sugar, Catherine; Kupelian, Chloe; Baker, Elizabeth; Sanders, Andrew J; Shimizu, Christina; Norona, Amanda; Paparella, Tanya; Freeman, Stephanny F N; Johnson, Scott P

    2015-01-01

    Statistical learning is characterized by detection of regularities in one's environment without an awareness or intention to learn, and it may play a critical role in language and social behavior. Accordingly, in this study we investigated the electrophysiological correlates of visual statistical learning in young children with autism spectrum disorder (ASD) using an event-related potential shape learning paradigm, and we examined the relation between visual statistical learning and cognitive function. Compared to typically developing (TD) controls, the ASD group as a whole showed reduced evidence of learning as defined by N1 (early visual discrimination) and P300 (attention to novelty) components. Upon further analysis, in the ASD group there was a positive correlation between N1 amplitude difference and non-verbal IQ, and a positive correlation between P300 amplitude difference and adaptive social function. Children with ASD and a high non-verbal IQ and high adaptive social function demonstrated a distinctive pattern of learning. This is the first study to identify electrophysiological markers of visual statistical learning in children with ASD. Through this work we have demonstrated heterogeneity in statistical learning in ASD that maps onto non-verbal cognition and adaptive social function. © 2014 John Wiley & Sons Ltd.

  18. Analysis of statistical properties of laser speckles, forming in skin and mucous of colon: potential application in laser surgery

    NASA Astrophysics Data System (ADS)

    Rubtsov, Vladimir; Kapralov, Sergey; Chalyk, Iuri; Ulianova, Onega; Ulyanov, Sergey

    2013-02-01

    Statistical properties of laser speckles, formed in skin and mucous of colon have been analyzed and compared. It has been demonstrated that first and second order statistics of "skin" speckles and "mucous" speckles are quite different. It is shown that speckles, formed in mucous, are not Gaussian one. Layered structure of colon mucous causes formation of speckled biospeckles. First- and second- order statistics of speckled speckles have been reviewed in this paper. Statistical properties of Fresnel and Fraunhofer doubly scattered and cascade speckles are described. Non-gaussian statistics of biospeckles may lead to high localization of intensity of coherent light in human tissue during the laser surgery. Way of suppression of highly localized non-gaussian speckles is suggested.

  19. Root Cause Analysis of Quality Defects Using HPLC-MS Fingerprint Knowledgebase for Batch-to-batch Quality Control of Herbal Drugs.

    PubMed

    Yan, Binjun; Fang, Zhonghua; Shen, Lijuan; Qu, Haibin

    2015-01-01

    The batch-to-batch quality consistency of herbal drugs has always been an important issue. To propose a methodology for batch-to-batch quality control based on HPLC-MS fingerprints and process knowledgebase. The extraction process of Compound E-jiao Oral Liquid was taken as a case study. After establishing the HPLC-MS fingerprint analysis method, the fingerprints of the extract solutions produced under normal and abnormal operation conditions were obtained. Multivariate statistical models were built for fault detection and a discriminant analysis model was built using the probabilistic discriminant partial-least-squares method for fault diagnosis. Based on multivariate statistical analysis, process knowledge was acquired and the cause-effect relationship between process deviations and quality defects was revealed. The quality defects were detected successfully by multivariate statistical control charts and the type of process deviations were diagnosed correctly by discriminant analysis. This work has demonstrated the benefits of combining HPLC-MS fingerprints, process knowledge and multivariate analysis for the quality control of herbal drugs. Copyright © 2015 John Wiley & Sons, Ltd.

  20. Statistical analysis and interpolation of compositional data in materials science.

    PubMed

    Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M

    2015-02-09

    Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.

  1. The level crossing rates and associated statistical properties of a random frequency response function

    NASA Astrophysics Data System (ADS)

    Langley, Robin S.

    2018-03-01

    This work is concerned with the statistical properties of the frequency response function of the energy of a random system. Earlier studies have considered the statistical distribution of the function at a single frequency, or alternatively the statistics of a band-average of the function. In contrast the present analysis considers the statistical fluctuations over a frequency band, and results are obtained for the mean rate at which the function crosses a specified level (or equivalently, the average number of times the level is crossed within the band). Results are also obtained for the probability of crossing a specified level at least once, the mean rate of occurrence of peaks, and the mean trough-to-peak height. The analysis is based on the assumption that the natural frequencies and mode shapes of the system have statistical properties that are governed by the Gaussian Orthogonal Ensemble (GOE), and the validity of this assumption is demonstrated by comparison with numerical simulations for a random plate. The work has application to the assessment of the performance of dynamic systems that are sensitive to random imperfections.

  2. Remote sensing of atmospheric water content from Bhaskara SAMIR data. [using statistical linear regression analysis

    NASA Technical Reports Server (NTRS)

    Gohil, B. S.; Hariharan, T. A.; Sharma, A. K.; Pandey, P. C.

    1982-01-01

    The 19.35 GHz and 22.235 GHz passive microwave radiometers (SAMIR) on board the Indian satellite Bhaskara have provided very useful data. From these data has been demonstrated the feasibility of deriving atmospheric and ocean surface parameters such as water vapor content, liquid water content, rainfall rate and ocean surface winds. Different approaches have been tried for deriving the atmospheric water content. The statistical and empirical methods have been used by others for the analysis of the Nimbus data. A simulation technique has been attempted for the first time for 19.35 GHz and 22.235 GHz radiometer data. The results obtained from three different methods are compared with radiosonde data. A case study of a tropical depression has been undertaken to demonstrate the capability of Bhaskara SAMIR data to show the variation of total water vapor and liquid water contents.

  3. HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.

    PubMed

    Song, Chi; Tseng, George C

    2014-01-01

    Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values ( r th ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.

  4. Modeling and replicating statistical topology and evidence for CMB nonhomogeneity

    PubMed Central

    Agami, Sarit

    2017-01-01

    Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301

  5. Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

    PubMed

    Dai, Qi; Yang, Yanchun; Wang, Tianming

    2008-10-15

    Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.

  6. Mapping Quantitative Traits in Unselected Families: Algorithms and Examples

    PubMed Central

    Dupuis, Josée; Shi, Jianxin; Manning, Alisa K.; Benjamin, Emelia J.; Meigs, James B.; Cupples, L. Adrienne; Siegmund, David

    2009-01-01

    Linkage analysis has been widely used to identify from family data genetic variants influencing quantitative traits. Common approaches have both strengths and limitations. Likelihood ratio tests typically computed in variance component analysis can accommodate large families but are highly sensitive to departure from normality assumptions. Regression-based approaches are more robust but their use has primarily been restricted to nuclear families. In this paper, we develop methods for mapping quantitative traits in moderately large pedigrees. Our methods are based on the score statistic which in contrast to the likelihood ratio statistic, can use nonparametric estimators of variability to achieve robustness of the false positive rate against departures from the hypothesized phenotypic model. Because the score statistic is easier to calculate than the likelihood ratio statistic, our basic mapping methods utilize relatively simple computer code that performs statistical analysis on output from any program that computes estimates of identity-by-descent. This simplicity also permits development and evaluation of methods to deal with multivariate and ordinal phenotypes, and with gene-gene and gene-environment interaction. We demonstrate our methods on simulated data and on fasting insulin, a quantitative trait measured in the Framingham Heart Study. PMID:19278016

  7. Statistical analysis of subjective preferences for video enhancement

    NASA Astrophysics Data System (ADS)

    Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli

    2010-02-01

    Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.

  8. A Non-Intrusive Algorithm for Sensitivity Analysis of Chaotic Flow Simulations

    NASA Technical Reports Server (NTRS)

    Blonigan, Patrick J.; Wang, Qiqi; Nielsen, Eric J.; Diskin, Boris

    2017-01-01

    We demonstrate a novel algorithm for computing the sensitivity of statistics in chaotic flow simulations to parameter perturbations. The algorithm is non-intrusive but requires exposing an interface. Based on the principle of shadowing in dynamical systems, this algorithm is designed to reduce the effect of the sampling error in computing sensitivity of statistics in chaotic simulations. We compare the effectiveness of this method to that of the conventional finite difference method.

  9. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

    PubMed

    Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

    2012-08-08

    Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  10. A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.

    PubMed

    Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S

    2016-01-01

    Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.

  11. Multidimensional Rasch Analysis of a Psychological Test with Multiple Subtests: A Statistical Solution for the Bandwidth-Fidelity Dilemma

    ERIC Educational Resources Information Center

    Cheng, Ying-Yao; Wang, Wen-Chung; Ho, Yi-Hui

    2009-01-01

    Educational and psychological tests are often composed of multiple short subtests, each measuring a distinct latent trait. Unfortunately, short subtests suffer from low measurement precision, which makes the bandwidth-fidelity dilemma inevitable. In this study, the authors demonstrate how a multidimensional Rasch analysis can be employed to take…

  12. The Evidence for Efficacy of HPV Vaccines: Investigations in Categorical Data Analysis

    ERIC Educational Resources Information Center

    Gibbs, Alison L.; Goossens, Emery T.

    2013-01-01

    Recent approval of HPV vaccines and their widespread provision to young women provide an interesting context to gain experience with the application of statistical methods in current research. We demonstrate how we have used data extracted from a meta-analysis examining the efficacy of HPV vaccines in clinical trials with students in applied…

  13. Passing the Test: Ecological Regression Analysis in the Los Angeles County Case and Beyond.

    ERIC Educational Resources Information Center

    Lichtman, Allan J.

    1991-01-01

    Statistical analysis of racially polarized voting prepared for the Garza v County of Los Angeles (California) (1990) voting rights case is reviewed to demonstrate that ecological regression is a flexible, robust technique that illuminates the reality of ethnic voting, and superior to the neighborhood model supported by the defendants. (SLD)

  14. Two-Sample Statistics for Testing the Equality of Survival Functions Against Improper Semi-parametric Accelerated Failure Time Alternatives: An Application to the Analysis of a Breast Cancer Clinical Trial

    PubMed Central

    BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY

    2010-01-01

    This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627

  15. Phenotypic mapping of metabolic profiles using self-organizing maps of high-dimensional mass spectrometry data.

    PubMed

    Goodwin, Cody R; Sherrod, Stacy D; Marasco, Christina C; Bachmann, Brian O; Schramm-Sapyta, Nicole; Wikswo, John P; McLean, John A

    2014-07-01

    A metabolic system is composed of inherently interconnected metabolic precursors, intermediates, and products. The analysis of untargeted metabolomics data has conventionally been performed through the use of comparative statistics or multivariate statistical analysis-based approaches; however, each falls short in representing the related nature of metabolic perturbations. Herein, we describe a complementary method for the analysis of large metabolite inventories using a data-driven approach based upon a self-organizing map algorithm. This workflow allows for the unsupervised clustering, and subsequent prioritization of, correlated features through Gestalt comparisons of metabolic heat maps. We describe this methodology in detail, including a comparison to conventional metabolomics approaches, and demonstrate the application of this method to the analysis of the metabolic repercussions of prolonged cocaine exposure in rat sera profiles.

  16. A scaling procedure for the response of an isolated system with high modal overlap factor

    NASA Astrophysics Data System (ADS)

    De Rosa, S.; Franco, F.

    2008-10-01

    The paper deals with a numerical approach that reduces some physical sizes of the solution domain to compute the dynamic response of an isolated system: it has been named Asymptotical Scaled Modal Analysis (ASMA). The proposed numerical procedure alters the input data needed to obtain the classic modal responses to increase the frequency band of validity of the discrete or continuous coordinates model through the definition of a proper scaling coefficient. It is demonstrated that the computational cost remains acceptable while the frequency range of analysis increases. Moreover, with reference to the flexural vibrations of a rectangular plate, the paper discusses the ASMA vs. the statistical energy analysis and the energy distribution approach. Some insights are also given about the limits of the scaling coefficient. Finally it is shown that the linear dynamic response, predicted with the scaling procedure, has the same quality and characteristics of the statistical energy analysis, but it can be useful when the system cannot be solved appropriately by the standard Statistical Energy Analysis (SEA).

  17. Comparative efficacy of two battery-powered toothbrushes on dental plaque removal.

    PubMed

    Ruhlman, C Douglas; Bartizek, Robert D; Biesbrock, Aaron R

    2002-01-01

    A number of clinical studies have consistently demonstrated that power toothbrushes deliver superior plaque removal compared to manual toothbrushes. Recently, a new power toothbrush (Crest SpinBrush) has been marketed with a design that fundamentally differs from other marketed power toothbrushes. Other power toothbrushes feature a small, round head designed to oscillate for enhanced cleaning between the teeth and below the gumline. The new power toothbrush incorporates a similar round oscillating head in conjunction with fixed bristles, which allows the user to brush with optimal manual brushing technique. The objective of this randomized, examiner-blind, parallel design study was to compare the plaque removal efficacy of a positive control power toothbrush (Colgate Actibrush) to an experimental toothbrush (Crest SpinBrush) following a single use among 59 subjects. Baseline plaque scores were 1.64 and 1.40 for the experimental toothbrush and control toothbrush treatment groups, respectively. With regard to all surfaces examined, the experimental toothbrush delivered an adjusted (via analysis of covariance) mean difference between baseline and post-brushing plaque scores of 0.47, while the control toothbrush delivered an adjusted mean difference of 0.33. On average, the difference between toothbrushes was statistically significant (p = 0.013). Because the covariate slope for the experimental group was statistically significantly greater (p = 0.001) than the slope for the control group, a separate slope model was used. Further analysis demonstrated that the experimental group had statistically significantly greater plaque removal than the control group for baseline plaque scores above 1.43. With respect to buccal surfaces, using a separate slope analysis of covariance, the experimental toothbrush delivered an adjusted mean difference between baseline and post-brushing plaque scores of 0.61, while the control toothbrush delivered an adjusted mean difference of 0.39. This difference between toothbrushes was also statistically significant (p = 0.002). On average, the results on lingual surfaces demonstrated similar directional scores favoring the experimental toothbrush; however these results did not achieve statistical significance. In conclusion, the experimental Crest SpinBrush, with its novel fixed and oscillating bristle design, was found to be more effective than the positive control Colgate Actibrush, which is designed with a small round oscillating cluster of bristles.

  18. Statistical assessment of the learning curves of health technologies.

    PubMed

    Ramsay, C R; Grant, A M; Wallace, S A; Garthwaite, P H; Monk, A F; Russell, I T

    2001-01-01

    (1) To describe systematically studies that directly assessed the learning curve effect of health technologies. (2) Systematically to identify 'novel' statistical techniques applied to learning curve data in other fields, such as psychology and manufacturing. (3) To test these statistical techniques in data sets from studies of varying designs to assess health technologies in which learning curve effects are known to exist. METHODS - STUDY SELECTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): For a study to be included, it had to include a formal analysis of the learning curve of a health technology using a graphical, tabular or statistical technique. METHODS - STUDY SELECTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): For a study to be included, it had to include a formal assessment of a learning curve using a statistical technique that had not been identified in the previous search. METHODS - DATA SOURCES: Six clinical and 16 non-clinical biomedical databases were searched. A limited amount of handsearching and scanning of reference lists was also undertaken. METHODS - DATA EXTRACTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): A number of study characteristics were abstracted from the papers such as study design, study size, number of operators and the statistical method used. METHODS - DATA EXTRACTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): The new statistical techniques identified were categorised into four subgroups of increasing complexity: exploratory data analysis; simple series data analysis; complex data structure analysis, generic techniques. METHODS - TESTING OF STATISTICAL METHODS: Some of the statistical methods identified in the systematic searches for single (simple) operator series data and for multiple (complex) operator series data were illustrated and explored using three data sets. The first was a case series of 190 consecutive laparoscopic fundoplication procedures performed by a single surgeon; the second was a case series of consecutive laparoscopic cholecystectomy procedures performed by ten surgeons; the third was randomised trial data derived from the laparoscopic procedure arm of a multicentre trial of groin hernia repair, supplemented by data from non-randomised operations performed during the trial. RESULTS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: Of 4571 abstracts identified, 272 (6%) were later included in the study after review of the full paper. Some 51% of studies assessed a surgical minimal access technique and 95% were case series. The statistical method used most often (60%) was splitting the data into consecutive parts (such as halves or thirds), with only 14% attempting a more formal statistical analysis. The reporting of the studies was poor, with 31% giving no details of data collection methods. RESULTS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: Of 9431 abstracts assessed, 115 (1%) were deemed appropriate for further investigation and, of these, 18 were included in the study. All of the methods for complex data sets were identified in the non-clinical literature. These were discriminant analysis, two-stage estimation of learning rates, generalised estimating equations, multilevel models, latent curve models, time series models and stochastic parameter models. In addition, eight new shapes of learning curves were identified. RESULTS - TESTING OF STATISTICAL METHODS: No one particular shape of learning curve performed significantly better than another. The performance of 'operation time' as a proxy for learning differed between the three procedures. Multilevel modelling using the laparoscopic cholecystectomy data demonstrated and measured surgeon-specific and confounding effects. The inclusion of non-randomised cases, despite the possible limitations of the method, enhanced the interpretation of learning effects. CONCLUSIONS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: The statistical methods used for assessing learning effects in health technology assessment have been crude and the reporting of studies poor. CONCLUSIONS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: A number of statistical methods for assessing learning effects were identified that had not hitherto been used in health technology assessment. There was a hierarchy of methods for the identification and measurement of learning, and the more sophisticated methods for both have had little if any use in health technology assessment. This demonstrated the value of considering fields outside clinical research when addressing methodological issues in health technology assessment. CONCLUSIONS - TESTING OF STATISTICAL METHODS: It has been demonstrated that the portfolio of techniques identified can enhance investigations of learning curve effects. (ABSTRACT TRUNCATED)

  19. The Student-to-Student Chemistry Initiative: Training High School Students To Perform Chemistry Demonstration Programs for Elementary School Students

    NASA Astrophysics Data System (ADS)

    Voegel, Phillip D.; Quashnock, Kathryn A.; Heil, Katrina M.

    2004-05-01

    The Student-to-Student Chemistry Initiative is an outreach program started in the fall of 2001 at Midwestern State University (MSU). The oncampus program trains high school science students to perform a series of chemistry demonstrations and subsequently provides kits containing necessary supplies and reagents for the high school students to perform demonstration programs at elementary schools. The program focuses on improving student perception of science. The program's impact on high school student perception is evaluated through statistical analysis of paired preparticipation and postparticipation surveys. The surveys focus on four areas of student perception: general attitude toward science, interest in careers in science, science awareness, and interest in attending MSU for postsecondary education. Increased scores were observed in all evaluation areas including a statistically significant increase in science awareness following participation.

  20. The coordinate-based meta-analysis of neuroimaging data.

    PubMed

    Samartsidis, Pantelis; Montagna, Silvia; Nichols, Thomas E; Johnson, Timothy D

    2017-01-01

    Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research.

  1. The coordinate-based meta-analysis of neuroimaging data

    PubMed Central

    Samartsidis, Pantelis; Montagna, Silvia; Nichols, Thomas E.; Johnson, Timothy D.

    2017-01-01

    Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research. PMID:29545671

  2. Statistical Learning Analysis in Neuroscience: Aiming for Transparency

    PubMed Central

    Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan

    2009-01-01

    Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270

  3. Statistical Analysis of Protein Ensembles

    NASA Astrophysics Data System (ADS)

    Máté, Gabriell; Heermann, Dieter

    2014-04-01

    As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.

  4. Incorporating principal component analysis into air quality model evaluation

    EPA Science Inventory

    The efficacy of standard air quality model evaluation techniques is becoming compromised as the simulation periods continue to lengthen in response to ever increasing computing capacity. Accordingly, the purpose of this paper is to demonstrate a statistical approach called Princi...

  5. AGR-1 Thermocouple Data Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeff Einerson

    2012-05-01

    This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods tomore » further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of simulation results (Chapter 3). The statistics-based simulation-aided experimental control procedure described for the future AGR tests is developed and demonstrated in Chapter 4. The procedure for controlling the target fuel temperature (capsule peak or average) is based on regression functions of thermocouple readings and other relevant parameters and accounting for possible changes in both physical and thermal conditions and in instrument performance.« less

  6. Combined data preprocessing and multivariate statistical analysis characterizes fed-batch culture of mouse hybridoma cells for rational medium design.

    PubMed

    Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup

    2010-10-01

    We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.

  7. Network Meta-Analysis Using R: A Review of Currently Available Automated Packages

    PubMed Central

    Neupane, Binod; Richer, Danielle; Bonner, Ashley Joel; Kibret, Taddele; Beyene, Joseph

    2014-01-01

    Network meta-analysis (NMA) – a statistical technique that allows comparison of multiple treatments in the same meta-analysis simultaneously – has become increasingly popular in the medical literature in recent years. The statistical methodology underpinning this technique and software tools for implementing the methods are evolving. Both commercial and freely available statistical software packages have been developed to facilitate the statistical computations using NMA with varying degrees of functionality and ease of use. This paper aims to introduce the reader to three R packages, namely, gemtc, pcnetmeta, and netmeta, which are freely available software tools implemented in R. Each automates the process of performing NMA so that users can perform the analysis with minimal computational effort. We present, compare and contrast the availability and functionality of different important features of NMA in these three packages so that clinical investigators and researchers can determine which R packages to implement depending on their analysis needs. Four summary tables detailing (i) data input and network plotting, (ii) modeling options, (iii) assumption checking and diagnostic testing, and (iv) inference and reporting tools, are provided, along with an analysis of a previously published dataset to illustrate the outputs available from each package. We demonstrate that each of the three packages provides a useful set of tools, and combined provide users with nearly all functionality that might be desired when conducting a NMA. PMID:25541687

  8. Network meta-analysis using R: a review of currently available automated packages.

    PubMed

    Neupane, Binod; Richer, Danielle; Bonner, Ashley Joel; Kibret, Taddele; Beyene, Joseph

    2014-01-01

    Network meta-analysis (NMA)--a statistical technique that allows comparison of multiple treatments in the same meta-analysis simultaneously--has become increasingly popular in the medical literature in recent years. The statistical methodology underpinning this technique and software tools for implementing the methods are evolving. Both commercial and freely available statistical software packages have been developed to facilitate the statistical computations using NMA with varying degrees of functionality and ease of use. This paper aims to introduce the reader to three R packages, namely, gemtc, pcnetmeta, and netmeta, which are freely available software tools implemented in R. Each automates the process of performing NMA so that users can perform the analysis with minimal computational effort. We present, compare and contrast the availability and functionality of different important features of NMA in these three packages so that clinical investigators and researchers can determine which R packages to implement depending on their analysis needs. Four summary tables detailing (i) data input and network plotting, (ii) modeling options, (iii) assumption checking and diagnostic testing, and (iv) inference and reporting tools, are provided, along with an analysis of a previously published dataset to illustrate the outputs available from each package. We demonstrate that each of the three packages provides a useful set of tools, and combined provide users with nearly all functionality that might be desired when conducting a NMA.

  9. Time Series Expression Analyses Using RNA-seq: A Statistical Approach

    PubMed Central

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021

  10. Mediation analysis in nursing research: a methodological review.

    PubMed

    Liu, Jianghong; Ulrich, Connie

    2016-12-01

    Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask - and answer - more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science.

  11. Structure-Specific Statistical Mapping of White Matter Tracts

    PubMed Central

    Yushkevich, Paul A.; Zhang, Hui; Simon, Tony; Gee, James C.

    2008-01-01

    We present a new model-based framework for the statistical analysis of diffusion imaging data associated with specific white matter tracts. The framework takes advantage of the fact that several of the major white matter tracts are thin sheet-like structures that can be effectively modeled by medial representations. The approach involves segmenting major tracts and fitting them with deformable geometric medial models. The medial representation makes it possible to average and combine tensor-based features along directions locally perpendicular to the tracts, thus reducing data dimensionality and accounting for errors in normalization. The framework enables the analysis of individual white matter structures, and provides a range of possibilities for computing statistics and visualizing differences between cohorts. The framework is demonstrated in a study of white matter differences in pediatric chromosome 22q11.2 deletion syndrome. PMID:18407524

  12. Time series expression analyses using RNA-seq: a statistical approach.

    PubMed

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.

  13. Multilevel Latent Class Analysis for Large-Scale Educational Assessment Data: Exploring the Relation between the Curriculum and Students' Mathematical Strategies

    ERIC Educational Resources Information Center

    Fagginger Auer, Marije F.; Hickendorff, Marian; Van Putten, Cornelis M.; Béguin, Anton A.; Heiser, Willem J.

    2016-01-01

    A first application of multilevel latent class analysis (MLCA) to educational large-scale assessment data is demonstrated. This statistical technique addresses several of the challenges that assessment data offers. Importantly, MLCA allows modeling of the often ignored teacher effects and of the joint influence of teacher and student variables.…

  14. Increasing URM Undergraduate Student Success through Assessment-Driven Interventions: A Multiyear Study Using Freshman-Level General Biology as a Model System

    PubMed Central

    Carmichael, Mary C.; St. Clair, Candace; Edwards, Andrea M.; Barrett, Peter; McFerrin, Harris; Davenport, Ian; Awad, Mohamed; Kundu, Anup; Ireland, Shubha Kale

    2016-01-01

    Xavier University of Louisiana leads the nation in awarding BS degrees in the biological sciences to African-American students. In this multiyear study with ∼5500 participants, data-driven interventions were adopted to improve student academic performance in a freshman-level general biology course. The three hour-long exams were common and administered concurrently to all students. New exam questions were developed using Bloom’s taxonomy, and exam results were analyzed statistically with validated assessment tools. All but the comprehensive final exam were returned to students for self-evaluation and remediation. Among other approaches, course rigor was monitored by using an identical set of 60 questions on the final exam across 10 semesters. Analysis of the identical sets of 60 final exam questions revealed that overall averages increased from 72.9% (2010) to 83.5% (2015). Regression analysis demonstrated a statistically significant correlation between high-risk students and their averages on the 60 questions. Additional analysis demonstrated statistically significant improvements for at least one letter grade from midterm to final and a 20% increase in the course pass rates over time, also for the high-risk population. These results support the hypothesis that our data-driven interventions and assessment techniques are successful in improving student retention, particularly for our academically at-risk students. PMID:27543637

  15. Comparison and Recovery of Escherichia coli and Thermotolerant Coliforms in Water with a Chromogenic Medium Incubated at 41 and 44.5°C

    PubMed Central

    Alonso, Jose L.; Soriano, Adela; Carbajo, Oscar; Amoros, Inmaculada; Garelick, Hemda

    1999-01-01

    This study compared the performance of a commercial chromogenic medium, CHROMagarECC (CECC), and CECC supplemented with sodium pyruvate (CECCP) with the membrane filtration lauryl sulfate-based medium (mLSA) for enumeration of Escherichia coli and non-E. coli thermotolerant coliforms (KEC). To establish that we could recover the maximum KEC and E. coli population, we compared two incubation temperature regimens, 41 and 44.5°C. Statistical analysis by the Fisher test of data did not demonstrate any statistically significant differences (P = 0.05) in the enumeration of E. coli for the different media (CECC and CECCP) and incubation temperatures. Variance analysis of data performed on KEC counts showed significant differences (P = 0.01) between KEC counts at 41 and 44.5°C on both CECC and CECCP. Analysis of variance demonstrated statistically significant differences (P = 0.05) in the enumeration of total thermotolerant coliforms (TTCs) on CECC and CECCP compared with mLSA. Target colonies were confirmed to be E. coli at a rate of 91.5% and KEC of likely fecal origin at a rate of 77.4% when using CECCP incubated at 41°C. The results of this study showed that CECCP agar incubated at 41°C is efficient for the simultaneous enumeration of E. coli and KEC from river and marine waters. PMID:10427079

  16. Comparison of the flexural strength of six reinforced restorative materials.

    PubMed

    Cohen, B I; Volovich, Y; Musikant, B L; Deutsch, A S

    2001-01-01

    This study calculated the flexural strength for six reinforced restorative materials and demonstrated that flexural strength values can be determined simply by using physical parameters (diametral tensile strength and Young's modulus values) that are easily determined experimentally. A one-way ANOVA analysis demonstrated a statistically significant difference between the two reinforced glass ionomers and the four composite resin materials, with the composite resin being stronger than the glass ionomers.

  17. Empirical Reference Distributions for Networks of Different Size

    PubMed Central

    Smith, Anna; Calder, Catherine A.; Browning, Christopher R.

    2016-01-01

    Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556

  18. Statistical analysis of solid waste composition data: Arithmetic mean, standard deviation and correlation coefficients.

    PubMed

    Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard

    2017-11-01

    Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. The mediating effect of calling on the relationship between medical school students' academic burnout and empathy.

    PubMed

    Chae, Su Jin; Jeong, So Mi; Chung, Yoon-Sok

    2017-09-01

    This study is aimed at identifying the relationships between medical school students' academic burnout, empathy, and calling, and determining whether their calling has a mediating effect on the relationship between academic burnout and empathy. A mixed method study was conducted. One hundred twenty-seven medical students completed a survey. Scales measuring academic burnout, medical students' empathy, and calling were utilized. For statistical analysis, correlation analysis, descriptive statistics analysis, and hierarchical multiple regression analyses were conducted. For qualitative approach, eight medical students participated in a focus group interview. The study found that empathy has a statistically significant, negative correlation with academic burnout, while having a significant, positive correlation with calling. Sense of calling proved to be an effective mediator of the relationship between academic burnout and empathy. This result demonstrates that calling is a key variable that mediates the relationship between medical students' academic burnout and empathy. As such, this study provides baseline data for an education that could improve medical students' empathy skills.

  20. Quantifying the Energy Landscape Statistics in Proteins - a Relaxation Mode Analysis

    NASA Astrophysics Data System (ADS)

    Cai, Zhikun; Zhang, Yang

    Energy landscape, the hypersurface in the configurational space, has been a useful concept in describing complex processes that occur over a very long time scale, such as the multistep slow relaxations of supercooled liquids and folding of polypeptide chains into structured proteins. Despite extensive simulation studies, its experimental characterization still remains a challenge. To address this challenge, we developed a relaxation mode analysis (RMA) for liquids under a framework analogous to the normal mode analysis for solids. Using RMA, important statistics of the activation barriers of the energy landscape becomes accessible from experimentally measurable two-point correlation functions, e.g. using quasi-elastic and inelastic scattering experiments. We observed a prominent coarsening effect of the energy landscape. The results were further confirmed by direct sampling of the energy landscape using a metadynamics-like adaptive autonomous basin climbing computation. We first demonstrate RMA in a supercooled liquid when dynamical cooperativity emerges in the landscape-influenced regime. Then we show this framework reveals encouraging energy landscape statistics when applied to proteins.

  1. Statistical Methodologies to Integrate Experimental and Computational Research

    NASA Technical Reports Server (NTRS)

    Parker, P. A.; Johnson, R. T.; Montgomery, D. C.

    2008-01-01

    Development of advanced algorithms for simulating engine flow paths requires the integration of fundamental experiments with the validation of enhanced mathematical models. In this paper, we provide an overview of statistical methods to strategically and efficiently conduct experiments and computational model refinement. Moreover, the integration of experimental and computational research efforts is emphasized. With a statistical engineering perspective, scientific and engineering expertise is combined with statistical sciences to gain deeper insights into experimental phenomenon and code development performance; supporting the overall research objectives. The particular statistical methods discussed are design of experiments, response surface methodology, and uncertainty analysis and planning. Their application is illustrated with a coaxial free jet experiment and a turbulence model refinement investigation. Our goal is to provide an overview, focusing on concepts rather than practice, to demonstrate the benefits of using statistical methods in research and development, thereby encouraging their broader and more systematic application.

  2. Identification of the isomers using principal component analysis (PCA) method

    NASA Astrophysics Data System (ADS)

    Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

    2016-03-01

    In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.

  3. A Finite-Volume "Shaving" Method for Interfacing NASA/DAO''s Physical Space Statistical Analysis System to the Finite-Volume GCM with a Lagrangian Control-Volume Vertical Coordinate

    NASA Technical Reports Server (NTRS)

    Lin, Shian-Jiann; DaSilva, Arlindo; Atlas, Robert (Technical Monitor)

    2001-01-01

    Toward the development of a finite-volume Data Assimilation System (fvDAS), a consistent finite-volume methodology is developed for interfacing the NASA/DAO's Physical Space Statistical Analysis System (PSAS) to the joint NASA/NCAR finite volume CCM3 (fvCCM3). To take advantage of the Lagrangian control-volume vertical coordinate of the fvCCM3, a novel "shaving" method is applied to the lowest few model layers to reflect the surface pressure changes as implied by the final analysis. Analysis increments (from PSAS) to the upper air variables are then consistently put onto the Lagrangian layers as adjustments to the volume-mean quantities during the analysis cycle. This approach is demonstrated to be superior to the conventional method of using independently computed "tendency terms" for surface pressure and upper air prognostic variables.

  4. Illicit and pharmaceutical drug consumption estimated via wastewater analysis. Part B: placing back-calculations in a formal statistical framework.

    PubMed

    Jones, Hayley E; Hickman, Matthew; Kasprzyk-Hordern, Barbara; Welton, Nicky J; Baker, David R; Ades, A E

    2014-07-15

    Concentrations of metabolites of illicit drugs in sewage water can be measured with great accuracy and precision, thanks to the development of sensitive and robust analytical methods. Based on assumptions about factors including the excretion profile of the parent drug, routes of administration and the number of individuals using the wastewater system, the level of consumption of a drug can be estimated from such measured concentrations. When presenting results from these 'back-calculations', the multiple sources of uncertainty are often discussed, but are not usually explicitly taken into account in the estimation process. In this paper we demonstrate how these calculations can be placed in a more formal statistical framework by assuming a distribution for each parameter involved, based on a review of the evidence underpinning it. Using a Monte Carlo simulations approach, it is then straightforward to propagate uncertainty in each parameter through the back-calculations, producing a distribution for instead of a single estimate of daily or average consumption. This can be summarised for example by a median and credible interval. To demonstrate this approach, we estimate cocaine consumption in a large urban UK population, using measured concentrations of two of its metabolites, benzoylecgonine and norbenzoylecgonine. We also demonstrate a more sophisticated analysis, implemented within a Bayesian statistical framework using Markov chain Monte Carlo simulation. Our model allows the two metabolites to simultaneously inform estimates of daily cocaine consumption and explicitly allows for variability between days. After accounting for this variability, the resulting credible interval for average daily consumption is appropriately wider, representing additional uncertainty. We discuss possibilities for extensions to the model, and whether analysis of wastewater samples has potential to contribute to a prevalence model for illicit drug use. Copyright © 2014. Published by Elsevier B.V.

  5. Illicit and pharmaceutical drug consumption estimated via wastewater analysis. Part B: Placing back-calculations in a formal statistical framework

    PubMed Central

    Jones, Hayley E.; Hickman, Matthew; Kasprzyk-Hordern, Barbara; Welton, Nicky J.; Baker, David R.; Ades, A.E.

    2014-01-01

    Concentrations of metabolites of illicit drugs in sewage water can be measured with great accuracy and precision, thanks to the development of sensitive and robust analytical methods. Based on assumptions about factors including the excretion profile of the parent drug, routes of administration and the number of individuals using the wastewater system, the level of consumption of a drug can be estimated from such measured concentrations. When presenting results from these ‘back-calculations’, the multiple sources of uncertainty are often discussed, but are not usually explicitly taken into account in the estimation process. In this paper we demonstrate how these calculations can be placed in a more formal statistical framework by assuming a distribution for each parameter involved, based on a review of the evidence underpinning it. Using a Monte Carlo simulations approach, it is then straightforward to propagate uncertainty in each parameter through the back-calculations, producing a distribution for instead of a single estimate of daily or average consumption. This can be summarised for example by a median and credible interval. To demonstrate this approach, we estimate cocaine consumption in a large urban UK population, using measured concentrations of two of its metabolites, benzoylecgonine and norbenzoylecgonine. We also demonstrate a more sophisticated analysis, implemented within a Bayesian statistical framework using Markov chain Monte Carlo simulation. Our model allows the two metabolites to simultaneously inform estimates of daily cocaine consumption and explicitly allows for variability between days. After accounting for this variability, the resulting credible interval for average daily consumption is appropriately wider, representing additional uncertainty. We discuss possibilities for extensions to the model, and whether analysis of wastewater samples has potential to contribute to a prevalence model for illicit drug use. PMID:24636801

  6. GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices

    PubMed Central

    Munjal, Aarti; Sakhadeo, Uttara R.; Muller, Keith E.; Glueck, Deborah H.; Kreidler, Sarah M.

    2014-01-01

    Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688

  7. The cutting edge - Micro-CT for quantitative toolmark analysis of sharp force trauma to bone.

    PubMed

    Norman, D G; Watson, D G; Burnett, B; Fenne, P M; Williams, M A

    2018-02-01

    Toolmark analysis involves examining marks created on an object to identify the likely tool responsible for creating those marks (e.g., a knife). Although a potentially powerful forensic tool, knife mark analysis is still in its infancy and the validation of imaging techniques as well as quantitative approaches is ongoing. This study builds on previous work by simulating real-world stabbings experimentally and statistically exploring quantitative toolmark properties, such as cut mark angle captured by micro-CT imaging, to predict the knife responsible. In Experiment 1 a mechanical stab rig and two knives were used to create 14 knife cut marks on dry pig ribs. The toolmarks were laser and micro-CT scanned to allow for quantitative measurements of numerous toolmark properties. The findings from Experiment 1 demonstrated that both knives produced statistically different cut mark widths, wall angle and shapes. Experiment 2 examined knife marks created on fleshed pig torsos with conditions designed to better simulate real-world stabbings. Eight knives were used to generate 64 incision cut marks that were also micro-CT scanned. Statistical exploration of these cut marks suggested that knife type, serrated or plain, can be predicted from cut mark width and wall angle. Preliminary results suggest that knives type can be predicted from cut mark width, and that knife edge thickness correlates with cut mark width. An additional 16 cut marks walls were imaged for striation marks using scanning electron microscopy with results suggesting that this approach might not be useful for knife mark analysis. Results also indicated that observer judgements of cut mark shape were more consistent when rated from micro-CT images than light microscopy images. The potential to combine micro-CT data, medical grade CT data and photographs to develop highly realistic virtual models for visualisation and 3D printing is also demonstrated. This is the first study to statistically explore simulated real-world knife marks imaged by micro-CT to demonstrate the potential of quantitative approaches in knife mark analysis. Findings and methods presented in this study are relevant to both forensic toolmark researchers as well as practitioners. Limitations of the experimental methodologies and imaging techniques are discussed, and further work is recommended. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. GIS and statistical analysis for landslide susceptibility mapping in the Daunia area, Italy

    NASA Astrophysics Data System (ADS)

    Mancini, F.; Ceppi, C.; Ritrovato, G.

    2010-09-01

    This study focuses on landslide susceptibility mapping in the Daunia area (Apulian Apennines, Italy) and achieves this by using a multivariate statistical method and data processing in a Geographical Information System (GIS). The Logistic Regression (hereafter LR) method was chosen to produce a susceptibility map over an area of 130 000 ha where small settlements are historically threatened by landslide phenomena. By means of LR analysis, the tendency to landslide occurrences was, therefore, assessed by relating a landslide inventory (dependent variable) to a series of causal factors (independent variables) which were managed in the GIS, while the statistical analyses were performed by means of the SPSS (Statistical Package for the Social Sciences) software. The LR analysis produced a reliable susceptibility map of the investigated area and the probability level of landslide occurrence was ranked in four classes. The overall performance achieved by the LR analysis was assessed by local comparison between the expected susceptibility and an independent dataset extrapolated from the landslide inventory. Of the samples classified as susceptible to landslide occurrences, 85% correspond to areas where landslide phenomena have actually occurred. In addition, the consideration of the regression coefficients provided by the analysis demonstrated that a major role is played by the "land cover" and "lithology" causal factors in determining the occurrence and distribution of landslide phenomena in the Apulian Apennines.

  9. Surface flaw reliability analysis of ceramic components with the SCARE finite element postprocessor program

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, John P.; Nemeth, Noel N.

    1987-01-01

    The SCARE (Structural Ceramics Analysis and Reliability Evaluation) computer program on statistical fast fracture reliability analysis with quadratic elements for volume distributed imperfections is enhanced to include the use of linear finite elements and the capability of designing against concurrent surface flaw induced ceramic component failure. The SCARE code is presently coupled as a postprocessor to the MSC/NASTRAN general purpose, finite element analysis program. The improved version now includes the Weibull and Batdorf statistical failure theories for both surface and volume flaw based reliability analysis. The program uses the two-parameter Weibull fracture strength cumulative failure probability distribution model with the principle of independent action for poly-axial stress states, and Batdorf's shear-sensitive as well as shear-insensitive statistical theories. The shear-sensitive surface crack configurations include the Griffith crack and Griffith notch geometries, using the total critical coplanar strain energy release rate criterion to predict mixed-mode fracture. Weibull material parameters based on both surface and volume flaw induced fracture can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and grouped fracture data. The statistical fast fracture theories for surface flaw induced failure, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.

  10. Statistical Coupling Analysis-Guided Library Design for the Discovery of Mutant Luciferases.

    PubMed

    Liu, Mira D; Warner, Elliot A; Morrissey, Charlotte E; Fick, Caitlyn W; Wu, Taia S; Ornelas, Marya Y; Ochoa, Gabriela V; Zhang, Brendan S; Rathbun, Colin M; Porterfield, William B; Prescher, Jennifer A; Leconte, Aaron M

    2018-02-06

    Directed evolution has proven to be an invaluable tool for protein engineering; however, there is still a need for developing new approaches to continue to improve the efficiency and efficacy of these methods. Here, we demonstrate a new method for library design that applies a previously developed bioinformatic method, Statistical Coupling Analysis (SCA). SCA uses homologous enzymes to identify amino acid positions that are mutable and functionally important and engage in synergistic interactions between amino acids. We use SCA to guide a library of the protein luciferase and demonstrate that, in a single round of selection, we can identify luciferase mutants with several valuable properties. Specifically, we identify luciferase mutants that possess both red-shifted emission spectra and improved stability relative to those of the wild-type enzyme. We also identify luciferase mutants that possess a >50-fold change in specificity for modified luciferins. To understand the mutational origin of these improved mutants, we demonstrate the role of mutations at N229, S239, and G246 in altered function. These studies show that SCA can be used to guide library design and rapidly identify synergistic amino acid mutations from a small library.

  11. Characterizing microstructural features of biomedical samples by statistical analysis of Mueller matrix images

    NASA Astrophysics Data System (ADS)

    He, Honghui; Dong, Yang; Zhou, Jialing; Ma, Hui

    2017-03-01

    As one of the salient features of light, polarization contains abundant structural and optical information of media. Recently, as a comprehensive description of polarization property, the Mueller matrix polarimetry has been applied to various biomedical studies such as cancerous tissues detections. In previous works, it has been found that the structural information encoded in the 2D Mueller matrix images can be presented by other transformed parameters with more explicit relationship to certain microstructural features. In this paper, we present a statistical analyzing method to transform the 2D Mueller matrix images into frequency distribution histograms (FDHs) and their central moments to reveal the dominant structural features of samples quantitatively. The experimental results of porcine heart, intestine, stomach, and liver tissues demonstrate that the transformation parameters and central moments based on the statistical analysis of Mueller matrix elements have simple relationships to the dominant microstructural properties of biomedical samples, including the density and orientation of fibrous structures, the depolarization power, diattenuation and absorption abilities. It is shown in this paper that the statistical analysis of 2D images of Mueller matrix elements may provide quantitative or semi-quantitative criteria for biomedical diagnosis.

  12. The Impact of Measurement Noise in GPA Diagnostic Analysis of a Gas Turbine Engine

    NASA Astrophysics Data System (ADS)

    Ntantis, Efstratios L.; Li, Y. G.

    2013-12-01

    The performance diagnostic analysis of a gas turbine is accomplished by estimating a set of internal engine health parameters from available sensor measurements. No physical measuring instruments however can ever completely eliminate the presence of measurement uncertainties. Sensor measurements are often distorted by noise and bias leading to inaccurate estimation results. This paper explores the impact of measurement noise on Gas Turbine GPA analysis. The analysis is demonstrated with a test case where gas turbine performance simulation and diagnostics code TURBOMATCH is used to build a performance model of a model engine similar to Rolls-Royce Trent 500 turbofan engine, and carry out the diagnostic analysis with the presence of different levels of measurement noise. Conclusively, to improve the reliability of the diagnostic results, a statistical analysis of the data scattering caused by sensor uncertainties is made. The diagnostic tool used to deal with the statistical analysis of measurement noise impact is a model-based method utilizing a non-linear GPA.

  13. SU-F-I-10: Spatially Local Statistics for Adaptive Image Filtering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iliopoulos, AS; Sun, X; Floros, D

    Purpose: To facilitate adaptive image filtering operations, addressing spatial variations in both noise and signal. Such issues are prevalent in cone-beam projections, where physical effects such as X-ray scattering result in spatially variant noise, violating common assumptions of homogeneous noise and challenging conventional filtering approaches to signal extraction and noise suppression. Methods: We present a computational mechanism for probing into and quantifying the spatial variance of noise throughout an image. The mechanism builds a pyramid of local statistics at multiple spatial scales; local statistical information at each scale includes (weighted) mean, median, standard deviation, median absolute deviation, as well asmore » histogram or dynamic range after local mean/median shifting. Based on inter-scale differences of local statistics, the spatial scope of distinguishable noise variation is detected in a semi- or un-supervised manner. Additionally, we propose and demonstrate the incorporation of such information in globally parametrized (i.e., non-adaptive) filters, effectively transforming the latter into spatially adaptive filters. The multi-scale mechanism is materialized by efficient algorithms and implemented in parallel CPU/GPU architectures. Results: We demonstrate the impact of local statistics for adaptive image processing and analysis using cone-beam projections of a Catphan phantom, fitted within an annulus to increase X-ray scattering. The effective spatial scope of local statistics calculations is shown to vary throughout the image domain, necessitating multi-scale noise and signal structure analysis. Filtering results with and without spatial filter adaptation are compared visually, illustrating improvements in imaging signal extraction and noise suppression, and in preserving information in low-contrast regions. Conclusion: Local image statistics can be incorporated in filtering operations to equip them with spatial adaptivity to spatial signal/noise variations. An efficient multi-scale computational mechanism is developed to curtail processing latency. Spatially adaptive filtering may impact subsequent processing tasks such as reconstruction and numerical gradient computations for deformable registration. NIH Grant No. R01-184173.« less

  14. Statistical methods for astronomical data with upper limits. II - Correlation and regression

    NASA Technical Reports Server (NTRS)

    Isobe, T.; Feigelson, E. D.; Nelson, P. I.

    1986-01-01

    Statistical methods for calculating correlations and regressions in bivariate censored data where the dependent variable can have upper or lower limits are presented. Cox's regression and the generalization of Kendall's rank correlation coefficient provide significant levels of correlations, and the EM algorithm, under the assumption of normally distributed errors, and its nonparametric analog using the Kaplan-Meier estimator, give estimates for the slope of a regression line. Monte Carlo simulations demonstrate that survival analysis is reliable in determining correlations between luminosities at different bands. Survival analysis is applied to CO emission in infrared galaxies, X-ray emission in radio galaxies, H-alpha emission in cooling cluster cores, and radio emission in Seyfert galaxies.

  15. The other half of the story: effect size analysis in quantitative research.

    PubMed

    Maher, Jessica Middlemis; Markey, Jonathan C; Ebert-May, Diane

    2013-01-01

    Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.

  16. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

    PubMed Central

    Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.

    2014-01-01

    Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607

  17. Explaining nitrate pollution pressure on the groundwater resource in Kinshasa using a multivariate statistical modelling approach

    NASA Astrophysics Data System (ADS)

    Mfumu Kihumba, Antoine; Vanclooster, Marnik

    2013-04-01

    Drinking water in Kinshasa, the capital of the Democratic Republic of Congo, is provided by extracting groundwater from the local aquifer, particularly in peripheral areas. The exploited groundwater body is mainly unconfined and located within a continuous detrital aquifer, primarily composed of sedimentary formations. However, the aquifer is subjected to an increasing threat of anthropogenic pollution pressure. Understanding the detailed origin of this pollution pressure is important for sustainable drinking water management in Kinshasa. The present study aims to explain the observed nitrate pollution problem, nitrate being considered as a good tracer for other pollution threats. The analysis is made in terms of physical attributes that are readily available using a statistical modelling approach. For the nitrate data, use was made of a historical groundwater quality assessment study, for which the data were re-analysed. The physical attributes are related to the topography, land use, geology and hydrogeology of the region. Prior to the statistical modelling, intrinsic and specific vulnerability for nitrate pollution was assessed. This vulnerability assessment showed that the alluvium area in the northern part of the region is the most vulnerable area. This area consists of urban land use with poor sanitation. Re-analysis of the nitrate pollution data demonstrated that the spatial variability of nitrate concentrations in the groundwater body is high, and coherent with the fragmented land use of the region and the intrinsic and specific vulnerability maps. For the statistical modeling use was made of multiple regression and regression tree analysis. The results demonstrated the significant impact of land use variables on the Kinshasa groundwater nitrate pollution and the need for a detailed delineation of groundwater capture zones around the monitoring stations. Key words: Groundwater , Isotopic, Kinshasa, Modelling, Pollution, Physico-chemical.

  18. Implemented Lomb-Scargle periodogram: a valuable tool for improving cyclostratigraphic research on unevenly sampled deep-sea stratigraphic sequences

    NASA Astrophysics Data System (ADS)

    Pardo-Iguzquiza, Eulogio; Rodríguez-Tovar, Francisco J.

    2011-12-01

    One important handicap when working with stratigraphic sequences is the discontinuous character of the sedimentary record, especially relevant in cyclostratigraphic analysis. Uneven palaeoclimatic/palaeoceanographic time series are common, their cyclostratigraphic analysis being comparatively difficult because most spectral methodologies are appropriate only when working with even sampling. As a means to solve this problem, a program for calculating the smoothed Lomb-Scargle periodogram and cross-periodogram, which additionally evaluates the statistical confidence of the estimated power spectrum through a Monte Carlo procedure (the permutation test), has been developed. The spectral analysis of a short uneven time series calls for assessment of the statistical significance of the spectral peaks, since a periodogram can always be calculated but the main challenge resides in identifying true spectral features. To demonstrate the effectiveness of this program, two case studies are presented: the one deals with synthetic data and the other with paleoceanographic/palaeoclimatic proxies. On a simulated time series of 500 data, two uneven time series (with 100 and 25 data) were generated by selecting data at random. Comparative analysis between the power spectra from the simulated series and from the two uneven time series demonstrates the usefulness of the smoothed Lomb-Scargle periodogram for uneven sequences, making it possible to distinguish between statistically significant and spurious spectral peaks. Fragmentary time series of Cd/Ca ratios and δ18O from core AII107-131 of SPECMAP were analysed as a real case study. The efficiency of the direct and cross Lomb-Scargle periodogram in recognizing Milankovitch and sub-Milankovitch signals related to palaeoclimatic/palaeoceanographic changes is demonstrated. As implemented, the Lomb-Scargle periodogram may be applied to any palaeoclimatic/palaeoceanographic proxies, including those usually recovered from contourites, and it holds special interest in the context of centennial- to millennial-scale climatic changes affecting contouritic currents.

  19. A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design

    PubMed Central

    Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

    2013-01-01

    For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809

  20. A phylogenetic transform enhances analysis of compositional microbiota data.

    PubMed

    Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

    2017-02-15

    Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.

  1. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design.

    PubMed

    Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

    2013-01-01

    For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.

  2. Mediation analysis in nursing research: a methodological review

    PubMed Central

    Liu, Jianghong; Ulrich, Connie

    2017-01-01

    Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask – and answer – more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science. PMID:26176804

  3. Effectiveness of perioperative antiepileptic drug prophylaxis for early and late seizures following oncologic neurosurgery: a meta-analysis.

    PubMed

    Joiner, Evan F; Youngerman, Brett E; Hudson, Taylor S; Yang, Jingyan; Welch, Mary R; McKhann, Guy M; Neugut, Alfred I; Bruce, Jeffrey N

    2018-04-27

    OBJECTIVE The purpose of this meta-analysis was to evaluate the impact of perioperative antiepileptic drug (AED) prophylaxis on short- and long-term seizure incidence among patients undergoing brain tumor surgery. It is the first meta-analysis to focus exclusively on perioperative AED prophylaxis among patients undergoing brain tumor surgery. METHODS The authors searched PubMed/MEDLINE, Embase, Cochrane Central Register of Controlled Trials, clinicaltrials.gov, and the System for Information on Gray Literature in Europe for records related to perioperative AED prophylaxis for patients with brain tumors. Risk of bias in the included studies was assessed using the Cochrane risk of bias tool. Incidence rates for early seizures (within the first postoperative week) and total seizures were estimated based on data from randomized controlled trials. A Mantel-Haenszel random-effects model was used to analyze pooled relative risk (RR) of early seizures (within the first postoperative week) and total seizures associated with perioperative AED prophylaxis versus control. RESULTS Four RCTs involving 352 patients met the criteria of inclusion. The results demonstrated that perioperative AED prophylaxis for patients undergoing brain tumor surgery provides a statistically significant reduction in risk of early postoperative seizures compared with control (RR = 0.352, 95% confidence interval 0.130-0.949, p = 0.039). AED prophylaxis had no statistically significant effect on the total (combined short- and long-term) incidence of seizures. CONCLUSIONS This meta-analysis demonstrates for the first time that perioperative AED prophylaxis for brain tumor surgery provides a statistically significant reduction in early postoperative seizure risk.

  4. Quantity of dates trumps quality of dates for dense Bayesian radiocarbon sediment chronologies - Gas ion source 14C dating instructed by simultaneous Bayesian accumulation rate modeling

    NASA Astrophysics Data System (ADS)

    Rosenheim, B. E.; Firesinger, D.; Roberts, M. L.; Burton, J. R.; Khan, N.; Moyer, R. P.

    2016-12-01

    Radiocarbon (14C) sediment core chronologies benefit from a high density of dates, even when precision of individual dates is sacrificed. This is demonstrated by a combined approach of rapid 14C analysis of CO2 gas generated from carbonates and organic material coupled with Bayesian statistical modeling. Analysis of 14C is facilitated by the gas ion source on the Continuous Flow Accelerator Mass Spectrometry (CFAMS) system at the Woods Hole Oceanographic Institution's National Ocean Sciences Accelerator Mass Spectrometry facility. This instrument is capable of producing a 14C determination of +/- 100 14C y precision every 4-5 minutes, with limited sample handling (dissolution of carbonates and/or combustion of organic carbon in evacuated containers). Rapid analysis allows over-preparation of samples to include replicates at each depth and/or comparison of different sample types at particular depths in a sediment or peat core. Analysis priority is given to depths that have the least chronologic precision as determined by Bayesian modeling of the chronology of calibrated ages. Use of such a statistical approach to determine the order in which samples are run ensures that the chronology constantly improves so long as material is available for the analysis of chronologic weak points. Ultimately, accuracy of the chronology is determined by the material that is actually being dated, and our combined approach allows testing of different constituents of the organic carbon pool and the carbonate minerals within a core. We will present preliminary results from a deep-sea sediment core abundant in deep-sea foraminifera as well as coastal wetland peat cores to demonstrate statistical improvements in sediment- and peat-core chronologies obtained by increasing the quantity and decreasing the quality of individual dates.

  5. Voxel-based statistical analysis of cerebral glucose metabolism in patients with permanent vegetative state after acquired brain injury.

    PubMed

    Kim, Yong Wook; Kim, Hyoung Seop; An, Young-Sil; Im, Sang Hee

    2010-10-01

    Permanent vegetative state is defined as the impaired level of consciousness longer than 12 months after traumatic causes and 3 months after non-traumatic causes of brain injury. Although many studies assessed the cerebral metabolism in patients with acute and persistent vegetative state after brain injury, few studies investigated the cerebral metabolism in patients with permanent vegetative state. In this study, we performed the voxel-based analysis of cerebral glucose metabolism and investigated the relationship between regional cerebral glucose metabolism and the severity of impaired consciousness in patients with permanent vegetative state after acquired brain injury. We compared the regional cerebral glucose metabolism as demonstrated by F-18 fluorodeoxyglucose positron emission tomography from 12 patients with permanent vegetative state after acquired brain injury with those from 12 control subjects. Additionally, covariance analysis was performed to identify regions where decreased changes in regional cerebral glucose metabolism significantly correlated with a decrease of level of consciousness measured by JFK-coma recovery scale. Statistical analysis was performed using statistical parametric mapping. Compared with controls, patients with permanent vegetative state demonstrated decreased cerebral glucose metabolism in the left precuneus, both posterior cingulate cortices, the left superior parietal lobule (P(corrected) < 0.001), and increased cerebral glucose metabolism in the both cerebellum and the right supramarginal cortices (P(corrected) < 0.001). In the covariance analysis, a decrease in the level of consciousness was significantly correlated with decreased cerebral glucose metabolism in the both posterior cingulate cortices (P(uncorrected) < 0.005). Our findings suggest that the posteromedial parietal cortex, which are part of neural network for consciousness, may be relevant structure for pathophysiological mechanism in patients with permanent vegetative state after acquired brain injury.

  6. Advanced Gear Alloys for Ultra High Strength Applications

    NASA Technical Reports Server (NTRS)

    Shen, Tony; Krantz, Timothy; Sebastian, Jason

    2011-01-01

    Single tooth bending fatigue (STBF) test data of UHS Ferrium C61 and C64 alloys are presented in comparison with historical test data of conventional gear steels (9310 and Pyrowear 53) with comparable statistical analysis methods. Pitting and scoring tests of C61 and C64 are works in progress. Boeing statistical analysis of STBF test data for the four gear steels (C61, C64, 9310 and Pyrowear 53) indicates that the UHS grades exhibit increases in fatigue strength in the low cycle fatigue (LCF) regime. In the high cycle fatigue (HCF) regime, the UHS steels exhibit better mean fatigue strength endurance limit behavior (particularly as compared to Pyrowear 53). However, due to considerable scatter in the UHS test data, the anticipated overall benefits of the UHS grades in bending fatigue have not been fully demonstrated. Based on all the test data and on Boeing s analysis, C61 has been selected by Boeing as the gear steel for the final ERDS demonstrator test gearboxes. In terms of potential follow-up work, detailed physics-based, micromechanical analysis and modeling of the fatigue data would allow for a better understanding of the causes of the experimental scatter, and of the transition from high-stress LCF (surface-dominated) to low-stress HCF (subsurface-dominated) fatigue failure. Additional STBF test data and failure analysis work, particularly in the HCF regime and around the endurance limit stress, could allow for better statistical confidence and could reduce the observed effects of experimental test scatter. Finally, the need for further optimization of the residual compressive stress profiles of the UHS steels (resulting from carburization and peening) is noted, particularly for the case of the higher hardness C64 material.

  7. Novel Image Encryption Scheme Based on Chebyshev Polynomial and Duffing Map

    PubMed Central

    2014-01-01

    We present a novel image encryption algorithm using Chebyshev polynomial based on permutation and substitution and Duffing map based on substitution. Comprehensive security analysis has been performed on the designed scheme using key space analysis, visual testing, histogram analysis, information entropy calculation, correlation coefficient analysis, differential analysis, key sensitivity test, and speed test. The study demonstrates that the proposed image encryption algorithm shows advantages of more than 10113 key space and desirable level of security based on the good statistical results and theoretical arguments. PMID:25143970

  8. Assessment of statistical significance and clinical relevance.

    PubMed

    Kieser, Meinhard; Friede, Tim; Gondan, Matthias

    2013-05-10

    In drug development, it is well accepted that a successful study will demonstrate not only a statistically significant result but also a clinically relevant effect size. Whereas standard hypothesis tests are used to demonstrate the former, it is less clear how the latter should be established. In the first part of this paper, we consider the responder analysis approach and study the performance of locally optimal rank tests when the outcome distribution is a mixture of responder and non-responder distributions. We find that these tests are quite sensitive to their planning assumptions and have therefore not really any advantage over standard tests such as the t-test and the Wilcoxon-Mann-Whitney test, which perform overall well and can be recommended for applications. In the second part, we present a new approach to the assessment of clinical relevance based on the so-called relative effect (or probabilistic index) and derive appropriate sample size formulae for the design of studies aiming at demonstrating both a statistically significant and clinically relevant effect. Referring to recent studies in multiple sclerosis, we discuss potential issues in the application of this approach. Copyright © 2012 John Wiley & Sons, Ltd.

  9. Statistically Controlling for Confounding Constructs Is Harder than You Think

    PubMed Central

    Westfall, Jacob; Yarkoni, Tal

    2016-01-01

    Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity. PMID:27031707

  10. Statistical summaries of fatigue data for design purposes

    NASA Technical Reports Server (NTRS)

    Wirsching, P. H.

    1983-01-01

    Two methods are discussed for constructing a design curve on the safe side of fatigue data. Both the tolerance interval and equivalent prediction interval (EPI) concepts provide such a curve while accounting for both the distribution of the estimators in small samples and the data scatter. The EPI is also useful as a mechanism for providing necessary statistics on S-N data for a full reliability analysis which includes uncertainty in all fatigue design factors. Examples of statistical analyses of the general strain life relationship are presented. The tolerance limit and EPI techniques for defining a design curve are demonstrated. Examples usng WASPALOY B and RQC-100 data demonstrate that a reliability model could be constructed by considering the fatigue strength and fatigue ductility coefficients as two independent random variables. A technique given for establishing the fatigue strength for high cycle lives relies on an extrapolation technique and also accounts for "runners." A reliability model or design value can be specified.

  11. Statistical Analysis of Time-Series from Monitoring of Active Volcanic Vents

    NASA Astrophysics Data System (ADS)

    Lachowycz, S.; Cosma, I.; Pyle, D. M.; Mather, T. A.; Rodgers, M.; Varley, N. R.

    2016-12-01

    Despite recent advances in the collection and analysis of time-series from volcano monitoring, and the resulting insights into volcanic processes, challenges remain in forecasting and interpreting activity from near real-time analysis of monitoring data. Statistical methods have potential to characterise the underlying structure and facilitate intercomparison of these time-series, and so inform interpretation of volcanic activity. We explore the utility of multiple statistical techniques that could be widely applicable to monitoring data, including Shannon entropy and detrended fluctuation analysis, by their application to various data streams from volcanic vents during periods of temporally variable activity. Each technique reveals changes through time in the structure of some of the data that were not apparent from conventional analysis. For example, we calculate the Shannon entropy (a measure of the randomness of a signal) of time-series from the recent dome-forming eruptions of Volcán de Colima (Mexico) and Soufrière Hills (Montserrat). The entropy of real-time seismic measurements and the count rate of certain volcano-seismic event types from both volcanoes is found to be temporally variable, with these data generally having higher entropy during periods of lava effusion and/or larger explosions. In some instances, the entropy shifts prior to or coincident with changes in seismic or eruptive activity, some of which were not clearly recognised by real-time monitoring. Comparison with other statistics demonstrates the sensitivity of the entropy to the data distribution, but that it is distinct from conventional statistical measures such as coefficient of variation. We conclude that each analysis technique examined could provide valuable insights for interpretation of diverse monitoring time-series.

  12. Functional brain networks for learning predictive statistics.

    PubMed

    Giorgio, Joseph; Karlaftis, Vasilis M; Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew; Kourtzi, Zoe

    2017-08-18

    Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. This skill relies on extracting regular patterns in space and time by mere exposure to the environment (i.e., without explicit feedback). Yet, we know little about the functional brain networks that mediate this type of statistical learning. Here, we test whether changes in the processing and connectivity of functional brain networks due to training relate to our ability to learn temporal regularities. By combining behavioral training and functional brain connectivity analysis, we demonstrate that individuals adapt to the environment's statistics as they change over time from simple repetition to probabilistic combinations. Further, we show that individual learning of temporal structures relates to decision strategy. Our fMRI results demonstrate that learning-dependent changes in fMRI activation within and functional connectivity between brain networks relate to individual variability in strategy. In particular, extracting the exact sequence statistics (i.e., matching) relates to changes in brain networks known to be involved in memory and stimulus-response associations, while selecting the most probable outcomes in a given context (i.e., maximizing) relates to changes in frontal and striatal networks. Thus, our findings provide evidence that dissociable brain networks mediate individual ability in learning behaviorally-relevant statistics. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Learning and understanding the Kruskal-Wallis one-way analysis-of-variance-by-ranks test for differences among three or more independent groups.

    PubMed

    Chan, Y; Walmsley, R P

    1997-12-01

    When several treatment methods are available for the same problem, many clinicians are faced with the task of deciding which treatment to use. Many clinicians may have conducted informal "mini-experiments" on their own to determine which treatment is best suited for the problem. These results are usually not documented or reported in a formal manner because many clinicians feel that they are "statistically challenged." Another reason may be because clinicians do not feel they have controlled enough test conditions to warrant analysis. In this update, a statistic is described that does not involve complicated statistical assumptions, making it a simple and easy-to-use statistical method. This update examines the use of two statistics and does not deal with other issues that could affect clinical research such as issues affecting credibility. For readers who want a more in-depth examination of this topic, references have been provided. The Kruskal-Wallis one-way analysis-of-variance-by-ranks test (or H test) is used to determine whether three or more independent groups are the same or different on some variable of interest when an ordinal level of data or an interval or ratio level of data is available. A hypothetical example will be presented to explain when and how to use this statistic, how to interpret results using the statistic, the advantages and disadvantages of the statistic, and what to look for in a written report. This hypothetical example will involve the use of ratio data to demonstrate how to choose between using the nonparametric H test and the more powerful parametric F test.

  14. Parametric and experimental analysis using a power flow approach

    NASA Technical Reports Server (NTRS)

    Cuschieri, J. M.

    1990-01-01

    A structural power flow approach for the analysis of structure-borne transmission of vibrations is used to analyze the influence of structural parameters on transmitted power. The parametric analysis is also performed using the Statistical Energy Analysis approach and the results are compared with those obtained using the power flow approach. The advantages of structural power flow analysis are demonstrated by comparing the type of results that are obtained by the two analytical methods. Also, to demonstrate that the power flow results represent a direct physical parameter that can be measured on a typical structure, an experimental study of structural power flow is presented. This experimental study presents results for an L shaped beam for which an available solution was already obtained. Various methods to measure vibrational power flow are compared to study their advantages and disadvantages.

  15. SQC: secure quality control for meta-analysis of genome-wide association studies.

    PubMed

    Huang, Zhicong; Lin, Huang; Fellay, Jacques; Kutalik, Zoltán; Hubaux, Jean-Pierre

    2017-08-01

    Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. jean-pierre.hubaux@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  16. Recovering incomplete data using Statistical Multiple Imputations (SMI): a case study in environmental chemistry.

    PubMed

    Mercer, Theresa G; Frostick, Lynne E; Walmsley, Anthony D

    2011-10-15

    This paper presents a statistical technique that can be applied to environmental chemistry data where missing values and limit of detection levels prevent the application of statistics. A working example is taken from an environmental leaching study that was set up to determine if there were significant differences in levels of leached arsenic (As), chromium (Cr) and copper (Cu) between lysimeters containing preservative treated wood waste and those containing untreated wood. Fourteen lysimeters were setup and left in natural conditions for 21 weeks. The resultant leachate was analysed by ICP-OES to determine the As, Cr and Cu concentrations. However, due to the variation inherent in each lysimeter combined with the limits of detection offered by ICP-OES, the collected quantitative data was somewhat incomplete. Initial data analysis was hampered by the number of 'missing values' in the data. To recover the dataset, the statistical tool of Statistical Multiple Imputation (SMI) was applied, and the data was re-analysed successfully. It was demonstrated that using SMI did not affect the variance in the data, but facilitated analysis of the complete dataset. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Rescaled earthquake recurrence time statistics: application to microrepeaters

    NASA Astrophysics Data System (ADS)

    Goltz, Christian; Turcotte, Donald L.; Abaimov, Sergey G.; Nadeau, Robert M.; Uchida, Naoki; Matsuzawa, Toru

    2009-01-01

    Slip on major faults primarily occurs during `characteristic' earthquakes. The recurrence statistics of characteristic earthquakes play an important role in seismic hazard assessment. A major problem in determining applicable statistics is the short sequences of characteristic earthquakes that are available worldwide. In this paper, we introduce a rescaling technique in which sequences can be superimposed to establish larger numbers of data points. We consider the Weibull and log-normal distributions, in both cases we rescale the data using means and standard deviations. We test our approach utilizing sequences of microrepeaters, micro-earthquakes which recur in the same location on a fault. It seems plausible to regard these earthquakes as a miniature version of the classic characteristic earthquakes. Microrepeaters are much more frequent than major earthquakes, leading to longer sequences for analysis. In this paper, we present results for the analysis of recurrence times for several microrepeater sequences from Parkfield, CA as well as NE Japan. We find that, once the respective sequence can be considered to be of sufficient stationarity, the statistics can be well fitted by either a Weibull or a log-normal distribution. We clearly demonstrate this fact by our technique of rescaled combination. We conclude that the recurrence statistics of the microrepeater sequences we consider are similar to the recurrence statistics of characteristic earthquakes on major faults.

  18. Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

    NASA Astrophysics Data System (ADS)

    Passemier, Damien; McKay, Matthew R.; Chen, Yang

    2015-07-01

    Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.

  19. Race and Gender Bias in the Administration of Corporal Punishment.

    ERIC Educational Resources Information Center

    Shaw, Steven R.; Braden, Jeffery B.

    1990-01-01

    Examined disciplinary actions taken by school building administrators after receiving discipline referral to identify evidence of race and gender bias in administration of corporal punishment (CP). Analysis of discipline files (n=6,244) demonstrated statistically significant relationships between race and CP and between gender and CP. Results…

  20. An Analysis of Teacher Sorting in Secondary Special Education and Alternative Schools

    ERIC Educational Resources Information Center

    Mason-Williams, Loretta; Gagnon, Joseph Calvin

    2017-01-01

    This study provides nationally representative information about the qualifications and preparation of secondary content and special education teachers in special education and alternative school settings, as compared with teachers in regular schools. Findings demonstrate that a statistically significant relationship did not exist between school…

  1. Sensory Integration and Ego Development in a Schizophrenic Adolescent Male.

    ERIC Educational Resources Information Center

    Pettit, Karen A.

    1987-01-01

    A retrospective study compared hours spent by a schizophrenic adolescent in "time out" before and after initiation of treatment. The study evaluated the effects of sensory integrative treatment on the ability to handle anger and frustration. Results demonstrate the utility of statistical analysis versus visual comparison to validate effectiveness…

  2. An Examination of Commercial Motor Vehicle Hours of Service Safety Regulation

    DTIC Science & Technology

    2016-09-01

    enough data for a statistical analysis of its effects on truck driver safety. This research found that by comparing truck driving safety data prior...impact on the commercial motor vehicle industry. Research has demonstrated how fatigue leads to increased accident rates by decreasing reaction time

  3. Big Data and Neuroimaging.

    PubMed

    Webb-Vargas, Yenny; Chen, Shaojie; Fisher, Aaron; Mejia, Amanda; Xu, Yuting; Crainiceanu, Ciprian; Caffo, Brian; Lindquist, Martin A

    2017-12-01

    Big Data are of increasing importance in a variety of areas, especially in the biosciences. There is an emerging critical need for Big Data tools and methods, because of the potential impact of advancements in these areas. Importantly, statisticians and statistical thinking have a major role to play in creating meaningful progress in this arena. We would like to emphasize this point in this special issue, as it highlights both the dramatic need for statistical input for Big Data analysis and for a greater number of statisticians working on Big Data problems. We use the field of statistical neuroimaging to demonstrate these points. As such, this paper covers several applications and novel methodological developments of Big Data tools applied to neuroimaging data.

  4. Dark Energy Survey Year 1 Results: Multi-Probe Methodology and Simulated Likelihood Analyses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krause, E.; et al.

    We present the methodology for and detail the implementation of the Dark Energy Survey (DES) 3x2pt DES Year 1 (Y1) analysis, which combines configuration-space two-point statistics from three different cosmological probes: cosmic shear, galaxy-galaxy lensing, and galaxy clustering, using data from the first year of DES observations. We have developed two independent modeling pipelines and describe the code validation process. We derive expressions for analytical real-space multi-probe covariances, and describe their validation with numerical simulations. We stress-test the inference pipelines in simulated likelihood analyses that vary 6-7 cosmology parameters plus 20 nuisance parameters and precisely resemble the analysis to be presented in the DES 3x2pt analysis paper, using a variety of simulated input data vectors with varying assumptions. We find that any disagreement between pipelines leads to changes in assigned likelihoodmore » $$\\Delta \\chi^2 \\le 0.045$$ with respect to the statistical error of the DES Y1 data vector. We also find that angular binning and survey mask do not impact our analytic covariance at a significant level. We determine lower bounds on scales used for analysis of galaxy clustering (8 Mpc$$~h^{-1}$$) and galaxy-galaxy lensing (12 Mpc$$~h^{-1}$$) such that the impact of modeling uncertainties in the non-linear regime is well below statistical errors, and show that our analysis choices are robust against a variety of systematics. These tests demonstrate that we have a robust analysis pipeline that yields unbiased cosmological parameter inferences for the flagship 3x2pt DES Y1 analysis. We emphasize that the level of independent code development and subsequent code comparison as demonstrated in this paper is necessary to produce credible constraints from increasingly complex multi-probe analyses of current data.« less

  5. The demonstration of an advanced cyclone coal combustor, with internal sulfur, nitrogen, and ash control for the conversion of a 23 MMBTU/hour oil fired boiler to pulverized coal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zauderer, B.; Fleming, E.S.

    1991-08-30

    This work contains to the final report of the demonstration of an advanced cyclone coal combustor. Titles include: Chronological Description of the Clean Coal Project Tests,'' Statistical Analysis of Operating Data for the Coal Tech Combustor,'' Photographic History of the Project,'' Results of Slag Analysis by PA DER Module 1 Procedure,'' Properties of the Coals Limestone Used in the Test Effort,'' Results of the Solid Waste Sampling Performed on the Coal Tech Combustor by an Independent Contractor During the February 1990 Tests.'' (VC)

  6. Materials of acoustic analysis: sustained vowel versus sentence.

    PubMed

    Moon, Kyung Ray; Chung, Sung Min; Park, Hae Sang; Kim, Han Su

    2012-09-01

    Sustained vowel is a widely used material of acoustic analysis. However, vowel phonation does not sufficiently demonstrate sentence-based real-life phonation, and biases may occur depending on the test subjects intent during pronunciation. The purpose of this study was to investigate the differences between the results of acoustic analysis using each material. An individual prospective study. Two hundred two individuals (87 men and 115 women) with normal findings in videostroboscopy were enrolled. Acoustic analysis was done using the speech pattern element acquisition and display program. Fundamental frequency (Fx), amplitude (Ax), contact quotient (Qx), jitter, and shimmer were measured with sustained vowel-based acoustic analysis. Average fundamental frequency (FxM), average amplitude (AxM), average contact quotient (QxM), Fx perturbation (CFx), and amplitude perturbation (CAx) were measured with sentence-based acoustic analysis. Corresponding data of the two methods were compared with each other. SPSS (Statistical Package for the Social Sciences, Version 12.0; SPSS, Inc., Chicago, IL) software was used for statistical analysis. FxM was higher than Fx in men (Fx, 124.45 Hz; FxM, 133.09 Hz; P=0.000). In women, FxM seemed to be lower than Fx, but the results were not statistically significant (Fx, 210.58 Hz; FxM, 208.34 Hz; P=0.065). There was no statistical significance between Ax and AxM in both the groups. QxM was higher than Qx in men and women. Jitter was lower in men, but CFx was lower in women. Both Shimmer and CAx were higher in men. Sustained vowel phonation could not be a complete substitute for real-time phonation in acoustic analysis. Characteristics of acoustic materials should be considered when choosing the material for acoustic analysis and interpreting the results. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.

  7. Computerized system for assessing heart rate variability.

    PubMed

    Frigy, A; Incze, A; Brânzaniuc, E; Cotoi, S

    1996-01-01

    The principal theoretical, methodological and clinical aspects of heart rate variability (HRV) analysis are reviewed. This method has been developed over the last 10 years as a useful noninvasive method of measuring the activity of the autonomic nervous system. The main components and the functioning of the computerized rhythm-analyzer system developed by our team are presented. The system is able to perform short-term (maximum 20 minutes) time domain HRV analysis and statistical analysis of the ventricular rate in any rhythm, particularly in atrial fibrillation. The performances of our system are demonstrated by using the graphics (RR histograms, delta RR histograms, RR scattergrams) and the statistical parameters resulted from the processing of three ECG recordings. These recordings are obtained from a normal subject, from a patient with advanced heart failure, and from a patient with atrial fibrillation.

  8. Four modes of optical parametric operation for squeezed state generation

    NASA Astrophysics Data System (ADS)

    Andersen, U. L.; Buchler, B. C.; Lam, P. K.; Wu, J. W.; Gao, J. R.; Bachor, H.-A.

    2003-11-01

    We report a versatile instrument, based on a monolithic optical parametric amplifier, which reliably generates four different types of squeezed light. We obtained vacuum squeezing, low power amplitude squeezing, phase squeezing and bright amplitude squeezing. We show a complete analysis of this light, including a full quantum state tomography. In addition we demonstrate the direct detection of the squeezed state statistics without the aid of a spectrum analyser. This technique makes the nonclassical properties directly visible and allows complete measurement of the statistical moments of the squeezed quadrature.

  9. Small sample estimation of the reliability function for technical products

    NASA Astrophysics Data System (ADS)

    Lyamets, L. L.; Yakimenko, I. V.; Kanishchev, O. A.; Bliznyuk, O. A.

    2017-12-01

    It is demonstrated that, in the absence of big statistic samples obtained as a result of testing complex technical products for failure, statistic estimation of the reliability function of initial elements can be made by the moments method. A formal description of the moments method is given and its advantages in the analysis of small censored samples are discussed. A modified algorithm is proposed for the implementation of the moments method with the use of only the moments at which the failures of initial elements occur.

  10. Analysis of repeated measurement data in the clinical trials

    PubMed Central

    Singh, Vineeta; Rana, Rakesh Kumar; Singhal, Richa

    2013-01-01

    Statistics is an integral part of Clinical Trials. Elements of statistics span Clinical Trial design, data monitoring, analyses and reporting. A solid understanding of statistical concepts by clinicians improves the comprehension and the resulting quality of Clinical Trials. In biomedical research it has been seen that researcher frequently use t-test and ANOVA to compare means between the groups of interest irrespective of the nature of the data. In Clinical Trials we record the data on the patients more than two times. In such a situation using the standard ANOVA procedures is not appropriate as it does not consider dependencies between observations within subjects in the analysis. To deal with such types of study data Repeated Measure ANOVA should be used. In this article the application of One-way Repeated Measure ANOVA has been demonstrated by using the software SPSS (Statistical Package for Social Sciences) Version 15.0 on the data collected at four time points 0 day, 15th day, 30th day, and 45th day of multicentre clinical trial conducted on Pandu Roga (~Iron Deficiency Anemia) with an Ayurvedic formulation Dhatrilauha. PMID:23930038

  11. High order statistical signatures from source-driven measurements of subcritical fissile systems

    NASA Astrophysics Data System (ADS)

    Mattingly, John Kelly

    1998-11-01

    This research focuses on the development and application of high order statistical analyses applied to measurements performed with subcritical fissile systems driven by an introduced neutron source. The signatures presented are derived from counting statistics of the introduced source and radiation detectors that observe the response of the fissile system. It is demonstrated that successively higher order counting statistics possess progressively higher sensitivity to reactivity. Consequently, these signatures are more sensitive to changes in the composition, fissile mass, and configuration of the fissile assembly. Furthermore, it is shown that these techniques are capable of distinguishing the response of the fissile system to the introduced source from its response to any internal or inherent sources. This ability combined with the enhanced sensitivity of higher order signatures indicates that these techniques will be of significant utility in a variety of applications. Potential applications include enhanced radiation signature identification of weapons components for nuclear disarmament and safeguards applications and augmented nondestructive analysis of spent nuclear fuel. In general, these techniques expand present capabilities in the analysis of subcritical measurements.

  12. IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

    PubMed

    Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

    2017-09-15

    Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection

    PubMed Central

    Bamidis, P D; Lithari, C; Konstantinidis, S T

    2010-01-01

    With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces. PMID:21487489

  14. Is the spatial distribution of brain lesions associated with closed-head injury predictive of subsequent development of attention-deficit/hyperactivity disorder? Analysis with brain-image database

    NASA Technical Reports Server (NTRS)

    Herskovits, E. H.; Megalooikonomou, V.; Davatzikos, C.; Chen, A.; Bryan, R. N.; Gerring, J. P.

    1999-01-01

    PURPOSE: To determine whether there is an association between the spatial distribution of lesions detected at magnetic resonance (MR) imaging of the brain in children after closed-head injury and the development of secondary attention-deficit/hyperactivity disorder (ADHD). MATERIALS AND METHODS: Data obtained from 76 children without prior history of ADHD were analyzed. MR images were obtained 3 months after closed-head injury. After manual delineation of lesions, images were registered to the Talairach coordinate system. For each subject, registered images and secondary ADHD status were integrated into a brain-image database, which contains depiction (visualization) and statistical analysis software. Using this database, we assessed visually the spatial distributions of lesions and performed statistical analysis of image and clinical variables. RESULTS: Of the 76 children, 15 developed secondary ADHD. Depiction of the data suggested that children who developed secondary ADHD had more lesions in the right putamen than children who did not develop secondary ADHD; this impression was confirmed statistically. After Bonferroni correction, we could not demonstrate significant differences between secondary ADHD status and lesion burdens for the right caudate nucleus or the right globus pallidus. CONCLUSION: Closed-head injury-induced lesions in the right putamen in children are associated with subsequent development of secondary ADHD. Depiction software is useful in guiding statistical analysis of image data.

  15. Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection.

    PubMed

    Bamidis, P D; Lithari, C; Konstantinidis, S T

    2010-12-01

    With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces.

  16. Estimating the proportion of true null hypotheses when the statistics are discrete.

    PubMed

    Dialsingh, Isaac; Austin, Stefanie R; Altman, Naomi S

    2015-07-15

    In high-dimensional testing problems π0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating π0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems. This article introduces a number of π0 estimators, the regression and 'T' methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data. implemented in R. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. MAGMA: analysis of two-channel microarrays made easy.

    PubMed

    Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph

    2007-07-01

    The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch.

  18. The mediating effect of calling on the relationship between medical school students’ academic burnout and empathy

    PubMed Central

    2017-01-01

    Purpose This study is aimed at identifying the relationships between medical school students’ academic burnout, empathy, and calling, and determining whether their calling has a mediating effect on the relationship between academic burnout and empathy. Methods A mixed method study was conducted. One hundred twenty-seven medical students completed a survey. Scales measuring academic burnout, medical students’ empathy, and calling were utilized. For statistical analysis, correlation analysis, descriptive statistics analysis, and hierarchical multiple regression analyses were conducted. For qualitative approach, eight medical students participated in a focus group interview. Results The study found that empathy has a statistically significant, negative correlation with academic burnout, while having a significant, positive correlation with calling. Sense of calling proved to be an effective mediator of the relationship between academic burnout and empathy. Conclusion This result demonstrates that calling is a key variable that mediates the relationship between medical students’ academic burnout and empathy. As such, this study provides baseline data for an education that could improve medical students’ empathy skills. PMID:28870019

  19. Quantifying the evolution of flow boiling bubbles by statistical testing and image analysis: toward a general model.

    PubMed

    Xiao, Qingtai; Xu, Jianxin; Wang, Hua

    2016-08-16

    A new index, the estimate of the error variance, which can be used to quantify the evolution of the flow patterns when multiphase components or tracers are difficultly distinguishable, was proposed. The homogeneity degree of the luminance space distribution behind the viewing windows in the direct contact boiling heat transfer process was explored. With image analysis and a linear statistical model, the F-test of the statistical analysis was used to test whether the light was uniform, and a non-linear method was used to determine the direction and position of a fixed source light. The experimental results showed that the inflection point of the new index was approximately equal to the mixing time. The new index has been popularized and applied to a multiphase macro mixing process by top blowing in a stirred tank. Moreover, a general quantifying model was introduced for demonstrating the relationship between the flow patterns of the bubble swarms and heat transfer. The results can be applied to investigate other mixing processes that are very difficult to recognize the target.

  20. 2D Affine and Projective Shape Analysis.

    PubMed

    Bryner, Darshan; Klassen, Eric; Huiling Le; Srivastava, Anuj

    2014-05-01

    Current techniques for shape analysis tend to seek invariance to similarity transformations (rotation, translation, and scale), but certain imaging situations require invariance to larger groups, such as affine or projective groups. Here we present a general Riemannian framework for shape analysis of planar objects where metrics and related quantities are invariant to affine and projective groups. Highlighting two possibilities for representing object boundaries-ordered points (or landmarks) and parameterized curves-we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussian-type statistical models, and classifying test shapes using such models learned from training data. In the case of parameterized curves, we also achieve the desired goal of invariance to re-parameterizations. The geodesics are constructed by particularizing the path-straightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussian-type shape models. We demonstrate these ideas using a number of examples from shape and activity recognition.

  1. Quantifying the evolution of flow boiling bubbles by statistical testing and image analysis: toward a general model

    PubMed Central

    Xiao, Qingtai; Xu, Jianxin; Wang, Hua

    2016-01-01

    A new index, the estimate of the error variance, which can be used to quantify the evolution of the flow patterns when multiphase components or tracers are difficultly distinguishable, was proposed. The homogeneity degree of the luminance space distribution behind the viewing windows in the direct contact boiling heat transfer process was explored. With image analysis and a linear statistical model, the F-test of the statistical analysis was used to test whether the light was uniform, and a non-linear method was used to determine the direction and position of a fixed source light. The experimental results showed that the inflection point of the new index was approximately equal to the mixing time. The new index has been popularized and applied to a multiphase macro mixing process by top blowing in a stirred tank. Moreover, a general quantifying model was introduced for demonstrating the relationship between the flow patterns of the bubble swarms and heat transfer. The results can be applied to investigate other mixing processes that are very difficult to recognize the target. PMID:27527065

  2. A Statistical Analysis of the Economic Drivers of Battery Energy Storage in Commercial Buildings: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, Matthew; Simpkins, Travis; Cutler, Dylan

    There is significant interest in using battery energy storage systems (BESS) to reduce peak demand charges, and therefore the life cycle cost of electricity, in commercial buildings. This paper explores the drivers of economic viability of BESS in commercial buildings through statistical analysis. A sample population of buildings was generated, a techno-economic optimization model was used to size and dispatch the BESS, and the resulting optimal BESS sizes were analyzed for relevant predictor variables. Explanatory regression analyses were used to demonstrate that peak demand charges are the most significant predictor of an economically viable battery, and that the shape ofmore » the load profile is the most significant predictor of the size of the battery.« less

  3. A Statistical Analysis of the Economic Drivers of Battery Energy Storage in Commercial Buildings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, Matthew; Simpkins, Travis; Cutler, Dylan

    There is significant interest in using battery energy storage systems (BESS) to reduce peak demand charges, and therefore the life cycle cost of electricity, in commercial buildings. This paper explores the drivers of economic viability of BESS in commercial buildings through statistical analysis. A sample population of buildings was generated, a techno-economic optimization model was used to size and dispatch the BESS, and the resulting optimal BESS sizes were analyzed for relevant predictor variables. Explanatory regression analyses were used to demonstrate that peak demand charges are the most significant predictor of an economically viable battery, and that the shape ofmore » the load profile is the most significant predictor of the size of the battery.« less

  4. Treated cabin acoustic prediction using statistical energy analysis

    NASA Technical Reports Server (NTRS)

    Yoerkie, Charles A.; Ingraham, Steven T.; Moore, James A.

    1987-01-01

    The application of statistical energy analysis (SEA) to the modeling and design of helicopter cabin interior noise control treatment is demonstrated. The information presented here is obtained from work sponsored at NASA Langley for the development of analytic modeling techniques and the basic understanding of cabin noise. Utility and executive interior models are developed directly from existing S-76 aircraft designs. The relative importance of panel transmission loss (TL), acoustic leakage, and absorption to the control of cabin noise is shown using the SEA modeling parameters. It is shown that the major cabin noise improvement below 1000 Hz comes from increased panel TL, while above 1000 Hz it comes from reduced acoustic leakage and increased absorption in the cabin and overhead cavities.

  5. Estimating weak ratiometric signals in imaging data. II. Meta-analysis with multiple, dual-channel datasets.

    PubMed

    Sornborger, Andrew; Broder, Josef; Majumder, Anirban; Srinivasamoorthy, Ganesh; Porter, Erika; Reagin, Sean S; Keith, Charles; Lauderdale, James D

    2008-09-01

    Ratiometric fluorescent indicators are used for making quantitative measurements of a variety of physiological variables. Their utility is often limited by noise. This is the second in a series of papers describing statistical methods for denoising ratiometric data with the aim of obtaining improved quantitative estimates of variables of interest. Here, we outline a statistical optimization method that is designed for the analysis of ratiometric imaging data in which multiple measurements have been taken of systems responding to the same stimulation protocol. This method takes advantage of correlated information across multiple datasets for objectively detecting and estimating ratiometric signals. We demonstrate our method by showing results of its application on multiple, ratiometric calcium imaging experiments.

  6. Functional recognition imaging using artificial neural networks: applications to rapid cellular identification via broadband electromechanical response

    NASA Astrophysics Data System (ADS)

    Nikiforov, M. P.; Reukov, V. V.; Thompson, G. L.; Vertegel, A. A.; Guo, S.; Kalinin, S. V.; Jesse, S.

    2009-10-01

    Functional recognition imaging in scanning probe microscopy (SPM) using artificial neural network identification is demonstrated. This approach utilizes statistical analysis of complex SPM responses at a single spatial location to identify the target behavior, which is reminiscent of associative thinking in the human brain, obviating the need for analytical models. We demonstrate, as an example of recognition imaging, rapid identification of cellular organisms using the difference in electromechanical activity over a broad frequency range. Single-pixel identification of model Micrococcus lysodeikticus and Pseudomonas fluorescens bacteria is achieved, demonstrating the viability of the method.

  7. Detection of semi-volatile organic compounds in permeable ...

    EPA Pesticide Factsheets

    Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame

  8. A new image encryption algorithm based on the fractional-order hyperchaotic Lorenz system

    NASA Astrophysics Data System (ADS)

    Wang, Zhen; Huang, Xia; Li, Yu-Xia; Song, Xiao-Na

    2013-01-01

    We propose a new image encryption algorithm on the basis of the fractional-order hyperchaotic Lorenz system. While in the process of generating a key stream, the system parameters and the derivative order are embedded in the proposed algorithm to enhance the security. Such an algorithm is detailed in terms of security analyses, including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. The experimental results demonstrate that the proposed image encryption scheme has the advantages of large key space and high security for practical image encryption.

  9. Role of diversity in ICA and IVA: theory and applications

    NASA Astrophysics Data System (ADS)

    Adalı, Tülay

    2016-05-01

    Independent component analysis (ICA) has been the most popular approach for solving the blind source separation problem. Starting from a simple linear mixing model and the assumption of statistical independence, ICA can recover a set of linearly-mixed sources to within a scaling and permutation ambiguity. It has been successfully applied to numerous data analysis problems in areas as diverse as biomedicine, communications, finance, geo- physics, and remote sensing. ICA can be achieved using different types of diversity—statistical property—and, can be posed to simultaneously account for multiple types of diversity such as higher-order-statistics, sample dependence, non-circularity, and nonstationarity. A recent generalization of ICA, independent vector analysis (IVA), generalizes ICA to multiple data sets and adds the use of one more type of diversity, statistical dependence across the data sets, for jointly achieving independent decomposition of multiple data sets. With the addition of each new diversity type, identification of a broader class of signals become possible, and in the case of IVA, this includes sources that are independent and identically distributed Gaussians. We review the fundamentals and properties of ICA and IVA when multiple types of diversity are taken into account, and then ask the question whether diversity plays an important role in practical applications as well. Examples from various domains are presented to demonstrate that in many scenarios it might be worthwhile to jointly account for multiple statistical properties. This paper is submitted in conjunction with the talk delivered for the "Unsupervised Learning and ICA Pioneer Award" at the 2016 SPIE Conference on Sensing and Analysis Technologies for Biomedical and Cognitive Applications.

  10. Face recognition using an enhanced independent component analysis approach.

    PubMed

    Kwak, Keun-Chang; Pedrycz, Witold

    2007-03-01

    This paper is concerned with an enhanced independent component analysis (ICA) and its application to face recognition. Typically, face representations obtained by ICA involve unsupervised learning and high-order statistics. In this paper, we develop an enhancement of the generic ICA by augmenting this method by the Fisher linear discriminant analysis (LDA); hence, its abbreviation, FICA. The FICA is systematically developed and presented along with its underlying architecture. A comparative analysis explores four distance metrics, as well as classification with support vector machines (SVMs). We demonstrate that the FICA approach leads to the formation of well-separated classes in low-dimension subspace and is endowed with a great deal of insensitivity to large variation in illumination and facial expression. The comprehensive experiments are completed for the facial-recognition technology (FERET) face database; a comparative analysis demonstrates that FICA comes with improved classification rates when compared with some other conventional approaches such as eigenface, fisherface, and the ICA itself.

  11. Overweight, but not obesity, paradox on mortality following coronary artery bypass grafting.

    PubMed

    Takagi, Hisato; Umemoto, Takuya

    2016-09-01

    To determine whether an "obesity paradox" on post-coronary artery bypass grafting (CABG) mortality exists, we abstracted exclusively adjusted odds ratios (ORs) and/or hazard ratios (HRs) for mortality from each study, and then combined them in a meta-analysis. MEDLINE and EMBASE were searched through April 2015 using PubMed and OVID, to identify comparative studies, of overweight or obese versus normal weight patients undergoing CABG, reporting adjusted relative risk estimates for short-term (30-day or in-hospital) and/or mid-to-long-term all-cause mortality. Our search identified 14 eligible studies. In total our meta-analysis included data on 79,140 patients undergoing CABG. Pooled analyses in short-term mortality demonstrated that overweight was associated with a statistically significant 15% reduction relative to normal weight (OR, 0.85; 95% confidence interval [CI], 0.74-0.98; p=0.03) and no statistically significant differences between mild obesity, moderate/severe obesity, or overall obesity and normal weight. Pooled analyses in mid-to-long-term mortality demonstrated that overweight was associated with a statistically significant 10% reduction relative to normal weight (HR, 0.90; 95% CI, 0.84 to 0.96; p=0.001); and no statistically significant differences between mild obesity, moderate/severe obesity, or overall obesity and normal weight. Overweight, but not obesity, may be associated with better short-term and mid-to-long-term post-CABG survival relative to normal weight. An overweight, but not obesity, paradox on post-CABG mortality appears to exist. Copyright © 2015 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.

  12. Electron microscopic quantification of collagen fibril diameters in the rabbit medial collateral ligament: a baseline for comparison.

    PubMed

    Frank, C; Bray, D; Rademaker, A; Chrusch, C; Sabiston, P; Bodie, D; Rangayyan, R

    1989-01-01

    To establish a normal baseline for comparison, thirty-one thousand collagen fibril diameters were measured in calibrated transmission electron (TEM) photomicrographs of normal rabbit medial collateral ligaments (MCL's). A new automated method of quantitation was used to compare statistically fibril minimum diameter distributions in one midsubstance location in both MCL's from six animals at 3 months of age (immature) and three animals at 10 months of age (mature). Pooled results demonstrate that rabbit MCL's have statistically different (p less than 0.001) mean minimum diameters at these two ages. Interanimal differences in mean fibril minimum diameters were also significant (p less than 0.001) and varied by 20% to 25% in both mature and immature animals. Finally, there were significant differences (p less than 0.001) in mean diameters and distributions from side-to-side in all animals. These mean left-to-right differences were less than 10% in all mature animals but as much as 62% in some immature animals. Statistical analysis of these data demonstrate that animal-to-animal comparisons using these protocols require a large number of animals with appropriate numbers of fibrils being measured to detect small intergroup differences. With experiments which compare left to right ligaments, far fewer animals are required to detect similarly small differences. These results demonstrate the necessity for rigorous control of sampling, an extensive normal baseline and statistically confirmed experimental designs in any TEM comparisons of collagen fibril diameters.

  13. Federal Programs Supporting Educational Change, Vol. 2: Factors Affecting Change Agent Projects.

    ERIC Educational Resources Information Center

    Berman, Paul; Pauly, Edward W.

    This second volume in the change-agent series reports the interim results of an exploratory statistical analysis of a survey of a nationwide sample of 293 change-agent projects funded by four federal demonstration programs--Elementary Secondary Education Act (ESEA) Title III, Innovative Projects; ESEA Title VII, Bilingual Projects; Vocational…

  14. Women, University and Science in Twentieth-Century Spain

    ERIC Educational Resources Information Center

    Canales, Antonio Fco.

    2018-01-01

    This article aims to question the widely accepted idea that female university students in Spain have, in the past, tended to opt for degrees in the field of humanities. Based on an analysis of the official statistics that are currently available, the paper demonstrates that Spanish female university students showed a clear preference for…

  15. A user-friendly workflow for analysis of Illumina gene expression bead array data available at the arrayanalysis.org portal.

    PubMed

    Eijssen, Lars M T; Goelela, Varshna S; Kelder, Thomas; Adriaens, Michiel E; Evelo, Chris T; Radonjic, Marijana

    2015-06-30

    Illumina whole-genome expression bead arrays are a widely used platform for transcriptomics. Most of the tools available for the analysis of the resulting data are not easily applicable by less experienced users. ArrayAnalysis.org provides researchers with an easy-to-use and comprehensive interface to the functionality of R and Bioconductor packages for microarray data analysis. As a modular open source project, it allows developers to contribute modules that provide support for additional types of data or extend workflows. To enable data analysis of Illumina bead arrays for a broad user community, we have developed a module for ArrayAnalysis.org that provides a free and user-friendly web interface for quality control and pre-processing for these arrays. This module can be used together with existing modules for statistical and pathway analysis to provide a full workflow for Illumina gene expression data analysis. The module accepts data exported from Illumina's GenomeStudio, and provides the user with quality control plots and normalized data. The outputs are directly linked to the existing statistics module of ArrayAnalysis.org, but can also be downloaded for further downstream analysis in third-party tools. The Illumina bead arrays analysis module is available at http://www.arrayanalysis.org . A user guide, a tutorial demonstrating the analysis of an example dataset, and R scripts are available. The module can be used as a starting point for statistical evaluation and pathway analysis provided on the website or to generate processed input data for a broad range of applications in life sciences research.

  16. Analysis of scattering statistics and governing distribution functions in optical coherence tomography.

    PubMed

    Sugita, Mitsuro; Weatherbee, Andrew; Bizheva, Kostadinka; Popov, Ivan; Vitkin, Alex

    2016-07-01

    The probability density function (PDF) of light scattering intensity can be used to characterize the scattering medium. We have recently shown that in optical coherence tomography (OCT), a PDF formalism can be sensitive to the number of scatterers in the probed scattering volume and can be represented by the K-distribution, a functional descriptor for non-Gaussian scattering statistics. Expanding on this initial finding, here we examine polystyrene microsphere phantoms with different sphere sizes and concentrations, and also human skin and fingernail in vivo. It is demonstrated that the K-distribution offers an accurate representation for the measured OCT PDFs. The behavior of the shape parameter of K-distribution that best fits the OCT scattering results is investigated in detail, and the applicability of this methodology for biological tissue characterization is demonstrated and discussed.

  17. Model-Based Linkage Analysis of a Quantitative Trait.

    PubMed

    Song, Yeunjoo E; Song, Sunah; Schnell, Audrey H

    2017-01-01

    Linkage Analysis is a family-based method of analysis to examine whether any typed genetic markers cosegregate with a given trait, in this case a quantitative trait. If linkage exists, this is taken as evidence in support of a genetic basis for the trait. Historically, linkage analysis was performed using a binary disease trait, but has been extended to include quantitative disease measures. Quantitative traits are desirable as they provide more information than binary traits. Linkage analysis can be performed using single-marker methods (one marker at a time) or multipoint (using multiple markers simultaneously). In model-based linkage analysis the genetic model for the trait of interest is specified. There are many software options for performing linkage analysis. Here, we use the program package Statistical Analysis for Genetic Epidemiology (S.A.G.E.). S.A.G.E. was chosen because it also includes programs to perform data cleaning procedures and to generate and test genetic models for a quantitative trait, in addition to performing linkage analysis. We demonstrate in detail the process of running the program LODLINK to perform single-marker analysis, and MLOD to perform multipoint analysis using output from SEGREG, where SEGREG was used to determine the best fitting statistical model for the trait.

  18. Feature-Based Statistical Analysis of Combustion Simulation Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bennett, J; Krishnamoorthy, V; Liu, S

    2011-11-18

    We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing andmore » reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion science; however, it is applicable to many other science domains.« less

  19. High Dimensional Classification Using Features Annealed Independence Rules.

    PubMed

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  20. Trial Sequential Analysis in systematic reviews with meta-analysis.

    PubMed

    Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

    2017-03-06

    Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.

  1. The determinants of bond angle variability in protein/peptide backbones: A comprehensive statistical/quantum mechanics analysis.

    PubMed

    Improta, Roberto; Vitagliano, Luigi; Esposito, Luciana

    2015-11-01

    The elucidation of the mutual influence between peptide bond geometry and local conformation has important implications for protein structure refinement, validation, and prediction. To gain insights into the structural determinants and the energetic contributions associated with protein/peptide backbone plasticity, we here report an extensive analysis of the variability of the peptide bond angles by combining statistical analyses of protein structures and quantum mechanics calculations on small model peptide systems. Our analyses demonstrate that all the backbone bond angles strongly depend on the peptide conformation and unveil the existence of regular trends as function of ψ and/or φ. The excellent agreement of the quantum mechanics calculations with the statistical surveys of protein structures validates the computational scheme here employed and demonstrates that the valence geometry of protein/peptide backbone is primarily dictated by local interactions. Notably, for the first time we show that the position of the H(α) hydrogen atom, which is an important parameter in NMR structural studies, is also dependent on the local conformation. Most of the trends observed may be satisfactorily explained by invoking steric repulsive interactions; in some specific cases the valence bond variability is also influenced by hydrogen-bond like interactions. Moreover, we can provide a reliable estimate of the energies involved in the interplay between geometry and conformations. © 2015 Wiley Periodicals, Inc.

  2. Risk-based Methodology for Validation of Pharmaceutical Batch Processes.

    PubMed

    Wiles, Frederick

    2013-01-01

    In January 2011, the U.S. Food and Drug Administration published new process validation guidance for pharmaceutical processes. The new guidance debunks the long-held industry notion that three consecutive validation batches or runs are all that are required to demonstrate that a process is operating in a validated state. Instead, the new guidance now emphasizes that the level of monitoring and testing performed during process performance qualification (PPQ) studies must be sufficient to demonstrate statistical confidence both within and between batches. In some cases, three qualification runs may not be enough. Nearly two years after the guidance was first published, little has been written defining a statistical methodology for determining the number of samples and qualification runs required to satisfy Stage 2 requirements of the new guidance. This article proposes using a combination of risk assessment, control charting, and capability statistics to define the monitoring and testing scheme required to show that a pharmaceutical batch process is operating in a validated state. In this methodology, an assessment of process risk is performed through application of a process failure mode, effects, and criticality analysis (PFMECA). The output of PFMECA is used to select appropriate levels of statistical confidence and coverage which, in turn, are used in capability calculations to determine when significant Stage 2 (PPQ) milestones have been met. The achievement of Stage 2 milestones signals the release of batches for commercial distribution and the reduction of monitoring and testing to commercial production levels. Individuals, moving range, and range/sigma charts are used in conjunction with capability statistics to demonstrate that the commercial process is operating in a state of statistical control. The new process validation guidance published by the U.S. Food and Drug Administration in January of 2011 indicates that the number of process validation batches or runs required to demonstrate that a pharmaceutical process is operating in a validated state should be based on sound statistical principles. The old rule of "three consecutive batches and you're done" is no longer sufficient. The guidance, however, does not provide any specific methodology for determining the number of runs required, and little has been published to augment this shortcoming. The paper titled "Risk-based Methodology for Validation of Pharmaceutical Batch Processes" describes a statistically sound methodology for determining when a statistically valid number of validation runs has been acquired based on risk assessment and calculation of process capability.

  3. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.

    PubMed

    Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P; Patterson, Nick; Price, Alkes L

    2014-10-15

    Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of [Formula: see text] association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary materials are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Impact of Damping Uncertainty on SEA Model Response Variance

    NASA Technical Reports Server (NTRS)

    Schiller, Noah; Cabell, Randolph; Grosveld, Ferdinand

    2010-01-01

    Statistical Energy Analysis (SEA) is commonly used to predict high-frequency vibroacoustic levels. This statistical approach provides the mean response over an ensemble of random subsystems that share the same gross system properties such as density, size, and damping. Recently, techniques have been developed to predict the ensemble variance as well as the mean response. However these techniques do not account for uncertainties in the system properties. In the present paper uncertainty in the damping loss factor is propagated through SEA to obtain more realistic prediction bounds that account for both ensemble and damping variance. The analysis is performed on a floor-equipped cylindrical test article that resembles an aircraft fuselage. Realistic bounds on the damping loss factor are determined from measurements acquired on the sidewall of the test article. The analysis demonstrates that uncertainties in damping have the potential to significantly impact the mean and variance of the predicted response.

  5. Reply: Comparison of slope instability screening tools following a large storm event and application to forest management and policy

    NASA Astrophysics Data System (ADS)

    Whittaker, Kara A.; McShane, Dan

    2013-02-01

    A large storm event in southwest Washington State triggered over 2500 landslides and provided an opportunity to assess two slope stability screening tools. The statistical analysis conducted demonstrated that both screening tools are effective at predicting where landslides were likely to take place (Whittaker and McShane, 2012). Here we reply to two discussions of this article related to the development of the slope stability screening tools and the accuracy and scale of the spatial data used. Neither of the discussions address our statistical analysis or results. We provide greater detail on our sampling criteria and also elaborate on the policy and management implications of our findings and how they complement those of a separate investigation of landslides resulting from the same storm. The conclusions made in Whittaker and McShane (2012) stand as originally published unless future analysis indicates otherwise.

  6. RAD-ADAPT: Software for modelling clonogenic assay data in radiation biology.

    PubMed

    Zhang, Yaping; Hu, Kaiqiang; Beumer, Jan H; Bakkenist, Christopher J; D'Argenio, David Z

    2017-04-01

    We present a comprehensive software program, RAD-ADAPT, for the quantitative analysis of clonogenic assays in radiation biology. Two commonly used models for clonogenic assay analysis, the linear-quadratic model and single-hit multi-target model, are included in the software. RAD-ADAPT uses maximum likelihood estimation method to obtain parameter estimates with the assumption that cell colony count data follow a Poisson distribution. The program has an intuitive interface, generates model prediction plots, tabulates model parameter estimates, and allows automatic statistical comparison of parameters between different groups. The RAD-ADAPT interface is written using the statistical software R and the underlying computations are accomplished by the ADAPT software system for pharmacokinetic/pharmacodynamic systems analysis. The use of RAD-ADAPT is demonstrated using an example that examines the impact of pharmacologic ATM and ATR kinase inhibition on human lung cancer cell line A549 after ionizing radiation. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics.

    PubMed Central

    Sobel, E.; Lange, K.

    1996-01-01

    The introduction of stochastic methods in pedigree analysis has enabled geneticists to tackle computations intractable by standard deterministic methods. Until now these stochastic techniques have worked by running a Markov chain on the set of genetic descent states of a pedigree. Each descent state specifies the paths of gene flow in the pedigree and the founder alleles dropped down each path. The current paper follows up on a suggestion by Elizabeth Thompson that genetic descent graphs offer a more appropriate space for executing a Markov chain. A descent graph specifies the paths of gene flow but not the particular founder alleles traveling down the paths. This paper explores algorithms for implementing Thompson's suggestion for codominant markers in the context of automatic haplotyping, estimating location scores, and computing gene-clustering statistics for robust linkage analysis. Realistic numerical examples demonstrate the feasibility of the algorithms. PMID:8651310

  8. Size of the Dynamic Bead in Polymers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Agapov, Alexander L; Sokolov, Alexei P

    2010-01-01

    Presented analysis of neutron, mechanical, and MD simulation data available in the literature demonstrates that the dynamic bead size (the smallest subchain that still exhibits the Rouse-like dynamics) in most of the polymers is significantly larger than the traditionally defined Kuhn segment. Moreover, our analysis emphasizes that even the static bead size (e.g., chain statistics) disagrees with the Kuhn segment length. We demonstrate that the deficiency of the Kuhn segment definition is based on the assumption of a chain being completely extended inside a single bead. The analysis suggests that representation of a real polymer chain by the bead-and-spring modelmore » with a single parameter C cannot be correct. One needs more parameters to reflect correctly details of the chain structure in the bead-and-spring model.« less

  9. Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates.

    PubMed

    Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu

    2011-01-01

    The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.

  10. Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates

    PubMed Central

    2011-01-01

    Background The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. Results We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. Conclusions The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa. PMID:22784572

  11. Maxillary distraction osteogenesis in the adolescent cleft patient: three-dimensional computed tomography analysis of linear and volumetric changes over five years.

    PubMed

    Chen, Philip Kuo-Ting; Por, Yong-Chen; Liou, Eric Jein-Wein; Chang, Frank Chun-Shin

    2011-07-01

    To assess the results of maxillary distraction osteogenesis with the Rigid External Distraction System using three-dimensional computed tomography scan volume-rendered images with respect to stability and facial growth at three time frames: preoperative (T0), 1-year postoperative (T1), and 5-years postoperative (T2). Retrospective analysis. Tertiary. A total of 12 patients with severe cleft maxillary hypoplasia were treated between June 30, 1997, and July 15, 1998. The mean age at surgery was 11 years 1 month. Le Fort I maxillary distraction osteogenesis. Distraction was started 2 to 5 days postsurgery at a rate of 1 mm per day. The consolidation period was 3 months. No face mask was used. A paired t test was used for statistical analysis. Overjet, ANB, and SNA and maxillary, pterygoid, and mandibular volumes. From T0 to T1, there were statistically significant increments of overjet, ANB, and SNA and maxillary, pterygoid, and mandibular volumes. The T1 to T2 period demonstrated a reduction of overjet (30.07%) and ANB (54.42%). The maxilla showed a stable SNA and a small but statistically significant advancement of the ANS point. There was a significant increase in the mandibular volume. However, there was no significant change in the maxillary and pterygoid volumes. Maxillary distraction osteogenesis demonstrated linear and volumetric maxillary growth during the distraction phase without clinically significant continued growth thereafter. Overcorrection is required to take into account recurrence of midface retrusion over the long term.

  12. A statistical analysis of the elastic distortion and dislocation density fields in deformed crystals

    DOE PAGES

    Mohamed, Mamdouh S.; Larson, Bennett C.; Tischler, Jonathan Z.; ...

    2015-05-18

    The statistical properties of the elastic distortion fields of dislocations in deforming crystals are investigated using the method of discrete dislocation dynamics to simulate dislocation structures and dislocation density evolution under tensile loading. Probability distribution functions (PDF) and pair correlation functions (PCF) of the simulated internal elastic strains and lattice rotations are generated for tensile strain levels up to 0.85%. The PDFs of simulated lattice rotation are compared with sub-micrometer resolution three-dimensional X-ray microscopy measurements of rotation magnitudes and deformation length scales in 1.0% and 2.3% compression strained Cu single crystals to explore the linkage between experiment and the theoreticalmore » analysis. The statistical properties of the deformation simulations are analyzed through determinations of the Nye and Kr ner dislocation density tensors. The significance of the magnitudes and the length scales of the elastic strain and the rotation parts of dislocation density tensors are demonstrated, and their relevance to understanding the fundamental aspects of deformation is discussed.« less

  13. Statistical Analysis of a Large Sample Size Pyroshock Test Data Set Including Post Flight Data Assessment. Revision 1

    NASA Technical Reports Server (NTRS)

    Hughes, William O.; McNelis, Anne M.

    2010-01-01

    The Earth Observing System (EOS) Terra spacecraft was launched on an Atlas IIAS launch vehicle on its mission to observe planet Earth in late 1999. Prior to launch, the new design of the spacecraft's pyroshock separation system was characterized by a series of 13 separation ground tests. The analysis methods used to evaluate this unusually large amount of shock data will be discussed in this paper, with particular emphasis on population distributions and finding statistically significant families of data, leading to an overall shock separation interface level. The wealth of ground test data also allowed a derivation of a Mission Assurance level for the flight. All of the flight shock measurements were below the EOS Terra Mission Assurance level thus contributing to the overall success of the EOS Terra mission. The effectiveness of the statistical methodology for characterizing the shock interface level and for developing a flight Mission Assurance level from a large sample size of shock data is demonstrated in this paper.

  14. A phylogenetic transform enhances analysis of compositional microbiota data

    PubMed Central

    Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

    2017-01-01

    Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities. DOI: http://dx.doi.org/10.7554/eLife.21887.001 PMID:28198697

  15. Statistical functions and relevant correlation coefficients of clearness index

    NASA Astrophysics Data System (ADS)

    Pavanello, Diego; Zaaiman, Willem; Colli, Alessandra; Heiser, John; Smith, Scott

    2015-08-01

    This article presents a statistical analysis of the sky conditions, during years from 2010 to 2012, for three different locations: the Joint Research Centre site in Ispra (Italy, European Solar Test Installation - ESTI laboratories), the site of National Renewable Energy Laboratory in Golden (Colorado, USA) and the site of Brookhaven National Laboratories in Upton (New York, USA). The key parameter is the clearness index kT, a dimensionless expression of the global irradiance impinging upon a horizontal surface at a given instant of time. In the first part, the sky conditions are characterized using daily averages, giving a general overview of the three sites. In the second part the analysis is performed using data sets with a short-term resolution of 1 sample per minute, demonstrating remarkable properties of the statistical distributions of the clearness index, reinforced by a proof using fuzzy logic methods. Successively some time-dependent correlations between different meteorological variables are presented in terms of Pearson and Spearman correlation coefficients, and introducing a new one.

  16. A Flexible Approach for the Statistical Visualization of Ensemble Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Potter, K.; Wilson, A.; Bremer, P.

    2009-09-29

    Scientists are increasingly moving towards ensemble data sets to explore relationships present in dynamic systems. Ensemble data sets combine spatio-temporal simulation results generated using multiple numerical models, sampled input conditions and perturbed parameters. While ensemble data sets are a powerful tool for mitigating uncertainty, they pose significant visualization and analysis challenges due to their complexity. We present a collection of overview and statistical displays linked through a high level of interactivity to provide a framework for gaining key scientific insight into the distribution of the simulation results as well as the uncertainty associated with the data. In contrast to methodsmore » that present large amounts of diverse information in a single display, we argue that combining multiple linked statistical displays yields a clearer presentation of the data and facilitates a greater level of visual data analysis. We demonstrate this approach using driving problems from climate modeling and meteorology and discuss generalizations to other fields.« less

  17. Distinguishing Mediational Models and Analyses in Clinical Psychology: Atemporal Associations Do Not Imply Causation.

    PubMed

    Winer, E Samuel; Cervone, Daniel; Bryant, Jessica; McKinney, Cliff; Liu, Richard T; Nadorff, Michael R

    2016-09-01

    A popular way to attempt to discern causality in clinical psychology is through mediation analysis. However, mediation analysis is sometimes applied to research questions in clinical psychology when inferring causality is impossible. This practice may soon increase with new, readily available, and easy-to-use statistical advances. Thus, we here provide a heuristic to remind clinical psychological scientists of the assumptions of mediation analyses. We describe recent statistical advances and unpack assumptions of causality in mediation, underscoring the importance of time in understanding mediational hypotheses and analyses in clinical psychology. Example analyses demonstrate that statistical mediation can occur despite theoretical mediation being improbable. We propose a delineation of mediational effects derived from cross-sectional designs into the terms temporal and atemporal associations to emphasize time in conceptualizing process models in clinical psychology. The general implications for mediational hypotheses and the temporal frameworks from within which they may be drawn are discussed. © 2016 Wiley Periodicals, Inc.

  18. Accuracy Evaluation of the Unified P-Value from Combining Correlated P-Values

    PubMed Central

    Alves, Gelio; Yu, Yi-Kuo

    2014-01-01

    Meta-analysis methods that combine -values into a single unified -value are frequently employed to improve confidence in hypothesis testing. An assumption made by most meta-analysis methods is that the -values to be combined are independent, which may not always be true. To investigate the accuracy of the unified -value from combining correlated -values, we have evaluated a family of statistical methods that combine: independent, weighted independent, correlated, and weighted correlated -values. Statistical accuracy evaluation by combining simulated correlated -values showed that correlation among -values can have a significant effect on the accuracy of the combined -value obtained. Among the statistical methods evaluated those that weight -values compute more accurate combined -values than those that do not. Also, statistical methods that utilize the correlation information have the best performance, producing significantly more accurate combined -values. In our study we have demonstrated that statistical methods that combine -values based on the assumption of independence can produce inaccurate -values when combining correlated -values, even when the -values are only weakly correlated. Therefore, to prevent from drawing false conclusions during hypothesis testing, our study advises caution be used when interpreting the -value obtained from combining -values of unknown correlation. However, when the correlation information is available, the weighting-capable statistical method, first introduced by Brown and recently modified by Hou, seems to perform the best amongst the methods investigated. PMID:24663491

  19. Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.

    PubMed

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei

    2016-02-01

    Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.

  20. Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions

    PubMed Central

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei

    2015-01-01

    Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979

  1. Reliability and statistical power analysis of cortical and subcortical FreeSurfer metrics in a large sample of healthy elderly.

    PubMed

    Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz

    2015-03-01

    FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Secure and scalable deduplication of horizontally partitioned health data for privacy-preserving distributed statistical computation.

    PubMed

    Yigzaw, Kassaye Yitbarek; Michalas, Antonis; Bellika, Johan Gustav

    2017-01-03

    Techniques have been developed to compute statistics on distributed datasets without revealing private information except the statistical results. However, duplicate records in a distributed dataset may lead to incorrect statistical results. Therefore, to increase the accuracy of the statistical analysis of a distributed dataset, secure deduplication is an important preprocessing step. We designed a secure protocol for the deduplication of horizontally partitioned datasets with deterministic record linkage algorithms. We provided a formal security analysis of the protocol in the presence of semi-honest adversaries. The protocol was implemented and deployed across three microbiology laboratories located in Norway, and we ran experiments on the datasets in which the number of records for each laboratory varied. Experiments were also performed on simulated microbiology datasets and data custodians connected through a local area network. The security analysis demonstrated that the protocol protects the privacy of individuals and data custodians under a semi-honest adversarial model. More precisely, the protocol remains secure with the collusion of up to N - 2 corrupt data custodians. The total runtime for the protocol scales linearly with the addition of data custodians and records. One million simulated records distributed across 20 data custodians were deduplicated within 45 s. The experimental results showed that the protocol is more efficient and scalable than previous protocols for the same problem. The proposed deduplication protocol is efficient and scalable for practical uses while protecting the privacy of patients and data custodians.

  3. Improving information retrieval in functional analysis.

    PubMed

    Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A

    2016-12-01

    Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Disability Measurement for Korean Community-Dwelling Adults With Stroke: Item-Level Psychometric Analysis of the Korean Longitudinal Study of Ageing

    PubMed Central

    2018-01-01

    Objective To investigate the psychometric properties of the activities of daily living (ADL) instrument used in the analysis of Korean Longitudinal Study of Ageing (KLoSA) dataset. Methods A retrospective study was carried out involving 2006 KLoSA records of community-dwelling adults diagnosed with stroke. The ADL instrument used for the analysis of KLoSA included 17 items, which were analyzed using Rasch modeling to develop a robust outcome measure. The unidimensionality of the ADL instrument was examined based on confirmatory factor analysis with a one-factor model. Item-level psychometric analysis of the ADL instrument included fit statistics, internal consistency, precision, and the item difficulty hierarchy. Results The study sample included a total of 201 community-dwelling adults (1.5% of the Korean population with an age over 45 years; mean age=70.0 years, SD=9.7) having a history of stroke. The ADL instrument demonstrated unidimensional construct. Two misfit items, money management (mean square [MnSq]=1.56, standardized Z-statistics [ZSTD]=2.3) and phone use (MnSq=1.78, ZSTD=2.3) were removed from the analysis. The remaining 15 items demonstrated good item fit, high internal consistency (person reliability=0.91), and good precision (person strata=3.48). The instrument precisely estimated person measures within a wide range of theta (−4.75 logits < θ < 3.97 logits) and a reliability of 0.9, with a conceptual hierarchy of item difficulty. Conclusion The findings indicate that the 15 ADL items met Rasch expectations of unidimensionality and demonstrated good psychometric properties. It is proposed that the validated ADL instrument can be used as a primary outcome measure for assessing longitudinal disability trajectories in the Korean adult population and can be employed for comparative analysis of international disability across national aging studies. PMID:29765888

  5. Statistical quality control through overall vibration analysis

    NASA Astrophysics Data System (ADS)

    Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos

    2010-05-01

    The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence of predictive variables (high-frequency vibration displacements) that are sensible to the processes setup and the quality of the products obtained. Based on the result of this overall vibration analysis, a second paper will analyse self-induced vibration spectrums in order to define limit vibration bands, controllable every cycle or connected to permanent vibration-monitoring systems able to adjust sensible process variables identified by ANOVA, once the vibration readings exceed established quality limits.

  6. Effects of measurement errors on psychometric measurements in ergonomics studies: Implications for correlations, ANOVA, linear regression, factor analysis, and linear discriminant analysis.

    PubMed

    Liu, Yan; Salvendy, Gavriel

    2009-05-01

    This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.

  7. Demonstrating microbial co-occurrence pattern analyses within and between ecosystems

    PubMed Central

    Williams, Ryan J.; Howe, Adina; Hofmockel, Kirsten S.

    2014-01-01

    Co-occurrence patterns are used in ecology to explore interactions between organisms and environmental effects on coexistence within biological communities. Analysis of co-occurrence patterns among microbial communities has ranged from simple pairwise comparisons between all community members to direct hypothesis testing between focal species. However, co-occurrence patterns are rarely studied across multiple ecosystems or multiple scales of biological organization within the same study. Here we outline an approach to produce co-occurrence analyses that are focused at three different scales: co-occurrence patterns between ecosystems at the community scale, modules of co-occurring microorganisms within communities, and co-occurring pairs within modules that are nested within microbial communities. To demonstrate our co-occurrence analysis approach, we gathered publicly available 16S rRNA amplicon datasets to compare and contrast microbial co-occurrence at different taxonomic levels across different ecosystems. We found differences in community composition and co-occurrence that reflect environmental filtering at the community scale and consistent pairwise occurrences that may be used to infer ecological traits about poorly understood microbial taxa. However, we also found that conclusions derived from applying network statistics to microbial relationships can vary depending on the taxonomic level chosen and criteria used to build co-occurrence networks. We present our statistical analysis and code for public use in analysis of co-occurrence patterns across microbial communities. PMID:25101065

  8. Automated system for the on-line monitoring of powder blending processes using near-infrared spectroscopy. Part I. System development and control.

    PubMed

    Hailey, P A; Doherty, P; Tapsell, P; Oliver, T; Aldridge, P K

    1996-03-01

    An automated system for the on-line monitoring of powder blending processes is described. The system employs near-infrared (NIR) spectroscopy using fibre-optics and a graphical user interface (GUI) developed in the LabVIEW environment. The complete supervisory control and data analysis (SCADA) software controls blender and spectrophotometer operation and performs statistical spectral data analysis in real time. A data analysis routine using standard deviation is described to demonstrate an approach to the real-time determination of blend homogeneity.

  9. Load Model Verification, Validation and Calibration Framework by Statistical Analysis on Field Data

    NASA Astrophysics Data System (ADS)

    Jiao, Xiangqing; Liao, Yuan; Nguyen, Thai

    2017-11-01

    Accurate load models are critical for power system analysis and operation. A large amount of research work has been done on load modeling. Most of the existing research focuses on developing load models, while little has been done on developing formal load model verification and validation (V&V) methodologies or procedures. Most of the existing load model validation is based on qualitative rather than quantitative analysis. In addition, not all aspects of model V&V problem have been addressed by the existing approaches. To complement the existing methods, this paper proposes a novel load model verification and validation framework that can systematically and more comprehensively examine load model's effectiveness and accuracy. Statistical analysis, instead of visual check, quantifies the load model's accuracy, and provides a confidence level of the developed load model for model users. The analysis results can also be used to calibrate load models. The proposed framework can be used as a guidance to systematically examine load models for utility engineers and researchers. The proposed method is demonstrated through analysis of field measurements collected from a utility system.

  10. Correlation and simple linear regression.

    PubMed

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  11. Assessing signal-to-noise in quantitative proteomics: multivariate statistical analysis in DIGE experiments.

    PubMed

    Friedman, David B

    2012-01-01

    All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.

  12. Therapies for acute myeloid leukemia: vosaroxin

    PubMed Central

    Sayar, Hamid; Bashardoust, Parvaneh

    2017-01-01

    Vosaroxin, a quinolone-derivative chemotherapeutic agent, was considered a promising drug for the treatment of acute myeloid leukemia (AML). Early-stage clinical trials with this agent led to a large randomized double-blind placebo-controlled study of vosaroxin in combination with intermediate-dose cytarabine for the treatment of relapsed or refractory AML. The study demonstrated better complete remission rates with vosaroxin, but there was no statistically significant overall survival benefit in the whole cohort. A subset analysis censoring patients who had undergone allogeneic stem cell transplantation, however, revealed a modest but statistically significant improvement in overall survival particularly among older patients. This article reviews the data available on vosaroxin including clinical trials in AML and offers an analysis of findings of these studies as well as the current status of vosaroxin. PMID:28860803

  13. Therapies for acute myeloid leukemia: vosaroxin.

    PubMed

    Sayar, Hamid; Bashardoust, Parvaneh

    2017-01-01

    Vosaroxin, a quinolone-derivative chemotherapeutic agent, was considered a promising drug for the treatment of acute myeloid leukemia (AML). Early-stage clinical trials with this agent led to a large randomized double-blind placebo-controlled study of vosaroxin in combination with intermediate-dose cytarabine for the treatment of relapsed or refractory AML. The study demonstrated better complete remission rates with vosaroxin, but there was no statistically significant overall survival benefit in the whole cohort. A subset analysis censoring patients who had undergone allogeneic stem cell transplantation, however, revealed a modest but statistically significant improvement in overall survival particularly among older patients. This article reviews the data available on vosaroxin including clinical trials in AML and offers an analysis of findings of these studies as well as the current status of vosaroxin.

  14. Statistical Inference for Data Adaptive Target Parameters.

    PubMed

    Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J

    2016-05-01

    Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.

  15. Spatial analyses for nonoverlapping objects with size variations and their application to coral communities.

    PubMed

    Muko, Soyoka; Shimatani, Ichiro K; Nozawa, Yoko

    2014-07-01

    Spatial distributions of individuals are conventionally analysed by representing objects as dimensionless points, in which spatial statistics are based on centre-to-centre distances. However, if organisms expand without overlapping and show size variations, such as is the case for encrusting corals, interobject spacing is crucial for spatial associations where interactions occur. We introduced new pairwise statistics using minimum distances between objects and demonstrated their utility when examining encrusting coral community data. We also calculated the conventional point process statistics and the grid-based statistics to clarify the advantages and limitations of each spatial statistical method. For simplicity, coral colonies were approximated by disks in these demonstrations. Focusing on short-distance effects, the use of minimum distances revealed that almost all coral genera were aggregated at a scale of 1-25 cm. However, when fragmented colonies (ramets) were treated as a genet, a genet-level analysis indicated weak or no aggregation, suggesting that most corals were randomly distributed and that fragmentation was the primary cause of colony aggregations. In contrast, point process statistics showed larger aggregation scales, presumably because centre-to-centre distances included both intercolony spacing and colony sizes (radius). The grid-based statistics were able to quantify the patch (aggregation) scale of colonies, but the scale was strongly affected by the colony size. Our approach quantitatively showed repulsive effects between an aggressive genus and a competitively weak genus, while the grid-based statistics (covariance function) also showed repulsion although the spatial scale indicated from the statistics was not directly interpretable in terms of ecological meaning. The use of minimum distances together with previously proposed spatial statistics helped us to extend our understanding of the spatial patterns of nonoverlapping objects that vary in size and the associated specific scales. © 2013 The Authors. Journal of Animal Ecology © 2013 British Ecological Society.

  16. Infinite von Mises-Fisher Mixture Modeling of Whole Brain fMRI Data.

    PubMed

    Røge, Rasmus E; Madsen, Kristoffer H; Schmidt, Mikkel N; Mørup, Morten

    2017-10-01

    Cluster analysis of functional magnetic resonance imaging (fMRI) data is often performed using gaussian mixture models, but when the time series are standardized such that the data reside on a hypersphere, this modeling assumption is questionable. The consequences of ignoring the underlying spherical manifold are rarely analyzed, in part due to the computational challenges imposed by directional statistics. In this letter, we discuss a Bayesian von Mises-Fisher (vMF) mixture model for data on the unit hypersphere and present an efficient inference procedure based on collapsed Markov chain Monte Carlo sampling. Comparing the vMF and gaussian mixture models on synthetic data, we demonstrate that the vMF model has a slight advantage inferring the true underlying clustering when compared to gaussian-based models on data generated from both a mixture of vMFs and a mixture of gaussians subsequently normalized. Thus, when performing model selection, the two models are not in agreement. Analyzing multisubject whole brain resting-state fMRI data from healthy adult subjects, we find that the vMF mixture model is considerably more reliable than the gaussian mixture model when comparing solutions across models trained on different groups of subjects, and again we find that the two models disagree on the optimal number of components. The analysis indicates that the fMRI data support more than a thousand clusters, and we confirm this is not a result of overfitting by demonstrating better prediction on data from held-out subjects. Our results highlight the utility of using directional statistics to model standardized fMRI data and demonstrate that whole brain segmentation of fMRI data requires a very large number of functional units in order to adequately account for the discernible statistical patterns in the data.

  17. Chronic lung disease in very low birth weight infants: Persistence and improvement of a quality improvement process in a tertiary level neonatal intensive care unit.

    PubMed

    Birenbaum, H J; Pfoh, E R; Helou, S; Pane, M A; Marinkovich, G A; Dentry, A; Yeh, Hsin-Chieh; Updegraff, L; Arnold, C; Liverman, S; Cawman, H

    2016-05-19

    We previously demonstrated a significant reduction in our incidence of chronic lung disease in our NICU using potentially better practices of avoiding delivery room endotracheal intubation and using early nasal CPAP. We sought to demonstrate whether these improvements were sustained and or improved over time. We conducted a retrospective, cross-sectional analysis of infants 501-1500 grams born at our hospital between 2005 and 2013. Infants born during the 2005-2007, 2008-2010 and 2011-2013 epochs were grouped together, respectively. Descriptive analysis was conducted to determine the number and percent of maternal and neonatal characteristics by year grouping. Chi-squared tests were used to determine whether there were any statistically significant changes in characteristics across year groupings.. Two outcome variables were assessed: a diagnosis of chronic lung disease based on the Vermont Oxford Network definition and being discharged home on supplemental oxygen. There was a statistically significant improvement in the incidence of chronic lung disease in infants below 27 weeks' gestation in the three year period in the 2011-2013 cohort compared with those in the 2005-2007 cohort. We also found a statistically significant improvement in the number of infants discharged on home oxygen with birth weights 751-1000 grams and infants with gestational age less than 27 weeks in the 2011-2013 cohort compared to the 2005-2007 cohort. We demonstrated sustained improvement in our incidence of CLD between 2005 and 2013. We speculate that a multifaceted strategy of avoiding intubation and excessive oxygen in the delivery room, the early use of CPAP, as well as the use of volume targeted ventilation, when needed, may help significantly reduce the incidence of CLD.

  18. A comparison of spectral magnitude and phase-locking value analyses of the frequency-following response to complex tones

    PubMed Central

    Zhu, Li; Bharadwaj, Hari; Xia, Jing; Shinn-Cunningham, Barbara

    2013-01-01

    Two experiments, both presenting diotic, harmonic tone complexes (100 Hz fundamental), were conducted to explore the envelope-related component of the frequency-following response (FFRENV), a measure of synchronous, subcortical neural activity evoked by a periodic acoustic input. Experiment 1 directly compared two common analysis methods, computing the magnitude spectrum and the phase-locking value (PLV). Bootstrapping identified which FFRENV frequency components were statistically above the noise floor for each metric and quantified the statistical power of the approaches. Across listeners and conditions, the two methods produced highly correlated results. However, PLV analysis required fewer processing stages to produce readily interpretable results. Moreover, at the fundamental frequency of the input, PLVs were farther above the metric's noise floor than spectral magnitudes. Having established the advantages of PLV analysis, the efficacy of the approach was further demonstrated by investigating how different acoustic frequencies contribute to FFRENV, analyzing responses to complex tones composed of different acoustic harmonics of 100 Hz (Experiment 2). Results show that the FFRENV response is dominated by peripheral auditory channels responding to unresolved harmonics, although low-frequency channels driven by resolved harmonics also contribute. These results demonstrate the utility of the PLV for quantifying the strength of FFRENV across conditions. PMID:23862815

  19. Relationship between teacher preparedness and inquiry-based instructional practices to students' science achievement: Evidence from TIMSS 2007

    NASA Astrophysics Data System (ADS)

    Martin, Lynn A.

    The purpose of this study was to examine the relationship between teachers' self-reported preparedness for teaching science content and their instructional practices to the science achievement of eighth grade science students in the United States as demonstrated by TIMSS 2007. Six hundred eighty-seven eighth grade science teachers in the United States representing 7,377 students responded to the TIMSS 2007 questionnaire about their instructional preparedness and their instructional practices. Quantitative data were reported. Through correlation analysis, the researcher found statistically significant positive relationships emerge between eighth grade science teachers' main area of study and their self-reported beliefs about their preparedness to teach that same content area. Another correlation analysis found a statistically significant negative relationship existed between teachers' self-reported use of inquiry-based instruction and preparedness to teach chemistry, physics and earth science. Another correlation analysis discovered a statistically significant positive relationship existed between physics preparedness and student science achievement. Finally, a correlation analysis found a statistically significant positive relationship existed between science teachers' self-reported implementation of inquiry-based instructional practices and student achievement. The data findings support the conclusion that teachers who have feelings of preparedness to teach science content and implement more inquiry-based instruction and less didactic instruction produce high achieving science students. As science teachers obtain the appropriate knowledge in science content and pedagogy, science teachers will feel prepared and will implement inquiry-based instruction in science classrooms.

  20. Post-operative diffusion weighted imaging as a predictor of posterior fossa syndrome permanence in paediatric medulloblastoma.

    PubMed

    Chua, Felicia H Z; Thien, Ady; Ng, Lee Ping; Seow, Wan Tew; Low, David C Y; Chang, Kenneth T E; Lian, Derrick W Q; Loh, Eva; Low, Sharon Y Y

    2017-03-01

    Posterior fossa syndrome (PFS) is a serious complication faced by neurosurgeons and their patients, especially in paediatric medulloblastoma patients. The uncertain aetiology of PFS, myriad of cited risk factors and therapeutic challenges make this phenomenon an elusive entity. The primary objective of this study was to identify associative factors related to the development of PFS in medulloblastoma patient post-tumour resection. This is a retrospective study based at a single institution. Patient data and all related information were collected from the hospital records, in accordance to a list of possible risk factors associated with PFS. These included pre-operative tumour volume, hydrocephalus, age, gender, extent of resection, metastasis, ventriculoperitoneal shunt insertion, post-operative meningitis and radiological changes in MRI. Additional variables included molecular and histological subtypes of each patient's medulloblastoma tumour. Statistical analysis was employed to determine evidence of each variable's significance in PFS permanence. A total of 19 patients with appropriately complete data was identified. Initial univariate analysis did not show any statistical significance. However, multivariate analysis for MRI-specific changes reported bilateral DWI restricted diffusion changes involving both right and left sides of the surgical cavity was of statistical significance for PFS permanence. The authors performed a clinical study that evaluated possible risk factors for permanent PFS in paediatric medulloblastoma patients. Analysis of collated results found that post-operative DWI restriction in bilateral regions within the surgical cavity demonstrated statistical significance as a predictor of PFS permanence-a novel finding in the current literature.

  1. Preoperative radiation and free flap outcomes for head and neck reconstruction: a systematic review and meta-analysis.

    PubMed

    Herle, Pradyumna; Shukla, Lipi; Morrison, Wayne A; Shayan, Ramin

    2015-03-01

    There is a general consensus among reconstructive surgeons that preoperative radiotherapy is associated with a higher risk of flap failure and complications in head and neck surgery. Opinion is also divided regarding the effects of radiation dose on free flap outcomes and timing of preoperative radiation to minimize adverse outcomes. Our meta-analysis will attempt to address these issues. A systematic review of the literature was conducted in concordance to PRISMA protocol. Data were combined using STATA 12 and Open Meta-Analyst software programmes. Twenty-four studies were included comparing 2842 flaps performed in irradiated fields and 3491 flaps performed in non-irradiated fields. Meta-analysis yielded statistically significant risk ratios for flap failure (RR 1.48, P = 0.004), complications (RR 1.84, P < 0.001), reoperation (RR 2.06, P < 0.001) and fistula (RR 2.05, P < 0.001). Mean radiation dose demonstrated a trend towards increased risk of flap failure, but this was not statistically significant. On subgroup analysis, flaps with >60 Gy radiation had a non-statistically significant higher risk of flap failure (RR 1.61, P = 0.145). Preoperative radiation is associated with a statistically significant increased risk of flap complications, failure and fistula. Preoperative radiation in excess of 60 Gy after radiotherapy represents a potential risk factor for increased flap loss and should be avoided where possible. © 2014 Royal Australasian College of Surgeons.

  2. Toward statistical modeling of saccadic eye-movement and visual saliency.

    PubMed

    Sun, Xiaoshuai; Yao, Hongxun; Ji, Rongrong; Liu, Xian-Ming

    2014-11-01

    In this paper, we present a unified statistical framework for modeling both saccadic eye movements and visual saliency. By analyzing the statistical properties of human eye fixations on natural images, we found that human attention is sparsely distributed and usually deployed to locations with abundant structural information. This observations inspired us to model saccadic behavior and visual saliency based on super-Gaussian component (SGC) analysis. Our model sequentially obtains SGC using projection pursuit, and generates eye movements by selecting the location with maximum SGC response. Besides human saccadic behavior simulation, we also demonstrated our superior effectiveness and robustness over state-of-the-arts by carrying out dense experiments on synthetic patterns and human eye fixation benchmarks. Multiple key issues in saliency modeling research, such as individual differences, the effects of scale and blur, are explored in this paper. Based on extensive qualitative and quantitative experimental results, we show promising potentials of statistical approaches for human behavior research.

  3. Bayesian models based on test statistics for multiple hypothesis testing problems.

    PubMed

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  4. Steganalysis based on reducing the differences of image statistical characteristics

    NASA Astrophysics Data System (ADS)

    Wang, Ran; Niu, Shaozhang; Ping, Xijian; Zhang, Tao

    2018-04-01

    Compared with the process of embedding, the image contents make a more significant impact on the differences of image statistical characteristics. This makes the image steganalysis to be a classification problem with bigger withinclass scatter distances and smaller between-class scatter distances. As a result, the steganalysis features will be inseparate caused by the differences of image statistical characteristics. In this paper, a new steganalysis framework which can reduce the differences of image statistical characteristics caused by various content and processing methods is proposed. The given images are segmented to several sub-images according to the texture complexity. Steganalysis features are separately extracted from each subset with the same or close texture complexity to build a classifier. The final steganalysis result is figured out through a weighted fusing process. The theoretical analysis and experimental results can demonstrate the validity of the framework.

  5. Validating Coherence Measurements Using Aligned and Unaligned Coherence Functions

    NASA Technical Reports Server (NTRS)

    Miles, Jeffrey Hilton

    2006-01-01

    This paper describes a novel approach based on the use of coherence functions and statistical theory for sensor validation in a harsh environment. By the use of aligned and unaligned coherence functions and statistical theory one can test for sensor degradation, total sensor failure or changes in the signal. This advanced diagnostic approach and the novel data processing methodology discussed provides a single number that conveys this information. This number as calculated with standard statistical procedures for comparing the means of two distributions is compared with results obtained using Yuen's robust statistical method to create confidence intervals. Examination of experimental data from Kulite pressure transducers mounted in a Pratt & Whitney PW4098 combustor using spectrum analysis methods on aligned and unaligned time histories has verified the effectiveness of the proposed method. All the procedures produce good results which demonstrates how robust the technique is.

  6. An astronomer's guide to period searching

    NASA Astrophysics Data System (ADS)

    Schwarzenberg-Czerny, A.

    2003-03-01

    We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.

  7. Theoretical analysis of photon statistics on excited number of modes and dopant concentration in Er3+:Ti:LiNbO, waveguide amplifiers

    NASA Astrophysics Data System (ADS)

    Ducariu, A.; Constantin, G. C.; Puscas, N. N.

    2005-08-01

    In the small gain approximation and the unsaturated regime in this paper we report some original results concerning the evaluation of the Fano factor, statistical fluctuation and spontaneous emission factor which characterize the photon statistics on the number of excited modes, dopant concentration and power pumping in the single and double pass Er3+ - doped LiNbO, straight waveguide amplifiers pumped near 1484 nm using erfc, Gaussian and constant profile of the Er3+ ions in LiNbO, crystal. We demonstrated that for 50 mW input pump power the Poisson photon statistics are maintained in the above mentioned amplifiers for concentrations of the Er ions smaller than l026 m-3 and also high gains and low noise figures are achievable. The obtained results can be used for the design of optoelectronic integrated circuits.

  8. An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

    PubMed

    Ng'andu, N H

    1997-03-30

    In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.

  9. The Problem of Auto-Correlation in Parasitology

    PubMed Central

    Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

    2012-01-01

    Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865

  10. Business intelligence from social media: a study from the VAST Box Office Challenge.

    PubMed

    Lu, Yafeng; Wang, Feng; Maciejewski, Ross

    2014-01-01

    With over 16 million tweets per hour, 600 new blog posts per minute, and 400 million active users on Facebook, businesses have begun searching for ways to turn real-time consumer-based posts into actionable intelligence. The goal is to extract information from this noisy, unstructured data and use it for trend analysis and prediction. Current practices support the idea that visual analytics (VA) can help enable the effective analysis of such data. However, empirical evidence demonstrating the effectiveness of a VA solution is still lacking. A proposed VA toolkit extracts data from Bitly and Twitter to predict movie revenue and ratings. Results from the 2013 VAST Box Office Challenge demonstrate the benefit of an interactive environment for predictive analysis, compared to a purely statistical modeling approach. The VA approach used by the toolkit is generalizable to other domains involving social media data, such as sales forecasting and advertisement analysis.

  11. Mild Memory Impairment in Healthy Older Adults Is Distinct from Normal Aging

    ERIC Educational Resources Information Center

    Cargin, J. Weaver; Maruff, P.; Collie, A.; Masters, C.

    2006-01-01

    Mild memory impairment was detected in 28% of a sample of healthy community-dwelling older adults using the delayed recall trial of a word list learning task. Statistical analysis revealed that individuals with memory impairment also demonstrated relative deficits on other measures of memory, and tests of executive function, processing speed and…

  12. A Laboratory Exercise for Ecology Teaching: The Use of Photographs in Detecting Dispersion Patterns in Animals

    ERIC Educational Resources Information Center

    Lenton, G. M.

    1975-01-01

    Photographs of a beetle, Catamerus rugosus, were taken at different stages in its life cycle. Students were asked to relate these to real life and carry out a statistical analysis to determine the degree of dispersion of animals. Results demonstrate a change in dispersion throughout the life cycle. (Author/EB)

  13. Comparison of Unidimensional and Multidimensional Approaches to IRT Parameter Estimation. Research Report. ETS RR-04-44

    ERIC Educational Resources Information Center

    Zhang, Jinming

    2004-01-01

    It is common to assume during statistical analysis of a multiscale assessment that the assessment has simple structure or that it is composed of several unidimensional subtests. Under this assumption, both the unidimensional and multidimensional approaches can be used to estimate item parameters. This paper theoretically demonstrates that these…

  14. [Occupational risk according to occupational traumatism parameters in Russia].

    PubMed

    Tikhonova, G I; Churanova, A N; Gorchakova, T Iu

    2012-01-01

    The article covers analysis of occupational traumatism in Russia over 2009 in concern with economic activity types, with small enterprises accent. Based on method adapted to national information sources and assessing statistics reliability in countries with imperfect accounting, the authors demonstrated that with various hypotheses occupational accidents risk in Russian Federation is considerably higher than the registered one.

  15. Impact of in-woods product merchandizing on profitable logging opportunities in southern upland hardwood forests

    Treesearch

    Dennis M. May; Chris B. LeDoux; John B. Tansey; Richard Widmann

    1994-01-01

    Procedures developed to assess available timber supplies from upland hardwood forest statistics reported by the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) units, were modified to demonstrate the impact of three in-woods product-merchandizing options on profitable logging opportunities in upland hardwood forests in 14 Southern...

  16. Cognitive-Developmental Hierarchies: A Search for Structure Using Item-Level Data.

    ERIC Educational Resources Information Center

    Martinez, Michael E.; Simpson, R. Scott

    Item-level statistics from ability and achievement tests have been underutilized as sources of data for building models of cognitive development. How item data can be used to build a cognitive-developmental map of proportional reasoning is demonstrated. The product of the analysis is a cognitive hierarchy with levels corresponding to categories of…

  17. An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2015-11-01

    In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation ofmore » the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.« less

  18. Scaling Laws in Canopy Flows: A Wind-Tunnel Analysis

    NASA Astrophysics Data System (ADS)

    Segalini, Antonio; Fransson, Jens H. M.; Alfredsson, P. Henrik

    2013-08-01

    An analysis of velocity statistics and spectra measured above a wind-tunnel forest model is reported. Several measurement stations downstream of the forest edge have been investigated and it is observed that, while the mean velocity profile adjusts quickly to the new canopy boundary condition, the turbulence lags behind and shows a continuous penetration towards the free stream along the canopy model. The statistical profiles illustrate this growth and do not collapse when plotted as a function of the vertical coordinate. However, when the statistics are plotted as function of the local mean velocity (normalized with a characteristic velocity scale), they do collapse, independently of the streamwise position and freestream velocity. A new scaling for the spectra of all three velocity components is proposed based on the velocity variance and integral time scale. This normalization improves the collapse of the spectra compared to existing scalings adopted in atmospheric measurements, and allows the determination of a universal function that provides the velocity spectrum. Furthermore, a comparison of the proposed scaling laws for two different canopy densities is shown, demonstrating that the vertical velocity variance is the most sensible statistical quantity to the characteristics of the canopy roughness.

  19. Digital Correlation Microwave Polarimetry: Analysis and Demonstration

    NASA Technical Reports Server (NTRS)

    Piepmeier, J. R.; Gasiewski, A. J.; Krebs, Carolyn A. (Technical Monitor)

    2000-01-01

    The design, analysis, and demonstration of a digital-correlation microwave polarimeter for use in earth remote sensing is presented. We begin with an analysis of three-level digital correlation and develop the correlator transfer function and radiometric sensitivity. A fifth-order polynomial regression is derived for inverting the digital correlation coefficient into the analog statistic. In addition, the effects of quantizer threshold asymmetry and hysteresis are discussed. A two-look unpolarized calibration scheme is developed for identifying correlation offsets. The developed theory and calibration method are verified using a 10.7 GHz and a 37.0 GHz polarimeter. The polarimeters are based upon 1-GS/s three-level digital correlators and measure the first three Stokes parameters. Through experiment, the radiometric sensitivity is shown to approach the theoretical as derived earlier in the paper and the two-look unpolarized calibration method is successfully compared with results using a polarimetric scheme. Finally, sample data from an aircraft experiment demonstrates that the polarimeter is highly-useful for ocean wind-vector measurement.

  20. Data series embedding and scale invariant statistics.

    PubMed

    Michieli, I; Medved, B; Ristov, S

    2010-06-01

    Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power-law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or "fractal-like" behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated. (c) 2009 Elsevier B.V. All rights reserved.

  1. A large-scale perspective on stress-induced alterations in resting-state networks

    NASA Astrophysics Data System (ADS)

    Maron-Katz, Adi; Vaisvaser, Sharon; Lin, Tamar; Hendler, Talma; Shamir, Ron

    2016-02-01

    Stress is known to induce large-scale neural modulations. However, its neural effect once the stressor is removed and how it relates to subjective experience are not fully understood. Here we used a statistically sound data-driven approach to investigate alterations in large-scale resting-state functional connectivity (rsFC) induced by acute social stress. We compared rsfMRI profiles of 57 healthy male subjects before and after stress induction. Using a parcellation-based univariate statistical analysis, we identified a large-scale rsFC change, involving 490 parcel-pairs. Aiming to characterize this change, we employed statistical enrichment analysis, identifying anatomic structures that were significantly interconnected by these pairs. This analysis revealed strengthening of thalamo-cortical connectivity and weakening of cross-hemispheral parieto-temporal connectivity. These alterations were further found to be associated with change in subjective stress reports. Integrating report-based information on stress sustainment 20 minutes post induction, revealed a single significant rsFC change between the right amygdala and the precuneus, which inversely correlated with the level of subjective recovery. Our study demonstrates the value of enrichment analysis for exploring large-scale network reorganization patterns, and provides new insight on stress-induced neural modulations and their relation to subjective experience.

  2. Surgical Treatment for Discogenic Low-Back Pain: Lumbar Arthroplasty Results in Superior Pain Reduction and Disability Level Improvement Compared With Lumbar Fusion

    PubMed Central

    2007-01-01

    Background The US Food and Drug Administration approved the Charité artificial disc on October 26, 2004. This approval was based on an extensive analysis and review process; 20 years of disc usage worldwide; and the results of a prospective, randomized, controlled clinical trial that compared lumbar artificial disc replacement to fusion. The results of the investigational device exemption (IDE) study led to a conclusion that clinical outcomes following lumbar arthroplasty were at least as good as outcomes from fusion. Methods The author performed a new analysis of the Visual Analog Scale pain scores and the Oswestry Disability Index scores from the Charité artificial disc IDE study and used a nonparametric statistical test, because observed data distributions were not normal. The analysis included all of the enrolled subjects in both the nonrandomized and randomized phases of the study. Results Subjects from both the treatment and control groups improved from the baseline situation (P < .001) at all follow-up times (6 weeks to 24 months). Additionally, these pain and disability levels with artificial disc replacement were superior (P < .05) to the fusion treatment at all follow-up times including 2 years. Conclusions The a priori statistical plan for an IDE study may not adequately address the final distribution of the data. Therefore, statistical analyses more appropriate to the distribution may be necessary to develop meaningful statistical conclusions from the study. A nonparametric statistical analysis of the Charité artificial disc IDE outcomes scores demonstrates superiority for lumbar arthroplasty versus fusion at all follow-up time points to 24 months. PMID:25802574

  3. Coordinate based random effect size meta-analysis of neuroimaging studies.

    PubMed

    Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

    2017-06-01

    Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Algorithm for computing descriptive statistics for very large data sets and the exa-scale era

    NASA Astrophysics Data System (ADS)

    Beekman, Izaak

    2017-11-01

    An algorithm for Single-point, Parallel, Online, Converging Statistics (SPOCS) is presented. It is suited for in situ analysis that traditionally would be relegated to post-processing, and can be used to monitor the statistical convergence and estimate the error/residual in the quantity-useful for uncertainty quantification too. Today, data may be generated at an overwhelming rate by numerical simulations and proliferating sensing apparatuses in experiments and engineering applications. Monitoring descriptive statistics in real time lets costly computations and experiments be gracefully aborted if an error has occurred, and monitoring the level of statistical convergence allows them to be run for the shortest amount of time required to obtain good results. This algorithm extends work by Pébay (Sandia Report SAND2008-6212). Pébay's algorithms are recast into a converging delta formulation, with provably favorable properties. The mean, variance, covariances and arbitrary higher order statistical moments are computed in one pass. The algorithm is tested using Sillero, Jiménez, & Moser's (2013, 2014) publicly available UPM high Reynolds number turbulent boundary layer data set, demonstrating numerical robustness, efficiency and other favorable properties.

  5. The same analysis approach: Practical protection against the pitfalls of novel neuroimaging analysis methods.

    PubMed

    Görgen, Kai; Hebart, Martin N; Allefeld, Carsten; Haynes, John-Dylan

    2017-12-27

    Standard neuroimaging data analysis based on traditional principles of experimental design, modelling, and statistical inference is increasingly complemented by novel analysis methods, driven e.g. by machine learning methods. While these novel approaches provide new insights into neuroimaging data, they often have unexpected properties, generating a growing literature on possible pitfalls. We propose to meet this challenge by adopting a habit of systematic testing of experimental design, analysis procedures, and statistical inference. Specifically, we suggest to apply the analysis method used for experimental data also to aspects of the experimental design, simulated confounds, simulated null data, and control data. We stress the importance of keeping the analysis method the same in main and test analyses, because only this way possible confounds and unexpected properties can be reliably detected and avoided. We describe and discuss this Same Analysis Approach in detail, and demonstrate it in two worked examples using multivariate decoding. With these examples, we reveal two sources of error: A mismatch between counterbalancing (crossover designs) and cross-validation which leads to systematic below-chance accuracies, and linear decoding of a nonlinear effect, a difference in variance. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Identifying biologically relevant differences between metagenomic communities.

    PubMed

    Parks, Donovan H; Beiko, Robert G

    2010-03-15

    Metagenomics is the study of genetic material recovered directly from environmental samples. Taxonomic and functional differences between metagenomic samples can highlight the influence of ecological factors on patterns of microbial life in a wide range of habitats. Statistical hypothesis tests can help us distinguish ecological influences from sampling artifacts, but knowledge of only the P-value from a statistical hypothesis test is insufficient to make inferences about biological relevance. Current reporting practices for pairwise comparative metagenomics are inadequate, and better tools are needed for comparative metagenomic analysis. We have developed a new software package, STAMP, for comparative metagenomics that supports best practices in analysis and reporting. Examination of a pair of iron mine metagenomes demonstrates that deeper biological insights can be gained using statistical techniques available in our software. An analysis of the functional potential of 'Candidatus Accumulibacter phosphatis' in two enhanced biological phosphorus removal metagenomes identified several subsystems that differ between the A.phosphatis stains in these related communities, including phosphate metabolism, secretion and metal transport. Python source code and binaries are freely available from our website at http://kiwi.cs.dal.ca/Software/STAMP CONTACT: beiko@cs.dal.ca Supplementary data are available at Bioinformatics online.

  7. Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods

    NASA Technical Reports Server (NTRS)

    Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan

    2016-01-01

    The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.

  8. Compositional data analysis for physical activity, sedentary time and sleep research.

    PubMed

    Dumuid, Dorothea; Stanford, Tyman E; Martin-Fernández, Josep-Antoni; Pedišić, Željko; Maher, Carol A; Lewis, Lucy K; Hron, Karel; Katzmarzyk, Peter T; Chaput, Jean-Philippe; Fogelholm, Mikael; Hu, Gang; Lambert, Estelle V; Maia, José; Sarmiento, Olga L; Standage, Martyn; Barreira, Tiago V; Broyles, Stephanie T; Tudor-Locke, Catrine; Tremblay, Mark S; Olds, Timothy

    2017-01-01

    The health effects of daily activity behaviours (physical activity, sedentary time and sleep) are widely studied. While previous research has largely examined activity behaviours in isolation, recent studies have adjusted for multiple behaviours. However, the inclusion of all activity behaviours in traditional multivariate analyses has not been possible due to the perfect multicollinearity of 24-h time budget data. The ensuing lack of adjustment for known effects on the outcome undermines the validity of study findings. We describe a statistical approach that enables the inclusion of all daily activity behaviours, based on the principles of compositional data analysis. Using data from the International Study of Childhood Obesity, Lifestyle and the Environment, we demonstrate the application of compositional multiple linear regression to estimate adiposity from children's daily activity behaviours expressed as isometric log-ratio coordinates. We present a novel method for predicting change in a continuous outcome based on relative changes within a composition, and for calculating associated confidence intervals to allow for statistical inference. The compositional data analysis presented overcomes the lack of adjustment that has plagued traditional statistical methods in the field, and provides robust and reliable insights into the health effects of daily activity behaviours.

  9. An On-Demand Optical Quantum Random Number Generator with In-Future Action and Ultra-Fast Response

    PubMed Central

    Stipčević, Mario; Ursin, Rupert

    2015-01-01

    Random numbers are essential for our modern information based society e.g. in cryptography. Unlike frequently used pseudo-random generators, physical random number generators do not depend on complex algorithms but rather on a physicsal process to provide true randomness. Quantum random number generators (QRNG) do rely on a process, wich can be described by a probabilistic theory only, even in principle. Here we present a conceptualy simple implementation, which offers a 100% efficiency of producing a random bit upon a request and simultaneously exhibits an ultra low latency. A careful technical and statistical analysis demonstrates its robustness against imperfections of the actual implemented technology and enables to quickly estimate randomness of very long sequences. Generated random numbers pass standard statistical tests without any post-processing. The setup described, as well as the theory presented here, demonstrate the maturity and overall understanding of the technology. PMID:26057576

  10. Profile of intimate partner violence in Family Health Units.

    PubMed

    Rafael, Ricardo de Mattos Russo; Moura, Anna Tereza Miranda Soares de; Tavares, Jeane Marques Cunha; Ferreira, Renata Evelin Moreno; Camilo, Glauce Gomes da Silva; Neto, Mercedes

    2017-01-01

    To estimate the profile of intimate partner violence involving women in a scenario of Family Health Strategy in the municipality of Nova Iguaçu (Rio de Janeiro). A transversal study was conducted in four units with a sample of 640 women between the ages of 25 to 64. The phenomena of violence was determined using the tool Revised Conflict Tactics Scales, validated for Brazil. Statistical analysis took into consideration an estimation of prevalence in the calculation of the p values. The situations of violence and the sociodemographic profiles demonstrated a statistically significant relationship with the variables of educational level and housing conditions. Age, ethnicity and economic class demonstrated an association with certain types of violence, varying in type and severity. The study investigated the profile of these situations of violence and enabled reflection regarding the approaches adopted by the Family Health Strategy teams.

  11. Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.

    PubMed

    Kwak, Il-Youp; Pan, Wei

    2017-01-01

    To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

    PubMed

    Chu, Annie; Cui, Jenny; Dinov, Ivo D

    2009-03-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.

  13. Proper interpretation of chronic toxicity studies and their statistics: A critique of "Which level of evidence does the US National Toxicology Program provide? Statistical considerations using the Technical Report 578 on Ginkgo biloba as an example".

    PubMed

    Kissling, Grace E; Haseman, Joseph K; Zeiger, Errol

    2015-09-02

    A recent article by Gaus (2014) demonstrates a serious misunderstanding of the NTP's statistical analysis and interpretation of rodent carcinogenicity data as reported in Technical Report 578 (Ginkgo biloba) (NTP, 2013), as well as a failure to acknowledge the abundant literature on false positive rates in rodent carcinogenicity studies. The NTP reported Ginkgo biloba extract to be carcinogenic in mice and rats. Gaus claims that, in this study, 4800 statistical comparisons were possible, and that 209 of them were statistically significant (p<0.05) compared with 240 (4800×0.05) expected by chance alone; thus, the carcinogenicity of Ginkgo biloba extract cannot be definitively established. However, his assumptions and calculations are flawed since he incorrectly assumes that the NTP uses no correction for multiple comparisons, and that significance tests for discrete data operate at exactly the nominal level. He also misrepresents the NTP's decision making process, overstates the number of statistical comparisons made, and ignores the fact that the mouse liver tumor effects were so striking (e.g., p<0.0000000000001) that it is virtually impossible that they could be false positive outcomes. Gaus' conclusion that such obvious responses merely "generate a hypothesis" rather than demonstrate a real carcinogenic effect has no scientific credibility. Moreover, his claims regarding the high frequency of false positive outcomes in carcinogenicity studies are misleading because of his methodological misconceptions and errors. Published by Elsevier Ireland Ltd.

  14. Proper interpretation of chronic toxicity studies and their statistics: A critique of “Which level of evidence does the US National Toxicology Program provide? Statistical considerations using the Technical Report 578 on Ginkgo biloba as an example”

    PubMed Central

    Kissling, Grace E.; Haseman, Joseph K.; Zeiger, Errol

    2014-01-01

    A recent article by Gaus (2014) demonstrates a serious misunderstanding of the NTP’s statistical analysis and interpretation of rodent carcinogenicity data as reported in Technical Report 578 (Ginkgo biloba) (NTP 2013), as well as a failure to acknowledge the abundant literature on false positive rates in rodent carcinogenicity studies. The NTP reported Ginkgo biloba extract to be carcinogenic in mice and rats. Gaus claims that, in this study, 4800 statistical comparisons were possible, and that 209 of them were statistically significant (p<0.05) compared with 240 (4800 × 0.05) expected by chance alone; thus, the carcinogenicity of Ginkgo biloba extract cannot be definitively established. However, his assumptions and calculations are flawed since he incorrectly assumes that the NTP uses no correction for multiple comparisons, and that significance tests for discrete data operate at exactly the nominal level. He also misrepresents the NTP’s decision making process, overstates the number of statistical comparisons made, and ignores that fact that that the mouse liver tumor effects were so striking (e.g., p<0.0000000000001) that it is virtually impossible that they could be false positive outcomes. Gaus’ conclusion that such obvious responses merely “generate a hypothesis” rather than demonstrate a real carcinogenic effect has no scientific credibility. Moreover, his claims regarding the high frequency of false positive outcomes in carcinogenicity studies are misleading because of his methodological misconceptions and errors. PMID:25261588

  15. Depth-Dependent Glycosaminoglycan Concentration in Articular Cartilage by Quantitative Contrast-Enhanced Micro–Computed Tomography

    PubMed Central

    Mittelstaedt, Daniel

    2015-01-01

    Objective A quantitative contrast-enhanced micro–computed tomography (qCECT) method was developed to investigate the depth dependency and heterogeneity of the glycosaminoglycan (GAG) concentration of ex vivo cartilage equilibrated with an anionic radiographic contrast agent, Hexabrix. Design Full-thickness fresh native (n = 19 in 3 subgroups) and trypsin-degraded (n = 6) articular cartilage blocks were imaged using micro–computed tomography (μCT) at high resolution (13.4 μm3) before and after equilibration with various Hexabrix bathing concentrations. The GAG concentration was calculated depth-dependently based on Gibbs-Donnan equilibrium theory. Analysis of variance with Tukey’s post hoc was used to test for statistical significance (P < 0.05) for effect of Hexabrix bathing concentration, and for differences in bulk and zonal GAG concentrations individually and compared between native and trypsin-degraded cartilage. Results The bulk GAG concentration was calculated to be 74.44 ± 6.09 and 11.99 ± 4.24 mg/mL for native and degraded cartilage, respectively. A statistical difference was demonstrated for bulk and zonal GAG between native and degraded cartilage (P < 0.032). A statistical difference was not demonstrated for bulk GAG when comparing Hexabrix bathing concentrations (P > 0.3214) for neither native nor degraded cartilage. Depth-dependent GAG analysis of native cartilage revealed a statistical difference only in the radial zone between 30% and 50% Hexabrix bathing concentrations. Conclusions This nondestructive qCECT methodology calculated the depth-dependent GAG concentration for both native and trypsin-degraded cartilage at high spatial resolution. qCECT allows for more detailed understanding of the topography and depth dependency, which could help diagnose health, degradation, and repair of native and contrived cartilage. PMID:26425259

  16. Demonstrating the Effects of Shop Flow Process Variability on the Air Force Depot Level Reparable Item Pipeline

    DTIC Science & Technology

    1992-09-01

    Crawford found that pipeline contents are extremely variable about their mean (10:24) and Kettner and Wheatley said that "a statistical analysis of data...write the results from this replication "* to the ANOVA files for later analysis . The first set outputs points "* for overall pipeline contents . The...families and friends for their unselfishness and support. Marvin A. Arostegui and Jon A. Larvick ii Table of Contents Page Preface

  17. Computation of statistical secondary structure of nucleic acids.

    PubMed Central

    Yamamoto, K; Kitamura, Y; Yoshikura, H

    1984-01-01

    This paper presents a computer analysis of statistical secondary structure of nucleic acids. For a given single stranded nucleic acid, we generated "structure map" which included all the annealing structures in the sequence. The map was transformed into "energy map" by rough approximation; here, the energy level of every pairing structure consisting of more than 2 successive nucleic acid pairs was calculated. By using the "energy map", the probability of occurrence of each annealed structure was computed, i.e., the structure was computed statistically. The basis of computation was the 8-queen problem in the chess game. The validity of our computer programme was checked by computing tRNA structure which has been well established. Successful application of this programme to small nuclear RNAs of various origins is demonstrated. PMID:6198622

  18. Modeling stock price dynamics by continuum percolation system and relevant complex systems analysis

    NASA Astrophysics Data System (ADS)

    Xiao, Di; Wang, Jun

    2012-10-01

    The continuum percolation system is developed to model a random stock price process in this work. Recent empirical research has demonstrated various statistical features of stock price changes, the financial model aiming at understanding price fluctuations needs to define a mechanism for the formation of the price, in an attempt to reproduce and explain this set of empirical facts. The continuum percolation model is usually referred to as a random coverage process or a Boolean model, the local interaction or influence among traders is constructed by the continuum percolation, and a cluster of continuum percolation is applied to define the cluster of traders sharing the same opinion about the market. We investigate and analyze the statistical behaviors of normalized returns of the price model by some analysis methods, including power-law tail distribution analysis, chaotic behavior analysis and Zipf analysis. Moreover, we consider the daily returns of Shanghai Stock Exchange Composite Index from January 1997 to July 2011, and the comparisons of return behaviors between the actual data and the simulation data are exhibited.

  19. MetaboLyzer: A Novel Statistical Workflow for Analyzing Post-Processed LC/MS Metabolomics Data

    PubMed Central

    Mak, Tytus D.; Laiakis, Evagelia C.; Goudarzi, Maryam; Fornace, Albert J.

    2014-01-01

    Metabolomics, the global study of small molecules in a particular system, has in the last few years risen to become a primary –omics platform for the study of metabolic processes. With the ever-increasing pool of quantitative data yielded from metabolomic research, specialized methods and tools with which to analyze and extract meaningful conclusions from these data are becoming more and more crucial. Furthermore, the depth of knowledge and expertise required to undertake a metabolomics oriented study is a daunting obstacle to investigators new to the field. As such, we have created a new statistical analysis workflow, MetaboLyzer, which aims to both simplify analysis for investigators new to metabolomics, as well as provide experienced investigators the flexibility to conduct sophisticated analysis. MetaboLyzer’s workflow is specifically tailored to the unique characteristics and idiosyncrasies of postprocessed liquid chromatography/mass spectrometry (LC/MS) based metabolomic datasets. It utilizes a wide gamut of statistical tests, procedures, and methodologies that belong to classical biostatistics, as well as several novel statistical techniques that we have developed specifically for metabolomics data. Furthermore, MetaboLyzer conducts rapid putative ion identification and putative biologically relevant analysis via incorporation of four major small molecule databases: KEGG, HMDB, Lipid Maps, and BioCyc. MetaboLyzer incorporates these aspects into a comprehensive workflow that outputs easy to understand statistically significant and potentially biologically relevant information in the form of heatmaps, volcano plots, 3D visualization plots, correlation maps, and metabolic pathway hit histograms. For demonstration purposes, a urine metabolomics data set from a previously reported radiobiology study in which samples were collected from mice exposed to gamma radiation was analyzed. MetaboLyzer was able to identify 243 statistically significant ions out of a total of 1942. Numerous putative metabolites and pathways were found to be biologically significant from the putative ion identification workflow. PMID:24266674

  20. Endogenous Retrovirus Insertion in the KIT Oncogene Determines White and White spotting in Domestic Cats

    PubMed Central

    David, Victor A.; Menotti-Raymond, Marilyn; Wallace, Andrea Coots; Roelke, Melody; Kehler, James; Leighty, Robert; Eizirik, Eduardo; Hannah, Steven S.; Nelson, George; Schäffer, Alejandro A.; Connelly, Catherine J.; O’Brien, Stephen J.; Ryugo, David K.

    2014-01-01

    The Dominant White locus (W) in the domestic cat demonstrates pleiotropic effects exhibiting complete penetrance for absence of coat pigmentation and incomplete penetrance for deafness and iris hypopigmentation. We performed linkage analysis using a pedigree segregating White to identify KIT (Chr. B1) as the feline W locus. Segregation and sequence analysis of the KIT gene in two pedigrees (P1 and P2) revealed the remarkable retrotransposition and evolution of a feline endogenous retrovirus (FERV1) as responsible for two distinct phenotypes of the W locus, Dominant White, and white spotting. A full-length (7125 bp) FERV1 element is associated with white spotting, whereas a FERV1 long terminal repeat (LTR) is associated with all Dominant White individuals. For purposes of statistical analysis, the alternatives of wild-type sequence, FERV1 element, and LTR-only define a triallelic marker. Taking into account pedigree relationships, deafness is genetically linked and associated with this marker; estimated P values for association are in the range of 0.007 to 0.10. The retrotransposition interrupts a DNAase I hypersensitive site in KIT intron 1 that is highly conserved across mammals and was previously demonstrated to regulate temporal and tissue-specific expression of KIT in murine hematopoietic and melanocytic cells. A large-population genetic survey of cats (n = 270), representing 30 cat breeds, supports our findings and demonstrates statistical significance of the FERV1 LTR and full-length element with Dominant White/blue iris (P < 0.0001) and white spotting (P < 0.0001), respectively. PMID:25085922

  1. Endogenous retrovirus insertion in the KIT oncogene determines white and white spotting in domestic cats.

    PubMed

    David, Victor A; Menotti-Raymond, Marilyn; Wallace, Andrea Coots; Roelke, Melody; Kehler, James; Leighty, Robert; Eizirik, Eduardo; Hannah, Steven S; Nelson, George; Schäffer, Alejandro A; Connelly, Catherine J; O'Brien, Stephen J; Ryugo, David K

    2014-08-01

    The Dominant White locus (W) in the domestic cat demonstrates pleiotropic effects exhibiting complete penetrance for absence of coat pigmentation and incomplete penetrance for deafness and iris hypopigmentation. We performed linkage analysis using a pedigree segregating White to identify KIT (Chr. B1) as the feline W locus. Segregation and sequence analysis of the KIT gene in two pedigrees (P1 and P2) revealed the remarkable retrotransposition and evolution of a feline endogenous retrovirus (FERV1) as responsible for two distinct phenotypes of the W locus, Dominant White, and white spotting. A full-length (7125 bp) FERV1 element is associated with white spotting, whereas a FERV1 long terminal repeat (LTR) is associated with all Dominant White individuals. For purposes of statistical analysis, the alternatives of wild-type sequence, FERV1 element, and LTR-only define a triallelic marker. Taking into account pedigree relationships, deafness is genetically linked and associated with this marker; estimated P values for association are in the range of 0.007 to 0.10. The retrotransposition interrupts a DNAase I hypersensitive site in KIT intron 1 that is highly conserved across mammals and was previously demonstrated to regulate temporal and tissue-specific expression of KIT in murine hematopoietic and melanocytic cells. A large-population genetic survey of cats (n = 270), representing 30 cat breeds, supports our findings and demonstrates statistical significance of the FERV1 LTR and full-length element with Dominant White/blue iris (P < 0.0001) and white spotting (P < 0.0001), respectively. Copyright © 2014 David et al.

  2. Assessing the significance of pedobarographic signals using random field theory.

    PubMed

    Pataky, Todd C

    2008-08-07

    Traditional pedobarographic statistical analyses are conducted over discrete regions. Recent studies have demonstrated that regionalization can corrupt pedobarographic field data through conflation when arbitrary dividing lines inappropriately delineate smooth field processes. An alternative is to register images such that homologous structures optimally overlap and then conduct statistical tests at each pixel to generate statistical parametric maps (SPMs). The significance of SPM processes may be assessed within the framework of random field theory (RFT). RFT is ideally suited to pedobarographic image analysis because its fundamental data unit is a lattice sampling of a smooth and continuous spatial field. To correct for the vast number of multiple comparisons inherent in such data, recent pedobarographic studies have employed a Bonferroni correction to retain a constant family-wise error rate. This approach unfortunately neglects the spatial correlation of neighbouring pixels, so provides an overly conservative (albeit valid) statistical threshold. RFT generally relaxes the threshold depending on field smoothness and on the geometry of the search area, but it also provides a framework for assigning p values to suprathreshold clusters based on their spatial extent. The current paper provides an overview of basic RFT concepts and uses simulated and experimental data to validate both RFT-relevant field smoothness estimations and RFT predictions regarding the topological characteristics of random pedobarographic fields. Finally, previously published experimental data are re-analysed using RFT inference procedures to demonstrate how RFT yields easily understandable statistical results that may be incorporated into routine clinical and laboratory analyses.

  3. Quality assurance of temporal variability of natural decay chain and neutron induced background for low-level NORM analysis

    DOE PAGES

    Yoho, Michael; Porterfield, Donivan R.; Landsberger, Sheldon

    2015-09-22

    In this study, twenty-one high purity germanium (HPGe) background spectra were collected over 2 years at Los Alamos National Laboratory. A quality assurance methodology was developed to monitor spectral background levels from thermal and fast neutron flux levels and naturally occurring radioactive material decay series radionuclides. 238U decay products above 222Rn demonstrated minimal temporal variability beyond that expected from counting statistics. 238U and 232Th progeny below Rn gas displayed at most twice the expected variability. Further, an analysis of the 139 keV 74Ge(n, γ) and 691 keV 72Ge(n, n') spectral features demonstrated temporal stability for both thermal and fastmore » neutron fluxes.« less

  4. No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.

    PubMed

    Li, Xuelong; Guo, Qun; Lu, Xiaoqiang

    2016-05-13

    It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.

  5. Increasing URM Undergraduate Student Success through Assessment-Driven Interventions: A Multiyear Study Using Freshman-Level General Biology as a Model System.

    PubMed

    Carmichael, Mary C; St Clair, Candace; Edwards, Andrea M; Barrett, Peter; McFerrin, Harris; Davenport, Ian; Awad, Mohamed; Kundu, Anup; Ireland, Shubha Kale

    2016-01-01

    Xavier University of Louisiana leads the nation in awarding BS degrees in the biological sciences to African-American students. In this multiyear study with ∼5500 participants, data-driven interventions were adopted to improve student academic performance in a freshman-level general biology course. The three hour-long exams were common and administered concurrently to all students. New exam questions were developed using Bloom's taxonomy, and exam results were analyzed statistically with validated assessment tools. All but the comprehensive final exam were returned to students for self-evaluation and remediation. Among other approaches, course rigor was monitored by using an identical set of 60 questions on the final exam across 10 semesters. Analysis of the identical sets of 60 final exam questions revealed that overall averages increased from 72.9% (2010) to 83.5% (2015). Regression analysis demonstrated a statistically significant correlation between high-risk students and their averages on the 60 questions. Additional analysis demonstrated statistically significant improvements for at least one letter grade from midterm to final and a 20% increase in the course pass rates over time, also for the high-risk population. These results support the hypothesis that our data-driven interventions and assessment techniques are successful in improving student retention, particularly for our academically at-risk students. © 2016 M. C. Carmichael et al. CBE—Life Sciences Education © 2016 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  6. Imaging of Al/Fe ratios in synthetic Al-goethite revealed by nanoscale secondary ion mass spectrometry.

    PubMed

    Pohl, Lydia; Kölbl, Angelika; Werner, Florian; Mueller, Carsten W; Höschen, Carmen; Häusler, Werner; Kögel-Knabner, Ingrid

    2018-04-30

    Aluminium (Al)-substituted goethite is ubiquitous in soils and sediments. The extent of Al-substitution affects the physicochemical properties of the mineral and influences its macroscale properties. Bulk analysis only provides total Al/Fe ratios without providing information with respect to the Al-substitution of single minerals. Here, we demonstrate that nanoscale secondary ion mass spectrometry (NanoSIMS) enables the precise determination of Al-content in single minerals, while simultaneously visualising the variation of the Al/Fe ratio. Al-substituted goethite samples were synthesized with increasing Al concentrations of 0.1, 3, and 7 % and analysed by NanoSIMS in combination with established bulk spectroscopic methods (XRD, FTIR, Mössbauer spectroscopy). The high spatial resolution (50-150 nm) of NanoSIMS is accompanied by a high number of single-point measurements. We statistically evaluated the Al/Fe ratios derived from NanoSIMS, while maintaining the spatial information and reassigning it to its original localization. XRD analyses confirmed increasing concentration of incorporated Al within the goethite structure. Mössbauer spectroscopy revealed 11 % of the goethite samples generated at high Al concentrations consisted of hematite. The NanoSIMS data show that the Al/Fe ratios are in agreement with bulk data derived from total digestion and demonstrated small spatial variability between single-point measurements. More advantageously, statistical analysis and reassignment of single-point measurements allowed us to identify distinct spots with significantly higher or lower Al/Fe ratios. NanoSIMS measurements confirmed the capacity to produce images, which indicated the uniform increase in Al-concentrations in goethite. Using a combination of statistical analysis with information from complementary spectroscopic techniques (XRD, FTIR and Mössbauer spectroscopy) we were further able to reveal spots with lower Al/Fe ratios as hematite. Copyright © 2018 John Wiley & Sons, Ltd.

  7. Global Sensitivity Analysis of Environmental Systems via Multiple Indices based on Statistical Moments of Model Outputs

    NASA Astrophysics Data System (ADS)

    Guadagnini, A.; Riva, M.; Dell'Oca, A.

    2017-12-01

    We propose to ground sensitivity of uncertain parameters of environmental models on a set of indices based on the main (statistical) moments, i.e., mean, variance, skewness and kurtosis, of the probability density function (pdf) of a target model output. This enables us to perform Global Sensitivity Analysis (GSA) of a model in terms of multiple statistical moments and yields a quantification of the impact of model parameters on features driving the shape of the pdf of model output. Our GSA approach includes the possibility of being coupled with the construction of a reduced complexity model that allows approximating the full model response at a reduced computational cost. We demonstrate our approach through a variety of test cases. These include a commonly used analytical benchmark, a simplified model representing pumping in a coastal aquifer, a laboratory-scale tracer experiment, and the migration of fracturing fluid through a naturally fractured reservoir (source) to reach an overlying formation (target). Our strategy allows discriminating the relative importance of model parameters to the four statistical moments considered. We also provide an appraisal of the error associated with the evaluation of our sensitivity metrics by replacing the original system model through the selected surrogate model. Our results suggest that one might need to construct a surrogate model with increasing level of accuracy depending on the statistical moment considered in the GSA. The methodological framework we propose can assist the development of analysis techniques targeted to model calibration, design of experiment, uncertainty quantification and risk assessment.

  8. When can social media lead financial markets?

    PubMed

    Zheludev, Ilya; Smith, Robert; Aste, Tomaso

    2014-02-27

    Social media analytics is showing promise for the prediction of financial markets. However, the true value of such data for trading is unclear due to a lack of consensus on which instruments can be predicted and how. Current approaches are based on the evaluation of message volumes and are typically assessed via retrospective (ex-post facto) evaluation of trading strategy returns. In this paper, we present instead a sentiment analysis methodology to quantify and statistically validate which assets could qualify for trading from social media analytics in an ex-ante configuration. We use sentiment analysis techniques and Information Theory measures to demonstrate that social media message sentiment can contain statistically-significant ex-ante information on the future prices of the S&P500 index and a limited set of stocks, in excess of what is achievable using solely message volumes.

  9. When Can Social Media Lead Financial Markets?

    NASA Astrophysics Data System (ADS)

    Zheludev, Ilya; Smith, Robert; Aste, Tomaso

    2014-02-01

    Social media analytics is showing promise for the prediction of financial markets. However, the true value of such data for trading is unclear due to a lack of consensus on which instruments can be predicted and how. Current approaches are based on the evaluation of message volumes and are typically assessed via retrospective (ex-post facto) evaluation of trading strategy returns. In this paper, we present instead a sentiment analysis methodology to quantify and statistically validate which assets could qualify for trading from social media analytics in an ex-ante configuration. We use sentiment analysis techniques and Information Theory measures to demonstrate that social media message sentiment can contain statistically-significant ex-ante information on the future prices of the S&P500 index and a limited set of stocks, in excess of what is achievable using solely message volumes.

  10. When Can Social Media Lead Financial Markets?

    PubMed Central

    Zheludev, Ilya; Smith, Robert; Aste, Tomaso

    2014-01-01

    Social media analytics is showing promise for the prediction of financial markets. However, the true value of such data for trading is unclear due to a lack of consensus on which instruments can be predicted and how. Current approaches are based on the evaluation of message volumes and are typically assessed via retrospective (ex-post facto) evaluation of trading strategy returns. In this paper, we present instead a sentiment analysis methodology to quantify and statistically validate which assets could qualify for trading from social media analytics in an ex-ante configuration. We use sentiment analysis techniques and Information Theory measures to demonstrate that social media message sentiment can contain statistically-significant ex-ante information on the future prices of the S&P500 index and a limited set of stocks, in excess of what is achievable using solely message volumes. PMID:24572909

  11. Statistical physics in foreign exchange currency and stock markets

    NASA Astrophysics Data System (ADS)

    Ausloos, M.

    2000-09-01

    Problems in economy and finance have attracted the interest of statistical physicists all over the world. Fundamental problems pertain to the existence or not of long-, medium- or/and short-range power-law correlations in various economic systems, to the presence of financial cycles and on economic considerations, including economic policy. A method like the detrended fluctuation analysis is recalled emphasizing its value in sorting out correlation ranges, thereby leading to predictability at short horizon. The ( m, k)-Zipf method is presented for sorting out short-range correlations in the sign and amplitude of the fluctuations. A well-known financial analysis technique, the so-called moving average, is shown to raise questions to physicists about fractional Brownian motion properties. Among spectacular results, the possibility of crash predictions has been demonstrated through the log-periodicity of financial index oscillations.

  12. Studies of oceanic tectonics based on GEOS-3 satellite altimetry

    NASA Technical Reports Server (NTRS)

    Poehls, K. A.; Kaula, W. M.; Schubert, G.; Sandwell, D.

    1979-01-01

    Using statistical analysis, geoidal admittance (the relationship between the ocean geoid and seafloor topography) obtained from GEOS-3 altimetry was compared to various model admittances. Analysis of several altimetry tracks in the Pacific Ocean demonstrated a low coherence between altimetry and seafloor topography except where the track crosses active or recent tectonic features. However, global statistical studies using the much larger data base of all available gravimetry showed a positive correlation of oceanic gravity with topography. The oceanic lithosphere was modeled by simultaneously inverting surface wave dispersion, topography, and gravity data. Efforts to incorporate geoid data into the inversion showed that the base of the subchannel can be better resolved with geoid rather than gravity data. Thermomechanical models of seafloor spreading taking into account differing plate velocities, heat source distributions, and rock rheologies were discussed.

  13. Constraining the mass–richness relationship of redMaPPer clusters with angular clustering

    DOE PAGES

    Baxter, Eric J.; Rozo, Eduardo; Jain, Bhuvnesh; ...

    2016-08-04

    The potential of using cluster clustering for calibrating the mass–richness relation of galaxy clusters has been recognized theoretically for over a decade. In this paper, we demonstrate the feasibility of this technique to achieve high-precision mass calibration using redMaPPer clusters in the Sloan Digital Sky Survey North Galactic Cap. By including cross-correlations between several richness bins in our analysis, we significantly improve the statistical precision of our mass constraints. The amplitude of the mass–richness relation is constrained to 7 per cent statistical precision by our analysis. However, the error budget is systematics dominated, reaching a 19 per cent total errormore » that is dominated by theoretical uncertainty in the bias–mass relation for dark matter haloes. We confirm the result from Miyatake et al. that the clustering amplitude of redMaPPer clusters depends on galaxy concentration as defined therein, and we provide additional evidence that this dependence cannot be sourced by mass dependences: some other effect must account for the observed variation in clustering amplitude with galaxy concentration. Assuming that the observed dependence of redMaPPer clustering on galaxy concentration is a form of assembly bias, we find that such effects introduce a systematic error on the amplitude of the mass–richness relation that is comparable to the error bar from statistical noise. Finally, the results presented here demonstrate the power of cluster clustering for mass calibration and cosmology provided the current theoretical systematics can be ameliorated.« less

  14. Parametric and experimental analysis using a power flow approach

    NASA Technical Reports Server (NTRS)

    Cuschieri, J. M.

    1988-01-01

    Having defined and developed a structural power flow approach for the analysis of structure-borne transmission of structural vibrations, the technique is used to perform an analysis of the influence of structural parameters on the transmitted energy. As a base for comparison, the parametric analysis is first performed using a Statistical Energy Analysis approach and the results compared with those obtained using the power flow approach. The advantages of using structural power flow are thus demonstrated by comparing the type of results obtained by the two methods. Additionally, to demonstrate the advantages of using the power flow method and to show that the power flow results represent a direct physical parameter that can be measured on a typical structure, an experimental investigation of structural power flow is also presented. Results are presented for an L-shaped beam for which an analytical solution has already been obtained. Furthermore, the various methods available to measure vibrational power flow are compared to investigate the advantages and disadvantages of each method.

  15. [The characteristic of stomatologic service manpower in Omsk region].

    PubMed

    Prokop'ev, K A; Ravdugina, T G

    2013-01-01

    The article considers the network and manpower characteristics of stomatologic service in Omsk oblast during 2006-3011. The results are presented concerning statistical analysis of dynamics of manpower size and staffing of specialists in stomatologic rooms (departments) in state (budget) health institutions. The differences are demonstrated concerning the accessibility of stomatologic care to residents of city and rural regions of oblast.

  16. 1H NMR-based metabolic profiling for evaluating poppy seed rancidity and brewing.

    PubMed

    Jawień, Ewa; Ząbek, Adam; Deja, Stanisław; Łukaszewicz, Marcin; Młynarz, Piotr

    2015-12-01

    Poppy seeds are widely used in household and commercial confectionery. The aim of this study was to demonstrate the application of metabolic profiling for industrial monitoring of the molecular changes which occur during minced poppy seed rancidity and brewing processes performed on raw seeds. Both forms of poppy seeds were obtained from a confectionery company. Proton nuclear magnetic resonance (1H NMR) was applied as the analytical method of choice together with multivariate statistical data analysis. Metabolic fingerprinting was applied as a bioprocess control tool to monitor rancidity with the trajectory of change and brewing progressions. Low molecular weight compounds were found to be statistically significant biomarkers of these bioprocesses. Changes in concentrations of chemical compounds were explained relative to the biochemical processes and external conditions. The obtained results provide valuable and comprehensive information to gain a better understanding of the biology of rancidity and brewing processes, while demonstrating the potential for applying NMR spectroscopy combined with multivariate data analysis tools for quality control in food industries involved in the processing of oilseeds. This precious and versatile information gives a better understanding of the biology of these processes.

  17. Visualizing statistical significance of disease clusters using cartograms.

    PubMed

    Kronenfeld, Barry J; Wong, David W S

    2017-05-15

    Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.

  18. Fully Bayesian tests of neutrality using genealogical summary statistics.

    PubMed

    Drummond, Alexei J; Suchard, Marc A

    2008-10-31

    Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.

  19. Evaluation of Solid Rocket Motor Component Data Using a Commercially Available Statistical Software Package

    NASA Technical Reports Server (NTRS)

    Stefanski, Philip L.

    2015-01-01

    Commercially available software packages today allow users to quickly perform the routine evaluations of (1) descriptive statistics to numerically and graphically summarize both sample and population data, (2) inferential statistics that draws conclusions about a given population from samples taken of it, (3) probability determinations that can be used to generate estimates of reliability allowables, and finally (4) the setup of designed experiments and analysis of their data to identify significant material and process characteristics for application in both product manufacturing and performance enhancement. This paper presents examples of analysis and experimental design work that has been conducted using Statgraphics®(Registered Trademark) statistical software to obtain useful information with regard to solid rocket motor propellants and internal insulation material. Data were obtained from a number of programs (Shuttle, Constellation, and Space Launch System) and sources that include solid propellant burn rate strands, tensile specimens, sub-scale test motors, full-scale operational motors, rubber insulation specimens, and sub-scale rubber insulation analog samples. Besides facilitating the experimental design process to yield meaningful results, statistical software has demonstrated its ability to quickly perform complex data analyses and yield significant findings that might otherwise have gone unnoticed. One caveat to these successes is that useful results not only derive from the inherent power of the software package, but also from the skill and understanding of the data analyst.

  20. Predictors of workplace violence among female sex workers in Tijuana, Mexico.

    PubMed

    Katsulis, Yasmina; Durfee, Alesha; Lopez, Vera; Robillard, Alyssa

    2015-05-01

    For sex workers, differences in rates of exposure to workplace violence are likely influenced by a variety of risk factors, including where one works and under what circumstances. Economic stressors, such as housing insecurity, may also increase the likelihood of exposure. Bivariate analyses demonstrate statistically significant associations between workplace violence and selected predictor variables, including age, drug use, exchanging sex for goods, soliciting clients outdoors, and experiencing housing insecurity. Multivariate regression analysis shows that after controlling for each of these variables in one model, only soliciting clients outdoors and housing insecurity emerge as statistically significant predictors for workplace violence. © The Author(s) 2014.

  1. Kurtosis Approach Nonlinear Blind Source Separation

    NASA Technical Reports Server (NTRS)

    Duong, Vu A.; Stubbemd, Allen R.

    2005-01-01

    In this paper, we introduce a new algorithm for blind source signal separation for post-nonlinear mixtures. The mixtures are assumed to be linearly mixed from unknown sources first and then distorted by memoryless nonlinear functions. The nonlinear functions are assumed to be smooth and can be approximated by polynomials. Both the coefficients of the unknown mixing matrix and the coefficients of the approximated polynomials are estimated by the gradient descent method conditional on the higher order statistical requirements. The results of simulation experiments presented in this paper demonstrate the validity and usefulness of our approach for nonlinear blind source signal separation Keywords: Independent Component Analysis, Kurtosis, Higher order statistics.

  2. The extent and consequences of p-hacking in science.

    PubMed

    Head, Megan L; Holman, Luke; Lanfear, Rob; Kahn, Andrew T; Jennions, Michael D

    2015-03-01

    A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as "p-hacking," occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.

  3. Demystification of Bell inequality

    NASA Astrophysics Data System (ADS)

    Khrennikov, Andrei

    2009-08-01

    The main aim of this review is to show that the common conclusion that Bell's argument implies that any attempt to proceed beyond quantum mechanics induces a nonlocal model was not totally justified. Our analysis of Bell's argument demonstrates that violation of Bell's inequality implies neither "death of realism" nor nonlocality. This violation is just a sign of non-Kolmogorovness of statistical data - impossibility to put statistical data collected in a few different experiments (corresponding to incompatible settings of polarization beam splitters) in one probability space. This inequality was well known in theoretical probability since 19th century (from works of Boole). We couple non-Kolmogorovness of data with design of modern detectors of photons.

  4. Intraoperative optical biopsy for brain tumors using spectro-lifetime properties of intrinsic fluorophores

    NASA Astrophysics Data System (ADS)

    Vasefi, Fartash; Kittle, David S.; Nie, Zhaojun; Falcone, Christina; Patil, Chirag G.; Chu, Ray M.; Mamelak, Adam N.; Black, Keith L.; Butte, Pramod V.

    2016-04-01

    We have developed and tested a system for real-time intra-operative optical identification and classification of brain tissues using time-resolved fluorescence spectroscopy (TRFS). A supervised learning algorithm using linear discriminant analysis (LDA) employing selected intrinsic fluorescence decay temporal points in 6 spectral bands was employed to maximize statistical significance difference between training groups. The linear discriminant analysis on in vivo human tissues obtained by TRFS measurements (N = 35) were validated by histopathologic analysis and neuronavigation correlation to pre-operative MRI images. These results demonstrate that TRFS can differentiate between normal cortex, white matter and glioma.

  5. Demonstration of fundamental statistics by studying timing of electronics signals in a physics-based laboratory

    NASA Astrophysics Data System (ADS)

    Beach, Shaun E.; Semkow, Thomas M.; Remling, David J.; Bradt, Clayton J.

    2017-07-01

    We have developed accessible methods to demonstrate fundamental statistics in several phenomena, in the context of teaching electronic signal processing in a physics-based college-level curriculum. A relationship between the exponential time-interval distribution and Poisson counting distribution for a Markov process with constant rate is derived in a novel way and demonstrated using nuclear counting. Negative binomial statistics is demonstrated as a model for overdispersion and justified by the effect of electronic noise in nuclear counting. The statistics of digital packets on a computer network are shown to be compatible with the fractal-point stochastic process leading to a power-law as well as generalized inverse Gaussian density distributions of time intervals between packets.

  6. Local sensitivity analysis for inverse problems solved by singular value decomposition

    USGS Publications Warehouse

    Hill, M.C.; Nolan, B.T.

    2010-01-01

    Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.

  7. Combining statistical inference and decisions in ecology

    USGS Publications Warehouse

    Williams, Perry J.; Hooten, Mevin B.

    2016-01-01

    Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.

  8. Humans make efficient use of natural image statistics when performing spatial interpolation.

    PubMed

    D'Antona, Anthony D; Perry, Jeffrey S; Geisler, Wilson S

    2013-12-16

    Visual systems learn through evolution and experience over the lifespan to exploit the statistical structure of natural images when performing visual tasks. Understanding which aspects of this statistical structure are incorporated into the human nervous system is a fundamental goal in vision science. To address this goal, we measured human ability to estimate the intensity of missing image pixels in natural images. Human estimation accuracy is compared with various simple heuristics (e.g., local mean) and with optimal observers that have nearly complete knowledge of the local statistical structure of natural images. Human estimates are more accurate than those of simple heuristics, and they match the performance of an optimal observer that knows the local statistical structure of relative intensities (contrasts). This optimal observer predicts the detailed pattern of human estimation errors and hence the results place strong constraints on the underlying neural mechanisms. However, humans do not reach the performance of an optimal observer that knows the local statistical structure of the absolute intensities, which reflect both local relative intensities and local mean intensity. As predicted from a statistical analysis of natural images, human estimation accuracy is negligibly improved by expanding the context from a local patch to the whole image. Our results demonstrate that the human visual system exploits efficiently the statistical structure of natural images.

  9. Correlation of FCGRT genomic structure with serum immunoglobulin, albumin and farletuzumab pharmacokinetics in patients with first relapsed ovarian cancer.

    PubMed

    O'Shannessy, Daniel J; Bendas, Katie; Schweizer, Charles; Wang, Wenquan; Albone, Earl; Somers, Elizabeth B; Weil, Susan; Meredith, Rhonda K; Wustner, Jason; Grasso, Luigi; Landers, Mark; Nicolaides, Nicholas C

    2017-07-01

    Farletuzumab (FAR) is a humanized monoclonal antibody (mAb) that binds to folate receptor alpha. A Ph3 trial in ovarian cancer patients treated with carboplatin/taxane plus FAR or placebo did not meet the primary statistical endpoint. Subgroup analysis demonstrated that subjects with high FAR exposure levels (Cmin>57.6μg/mL) showed statistically significant improvements in PFS and OS. The neonatal Fc receptor (fcgrt) plays a central role in albumin/IgG stasis and mAb pharmacokinetics (PK). Here we evaluated fcgrt sequence and association of its promoter variable number tandem repeats (VNTR) and coding single nucleotide variants (SNV) with albumin/IgG levels and FAR PK in the Ph3 patients. A statistical correlation existed between high FAR Cmin and AUC in patients with the highest quartile of albumin and lowest quartile of IgG1. Analysis of fcgrt identified 5 different VNTRs in the promoter region and 9 SNVs within the coding region, 4 which are novel. Copyright © 2017. Published by Elsevier Inc.

  10. Machine learning to analyze images of shocked materials for precise and accurate measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dresselhaus-Cooper, Leora; Howard, Marylesa; Hock, Margaret C.

    A supervised machine learning algorithm, called locally adaptive discriminant analysis (LADA), has been developed to locate boundaries between identifiable image features that have varying intensities. LADA is an adaptation of image segmentation, which includes techniques that find the positions of image features (classes) using statistical intensity distributions for each class in the image. In order to place a pixel in the proper class, LADA considers the intensity at that pixel and the distribution of intensities in local (nearby) pixels. This paper presents the use of LADA to provide, with statistical uncertainties, the positions and shapes of features within ultrafast imagesmore » of shock waves. We demonstrate the ability to locate image features including crystals, density changes associated with shock waves, and material jetting caused by shock waves. This algorithm can analyze images that exhibit a wide range of physical phenomena because it does not rely on comparison to a model. LADA enables analysis of images from shock physics with statistical rigor independent of underlying models or simulations.« less

  11. Error Analysis for RADAR Neighbor Matching Localization in Linear Logarithmic Strength Varying Wi-Fi Environment

    PubMed Central

    Tian, Zengshan; Xu, Kunjie; Yu, Xiang

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future. PMID:24683349

  12. Error analysis for RADAR neighbor matching localization in linear logarithmic strength varying Wi-Fi environment.

    PubMed

    Zhou, Mu; Tian, Zengshan; Xu, Kunjie; Yu, Xiang; Wu, Haibo

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future.

  13. NASA DOE POD NDE Capabilities Data Book

    NASA Technical Reports Server (NTRS)

    Generazio, Edward R.

    2015-01-01

    This data book contains the Directed Design of Experiments for Validating Probability of Detection (POD) Capability of NDE Systems (DOEPOD) analyses of the nondestructive inspection data presented in the NTIAC, Nondestructive Evaluation (NDE) Capabilities Data Book, 3rd ed., NTIAC DB-97-02. DOEPOD is designed as a decision support system to validate inspection system, personnel, and protocol demonstrating 0.90 POD with 95% confidence at critical flaw sizes, a90/95. The test methodology used in DOEPOD is based on the field of statistical sequential analysis founded by Abraham Wald. Sequential analysis is a method of statistical inference whose characteristic feature is that the number of observations required by the procedure is not determined in advance of the experiment. The decision to terminate the experiment depends, at each stage, on the results of the observations previously made. A merit of the sequential method, as applied to testing statistical hypotheses, is that test procedures can be constructed which require, on average, a substantially smaller number of observations than equally reliable test procedures based on a predetermined number of observations.

  14. Connectivity-based fixel enhancement: Whole-brain statistical analysis of diffusion MRI measures in the presence of crossing fibres

    PubMed Central

    Raffelt, David A.; Smith, Robert E.; Ridgway, Gerard R.; Tournier, J-Donald; Vaughan, David N.; Rose, Stephen; Henderson, Robert; Connelly, Alan

    2015-01-01

    In brain regions containing crossing fibre bundles, voxel-average diffusion MRI measures such as fractional anisotropy (FA) are difficult to interpret, and lack within-voxel single fibre population specificity. Recent work has focused on the development of more interpretable quantitative measures that can be associated with a specific fibre population within a voxel containing crossing fibres (herein we use fixel to refer to a specific fibre population within a single voxel). Unfortunately, traditional 3D methods for smoothing and cluster-based statistical inference cannot be used for voxel-based analysis of these measures, since the local neighbourhood for smoothing and cluster formation can be ambiguous when adjacent voxels may have different numbers of fixels, or ill-defined when they belong to different tracts. Here we introduce a novel statistical method to perform whole-brain fixel-based analysis called connectivity-based fixel enhancement (CFE). CFE uses probabilistic tractography to identify structurally connected fixels that are likely to share underlying anatomy and pathology. Probabilistic connectivity information is then used for tract-specific smoothing (prior to the statistical analysis) and enhancement of the statistical map (using a threshold-free cluster enhancement-like approach). To investigate the characteristics of the CFE method, we assessed sensitivity and specificity using a large number of combinations of CFE enhancement parameters and smoothing extents, using simulated pathology generated with a range of test-statistic signal-to-noise ratios in five different white matter regions (chosen to cover a broad range of fibre bundle features). The results suggest that CFE input parameters are relatively insensitive to the characteristics of the simulated pathology. We therefore recommend a single set of CFE parameters that should give near optimal results in future studies where the group effect is unknown. We then demonstrate the proposed method by comparing apparent fibre density between motor neurone disease (MND) patients with control subjects. The MND results illustrate the benefit of fixel-specific statistical inference in white matter regions that contain crossing fibres. PMID:26004503

  15. The Importance of Medical Students' Attitudes Regarding Cognitive Competence for Teaching Applied Statistics: Multi-Site Study and Meta-Analysis

    PubMed Central

    Milic, Natasa M.; Masic, Srdjan; Milin-Lazovic, Jelena; Trajkovic, Goran; Bukumiric, Zoran; Savic, Marko; Milic, Nikola V.; Cirkovic, Andja; Gajic, Milan; Kostic, Mirjana; Ilic, Aleksandra; Stanisavljevic, Dejana

    2016-01-01

    Background The scientific community increasingly is recognizing the need to bolster standards of data analysis given the widespread concern that basic mistakes in data analysis are contributing to the irreproducibility of many published research findings. The aim of this study was to investigate students’ attitudes towards statistics within a multi-site medical educational context, monitor their changes and impact on student achievement. In addition, we performed a systematic review to better support our future pedagogical decisions in teaching applied statistics to medical students. Methods A validated Serbian Survey of Attitudes Towards Statistics (SATS-36) questionnaire was administered to medical students attending obligatory introductory courses in biostatistics from three medical universities in the Western Balkans. A systematic review of peer-reviewed publications was performed through searches of Scopus, Web of Science, Science Direct, Medline, and APA databases through 1994. A meta-analysis was performed for the correlation coefficients between SATS component scores and statistics achievement. Pooled estimates were calculated using random effects models. Results SATS-36 was completed by 461 medical students. Most of the students held positive attitudes towards statistics. Ability in mathematics and grade point average were associated in a multivariate regression model with the Cognitive Competence score, after adjusting for age, gender and computer ability. The results of 90 paired data showed that Affect, Cognitive Competence, and Effort scores demonstrated significant positive changes. The Cognitive Competence score showed the largest increase (M = 0.48, SD = 0.95). The positive correlation found between the Cognitive Competence score and students’ achievement (r = 0.41; p<0.001), was also shown in the meta-analysis (r = 0.37; 95% CI 0.32–0.41). Conclusion Students' subjective attitudes regarding Cognitive Competence at the beginning of the biostatistics course, which were directly linked to mathematical knowledge, affected their attitudes at the end of the course that, in turn, influenced students' performance. This indicates the importance of positively changing not only students’ cognitive competency, but also their perceptions of gained competency during the biostatistics course. PMID:27764123

  16. The Importance of Medical Students' Attitudes Regarding Cognitive Competence for Teaching Applied Statistics: Multi-Site Study and Meta-Analysis.

    PubMed

    Milic, Natasa M; Masic, Srdjan; Milin-Lazovic, Jelena; Trajkovic, Goran; Bukumiric, Zoran; Savic, Marko; Milic, Nikola V; Cirkovic, Andja; Gajic, Milan; Kostic, Mirjana; Ilic, Aleksandra; Stanisavljevic, Dejana

    2016-01-01

    The scientific community increasingly is recognizing the need to bolster standards of data analysis given the widespread concern that basic mistakes in data analysis are contributing to the irreproducibility of many published research findings. The aim of this study was to investigate students' attitudes towards statistics within a multi-site medical educational context, monitor their changes and impact on student achievement. In addition, we performed a systematic review to better support our future pedagogical decisions in teaching applied statistics to medical students. A validated Serbian Survey of Attitudes Towards Statistics (SATS-36) questionnaire was administered to medical students attending obligatory introductory courses in biostatistics from three medical universities in the Western Balkans. A systematic review of peer-reviewed publications was performed through searches of Scopus, Web of Science, Science Direct, Medline, and APA databases through 1994. A meta-analysis was performed for the correlation coefficients between SATS component scores and statistics achievement. Pooled estimates were calculated using random effects models. SATS-36 was completed by 461 medical students. Most of the students held positive attitudes towards statistics. Ability in mathematics and grade point average were associated in a multivariate regression model with the Cognitive Competence score, after adjusting for age, gender and computer ability. The results of 90 paired data showed that Affect, Cognitive Competence, and Effort scores demonstrated significant positive changes. The Cognitive Competence score showed the largest increase (M = 0.48, SD = 0.95). The positive correlation found between the Cognitive Competence score and students' achievement (r = 0.41; p<0.001), was also shown in the meta-analysis (r = 0.37; 95% CI 0.32-0.41). Students' subjective attitudes regarding Cognitive Competence at the beginning of the biostatistics course, which were directly linked to mathematical knowledge, affected their attitudes at the end of the course that, in turn, influenced students' performance. This indicates the importance of positively changing not only students' cognitive competency, but also their perceptions of gained competency during the biostatistics course.

  17. Using Computational Modeling to Assess the Impact of Clinical Decision Support on Cancer Screening within Community Health Centers

    PubMed Central

    Carney, Timothy Jay; Morgan, Geoffrey P.; Jones, Josette; McDaniel, Anna M.; Weaver, Michael; Weiner, Bryan; Haggstrom, David A.

    2014-01-01

    Our conceptual model demonstrates our goal to investigate the impact of clinical decision support (CDS) utilization on cancer screening improvement strategies in the community health care (CHC) setting. We employed a dual modeling technique using both statistical and computational modeling to evaluate impact. Our statistical model used the Spearman’s Rho test to evaluate the strength of relationship between our proximal outcome measures (CDS utilization) against our distal outcome measure (provider self-reported cancer screening improvement). Our computational model relied on network evolution theory and made use of a tool called Construct-TM to model the use of CDS measured by the rate of organizational learning. We employed the use of previously collected survey data from community health centers Cancer Health Disparities Collaborative (HDCC). Our intent is to demonstrate the added valued gained by using a computational modeling tool in conjunction with a statistical analysis when evaluating the impact a health information technology, in the form of CDS, on health care quality process outcomes such as facility-level screening improvement. Significant simulated disparities in organizational learning over time were observed between community health centers beginning the simulation with high and low clinical decision support capability. PMID:24953241

  18. Analysis of model development strategies: predicting ventral hernia recurrence.

    PubMed

    Holihan, Julie L; Li, Linda T; Askenasy, Erik P; Greenberg, Jacob A; Keith, Jerrod N; Martindale, Robert G; Roth, J Scott; Liang, Mike K

    2016-11-01

    There have been many attempts to identify variables associated with ventral hernia recurrence; however, it is unclear which statistical modeling approach results in models with greatest internal and external validity. We aim to assess the predictive accuracy of models developed using five common variable selection strategies to determine variables associated with hernia recurrence. Two multicenter ventral hernia databases were used. Database 1 was randomly split into "development" and "internal validation" cohorts. Database 2 was designated "external validation". The dependent variable for model development was hernia recurrence. Five variable selection strategies were used: (1) "clinical"-variables considered clinically relevant, (2) "selective stepwise"-all variables with a P value <0.20 were assessed in a step-backward model, (3) "liberal stepwise"-all variables were included and step-backward regression was performed, (4) "restrictive internal resampling," and (5) "liberal internal resampling." Variables were included with P < 0.05 for the Restrictive model and P < 0.10 for the Liberal model. A time-to-event analysis using Cox regression was performed using these strategies. The predictive accuracy of the developed models was tested on the internal and external validation cohorts using Harrell's C-statistic where C > 0.70 was considered "reasonable". The recurrence rate was 32.9% (n = 173/526; median/range follow-up, 20/1-58 mo) for the development cohort, 36.0% (n = 95/264, median/range follow-up 20/1-61 mo) for the internal validation cohort, and 12.7% (n = 155/1224, median/range follow-up 9/1-50 mo) for the external validation cohort. Internal validation demonstrated reasonable predictive accuracy (C-statistics = 0.772, 0.760, 0.767, 0.757, 0.763), while on external validation, predictive accuracy dipped precipitously (C-statistic = 0.561, 0.557, 0.562, 0.553, 0.560). Predictive accuracy was equally adequate on internal validation among models; however, on external validation, all five models failed to demonstrate utility. Future studies should report multiple variable selection techniques and demonstrate predictive accuracy on external data sets for model validation. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis

    PubMed Central

    Steele, Joe; Bastola, Dhundy

    2014-01-01

    Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base–base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel–Ziv techniques from data compression. PMID:23904502

  20. Time-variant random interval natural frequency analysis of structures

    NASA Astrophysics Data System (ADS)

    Wu, Binhua; Wu, Di; Gao, Wei; Song, Chongmin

    2018-02-01

    This paper presents a new robust method namely, unified interval Chebyshev-based random perturbation method, to tackle hybrid random interval structural natural frequency problem. In the proposed approach, random perturbation method is implemented to furnish the statistical features (i.e., mean and standard deviation) and Chebyshev surrogate model strategy is incorporated to formulate the statistical information of natural frequency with regards to the interval inputs. The comprehensive analysis framework combines the superiority of both methods in a way that computational cost is dramatically reduced. This presented method is thus capable of investigating the day-to-day based time-variant natural frequency of structures accurately and efficiently under concrete intrinsic creep effect with probabilistic and interval uncertain variables. The extreme bounds of the mean and standard deviation of natural frequency are captured through the embedded optimization strategy within the analysis procedure. Three particularly motivated numerical examples with progressive relationship in perspective of both structure type and uncertainty variables are demonstrated to justify the computational applicability, accuracy and efficiency of the proposed method.

  1. Complex networks as a unified framework for descriptive analysis and predictive modeling in climate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steinhaeuser, Karsten J K; Chawla, Nitesh; Ganguly, Auroop R

    The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain. Specifically, we posit that complex networks are well-suited for both descriptive analysis and predictive modeling tasks. We show that the structural properties of climate networks have useful interpretation within the domain. Further,more » we extract clusters from these networks and demonstrate their predictive power as climate indices. Our experimental results establish that the network clusters are statistically significantly better predictors than clusters derived using a more traditional clustering approach. Using complex networks as data representation thus enables the unique opportunity for descriptive and predictive modeling to inform each other.« less

  2. Impact of distributions on the archetypes and prototypes in heterogeneous nanoparticle ensembles.

    PubMed

    Fernandez, Michael; Wilson, Hugh F; Barnard, Amanda S

    2017-01-05

    The magnitude and complexity of the structural and functional data available on nanomaterials requires data analytics, statistical analysis and information technology to drive discovery. We demonstrate that multivariate statistical analysis can recognise the sets of truly significant nanostructures and their most relevant properties in heterogeneous ensembles with different probability distributions. The prototypical and archetypal nanostructures of five virtual ensembles of Si quantum dots (SiQDs) with Boltzmann, frequency, normal, Poisson and random distributions are identified using clustering and archetypal analysis, where we find that their diversity is defined by size and shape, regardless of the type of distribution. At the complex hull of the SiQD ensembles, simple configuration archetypes can efficiently describe a large number of SiQDs, whereas more complex shapes are needed to represent the average ordering of the ensembles. This approach provides a route towards the characterisation of computationally intractable virtual nanomaterial spaces, which can convert big data into smart data, and significantly reduce the workload to simulate experimentally relevant virtual samples.

  3. Statistical analysis of cascading failures in power grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chertkov, Michael; Pfitzner, Rene; Turitsyn, Konstantin

    2010-12-01

    We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systemsmore » consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.« less

  4. Individualism: a valid and important dimension of cultural differences between nations.

    PubMed

    Schimmack, Ulrich; Oishi, Shigehiro; Diener, Ed

    2005-01-01

    Oyserman, Coon, and Kemmelmeier's (2002) meta-analysis suggested problems in the measurement of individualism and collectivism. Studies using Hofstede's individualism scores show little convergent validity with more recent measures of individualism and collectivism. We propose that the lack of convergent validity is due to national differences in response styles. Whereas Hofstede statistically controlled for response styles, Oyserman et al.'s meta-analysis relied on uncorrected ratings. Data from an international student survey demonstrated convergent validity between Hofstede's individualism dimension and horizontal individualism when response styles were statistically controlled, whereas uncorrected scores correlated highly with the individualism scores in Oyserman et al.'s meta-analysis. Uncorrected horizontal individualism scores and meta-analytic individualism scores did not correlate significantly with nations' development, whereas corrected horizontal individualism scores and Hofstede's individualism dimension were significantly correlated with development. This pattern of results suggests that individualism is a valid construct for cross-cultural comparisons, but that the measurement of this construct needs improvement.

  5. Damage detection of engine bladed-disks using multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Fang, X.; Tang, J.

    2006-03-01

    The timely detection of damage in aero-engine bladed-disks is an extremely important and challenging research topic. Bladed-disks have high modal density and, particularly, their vibration responses are subject to significant uncertainties due to manufacturing tolerance (blade-to-blade difference or mistuning), operating condition change and sensor noise. In this study, we present a new methodology for the on-line damage detection of engine bladed-disks using their vibratory responses during spin-up or spin-down operations which can be measured by blade-tip-timing sensing technique. We apply a principle component analysis (PCA)-based approach for data compression, feature extraction, and denoising. The non-model based damage detection is achieved by analyzing the change between response features of the healthy structure and of the damaged one. We facilitate such comparison by incorporating the Hotelling's statistic T2 analysis, which yields damage declaration with a given confidence level. The effectiveness of the method is demonstrated by case studies.

  6. Applied immuno-epidemiological research: an approach for integrating existing knowledge into the statistical analysis of multiple immune markers.

    PubMed

    Genser, Bernd; Fischer, Joachim E; Figueiredo, Camila A; Alcântara-Neves, Neuza; Barreto, Mauricio L; Cooper, Philip J; Amorim, Leila D; Saemann, Marcus D; Weichhart, Thomas; Rodrigues, Laura C

    2016-05-20

    Immunologists often measure several correlated immunological markers, such as concentrations of different cytokines produced by different immune cells and/or measured under different conditions, to draw insights from complex immunological mechanisms. Although there have been recent methodological efforts to improve the statistical analysis of immunological data, a framework is still needed for the simultaneous analysis of multiple, often correlated, immune markers. This framework would allow the immunologists' hypotheses about the underlying biological mechanisms to be integrated. We present an analytical approach for statistical analysis of correlated immune markers, such as those commonly collected in modern immuno-epidemiological studies. We demonstrate i) how to deal with interdependencies among multiple measurements of the same immune marker, ii) how to analyse association patterns among different markers, iii) how to aggregate different measures and/or markers to immunological summary scores, iv) how to model the inter-relationships among these scores, and v) how to use these scores in epidemiological association analyses. We illustrate the application of our approach to multiple cytokine measurements from 818 children enrolled in a large immuno-epidemiological study (SCAALA Salvador), which aimed to quantify the major immunological mechanisms underlying atopic diseases or asthma. We demonstrate how to aggregate systematically the information captured in multiple cytokine measurements to immunological summary scores aimed at reflecting the presumed underlying immunological mechanisms (Th1/Th2 balance and immune regulatory network). We show how these aggregated immune scores can be used as predictors in regression models with outcomes of immunological studies (e.g. specific IgE) and compare the results to those obtained by a traditional multivariate regression approach. The proposed analytical approach may be especially useful to quantify complex immune responses in immuno-epidemiological studies, where investigators examine the relationship among epidemiological patterns, immune response, and disease outcomes.

  7. A bibliometric analysis of statistical terms used in American Physical Therapy Association journals (2011-2012): evidence for educating physical therapists.

    PubMed

    Tilson, Julie K; Marshall, Katie; Tam, Jodi J; Fetters, Linda

    2016-04-22

    A primary barrier to the implementation of evidence based practice (EBP) in physical therapy is therapists' limited ability to understand and interpret statistics. Physical therapists demonstrate limited skills and report low self-efficacy for interpreting results of statistical procedures. While standards for physical therapist education include statistics, little empirical evidence is available to inform what should constitute such curricula. The purpose of this study was to conduct a census of the statistical terms and study designs used in physical therapy literature and to use the results to make recommendations for curricular development in physical therapist education. We conducted a bibliometric analysis of 14 peer-reviewed journals associated with the American Physical Therapy Association over 12 months (Oct 2011-Sept 2012). Trained raters recorded every statistical term appearing in identified systematic reviews, primary research reports, and case series and case reports. Investigator-reported study design was also recorded. Terms representing the same statistical test or concept were combined into a single, representative term. Cumulative percentage was used to identify the most common representative statistical terms. Common representative terms were organized into eight categories to inform curricular design. Of 485 articles reviewed, 391 met the inclusion criteria. These 391 articles used 532 different terms which were combined into 321 representative terms; 13.1 (sd = 8.0) terms per article. Eighty-one representative terms constituted 90% of all representative term occurrences. Of the remaining 240 representative terms, 105 (44%) were used in only one article. The most common study design was prospective cohort (32.5%). Physical therapy literature contains a large number of statistical terms and concepts for readers to navigate. However, in the year sampled, 81 representative terms accounted for 90% of all occurrences. These "common representative terms" can be used to inform curricula to promote physical therapists' skills, competency, and confidence in interpreting statistics in their professional literature. We make specific recommendations for curriculum development informed by our findings.

  8. Markov chain Monte Carlo estimation of quantum states

    NASA Astrophysics Data System (ADS)

    Diguglielmo, James; Messenger, Chris; Fiurášek, Jaromír; Hage, Boris; Samblowski, Aiko; Schmidt, Tabea; Schnabel, Roman

    2009-03-01

    We apply a Bayesian data analysis scheme known as the Markov chain Monte Carlo to the tomographic reconstruction of quantum states. This method yields a vector, known as the Markov chain, which contains the full statistical information concerning all reconstruction parameters including their statistical correlations with no a priori assumptions as to the form of the distribution from which it has been obtained. From this vector we can derive, e.g., the marginal distributions and uncertainties of all model parameters, and also of other quantities such as the purity of the reconstructed state. We demonstrate the utility of this scheme by reconstructing the Wigner function of phase-diffused squeezed states. These states possess non-Gaussian statistics and therefore represent a nontrivial case of tomographic reconstruction. We compare our results to those obtained through pure maximum-likelihood and Fisher information approaches.

  9. Superposed epoch analysis of physiological fluctuations: possible space weather connections

    NASA Astrophysics Data System (ADS)

    Wanliss, James; Cornélissen, Germaine; Halberg, Franz; Brown, Denzel; Washington, Brien

    2018-03-01

    There is a strong connection between space weather and fluctuations in technological systems. Some studies also suggest a statistical connection between space weather and subsequent fluctuations in the physiology of living creatures. This connection, however, has remained controversial and difficult to demonstrate. Here we present support for a response of human physiology to forcing from the explosive onset of the largest of space weather events—space storms. We consider a case study with over 16 years of high temporal resolution measurements of human blood pressure (systolic, diastolic) and heart rate variability to search for associations with space weather. We find no statistically significant change in human blood pressure but a statistically significant drop in heart rate during the main phase of space storms. Our empirical findings shed light on how human physiology may respond to exogenous space weather forcing.

  10. Superposed epoch analysis of physiological fluctuations: possible space weather connections.

    PubMed

    Wanliss, James; Cornélissen, Germaine; Halberg, Franz; Brown, Denzel; Washington, Brien

    2018-03-01

    There is a strong connection between space weather and fluctuations in technological systems. Some studies also suggest a statistical connection between space weather and subsequent fluctuations in the physiology of living creatures. This connection, however, has remained controversial and difficult to demonstrate. Here we present support for a response of human physiology to forcing from the explosive onset of the largest of space weather events-space storms. We consider a case study with over 16 years of high temporal resolution measurements of human blood pressure (systolic, diastolic) and heart rate variability to search for associations with space weather. We find no statistically significant change in human blood pressure but a statistically significant drop in heart rate during the main phase of space storms. Our empirical findings shed light on how human physiology may respond to exogenous space weather forcing.

  11. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    NASA Astrophysics Data System (ADS)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  12. Flow Injection Mass Spectral Fingerprints Demonstrate Chemical Differences in Rio Red Grapefruit with Respect to Year, Harvest Time, and Conventional versus Organic Farming

    PubMed Central

    Chen, Pei; Harnly, James M.; Lester, Gene E.

    2013-01-01

    Spectral fingerprints were acquired for Rio Red grapefruit using flow injection electrospray ionization with ion trap and time-of-flight mass spectrometry (FI-ESI-IT-MS and FI-ESI-TOF-MS). Rio Red grapefruits were harvested 3 times a year (early, mid, and late harvests) in 2005 and 2006 from conventionally and organically grown trees. Data analysis using analysis of variance principal component analysis (ANOVA-PCA) demonstrated that, for both MS systems, the chemical patterns were different as a function of farming mode (conventional vs organic), as well as growing year and time of harvest. This was visually obvious with PCA and was shown to be statistically significant using ANOVA. The spectral fingerprints provided a more inclusive view of the chemical composition of the grapefruit and extended previous conclusions regarding the chemical differences between conventionally and organically grown Rio Red grapefruit. PMID:20337420

  13. Low-Dimensional Statistics of Anatomical Variability via Compact Representation of Image Deformations.

    PubMed

    Zhang, Miaomiao; Wells, William M; Golland, Polina

    2016-10-01

    Using image-based descriptors to investigate clinical hypotheses and therapeutic implications is challenging due to the notorious "curse of dimensionality" coupled with a small sample size. In this paper, we present a low-dimensional analysis of anatomical shape variability in the space of diffeomorphisms and demonstrate its benefits for clinical studies. To combat the high dimensionality of the deformation descriptors, we develop a probabilistic model of principal geodesic analysis in a bandlimited low-dimensional space that still captures the underlying variability of image data. We demonstrate the performance of our model on a set of 3D brain MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Our model yields a more compact representation of group variation at substantially lower computational cost than models based on the high-dimensional state-of-the-art approaches such as tangent space PCA (TPCA) and probabilistic principal geodesic analysis (PPGA).

  14. [The role of meta-analysis in assessing the treatment of advanced non-small cell lung cancer].

    PubMed

    Pérol, M; Pérol, D

    2004-02-01

    Meta-analysis is a statistical method allowing an evaluation of the direction and quantitative importance of a treatment effect observed in randomized trials which have tested the treatment but have not provided a definitive conclusion. In the present review, we discuss the methodology and the contribution of meta-analyses to the treatment of advanced-stage or metastatic non-small-cell lung cancer. In this area of cancerology, meta-analyses have provided determining information demonstrating the impact of chemotherapy on patient survival. They have also helped define a two-drug regimen based on cisplatin as the gold standard treatment for patients with a satisfactory general status. Recently, the meta-analysis method was used to measure the influence of gemcitabin in combination with platinium salts and demonstrated a small but significant benefit in survival, confirming that gemcitabin remains the gold standard treatment in combination with cisplatin.

  15. It's all relative: ranking the diversity of aquatic bacterial communities.

    PubMed

    Shaw, Allison K; Halpern, Aaron L; Beeson, Karen; Tran, Bao; Venter, J Craig; Martiny, Jennifer B H

    2008-09-01

    The study of microbial diversity patterns is hampered by the enormous diversity of microbial communities and the lack of resources to sample them exhaustively. For many questions about richness and evenness, however, one only needs to know the relative order of diversity among samples rather than total diversity. We used 16S libraries from the Global Ocean Survey to investigate the ability of 10 diversity statistics (including rarefaction, non-parametric, parametric, curve extrapolation and diversity indices) to assess the relative diversity of six aquatic bacterial communities. Overall, we found that the statistics yielded remarkably similar rankings of the samples for a given sequence similarity cut-off. This correspondence, despite the different underlying assumptions of the statistics, suggests that diversity statistics are a useful tool for ranking samples of microbial diversity. In addition, sequence similarity cut-off influenced the diversity ranking of the samples, demonstrating that diversity statistics can also be used to detect differences in phylogenetic structure among microbial communities. Finally, a subsampling analysis suggests that further sequencing from these particular clone libraries would not have substantially changed the richness rankings of the samples.

  16. Importance of the Correlation between Width and Length in the Shape Analysis of Nanorods: Use of a 2D Size Plot To Probe Such a Correlation.

    PubMed

    Zhao, Zhihua; Zheng, Zhiqin; Roux, Clément; Delmas, Céline; Marty, Jean-Daniel; Kahn, Myrtil L; Mingotaud, Christophe

    2016-08-22

    Analysis of nanoparticle size through a simple 2D plot is proposed in order to extract the correlation between length and width in a collection or a mixture of anisotropic particles. Compared to the usual statistics on the length associated with a second and independent statistical analysis of the width, this simple plot easily points out the various types of nanoparticles and their (an)isotropy. For each class of nano-objects, the relationship between width and length (i.e., the strong or weak correlations between these two parameters) may suggest information concerning the nucleation/growth processes. It allows one to follow the effect on the shape and size distribution of physical or chemical processes such as simple ripening. Various electron microscopy pictures from the literature or from the authors' own syntheses are used as examples to demonstrate the efficiency and simplicity of the proposed 2D plot combined with a multivariate analysis. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Computerized EEG analysis for studying the effect of drugs on the central nervous system.

    PubMed

    Rosadini, G; Cavazza, B; Rodriguez, G; Sannita, W G; Siccardi, A

    1977-11-01

    Samples of our experience in quantitative pharmaco-EEG are reviewed to discuss and define its applicability and limits. Simple processing systems, such as the computation of Hjorth's descriptors, are useful for on-line monitoring of drug-induced EEG modifications which are evident also at the visual visual analysis. Power spectral analysis is suitable to identify and quantify EEG effects not evident at the visual inspection. It demonstrated how the EEG effects of compounds in a long-acting formulation vary according to the sampling time and the explored cerebral area. EEG modifications not detected by power spectral analysis can be defined by comparing statistically (F test) the spectral values of the EEG from a single lead at the different samples (longitudinal comparison), or the spectral values from different leads at any sample (intrahemispheric comparison). The presently available procedures of quantitative pharmaco-EEG are effective when applied to the study of mutltilead EEG recordings in a statistically significant sample of population. They do not seem reliable in the monitoring of directing of neuropyschiatric therapies in single patients, due to individual variability of drug effects.

  18. Changes of statistical structural fluctuations unveils an early compacted degraded stage of PNS myelin

    NASA Astrophysics Data System (ADS)

    Poccia, Nicola; Campi, Gaetano; Ricci, Alessandro; Caporale, Alessandra S.; di Cola, Emanuela; Hawkins, Thomas A.; Bianconi, Antonio

    2014-06-01

    Degradation of the myelin sheath is a common pathology underlying demyelinating neurological diseases from Multiple Sclerosis to Leukodistrophies. Although large malformations of myelin ultrastructure in the advanced stages of Wallerian degradation is known, its subtle structural variations at early stages of demyelination remains poorly characterized. This is partly due to the lack of suitable and non-invasive experimental probes possessing sufficient resolution to detect the degradation. Here we report the feasibility of the application of an innovative non-invasive local structure experimental approach for imaging the changes of statistical structural fluctuations in the first stage of myelin degeneration. Scanning micro X-ray diffraction, using advances in synchrotron x-ray beam focusing, fast data collection, paired with spatial statistical analysis, has been used to unveil temporal changes in the myelin structure of dissected nerves following extraction of the Xenopus laevis sciatic nerve. The early myelin degeneration is a specific ordered compacted phase preceding the swollen myelin phase of Wallerian degradation. Our demonstration of the feasibility of the statistical analysis of SµXRD measurements using biological tissue paves the way for further structural investigations of degradation and death of neurons and other cells and tissues in diverse pathological states where nanoscale structural changes may be uncovered.

  19. Identifying drought response of semi-arid aeolian systems using near-surface luminescence profiles and changepoint analysis, Nebraska Sandhills.

    NASA Astrophysics Data System (ADS)

    Buckland, Catherine; Bailey, Richard; Thomas, David

    2017-04-01

    Two billion people living in drylands are affected by land degradation. Sediment erosion by wind and water removes fertile soil and destabilises landscapes. Vegetation disturbance is a key driver of dryland erosion caused by both natural and human forcings: drought, fire, land use, grazing pressure. A quantified understanding of vegetation cover sensitivities and resultant surface change to forcing factors is needed if the vegetation and landscape response to future climate change and human pressure are to be better predicted. Using quartz luminescence dating and statistical changepoint analysis (Killick & Eckley, 2014) this study demonstrates the ability to identify step-changes in depositional age of near-surface sediments. Lx/Tx luminescence profiles coupled with statistical analysis show the use of near-surface sediments in providing a high-resolution record of recent system response and aeolian system thresholds. This research determines how the environment has recorded and retained sedimentary evidence of drought response and land use disturbances over the last two hundred years across both individual landforms and the wider Nebraska Sandhills. Identifying surface deposition and comparing with records of climate, fire and land use changes allows us to assess the sensitivity and stability of the surface sediment to a range of forcing factors. Killick, R and Eckley, IA. (2014) "changepoint: An R Package for Changepoint Analysis." Journal of Statistical Software, (58) 1-19.

  20. Detailed Spectral Analysis of the 260 ks XMM-Newton Data of 1E 1207.4-5209 and Significance of a 2.1 keV Absorption Feature

    NASA Astrophysics Data System (ADS)

    Mori, Kaya; Chonko, James C.; Hailey, Charles J.

    2005-10-01

    We have reanalyzed the 260 ks XMM-Newton observation of 1E 1207.4-5209. There are several significant improvements over previous work. First, a much broader range of physically plausible spectral models was used. Second, we have used a more rigorous statistical analysis. The standard F-distribution was not employed, but rather the exact finite statistics F-distribution was determined by Monte Carlo simulations. This approach was motivated by the recent work of Protassov and coworkers and Freeman and coworkers. They demonstrated that the standard F-distribution is not even asymptotically correct when applied to assess the significance of additional absorption features in a spectrum. With our improved analysis we do not find a third and fourth spectral feature in 1E 1207.4-5209 but only the two broad absorption features previously reported. Two additional statistical tests, one line model dependent and the other line model independent, confirmed our modified F-test analysis. For all physically plausible continuum models in which the weak residuals are strong enough to fit, the residuals occur at the instrument Au M edge. As a sanity check we confirmed that the residuals are consistent in strength and position with the instrument Au M residuals observed in 3C 273.

  1. Statistical process control: separating signal from noise in emergency department operations.

    PubMed

    Pimentel, Laura; Barrueto, Fermin

    2015-05-01

    Statistical process control (SPC) is a visually appealing and statistically rigorous methodology very suitable to the analysis of emergency department (ED) operations. We demonstrate that the control chart is the primary tool of SPC; it is constructed by plotting data measuring the key quality indicators of operational processes in rationally ordered subgroups such as units of time. Control limits are calculated using formulas reflecting the variation in the data points from one another and from the mean. SPC allows managers to determine whether operational processes are controlled and predictable. We review why the moving range chart is most appropriate for use in the complex ED milieu, how to apply SPC to ED operations, and how to determine when performance improvement is needed. SPC is an excellent tool for operational analysis and quality improvement for these reasons: 1) control charts make large data sets intuitively coherent by integrating statistical and visual descriptions; 2) SPC provides analysis of process stability and capability rather than simple comparison with a benchmark; 3) SPC allows distinction between special cause variation (signal), indicating an unstable process requiring action, and common cause variation (noise), reflecting a stable process; and 4) SPC keeps the focus of quality improvement on process rather than individual performance. Because data have no meaning apart from their context, and every process generates information that can be used to improve it, we contend that SPC should be seriously considered for driving quality improvement in emergency medicine. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Identification of Microorganisms by High Resolution Tandem Mass Spectrometry with Accurate Statistical Significance

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo

    2016-02-01

    Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  3. The relationship between obligatory cortical auditory evoked potentials (CAEPs) and functional measures in young infants.

    PubMed

    Golding, Maryanne; Pearce, Wendy; Seymour, John; Cooper, Alison; Ching, Teresa; Dillon, Harvey

    2007-02-01

    Finding ways to evaluate the success of hearing aid fittings in young infants has increased in importance with the implementation of hearing screening programs. Cortical auditory evoked potentials (CAEP) can be recorded in infants and provides evidence for speech detection at the cortical level. The validity of this technique as a tool of hearing aid evaluation needs, however, to be demonstrated. The present study examined the relationship between the presence/absence of CAEPs to speech stimuli and the outcomes of a parental questionnaire in young infants who were fitted with hearing aids. The presence/absence of responses was determined by an experienced examiner as well as by a statistical measure, Hotelling's T(2). A statistically significant correlation between CAEPs and questionnaire scores was found using the examiner's grading (rs = 0.45) and using the statistical grading (rs = 0.41), and there was reasonably good agreement between traditional response detection methods and the statistical analysis.

  4. Exploratory Analysis of Survey Data for Understanding Adoption of Novel Aerospace Systems

    NASA Astrophysics Data System (ADS)

    Reddy, Lauren M.

    In order to meet the increasing demand for manned and unmanned flight, the air transportation system must constantly evolve. As new technologies or operational procedures are conceived, we must determine their effect on humans in the system. In this research, we introduce a strategy to assess how individuals or organizations would respond to a novel aerospace system. We employ the most appropriate and sophisticated exploratory analysis techniques on the survey data to generate insight and identify significant variables. We employ three different methods for eliciting views from individuals or organizations who are affected by a system: an opinion survey, a stated preference survey, and structured interviews. We conduct an opinion survey of both the general public and stakeholders in the unmanned aircraft industry to assess their knowledge, attitude, and practices regarding unmanned aircraft. We complete a statistical analysis of the multiple-choice questions using multinomial logit and multivariate probit models and conduct qualitative analysis on free-text questions. We next present a stated preference survey of the general public on the use of an unmanned aircraft package delivery service. We complete a statistical analysis of the questions using multinomial logit, ordered probit, linear regression, and negative binomial models. Finally, we discuss structured interviews conducted on stakeholders from ANSPs and airlines operating in the North Atlantic. We describe how these groups may choose to adopt a new technology (space-based ADS-B) or operational procedure (in-trail procedures). We discuss similarities and differences between the stakeholders groups, the benefits and costs of in-trail procedures and space-based ADS-B as reported by the stakeholders, and interdependencies between the groups interviewed. To demonstrate the value of the data we generated, we explore how the findings from the surveys can be used to better characterize uncertainty in the cost-benefit analysis of aerospace systems. We demonstrate how the findings from the opinion and stated preference surveys can be infused into the cost-benefit analysis of an unmanned aircraft delivery system. We also demonstrate how to apply the findings from the interviews to characterize uncertainty in the estimation of the benefits of space-based ADS-B.

  5. CALL versus Paper: In Which Context Are L1 Glosses More Effective?

    ERIC Educational Resources Information Center

    Taylor, Alan M.

    2013-01-01

    CALL glossing in first language (L1) or second language (L2) texts has been shown by previous studies to be more effective than traditional, paper-and-pen L1 glossing. Using a pool of studies with much more statistical power and more accurate results, this meta-analysis demonstrates more precisely the degree to which CALL L1 glossing can be more…

  6. Does My Baby Really Look Like Me? Using Tests for Resemblance between Parent and Child to Teach Topics in Categorical Data Analysis

    ERIC Educational Resources Information Center

    Froelich, Amy G.; Nettleton, Dan

    2013-01-01

    In this article, we present a study to test whether neutral observers perceive a resemblance between a parent and a child. We demonstrate the general approach for two separate parent/ child pairs using survey data collected from introductory statistics students serving as neutral observers. We then present ideas for incorporating the study design…

  7. Statistically advanced, self-similar, radial probability density functions of atmospheric and under-expanded hydrogen jets

    NASA Astrophysics Data System (ADS)

    Ruggles, Adam J.

    2015-11-01

    This paper presents improved statistical insight regarding the self-similar scalar mixing process of atmospheric hydrogen jets and the downstream region of under-expanded hydrogen jets. Quantitative planar laser Rayleigh scattering imaging is used to probe both jets. The self-similarity of statistical moments up to the sixth order (beyond the literature established second order) is documented in both cases. This is achieved using a novel self-similar normalization method that facilitated a degree of statistical convergence that is typically limited to continuous, point-based measurements. This demonstrates that image-based measurements of a limited number of samples can be used for self-similar scalar mixing studies. Both jets exhibit the same radial trends of these moments demonstrating that advanced atmospheric self-similarity can be applied in the analysis of under-expanded jets. Self-similar histograms away from the centerline are shown to be the combination of two distributions. The first is attributed to turbulent mixing. The second, a symmetric Poisson-type distribution centered on zero mass fraction, progressively becomes the dominant and eventually sole distribution at the edge of the jet. This distribution is attributed to shot noise-affected pure air measurements, rather than a diffusive superlayer at the jet boundary. This conclusion is reached after a rigorous measurement uncertainty analysis and inspection of pure air data collected with each hydrogen data set. A threshold based upon the measurement noise analysis is used to separate the turbulent and pure air data, and thusly estimate intermittency. Beta-distributions (four parameters) are used to accurately represent the turbulent distribution moments. This combination of measured intermittency and four-parameter beta-distributions constitutes a new, simple approach to model scalar mixing. Comparisons between global moments from the data and moments calculated using the proposed model show excellent agreement. This was attributed to the high quality of the measurements which reduced the width of the correctly identified, noise-affected pure air distribution, with respect to the turbulent mixing distribution. The ignitability of the atmospheric jet is determined using the flammability factor calculated from both kernel density estimated (KDE) PDFs and PDFs generated using the newly proposed model. Agreement between contours from both approaches is excellent. Ignitability of the under-expanded jet is also calculated using KDE PDFs. Contours are compared with those calculated by applying the atmospheric model to the under-expanded jet. Once again, agreement is excellent. This work demonstrates that self-similar scalar mixing statistics and ignitability of atmospheric jets can be accurately described by the proposed model. This description can be applied with confidence to under-expanded jets, which are more realistic of leak and fuel injection scenarios.

  8. Chemical discrimination of lubricant marketing types using direct analysis in real time time-of-flight mass spectrometry.

    PubMed

    Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice

    2017-06-30

    In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  9. Significance testing of clinical data using virus dynamics models with a Markov chain Monte Carlo method: application to emergence of lamivudine-resistant hepatitis B virus.

    PubMed Central

    Burroughs, N J; Pillay, D; Mutimer, D

    1999-01-01

    Bayesian analysis using a virus dynamics model is demonstrated to facilitate hypothesis testing of patterns in clinical time-series. Our Markov chain Monte Carlo implementation demonstrates that the viraemia time-series observed in two sets of hepatitis B patients on antiviral (lamivudine) therapy, chronic carriers and liver transplant patients, are significantly different, overcoming clinical trial design differences that question the validity of non-parametric tests. We show that lamivudine-resistant mutants grow faster in transplant patients than in chronic carriers, which probably explains the differences in emergence times and failure rates between these two sets of patients. Incorporation of dynamic models into Bayesian parameter analysis is of general applicability in medical statistics. PMID:10643081

  10. Effect of steady magnetic field on human lymphocytes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mileva, M.; Ivanov, B.; Bulanova, M.

    1983-01-01

    Exposure to steady magnetic field (SMF) for different periods of time did not elicit statistically reliable increase in chromosome aberrations in human peripheral blood lymphocytes. Metaphase analysis of Crepis capilaris cells revealed that SMF (9 k0e, 200 0e/cm) for 2 days did not induce chromosome aberrations. Nor were any changes demonstrated in roots of beans, onions and L-fibroblasts of subcutaneous tissue of mice and Chinese hamsters. The obtained data are indicative of absence of cytogenetic effect of SMF. The level and spectrum of chromosome aberrations did not exceed the values for spontaneous chromatic fragments in cultures. Cytogenetic analysis of DEDEmore » cells of the Chinese hamster revealed a mild mutagenic effect of SMF. Chromosomal aberrations were also demonstrated after exposure (5 min) of garlic roots.« less

  11. Identification of key micro-organisms involved in Douchi fermentation by statistical analysis and their use in an experimental fermentation.

    PubMed

    Chen, C; Xiang, J Y; Hu, W; Xie, Y B; Wang, T J; Cui, J W; Xu, Y; Liu, Z; Xiang, H; Xie, Q

    2015-11-01

    To screen and identify safe micro-organisms used during Douchi fermentation, and verify the feasibility of producing high-quality Douchi using these identified micro-organisms. PCR-denaturing gradient gel electrophoresis (DGGE) and automatic amino-acid analyser were used to investigate the microbial diversity and free amino acids (FAAs) content of 10 commercial Douchi samples. The correlations between microbial communities and FAAs were analysed by statistical analysis. Ten strains with significant positive correlation were identified. Then an experiment on Douchi fermentation by identified strains was carried out, and the nutritional composition in Douchi was analysed. Results showed that FAAs and relative content of isoflavone aglycones in verification Douchi samples were generally higher than those in commercial Douchi samples. Our study indicated that fungi, yeasts, Bacillus and lactic acid bacteria were the key players in Douchi fermentation, and with identified probiotic micro-organisms participating in fermentation, a higher quality Douchi product was produced. This is the first report to analyse and confirm the key micro-organisms during Douchi fermentation by statistical analysis. This work proves fermentation micro-organisms to be the key influencing factor of Douchi quality, and demonstrates the feasibility of fermenting Douchi using identified starter micro-organisms. © 2015 The Society for Applied Microbiology.

  12. Empirical validation of statistical parametric mapping for group imaging of fast neural activity using electrical impedance tomography.

    PubMed

    Packham, B; Barnes, G; Dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D

    2016-06-01

    Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have  >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p  <  0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity.

  13. Empirical validation of statistical parametric mapping for group imaging of fast neural activity using electrical impedance tomography

    PubMed Central

    Packham, B; Barnes, G; dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D

    2016-01-01

    Abstract Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have  >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p  <  0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity. PMID:27203477

  14. Correcting for population structure and kinship using the linear mixed model: theory and extensions.

    PubMed

    Hoffman, Gabriel E

    2013-01-01

    Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.

  15. Method for Identifying Probable Archaeological Sites from Remotely Sensed Data

    NASA Technical Reports Server (NTRS)

    Tilton, James C.; Comer, Douglas C.; Priebe, Carey E.; Sussman, Daniel

    2011-01-01

    Archaeological sites are being compromised or destroyed at a catastrophic rate in most regions of the world. The best solution to this problem is for archaeologists to find and study these sites before they are compromised or destroyed. One way to facilitate the necessary rapid, wide area surveys needed to find these archaeological sites is through the generation of maps of probable archaeological sites from remotely sensed data. We describe an approach for identifying probable locations of archaeological sites over a wide area based on detecting subtle anomalies in vegetative cover through a statistically based analysis of remotely sensed data from multiple sources. We further developed this approach under a recent NASA ROSES Space Archaeology Program project. Under this project we refined and elaborated this statistical analysis to compensate for potential slight miss-registrations between the remote sensing data sources and the archaeological site location data. We also explored data quantization approaches (required by the statistical analysis approach), and we identified a superior data quantization approached based on a unique image segmentation approach. In our presentation we will summarize our refined approach and demonstrate the effectiveness of the overall approach with test data from Santa Catalina Island off the southern California coast. Finally, we discuss our future plans for further improving our approach.

  16. A change in humidification system can eliminate endotracheal tube occlusion.

    PubMed

    Doyle, Alex; Joshi, Manasi; Frank, Peter; Craven, Thomas; Moondi, Parvez; Young, Peter

    2011-12-01

    Inadequate airway humidification can result in endotracheal tube occlusion. There is evidence that heat and moisture exchangers (HMEs) are more prone to endotracheal tube occlusion than heated humidifiers (HHs) that contain a heated wire circuit. We aimed to compare the incidence of endotracheal tube occlusion while introducing a new dual-heated wire circuit HH in place of an established hydrophobic HME. This was a prospective observational study. All patients who required intubation were included in our analysis. Univariate statistical analysis was performed using a Fisher exact test. P < .05 was considered statistically significant. There were 158 patients in the HME group and 88 patients in the HH group. The incidence of endotracheal tube occlusion was 5.7% in the HME group and 0% in the HH group. Statistical analysis revealed a significant difference between the 2 groups (P = .02). In light of this finding, we changed our practice to provide humidification exclusively by HH. In the subsequent 18-month period, there were no further episodes of endotracheal tube occlusion. Our study demonstrates that there is a significant increase in the incidence of endotracheal tube occlusion when using a hydrophobic HME compared with an HH and that using a dual-heated wire circuit HH can eliminate endotracheal tube occlusion. Copyright © 2011 Elsevier Inc. All rights reserved.

  17. DICON: interactive visual analysis of multidimensional clusters.

    PubMed

    Cao, Nan; Gotz, David; Sun, Jimeng; Qu, Huamin

    2011-12-01

    Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis. © 2011 IEEE

  18. Similar protein expression profiles of ovarian and endometrial high-grade serous carcinomas.

    PubMed

    Hiramatsu, Kosuke; Yoshino, Kiyoshi; Serada, Satoshi; Yoshihara, Kosuke; Hori, Yumiko; Fujimoto, Minoru; Matsuzaki, Shinya; Egawa-Takata, Tomomi; Kobayashi, Eiji; Ueda, Yutaka; Morii, Eiichi; Enomoto, Takayuki; Naka, Tetsuji; Kimura, Tadashi

    2016-03-01

    Ovarian and endometrial high-grade serous carcinomas (HGSCs) have similar clinical and pathological characteristics; however, exhaustive protein expression profiling of these cancers has yet to be reported. We performed protein expression profiling on 14 cases of HGSCs (7 ovarian and 7 endometrial) and 18 endometrioid carcinomas (9 ovarian and 9 endometrial) using iTRAQ-based exhaustive and quantitative protein analysis. We identified 828 tumour-expressed proteins and evaluated the statistical similarity of protein expression profiles between ovarian and endometrial HGSCs using unsupervised hierarchical cluster analysis (P<0.01). Using 45 statistically highly expressed proteins in HGSCs, protein ontology analysis detected two enriched terms and proteins composing each term: IMP2 and MCM2. Immunohistochemical analyses confirmed the higher expression of IMP2 and MCM2 in ovarian and endometrial HGSCs as well as in tubal and peritoneal HGSCs than in endometrioid carcinomas (P<0.01). The knockdown of either IMP2 or MCM2 by siRNA interference significantly decreased the proliferation rate of ovarian HGSC cell line (P<0.01). We demonstrated the statistical similarity of the protein expression profiles of ovarian and endometrial HGSC beyond the organs. We suggest that increased IMP2 and MCM2 expression may underlie some of the rapid HGSC growth observed clinically.

  19. The Statistical Basis of Chemical Equilibria.

    ERIC Educational Resources Information Center

    Hauptmann, Siegfried; Menger, Eva

    1978-01-01

    Describes a machine which demonstrates the statistical bases of chemical equilibrium, and in doing so conveys insight into the connections among statistical mechanics, quantum mechanics, Maxwell Boltzmann statistics, statistical thermodynamics, and transition state theory. (GA)

  20. Spatial analysis on future housing markets: economic development and housing implications.

    PubMed

    Liu, Xin; Wang, Lizhe

    2014-01-01

    A coupled projection method combining formal modelling and other statistical techniques was developed to delineate the relationship between economic and social drivers for net new housing allocations. Using the example of employment growth in Tyne and Wear, UK, until 2016, the empirical analysis yields housing projections at the macro- and microspatial levels (e.g., region to subregion to elected ward levels). The results have important implications for the strategic planning of locations for housing and employment, demonstrating both intuitively and quantitatively how local economic developments affect housing demand.

  1. Spatial Analysis on Future Housing Markets: Economic Development and Housing Implications

    PubMed Central

    Liu, Xin; Wang, Lizhe

    2014-01-01

    A coupled projection method combining formal modelling and other statistical techniques was developed to delineate the relationship between economic and social drivers for net new housing allocations. Using the example of employment growth in Tyne and Wear, UK, until 2016, the empirical analysis yields housing projections at the macro- and microspatial levels (e.g., region to subregion to elected ward levels). The results have important implications for the strategic planning of locations for housing and employment, demonstrating both intuitively and quantitatively how local economic developments affect housing demand. PMID:24892097

  2. Response surface methodology as an approach to determine optimal activities of lipase entrapped in sol-gel matrix using different vegetable oils.

    PubMed

    Pinheiro, Rubiane C; Soares, Cleide M F; de Castro, Heizir F; Moraes, Flavio F; Zanin, Gisella M

    2008-03-01

    The conditions for maximization of the enzymatic activity of lipase entrapped in sol-gel matrix were determined for different vegetable oils using an experimental design. The effects of pH, temperature, and biocatalyst loading on lipase activity were verified using a central composite experimental design leading to a set of 13 assays and the surface response analysis. For canola oil and entrapped lipase, statistical analyses showed significant effects for pH and temperature and also the interactions between pH and temperature and temperature and biocatalyst loading. For the olive oil and entrapped lipase, it was verified that the pH was the only variable statistically significant. This study demonstrated that response surface analysis is a methodology appropriate for the maximization of the percentage of hydrolysis, as a function of pH, temperature, and lipase loading.

  3. Visual analysis of geocoded twin data puts nature and nurture on the map.

    PubMed

    Davis, O S P; Haworth, C M A; Lewis, C M; Plomin, R

    2012-09-01

    Twin studies allow us to estimate the relative contributions of nature and nurture to human phenotypes by comparing the resemblance of identical and fraternal twins. Variation in complex traits is a balance of genetic and environmental influences; these influences are typically estimated at a population level. However, what if the balance of nature and nurture varies depending on where we grow up? Here we use statistical and visual analysis of geocoded data from over 6700 families to show that genetic and environmental contributions to 45 childhood cognitive and behavioral phenotypes vary geographically in the United Kingdom. This has implications for detecting environmental exposures that may interact with the genetic influences on complex traits, and for the statistical power of samples recruited for genetic association studies. More broadly, our experience demonstrates the potential for collaborative exploratory visualization to act as a lingua franca for large-scale interdisciplinary research.

  4. Provenance establishment of coffee using solution ICP-MS and ICP-AES.

    PubMed

    Valentin, Jenna L; Watling, R John

    2013-11-01

    Statistical interpretation of the concentrations of 59 elements, determined using solution based inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma emission spectroscopy (ICP-AES), was used to establish the provenance of coffee samples from 15 countries across five continents. Data confirmed that the harvest year, degree of ripeness and whether the coffees were green or roasted had little effect on the elemental composition of the coffees. The application of linear discriminant analysis and principal component analysis of the elemental concentrations permitted up to 96.9% correct classification of the coffee samples according to their continent of origin. When samples from each continent were considered separately, up to 100% correct classification of coffee samples into their countries, and plantations of origin was achieved. This research demonstrates the potential of using elemental composition, in combination with statistical classification methods, for accurate provenance establishment of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Statistical mapping of zones of focused groundwater/surface-water exchange using fiber-optic distributed temperature sensing

    USGS Publications Warehouse

    Mwakanyamale, Kisa; Day-Lewis, Frederick D.; Slater, Lee D.

    2013-01-01

    Fiber-optic distributed temperature sensing (FO-DTS) increasingly is used to map zones of focused groundwater/surface-water exchange (GWSWE). Previous studies of GWSWE using FO-DTS involved identification of zones of focused GWSWE based on arbitrary cutoffs of FO-DTS time-series statistics (e.g., variance, cross-correlation between temperature and stage, or spectral power). New approaches are needed to extract more quantitative information from large, complex FO-DTS data sets while concurrently providing an assessment of uncertainty associated with mapping zones of focused GSWSE. Toward this end, we present a strategy combining discriminant analysis (DA) and spectral analysis (SA). We demonstrate the approach using field experimental data from a reach of the Columbia River adjacent to the Hanford 300 Area site. Results of the combined SA/DA approach are shown to be superior to previous results from qualitative interpretation of FO-DTS spectra alone.

  6. Performance analysis of different tuning rules for an isothermal CSTR using integrated EPC and SPC

    NASA Astrophysics Data System (ADS)

    Roslan, A. H.; Karim, S. F. Abd; Hamzah, N.

    2018-03-01

    This paper demonstrates the integration of Engineering Process Control (EPC) and Statistical Process Control (SPC) for the control of product concentration of an isothermal CSTR. The objectives of this study are to evaluate the performance of Ziegler-Nichols (Z-N), Direct Synthesis, (DS) and Internal Model Control (IMC) tuning methods and determine the most effective method for this process. The simulation model was obtained from past literature and re-constructed using SIMULINK MATLAB to evaluate the process response. Additionally, the process stability, capability and normality were analyzed using Process Capability Sixpack reports in Minitab. Based on the results, DS displays the best response for having the smallest rise time, settling time, overshoot, undershoot, Integral Time Absolute Error (ITAE) and Integral Square Error (ISE). Also, based on statistical analysis, DS yields as the best tuning method as it exhibits the highest process stability and capability.

  7. Application of random match probability calculations to mixed STR profiles.

    PubMed

    Bille, Todd; Bright, Jo-Anne; Buckleton, John

    2013-03-01

    Mixed DNA profiles are being encountered more frequently as laboratories analyze increasing amounts of touch evidence. If it is determined that an individual could be a possible contributor to the mixture, it is necessary to perform a statistical analysis to allow an assignment of weight to the evidence. Currently, the combined probability of inclusion (CPI) and the likelihood ratio (LR) are the most commonly used methods to perform the statistical analysis. A third method, random match probability (RMP), is available. This article compares the advantages and disadvantages of the CPI and LR methods to the RMP method. We demonstrate that although the LR method is still considered the most powerful of the binary methods, the RMP and LR methods make similar use of the observed data such as peak height, assumed number of contributors, and known contributors where the CPI calculation tends to waste information and be less informative. © 2013 American Academy of Forensic Sciences.

  8. Anisotropic analysis of trabecular architecture in human femur bone radiographs using quaternion wavelet transforms.

    PubMed

    Sangeetha, S; Sujatha, C M; Manamalli, D

    2014-01-01

    In this work, anisotropy of compressive and tensile strength regions of femur trabecular bone are analysed using quaternion wavelet transforms. The normal and abnormal femur trabecular bone radiographic images are considered for this study. The sub-anatomic regions, which include compressive and tensile regions, are delineated using pre-processing procedures. These delineated regions are subjected to quaternion wavelet transforms and statistical parameters are derived from the transformed images. These parameters are correlated with apparent porosity, which is derived from the strength regions. Further, anisotropy is also calculated from the transformed images and is analyzed. Results show that the anisotropy values derived from second and third phase components of quaternion wavelet transform are found to be distinct for normal and abnormal samples with high statistical significance for both compressive and tensile regions. These investigations demonstrate that architectural anisotropy derived from QWT analysis is able to differentiate normal and abnormal samples.

  9. Inference on network statistics by restricting to the network space: applications to sexual history data.

    PubMed

    Goyal, Ravi; De Gruttola, Victor

    2018-01-30

    Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

    NASA Astrophysics Data System (ADS)

    Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

    2018-03-01

    This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.

  11. Application of a data-mining method based on Bayesian networks to lesion-deficit analysis

    NASA Technical Reports Server (NTRS)

    Herskovits, Edward H.; Gerring, Joan P.

    2003-01-01

    Although lesion-deficit analysis (LDA) has provided extensive information about structure-function associations in the human brain, LDA has suffered from the difficulties inherent to the analysis of spatial data, i.e., there are many more variables than subjects, and data may be difficult to model using standard distributions, such as the normal distribution. We herein describe a Bayesian method for LDA; this method is based on data-mining techniques that employ Bayesian networks to represent structure-function associations. These methods are computationally tractable, and can represent complex, nonlinear structure-function associations. When applied to the evaluation of data obtained from a study of the psychiatric sequelae of traumatic brain injury in children, this method generates a Bayesian network that demonstrates complex, nonlinear associations among lesions in the left caudate, right globus pallidus, right side of the corpus callosum, right caudate, and left thalamus, and subsequent development of attention-deficit hyperactivity disorder, confirming and extending our previous statistical analysis of these data. Furthermore, analysis of simulated data indicates that methods based on Bayesian networks may be more sensitive and specific for detecting associations among categorical variables than methods based on chi-square and Fisher exact statistics.

  12. Event time analysis of longitudinal neuroimage data.

    PubMed

    Sabuncu, Mert R; Bernal-Rusiel, Jorge L; Reuter, Martin; Greve, Douglas N; Fischl, Bruce

    2014-08-15

    This paper presents a method for the statistical analysis of the associations between longitudinal neuroimaging measurements, e.g., of cortical thickness, and the timing of a clinical event of interest, e.g., disease onset. The proposed approach consists of two steps, the first of which employs a linear mixed effects (LME) model to capture temporal variation in serial imaging data. The second step utilizes the extended Cox regression model to examine the relationship between time-dependent imaging measurements and the timing of the event of interest. We demonstrate the proposed method both for the univariate analysis of image-derived biomarkers, e.g., the volume of a structure of interest, and the exploratory mass-univariate analysis of measurements contained in maps, such as cortical thickness and gray matter density. The mass-univariate method employs a recently developed spatial extension of the LME model. We applied our method to analyze structural measurements computed using FreeSurfer, a widely used brain Magnetic Resonance Image (MRI) analysis software package. We provide a quantitative and objective empirical evaluation of the statistical performance of the proposed method on longitudinal data from subjects suffering from Mild Cognitive Impairment (MCI) at baseline. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies

    PubMed Central

    Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.

    2016-01-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis. PMID:27274911

  14. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.

    PubMed

    Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H

    2016-04-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.

  15. Local coexistence of VO 2 phases revealed by deep data analysis

    DOE PAGES

    Strelcov, Evgheni; Ievlev, Anton; Tselev, Alexander; ...

    2016-07-07

    We report a synergistic approach of micro-Raman spectroscopic mapping and deep data analysis to study the distribution of crystallographic phases and ferroelastic domains in a defected Al-doped VO 2 microcrystal. Bayesian linear unmixing revealed an uneven distribution of the T phase, which is stabilized by the surface defects and uneven local doping that went undetectable by other classical analysis techniques such as PCA and SIMPLISMA. This work demonstrates the impact of information recovery via statistical analysis and full mapping in spectroscopic studies of vanadium dioxide systems, which is commonly substituted by averaging or single point-probing approaches, both of which suffermore » from information misinterpretation due to low resolving power.« less

  16. Spatial distribution and cluster analysis of retail drug shop characteristics and antimalarial behaviors as reported by private medicine retailers in western Kenya: informing future interventions.

    PubMed

    Rusk, Andria; Highfield, Linda; Wilkerson, J Michael; Harrell, Melissa; Obala, Andrew; Amick, Benjamin

    2016-02-19

    Efforts to improve malaria case management in sub-Saharan Africa have shifted focus to private antimalarial retailers to increase access to appropriate treatment. Demands to decrease intervention cost while increasing efficacy requires interventions tailored to geographic regions with demonstrated need. Cluster analysis presents an opportunity to meet this demand, but has not been applied to the retail sector or antimalarial retailer behaviors. This research conducted cluster analysis on medicine retailer behaviors in Kenya, to improve malaria case management and inform future interventions. Ninety-seven surveys were collected from medicine retailers working in the Webuye Health and Demographic Surveillance Site. Survey items included retailer training, education, antimalarial drug knowledge, recommending behavior, sales, and shop characteristics, and were analyzed using Kulldorff's spatial scan statistic. The Bernoulli purely spatial model for binomial data was used, comparing cases to controls. Statistical significance of found clusters was tested with a likelihood ratio test, using the null hypothesis of no clustering, and a p value based on 999 Monte Carlo simulations. The null hypothesis was rejected with p values of 0.05 or less. A statistically significant cluster of fewer than expected pharmacy-trained retailers was found (RR = .09, p = .001) when compared to the expected random distribution. Drug recommending behavior also yielded a statistically significant cluster, with fewer than expected retailers recommending the correct antimalarial medication to adults (RR = .018, p = .01), and fewer than expected shops selling that medication more often than outdated antimalarials when compared to random distribution (RR = 0.23, p = .007). All three of these clusters were co-located, overlapping in the northwest of the study area. Spatial clustering was found in the data. A concerning amount of correlation was found in one specific region in the study area where multiple behaviors converged in space, highlighting a prime target for interventions. These results also demonstrate the utility of applying geospatial methods in the study of medicine retailer behaviors, making the case for expanding this approach to other regions.

  17. Seabed mapping and characterization of sediment variability using the usSEABED data base

    USGS Publications Warehouse

    Goff, J.A.; Jenkins, C.J.; Jeffress, Williams S.

    2008-01-01

    We present a methodology for statistical analysis of randomly located marine sediment point data, and apply it to the US continental shelf portions of usSEABED mean grain size records. The usSEABED database, like many modern, large environmental datasets, is heterogeneous and interdisciplinary. We statistically test the database as a source of mean grain size data, and from it provide a first examination of regional seafloor sediment variability across the entire US continental shelf. Data derived from laboratory analyses ("extracted") and from word-based descriptions ("parsed") are treated separately, and they are compared statistically and deterministically. Data records are selected for spatial analysis by their location within sample regions: polygonal areas defined in ArcGIS chosen by geography, water depth, and data sufficiency. We derive isotropic, binned semivariograms from the data, and invert these for estimates of noise variance, field variance, and decorrelation distance. The highly erratic nature of the semivariograms is a result both of the random locations of the data and of the high level of data uncertainty (noise). This decorrelates the data covariance matrix for the inversion, and largely prevents robust estimation of the fractal dimension. Our comparison of the extracted and parsed mean grain size data demonstrates important differences between the two. In particular, extracted measurements generally produce finer mean grain sizes, lower noise variance, and lower field variance than parsed values. Such relationships can be used to derive a regionally dependent conversion factor between the two. Our analysis of sample regions on the US continental shelf revealed considerable geographic variability in the estimated statistical parameters of field variance and decorrelation distance. Some regional relationships are evident, and overall there is a tendency for field variance to be higher where the average mean grain size is finer grained. Surprisingly, parsed and extracted noise magnitudes correlate with each other, which may indicate that some portion of the data variability that we identify as "noise" is caused by real grain size variability at very short scales. Our analyses demonstrate that by applying a bias-correction proxy, usSEABED data can be used to generate reliable interpolated maps of regional mean grain size and sediment character. 

  18. Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

    NASA Astrophysics Data System (ADS)

    Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

    2018-07-01

    Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.

  19. Sequential Least-Squares Using Orthogonal Transformations. [spacecraft communication/spacecraft tracking-data smoothing

    NASA Technical Reports Server (NTRS)

    Bierman, G. J.

    1975-01-01

    Square root information estimation, starting from its beginnings in least-squares parameter estimation, is considered. Special attention is devoted to discussions of sensitivity and perturbation matrices, computed solutions and their formal statistics, consider-parameters and consider-covariances, and the effects of a priori statistics. The constant-parameter model is extended to include time-varying parameters and process noise, and the error analysis capabilities are generalized. Efficient and elegant smoothing results are obtained as easy consequences of the filter formulation. The value of the techniques is demonstrated by the navigation results that were obtained for the Mariner Venus-Mercury (Mariner 10) multiple-planetary space probe and for the Viking Mars space mission.

  20. Experimental validation of a distribution theory based analysis of the effect of manufacturing tolerances on permanent magnet synchronous machines

    NASA Astrophysics Data System (ADS)

    Boscaino, V.; Cipriani, G.; Di Dio, V.; Corpora, M.; Curto, D.; Franzitta, V.; Trapanese, M.

    2017-05-01

    An experimental study on the effect of permanent magnet tolerances on the performances of a Tubular Linear Ferrite Motor is presented in this paper. The performances that have been investigated are: cogging force, end effect cogging force and generated thrust. It is demonstrated that: 1) the statistical variability of the magnets introduces harmonics in the spectrum of the cogging force; 2) the value of the end effect cogging force is directly linked to the values of then remanence field of the external magnets placed on the slider; 3) the generated thrust and its statistical distribution depend on the remanence field of the magnets placed on the translator.

  1. The Extent and Consequences of P-Hacking in Science

    PubMed Central

    Head, Megan L.; Holman, Luke; Lanfear, Rob; Kahn, Andrew T.; Jennions, Michael D.

    2015-01-01

    A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses. PMID:25768323

  2. Andreev Bound States Formation and Quasiparticle Trapping in Quench Dynamics Revealed by Time-Dependent Counting Statistics.

    PubMed

    Souto, R Seoane; Martín-Rodero, A; Yeyati, A Levy

    2016-12-23

    We analyze the quantum quench dynamics in the formation of a phase-biased superconducting nanojunction. We find that in the absence of an external relaxation mechanism and for very general conditions the system gets trapped in a metastable state, corresponding to a nonequilibrium population of the Andreev bound states. The use of the time-dependent full counting statistics analysis allows us to extract information on the asymptotic population of even and odd many-body states, demonstrating that a universal behavior, dependent only on the Andreev state energy, is reached in the quantum point contact limit. These results shed light on recent experimental observations on quasiparticle trapping in superconducting atomic contacts.

  3. Diffraction based Hanbury Brown and Twiss interferometry at a hard x-ray free-electron laser

    DOE PAGES

    Gorobtsov, O. Yu.; Mukharamova, N.; Lazarev, S.; ...

    2018-02-02

    X-ray free-electron lasers (XFELs) provide extremely bright and highly spatially coherent x-ray radiation with femtosecond pulse duration. Currently, they are widely used in biology and material science. Knowledge of the XFEL statistical properties during an experiment may be vitally important for the accurate interpretation of the results. Here, for the first time, we demonstrate Hanbury Brown and Twiss (HBT) interferometry performed in diffraction mode at an XFEL source. It allowed us to determine the XFEL statistical properties directly from the Bragg peaks originating from colloidal crystals. This approach is different from the traditional one when HBT interferometry is performed inmore » the direct beam without a sample. Our analysis has demonstrated nearly full (80%) global spatial coherence of the XFEL pulses and an average pulse duration on the order of ten femtoseconds for the monochromatized beam, which is significantly shorter than expected from the electron bunch measurements.« less

  4. Neighborhood archetypes for population health research: is there no place like home?

    PubMed

    Weden, Margaret M; Bird, Chloe E; Escarce, José J; Lurie, Nicole

    2011-01-01

    This study presents a new, latent archetype approach for studying place in population health. Latent class analysis is used to show how the number, defining attributes, and change/stability of neighborhood archetypes can be characterized and tested for statistical significance. The approach is demonstrated using data on contextual determinants of health for US neighborhoods defined by census tracts in 1990 and 2000. Six archetypes (prevalence 13-20%) characterize the statistically significant combinations of contextual determinants of health from the social environment, built environment, commuting and migration patterns, and demographics and household composition of US neighborhoods. Longitudinal analyses based on the findings demonstrate notable stability (76.4% of neighborhoods categorized as the same archetype ten years later), with exceptions reflecting trends in (ex)urbanization, gentrification/downgrading, and racial/ethnic reconfiguration. The findings and approach is applicable to both research and practice (e.g. surveillance) and can be scaled up or down to study health and place in other geographical contexts or historical periods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  5. Diffraction based Hanbury Brown and Twiss interferometry at a hard x-ray free-electron laser

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gorobtsov, O. Yu.; Mukharamova, N.; Lazarev, S.

    X-ray free-electron lasers (XFELs) provide extremely bright and highly spatially coherent x-ray radiation with femtosecond pulse duration. Currently, they are widely used in biology and material science. Knowledge of the XFEL statistical properties during an experiment may be vitally important for the accurate interpretation of the results. Here, for the first time, we demonstrate Hanbury Brown and Twiss (HBT) interferometry performed in diffraction mode at an XFEL source. It allowed us to determine the XFEL statistical properties directly from the Bragg peaks originating from colloidal crystals. This approach is different from the traditional one when HBT interferometry is performed inmore » the direct beam without a sample. Our analysis has demonstrated nearly full (80%) global spatial coherence of the XFEL pulses and an average pulse duration on the order of ten femtoseconds for the monochromatized beam, which is significantly shorter than expected from the electron bunch measurements.« less

  6. Application of multislice spiral CT for guidance of insertion of thoracic spine pedicle screws: an in vitro study.

    PubMed

    Wang, Juan; Zhou, Yicheng; Hu, Ning; Wang, Renfa

    2006-01-01

    To investigate the value of the guidance of three dimensional (3-D) reconstruction of multi-slice spiral CT (MSCT) for the placement of pedicle screws, the 3-D anatomical data of the thoracic pedicles were measured by MSCT in two embalmed human cadaveric thoracic pedicles spines (T1-T10) to guide the insertion of pedicle screws. After pulling the screws out, the pathways were filled with contrast media. The PW, PH, TSA and SSA of developed pathways were measured on the CT images and they were also measured on the real objects by caliper and goniometer. Analysis of variance demonstrated that the difference between the CT scans and real objects had no statistical significance (P > 0.05). Moreover, the difference between pedicle axis and developed pathway also had no statistical significance (P > 0.05). The data obtained from 3-D reconstruction of MSCT demonstrated that individualized standards, are not only accurate but also helpful for the successful placement of pedicle screws.

  7. Thermoelastic Stress Analysis: The Mean Stress Effect in Metallic Alloys

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, Andrew L.; Baaklini, George Y.

    1999-01-01

    The primary objective of this study involved the utilization of the thermoelastic stress analysis (TSA) method to demonstrate the mean stress dependence of the thermoelastic constant. Titanium and nickel base alloys, commonly employed in aerospace gas turbines, were the materials of interest. The repeatability of the results was studied through a statistical analysis of the data. Although the mean stress dependence was well established, the ability to confidently quantify it was diminished by the experimental variations. If calibration of the thermoelastic response to mean stress can be successfully implemented, it is feasible to use the relationship to determine a structure's residual stress state.

  8. A Statistical Assessment of Information, Knowledge and Attitudes of Medical Students Regarding Contraception Use.

    PubMed

    Simionescu, Anca A; Horobet, Alexandra; Belascu, Lucian

    2017-12-01

    To evaluate how contraception use is linked to information, knowledge and attitudes towards family planning and contraception of medical students. This is a voluntary cross-sectional study using an anonymous questionnaire applied to 62 medical students. The questionnaire had the following main structure: characteristics of the studied population, information on contraception, knowledge about contraception methods, attitudes regarding family planning and contraception, and contraception use. Statistical analysis was performed using STATISTICA 8.0 software and statistical significance of the data was verified using the t-statistic test. The survey had a 95% response rate. Seventy seven percent of the studied population consisted of females aged between 20-40 years, with 85.50% of them being 20-25 years old. The overwhelming majority of respondents believed it was important to be informed on the subject and considered themselves to be well informed on contraception. The internet and courses are the main sources of information. Of all respondents, 75.41% had routine discussions with their partners regarding contraception, 53.23% talked about it with family members and 46.77% with their physician; 90.16% had at least one gynecological examination and 47.54% got themselves tested for sexually transmitted diseases. The condom and the contraceptive pill were the main contraceptive methods for the respondents. Romanian medical students share similar features to their peers in European developed countries. We used a statistical analysis to demonstrate that information, knowledge and attitudes on contraception are closely linked to contraceptive choice.

  9. Senior Computational Scientist | Center for Cancer Research

    Cancer.gov

    The Basic Science Program (BSP) pursues independent, multidisciplinary research in basic and applied molecular biology, immunology, retrovirology, cancer biology, and human genetics. Research efforts and support are an integral part of the Center for Cancer Research (CCR) at the Frederick National Laboratory for Cancer Research (FNLCR). The Cancer & Inflammation Program (CIP), Basic Science Program, HLA Immunogenetics Section, under the leadership of Dr. Mary Carrington, studies the influence of human leukocyte antigens (HLA) and specific KIR/HLA genotypes on risk of and outcomes to infection, cancer, autoimmune disease, and maternal-fetal disease. Recent studies have focused on the impact of HLA gene expression in disease, the molecular mechanism regulating expression levels, and the functional basis for the effect of differential expression on disease outcome. The lab’s further focus is on the genetic basis for resistance/susceptibility to disease conferred by immunogenetic variation. KEY ROLES/RESPONSIBILITIES The Senior Computational Scientist will provide research support to the CIP-BSP-HLA Immunogenetics Section performing bio-statistical design, analysis and reporting of research projects conducted in the lab. This individual will be involved in the implementation of statistical models and data preparation. Successful candidate should have 5 or more years of competent, innovative biostatistics/bioinformatics research experience, beyond doctoral training Considerable experience with statistical software, such as SAS, R and S-Plus Sound knowledge, and demonstrated experience of theoretical and applied statistics Write program code to analyze data using statistical analysis software Contribute to the interpretation and publication of research results

  10. Tacrolimus in the treatment of myasthenia gravis in patients with an inadequate response to glucocorticoid therapy: randomized, double-blind, placebo-controlled study conducted in China.

    PubMed

    Zhou, Lei; Liu, Weibin; Li, Wei; Li, Haifeng; Zhang, Xu; Shang, Huifang; Zhang, Xu; Bu, Bitao; Deng, Hui; Fang, Qi; Li, Jimei; Zhang, Hua; Song, Zhi; Ou, Changyi; Yan, Chuanzhu; Liu, Tao; Zhou, Hongyu; Bao, Jianhong; Lu, Jiahong; Shi, Huawei; Zhao, Chongbo

    2017-09-01

    To determine the efficacy of low-dose, immediate-release tacrolimus in patients with myasthenia gravis (MG) with inadequate response to glucocorticoid therapy in a randomized, double-blind, placebo-controlled study. Eligible patients had inadequate response to glucocorticoids (GCs) after ⩾6 weeks of treatment with prednisone ⩾0.75 mg/kg/day or 60-100 mg/day. Patients were randomized to receive 3 mg tacrolimus or placebo daily (orally) for 24 weeks. Concomitant glucocorticoids and pyridostigmine were allowed. Patients continued GC therapy from weeks 1-4; from week 5, the dose was decreased at the discretion of the investigator. The primary efficacy outcome measure was a reduction, relative to baseline, in quantitative myasthenia gravis (QMG) score assessed using a generalized linear model; supportive analyses used alternative models. Of 138 patients screened, 83 [tacrolimus ( n = 45); placebo ( n = 38)] were enrolled and treated. The change in adjusted mean QMG score from baseline to week 24 was -4.9 for tacrolimus and -3.3 for placebo (least squares mean difference: -1.7, 95% confidence interval: -3.5, -0.1; p = 0.067). A post-hoc analysis demonstrated a statistically significant difference for QMG score reduction of ⩾4 points in the tacrolimus group (68.2%) versus the placebo group (44.7%; p = 0.044). Adverse event profiles were similar between treatment groups. Tacrolimus 3 mg treatment for patients with MG and inadequate response to GCs did not demonstrate a statistically significant improvement in the primary endpoint versus placebo over 24 weeks; however, a post-hoc analysis demonstrated a statistically significant difference for QMG score reduction of ⩾4 points in the tacrolimus group versus the placebo group. This study was limited by the low number of patients, the absence of testing for acetylcholine receptor antibody and the absence of stratification by disease duration (which led to a disparity between the two groups). ClinicalTrials.gov identifier: NCT01325571.

  11. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics.

    PubMed

    Giambartolomei, Claudia; Vukcevic, Damjan; Schadt, Eric E; Franke, Lude; Hingorani, Aroon D; Wallace, Chris; Plagnol, Vincent

    2014-05-01

    Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.

  12. Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.

    PubMed

    Festing, M F

    2001-01-01

    In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.

  13. Integrated design and manufacturing for the high speed civil transport (a combined aerodynamics/propulsion optimization study)

    NASA Technical Reports Server (NTRS)

    Baecher, Juergen; Bandte, Oliver; DeLaurentis, Dan; Lewis, Kemper; Sicilia, Jose; Soboleski, Craig

    1995-01-01

    This report documents the efforts of a Georgia Tech High Speed Civil Transport (HSCT) aerospace student design team in completing a design methodology demonstration under NASA's Advanced Design Program (ADP). Aerodynamic and propulsion analyses are integrated into the synthesis code FLOPS in order to improve its prediction accuracy. Executing the integrated product and process development (IPPD) methodology proposed at the Aerospace Systems Design Laboratory (ASDL), an improved sizing process is described followed by a combined aero-propulsion optimization, where the objective function, average yield per revenue passenger mile ($/RPM), is constrained by flight stability, noise, approach speed, and field length restrictions. Primary goals include successful demonstration of the application of the response surface methodolgy (RSM) to parameter design, introduction to higher fidelity disciplinary analysis than normally feasible at the conceptual and early preliminary level, and investigations of relationships between aerodynamic and propulsion design parameters and their effect on the objective function, $/RPM. A unique approach to aircraft synthesis is developed in which statistical methods, specifically design of experiments and the RSM, are used to more efficiently search the design space for optimum configurations. In particular, two uses of these techniques are demonstrated. First, response model equations are formed which represent complex analysis in the form of a regression polynomial. Next, a second regression equation is constructed, not for modeling purposes, but instead for the purpose of optimization at the system level. Such an optimization problem with the given tools normally would be difficult due to the need for hard connections between the various complex codes involved. The statistical methodology presents an alternative and is demonstrated via an example of aerodynamic modeling and planform optimization for a HSCT.

  14. Combining statistical inference and decisions in ecology.

    PubMed

    Williams, Perry J; Hooten, Mevin B

    2016-09-01

    Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods, including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem. © 2016 by the Ecological Society of America.

  15. The Intensity, Directionality, and Statistics of Underwater Noise From Melting Icebergs

    NASA Astrophysics Data System (ADS)

    Glowacki, Oskar; Deane, Grant B.; Moskalik, Mateusz

    2018-05-01

    Freshwater fluxes from melting icebergs and glaciers are important contributors to both sea level rise and anomalies of seawater salinity in polar regions. However, the hazards encountered close to icebergs and glaciers make it difficult to quantify their melt rates directly, motivating the development of cryoacoustics as a remote sensing technique. Recent studies have shown a qualitative link between ice melting and the accompanying underwater noise, but the properties of this signal remain poorly understood. Here we examine the intensity, directionality, and temporal statistics of the underwater noise radiated by melting icebergs in Hornsund Fjord, Svalbard, using a three-element acoustic array. We present the first estimate of noise energy per unit area associated with iceberg melt and demonstrate its qualitative dependence on exposure to surface current. Finally, we show that the analysis of noise directionality and statistics makes it possible to distinguish iceberg melt from the glacier terminus melt.

  16. Statistical power comparisons at 3T and 7T with a GO / NOGO task.

    PubMed

    Torrisi, Salvatore; Chen, Gang; Glen, Daniel; Bandettini, Peter A; Baker, Chris I; Reynolds, Richard; Yen-Ting Liu, Jeffrey; Leshin, Joseph; Balderston, Nicholas; Grillon, Christian; Ernst, Monique

    2018-07-15

    The field of cognitive neuroscience is weighing evidence about whether to move from standard field strength to ultra-high field (UHF). The present study contributes to the evidence by comparing a cognitive neuroscience paradigm at 3 Tesla (3T) and 7 Tesla (7T). The goal was to test and demonstrate the practical effects of field strength on a standard GO/NOGO task using accessible preprocessing and analysis tools. Two independent matched healthy samples (N = 31 each) were analyzed at 3T and 7T. Results show gains at 7T in statistical strength, the detection of smaller effects and group-level power. With an increased availability of UHF scanners, these gains may be exploited by cognitive neuroscientists and other neuroimaging researchers to develop more efficient or comprehensive experimental designs and, given the same sample size, achieve greater statistical power at 7T. Published by Elsevier Inc.

  17. US Geological Survey nutrient preservation experiment : experimental design, statistical analysis, and interpretation of analytical results

    USGS Publications Warehouse

    Patton, Charles J.; Gilroy, Edward J.

    1999-01-01

    Data on which this report is based, including nutrient concentrations in synthetic reference samples determined concurrently with those in real samples, are extensive (greater than 20,000 determinations) and have been published separately. In addition to confirming the well-documented instability of nitrite in acidified samples, this study also demonstrates that when biota are removed from samples at collection sites by 0.45-micrometer membrane filtration, subsequent preservation with sulfuric acid or mercury (II) provides no statistically significant improvement in nutrient concentration stability during storage at 4 degrees Celsius for 30 days. Biocide preservation had no statistically significant effect on the 30-day stability of phosphorus concentrations in whole-water splits from any of the 15 stations, but did stabilize Kjeldahl nitrogen concentrations in whole-water splits from three data-collection stations where ammonium accounted for at least half of the measured Kjeldahl nitrogen.

  18. Geostatistics and GIS: tools for characterizing environmental contamination.

    PubMed

    Henshaw, Shannon L; Curriero, Frank C; Shields, Timothy M; Glass, Gregory E; Strickland, Paul T; Breysse, Patrick N

    2004-08-01

    Geostatistics is a set of statistical techniques used in the analysis of georeferenced data that can be applied to environmental contamination and remediation studies. In this study, the 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene (DDE) contamination at a Superfund site in western Maryland is evaluated. Concern about the site and its future clean up has triggered interest within the community because residential development surrounds the area. Spatial statistical methods, of which geostatistics is a subset, are becoming increasingly popular, in part due to the availability of geographic information system (GIS) software in a variety of application packages. In this article, the joint use of ArcGIS software and the R statistical computing environment are demonstrated as an approach for comprehensive geostatistical analyses. The spatial regression method, kriging, is used to provide predictions of DDE levels at unsampled locations both within the site and the surrounding areas where residential development is ongoing.

  19. A SAS macro for testing differences among three or more independent groups using Kruskal-Wallis and Nemenyi tests.

    PubMed

    Liu, Yuewei; Chen, Weihong

    2012-02-01

    As a nonparametric method, the Kruskal-Wallis test is widely used to compare three or more independent groups when an ordinal or interval level of data is available, especially when the assumptions of analysis of variance (ANOVA) are not met. If the Kruskal-Wallis statistic is statistically significant, Nemenyi test is an alternative method for further pairwise multiple comparisons to locate the source of significance. Unfortunately, most popular statistical packages do not integrate the Nemenyi test, which is not easy to be calculated by hand. We described the theory and applications of the Kruskal-Wallis and Nemenyi tests, and presented a flexible SAS macro to implement the two tests. The SAS macro was demonstrated by two examples from our cohort study in occupational epidemiology. It provides a useful tool for SAS users to test the differences among three or more independent groups using a nonparametric method.

  20. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data

    PubMed Central

    Chen, Yi-Hau

    2017-01-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336

  1. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.

    PubMed

    Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin

    2017-06-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.

  2. SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

    PubMed Central

    Chu, Annie; Cui, Jenny; Dinov, Ivo D.

    2011-01-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994

  3. Quantifying Individual Brain Connectivity with Functional Principal Component Analysis for Networks.

    PubMed

    Petersen, Alexander; Zhao, Jianyang; Carmichael, Owen; Müller, Hans-Georg

    2016-09-01

    In typical functional connectivity studies, connections between voxels or regions in the brain are represented as edges in a network. Networks for different subjects are constructed at a given graph density and are summarized by some network measure such as path length. Examining these summary measures for many density values yields samples of connectivity curves, one for each individual. This has led to the adoption of basic tools of functional data analysis, most commonly to compare control and disease groups through the average curves in each group. Such group differences, however, neglect the variability in the sample of connectivity curves. In this article, the use of functional principal component analysis (FPCA) is demonstrated to enrich functional connectivity studies by providing increased power and flexibility for statistical inference. Specifically, individual connectivity curves are related to individual characteristics such as age and measures of cognitive function, thus providing a tool to relate brain connectivity with these variables at the individual level. This individual level analysis opens a new perspective that goes beyond previous group level comparisons. Using a large data set of resting-state functional magnetic resonance imaging scans, relationships between connectivity and two measures of cognitive function-episodic memory and executive function-were investigated. The group-based approach was implemented by dichotomizing the continuous cognitive variable and testing for group differences, resulting in no statistically significant findings. To demonstrate the new approach, FPCA was implemented, followed by linear regression models with cognitive scores as responses, identifying significant associations of connectivity in the right middle temporal region with both cognitive scores.

  4. Satellite temperature monitoring and prediction system

    NASA Technical Reports Server (NTRS)

    Barnett, U. R.; Martsolf, J. D.; Crosby, F. L.

    1980-01-01

    The paper describes the Florida Satellite Freeze Forecast System (SFFS) in its current state. All data collection options have been demonstrated, and data collected over a three year period have been stored for future analysis. Presently, specific minimum temperature forecasts are issued routinely from November through March. The procedures for issuing these forecast are discussed. The automated data acquisition and processing system is described, and the physical and statistical models employed are examined.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kreuzer-Martin, Helen W.; Hegg, Eric L.

    The use of isotopic signatures for forensic analysis of biological materials is well-established, and the same general principles that apply to interpretation of stable isotope content of C, N, O, and H apply to the analysis of microorganisms. Heterotrophic microorganisms derive their isotopic content from their growth substrates, which are largely plant and animal products, and the water in their culture medium. Thus the isotope signatures of microbes are tied to their growth environment. The C, N, O, and H isotope ratios of spores have been demonstrated to constitute highly discriminating signatures for sample matching. They can rule out specificmore » samples of media and/or water as possible production media, and can predict isotope ratio ranges of the culture media and water used to produce a given sample. These applications have been developed and tested through analyses of approximately 250 samples of Bacillus subtilis spores and over 500 samples of culture media, providing a strong statistical basis for data interpretation. A Bayesian statistical framework for integrating stable isotope data with other types of signatures derived from microorganisms has been able to characterize the culture medium used to produce spores of various Bacillus species, leveraging isotopic differences in different medium types and demonstrating the power of data integration for forensic investigations.« less

  6. Empirical test of the performance of an acoustic-phonetic approach to forensic voice comparison under conditions similar to those of a real case.

    PubMed

    Enzinger, Ewald; Morrison, Geoffrey Stewart

    2017-08-01

    In a 2012 case in New South Wales, Australia, the identity of a speaker on several audio recordings was in question. Forensic voice comparison testimony was presented based on an auditory-acoustic-phonetic-spectrographic analysis. No empirical demonstration of the validity and reliability of the analytical methodology was presented. Unlike the admissibility standards in some other jurisdictions (e.g., US Federal Rule of Evidence 702 and the Daubert criteria, or England & Wales Criminal Practice Directions 19A), Australia's Unified Evidence Acts do not require demonstration of the validity and reliability of analytical methods and their implementation before testimony based upon them is presented in court. The present paper reports on empirical tests of the performance of an acoustic-phonetic-statistical forensic voice comparison system which exploited the same features as were the focus of the auditory-acoustic-phonetic-spectrographic analysis in the case, i.e., second-formant (F2) trajectories in /o/ tokens and mean fundamental frequency (f0). The tests were conducted under conditions similar to those in the case. The performance of the acoustic-phonetic-statistical system was very poor compared to that of an automatic system. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Improvement of IFNγ ELISPOT Performance Following Overnight Resting of Frozen PBMC Samples Confirmed Through Rigorous Statistical Analysis

    PubMed Central

    Santos, Radleigh; Buying, Alcinette; Sabri, Nazila; Yu, John; Gringeri, Anthony; Bender, James; Janetzki, Sylvia; Pinilla, Clemencia; Judkowski, Valeria A.

    2014-01-01

    Immune monitoring of functional responses is a fundamental parameter to establish correlates of protection in clinical trials evaluating vaccines and therapies to boost antigen-specific responses. The IFNγ ELISPOT assay is a well-standardized and validated method for the determination of functional IFNγ-producing T-cells in peripheral blood mononuclear cells (PBMC); however, its performance greatly depends on the quality and integrity of the cryopreserved PBMC. Here, we investigate the effect of overnight (ON) resting of the PBMC on the detection of CD8-restricted peptide-specific responses by IFNγ ELISPOT. The study used PBMC from healthy donors to evaluate the CD8 T-cell response to five pooled or individual HLA-A2 viral peptides. The results were analyzed using a modification of the existing distribution free resampling (DFR) recommended for the analysis of ELISPOT data to ensure the most rigorous possible standard of significance. The results of the study demonstrate that ON resting of PBMC samples prior to IFNγ ELISPOT increases both the magnitude and the statistical significance of the responses. In addition, a comparison of the results with a 13-day preculture of PBMC with the peptides before testing demonstrates that ON resting is sufficient for the efficient evaluation of immune functioning. PMID:25546016

  8. Descriptive data analysis.

    PubMed

    Thompson, Cheryl Bagley

    2009-01-01

    This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.

  9. Temporal and spatial assessment of river surface water quality using multivariate statistical techniques: a study in Can Tho City, a Mekong Delta area, Vietnam.

    PubMed

    Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep

    2015-05-01

    The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.

  10. Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

    PubMed

    Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy

    2014-11-01

    Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  11. Forecasting volatility with neural regression: a contribution to model adequacy.

    PubMed

    Refenes, A N; Holt, W T

    2001-01-01

    Neural nets' usefulness for forecasting is limited by problems of overfitting and the lack of rigorous procedures for model identification, selection and adequacy testing. This paper describes a methodology for neural model misspecification testing. We introduce a generalization of the Durbin-Watson statistic for neural regression and discuss the general issues of misspecification testing using residual analysis. We derive a generalized influence matrix for neural estimators which enables us to evaluate the distribution of the statistic. We deploy Monte Carlo simulation to compare the power of the test for neural and linear regressors. While residual testing is not a sufficient condition for model adequacy, it is nevertheless a necessary condition to demonstrate that the model is a good approximation to the data generating process, particularly as neural-network estimation procedures are susceptible to partial convergence. The work is also an important step toward developing rigorous procedures for neural model identification, selection and adequacy testing which have started to appear in the literature. We demonstrate its applicability in the nontrivial problem of forecasting implied volatility innovations using high-frequency stock index options. Each step of the model building process is validated using statistical tests to verify variable significance and model adequacy with the results confirming the presence of nonlinear relationships in implied volatility innovations.

  12. MORTICIA, a statistical analysis software package for determining optical surveillance system effectiveness.

    NASA Astrophysics Data System (ADS)

    Ramkilowan, A.; Griffith, D. J.

    2017-10-01

    Surveillance modelling in terms of the standard Detect, Recognise and Identify (DRI) thresholds remains a key requirement for determining the effectiveness of surveillance sensors. With readily available computational resources it has become feasible to perform statistically representative evaluations of the effectiveness of these sensors. A new capability for performing this Monte-Carlo type analysis is demonstrated in the MORTICIA (Monte- Carlo Optical Rendering for Theatre Investigations of Capability under the Influence of the Atmosphere) software package developed at the Council for Scientific and Industrial Research (CSIR). This first generation, python-based open-source integrated software package, currently in the alpha stage of development aims to provide all the functionality required to perform statistical investigations of the effectiveness of optical surveillance systems in specific or generic deployment theatres. This includes modelling of the mathematical and physical processes that govern amongst other components of a surveillance system; a sensor's detector and optical components, a target and its background as well as the intervening atmospheric influences. In this paper we discuss integral aspects of the bespoke framework that are critical to the longevity of all subsequent modelling efforts. Additionally, some preliminary results are presented.

  13. Extending local canonical correlation analysis to handle general linear contrasts for FMRI data.

    PubMed

    Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

    2012-01-01

    Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic.

  14. Machine learning patterns for neuroimaging-genetic studies in the cloud.

    PubMed

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.

  15. Extending Local Canonical Correlation Analysis to Handle General Linear Contrasts for fMRI Data

    PubMed Central

    Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

    2012-01-01

    Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic. PMID:22461786

  16. Gap Shape Classification using Landscape Indices and Multivariate Statistics

    PubMed Central

    Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

    2016-01-01

    This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127

  17. Choroidal Thickness Analysis in Patients with Usher Syndrome Type 2 Using EDI OCT.

    PubMed

    Colombo, L; Sala, B; Montesano, G; Pierrottet, C; De Cillà, S; Maltese, P; Bertelli, M; Rossetti, L

    2015-01-01

    To portray Usher Syndrome type 2, analyzing choroidal thickness and comparing data reported in published literature on RP and healthy subjects. Methods. 20 eyes of 10 patients with clinical signs and genetic diagnosis of Usher Syndrome type 2. Each patient underwent a complete ophthalmologic examination including Best Corrected Visual Acuity (BCVA), intraocular pressure (IOP), axial length (AL), automated visual field (VF), and EDI OCT. Both retinal and choroidal measures were measured. Statistical analysis was performed to correlate choroidal thickness with age, BCVA, IOP, AL, VF, and RT. Comparison with data about healthy people and nonsyndromic RP patients was performed. Results. Mean subfoveal choroidal thickness (SFCT) was 248.21 ± 79.88 microns. SFCT was statistically significant correlated with age (correlation coefficient -0.7248179, p < 0.01). No statistically significant correlation was found between SFCT and BCVA, IOP, AL, VF, and RT. SFCT was reduced if compared to healthy subjects (p < 0.01). No difference was found when compared to choroidal thickness from nonsyndromic RP patients (p = 0.2138). Conclusions. Our study demonstrated in vivo choroidal thickness reduction in patients with Usher Syndrome type 2. These data are important for the comprehension of mechanisms of disease and for the evaluation of therapeutic approaches.

  18. Improved spatial regression analysis of diffusion tensor imaging for lesion detection during longitudinal progression of multiple sclerosis in individual subjects

    NASA Astrophysics Data System (ADS)

    Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui

    2016-03-01

    Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI.

  19. Gap Shape Classification using Landscape Indices and Multivariate Statistics.

    PubMed

    Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

    2016-11-30

    This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.

  20. A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants

    PubMed Central

    Broadaway, K. Alaine; Cutler, David J.; Duncan, Richard; Moore, Jacob L.; Ware, Erin B.; Jhun, Min A.; Bielak, Lawrence F.; Zhao, Wei; Smith, Jennifer A.; Peyser, Patricia A.; Kardia, Sharon L.R.; Ghosh, Debashis; Epstein, Michael P.

    2016-01-01

    Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy. PMID:26942286

  1. Congruence analysis of geodetic networks - hypothesis tests versus model selection by information criteria

    NASA Astrophysics Data System (ADS)

    Lehmann, Rüdiger; Lösler, Michael

    2017-12-01

    Geodetic deformation analysis can be interpreted as a model selection problem. The null model indicates that no deformation has occurred. It is opposed to a number of alternative models, which stipulate different deformation patterns. A common way to select the right model is the usage of a statistical hypothesis test. However, since we have to test a series of deformation patterns, this must be a multiple test. As an alternative solution for the test problem, we propose the p-value approach. Another approach arises from information theory. Here, the Akaike information criterion (AIC) or some alternative is used to select an appropriate model for a given set of observations. Both approaches are discussed and applied to two test scenarios: A synthetic levelling network and the Delft test data set. It is demonstrated that they work but behave differently, sometimes even producing different results. Hypothesis tests are well-established in geodesy, but may suffer from an unfavourable choice of the decision error rates. The multiple test also suffers from statistical dependencies between the test statistics, which are neglected. Both problems are overcome by applying information criterions like AIC.

  2. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Sacks, David B.; Yu, Yi-Kuo

    2018-06-01

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  3. A statistical analysis of cervical auscultation signals from adults with unsafe airway protection.

    PubMed

    Dudik, Joshua M; Kurosu, Atsuko; Coyle, James L; Sejdić, Ervin

    2016-01-22

    Aspiration, where food or liquid is allowed to enter the larynx during a swallow, is recognized as the most clinically salient feature of oropharyngeal dysphagia. This event can lead to short-term harm via airway obstruction or more long-term effects such as pneumonia. In order to non-invasively identify this event using high resolution cervical auscultation there is a need to characterize cervical auscultation signals from subjects with dysphagia who aspirate. In this study, we collected swallowing sound and vibration data from 76 adults (50 men, 26 women, mean age 62) who underwent a routine videofluoroscopy swallowing examination. The analysis was limited to swallows of liquid with either thin (<5 cps) or viscous (≈300 cps) consistency and was divided into those with deep laryngeal penetration or aspiration (unsafe airway protection), and those with either shallow or no laryngeal penetration (safe airway protection), using a standardized scale. After calculating a selection of time, frequency, and time-frequency features for each swallow, the safe and unsafe categories were compared using Wilcoxon rank-sum statistical tests. Our analysis found that few of our chosen features varied in magnitude between safe and unsafe swallows with thin swallows demonstrating no statistical variation. We also supported our past findings with regard to the effects of sex and the presence or absence of stroke on cervical ausculation signals, but noticed certain discrepancies with regards to bolus viscosity. Overall, our results support the necessity of using multiple statistical features concurrently to identify laryngeal penetration of swallowed boluses in future work with high resolution cervical auscultation.

  4. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry.

    PubMed

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo

    2018-06-05

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.

  5. Nonindependence and sensitivity analyses in ecological and evolutionary meta-analyses.

    PubMed

    Noble, Daniel W A; Lagisz, Malgorzata; O'dea, Rose E; Nakagawa, Shinichi

    2017-05-01

    Meta-analysis is an important tool for synthesizing research on a variety of topics in ecology and evolution, including molecular ecology, but can be susceptible to nonindependence. Nonindependence can affect two major interrelated components of a meta-analysis: (i) the calculation of effect size statistics and (ii) the estimation of overall meta-analytic estimates and their uncertainty. While some solutions to nonindependence exist at the statistical analysis stages, there is little advice on what to do when complex analyses are not possible, or when studies with nonindependent experimental designs exist in the data. Here we argue that exploring the effects of procedural decisions in a meta-analysis (e.g. inclusion of different quality data, choice of effect size) and statistical assumptions (e.g. assuming no phylogenetic covariance) using sensitivity analyses are extremely important in assessing the impact of nonindependence. Sensitivity analyses can provide greater confidence in results and highlight important limitations of empirical work (e.g. impact of study design on overall effects). Despite their importance, sensitivity analyses are seldom applied to problems of nonindependence. To encourage better practice for dealing with nonindependence in meta-analytic studies, we present accessible examples demonstrating the impact that ignoring nonindependence can have on meta-analytic estimates. We also provide pragmatic solutions for dealing with nonindependent study designs, and for analysing dependent effect sizes. Additionally, we offer reporting guidelines that will facilitate disclosure of the sources of nonindependence in meta-analyses, leading to greater transparency and more robust conclusions. © 2017 John Wiley & Sons Ltd.

  6. Detection of Fatty Acids from Intact Microorganisms by Molecular Beam Static Secondary Ion Mass Spectrometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ingram, Jani Cheri; Lehman, Richard Michael; Bauer, William Francis

    We report the use of a surface analysis approach, static secondary ion mass spectrometry (SIMS) equipped with a molecular (ReO4-) ion primary beam, to analyze the surface of intact microbial cells. SIMS spectra of 28 microorganisms were compared to fatty acid profiles determined by gas chromatographic analysis of transesterfied fatty acids extracted from the same organisms. The results indicate that surface bombardment using the molecular primary beam cleaved the ester linkage characteristic of bacteria at the glycerophosphate backbone of the phospholipid components of the cell membrane. This cleavage enables direct detection of the fatty acid conjugate base of intact microorganismsmore » by static SIMS. The limit of detection for this approach is approximately 107 bacterial cells/cm2. Multivariate statistical methods were applied in a graded approach to the SIMS microbial data. The results showed that the full data set could initially be statistically grouped based upon major differences in biochemical composition of the cell wall. The gram-positive bacteria were further statistically analyzed, followed by final analysis of a specific bacterial genus that was successfully grouped by species. Additionally, the use of SIMS to detect microbes on mineral surfaces is demonstrated by an analysis of Shewanella oneidensis on crushed hematite. The results of this study provide evidence for the potential of static SIMS to rapidly detect bacterial species based on ion fragments originating from cell membrane lipids directly from sample surfaces.« less

  7. Gas detection by correlation spectroscopy employing a multimode diode laser.

    PubMed

    Lou, Xiutao; Somesfalean, Gabriel; Zhang, Zhiguo

    2008-05-01

    A gas sensor based on the gas-correlation technique has been developed using a multimode diode laser (MDL) in a dual-beam detection scheme. Measurement of CO(2) mixed with CO as an interfering gas is successfully demonstrated using a 1570 nm tunable MDL. Despite overlapping absorption spectra and occasional mode hops, the interfering signals can be effectively excluded by a statistical procedure including correlation analysis and outlier identification. The gas concentration is retrieved from several pair-correlated signals by a linear-regression scheme, yielding a reliable and accurate measurement. This demonstrates the utility of the unsophisticated MDLs as novel light sources for gas detection applications.

  8. Validation tools for image segmentation

    NASA Astrophysics Data System (ADS)

    Padfield, Dirk; Ross, James

    2009-02-01

    A large variety of image analysis tasks require the segmentation of various regions in an image. For example, segmentation is required to generate accurate models of brain pathology that are important components of modern diagnosis and therapy. While the manual delineation of such structures gives accurate information, the automatic segmentation of regions such as the brain and tumors from such images greatly enhances the speed and repeatability of quantifying such structures. The ubiquitous need for such algorithms has lead to a wide range of image segmentation algorithms with various assumptions, parameters, and robustness. The evaluation of such algorithms is an important step in determining their effectiveness. Therefore, rather than developing new segmentation algorithms, we here describe validation methods for segmentation algorithms. Using similarity metrics comparing the automatic to manual segmentations, we demonstrate methods for optimizing the parameter settings for individual cases and across a collection of datasets using the Design of Experiment framework. We then employ statistical analysis methods to compare the effectiveness of various algorithms. We investigate several region-growing algorithms from the Insight Toolkit and compare their accuracy to that of a separate statistical segmentation algorithm. The segmentation algorithms are used with their optimized parameters to automatically segment the brain and tumor regions in MRI images of 10 patients. The validation tools indicate that none of the ITK algorithms studied are able to outperform with statistical significance the statistical segmentation algorithm although they perform reasonably well considering their simplicity.

  9. A novel bi-level meta-analysis approach: applied to biological pathway analysis.

    PubMed

    Nguyen, Tin; Tagett, Rebecca; Donato, Michele; Mitrea, Cristina; Draghici, Sorin

    2016-02-01

    The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fisher's, Stouffer's, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fisher's, Stouffer's and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimer's disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. The R scripts are available on demand from the authors. sorin@wayne.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Geostatistics - a tool applied to the distribution of Legionella pneumophila in a hospital water system.

    PubMed

    Laganà, Pasqualina; Moscato, Umberto; Poscia, Andrea; La Milia, Daniele Ignazio; Boccia, Stefania; Avventuroso, Emanuela; Delia, Santi

    2015-01-01

    Legionnaires' disease is normally acquired by inhalation of legionellae from a contaminated environmental source. Water systems of large buildings, such as hospitals, are often contaminated with legionellae and therefore represent a potential risk for the hospital population. The aim of this study was to evaluate the potential contamination of Legionella pneumophila (LP) in a large hospital in Italy through georeferential statistical analysis to assess the possible sources of dispersion and, consequently, the risk of exposure for both health care staff and patients. LP serogroups 1 and 2-14 distribution was considered in the wards housed on two consecutive floors of the hospital building. On the basis of information provided by 53 bacteriological analysis, a 'random' grid of points was chosen and spatial geostatistics or FAIk Kriging was applied and compared with the results of classical statistical analysis. Over 50% of the examined samples were positive for Legionella pneumophila. LP 1 was isolated in 69% of samples from the ground floor and in 60% of sample from the first floor; LP 2-14 in 36% of sample from the ground floor and 24% from the first. The iso-estimation maps show clearly the most contaminated pipe and the difference in the diffusion of the different L. pneumophila serogroups. Experimental work has demonstrated that geostatistical methods applied to the microbiological analysis of water matrices allows a better modeling of the phenomenon under study, a greater potential for risk management and a greater choice of methods of prevention and environmental recovery to be put in place with respect to the classical statistical analysis.

  11. Texture analysis of apparent diffusion coefficient maps for treatment response assessment in prostate cancer bone metastases-A pilot study.

    PubMed

    Reischauer, Carolin; Patzwahl, René; Koh, Dow-Mu; Froehlich, Johannes M; Gutzeit, Andreas

    2018-04-01

    To evaluate whole-lesion volumetric texture analysis of apparent diffusion coefficient (ADC) maps for assessing treatment response in prostate cancer bone metastases. Texture analysis is performed in 12 treatment-naïve patients with 34 metastases before treatment and at one, two, and three months after the initiation of androgen deprivation therapy. Four first-order and 19 second-order statistical texture features are computed on the ADC maps in each lesion at every time point. Repeatability, inter-patient variability, and changes in the feature values under therapy are investigated. Spearman rank's correlation coefficients are calculated across time to demonstrate the relationship between the texture features and the serum prostate specific antigen (PSA) levels. With few exceptions, the texture features exhibited moderate to high precision. At the same time, Friedman's tests revealed that all first-order and second-order statistical texture features changed significantly in response to therapy. Thereby, the majority of texture features showed significant changes in their values at all post-treatment time points relative to baseline. Bivariate analysis detected significant correlations between the great majority of texture features and the serum PSA levels. Thereby, three first-order and six second-order statistical features showed strong correlations with the serum PSA levels across time. The findings in the present work indicate that whole-tumor volumetric texture analysis may be utilized for response assessment in prostate cancer bone metastases. The approach may be used as a complementary measure for treatment monitoring in conjunction with averaged ADC values. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Statistical principle and methodology in the NISAN system.

    PubMed Central

    Asano, C

    1979-01-01

    The NISAN system is a new interactive statistical analysis program package constructed by an organization of Japanese statisticans. The package is widely available for both statistical situations, confirmatory analysis and exploratory analysis, and is planned to obtain statistical wisdom and to choose optimal process of statistical analysis for senior statisticians. PMID:540594

  13. Finding the Root Causes of Statistical Inconsistency in Community Earth System Model Output

    NASA Astrophysics Data System (ADS)

    Milroy, D.; Hammerling, D.; Baker, A. H.

    2017-12-01

    Baker et al (2015) developed the Community Earth System Model Ensemble Consistency Test (CESM-ECT) to provide a metric for software quality assurance by determining statistical consistency between an ensemble of CESM outputs and new test runs. The test has proved useful for detecting statistical difference caused by compiler bugs and errors in physical modules. However, detection is only the necessary first step in finding the causes of statistical difference. The CESM is a vastly complex model comprised of millions of lines of code which is developed and maintained by a large community of software engineers and scientists. Any root cause analysis is correspondingly challenging. We propose a new capability for CESM-ECT: identifying the sections of code that cause statistical distinguishability. The first step is to discover CESM variables that cause CESM-ECT to classify new runs as statistically distinct, which we achieve via Randomized Logistic Regression. Next we use a tool developed to identify CESM components that define or compute the variables found in the first step. Finally, we employ the application Kernel GENerator (KGEN) created in Kim et al (2016) to detect fine-grained floating point differences. We demonstrate an example of the procedure and advance a plan to automate this process in our future work.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoon, Boram; Gupta, Rajan; Bhattacharya, Tanmoy

    We present a detailed analysis of methods to reduce statistical errors and excited-state contamination in the calculation of matrix elements of quark bilinear operators in nucleon states. All the calculations were done on a 2+1 flavor ensemble with lattices of sizemore » $$32^3 \\times 64$$ generated using the rational hybrid Monte Carlo algorithm at $a=0.081$~fm and with $$M_\\pi=312$$~MeV. The statistical precision of the data is improved using the all-mode-averaging method. We compare two methods for reducing excited-state contamination: a variational analysis and a two-state fit to data at multiple values of the source-sink separation $$t_{\\rm sep}$$. We show that both methods can be tuned to significantly reduce excited-state contamination and discuss their relative advantages and cost-effectiveness. A detailed analysis of the size of source smearing used in the calculation of quark propagators and the range of values of $$t_{\\rm sep}$$ needed to demonstrate convergence of the isovector charges of the nucleon to the $$t_{\\rm sep} \\to \\infty $$ estimates is presented.« less

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoon, Boram; Gupta, Rajan; Bhattacharya, Tanmoy

    We present a detailed analysis of methods to reduce statistical errors and excited-state contamination in the calculation of matrix elements of quark bilinear operators in nucleon states. All the calculations were done on a 2+1-flavor ensemble with lattices of size 32 3 × 64 generated using the rational hybrid Monte Carlo algorithm at a = 0.081 fm and with M π = 312 MeV. The statistical precision of the data is improved using the all-mode-averaging method. We compare two methods for reducing excited-state contamination: a variational analysis and a 2-state fit to data at multiple values of the source-sink separationmore » t sep. We show that both methods can be tuned to significantly reduce excited-state contamination and discuss their relative advantages and cost effectiveness. As a result, a detailed analysis of the size of source smearing used in the calculation of quark propagators and the range of values of t sep needed to demonstrate convergence of the isovector charges of the nucleon to the t sep → ∞ estimates is presented.« less

  16. Principal component analysis of the cytokine and chemokine response to human traumatic brain injury.

    PubMed

    Helmy, Adel; Antoniades, Chrystalina A; Guilfoyle, Mathew R; Carpenter, Keri L H; Hutchinson, Peter J

    2012-01-01

    There is a growing realisation that neuro-inflammation plays a fundamental role in the pathology of Traumatic Brain Injury (TBI). This has led to the search for biomarkers that reflect these underlying inflammatory processes using techniques such as cerebral microdialysis. The interpretation of such biomarker data has been limited by the statistical methods used. When analysing data of this sort the multiple putative interactions between mediators need to be considered as well as the timing of production and high degree of statistical co-variance in levels of these mediators. Here we present a cytokine and chemokine dataset from human brain following human traumatic brain injury and use principal component analysis and partial least squares discriminant analysis to demonstrate the pattern of production following TBI, distinct phases of the humoral inflammatory response and the differing patterns of response in brain and in peripheral blood. This technique has the added advantage of making no assumptions about the Relative Recovery (RR) of microdialysis derived parameters. Taken together these techniques can be used in complex microdialysis datasets to summarise the data succinctly and generate hypotheses for future study.

  17. Kinetic analysis of single molecule FRET transitions without trajectories

    NASA Astrophysics Data System (ADS)

    Schrangl, Lukas; Göhring, Janett; Schütz, Gerhard J.

    2018-03-01

    Single molecule Förster resonance energy transfer (smFRET) is a popular tool to study biological systems that undergo topological transitions on the nanometer scale. smFRET experiments typically require recording of long smFRET trajectories and subsequent statistical analysis to extract parameters such as the states' lifetimes. Alternatively, analysis of probability distributions exploits the shapes of smFRET distributions at well chosen exposure times and hence works without the acquisition of time traces. Here, we describe a variant that utilizes statistical tests to compare experimental datasets with Monte Carlo simulations. For a given model, parameters are varied to cover the full realistic parameter space. As output, the method yields p-values which quantify the likelihood for each parameter setting to be consistent with the experimental data. The method provides suitable results even if the actual lifetimes differ by an order of magnitude. We also demonstrated the robustness of the method to inaccurately determine input parameters. As proof of concept, the new method was applied to the determination of transition rate constants for Holliday junctions.

  18. Stochastic dynamic analysis of marine risers considering Gaussian system uncertainties

    NASA Astrophysics Data System (ADS)

    Ni, Pinghe; Li, Jun; Hao, Hong; Xia, Yong

    2018-03-01

    This paper performs the stochastic dynamic response analysis of marine risers with material uncertainties, i.e. in the mass density and elastic modulus, by using Stochastic Finite Element Method (SFEM) and model reduction technique. These uncertainties are assumed having Gaussian distributions. The random mass density and elastic modulus are represented by using the Karhunen-Loève (KL) expansion. The Polynomial Chaos (PC) expansion is adopted to represent the vibration response because the covariance of the output is unknown. Model reduction based on the Iterated Improved Reduced System (IIRS) technique is applied to eliminate the PC coefficients of the slave degrees of freedom to reduce the dimension of the stochastic system. Monte Carlo Simulation (MCS) is conducted to obtain the reference response statistics. Two numerical examples are studied in this paper. The response statistics from the proposed approach are compared with those from MCS. It is noted that the computational time is significantly reduced while the accuracy is kept. The results demonstrate the efficiency of the proposed approach for stochastic dynamic response analysis of marine risers.

  19. Laryngospasm during emergency department ketamine sedation: a case-control study.

    PubMed

    Green, Steven M; Roback, Mark G; Krauss, Baruch

    2010-11-01

    The objective of this study was to assess predictors of emergency department (ED) ketamine-associated laryngospasm using case-control techniques. We performed a matched case-control analysis of a sample of 8282 ED ketamine sedations (including 22 occurrences of laryngospasm) assembled from 32 prior published series. We sequentially studied the association of each of 7 clinical variables with laryngospasm by assigning 4 controls to each case while matching for the remaining 6 variables. We then used univariate statistics and conditional logistic regression to analyze the matched sets. We found no statistical association of age, dose, oropharyngeal procedure, underlying physical illness, route, or coadministered anticholinergics with laryngospasm. Coadministered benzodiazepines showed a borderline association in the multivariate but not univariate analysis that was considered anomalous. This case-control analysis of the largest available sample of ED ketamine-associated laryngospasm did not demonstrate evidence of association with age, dose, or other clinical factors. Such laryngospasm seems to be idiosyncratic, and accordingly, clinicians administering ketamine must be prepared for its rapid identification and management. Given no evidence that they decrease the risk of laryngospasm, coadministered anticholinergics seem unnecessary.

  20. Statistical Analysis of the First Passage Path Ensemble of Jump Processes

    NASA Astrophysics Data System (ADS)

    von Kleist, Max; Schütte, Christof; Zhang, Wei

    2018-02-01

    The transition mechanism of jump processes between two different subsets in state space reveals important dynamical information of the processes and therefore has attracted considerable attention in the past years. In this paper, we study the first passage path ensemble of both discrete-time and continuous-time jump processes on a finite state space. The main approach is to divide each first passage path into nonreactive and reactive segments and to study them separately. The analysis can be applied to jump processes which are non-ergodic, as well as continuous-time jump processes where the waiting time distributions are non-exponential. In the particular case that the jump processes are both Markovian and ergodic, our analysis elucidates the relations between the study of the first passage paths and the study of the transition paths in transition path theory. We provide algorithms to numerically compute statistics of the first passage path ensemble. The computational complexity of these algorithms scales with the complexity of solving a linear system, for which efficient methods are available. Several examples demonstrate the wide applicability of the derived results across research areas.

  1. Effects of gyrokinesis exercise on the gait pattern of female patients with chronic low back pain

    PubMed Central

    Seo, Kook-Eun; Park, Tae-Jin

    2016-01-01

    [Purpose] The purpose of the present study was to use kinematic variables to identify the effects of 8/weeks’ performance of a gyrokinesis exercise on the gait pattern of females with chronic low back pain. [Subjects] The subjects of the present study were females in their late 20s to mid 30s who were chronic back pain patients. [Methods] A 3-D motion analysis system was used to measure the changes in their gait patterns between pre and post-gyrokintic exercise. The SPSS 21.0 statistics program was used to perform the paired t-test, to compare the gait patterns of pre-post-gyrokinesis exercise. [Results] In the gait analysis, pre-post-gyrokinesis exercise gait patterns showed statistically significant differences in right and left step length, stride length, right-left step widths, and stride speed. [Conclusion] Gait pattern analysis revealed increases in step length, stride length, and stride speed along with a decrease in step width after 8 weeks of gyrokinesis exercise, demonstrating it improved gait pattern. PMID:27065537

  2. Traumatic injury among drywall installers, 1992 to 1995.

    PubMed

    Chiou, S S; Pan, C S; Keane, P

    2000-11-01

    This study examined the traumatic-injury characteristics associated with one of the high-risk occupations in the construction industry--drywall installers--through an analysis of the traumatic-injury data obtained from the Bureau of Labor Statistics. An additional objective was to demonstrate a feasible and economic approach to identify risk factors associated with a specific occupation by using an existing database. An analysis of nonfatal traumatic injuries with days away from work among wage-and-salary drywall installers was performed for 1992 through 1995 using the Occupational Injury and Illness Survey conducted by the Bureau of Labor Statistics. Results from this study indicate that drywall installers are at a high risk of overexertion and falls to a lower level. More than 40% of the injured drywall installers suffered sprains, strains, and/or tears. The most frequently injured body part was the trunk. More than one-third of the trunk injuries occurred while handling solid building materials, mainly drywall. In addition, the database analysis used in this study is valid in identifying overall risk factors for specific occupations.

  3. Monitoring the metering performance of an electronic voltage transformer on-line based on cyber-physics correlation analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Zhu; Li, Hongbin; Tang, Dengping; Hu, Chen; Jiao, Yang

    2017-10-01

    Metering performance is the key parameter of an electronic voltage transformer (EVT), and it requires high accuracy. The conventional off-line calibration method using a standard voltage transformer is not suitable for the key equipment in a smart substation, which needs on-line monitoring. In this article, we propose a method for monitoring the metering performance of an EVT on-line based on cyber-physics correlation analysis. By the electrical and physical properties of a substation running in three-phase symmetry, the principal component analysis method is used to separate the metering deviation caused by the primary fluctuation and the EVT anomaly. The characteristic statistics of the measured data during operation are extracted, and the metering performance of the EVT is evaluated by analyzing the change in statistics. The experimental results show that the method successfully monitors the metering deviation of a Class 0.2 EVT accurately. The method demonstrates the accurate evaluation of on-line monitoring of the metering performance on an EVT without a standard voltage transformer.

  4. Aftershock identification problem via the nearest-neighbor analysis for marked point processes

    NASA Astrophysics Data System (ADS)

    Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.

    2007-12-01

    The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.

  5. Statistical tools for analysis and modeling of cosmic populations and astronomical time series: CUDAHM and TSE

    NASA Astrophysics Data System (ADS)

    Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.

    2018-01-01

    This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.

  6. Controlling excited-state contamination in nucleon matrix elements

    DOE PAGES

    Yoon, Boram; Gupta, Rajan; Bhattacharya, Tanmoy; ...

    2016-06-08

    We present a detailed analysis of methods to reduce statistical errors and excited-state contamination in the calculation of matrix elements of quark bilinear operators in nucleon states. All the calculations were done on a 2+1-flavor ensemble with lattices of size 32 3 × 64 generated using the rational hybrid Monte Carlo algorithm at a = 0.081 fm and with M π = 312 MeV. The statistical precision of the data is improved using the all-mode-averaging method. We compare two methods for reducing excited-state contamination: a variational analysis and a 2-state fit to data at multiple values of the source-sink separationmore » t sep. We show that both methods can be tuned to significantly reduce excited-state contamination and discuss their relative advantages and cost effectiveness. As a result, a detailed analysis of the size of source smearing used in the calculation of quark propagators and the range of values of t sep needed to demonstrate convergence of the isovector charges of the nucleon to the t sep → ∞ estimates is presented.« less

  7. Geographical origin discrimination of lentils (Lens culinaris Medik.) using 1H NMR fingerprinting and multivariate statistical analyses.

    PubMed

    Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela

    2017-12-15

    Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

    PubMed

    Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E

    2013-11-15

    Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu

  9. Joint principal trend analysis for longitudinal high-dimensional data.

    PubMed

    Zhang, Yuping; Ouyang, Zhengqing

    2018-06-01

    We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.

  10. Discrimination surfaces with application to region-specific brain asymmetry analysis.

    PubMed

    Martos, Gabriel; de Carvalho, Miguel

    2018-05-20

    Discrimination surfaces are here introduced as a diagnostic tool for localizing brain regions where discrimination between diseased and nondiseased participants is higher. To estimate discrimination surfaces, we introduce a Mann-Whitney type of statistic for random fields and present large-sample results characterizing its asymptotic behavior. Simulation results demonstrate that our estimator accurately recovers the true surface and corresponding interval of maximal discrimination. The empirical analysis suggests that in the anterior region of the brain, schizophrenic patients tend to present lower local asymmetry scores in comparison with participants in the control group. Copyright © 2018 John Wiley & Sons, Ltd.

  11. A Method for Evaluating the Safety Impacts of Air Traffic Automation

    NASA Technical Reports Server (NTRS)

    Kostiuk, Peter; Shapiro, Gerald; Hanson, Dave; Kolitz, Stephan; Leong, Frank; Rosch, Gene; Bonesteel, Charles

    1998-01-01

    This report describes a methodology for analyzing the safety and operational impacts of emerging air traffic technologies. The approach integrates traditional reliability models of the system infrastructure with models that analyze the environment within which the system operates, and models of how the system responds to different scenarios. Products of the analysis include safety measures such as predicted incident rates, predicted accident statistics, and false alarm rates; and operational availability data. The report demonstrates the methodology with an analysis of the operation of the Center-TRACON Automation System at Dallas-Fort Worth International Airport.

  12. Information categorization approach to literary authorship disputes

    NASA Astrophysics Data System (ADS)

    Yang, Albert C.-C.; Peng, C.-K.; Yien, H.-W.; Goldberger, Ary L.

    2003-11-01

    Scientific analysis of the linguistic styles of different authors has generated considerable interest. We present a generic approach to measuring the similarity of two symbolic sequences that requires minimal background knowledge about a given human language. Our analysis is based on word rank order-frequency statistics and phylogenetic tree construction. We demonstrate the applicability of this method to historic authorship questions related to the classic Chinese novel “The Dream of the Red Chamber,” to the plays of William Shakespeare, and to the Federalist papers. This method may also provide a simple approach to other large databases based on their information content.

  13. Evolutionary computing for the design search and optimization of space vehicle power subsystems

    NASA Technical Reports Server (NTRS)

    Kordon, Mark; Klimeck, Gerhard; Hanks, David; Hua, Hook

    2004-01-01

    Evolutionary computing has proven to be a straightforward and robust approach for optimizing a wide range of difficult analysis and design problems. This paper discusses the application of these techniques to an existing space vehicle power subsystem resource and performance analysis simulation in a parallel processing environment. Out preliminary results demonstrate that this approach has the potential to improve the space system trade study process by allowing engineers to statistically weight subsystem goals of mass, cost and performance then automatically size power elements based on anticipated performance of the subsystem rather than on worst-case estimates.

  14. Analyzing Data for Systems Biology: Working at the Intersection of Thermodynamics and Data Analytics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cannon, William R.; Baxter, Douglas J.

    2012-08-15

    Many challenges in systems biology have to do with analyzing data within the framework of molecular phenomena and cellular pathways. How does this relate to thermodynamics that we know govern the behavior of molecules? Making progress in relating data analysis to thermodynamics is essential in systems biology if we are to build predictive models that enable the field of synthetic biology. This report discusses work at the crossroads of thermodynamics and data analysis, and demonstrates that statistical mechanical free energy is a multinomial log likelihood. Applications to systems biology are presented.

  15. Discriminant analysis of Raman spectra for body fluid identification for forensic purposes.

    PubMed

    Sikirzhytski, Vitali; Virkler, Kelly; Lednev, Igor K

    2010-01-01

    Detection and identification of blood, semen and saliva stains, the most common body fluids encountered at a crime scene, are very important aspects of forensic science today. This study targets the development of a nondestructive, confirmatory method for body fluid identification based on Raman spectroscopy coupled with advanced statistical analysis. Dry traces of blood, semen and saliva obtained from multiple donors were probed using a confocal Raman microscope with a 785-nm excitation wavelength under controlled laboratory conditions. Results demonstrated the capability of Raman spectroscopy to identify an unknown substance to be semen, blood or saliva with high confidence.

  16. Atomic-scale phase composition through multivariate statistical analysis of atom probe tomography data.

    PubMed

    Keenan, Michael R; Smentkowski, Vincent S; Ulfig, Robert M; Oltman, Edward; Larson, David J; Kelly, Thomas F

    2011-06-01

    We demonstrate for the first time that multivariate statistical analysis techniques can be applied to atom probe tomography data to estimate the chemical composition of a sample at the full spatial resolution of the atom probe in three dimensions. Whereas the raw atom probe data provide the specific identity of an atom at a precise location, the multivariate results can be interpreted in terms of the probabilities that an atom representing a particular chemical phase is situated there. When aggregated to the size scale of a single atom (∼0.2 nm), atom probe spectral-image datasets are huge and extremely sparse. In fact, the average spectrum will have somewhat less than one total count per spectrum due to imperfect detection efficiency. These conditions, under which the variance in the data is completely dominated by counting noise, test the limits of multivariate analysis, and an extensive discussion of how to extract the chemical information is presented. Efficient numerical approaches to performing principal component analysis (PCA) on these datasets, which may number hundreds of millions of individual spectra, are put forward, and it is shown that PCA can be computed in a few seconds on a typical laptop computer.

  17. Cigarette smoking impairs clinical outcomes of assisted reproduction technologies: a meta-analysis of the literature.

    PubMed

    Budani, Maria Cristina; Fensore, Stefania; Di Marzio, Marco; Tiboni, Gian Mario

    2018-06-12

    There is convincing evidence that cigarette smoking can impair female reproductive potential. This meta-analysis updates the knowledge regarding the effects of cigarette smoking on clinical outcomes of assisted reproductive technologies (ART). Twenty-six studies were included in this meta-analysis. Results were expressed as odds ratios (OR) with 95% confidence intervals (CI) and statistical heterogeneity between the studies was evaluated with Higgins (I 2 ), Breslow (τ 2 ), Birge's ratio (H 2 ) indices and Chi-square test (χ 2 ). A P-value < 0.05 was considered statistically significant. The analysis showed a significant decrease in live birth rate per cycle for smoking patients (OR 0.59, 95% CI 0.44-0.79; P = 0.0005), a significant lower clinical pregnancy rate per cycle for smoking women (OR 0.53, 95% CI 0.41-0.68; P < 0.0001), and a significant increase in terms of spontaneous miscarriage rate (OR 2.22, 95% CI 1.10-4.48; P = 0.025) for smokers. These findings demonstrate clear negative effects of cigarette smoking on the outcome of ART programs. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. A standards-based method for compositional analysis by energy dispersive X-ray spectrometry using multivariate statistical analysis: application to multicomponent alloys.

    PubMed

    Rathi, Monika; Ahrenkiel, S P; Carapella, J J; Wanlass, M W

    2013-02-01

    Given an unknown multicomponent alloy, and a set of standard compounds or alloys of known composition, can one improve upon popular standards-based methods for energy dispersive X-ray (EDX) spectrometry to quantify the elemental composition of the unknown specimen? A method is presented here for determining elemental composition of alloys using transmission electron microscopy-based EDX with appropriate standards. The method begins with a discrete set of related reference standards of known composition, applies multivariate statistical analysis to those spectra, and evaluates the compositions with a linear matrix algebra method to relate the spectra to elemental composition. By using associated standards, only limited assumptions about the physical origins of the EDX spectra are needed. Spectral absorption corrections can be performed by providing an estimate of the foil thickness of one or more reference standards. The technique was applied to III-V multicomponent alloy thin films: composition and foil thickness were determined for various III-V alloys. The results were then validated by comparing with X-ray diffraction and photoluminescence analysis, demonstrating accuracy of approximately 1% in atomic fraction.

  19. Improving Accuracy and Temporal Resolution of Learning Curve Estimation for within- and across-Session Analysis

    PubMed Central

    Tabelow, Karsten; König, Reinhard; Polzehl, Jörg

    2016-01-01

    Estimation of learning curves is ubiquitously based on proportions of correct responses within moving trial windows. Thereby, it is tacitly assumed that learning performance is constant within the moving windows, which, however, is often not the case. In the present study we demonstrate that violations of this assumption lead to systematic errors in the analysis of learning curves, and we explored the dependency of these errors on window size, different statistical models, and learning phase. To reduce these errors in the analysis of single-subject data as well as on the population level, we propose adequate statistical methods for the estimation of learning curves and the construction of confidence intervals, trial by trial. Applied to data from an avoidance learning experiment with rodents, these methods revealed performance changes occurring at multiple time scales within and across training sessions which were otherwise obscured in the conventional analysis. Our work shows that the proper assessment of the behavioral dynamics of learning at high temporal resolution can shed new light on specific learning processes, and, thus, allows to refine existing learning concepts. It further disambiguates the interpretation of neurophysiological signal changes recorded during training in relation to learning. PMID:27303809

  20. Computer-Assisted Instruction in Statistics. Technical Report.

    ERIC Educational Resources Information Center

    Cooley, William W.

    A paper given at a conference on statistical computation discussed teaching statistics with computers. It concluded that computer-assisted instruction is most appropriately employed in the numerical demonstration of statistical concepts, and for statistical laboratory instruction. The student thus learns simultaneously about the use of computers…

  1. Transient evoked oto-acoustic emission screening in newborns in Bogotá, Colombia: a retrospective study.

    PubMed

    Rojas, Jorge A; Bernal, Jaime E; García, Mary A; Zarante, Ignacio; Ramírez, Natalia; Bernal, Constanza; Gelvez, Nancy; Tamayo, Marta L

    2014-10-01

    The aim of this study was to investigate the characteristics and performance of transient evoked oto-acoustic emission (TEOAE) hearing screening in newborns in Colombia, and analyze all possible variables and factors affecting the results. An observational, descriptive and retrospective study with bivariate analysis was performed. The study population consisted of 56,822 newborns evaluated at the private institution, PREGEN. TEOAE testing was carried out as a pediatric hearing screening test from December 2003 to March 2012. The database from PREGEN was revised, and the protocol for evaluation included the same screening test performed twice. Demographic characteristics were recorded and the newborn's background was evaluated. Basic statistics of the qualitative and quantitative variables, and statistical analysis were obtained using the chi-square test. Of the 56,822 records examined, 0.28% were classed as abnormal, which corresponded to a prevalence of 1 in 350. In the screened newborns, 0.08% had a major abnormality or other clinical condition diagnosed, and 0.29% reported a family history of hearing loss. A prevalence of 6.7 in 10,000 was obtained for microtia, which is similar to the 6.4 in 10,000 previously reported in Colombia (database of the Latin-American Collaborative Study of Congenital Malformations - ECLAMC). Statistical analysis demonstrated an association between presenting with a major anomaly and a higher frequency of abnormal results on both TEOAE tests. Newborns in Colombia do not currently undergo screening for the early detection of hearing impairment. The results from this study suggest TEOAE screening tests, when performed twice, are able to detect hearing abnormalities in newborns. This highlights the need to improve the long-term evaluation and monitoring of patients in Colombia through diagnostic tests, and to provide tests that are both sensitive and specific. Furthermore, the use of TEOAE screening is justified by the favorable cost: benefit ratio demonstrated in many countries worldwide. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  2. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  3. Optimizing human activity patterns using global sensitivity analysis.

    PubMed

    Fairchild, Geoffrey; Hickmann, Kyle S; Mniszewski, Susan M; Del Valle, Sara Y; Hyman, James M

    2014-12-01

    Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule's regularity for a population. We show how to tune an activity's regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.

  4. In-depth statistical analysis of the use of a website providing patients' narratives on lifestyle change when living with chronic back pain or coronary heart disease.

    PubMed

    Schweier, Rebecca; Grande, Gesine; Richter, Cynthia; Riedel-Heller, Steffi G; Romppel, Matthias

    2018-07-01

    To investigate the use of lebensstil-aendern.de ("lifestyle change"), a website providing peer narratives of experiences with successful lifestyle change, and to analyze whether peer model characteristics, clip content, and media type have an influence on the number of visitors, dwell time, and exit rates. An in-depth statistical analysis of website use with multilevel regression analyses. In two years, lebensstil-aendern.de attracted 12,844 visitors. The in-depth statistical analysis of usage rates demonstrated that audio clips were less popular than video or text-only clips, longer clips attracted more visitors, and clips by younger and female interviewees were preferred. User preferences for clip content categories differed between heart and back pain patients. Clips about stress management drew the smallest numbers of visitors in both indication modules. Patients are interested in the experiences of others. Because the quality of information for user-generated content is generally low, healthcare providers should include quality-assured patient narratives in their interventions. User preferences for content, medium, and peer characteristics need to be taken into account. If healthcare providers decide to include patient experiences in their websites, they should plan their intervention according to the different needs and preferences of users. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. Optimizing human activity patterns using global sensitivity analysis

    PubMed Central

    Hickmann, Kyle S.; Mniszewski, Susan M.; Del Valle, Sara Y.; Hyman, James M.

    2014-01-01

    Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations. PMID:25580080

  6. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  7. Ankle plantarflexion strength in rearfoot and forefoot runners: a novel clusteranalytic approach.

    PubMed

    Liebl, Dominik; Willwacher, Steffen; Hamill, Joseph; Brüggemann, Gert-Peter

    2014-06-01

    The purpose of the present study was to test for differences in ankle plantarflexion strengths of habitually rearfoot and forefoot runners. In order to approach this issue, we revisit the problem of classifying different footfall patterns in human runners. A dataset of 119 subjects running shod and barefoot (speed 3.5m/s) was analyzed. The footfall patterns were clustered by a novel statistical approach, which is motivated by advances in the statistical literature on functional data analysis. We explain the novel statistical approach in detail and compare it to the classically used strike index of Cavanagh and Lafortune (1980). The two groups found by the new cluster approach are well interpretable as a forefoot and a rearfoot footfall groups. The subsequent comparison study of the clustered subjects reveals that runners with a forefoot footfall pattern are capable of producing significantly higher joint moments in a maximum voluntary contraction (MVC) of their ankle plantarflexor muscles tendon units; difference in means: 0.28Nm/kg. This effect remains significant after controlling for an additional gender effect and for differences in training levels. Our analysis confirms the hypothesis that forefoot runners have a higher mean MVC plantarflexion strength than rearfoot runners. Furthermore, we demonstrate that our proposed stochastic cluster analysis provides a robust and useful framework for clustering foot strikes. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  9. iTTVis: Interactive Visualization of Table Tennis Data.

    PubMed

    Wu, Yingcai; Lan, Ji; Shu, Xinhuan; Ji, Chenyang; Zhao, Kejian; Wang, Jiachen; Zhang, Hui

    2018-01-01

    The rapid development of information technology paved the way for the recording of fine-grained data, such as stroke techniques and stroke placements, during a table tennis match. This data recording creates opportunities to analyze and evaluate matches from new perspectives. Nevertheless, the increasingly complex data poses a significant challenge to make sense of and gain insights into. Analysts usually employ tedious and cumbersome methods which are limited to watching videos and reading statistical tables. However, existing sports visualization methods cannot be applied to visualizing table tennis competitions due to different competition rules and particular data attributes. In this work, we collaborate with data analysts to understand and characterize the sophisticated domain problem of analysis of table tennis data. We propose iTTVis, a novel interactive table tennis visualization system, which to our knowledge, is the first visual analysis system for analyzing and exploring table tennis data. iTTVis provides a holistic visualization of an entire match from three main perspectives, namely, time-oriented, statistical, and tactical analyses. The proposed system with several well-coordinated views not only supports correlation identification through statistics and pattern detection of tactics with a score timeline but also allows cross analysis to gain insights. Data analysts have obtained several new insights by using iTTVis. The effectiveness and usability of the proposed system are demonstrated with four case studies.

  10. Evaluating the efficiency of environmental monitoring programs

    USGS Publications Warehouse

    Levine, Carrie R.; Yanai, Ruth D.; Lampman, Gregory G.; Burns, Douglas A.; Driscoll, Charles T.; Lawrence, Gregory B.; Lynch, Jason; Schoch, Nina

    2014-01-01

    Statistical uncertainty analyses can be used to improve the efficiency of environmental monitoring, allowing sampling designs to maximize information gained relative to resources required for data collection and analysis. In this paper, we illustrate four methods of data analysis appropriate to four types of environmental monitoring designs. To analyze a long-term record from a single site, we applied a general linear model to weekly stream chemistry data at Biscuit Brook, NY, to simulate the effects of reducing sampling effort and to evaluate statistical confidence in the detection of change over time. To illustrate a detectable difference analysis, we analyzed a one-time survey of mercury concentrations in loon tissues in lakes in the Adirondack Park, NY, demonstrating the effects of sampling intensity on statistical power and the selection of a resampling interval. To illustrate a bootstrapping method, we analyzed the plot-level sampling intensity of forest inventory at the Hubbard Brook Experimental Forest, NH, to quantify the sampling regime needed to achieve a desired confidence interval. Finally, to analyze time-series data from multiple sites, we assessed the number of lakes and the number of samples per year needed to monitor change over time in Adirondack lake chemistry using a repeated-measures mixed-effects model. Evaluations of time series and synoptic long-term monitoring data can help determine whether sampling should be re-allocated in space or time to optimize the use of financial and human resources.

  11. Optimizing human activity patterns using global sensitivity analysis

    DOE PAGES

    Fairchild, Geoffrey; Hickmann, Kyle S.; Mniszewski, Susan M.; ...

    2013-12-10

    Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimizationmore » problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. Here we use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Finally, though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.« less

  12. A powerful approach for association analysis incorporating imprinting effects

    PubMed Central

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-01-01

    Motivation: For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. Results: In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy–Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. Contact: wingfung@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21798962

  13. A powerful approach for association analysis incorporating imprinting effects.

    PubMed

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-09-15

    For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy-Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. wingfung@hku.hk Supplementary data are available at Bioinformatics online.

  14. Scan statistics with local vote for target detection in distributed system

    NASA Astrophysics Data System (ADS)

    Luo, Junhai; Wu, Qi

    2017-12-01

    Target detection has occupied a pivotal position in distributed system. Scan statistics, as one of the most efficient detection methods, has been applied to a variety of anomaly detection problems and significantly improves the probability of detection. However, scan statistics cannot achieve the expected performance when the noise intensity is strong, or the signal emitted by the target is weak. The local vote algorithm can also achieve higher target detection rate. After the local vote, the counting rule is always adopted for decision fusion. The counting rule does not use the information about the contiguity of sensors but takes all sensors' data into consideration, which makes the result undesirable. In this paper, we propose a scan statistics with local vote (SSLV) method. This method combines scan statistics with local vote decision. Before scan statistics, each sensor executes local vote decision according to the data of its neighbors and its own. By combining the advantages of both, our method can obtain higher detection rate in low signal-to-noise ratio environment than the scan statistics. After the local vote decision, the distribution of sensors which have detected the target becomes more intensive. To make full use of local vote decision, we introduce a variable-step-parameter for the SSLV. It significantly shortens the scan period especially when the target is absent. Analysis and simulations are presented to demonstrate the performance of our method.

  15. A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations.

    PubMed

    Zhang, Han; Wheeler, William; Hyland, Paula L; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

    2016-06-01

    Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs.

  16. A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations

    PubMed Central

    Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

    2016-01-01

    Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418

  17. Application of computer-aided diagnosis (CAD) in MR-mammography (MRM): do we really need whole lesion time curve distribution analysis?

    PubMed

    Baltzer, Pascal Andreas Thomas; Renz, Diane M; Kullnig, Petra E; Gajda, Mieczyslaw; Camara, Oumar; Kaiser, Werner A

    2009-04-01

    The identification of the most suspect enhancing part of a lesion is regarded as a major diagnostic criterion in dynamic magnetic resonance mammography. Computer-aided diagnosis (CAD) software allows the semi-automatic analysis of the kinetic characteristics of complete enhancing lesions, providing additional information about lesion vasculature. The diagnostic value of this information has not yet been quantified. Consecutive patients from routine diagnostic studies (1.5 T, 0.1 mmol gadopentetate dimeglumine, dynamic gradient-echo sequences at 1-minute intervals) were analyzed prospectively using CAD. Dynamic sequences were processed and reduced to a parametric map. Curve types were classified by initial signal increase (not significant, intermediate, and strong) and the delayed time course of signal intensity (continuous, plateau, and washout). Lesion enhancement was measured using CAD. The most suspect curve, the curve-type distribution percentage, and combined dynamic data were compared. Statistical analysis included logistic regression analysis and receiver-operating characteristic analysis. Fifty-one patients with 46 malignant and 44 benign lesions were enrolled. On receiver-operating characteristic analysis, the most suspect curve showed diagnostic accuracy of 76.7 +/- 5%. In comparison, the curve-type distribution percentage demonstrated accuracy of 80.2 +/- 4.9%. Combined dynamic data had the highest diagnostic accuracy (84.3 +/- 4.2%). These differences did not achieve statistical significance. With appropriate cutoff values, sensitivity and specificity, respectively, were found to be 80.4% and 72.7% for the most suspect curve, 76.1% and 83.6% for the curve-type distribution percentage, and 78.3% and 84.5% for both parameters. The integration of whole-lesion dynamic data tends to improve specificity. However, no statistical significance backs up this finding.

  18. Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains.

    PubMed

    Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu

    2015-09-21

    Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.

  19. A Mokken scale analysis of the peer physical examination questionnaire.

    PubMed

    Vaughan, Brett; Grace, Sandra

    2018-01-01

    Peer physical examination (PPE) is a teaching and learning strategy utilised in most health profession education programs. Perceptions of participating in PPE have been described in the literature, focusing on areas of the body students are willing, or unwilling, to examine. A small number of questionnaires exist to evaluate these perceptions, however none have described the measurement properties that may allow them to be used longitudinally. The present study undertook a Mokken scale analysis of the Peer Physical Examination Questionnaire (PPEQ) to evaluate its dimensionality and structure when used with Australian osteopathy students. Students enrolled in Year 1 of the osteopathy programs at Victoria University (Melbourne, Australia) and Southern Cross University (Lismore, Australia) were invited to complete the PPEQ prior to their first practical skills examination class. R, an open-source statistics program, was used to generate the descriptive statistics and perform a Mokken scale analysis. Mokken scale analysis is a non-parametric item response theory approach that is used to cluster items measuring a latent construct. Initial analysis suggested the PPEQ did not form a single scale. Further analysis identified three subscales: 'comfort', 'concern', and 'professionalism and education'. The properties of each subscale suggested they were unidimensional with variable internal structures. The 'comfort' subscale was the strongest of the three identified. All subscales demonstrated acceptable reliability estimation statistics (McDonald's omega > 0.75) supporting the calculation of a sum score for each subscale. The subscales identified are consistent with the literature. The 'comfort' subscale may be useful to longitudinally evaluate student perceptions of PPE. Further research is required to evaluate changes with PPE and the utility of the questionnaire with other health profession education programs.

  20. Spatial variation of volcanic rock geochemistry in the Virunga Volcanic Province: Statistical analysis of an integrated database

    NASA Astrophysics Data System (ADS)

    Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu

    2017-10-01

    We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.

  1. Statistical Analysis of Wireless Networks: Predicting Performance in Multiple Environments

    DTIC Science & Technology

    2006-06-01

    The demonstration planned for May 2006 is an air, ground, and water- based scenario, occurring just north of Chiang Mai , Thailand. The scenario...will be fused, displayed, and distributed in real-time to local ( Chiang Mai ), theater (Bangkok), and global (Alameda, California) command and control...COASTS 2006 TOPOLOGY The 2006 version of the COASTS project occurred just north of Chiang Mai , Thailand, at the Mae Ngat Dam. COTS systems were

  2. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling

    PubMed Central

    Wood, John

    2017-01-01

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080

  3. Accounting for standard errors of vision-specific latent trait in regression models.

    PubMed

    Wong, Wan Ling; Li, Xiang; Li, Jialiang; Wong, Tien Yin; Cheng, Ching-Yu; Lamoureux, Ecosse L

    2014-07-11

    To demonstrate the effectiveness of Hierarchical Bayesian (HB) approach in a modeling framework for association effects that accounts for SEs of vision-specific latent traits assessed using Rasch analysis. A systematic literature review was conducted in four major ophthalmic journals to evaluate Rasch analysis performed on vision-specific instruments. The HB approach was used to synthesize the Rasch model and multiple linear regression model for the assessment of the association effects related to vision-specific latent traits. The effectiveness of this novel HB one-stage "joint-analysis" approach allows all model parameters to be estimated simultaneously and was compared with the frequently used two-stage "separate-analysis" approach in our simulation study (Rasch analysis followed by traditional statistical analyses without adjustment for SE of latent trait). Sixty-six reviewed articles performed evaluation and validation of vision-specific instruments using Rasch analysis, and 86.4% (n = 57) performed further statistical analyses on the Rasch-scaled data using traditional statistical methods; none took into consideration SEs of the estimated Rasch-scaled scores. The two models on real data differed for effect size estimations and the identification of "independent risk factors." Simulation results showed that our proposed HB one-stage "joint-analysis" approach produces greater accuracy (average of 5-fold decrease in bias) with comparable power and precision in estimation of associations when compared with the frequently used two-stage "separate-analysis" procedure despite accounting for greater uncertainty due to the latent trait. Patient-reported data, using Rasch analysis techniques, do not take into account the SE of latent trait in association analyses. The HB one-stage "joint-analysis" is a better approach, producing accurate effect size estimations and information about the independent association of exposure variables with vision-specific latent traits. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  4. The influence of depression and anxiety in the development of heart failure after coronary angioplasty.

    PubMed

    Gegenava, T; Gegenava, M; Kavtaradze, G

    2009-03-01

    The aim of our study was to investigate the association between history of depressive episode and anxiety and complications in patients after 6 months of coronary artery angioplasty. The research was conducted on 70 patients, the grade of coronary occlusion that would not respond to therapeutic treatment and need coronary angioplasty had been established. Complications were estimated in 60 patients after 6 months of coronary angioplasty. To evaluate depression we used Beck depression scale Anxiety was assessed by Spilberger State-trait anxiety scale. Statistic analysis of the data was made by means of the methods of variation statistics using Students' criterion and program of STATISTICA w 5.0. Complications were discovered in 36 (60%) patients; 24 (40%) patients had not complications. There was not revealed significant statistical differences in depression and anxiety degree in coronary angioplasty period and after 6 months of coronary angioplasty. There was not revealed significant statistical differences in depression and anxiety degree in coronary angioplasty period and after 6 months of coronary angioplasty. Our study demonstrated that complications were revealed in patients who had high degree of depression and anxiety.

  5. Texture as a basis for acoustic classification of substrate in the nearshore region

    NASA Astrophysics Data System (ADS)

    Dennison, A.; Wattrus, N. J.

    2016-12-01

    Segmentation and classification of substrate type from two locations in Lake Superior, are predicted using multivariate statistical processing of textural measures derived from shallow-water, high-resolution multibeam bathymetric data. During a multibeam sonar survey, both bathymetric and backscatter data are collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on substrate type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. Preliminary results from an analysis of bathymetric data and ground-truth samples collected from the Amnicon River, Superior, Wisconsin, and the Lester River, Duluth, Minnesota, demonstrate the ability to process and develop a novel classification scheme of the bottom type in two geomorphologically distinct areas.

  6. Sb2Te3 and Its Superlattices: Optimization by Statistical Design.

    PubMed

    Behera, Jitendra K; Zhou, Xilin; Ranjan, Alok; Simpson, Robert E

    2018-05-02

    The objective of this work is to demonstrate the usefulness of fractional factorial design for optimizing the crystal quality of chalcogenide van der Waals (vdW) crystals. We statistically analyze the growth parameters of highly c axis oriented Sb 2 Te 3 crystals and Sb 2 Te 3 -GeTe phase change vdW heterostructured superlattices. The statistical significance of the growth parameters of temperature, pressure, power, buffer materials, and buffer layer thickness was found by fractional factorial design and response surface analysis. Temperature, pressure, power, and their second-order interactions are the major factors that significantly influence the quality of the crystals. Additionally, using tungsten rather than molybdenum as a buffer layer significantly enhances the crystal quality. Fractional factorial design minimizes the number of experiments that are necessary to find the optimal growth conditions, resulting in an order of magnitude improvement in the crystal quality. We highlight that statistical design of experiment methods, which is more commonly used in product design, should be considered more broadly by those designing and optimizing materials.

  7. An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.

    PubMed

    Kim, Junghi; Bai, Yun; Pan, Wei

    2015-12-01

    We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.

  8. Brain MRI analysis for Alzheimer's disease diagnosis using an ensemble system of deep convolutional neural networks.

    PubMed

    Islam, Jyoti; Zhang, Yanqing

    2018-05-31

    Alzheimer's disease is an incurable, progressive neurological brain disorder. Earlier detection of Alzheimer's disease can help with proper treatment and prevent brain tissue damage. Several statistical and machine learning models have been exploited by researchers for Alzheimer's disease diagnosis. Analyzing magnetic resonance imaging (MRI) is a common practice for Alzheimer's disease diagnosis in clinical research. Detection of Alzheimer's disease is exacting due to the similarity in Alzheimer's disease MRI data and standard healthy MRI data of older people. Recently, advanced deep learning techniques have successfully demonstrated human-level performance in numerous fields including medical image analysis. We propose a deep convolutional neural network for Alzheimer's disease diagnosis using brain MRI data analysis. While most of the existing approaches perform binary classification, our model can identify different stages of Alzheimer's disease and obtains superior performance for early-stage diagnosis. We conducted ample experiments to demonstrate that our proposed model outperformed comparative baselines on the Open Access Series of Imaging Studies dataset.

  9. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  10. Programmable quantum random number generator without postprocessing.

    PubMed

    Nguyen, Lac; Rehain, Patrick; Sua, Yong Meng; Huang, Yu-Ping

    2018-02-15

    We demonstrate a viable source of unbiased quantum random numbers whose statistical properties can be arbitrarily programmed without the need for any postprocessing such as randomness distillation or distribution transformation. It is based on measuring the arrival time of single photons in shaped temporal modes that are tailored with an electro-optical modulator. We show that quantum random numbers can be created directly in customized probability distributions and pass all randomness tests of the NIST and Dieharder test suites without any randomness extraction. The min-entropies of such generated random numbers are measured close to the theoretical limits, indicating their near-ideal statistics and ultrahigh purity. Easy to implement and arbitrarily programmable, this technique can find versatile uses in a multitude of data analysis areas.

  11. Improving Our Ability to Evaluate Underlying Mechanisms of Behavioral Onset and Other Event Occurrence Outcomes: A Discrete-Time Survival Mediation Model

    PubMed Central

    Fairchild, Amanda J.; Abara, Winston E.; Gottschall, Amanda C.; Tein, Jenn-Yun; Prinz, Ronald J.

    2015-01-01

    The purpose of this article is to introduce and describe a statistical model that researchers can use to evaluate underlying mechanisms of behavioral onset and other event occurrence outcomes. Specifically, the article develops a framework for estimating mediation effects with outcomes measured in discrete-time epochs by integrating the statistical mediation model with discrete-time survival analysis. The methodology has the potential to help strengthen health research by targeting prevention and intervention work more effectively as well as by improving our understanding of discretized periods of risk. The model is applied to an existing longitudinal data set to demonstrate its use, and programming code is provided to facilitate its implementation. PMID:24296470

  12. Towards Principled Experimental Study of Autonomous Mobile Robots

    NASA Technical Reports Server (NTRS)

    Gat, Erann

    1995-01-01

    We review the current state of research in autonomous mobile robots and conclude that there is an inadequate basis for predicting the reliability and behavior of robots operating in unengineered environments. We present a new approach to the study of autonomous mobile robot performance based on formal statistical analysis of independently reproducible experiments conducted on real robots. Simulators serve as models rather than experimental surrogates. We demonstrate three new results: 1) Two commonly used performance metrics (time and distance) are not as well correlated as is often tacitly assumed. 2) The probability distributions of these performance metrics are exponential rather than normal, and 3) a modular, object-oriented simulation accurately predicts the behavior of the real robot in a statistically significant manner.

  13. Hypothesis testing for band size detection of high-dimensional banded precision matrices.

    PubMed

    An, Baiguo; Guo, Jianhua; Liu, Yufeng

    2014-06-01

    Many statistical analysis procedures require a good estimator for a high-dimensional covariance matrix or its inverse, the precision matrix. When the precision matrix is banded, the Cholesky-based method often yields a good estimator of the precision matrix. One important aspect of this method is determination of the band size of the precision matrix. In practice, crossvalidation is commonly used; however, we show that crossvalidation not only is computationally intensive but can be very unstable. In this paper, we propose a new hypothesis testing procedure to determine the band size in high dimensions. Our proposed test statistic is shown to be asymptotically normal under the null hypothesis, and its theoretical power is studied. Numerical examples demonstrate the effectiveness of our testing procedure.

  14. [The incidence of craniomandibular disorders in patients with cervical dysfunctions. A clinico-statistical assessment].

    PubMed

    Carossa, S; Catapano, S; Previgliano, V; Preti, G

    1993-05-01

    The aim of this research was to measure the incidence of craniomandibular disorders in a group of patients with functional-type cervical alterations. The group consisted of 50 patients undergoing treatment for disorders of the cervical sectors of the spine. Each patient was subjected to a medical examination to investigate the presence of CMD signs or symptoms. From the data statistical analysis a higher percentage of cases with muscular and joint pain, limited mouth opening, deviation and deflection, were found in comparison with the percentage found among the general population. This demonstrates an overloading of the entire masticatory apparatus. Joint noise was less frequent, probably due to its exclusion from our sample of patients with arthrosis-type degenerative pathology.

  15. Distribution of CD163-positive cell and MHC class II-positive cell in the normal equine uveal tract.

    PubMed

    Sano, Yuto; Matsuda, Kazuya; Okamoto, Minoru; Takehana, Kazushige; Hirayama, Kazuko; Taniyama, Hiroyuki

    2016-02-01

    Antigen-presenting cells (APCs) in the uveal tract participate in ocular immunity including immune homeostasis and the pathogenesis of uveitis. In horses, although uveitis is the most common ocular disorder, little is known about ocular immunity, such as the distribution of APCs. In this study, we investigated the distribution of CD163-positive and MHC II-positive cells in the normal equine uveal tract using an immunofluorescence technique. Eleven eyes from 10 Thoroughbred horses aged 1 to 24 years old were used. Indirect immunofluorescence was performed using the primary antibodies CD163, MHC class II (MHC II) and CD20. To demonstrate the site of their greatest distribution, positive cells were manually counted in 3 different parts of the uveal tract (ciliary body, iris and choroid), and their average number was assessed by statistical analysis. The distribution of pleomorphic CD163- and MHC II-expressed cells was detected throughout the equine uveal tract, but no CD20-expressed cells were detected. The statistical analysis demonstrated the distribution of CD163- and MHC II-positive cells focusing on the ciliary body. These results demonstrated that the ciliary body is the largest site of their distribution in the normal equine uveal tract, and the ciliary body is considered to play important roles in uveal and/or ocular immune homeostasis. The data provided in this study will help further understanding of equine ocular immunity in the normal state and might be beneficial for understanding of mechanisms of ocular disorders, such as equine uveitis.

  16. Theoretical analysis of HVAC duct hanger systems

    NASA Technical Reports Server (NTRS)

    Miller, R. D.

    1987-01-01

    Several methods are presented which, together, may be used in the analysis of duct hanger systems over a wide range of frequencies. The finite element method (FEM) and component mode synthesis (CMS) method are used for low- to mid-frequency range computations and have been shown to yield reasonably close results. The statistical energy analysis (SEA) method yields predictions which agree with the CMS results for the 800 to 1000 Hz range provided that a sufficient number of modes participate. The CMS approach has been shown to yield valuable insight into the mid-frequency range of the analysis. It has been demonstrated that it is possible to conduct an analysis of a duct/hanger system in a cost-effective way for a wide frequency range, using several methods which overlap for several frequency bands.

  17. Psychometric evaluation of the Persian version of the Templer's Death Anxiety Scale in cancer patients.

    PubMed

    Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid

    2016-10-01

    In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.

  18. An analysis of pilot error-related aircraft accidents

    NASA Technical Reports Server (NTRS)

    Kowalsky, N. B.; Masters, R. L.; Stone, R. B.; Babcock, G. L.; Rypka, E. W.

    1974-01-01

    A multidisciplinary team approach to pilot error-related U.S. air carrier jet aircraft accident investigation records successfully reclaimed hidden human error information not shown in statistical studies. New analytic techniques were developed and applied to the data to discover and identify multiple elements of commonality and shared characteristics within this group of accidents. Three techniques of analysis were used: Critical element analysis, which demonstrated the importance of a subjective qualitative approach to raw accident data and surfaced information heretofore unavailable. Cluster analysis, which was an exploratory research tool that will lead to increased understanding and improved organization of facts, the discovery of new meaning in large data sets, and the generation of explanatory hypotheses. Pattern recognition, by which accidents can be categorized by pattern conformity after critical element identification by cluster analysis.

  19. Efficiency of bowel preparation for capsule endoscopy examination: a meta-analysis.

    PubMed

    Niv, Yaron

    2008-03-07

    Good preparation before endoscopic procedures is essential for successful visualization. The small bowel is difficult to evaluate because of its length and complex configuration. A meta-analysis was conducted of studies comparing small bowel visualization by capsule endoscopy with and without preparation. Medical data bases were searched for all studies investigating the preparation for capsule endoscopy of the small bowel up to July 31, 2007. Studies that scored bowel cleanness and measured gastric and small bowel transit time and rate of cecum visualization were included. The primary endpoint was the quality of bowel visualization. The secondary endpoints were transit times and proportion of examinations that demonstrated the cecum, with and without preparation. Meta-analysis was performed with StatDirect Statistical software, version 2.6.1 (http://statsdirect.com). Eight studies met the inclusion criteria. Bowel visualization was scored as "good" in 78% of the examinations performed with preparation and 49% performed without (P<0.0001). There were no significant differences in transit times or in the proportion of examinations that demonstrated the cecum with and without preparation. Capsule endoscopy preparation improves the quality of small bowel visualization, but has no effect on transit times, or demonstration of the cecum.

  20. Efficiency of bowel preparation for capsule endoscopy examination: A meta-analysis

    PubMed Central

    Niv, Yaron

    2008-01-01

    Good preparation before endoscopic procedures is essential for successful visualization. The small bowel is difficult to evaluate because of its length and complex configuration. A meta-analysis was conducted of studies comparing small bowel visualization by capsule endoscopy with and without preparation. Medical data bases were searched for all studies investigating the preparation for capsule endoscopy of the small bowel up to July 31, 2007. Studies that scored bowel cleanness and measured gastric and small bowel transit time and rate of cecum visualization were included. The primary endpoint was the quality of bowel visualization. The secondary endpoints were transit times and proportion of examinations that demonstrated the cecum, with and without preparation. Meta-analysis was performed with StatDirect Statistical software, version 2.6.1 (http://statsdirect.com). Eight studies met the inclusion criteria. Bowel visualization was scored as “good” in 78% of the examinations performed with preparation and 49% performed without (P < 0.0001). There were no significant differences in transit times or in the proportion of examinations that demonstrated the cecum with and without preparation. Capsule endoscopy preparation improves the quality of small bowel visualization, but has no effect on transit times, or demonstration of the cecum. PMID:18322940

Top