Sample records for statistical analysis includes

  1. Statistics 101 for Radiologists.

    PubMed

    Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

    2015-10-01

    Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.

  2. Statistical Tutorial | Center for Cancer Research

    Cancer.gov

    Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data.  ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018.  The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean

  3. Statistical Analysis of Research Data | Center for Cancer Research

    Cancer.gov

    Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data.  The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.

  4. Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

    NASA Astrophysics Data System (ADS)

    Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

    2018-03-01

    This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.

  5. The Content of Statistical Requirements for Authors in Biomedical Research Journals

    PubMed Central

    Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

    2016-01-01

    Background: Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues’ serious considerations not only at the stage of data analysis but also at the stage of methodological design. Methods: Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Results: Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including “address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation,” and “statistical methods and the reasons.” Conclusions: Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible. PMID:27748343

  6. The Content of Statistical Requirements for Authors in Biomedical Research Journals.

    PubMed

    Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

    2016-10-20

    Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues' serious considerations not only at the stage of data analysis but also at the stage of methodological design. Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including "address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation," and "statistical methods and the reasons." Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible.

  7. Statistical Tutorial | Center for Cancer Research

    Cancer.gov

    Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data.  ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018.  The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.

  8. Evaluating and Reporting Statistical Power in Counseling Research

    ERIC Educational Resources Information Center

    Balkin, Richard S.; Sheperis, Carl J.

    2011-01-01

    Despite recommendations from the "Publication Manual of the American Psychological Association" (6th ed.) to include information on statistical power when publishing quantitative results, authors seldom include analysis or discussion of statistical power. The rationale for discussing statistical power is addressed, approaches to using "G*Power" to…

  9. Journal of Transportation and Statistics, Vol. 3, No. 2 : special issue on the statistical analysis and modeling of automotive emissions

    DOT National Transportation Integrated Search

    2000-09-01

    This special issue of the Journal of Transportation and Statistics is devoted to the statistical analysis and modeling of automotive emissions. It contains many of the papers presented in the mini-symposium last August and also includes one additiona...

  10. Analysis and meta-analysis of single-case designs: an introduction.

    PubMed

    Shadish, William R

    2014-04-01

    The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  11. A statistical package for computing time and frequency domain analysis

    NASA Technical Reports Server (NTRS)

    Brownlow, J.

    1978-01-01

    The spectrum analysis (SPA) program is a general purpose digital computer program designed to aid in data analysis. The program does time and frequency domain statistical analyses as well as some preanalysis data preparation. The capabilities of the SPA program include linear trend removal and/or digital filtering of data, plotting and/or listing of both filtered and unfiltered data, time domain statistical characterization of data, and frequency domain statistical characterization of data.

  12. Network Data: Statistical Theory and New Models

    DTIC Science & Technology

    2016-02-17

    SECURITY CLASSIFICATION OF: During this period of review, Bin Yu worked on many thrusts of high-dimensional statistical theory and methodologies. Her...research covered a wide range of topics in statistics including analysis and methods for spectral clustering for sparse and structured networks...2,7,8,21], sparse modeling (e.g. Lasso) [4,10,11,17,18,19], statistical guarantees for the EM algorithm [3], statistical analysis of algorithm leveraging

  13. Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2012-01-01

    Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950

  14. 75 FR 72611 - Assessments, Large Bank Pricing

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-11-24

    ... the worst risk ranking and are included in the statistical analysis. Appendix 1 to the NPR describes the statistical analysis in detail. \\12\\ The percentage approximated by factors is based on the statistical model for that particual year. Actual weights assigned to each scorecard measure are largely based...

  15. Recent statistical methods for orientation data

    NASA Technical Reports Server (NTRS)

    Batschelet, E.

    1972-01-01

    The application of statistical methods for determining the areas of animal orientation and navigation are discussed. The method employed is limited to the two-dimensional case. Various tests for determining the validity of the statistical analysis are presented. Mathematical models are included to support the theoretical considerations and tables of data are developed to show the value of information obtained by statistical analysis.

  16. Common pitfalls in statistical analysis: “P” values, statistical significance and confidence intervals

    PubMed Central

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2015-01-01

    In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958

  17. Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

    PubMed

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E

    2014-04-01

    This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  18. Analysis and Report on SD2000: A Workshop to Determine Structural Dynamics Research for the Millenium

    DTIC Science & Technology

    2000-04-10

    interest. These include Statistical Energy Analysis (SEA), fuzzy structure theory, and approaches combining modal analysis and SEA. Non-determinism...34 arising with increasing frequency. This has led to Statistical Energy Analysis , in which a system is modelled as a collection of coupled subsystems...22. IUTAM Symposium on Statistical Energy Analysis . 1999 Ed. F.J. Fahy and W.G. Price. Kluwer Academic Publishing. • 23. R.S. Langley and P

  19. Primer of statistics in dental research: part I.

    PubMed

    Shintani, Ayumi

    2014-01-01

    Statistics play essential roles in evidence-based dentistry (EBD) practice and research. It ranges widely from formulating scientific questions, designing studies, collecting and analyzing data to interpreting, reporting, and presenting study findings. Mastering statistical concepts appears to be an unreachable goal among many dental researchers in part due to statistical authorities' limitations of explaining statistical principles to health researchers without elaborating complex mathematical concepts. This series of 2 articles aim to introduce dental researchers to 9 essential topics in statistics to conduct EBD with intuitive examples. The part I of the series includes the first 5 topics (1) statistical graph, (2) how to deal with outliers, (3) p-value and confidence interval, (4) testing equivalence, and (5) multiplicity adjustment. Part II will follow to cover the remaining topics including (6) selecting the proper statistical tests, (7) repeated measures analysis, (8) epidemiological consideration for causal association, and (9) analysis of agreement. Copyright © 2014. Published by Elsevier Ltd.

  20. Interrupted Time Series Versus Statistical Process Control in Quality Improvement Projects.

    PubMed

    Andersson Hagiwara, Magnus; Andersson Gäre, Boel; Elg, Mattias

    2016-01-01

    To measure the effect of quality improvement interventions, it is appropriate to use analysis methods that measure data over time. Examples of such methods include statistical process control analysis and interrupted time series with segmented regression analysis. This article compares the use of statistical process control analysis and interrupted time series with segmented regression analysis for evaluating the longitudinal effects of quality improvement interventions, using an example study on an evaluation of a computerized decision support system.

  1. Statistical analysis of 59 inspected SSME HPFTP turbine blades (uncracked and cracked)

    NASA Technical Reports Server (NTRS)

    Wheeler, John T.

    1987-01-01

    The numerical results of statistical analysis of the test data of Space Shuttle Main Engine high pressure fuel turbopump second-stage turbine blades, including some with cracks are presented. Several statistical methods use the test data to determine the application of differences in frequency variations between the uncracked and cracked blades.

  2. Advances in Statistical Methods for Substance Abuse Prevention Research

    PubMed Central

    MacKinnon, David P.; Lockwood, Chondra M.

    2010-01-01

    The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467

  3. Online Statistical Modeling (Regression Analysis) for Independent Responses

    NASA Astrophysics Data System (ADS)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  4. An overview of the mathematical and statistical analysis component of RICIS

    NASA Technical Reports Server (NTRS)

    Hallum, Cecil R.

    1987-01-01

    Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.

  5. Some issues in the statistical analysis of vehicle emissions

    DOT National Transportation Integrated Search

    2000-09-01

    Some of the issues complicating the statistical analysis of vehicle emissions and the effectiveness of emissions control programs are presented in this article. Issues discussed include: the variability of inter- and intra-vehicle emissions; the skew...

  6. Statistical correlation of structural mode shapes from test measurements and NASTRAN analytical values

    NASA Technical Reports Server (NTRS)

    Purves, L.; Strang, R. F.; Dube, M. P.; Alea, P.; Ferragut, N.; Hershfeld, D.

    1983-01-01

    The software and procedures of a system of programs used to generate a report of the statistical correlation between NASTRAN modal analysis results and physical tests results from modal surveys are described. Topics discussed include: a mathematical description of statistical correlation, a user's guide for generating a statistical correlation report, a programmer's guide describing the organization and functions of individual programs leading to a statistical correlation report, and a set of examples including complete listings of programs, and input and output data.

  7. Reporting quality of statistical methods in surgical observational studies: protocol for systematic review.

    PubMed

    Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume

    2014-06-28

    Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.

  8. Limitations of Using Microsoft Excel Version 2016 (MS Excel 2016) for Statistical Analysis for Medical Research.

    PubMed

    Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak

    2016-06-01

    Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.

  9. Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

    ERIC Educational Resources Information Center

    Jones, Andrew T.

    2011-01-01

    Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

  10. Local sensitivity analysis for inverse problems solved by singular value decomposition

    USGS Publications Warehouse

    Hill, M.C.; Nolan, B.T.

    2010-01-01

    Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.

  11. A new u-statistic with superior design sensitivity in matched observational studies.

    PubMed

    Rosenbaum, Paul R

    2011-09-01

    In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is "no departure" then this is the same as the power of a randomization test in a randomized experiment. A new family of u-statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments-that is, it often has good Pitman efficiency-but small effects are invariably sensitive to small unobserved biases. Members of this family of u-statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u-statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology. © 2010, The International Biometric Society.

  12. Performance Analysis of Live-Virtual-Constructive and Distributed Virtual Simulations: Defining Requirements in Terms of Temporal Consistency

    DTIC Science & Technology

    2009-12-01

    events. Work associated with aperiodic tasks have the same statistical behavior and the same timing requirements. The timing deadlines are soft. • Sporadic...answers, but it is possible to calculate how precise the estimates are. Simulation-based performance analysis of a model includes a statistical ...to evaluate all pos- sible states in a timely manner. This is the principle reason for resorting to simulation and statistical analysis to evaluate

  13. Data analysis techniques

    NASA Technical Reports Server (NTRS)

    Park, Steve

    1990-01-01

    A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.

  14. Research Education in Undergraduate Occupational Therapy Programs.

    ERIC Educational Resources Information Center

    Petersen, Paul; And Others

    1992-01-01

    Of 63 undergraduate occupational therapy programs surveyed, the 38 responses revealed some common areas covered: elementary descriptive statistics, validity, reliability, and measurement. Areas underrepresented include statistical analysis with or without computers, research design, and advanced statistics. (SK)

  15. Surface flaw reliability analysis of ceramic components with the SCARE finite element postprocessor program

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, John P.; Nemeth, Noel N.

    1987-01-01

    The SCARE (Structural Ceramics Analysis and Reliability Evaluation) computer program on statistical fast fracture reliability analysis with quadratic elements for volume distributed imperfections is enhanced to include the use of linear finite elements and the capability of designing against concurrent surface flaw induced ceramic component failure. The SCARE code is presently coupled as a postprocessor to the MSC/NASTRAN general purpose, finite element analysis program. The improved version now includes the Weibull and Batdorf statistical failure theories for both surface and volume flaw based reliability analysis. The program uses the two-parameter Weibull fracture strength cumulative failure probability distribution model with the principle of independent action for poly-axial stress states, and Batdorf's shear-sensitive as well as shear-insensitive statistical theories. The shear-sensitive surface crack configurations include the Griffith crack and Griffith notch geometries, using the total critical coplanar strain energy release rate criterion to predict mixed-mode fracture. Weibull material parameters based on both surface and volume flaw induced fracture can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and grouped fracture data. The statistical fast fracture theories for surface flaw induced failure, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.

  16. Prison Radicalization: The New Extremist Training Grounds?

    DTIC Science & Technology

    2007-09-01

    distributing and collecting survey data , and the data analysis. The analytical methodology includes descriptive and inferential statistical methods, in... statistical analysis of the responses to identify significant correlations and relationships. B. SURVEY DATA COLLECTION To effectively access a...Q18, Q19, Q20, and Q21. Due to the exploratory nature of this small survey, data analyses were confined mostly to descriptive statistics and

  17. Proceedings of the first ERDA statistical symposium, Los Alamos, NM, November 3--5, 1975. [Sixteen papers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nicholson, W L; Harris, J L

    1976-03-01

    The First ERDA Statistical Symposium was organized to provide a means for communication among ERDA statisticians, and the sixteen papers presented at the meeting are given. Topics include techniques of numerical analysis used for accelerators, nuclear reactors, skewness and kurtosis statistics, radiochemical spectral analysis, quality control, and other statistics problems. Nine of the papers were previously announced in Nuclear Science Abstracts (NSA), while the remaining seven were abstracted for ERDA Energy Research Abstracts (ERA) and INIS Atomindex. (PMA)

  18. Bootstrap Methods: A Very Leisurely Look.

    ERIC Educational Resources Information Center

    Hinkle, Dennis E.; Winstead, Wayland H.

    The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…

  19. Examples of Data Analysis with SPSS-X.

    ERIC Educational Resources Information Center

    MacFarland, Thomas W.

    Intended for classroom use only, these unpublished notes contain computer lessons on descriptive statistics using SPSS-X Release 3.0 for VAX/UNIX. Statistical measures covered include Chi-square analysis; Spearman's rank correlation coefficient; Student's t-test with two independent samples; Student's t-test with a paired sample; One-way analysis…

  20. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    ERIC Educational Resources Information Center

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  1. Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beggs, W.J.

    1981-02-01

    This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; themore » analysis of variance; quality control procedures; and linear regression analysis.« less

  2. Sigsearch: a new term for post hoc unplanned search for statistically significant relationships with the intent to create publishable findings.

    PubMed

    Hashim, Muhammad Jawad

    2010-09-01

    Post-hoc secondary data analysis with no prespecified hypotheses has been discouraged by textbook authors and journal editors alike. Unfortunately no single term describes this phenomenon succinctly. I would like to coin the term "sigsearch" to define this practice and bring it within the teaching lexicon of statistics courses. Sigsearch would include any unplanned, post-hoc search for statistical significance using multiple comparisons of subgroups. It would also include data analysis with outcomes other than the prespecified primary outcome measure of a study as well as secondary data analyses of earlier research.

  3. New dimensions from statistical graphics for GIS (geographic information system) analysis and interpretation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McCord, R.A.; Olson, R.J.

    1988-01-01

    Environmental research and assessment activities at Oak Ridge National Laboratory (ORNL) include the analysis of spatial and temporal patterns of ecosystem response at a landscape scale. Analysis through use of geographic information system (GIS) involves an interaction between the user and thematic data sets frequently expressed as maps. A portion of GIS analysis has a mathematical or statistical aspect, especially for the analysis of temporal patterns. ARC/INFO is an excellent tool for manipulating GIS data and producing the appropriate map graphics. INFO also has some limited ability to produce statistical tabulation. At ORNL we have extended our capabilities by graphicallymore » interfacing ARC/INFO and SAS/GRAPH to provide a combined mapping and statistical graphics environment. With the data management, statistical, and graphics capabilities of SAS added to ARC/INFO, we have expanded the analytical and graphical dimensions of the GIS environment. Pie or bar charts, frequency curves, hydrographs, or scatter plots as produced by SAS can be added to maps from attribute data associated with ARC/INFO coverages. Numerous, small, simplified graphs can also become a source of complex map ''symbols.'' These additions extend the dimensions of GIS graphics to include time, details of the thematic composition, distribution, and interrelationships. 7 refs., 3 figs.« less

  4. Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses.

    PubMed

    Metz, Anneke M

    2008-01-01

    There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study.

  5. World Population: Facts in Focus. World Population Data Sheet Workbook. Population Learning Series.

    ERIC Educational Resources Information Center

    Crews, Kimberly A.

    This workbook teaches population analysis using world population statistics. To complete the four student activity sheets, the students refer to the included "1988 World Population Data Sheet" which lists nations' statistical data that includes population totals, projected population, birth and death rates, fertility levels, and the…

  6. Paleomagnetism.org: An online multi-platform open source environment for paleomagnetic data analysis

    NASA Astrophysics Data System (ADS)

    Koymans, Mathijs R.; Langereis, Cor G.; Pastor-Galán, Daniel; van Hinsbergen, Douwe J. J.

    2016-08-01

    This contribution provides an overview of Paleomagnetism.org, an open-source, multi-platform online environment for paleomagnetic data analysis. Paleomagnetism.org provides an interactive environment where paleomagnetic data can be interpreted, evaluated, visualized, and exported. The Paleomagnetism.org application is split in to an interpretation portal, a statistics portal, and a portal for miscellaneous paleomagnetic tools. In the interpretation portal, principle component analysis can be performed on visualized demagnetization diagrams. Interpreted directions and great circles can be combined to find great circle solutions. These directions can be used in the statistics portal, or exported as data and figures. The tools in the statistics portal cover standard Fisher statistics for directions and VGPs, including other statistical parameters used as reliability criteria. Other available tools include an eigenvector approach foldtest, two reversal test including a Monte Carlo simulation on mean directions, and a coordinate bootstrap on the original data. An implementation is included for the detection and correction of inclination shallowing in sediments following TK03.GAD. Finally we provide a module to visualize VGPs and expected paleolatitudes, declinations, and inclinations relative to widely used global apparent polar wander path models in coordinates of major continent-bearing plates. The tools in the miscellaneous portal include a net tectonic rotation (NTR) analysis to restore a body to its paleo-vertical and a bootstrapped oroclinal test using linear regressive techniques, including a modified foldtest around a vertical axis. Paleomagnetism.org provides an integrated approach for researchers to work with visualized (e.g. hemisphere projections, Zijderveld diagrams) paleomagnetic data. The application constructs a custom exportable file that can be shared freely and included in public databases. This exported file contains all data and can later be imported to the application by other researchers. The accessibility and simplicity through which paleomagnetic data can be interpreted, analyzed, visualized, and shared makes Paleomagnetism.org of interest to the community.

  7. Statistical Approaches Used to Assess the Equity of Access to Food Outlets: A Systematic Review

    PubMed Central

    Lamb, Karen E.; Thornton, Lukar E.; Cerin, Ester; Ball, Kylie

    2015-01-01

    Background Inequalities in eating behaviours are often linked to the types of food retailers accessible in neighbourhood environments. Numerous studies have aimed to identify if access to healthy and unhealthy food retailers is socioeconomically patterned across neighbourhoods, and thus a potential risk factor for dietary inequalities. Existing reviews have examined differences between methodologies, particularly focussing on neighbourhood and food outlet access measure definitions. However, no review has informatively discussed the suitability of the statistical methodologies employed; a key issue determining the validity of study findings. Our aim was to examine the suitability of statistical approaches adopted in these analyses. Methods Searches were conducted for articles published from 2000–2014. Eligible studies included objective measures of the neighbourhood food environment and neighbourhood-level socio-economic status, with a statistical analysis of the association between food outlet access and socio-economic status. Results Fifty-four papers were included. Outlet accessibility was typically defined as the distance to the nearest outlet from the neighbourhood centroid, or as the number of food outlets within a neighbourhood (or buffer). To assess if these measures were linked to neighbourhood disadvantage, common statistical methods included ANOVA, correlation, and Poisson or negative binomial regression. Although all studies involved spatial data, few considered spatial analysis techniques or spatial autocorrelation. Conclusions With advances in GIS software, sophisticated measures of neighbourhood outlet accessibility can be considered. However, approaches to statistical analysis often appear less sophisticated. Care should be taken to consider assumptions underlying the analysis and the possibility of spatially correlated residuals which could affect the results. PMID:29546115

  8. The Statistical Consulting Center for Astronomy (SCCA)

    NASA Technical Reports Server (NTRS)

    Akritas, Michael

    2001-01-01

    The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.

  9. Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization

    NASA Technical Reports Server (NTRS)

    Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred

    2014-01-01

    In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.

  10. Students' Appreciation of Expectation and Variation as a Foundation for Statistical Understanding

    ERIC Educational Resources Information Center

    Watson, Jane M.; Callingham, Rosemary A.; Kelly, Ben A.

    2007-01-01

    This study presents the results of a partial credit Rasch analysis of in-depth interview data exploring statistical understanding of 73 school students in 6 contextual settings. The use of Rasch analysis allowed the exploration of a single underlying variable across contexts, which included probability sampling, representation of temperature…

  11. AN EMPIRICAL BAYES APPROACH TO COMBINING ESTIMATES OF THE VALUE OF A STATISTICAL LIFE FOR ENVIRONMENTAL POLICY ANALYSIS

    EPA Science Inventory

    This analysis updates EPA's standard VSL estimate by using a more comprehensive collection of VSL studies that include studies published between 1992 and 2000, as well as applying a more appropriate statistical method. We provide a pooled effect VSL estimate by applying the empi...

  12. [Statistical analysis using freely-available "EZR (Easy R)" software].

    PubMed

    Kanda, Yoshinobu

    2015-10-01

    Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.

  13. Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis

    PubMed Central

    Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

    2015-01-01

    Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876

  14. Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis.

    PubMed

    Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

    2015-01-01

    Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.

  15. Mild cognitive impairment and fMRI studies of brain functional connectivity: the state of the art

    PubMed Central

    Farràs-Permanyer, Laia; Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel

    2015-01-01

    In the last 15 years, many articles have studied brain connectivity in Mild Cognitive Impairment patients with fMRI techniques, seemingly using different connectivity statistical models in each investigation to identify complex connectivity structures so as to recognize typical behavior in this type of patient. This diversity in statistical approaches may cause problems in results comparison. This paper seeks to describe how researchers approached the study of brain connectivity in MCI patients using fMRI techniques from 2002 to 2014. The focus is on the statistical analysis proposed by each research group in reference to the limitations and possibilities of those techniques to identify some recommendations to improve the study of functional connectivity. The included articles came from a search of Web of Science and PsycINFO using the following keywords: f MRI, MCI, and functional connectivity. Eighty-one papers were found, but two of them were discarded because of the lack of statistical analysis. Accordingly, 79 articles were included in this review. We summarized some parts of the articles, including the goal of every investigation, the cognitive paradigm and methods used, brain regions involved, use of ROI analysis and statistical analysis, emphasizing on the connectivity estimation model used in each investigation. The present analysis allowed us to confirm the remarkable variability of the statistical analysis methods found. Additionally, the study of brain connectivity in this type of population is not providing, at the moment, any significant information or results related to clinical aspects relevant for prediction and treatment. We propose to follow guidelines for publishing fMRI data that would be a good solution to the problem of study replication. The latter aspect could be important for future publications because a higher homogeneity would benefit the comparison between publications and the generalization of results. PMID:26300802

  16. Kansas's forests, 2005: statistics, methods, and quality assurance

    Treesearch

    Patrick D. Miles; W. Keith Moser; Charles J. Barnett

    2011-01-01

    The first full annual inventory of Kansas's forests was completed in 2005 after 8,868 plots were selected and 468 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of Kansas inventory is presented...

  17. Advance Report of Final Mortality Statistics, 1985.

    ERIC Educational Resources Information Center

    Monthly Vital Statistics Report, 1987

    1987-01-01

    This document presents mortality statistics for 1985 for the entire United States. Data analysis and discussion of these factors is included: death and death rates; death rates by age, sex, and race; expectation of life at birth and at specified ages; causes of death; infant mortality; and maternal mortality. Highlights reported include: (1) the…

  18. 34 CFR Appendix A to Subpart N of... - Sample Default Prevention Plan

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... relevant default prevention statistics, including a statistical analysis of the borrowers who default on...'s delinquency status by obtaining reports from data managers and FFEL Program lenders. 5. Enhance... academic study. III. Statistics for Measuring Progress 1. The number of students enrolled at your...

  19. The Effect of Folate and Folate Plus Zinc Supplementation on Endocrine Parameters and Sperm Characteristics in Sub-Fertile Men: A Systematic Review and Meta-Analysis.

    PubMed

    Irani, Morvarid; Amirian, Malihe; Sadeghi, Ramin; Lez, Justine Le; Latifnejad Roudsari, Robab

    2017-08-29

    To evaluate the effect of folate and folate plus zinc supplementation on endocrine parameters and sperm characteristics in sub fertile men. We conducted a systematic review and meta-analysis. Electronic databases of Medline, Scopus , Google scholar and Persian databases (SID, Iran medex, Magiran, Medlib, Iran doc) were searched from 1966 to December 2016 using a set of relevant keywords including "folate or folic acid AND (infertility, infertile, sterility)".All available randomized controlled trials (RCTs), conducted on a sample of sub fertile men with semen analyses, who took oral folic acid or folate plus zinc, were included. Data collected included endocrine parameters and sperm characteristics. Statistical analyses were done by Comprehensive Meta-analysis Version 2. In total, seven studies were included. Six studies had sufficient data for meta-analysis. "Sperm concentration was statistically higher in men supplemented with folate than with placebo (P < .001)". However, folate supplementation alone did not seem to be more effective than the placebo on the morphology (P = .056) and motility of the sperms (P = .652). Folate plus zinc supplementation did not show any statistically different effect on serum testosterone (P = .86), inhibin B (P = .84), FSH (P = .054), and sperm motility (P = .169) as compared to the placebo. Yet, folate plus zinc showed statistically higher effect on the sperm concentration (P < .001), morphology (P < .001), and serum folate level (P < .001) as compared to placebo. Folate plus zinc supplementation has a positive effect on sperm characteristics in sub fertile men. However, these results should be interpreted with caution due to the important heterogeneity of the studies included in this meta-analysis. Further trials are still needed to confirm the current findings.

  20. Teaching Statistics in Biology: Using Inquiry-based Learning to Strengthen Understanding of Statistical Analysis in Biology Laboratory Courses

    PubMed Central

    2008-01-01

    There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study. PMID:18765754

  1. Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.

    PubMed

    Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C

    2018-03-07

    Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.

  2. Taxation in Public Education. Analysis and Bibliography Series, No. 12.

    ERIC Educational Resources Information Center

    Ross, Larry L.

    Intended for both researchers and practitioners, this analysis and bibliography cites approximately 100 publications on educational taxation, including general texts and reports, statistical reports, taxation guidelines, and alternative proposals for taxation. Topics covered in the analysis section include State and Federal aid, urban and suburban…

  3. Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments

    USGS Publications Warehouse

    Dorazio, Robert; Hunter, Margaret

    2015-01-01

    Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.

  4. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments.

    PubMed

    Dorazio, Robert M; Hunter, Margaret E

    2015-11-03

    Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.

  5. Statistical methods in personality assessment research.

    PubMed

    Schinka, J A; LaLone, L; Broeckel, J A

    1997-06-01

    Emerging models of personality structure and advances in the measurement of personality and psychopathology suggest that research in personality and personality assessment has entered a stage of advanced development, in this article we examine whether researchers in these areas have taken advantage of new and evolving statistical procedures. We conducted a review of articles published in the Journal of Personality, Assessment during the past 5 years. Of the 449 articles that included some form of data analysis, 12.7% used only descriptive statistics, most employed only univariate statistics, and fewer than 10% used multivariate methods of data analysis. We discuss the cost of using limited statistical methods, the possible reasons for the apparent reluctance to employ advanced statistical procedures, and potential solutions to this technical shortcoming.

  6. Profile of Undergraduates in U.S. Postsecondary Education Institutions, 2003-04: With a Special Analysis of Community College Students. Statistical Analysis Report. NCES 2006-184

    ERIC Educational Resources Information Center

    Horn, Laura; Nevill, Stephanie; Griffith, James

    2006-01-01

    This report is the fifth in a series of reports that provide a statistical snapshot of the undergraduate population. The reports accompany the newly released data from the National Postsecondary Student Aid Study (NPSAS), and each one includes a focused analysis on a particular topic. This report focuses on community college students, who…

  7. Data Analysis and Graphing in an Introductory Physics Laboratory: Spreadsheet versus Statistics Suite

    ERIC Educational Resources Information Center

    Peterlin, Primoz

    2010-01-01

    Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…

  8. Examples of Data Analysis with SPSS/PC+ Studentware.

    ERIC Educational Resources Information Center

    MacFarland, Thomas W.

    Intended for classroom use only, these unpublished notes contain computer lessons on descriptive statistics with files previously created in WordPerfect 4.2 and Lotus 1-2-3 Version 1.A for the IBM PC+. The statistical measures covered include Student's t-test with two independent samples; Student's t-test with a paired sample; Chi-square analysis;…

  9. ParallABEL: an R library for generalized parallelization of genome-wide association studies.

    PubMed

    Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

    2010-04-29

    Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.

  10. Approaching Career Criminals With An Intelligence Cycle

    DTIC Science & Technology

    2015-12-01

    including arrest statistics and “arrest statistics have been used as the main barometer of juvenile delinquent activity, (but) many juvenile... Statistical Briefing Book,” 187. 26 guided by theories about the causes of delinquent behavior, but there was no determination if those efforts achieved the...children.”110 However, the most evidence-based comparison of juvenile delinquency reduction programs is the statistical meta-analysis (a systematic

  11. North Dakota's forests, 2005: statistics, methods, and quality assurance

    Treesearch

    Patrick D. Miles; David E. Haugen; Charles J. Barnett

    2011-01-01

    The first full annual inventory of North Dakota's forests was completed in 2005 after 7,622 plots were selected and 164 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the North Dakota...

  12. South Dakota's forests, 2005: statistics, methods, and quality assurance

    Treesearch

    Patrick D. Miles; Ronald J. Piva; Charles J. Barnett

    2011-01-01

    The first full annual inventory of South Dakota's forests was completed in 2005 after 8,302 plots were selected and 325 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the South Dakota...

  13. Should I Pack My Umbrella? Clinical versus Statistical Prediction of Mental Health Decisions

    ERIC Educational Resources Information Center

    Aegisdottir, Stefania; Spengler, Paul M.; White, Michael J.

    2006-01-01

    In this rejoinder, the authors respond to the insightful commentary of Strohmer and Arm, Chwalisz, and Hilton, Harris, and Rice about the meta-analysis on statistical versus clinical prediction techniques for mental health judgments. The authors address issues including the availability of statistical prediction techniques for real-life psychology…

  14. Meta- and statistical analysis of single-case intervention research data: quantitative gifts and a wish list.

    PubMed

    Kratochwill, Thomas R; Levin, Joel R

    2014-04-01

    In this commentary, we add to the spirit of the articles appearing in the special series devoted to meta- and statistical analysis of single-case intervention-design data. Following a brief discussion of historical factors leading to our initial involvement in statistical analysis of such data, we discuss: (a) the value added by including statistical-analysis recommendations in the What Works Clearinghouse Standards for single-case intervention designs; (b) the importance of visual analysis in single-case intervention research, along with the distinctive role that could be played by single-case effect-size measures; and (c) the elevated internal validity and statistical-conclusion validity afforded by the incorporation of various forms of randomization into basic single-case design structures. For the future, we envision more widespread application of quantitative analyses, as critical adjuncts to visual analysis, in both primary single-case intervention research studies and literature reviews in the behavioral, educational, and health sciences. Copyright © 2014 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  15. The use and misuse of statistical analyses. [in geophysics and space physics

    NASA Technical Reports Server (NTRS)

    Reiff, P. H.

    1983-01-01

    The statistical techniques most often used in space physics include Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density, and superposed epoch analysis. Tests are presented which can evaluate the significance of the results obtained through each of these. Data presented without some form of error analysis are frequently useless, since they offer no way of assessing whether a bump on a spectrum or on a superposed epoch analysis is real or merely a statistical fluctuation. Among many of the published linear correlations, for instance, the uncertainty in the intercept and slope is not given, so that the significance of the fitted parameters cannot be assessed.

  16. Statistical Considerations of Food Allergy Prevention Studies.

    PubMed

    Bahnson, Henry T; du Toit, George; Lack, Gideon

    Clinical studies to prevent the development of food allergy have recently helped reshape public policy recommendations on the early introduction of allergenic foods. These trials are also prompting new research, and it is therefore important to address the unique design and analysis challenges of prevention trials. We highlight statistical concepts and give recommendations that clinical researchers may wish to adopt when designing future study protocols and analysis plans for prevention studies. Topics include selecting a study sample, addressing internal and external validity, improving statistical power, choosing alpha and beta, analysis innovations to address dilution effects, and analysis methods to deal with poor compliance, dropout, and missing data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  17. Statistical Estimation of Rollover Risk

    DOT National Transportation Integrated Search

    1989-08-01

    This report describes the results of a statistical analysis to determine the : probability of a rollover in a single vehicle accident. Over 39,000 accidents, : which included 4910 rollovers in the states of Texas, Maryland, and Washington were : exam...

  18. Analysis models for the estimation of oceanic fields

    NASA Technical Reports Server (NTRS)

    Carter, E. F.; Robinson, A. R.

    1987-01-01

    A general model for statistically optimal estimates is presented for dealing with scalar, vector and multivariate datasets. The method deals with anisotropic fields and treats space and time dependence equivalently. Problems addressed include the analysis, or the production of synoptic time series of regularly gridded fields from irregular and gappy datasets, and the estimate of fields by compositing observations from several different instruments and sampling schemes. Technical issues are discussed, including the convergence of statistical estimates, the choice of representation of the correlations, the influential domain of an observation, and the efficiency of numerical computations.

  19. 2011 statistical abstract of the United States

    USGS Publications Warehouse

    Krisanda, Joseph M.

    2011-01-01

    The Statistical Abstract of the United States, published since 1878, is the authoritative and comprehensive summary of statistics on the social, political, and economic organization of the United States.Use the Abstract as a convenient volume for statistical reference, and as a guide to sources of more information both in print and on the Web.Sources of data include the Census Bureau, Bureau of Labor Statistics, Bureau of Economic Analysis, and many other Federal agencies and private organizations.

  20. Public and patient involvement in quantitative health research: A statistical perspective.

    PubMed

    Hannigan, Ailish

    2018-06-19

    The majority of studies included in recent reviews of impact for public and patient involvement (PPI) in health research had a qualitative design. PPI in solely quantitative designs is underexplored, particularly its impact on statistical analysis. Statisticians in practice have a long history of working in both consultative (indirect) and collaborative (direct) roles in health research, yet their perspective on PPI in quantitative health research has never been explicitly examined. To explore the potential and challenges of PPI from a statistical perspective at distinct stages of quantitative research, that is sampling, measurement and statistical analysis, distinguishing between indirect and direct PPI. Statistical analysis is underpinned by having a representative sample, and a collaborative or direct approach to PPI may help achieve that by supporting access to and increasing participation of under-represented groups in the population. Acknowledging and valuing the role of lay knowledge of the context in statistical analysis and in deciding what variables to measure may support collective learning and advance scientific understanding, as evidenced by the use of participatory modelling in other disciplines. A recurring issue for quantitative researchers, which reflects quantitative sampling methods, is the selection and required number of PPI contributors, and this requires further methodological development. Direct approaches to PPI in quantitative health research may potentially increase its impact, but the facilitation and partnership skills required may require further training for all stakeholders, including statisticians. © 2018 The Authors Health Expectations published by John Wiley & Sons Ltd.

  1. Math Academy: Play Ball! Explorations in Data Analysis & Statistics. Book 3: Supplemental Math Materials for Grades 3-8

    ERIC Educational Resources Information Center

    Rimbey, Kimberly

    2008-01-01

    Created by teachers for teachers, the Math Academy tools and activities included in this booklet were designed to create hands-on activities and a fun learning environment for the teaching of mathematics to the students. This booklet contains the "Math Academy--Play Ball! Explorations in Data Analysis & Statistics," which teachers can use to…

  2. Gaia DR2 documentation Chapter 1: Introduction

    NASA Astrophysics Data System (ADS)

    de Bruijne, J. H. J.; Abreu, A.; Brown, A. G. A.; Castañeda, J.; Cheek, N.; Crowley, C.; De Angeli, F.; Drimmel, R.; Fabricius, C.; Fleitas, J.; Gracia-Abril, G.; Guerra, R.; Hutton, A.; Messineo, R.; Mora, A.; Nienartowicz, K.; Panem, C.; Siddiqui, H.

    2018-04-01

    This chapter of the Gaia DR2 documentation describes the Gaia mission, the Gaia spacecraft, and the organisation of the Gaia Data Processing and Analysis Consortium (DPAC), which is responsible for the processing and analysis of the Gaia data. Furthermore, various properties of the data release are summarised, including statistical properties, object statistics, completeness, selection and filtering criteria, and limitations of the data.

  3. Metamodels for Computer-Based Engineering Design: Survey and Recommendations

    NASA Technical Reports Server (NTRS)

    Simpson, Timothy W.; Peplinski, Jesse; Koch, Patrick N.; Allen, Janet K.

    1997-01-01

    The use of statistical techniques to build approximations of expensive computer analysis codes pervades much of todays engineering design. These statistical approximations, or metamodels, are used to replace the actual expensive computer analyses, facilitating multidisciplinary, multiobjective optimization and concept exploration. In this paper we review several of these techniques including design of experiments, response surface methodology, Taguchi methods, neural networks, inductive learning, and kriging. We survey their existing application in engineering design and then address the dangers of applying traditional statistical techniques to approximate deterministic computer analysis codes. We conclude with recommendations for the appropriate use of statistical approximation techniques in given situations and how common pitfalls can be avoided.

  4. Statistical analysis of Thematic Mapper Simulator data for the geobotanical discrimination of rock types in southwest Oregon

    NASA Technical Reports Server (NTRS)

    Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.

    1984-01-01

    An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.

  5. Statistical summaries of selected Iowa streamflow data through September 2013.

    DOT National Transportation Integrated Search

    2015-01-01

    Statistical summaries of streamflow data collected at : 184 streamgages in Iowa are presented in this report. All : streamgages included for analysis have at least 10 years of : continuous record collected before or through September : 2013. This rep...

  6. Accession Medical Standards Analysis and Research Activity (AMSARA) 2014, Annual Report, and four Supplemental Applicants and Accessions Tables for: Army, Air Force, Marine, and Navy

    DTIC Science & Technology

    2016-02-02

    23 Descriptive Statistics for Enlisted Service Applicants and Accessions...33 Summary Statistics for Applicants and Accessions for Enlisted Service ..................................... 36 Applicants and...utilization among Soldiers screened using TAPAS. Section 2 of this report includes the descriptive statistics AMSARA compiles and publishes

  7. Classical Statistics and Statistical Learning in Imaging Neuroscience

    PubMed Central

    Bzdok, Danilo

    2017-01-01

    Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896

  8. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    PubMed Central

    Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2009-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650

  9. Quantitative investigation of inappropriate regression model construction and the importance of medical statistics experts in observational medical research: a cross-sectional study.

    PubMed

    Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka

    2018-05-05

    To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  10. Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS).

    PubMed

    Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M; Khan, Ajmal

    2016-01-01

    This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher's exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher's exact test, logistic regression, epidemiological statistics, and non-parametric tests. This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design.

  11. Statistical properties of filtered pseudorandom digital sequences formed from the sum of maximum-length sequences

    NASA Technical Reports Server (NTRS)

    Wallace, G. R.; Weathers, G. D.; Graf, E. R.

    1973-01-01

    The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.

  12. 2011 statistical abstract of the United States

    USGS Publications Warehouse

    Krisanda, Joseph M.

    2011-01-01

    The Statistical Abstract of the United States, published since 1878, is the authoritative and comprehensive summary of statistics on the social, political, and economic organization of the United States.


    Use the Abstract as a convenient volume for statistical reference, and as a guide to sources of more information both in print and on the Web.


    Sources of data include the Census Bureau, Bureau of Labor Statistics, Bureau of Economic Analysis, and many other Federal agencies and private organizations.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, Richard O.

    The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Somemore » statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.« less

  14. Statistical assessment of the learning curves of health technologies.

    PubMed

    Ramsay, C R; Grant, A M; Wallace, S A; Garthwaite, P H; Monk, A F; Russell, I T

    2001-01-01

    (1) To describe systematically studies that directly assessed the learning curve effect of health technologies. (2) Systematically to identify 'novel' statistical techniques applied to learning curve data in other fields, such as psychology and manufacturing. (3) To test these statistical techniques in data sets from studies of varying designs to assess health technologies in which learning curve effects are known to exist. METHODS - STUDY SELECTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): For a study to be included, it had to include a formal analysis of the learning curve of a health technology using a graphical, tabular or statistical technique. METHODS - STUDY SELECTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): For a study to be included, it had to include a formal assessment of a learning curve using a statistical technique that had not been identified in the previous search. METHODS - DATA SOURCES: Six clinical and 16 non-clinical biomedical databases were searched. A limited amount of handsearching and scanning of reference lists was also undertaken. METHODS - DATA EXTRACTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): A number of study characteristics were abstracted from the papers such as study design, study size, number of operators and the statistical method used. METHODS - DATA EXTRACTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): The new statistical techniques identified were categorised into four subgroups of increasing complexity: exploratory data analysis; simple series data analysis; complex data structure analysis, generic techniques. METHODS - TESTING OF STATISTICAL METHODS: Some of the statistical methods identified in the systematic searches for single (simple) operator series data and for multiple (complex) operator series data were illustrated and explored using three data sets. The first was a case series of 190 consecutive laparoscopic fundoplication procedures performed by a single surgeon; the second was a case series of consecutive laparoscopic cholecystectomy procedures performed by ten surgeons; the third was randomised trial data derived from the laparoscopic procedure arm of a multicentre trial of groin hernia repair, supplemented by data from non-randomised operations performed during the trial. RESULTS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: Of 4571 abstracts identified, 272 (6%) were later included in the study after review of the full paper. Some 51% of studies assessed a surgical minimal access technique and 95% were case series. The statistical method used most often (60%) was splitting the data into consecutive parts (such as halves or thirds), with only 14% attempting a more formal statistical analysis. The reporting of the studies was poor, with 31% giving no details of data collection methods. RESULTS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: Of 9431 abstracts assessed, 115 (1%) were deemed appropriate for further investigation and, of these, 18 were included in the study. All of the methods for complex data sets were identified in the non-clinical literature. These were discriminant analysis, two-stage estimation of learning rates, generalised estimating equations, multilevel models, latent curve models, time series models and stochastic parameter models. In addition, eight new shapes of learning curves were identified. RESULTS - TESTING OF STATISTICAL METHODS: No one particular shape of learning curve performed significantly better than another. The performance of 'operation time' as a proxy for learning differed between the three procedures. Multilevel modelling using the laparoscopic cholecystectomy data demonstrated and measured surgeon-specific and confounding effects. The inclusion of non-randomised cases, despite the possible limitations of the method, enhanced the interpretation of learning effects. CONCLUSIONS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: The statistical methods used for assessing learning effects in health technology assessment have been crude and the reporting of studies poor. CONCLUSIONS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: A number of statistical methods for assessing learning effects were identified that had not hitherto been used in health technology assessment. There was a hierarchy of methods for the identification and measurement of learning, and the more sophisticated methods for both have had little if any use in health technology assessment. This demonstrated the value of considering fields outside clinical research when addressing methodological issues in health technology assessment. CONCLUSIONS - TESTING OF STATISTICAL METHODS: It has been demonstrated that the portfolio of techniques identified can enhance investigations of learning curve effects. (ABSTRACT TRUNCATED)

  15. Landsat real-time processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Davis, E.L.

    A novel method for performing real-time acquisition and processing Landsat/EROS data covers all aspects including radiometric and geometric corrections of multispectral scanner or return-beam vidicon inputs, image enhancement, statistical analysis, feature extraction, and classification. Radiometric transformations include bias/gain adjustment, noise suppression, calibration, scan angle compensation, and illumination compensation, including topography and atmospheric effects. Correction or compensation for geometric distortion includes sensor-related distortions, such as centering, skew, size, scan nonlinearity, radial symmetry, and tangential symmetry. Also included are object image-related distortions such as aspect angle (altitude), scale distortion (altitude), terrain relief, and earth curvature. Ephemeral corrections are also applied to compensatemore » for satellite forward movement, earth rotation, altitude variations, satellite vibration, and mirror scan velocity. Image enhancement includes high-pass, low-pass, and Laplacian mask filtering and data restoration for intermittent losses. Resource classification is provided by statistical analysis including histograms, correlational analysis, matrix manipulations, and determination of spectral responses. Feature extraction includes spatial frequency analysis, which is used in parallel discriminant functions in each array processor for rapid determination. The technique uses integrated parallel array processors that decimate the tasks concurrently under supervision of a control processor. The operator-machine interface is optimized for programming ease and graphics image windowing.« less

  16. Statistical Software and Artificial Intelligence: A Watershed in Applications Programming.

    ERIC Educational Resources Information Center

    Pickett, John C.

    1984-01-01

    AUTOBJ and AUTOBOX are revolutionary software programs which contain the first application of artificial intelligence to statistical procedures used in analysis of time series data. The artificial intelligence included in the programs and program features are discussed. (JN)

  17. Statistical Analysis Tools for Learning in Engineering Laboratories.

    ERIC Educational Resources Information Center

    Maher, Carolyn A.

    1990-01-01

    Described are engineering programs that have used automated data acquisition systems to implement data collection and analyze experiments. Applications include a biochemical engineering laboratory, heat transfer performance, engineering materials testing, mechanical system reliability, statistical control laboratory, thermo-fluid laboratory, and a…

  18. The statistical analysis of global climate change studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hardin, J.W.

    1992-01-01

    The focus of this work is to contribute to the enhancement of the relationship between climatologists and statisticians. The analysis of global change data has been underway for many years by atmospheric scientists. Much of this analysis includes a heavy reliance on statistics and statistical inference. Some specific climatological analyses are presented and the dependence on statistics is documented before the analysis is undertaken. The first problem presented involves the fluctuation-dissipation theorem and its application to global climate models. This problem has a sound theoretical niche in the literature of both climate modeling and physics, but a statistical analysis inmore » which the data is obtained from the model to show graphically the relationship has not been undertaken. It is under this motivation that the author presents this problem. A second problem concerning the standard errors in estimating global temperatures is purely statistical in nature although very little materials exists for sampling on such a frame. This problem not only has climatological and statistical ramifications, but political ones as well. It is planned to use these results in a further analysis of global warming using actual data collected on the earth. In order to simplify the analysis of these problems, the development of a computer program, MISHA, is presented. This interactive program contains many of the routines, functions, graphics, and map projections needed by the climatologist in order to effectively enter the arena of data visualization.« less

  19. A Guerilla Guide to Common Problems in ‘Neurostatistics’: Essential Statistical Topics in Neuroscience

    PubMed Central

    Smith, Paul F.

    2017-01-01

    Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins. PMID:29371855

  20. A Guerilla Guide to Common Problems in 'Neurostatistics': Essential Statistical Topics in Neuroscience.

    PubMed

    Smith, Paul F

    2017-01-01

    Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins.

  1. A comparison of performance of automatic cloud coverage assessment algorithm for Formosat-2 image using clustering-based and spatial thresholding methods

    NASA Astrophysics Data System (ADS)

    Hsu, Kuo-Hsien

    2012-11-01

    Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.

  2. Statistics and Discoveries at the LHC (1/4)

    ScienceCinema

    Cowan, Glen

    2018-02-09

    The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.

  3. Statistics and Discoveries at the LHC (3/4)

    ScienceCinema

    Cowan, Glen

    2018-02-19

    The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.

  4. Statistics and Discoveries at the LHC (4/4)

    ScienceCinema

    Cowan, Glen

    2018-05-22

    The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.

  5. Statistics and Discoveries at the LHC (2/4)

    ScienceCinema

    Cowan, Glen

    2018-04-26

    The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.

  6. A Statistical Analysis of Brain Morphology Using Wild Bootstrapping

    PubMed Central

    Ibrahim, Joseph G.; Tang, Niansheng; Rowe, Daniel B.; Hao, Xuejun; Bansal, Ravi; Peterson, Bradley S.

    2008-01-01

    Methods for the analysis of brain morphology, including voxel-based morphology and surface-based morphometries, have been used to detect associations between brain structure and covariates of interest, such as diagnosis, severity of disease, age, IQ, and genotype. The statistical analysis of morphometric measures usually involves two statistical procedures: 1) invoking a statistical model at each voxel (or point) on the surface of the brain or brain subregion, followed by mapping test statistics (e.g., t test) or their associated p values at each of those voxels; 2) correction for the multiple statistical tests conducted across all voxels on the surface of the brain region under investigation. We propose the use of new statistical methods for each of these procedures. We first use a heteroscedastic linear model to test the associations between the morphological measures at each voxel on the surface of the specified subregion (e.g., cortical or subcortical surfaces) and the covariates of interest. Moreover, we develop a robust test procedure that is based on a resampling method, called wild bootstrapping. This procedure assesses the statistical significance of the associations between a measure of given brain structure and the covariates of interest. The value of this robust test procedure lies in its computationally simplicity and in its applicability to a wide range of imaging data, including data from both anatomical and functional magnetic resonance imaging (fMRI). Simulation studies demonstrate that this robust test procedure can accurately control the family-wise error rate. We demonstrate the application of this robust test procedure to the detection of statistically significant differences in the morphology of the hippocampus over time across gender groups in a large sample of healthy subjects. PMID:17649909

  7. CORSSA: Community Online Resource for Statistical Seismicity Analysis

    NASA Astrophysics Data System (ADS)

    Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.

    2011-12-01

    Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.

  8. Periodontal disease and carotid atherosclerosis: A meta-analysis of 17,330 participants.

    PubMed

    Zeng, Xian-Tao; Leng, Wei-Dong; Lam, Yat-Yin; Yan, Bryan P; Wei, Xue-Mei; Weng, Hong; Kwong, Joey S W

    2016-01-15

    The association between periodontal disease and carotid atherosclerosis has been evaluated primarily in single-center studies, and whether periodontal disease is an independent risk factor of carotid atherosclerosis remains uncertain. This meta-analysis aimed to evaluate the association between periodontal disease and carotid atherosclerosis. We searched PubMed and Embase for relevant observational studies up to February 20, 2015. Two authors independently extracted data from included studies, and odds ratios (ORs) with 95% confidence intervals (CIs) were calculated for overall and subgroup meta-analyses. Statistical heterogeneity was assessed by the chi-squared test (P<0.1 for statistical significance) and quantified by the I(2) statistic. Data analysis was conducted using the Comprehensive Meta-Analysis (CMA) software. Fifteen observational studies involving 17,330 participants were included in the meta-analysis. The overall pooled result showed that periodontal disease was associated with carotid atherosclerosis (OR: 1.27, 95% CI: 1.14-1.41; P<0.001) but statistical heterogeneity was substantial (I(2)=78.90%). Subgroup analysis of adjusted smoking and diabetes mellitus showed borderline significance (OR: 1.08; 95% CI: 1.00-1.18; P=0.05). Sensitivity and cumulative analyses both indicated that our results were robust. Findings of our meta-analysis indicated that the presence of periodontal disease was associated with carotid atherosclerosis; however, further large-scale, well-conducted clinical studies are needed to explore the precise risk of developing carotid atherosclerosis in patients with periodontal disease. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  9. Diagnosis checking of statistical analysis in RCTs indexed in PubMed.

    PubMed

    Lee, Paul H; Tse, Andy C Y

    2017-11-01

    Statistical analysis is essential for reporting of the results of randomized controlled trials (RCTs), as well as evaluating their effectiveness. However, the validity of a statistical analysis also depends on whether the assumptions of that analysis are valid. To review all RCTs published in journals indexed in PubMed during December 2014 to provide a complete picture of how RCTs handle assumptions of statistical analysis. We reviewed all RCTs published in December 2014 that appeared in journals indexed in PubMed using the Cochrane highly sensitive search strategy. The 2014 impact factors of the journals were used as proxies for their quality. The type of statistical analysis used and whether the assumptions of the analysis were tested were reviewed. In total, 451 papers were included. Of the 278 papers that reported a crude analysis for the primary outcomes, 31 (27·2%) reported whether the outcome was normally distributed. Of the 172 papers that reported an adjusted analysis for the primary outcomes, diagnosis checking was rarely conducted, with only 20%, 8·6% and 7% checked for generalized linear model, Cox proportional hazard model and multilevel model, respectively. Study characteristics (study type, drug trial, funding sources, journal type and endorsement of CONSORT guidelines) were not associated with the reporting of diagnosis checking. The diagnosis of statistical analyses in RCTs published in PubMed-indexed journals was usually absent. Journals should provide guidelines about the reporting of a diagnosis of assumptions. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.

  10. ParallABEL: an R library for generalized parallelization of genome-wide association studies

    PubMed Central

    2010-01-01

    Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Conclusions Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL. PMID:20429914

  11. Meta-analysis of neutropenia or leukopenia as a prognostic factor in patients with malignant disease undergoing chemotherapy.

    PubMed

    Shitara, Kohei; Matsuo, Keitaro; Oze, Isao; Mizota, Ayako; Kondo, Chihiro; Nomura, Motoo; Yokota, Tomoya; Takahari, Daisuke; Ura, Takashi; Muro, Kei

    2011-08-01

    We performed a systematic review and meta-analysis to determine the impact of neutropenia or leukopenia experienced during chemotherapy on survival. Eligible studies included prospective or retrospective analyses that evaluated neutropenia or leukopenia as a prognostic factor for overall survival or disease-free survival. Statistical analyses were conducted to calculate a summary hazard ratio and 95% confidence interval (CI) using random-effects or fixed-effects models based on the heterogeneity of the included studies. Thirteen trials were selected for the meta-analysis, with a total of 9,528 patients. The hazard ratio of death was 0.69 (95% CI, 0.64-0.75) for patients with higher-grade neutropenia or leukopenia compared to patients with lower-grade or lack of cytopenia. Our analysis was also stratified by statistical method (any statistical method to decrease lead-time bias; time-varying analysis or landmark analysis), but no differences were observed. Our results indicate that neutropenia or leukopenia experienced during chemotherapy is associated with improved survival in patients with advanced cancer or hematological malignancies undergoing chemotherapy. Future prospective analyses designed to investigate the potential impact of chemotherapy dose adjustment coupled with monitoring of neutropenia or leukopenia on survival are warranted.

  12. SWToolbox: A surface-water tool-box for statistical analysis of streamflow time series

    USGS Publications Warehouse

    Kiang, Julie E.; Flynn, Kate; Zhai, Tong; Hummel, Paul; Granato, Gregory

    2018-03-07

    This report is a user guide for the low-flow analysis methods provided with version 1.0 of the Surface Water Toolbox (SWToolbox) computer program. The software combines functionality from two software programs—U.S. Geological Survey (USGS) SWSTAT and U.S. Environmental Protection Agency (EPA) DFLOW. Both of these programs have been used primarily for computation of critical low-flow statistics. The main analysis methods are the computation of hydrologic frequency statistics such as the 7-day minimum flow that occurs on average only once every 10 years (7Q10), computation of design flows including biologically based flows, and computation of flow-duration curves and duration hydrographs. Other annual, monthly, and seasonal statistics can also be computed. The interface facilitates retrieval of streamflow discharge data from the USGS National Water Information System and outputs text reports for a record of the analysis. Tools for graphing data and screening tests are available to assist the analyst in conducting the analysis.

  13. Visual Data Analysis for Satellites

    NASA Technical Reports Server (NTRS)

    Lau, Yee; Bhate, Sachin; Fitzpatrick, Patrick

    2008-01-01

    The Visual Data Analysis Package is a collection of programs and scripts that facilitate visual analysis of data available from NASA and NOAA satellites, as well as dropsonde, buoy, and conventional in-situ observations. The package features utilities for data extraction, data quality control, statistical analysis, and data visualization. The Hierarchical Data Format (HDF) satellite data extraction routines from NASA's Jet Propulsion Laboratory were customized for specific spatial coverage and file input/output. Statistical analysis includes the calculation of the relative error, the absolute error, and the root mean square error. Other capabilities include curve fitting through the data points to fill in missing data points between satellite passes or where clouds obscure satellite data. For data visualization, the software provides customizable Generic Mapping Tool (GMT) scripts to generate difference maps, scatter plots, line plots, vector plots, histograms, timeseries, and color fill images.

  14. Comment on “Two statistics for evaluating parameter identifiability and error reduction” by John Doherty and Randall J. Hunt

    USGS Publications Warehouse

    Hill, Mary C.

    2010-01-01

    Doherty and Hunt (2009) present important ideas for first-order-second moment sensitivity analysis, but five issues are discussed in this comment. First, considering the composite-scaled sensitivity (CSS) jointly with parameter correlation coefficients (PCC) in a CSS/PCC analysis addresses the difficulties with CSS mentioned in the introduction. Second, their new parameter identifiability statistic actually is likely to do a poor job of parameter identifiability in common situations. The statistic instead performs the very useful role of showing how model parameters are included in the estimated singular value decomposition (SVD) parameters. Its close relation to CSS is shown. Third, the idea from p. 125 that a suitable truncation point for SVD parameters can be identified using the prediction variance is challenged using results from Moore and Doherty (2005). Fourth, the relative error reduction statistic of Doherty and Hunt is shown to belong to an emerging set of statistics here named perturbed calculated variance statistics. Finally, the perturbed calculated variance statistics OPR and PPR mentioned on p. 121 are shown to explicitly include the parameter null-space component of uncertainty. Indeed, OPR and PPR results that account for null-space uncertainty have appeared in the literature since 2000.

  15. Statistical Analysis of Big Data on Pharmacogenomics

    PubMed Central

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  16. Linnorm: improved statistical analysis for single cell RNA-seq expression data

    PubMed Central

    Yip, Shun H.; Wang, Panwen; Kocher, Jean-Pierre A.; Sham, Pak Chung

    2017-01-01

    Abstract Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. PMID:28981748

  17. Statistical Analysis of speckle noise reduction techniques for echocardiographic Images

    NASA Astrophysics Data System (ADS)

    Saini, Kalpana; Dewal, M. L.; Rohit, Manojkumar

    2011-12-01

    Echocardiography is the safe, easy and fast technology for diagnosing the cardiac diseases. As in other ultrasound images these images also contain speckle noise. In some cases this speckle noise is useful such as in motion detection. But in general noise removal is required for better analysis of the image and proper diagnosis. Different Adaptive and anisotropic filters are included for statistical analysis. Statistical parameters such as Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Root Mean Square Error (RMSE) calculated for performance measurement. One more important aspect that there may be blurring during speckle noise removal. So it is prefered that filter should be able to enhance edges during noise removal.

  18. Monte Carlo based statistical power analysis for mediation models: methods and software.

    PubMed

    Zhang, Zhiyong

    2014-12-01

    The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.

  19. SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

    PubMed

    Chu, Annie; Cui, Jenny; Dinov, Ivo D

    2009-03-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.

  20. Integration of modern statistical tools for the analysis of climate extremes into the web-GIS “CLIMATE”

    NASA Astrophysics Data System (ADS)

    Ryazanova, A. A.; Okladnikov, I. G.; Gordov, E. P.

    2017-11-01

    The frequency of occurrence and magnitude of precipitation and temperature extreme events show positive trends in several geographical regions. These events must be analyzed and studied in order to better understand their impact on the environment, predict their occurrences, and mitigate their effects. For this purpose, we augmented web-GIS called “CLIMATE” to include a dedicated statistical package developed in the R language. The web-GIS “CLIMATE” is a software platform for cloud storage processing and visualization of distributed archives of spatial datasets. It is based on a combined use of web and GIS technologies with reliable procedures for searching, extracting, processing, and visualizing the spatial data archives. The system provides a set of thematic online tools for the complex analysis of current and future climate changes and their effects on the environment. The package includes new powerful methods of time-dependent statistics of extremes, quantile regression and copula approach for the detailed analysis of various climate extreme events. Specifically, the very promising copula approach allows obtaining the structural connections between the extremes and the various environmental characteristics. The new statistical methods integrated into the web-GIS “CLIMATE” can significantly facilitate and accelerate the complex analysis of climate extremes using only a desktop PC connected to the Internet.

  1. Comparisons of non-Gaussian statistical models in DNA methylation analysis.

    PubMed

    Ma, Zhanyu; Teschendorff, Andrew E; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-06-16

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.

  2. Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis

    PubMed Central

    Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-01-01

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687

  3. Assessing compositional variability through graphical analysis and Bayesian statistical approaches: case studies on transgenic crops.

    PubMed

    Harrigan, George G; Harrison, Jay M

    2012-01-01

    New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.

  4. GARNET--gene set analysis with exploration of annotation relations.

    PubMed

    Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu

    2011-02-15

    Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).

  5. From Combat to Campus

    ERIC Educational Resources Information Center

    Bellafiore, Margaret

    2012-01-01

    Soldiers are returning from war to college. The number of veterans enrolled nationally is hard to find. Data from the National Center for Veterans Analysis and Statistics identify nearly 924,000 veterans as "total education program beneficiaries" for 2011. These statistics combine many categories, including dependents and survivors. The…

  6. Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS)

    PubMed Central

    Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M.; Khan, Ajmal

    2016-01-01

    Objective: This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Methods: Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. Results: A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher’s exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher’s exact test, logistic regression, epidemiological statistics, and non-parametric tests. Conclusion: This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design. PMID:27022365

  7. AGR-1 Thermocouple Data Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeff Einerson

    2012-05-01

    This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods tomore » further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of simulation results (Chapter 3). The statistics-based simulation-aided experimental control procedure described for the future AGR tests is developed and demonstrated in Chapter 4. The procedure for controlling the target fuel temperature (capsule peak or average) is based on regression functions of thermocouple readings and other relevant parameters and accounting for possible changes in both physical and thermal conditions and in instrument performance.« less

  8. Significant Association of Urinary Toxic Metals and Autism-Related Symptoms—A Nonlinear Statistical Analysis with Cross Validation

    PubMed Central

    Adams, James; Kruger, Uwe; Geis, Elizabeth; Gehn, Eva; Fimbres, Valeria; Pollard, Elena; Mitchell, Jessica; Ingram, Julie; Hellmers, Robert; Quig, David; Hahn, Juergen

    2017-01-01

    Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate Speech), but significant associations were found for UTM with all eleven autism-related assessments with cross-validation R2 values ranging from 0.12–0.48. PMID:28068407

  9. Frontiers of Two-Dimensional Correlation Spectroscopy. Part 1. New concepts and noteworthy developments

    NASA Astrophysics Data System (ADS)

    Noda, Isao

    2014-07-01

    A comprehensive survey review of new and noteworthy developments, which are advancing forward the frontiers in the field of 2D correlation spectroscopy during the last four years, is compiled. This review covers books, proceedings, and review articles published on 2D correlation spectroscopy, a number of significant conceptual developments in the field, data pretreatment methods and other pertinent topics, as well as patent and publication trends and citation activities. Developments discussed include projection 2D correlation analysis, concatenated 2D correlation, and correlation under multiple perturbation effects, as well as orthogonal sample design, predicting 2D correlation spectra, manipulating and comparing 2D spectra, correlation strategy based on segmented data blocks, such as moving-window analysis, features like determination of sequential order and enhanced spectral resolution, statistical 2D spectroscopy using covariance and other statistical metrics, hetero-correlation analysis, and sample-sample correlation technique. Data pretreatment operations prior to 2D correlation analysis are discussed, including the correction for physical effects, background and baseline subtraction, selection of reference spectrum, normalization and scaling of data, derivatives spectra and deconvolution technique, and smoothing and noise reduction. Other pertinent topics include chemometrics and statistical considerations, peak position shift phenomena, variable sampling increments, computation and software, display schemes, such as color coded format, slice and power spectra, tabulation, and other schemes.

  10. Statistical analysis of Skylab 3. [endocrine/metabolic studies of astronauts

    NASA Technical Reports Server (NTRS)

    Johnston, D. A.

    1974-01-01

    The results of endocrine/metabolic studies of astronauts on Skylab 3 are reported. One-way analysis of variance, contrasts, two-way unbalanced analysis of variance, and analysis of periodic changes in flight are included. Results for blood tests, and urine tests are presented.

  11. A Matlab user interface for the statistically assisted fluid registration algorithm and tensor-based morphometry

    NASA Astrophysics Data System (ADS)

    Yepes-Calderon, Fernando; Brun, Caroline; Sant, Nishita; Thompson, Paul; Lepore, Natasha

    2015-01-01

    Tensor-Based Morphometry (TBM) is an increasingly popular method for group analysis of brain MRI data. The main steps in the analysis consist of a nonlinear registration to align each individual scan to a common space, and a subsequent statistical analysis to determine morphometric differences, or difference in fiber structure between groups. Recently, we implemented the Statistically-Assisted Fluid Registration Algorithm or SAFIRA,1 which is designed for tracking morphometric differences among populations. To this end, SAFIRA allows the inclusion of statistical priors extracted from the populations being studied as regularizers in the registration. This flexibility and degree of sophistication limit the tool to expert use, even more so considering that SAFIRA was initially implemented in command line mode. Here, we introduce a new, intuitive, easy to use, Matlab-based graphical user interface for SAFIRA's multivariate TBM. The interface also generates different choices for the TBM statistics, including both the traditional univariate statistics on the Jacobian matrix, and comparison of the full deformation tensors.2 This software will be freely disseminated to the neuroimaging research community.

  12. Generalization of Entropy Based Divergence Measures for Symbolic Sequence Analysis

    PubMed Central

    Ré, Miguel A.; Azad, Rajeev K.

    2014-01-01

    Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms. PMID:24728338

  13. Generalization of entropy based divergence measures for symbolic sequence analysis.

    PubMed

    Ré, Miguel A; Azad, Rajeev K

    2014-01-01

    Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms.

  14. Improved Statistics for Genome-Wide Interaction Analysis

    PubMed Central

    Ueki, Masao; Cordell, Heather J.

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670

  15. Analysis strategies for longitudinal attachment loss data.

    PubMed

    Beck, J D; Elter, J R

    2000-02-01

    The purpose of this invited review is to describe and discuss methods currently in use to quantify the progression of attachment loss in epidemiological studies of periodontal disease, and to make recommendations for specific analytic methods based upon the particular design of the study and structure of the data. The review concentrates on the definition of incident attachment loss (ALOSS) and its component parts; measurement issues including thresholds and regression to the mean; methods of accounting for longitudinal change, including changes in means, changes in proportions of affected sites, incidence density, the effect of tooth loss and reversals, and repeated events; statistical models of longitudinal change, including the incorporation of the time element, use of linear, logistic or Poisson regression or survival analysis, and statistical tests; site vs person level of analysis, including statistical adjustment for correlated data; the strengths and limitations of ALOSS data. Examples from the Piedmont 65+ Dental Study are used to illustrate specific concepts. We conclude that incidence density is the preferred methodology to use for periodontal studies with more than one period of follow-up and that the use of studies not employing methods for dealing with complex samples, correlated data, and repeated measures does not take advantage of our current understanding of the site- and person-level variables important in periodontal disease and may generate biased results.

  16. Noise limitations in optical linear algebra processors.

    PubMed

    Batsell, S G; Jong, T L; Walkup, J F; Krile, T F

    1990-05-10

    A general statistical noise model is presented for optical linear algebra processors. A statistical analysis which includes device noise, the multiplication process, and the addition operation is undertaken. We focus on those processes which are architecturally independent. Finally, experimental results which verify the analytical predictions are also presented.

  17. ASURV: Astronomical SURVival Statistics

    NASA Astrophysics Data System (ADS)

    Feigelson, E. D.; Nelson, P. I.; Isobe, T.; LaValley, M.

    2014-06-01

    ASURV (Astronomical SURVival Statistics) provides astronomy survival analysis for right- and left-censored data including the maximum-likelihood Kaplan-Meier estimator and several univariate two-sample tests, bivariate correlation measures, and linear regressions. ASURV is written in FORTRAN 77, and is stand-alone and does not call any specialized libraries.

  18. An introduction to Bayesian statistics in health psychology.

    PubMed

    Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

    2017-09-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.

  19. Misuse of statistical methods: critical assessment of articles in BMJ from January to March 1976.

    PubMed Central

    Gore, S M; Jones, I G; Rytter, E C

    1977-01-01

    Sixty-two reports that appeared as Papers and Originals (excluding short reports) in 13 consecutive issues of the British Medical journal included statistical analysis. Thirty-two had statistical errors of one kind or another; in 18 fairly serious faults were discovered. The summaries of five reports made some claim that was unsupportable on re-examination of the data. Medical investigators should consult with people who have a real understanding of statistical methods throughout their projects. PMID:832023

  20. Australia 31-GHz brightness temperature exceedance statistics

    NASA Technical Reports Server (NTRS)

    Gary, B. L.

    1988-01-01

    Water vapor radiometer measurements were made at DSS 43 during an 18 month period. Brightness temperatures at 31 GHz were subjected to a statistical analysis which included correction for the effects of occasional water on the radiometer radome. An exceedance plot was constructed, and the 1 percent exceedance statistics occurs at 120 K. The 5 percent exceedance statistics occurs at 70 K, compared with 75 K in Spain. These values are valid for all of the three month groupings that were studied.

  1. Systematic Review and Meta-Analysis of Studies Evaluating Diagnostic Test Accuracy: A Practical Review for Clinical Researchers-Part II. Statistical Methods of Meta-Analysis

    PubMed Central

    Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi

    2015-01-01

    Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107

  2. Risk of thromboembolism with thrombopoietin receptor agonists in adult patients with thrombocytopenia: Systematic review and meta-analysis of randomized controlled trials.

    PubMed

    Catalá-López, Ferrán; Corrales, Inmaculada; de la Fuente-Honrubia, César; González-Bermejo, Diana; Martín-Serrano, Gloria; Montero, Dolores; Saint-Gerons, Diego Macías

    2015-12-21

    Romiplostim and eltrombopag are thrombopoietin receptor (TPOr) agonists that promote megakaryocyte differentiation, proliferation and platelet production. In 2012, a systematic review and meta-analysis reported a non-statistically significant increased risk of thromboembolic events for these drugs, but analyses were limited by lack of statistical power. Our objective was to update the 2012 meta-analysis examining whether TPOr agonists affect thromboembolism occurrence in adult thrombocytopenic patients. We conducted a systematic review and meta-analysis of randomized controlled trials (RCTs). Updated searches were conduced on PubMed, Cochrane Central, and publicly available registries (up to December 2014). RCTs using romiplostim or eltrombopag in at least one group were included. Relative risks (RR), absolute risk ratios (ARR) and number needed to harm (NNH) were estimated. Heterogeneity was analyzed using Cochran's Q test and I(2) statistic. Fifteen studies with 3026 adult thrombocytopenic patients were included. Estimated frequency of thromboembolism was 3.69% (95% CI: 2.95-4.61%) for TPOr agonists and 1.46% (95% CI: 0.89-2.40%) for controls. TPOr agonists were associated with a RR of thromboembolism of 1.81 (95% CI: 1.04-3.14) and an ARR of 2.10% (95% CI: 0.03-3.90%) meaning a NNH of 48. Overall, we did not find evidence of statistical heterogeneity (p=0.43; I(2)=1.60%). Our updated meta-analysis suggested that TPOr agonists are associated with a higher risk of thromboemboembolic events compared with controls, and supports the current recommendations included in the European product information on this respect. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.

  3. Algorithm for Identifying Erroneous Rain-Gauge Readings

    NASA Technical Reports Server (NTRS)

    Rickman, Doug

    2005-01-01

    An algorithm analyzes rain-gauge data to identify statistical outliers that could be deemed to be erroneous readings. Heretofore, analyses of this type have been performed in burdensome manual procedures that have involved subjective judgements. Sometimes, the analyses have included computational assistance for detecting values falling outside of arbitrary limits. The analyses have been performed without statistically valid knowledge of the spatial and temporal variations of precipitation within rain events. In contrast, the present algorithm makes it possible to automate such an analysis, makes the analysis objective, takes account of the spatial distribution of rain gauges in conjunction with the statistical nature of spatial variations in rainfall readings, and minimizes the use of arbitrary criteria. The algorithm implements an iterative process that involves nonparametric statistics.

  4. Event coincidence analysis for quantifying statistical interrelationships between event time series. On the role of flood events as triggers of epidemic outbreaks

    NASA Astrophysics Data System (ADS)

    Donges, J. F.; Schleussner, C.-F.; Siegmund, J. F.; Donner, R. V.

    2016-05-01

    Studying event time series is a powerful approach for analyzing the dynamics of complex dynamical systems in many fields of science. In this paper, we describe the method of event coincidence analysis to provide a framework for quantifying the strength, directionality and time lag of statistical interrelationships between event series. Event coincidence analysis allows to formulate and test null hypotheses on the origin of the observed interrelationships including tests based on Poisson processes or, more generally, stochastic point processes with a prescribed inter-event time distribution and other higher-order properties. Applying the framework to country-level observational data yields evidence that flood events have acted as triggers of epidemic outbreaks globally since the 1950s. Facing projected future changes in the statistics of climatic extreme events, statistical techniques such as event coincidence analysis will be relevant for investigating the impacts of anthropogenic climate change on human societies and ecosystems worldwide.

  5. Linnorm: improved statistical analysis for single cell RNA-seq expression data.

    PubMed

    Yip, Shun H; Wang, Panwen; Kocher, Jean-Pierre A; Sham, Pak Chung; Wang, Junwen

    2017-12-15

    Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Experimental analysis of computer system dependability

    NASA Technical Reports Server (NTRS)

    Iyer, Ravishankar, K.; Tang, Dong

    1993-01-01

    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance.

  7. Landing Site Dispersion Analysis and Statistical Assessment for the Mars Phoenix Lander

    NASA Technical Reports Server (NTRS)

    Bonfiglio, Eugene P.; Adams, Douglas; Craig, Lynn; Spencer, David A.; Strauss, William; Seelos, Frank P.; Seelos, Kimberly D.; Arvidson, Ray; Heet, Tabatha

    2008-01-01

    The Mars Phoenix Lander launched on August 4, 2007 and successfully landed on Mars 10 months later on May 25, 2008. Landing ellipse predicts and hazard maps were key in selecting safe surface targets for Phoenix. Hazard maps were based on terrain slopes, geomorphology maps and automated rock counts of MRO's High Resolution Imaging Science Experiment (HiRISE) images. The expected landing dispersion which led to the selection of Phoenix's surface target is discussed as well as the actual landing dispersion predicts determined during operations in the weeks, days, and hours before landing. A statistical assessment of these dispersions is performed, comparing the actual landing-safety probabilities to criteria levied by the project. Also discussed are applications for this statistical analysis which were used by the Phoenix project. These include using the statistical analysis used to verify the effectiveness of a pre-planned maneuver menu and calculating the probability of future maneuvers.

  8. SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS

    NASA Technical Reports Server (NTRS)

    Brownlow, J. D.

    1994-01-01

    The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval are removed by least-squares detrending. As many as ten channels of data may be analyzed at one time. Both tabular and plotted output may be generated by the SPA program. This program is written in FORTRAN IV and has been implemented on a CDC 6000 series computer with a central memory requirement of approximately 142K (octal) of 60 bit words. This core requirement can be reduced by segmentation of the program. The SPA program was developed in 1978.

  9. WASP (Write a Scientific Paper) using Excel - 6: Standard error and confidence interval.

    PubMed

    Grech, Victor

    2018-03-01

    The calculation of descriptive statistics includes the calculation of standard error and confidence interval, an inevitable component of data analysis in inferential statistics. This paper provides pointers as to how to do this in Microsoft Excel™. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Analyzing a Mature Software Inspection Process Using Statistical Process Control (SPC)

    NASA Technical Reports Server (NTRS)

    Barnard, Julie; Carleton, Anita; Stamper, Darrell E. (Technical Monitor)

    1999-01-01

    This paper presents a cooperative effort where the Software Engineering Institute and the Space Shuttle Onboard Software Project could experiment applying Statistical Process Control (SPC) analysis to inspection activities. The topics include: 1) SPC Collaboration Overview; 2) SPC Collaboration Approach and Results; and 3) Lessons Learned.

  11. Genetic structure of populations and differentiation in forest trees

    Treesearch

    Raymond P. Guries; F. Thomas Ledig

    1981-01-01

    Electrophoretic techniques permit population biologists to analyze genetic structure of natural populations by using large numbers of allozyme loci. Several methods of analysis have been applied to allozyme data, including chi-square contingency tests, F-statistics, and genetic distance. This paper compares such statistics for pitch pine (Pinus rigida...

  12. PANDA-view: An easy-to-use tool for statistical analysis and visualization of quantitative proteomics data.

    PubMed

    Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping

    2018-05-22

    Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.

  13. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

    PubMed

    Willis, Brian H; Riley, Richard D

    2017-09-20

    An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  14. On the Application of Syntactic Methodologies in Automatic Text Analysis.

    ERIC Educational Resources Information Center

    Salton, Gerard; And Others

    1990-01-01

    Summarizes various linguistic approaches proposed for document analysis in information retrieval environments. Topics discussed include syntactic analysis; use of machine-readable dictionary information; knowledge base construction; the PLNLP English Grammar (PEG) system; phrase normalization; and statistical and syntactic phrase evaluation used…

  15. Random forests for classification in ecology

    USGS Publications Warehouse

    Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.

    2007-01-01

    Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.

  16. Advanced statistical methods for improved data analysis of NASA astrophysics missions

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.

    1992-01-01

    The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.

  17. Scaling up to address data science challenges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wendelberger, Joanne R.

    Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less

  18. Statistical innovations in the medical device world sparked by the FDA.

    PubMed

    Campbell, Gregory; Yue, Lilly Q

    2016-01-01

    The world of medical devices while highly diverse is extremely innovative, and this facilitates the adoption of innovative statistical techniques. Statisticians in the Center for Devices and Radiological Health (CDRH) at the Food and Drug Administration (FDA) have provided leadership in implementing statistical innovations. The innovations discussed include: the incorporation of Bayesian methods in clinical trials, adaptive designs, the use and development of propensity score methodology in the design and analysis of non-randomized observational studies, the use of tipping-point analysis for missing data, techniques for diagnostic test evaluation, bridging studies for companion diagnostic tests, quantitative benefit-risk decisions, and patient preference studies.

  19. Scaling up to address data science challenges

    DOE PAGES

    Wendelberger, Joanne R.

    2017-04-27

    Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less

  20. Cognition, comprehension and application of biostatistics in research by Indian postgraduate students in periodontics.

    PubMed

    Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar

    2014-01-01

    Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed.

  1. Analysis and interpretation of cost data in randomised controlled trials: review of published studies

    PubMed Central

    Barber, Julie A; Thompson, Simon G

    1998-01-01

    Objective To review critically the statistical methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the study. Design Survey of published randomised trials including an economic evaluation with cost values suitable for statistical analysis; 45 such trials published in 1995 were identified from Medline. Main outcome measures The use of statistical methods for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified. Results Although all 45 trials reviewed apparently had cost data for each patient, only 9 (20%) reported adequate measures of variability for these data and only 25 (56%) gave results of statistical tests or a measure of precision for the comparison of costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented in the paper. No paper reported sample size calculations for costs. Conclusions The analysis and interpretation of cost data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs of alternative therapies have often been reported in the absence of supporting statistical evidence. Improvements in the analysis and reporting of health economic assessments are urgently required. Health economic guidelines need to be revised to incorporate more detailed statistical advice. Key messagesHealth economic evaluations required for important healthcare policy decisions are often carried out in randomised controlled trialsA review of such published economic evaluations assessed whether statistical methods for cost outcomes have been appropriately used and interpretedFew publications presented adequate descriptive information for costs or performed appropriate statistical analysesIn at least two thirds of the papers, the main conclusions regarding costs were not justifiedThe analysis and reporting of health economic assessments within randomised controlled trials urgently need improving PMID:9794854

  2. Use of model calibration to achieve high accuracy in analysis of computer networks

    DOEpatents

    Frogner, Bjorn; Guarro, Sergio; Scharf, Guy

    2004-05-11

    A system and method are provided for creating a network performance prediction model, and calibrating the prediction model, through application of network load statistical analyses. The method includes characterizing the measured load on the network, which may include background load data obtained over time, and may further include directed load data representative of a transaction-level event. Probabilistic representations of load data are derived to characterize the statistical persistence of the network performance variability and to determine delays throughout the network. The probabilistic representations are applied to the network performance prediction model to adapt the model for accurate prediction of network performance. Certain embodiments of the method and system may be used for analysis of the performance of a distributed application characterized as data packet streams.

  3. Compression Algorithm Analysis of In-Situ (S)TEM Video: Towards Automatic Event Detection and Characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Teuton, Jeremy R.; Griswold, Richard L.; Mehdi, Beata L.

    Precise analysis of both (S)TEM images and video are time and labor intensive processes. As an example, determining when crystal growth and shrinkage occurs during the dynamic process of Li dendrite deposition and stripping involves manually scanning through each frame in the video to extract a specific set of frames/images. For large numbers of images, this process can be very time consuming, so a fast and accurate automated method is desirable. Given this need, we developed software that uses analysis of video compression statistics for detecting and characterizing events in large data sets. This software works by converting the datamore » into a series of images which it compresses into an MPEG-2 video using the open source “avconv” utility [1]. The software does not use the video itself, but rather analyzes the video statistics from the first pass of the video encoding that avconv records in the log file. This file contains statistics for each frame of the video including the frame quality, intra-texture and predicted texture bits, forward and backward motion vector resolution, among others. In all, avconv records 15 statistics for each frame. By combining different statistics, we have been able to detect events in various types of data. We have developed an interactive tool for exploring the data and the statistics that aids the analyst in selecting useful statistics for each analysis. Going forward, an algorithm for detecting and possibly describing events automatically can be written based on statistic(s) for each data type.« less

  4. Developing a Campaign Plan to Target Centers of Gravity Within Economic Systems

    DTIC Science & Technology

    1995-05-01

    Conclusion 67 CHAPTER 7: CURRENT AND FUTURE CONCERNS 69 Decision Making and Planning 69 Conclusion 72 CHAPTER 8: CONCLUSION 73 APPENDIX A: STATISTICS 80...Terminology and Statistical Tests 80 Country Analysis 84 APPENDIX B 154 BIBLIOGRAPHY 157 VITAE 162 IV LIST OF FIGURES Figure 1. Air Campaign...This project furthers the original statistical effort and adds to this a campaign planning approach (including both systems and operational level

  5. The Ontology of Biological and Clinical Statistics (OBCS) for standardized and reproducible statistical analysis.

    PubMed

    Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun

    2016-09-14

    Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.

  6. Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

    PubMed Central

    Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

    2006-01-01

    In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709

  7. Higher Education: Gaps in Access and Persistence Study. Statistical Analysis Report. NCES 2012-046

    ERIC Educational Resources Information Center

    Ross, Terris; Kena, Grace; Rathbun, Amy; KewalRamani, Angelina; Zhang, Jijun; Kristapovich, Paul; Manning, Eileen

    2012-01-01

    Numerous studies, including those of the National Center for Education Statistics (NCES), have documented persistent gaps between the educational attainment of White males and that of Black, Hispanic, American Indian/Alaska Native, and Native Hawaiian/Pacific Islander males. Further, there is evidence of growing gaps by sex within these…

  8. [Artificial neural networks for decision making in urologic oncology].

    PubMed

    Remzi, M; Djavan, B

    2007-06-01

    This chapter presents a detailed introduction regarding Artificial Neural Networks (ANNs) and their contribution to modern Urologic Oncology. It includes a description of ANNs methodology and points out the differences between Artifical Intelligence and traditional statistic models in terms of usefulness for patients and clinicians, and its advantages over current statistical analysis.

  9. The Effects of Measurement Error on Statistical Models for Analyzing Change. Final Report.

    ERIC Educational Resources Information Center

    Dunivant, Noel

    The results of six major projects are discussed including a comprehensive mathematical and statistical analysis of the problems caused by errors of measurement in linear models for assessing change. In a general matrix representation of the problem, several new analytic results are proved concerning the parameters which affect bias in…

  10. Nebraska's forests, 2005: statistics, methods, and quality assurance

    Treesearch

    Patrick D. Miles; Dacia M. Meneguzzo; Charles J. Barnett

    2011-01-01

    The first full annual inventory of Nebraska's forests was completed in 2005 after 8,335 plots were selected and 274 forested plots were visited and measured. This report includes detailed information on forest inventory methods, and data quality estimates. Tables of various important resource statistics are presented. Detailed analysis of the inventory data are...

  11. Statistical description of tectonic motions

    NASA Technical Reports Server (NTRS)

    Agnew, Duncan Carr

    1993-01-01

    This report summarizes investigations regarding tectonic motions. The topics discussed include statistics of crustal deformation, Earth rotation studies, using multitaper spectrum analysis techniques applied to both space-geodetic data and conventional astrometric estimates of the Earth's polar motion, and the development, design, and installation of high-stability geodetic monuments for use with the global positioning system.

  12. Critical discussion of evaluation parameters for inter-observer variability in target definition for radiation therapy.

    PubMed

    Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D

    2012-02-01

    Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.

  13. The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method

    NASA Astrophysics Data System (ADS)

    Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad

    2018-04-01

    Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.

  14. Mathematics and statistics research progress report, period ending June 30, 1983

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beauchamp, J. J.; Denson, M. V.; Heath, M. T.

    1983-08-01

    This report is the twenty-sixth in the series of progress reports of Mathematics and Statistics Research of the Computer Sciences organization, Union Carbide Corporation Nuclear Division. Part A records research progress in analysis of large data sets, applied analysis, biometrics research, computational statistics, materials science applications, numerical linear algebra, and risk analysis. Collaboration and consulting with others throughout the Oak Ridge Department of Energy complex are recorded in Part B. Included are sections on biological sciences, energy, engineering, environmental sciences, health and safety, and safeguards. Part C summarizes the various educational activities in which the staff was engaged. Part Dmore » lists the presentations of research results, and Part E records the staff's other professional activities during the report period.« less

  15. Mathematical Ties That Bind.

    ERIC Educational Resources Information Center

    House, Peggy A.

    1994-01-01

    Describes some mathematical investigations of the necktie which includes applications of geometry, statistics, data analysis, sampling, probability, symmetry, proportion, problem solving, and business. (MKR)

  16. Intrabolus pressure on high-resolution manometry distinguishes fibrostenotic and inflammatory phenotypes of eosinophilic esophagitis.

    PubMed

    Colizzo, J M; Clayton, S B; Richter, J E

    2016-08-01

    The aim of this investigation was to determine the motility patterns of inflammatory and fibrostenotic phenotypes of eosinophilic esophagitis (EoE) utilizing high-resolution manometry (HRM). Twenty-nine patients with a confirmed diagnosis of EoE according to clinicopathological criteria currently being managed at the Joy McCann Culverhouse Swallowing Center at the University of South Florida were included in the retrospective analysis. Only patients who completed HRM studies were included in the analysis. Patients were classified into inflammatory or fibrostenotic subtypes based on baseline endoscopic evidence. Their baseline HRM studies prior to therapy were analyzed. Manometric data including distal contractile integral, integrated relaxation pressure, and intrabolus pressure (IBP) values were recorded. HRM results were interpreted according to the Chicago Classification system. Statistical analysis was performed with SPSS software (Version 22, IBM Co., Armonk, NY, USA). Data were compared utilizing Student's t-test, χ(2) test, Pearson correlation, and Spearman correlation tests. Statistical significance was set at P < 0.05. A total of 29 patients with EoE were included into the retrospective analysis. The overall average age among patients was 40 years. Male patients comprised 62% of the overall population. Both groups were similar in age, gender, and overall clinical presentation. Seventeen patients (58%) had fibrostenotic disease, and 12 (42%) displayed inflammatory disease. The average IBP for the fibrostenotic and inflammatory groups were 18.6 ± 6.0 mmHg and 12.6 ± 3.5 mmHg, respectively (P < 0.05). Strictures were only seen in the fibrostenotic group. Of the fibrostenotic group, 6 (35%) demonstrated proximal esophageal strictures, 7 (41%) had distal strictures, 3 (18%) had mid-esophageal strictures, and 1 (6%) patient had pan-esophageal strictures. There was no statistically significant correlation between the level of esophageal stricture and degree of IBP. Integrated relaxation pressure, distal contractile integral, and other HRM metrics did not demonstrate statistical significance between the two subtypes. There also appeared no statistically significant correlation between patient demographics and esophageal metrics. Patients with the fibrostenotic phenotype of EoE demonstrated an IBP that was significantly higher than that of the inflammatory group. © 2015 International Society for Diseases of the Esophagus.

  17. Statistical summaries of selected Iowa streamflow data through September 2013

    USGS Publications Warehouse

    Eash, David A.; O'Shea, Padraic S.; Weber, Jared R.; Nguyen, Kevin T.; Montgomery, Nicholas L.; Simonson, Adrian J.

    2016-01-04

    Statistical summaries of streamflow data collected at 184 streamgages in Iowa are presented in this report. All streamgages included for analysis have at least 10 years of continuous record collected before or through September 2013. This report is an update to two previously published reports that presented statistical summaries of selected Iowa streamflow data through September 1988 and September 1996. The statistical summaries include (1) monthly and annual flow durations, (2) annual exceedance probabilities of instantaneous peak discharges (flood frequencies), (3) annual exceedance probabilities of high discharges, and (4) annual nonexceedance probabilities of low discharges and seasonal low discharges. Also presented for each streamgage are graphs of the annual mean discharges, mean annual mean discharges, 50-percent annual flow-duration discharges (median flows), harmonic mean flows, mean daily mean discharges, and flow-duration curves. Two sets of statistical summaries are presented for each streamgage, which include (1) long-term statistics for the entire period of streamflow record and (2) recent-term statistics for or during the 30-year period of record from 1984 to 2013. The recent-term statistics are only calculated for streamgages with streamflow records pre-dating the 1984 water year and with at least 10 years of record during 1984–2013. The streamflow statistics in this report are not adjusted for the effects of water use; although some of this water is used consumptively, most of it is returned to the streams.

  18. A New Methodology for Systematic Exploitation of Technology Databases.

    ERIC Educational Resources Information Center

    Bedecarrax, Chantal; Huot, Charles

    1994-01-01

    Presents the theoretical aspects of a data analysis methodology that can help transform sequential raw data from a database into useful information, using the statistical analysis of patents as an example. Topics discussed include relational analysis and a technology watch approach. (Contains 17 references.) (LRW)

  19. The Development of Career Naval Officers from the U.S. Naval Academy: A Statistical Analysis of the Effects of Selectivity and Human Capital

    DTIC Science & Technology

    1997-06-01

    career success for academy graduates relative to officers commissioned from other sources. Favoritism occurs if high-ranking officers who are service... career success as a naval officer? 6 The thesis investigates several databases in an effort to paint a complete statistical picture of naval officer...including both public and private sector career success was conducted by the Standard & Poor’s Corporation with a related analysis by Professor Michael Useem

  20. Statistical analysis of fNIRS data: a comprehensive review.

    PubMed

    Tak, Sungho; Ye, Jong Chul

    2014-01-15

    Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.

  1. Analysis of Doppler radar windshear data

    NASA Technical Reports Server (NTRS)

    Williams, F.; Mckinney, P.; Ozmen, F.

    1989-01-01

    The objective of this analysis is to process Lincoln Laboratory Doppler radar data obtained during FLOWS testing at Huntsville, Alabama, in the summer of 1986, to characterize windshear events. The processing includes plotting velocity and F-factor profiles, histogram analysis to summarize statistics, and correlation analysis to demonstrate any correlation between different data fields.

  2. Test 6, Test 7, and Gas Standard Analysis Results

    NASA Technical Reports Server (NTRS)

    Perez, Horacio, III

    2007-01-01

    This viewgraph presentation shows results of analyses on odor, toxic off gassing and gas standards. The topics include: 1) Statistical Analysis Definitions; 2) Odor Analysis Results NASA Standard 6001 Test 6; 3) Toxic Off gassing Analysis Results NASA Standard 6001 Test 7; and 4) Gas Standard Results NASA Standard 6001 Test 7;

  3. Outcomes Definitions and Statistical Tests in Oncology Studies: A Systematic Review of the Reporting Consistency.

    PubMed

    Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie

    2016-01-01

    Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31-0.89] (P value = 0.009). Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies.

  4. Outcomes Definitions and Statistical Tests in Oncology Studies: A Systematic Review of the Reporting Consistency

    PubMed Central

    Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie

    2016-01-01

    Background Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. Methods From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. Results 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31–0.89] (P value = 0.009). Conclusion Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies. PMID:27716793

  5. SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

    PubMed Central

    Chu, Annie; Cui, Jenny; Dinov, Ivo D.

    2011-01-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994

  6. Statistical analysis and application of quasi experiments to antimicrobial resistance intervention studies.

    PubMed

    Shardell, Michelle; Harris, Anthony D; El-Kamary, Samer S; Furuno, Jon P; Miller, Ram R; Perencevich, Eli N

    2007-10-01

    Quasi-experimental study designs are frequently used to assess interventions that aim to limit the emergence of antimicrobial-resistant pathogens. However, previous studies using these designs have often used suboptimal statistical methods, which may result in researchers making spurious conclusions. Methods used to analyze quasi-experimental data include 2-group tests, regression analysis, and time-series analysis, and they all have specific assumptions, data requirements, strengths, and limitations. An example of a hospital-based intervention to reduce methicillin-resistant Staphylococcus aureus infection rates and reduce overall length of stay is used to explore these methods.

  7. Research Update: Spatially resolved mapping of electronic structure on atomic level by multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Belianinov, Alex; Ganesh, Panchapakesan; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena S.; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.

    2014-12-01

    Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1-xSex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.

  8. The potential of statistical shape modelling for geometric morphometric analysis of human teeth in archaeological research

    PubMed Central

    Fernee, Christianne; Browne, Martin; Zakrzewski, Sonia

    2017-01-01

    This paper introduces statistical shape modelling (SSM) for use in osteoarchaeology research. SSM is a full field, multi-material analytical technique, and is presented as a supplementary geometric morphometric (GM) tool. Lower mandibular canines from two archaeological populations and one modern population were sampled, digitised using micro-CT, aligned, registered to a baseline and statistically modelled using principal component analysis (PCA). Sample material properties were incorporated as a binary enamel/dentin parameter. Results were assessed qualitatively and quantitatively using anatomical landmarks. Finally, the technique’s application was demonstrated for inter-sample comparison through analysis of the principal component (PC) weights. It was found that SSM could provide high detail qualitative and quantitative insight with respect to archaeological inter- and intra-sample variability. This technique has value for archaeological, biomechanical and forensic applications including identification, finite element analysis (FEA) and reconstruction from partial datasets. PMID:29216199

  9. Multivariate analysis: A statistical approach for computations

    NASA Astrophysics Data System (ADS)

    Michu, Sachin; Kaushik, Vandana

    2014-10-01

    Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.

  10. Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

    PubMed Central

    Riley, Richard D.

    2017-01-01

    An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945

  11. Monitoring and Evaluation: Statistical Support for Life-cycle Studies, Annual Report 2003.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Skalski, John

    2003-11-01

    The ongoing mission of this project is the development of statistical tools for analyzing fisheries tagging data in the most precise and appropriate manner possible. This mission also includes providing statistical guidance on the best ways to design large-scale tagging studies. This mission continues because the technologies for conducting fish tagging studies continuously evolve. In just the last decade, fisheries biologists have seen the evolution from freeze-brands and coded wire tags (CWT) to passive integrated transponder (PIT) tags, balloon-tags, radiotelemetry, and now, acoustic-tags. With each advance, the technology holds the promise of more detailed and precise information. However, the technologymore » for analyzing and interpreting the data also becomes more complex as the tagging techniques become more sophisticated. The goal of the project is to develop the analytical tools in parallel with the technical advances in tagging studies, so that maximum information can be extracted on a timely basis. Associated with this mission is the transfer of these analytical capabilities to the field investigators to assure consistency and the highest levels of design and analysis throughout the fisheries community. Consequently, this project provides detailed technical assistance on the design and analysis of tagging studies to groups requesting assistance throughout the fisheries community. Ideally, each project and each investigator would invest in the statistical support needed for the successful completion of their study. However, this is an ideal that is rarely if every attained. Furthermore, there is only a small pool of highly trained scientists in this specialized area of tag analysis here in the Northwest. Project 198910700 provides the financial support to sustain this local expertise on the statistical theory of tag analysis at the University of Washington and make it available to the fisheries community. Piecemeal and fragmented support from various agencies and organizations would be incapable of maintaining a center of expertise. The mission of the project is to help assure tagging studies are designed and analyzed from the onset to extract the best available information using state-of-the-art statistical methods. The overarching goals of the project is to assure statistically sound survival studies so that fish managers can focus on the management implications of their findings and not be distracted by concerns whether the studies are statistically reliable or not. Specific goals and objectives of the study include the following: (1) Provide consistent application of statistical methodologies for survival estimation across all salmon life cycle stages to assure comparable performance measures and assessment of results through time, to maximize learning and adaptive management opportunities, and to improve and maintain the ability to responsibly evaluate the success of implemented Columbia River FWP salmonid mitigation programs and identify future mitigation options. (2) Improve analytical capabilities to conduct research on survival processes of wild and hatchery chinook and steelhead during smolt outmigration, to improve monitoring and evaluation capabilities and assist in-season river management to optimize operational and fish passage strategies to maximize survival. (3) Extend statistical support to estimate ocean survival and in-river survival of returning adults. Provide statistical guidance in implementing a river-wide adult PIT-tag detection capability. (4) Develop statistical methods for survival estimation for all potential users and make this information available through peer-reviewed publications, statistical software, and technology transfers to organizations such as NOAA Fisheries, the Fish Passage Center, US Fish and Wildlife Service, US Geological Survey (USGS), US Army Corps of Engineers (USACE), Public Utility Districts (PUDs), the Independent Scientific Advisory Board (ISAB), and other members of the Northwest fisheries community. (5) Provide and maintain statistical software for tag analysis and user support. (6) Provide improvements in statistical theory and software as requested by user groups. These improvements include extending software capabilities to address new research issues, adapting tagging techniques to new study designs, and extending the analysis capabilities to new technologies such as radio-tags and acoustic-tags.« less

  12. Graphical augmentations to the funnel plot assess the impact of additional evidence on a meta-analysis.

    PubMed

    Langan, Dean; Higgins, Julian P T; Gregory, Walter; Sutton, Alexander J

    2012-05-01

    We aim to illustrate the potential impact of a new study on a meta-analysis, which gives an indication of the robustness of the meta-analysis. A number of augmentations are proposed to one of the most widely used of graphical displays, the funnel plot. Namely, 1) statistical significance contours, which define regions of the funnel plot in which a new study would have to be located to change the statistical significance of the meta-analysis; and 2) heterogeneity contours, which show how a new study would affect the extent of heterogeneity in a given meta-analysis. Several other features are also described, and the use of multiple features simultaneously is considered. The statistical significance contours suggest that one additional study, no matter how large, may have a very limited impact on the statistical significance of a meta-analysis. The heterogeneity contours illustrate that one outlying study can increase the level of heterogeneity dramatically. The additional features of the funnel plot have applications including 1) informing sample size calculations for the design of future studies eligible for inclusion in the meta-analysis; and 2) informing the updating prioritization of a portfolio of meta-analyses such as those prepared by the Cochrane Collaboration. Copyright © 2012 Elsevier Inc. All rights reserved.

  13. Utilization of an Enhanced Canonical Correlation Analysis (ECCA) to Predict Daily Precipitation and Temperature in a Semi-Arid Environment

    NASA Astrophysics Data System (ADS)

    Lopez, S. R.; Hogue, T. S.

    2011-12-01

    Global climate models (GCMs) are primarily used to generate historical and future large-scale circulation patterns at a coarse resolution (typical order of 50,000 km2) and fail to capture climate variability at the ground level due to localized surface influences (i.e topography, marine, layer, land cover, etc). Their inability to accurately resolve these processes has led to the development of numerous 'downscaling' techniques. The goal of this study is to enhance statistical downscaling of daily precipitation and temperature for regions with heterogeneous land cover and topography. Our analysis was divided into two periods, historical (1961-2000) and contemporary (1980-2000), and tested using sixteen predictand combinations from four GCMs (GFDL CM2.0, GFDL CM2.1, CNRM-CM3 and MRI-CGCM2 3.2a. The Southern California area was separated into five county regions: Santa Barbara, Ventura, Los Angeles, Orange and San Diego. Principle component analysis (PCA) was performed on ground-based observations in order to (1) reduce the number of redundant gauges and minimize dimensionality and (2) cluster gauges that behave statistically similarly for post-analysis. Post-PCA analysis included extensive testing of predictor-predictand relationships using an enhanced canonical correlation analysis (ECCA). The ECCA includes obtaining the optimal predictand sets for all models within each spatial domain (county) as governed by daily and monthly overall statistics. Results show all models maintain mean annual and monthly behavior within each county and daily statistics are improved. The level of improvement highly depends on the vegetation extent within each county and the land-to-ocean ratio within the GCM spatial grid. The utilization of the entire historical period also leads to better statistical representation of observed daily precipitation. The validated ECCA technique is being applied to future climate scenarios distributed by the IPCC in order to provide forcing data for regional hydrologic models and assess future water resources in the Southern California region.

  14. Statistical methods to estimate treatment effects from multichannel electroencephalography (EEG) data in clinical trials.

    PubMed

    Ma, Junshui; Wang, Shubing; Raubertas, Richard; Svetnik, Vladimir

    2010-07-15

    With the increasing popularity of using electroencephalography (EEG) to reveal the treatment effect in drug development clinical trials, the vast volume and complex nature of EEG data compose an intriguing, but challenging, topic. In this paper the statistical analysis methods recommended by the EEG community, along with methods frequently used in the published literature, are first reviewed. A straightforward adjustment of the existing methods to handle multichannel EEG data is then introduced. In addition, based on the spatial smoothness property of EEG data, a new category of statistical methods is proposed. The new methods use a linear combination of low-degree spherical harmonic (SPHARM) basis functions to represent a spatially smoothed version of the EEG data on the scalp, which is close to a sphere in shape. In total, seven statistical methods, including both the existing and the newly proposed methods, are applied to two clinical datasets to compare their power to detect a drug effect. Contrary to the EEG community's recommendation, our results suggest that (1) the nonparametric method does not outperform its parametric counterpart; and (2) including baseline data in the analysis does not always improve the statistical power. In addition, our results recommend that (3) simple paired statistical tests should be avoided due to their poor power; and (4) the proposed spatially smoothed methods perform better than their unsmoothed versions. Copyright 2010 Elsevier B.V. All rights reserved.

  15. RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.

    PubMed

    Glaab, Enrico; Schneider, Reinhard

    2015-07-01

    High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  16. Uncertainty Analysis of Inertial Model Attitude Sensor Calibration and Application with a Recommended New Calibration Method

    NASA Technical Reports Server (NTRS)

    Tripp, John S.; Tcheng, Ping

    1999-01-01

    Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.

  17. Statistical considerations in the development of injury risk functions.

    PubMed

    McMurry, Timothy L; Poplin, Gerald S

    2015-01-01

    We address 4 frequently misunderstood and important statistical ideas in the construction of injury risk functions. These include the similarities of survival analysis and logistic regression, the correct scale on which to construct pointwise confidence intervals for injury risk, the ability to discern which form of injury risk function is optimal, and the handling of repeated tests on the same subject. The statistical models are explored through simulation and examination of the underlying mathematics. We provide recommendations for the statistically valid construction and correct interpretation of single-predictor injury risk functions. This article aims to provide useful and understandable statistical guidance to improve the practice in constructing injury risk functions.

  18. Psychometric Analysis of Role Conflict and Ambiguity Scales in Academia

    ERIC Educational Resources Information Center

    Khan, Anwar; Yusoff, Rosman Bin Md.; Khan, Muhammad Muddassar; Yasir, Muhammad; Khan, Faisal

    2014-01-01

    A comprehensive Psychometric Analysis of Rizzo et al.'s (1970) Role Conflict & Ambiguity (RCA) scales were performed after its distribution among 600 academic staff working in six universities of Pakistan. The reliability analysis includes calculation of Cronbach Alpha Coefficients and Inter-Items statistics, whereas validity was determined by…

  19. Meta-Analysis in Stata Using Gllamm

    ERIC Educational Resources Information Center

    Bagos, Pantelis G.

    2015-01-01

    There are several user-written programs for performing meta-analysis in Stata (Stata Statistical Software: College Station, TX: Stata Corp LP). These include metan, metareg, mvmeta, and glst. However, there are several cases for which these programs do not suffice. For instance, there is no software for performing univariate meta-analysis with…

  20. Genome-wide scans of genetic variants for psychophysiological endophenotypes: a methodological overview.

    PubMed

    Iacono, William G; Malone, Stephen M; Vaidyanathan, Uma; Vrieze, Scott I

    2014-12-01

    This article provides an introductory overview of the investigative strategy employed to evaluate the genetic basis of 17 endophenotypes examined as part of a 20-year data collection effort from the Minnesota Center for Twin and Family Research. Included are characterization of the study samples, descriptive statistics for key properties of the psychophysiological measures, and rationale behind the steps taken in the molecular genetic study design. The statistical approach included (a) biometric analysis of twin and family data, (b) heritability analysis using 527,829 single nucleotide polymorphisms (SNPs), (c) genome-wide association analysis of these SNPs and 17,601 autosomal genes, (d) follow-up analyses of candidate SNPs and genes hypothesized to have an association with each endophenotype, (e) rare variant analysis of nonsynonymous SNPs in the exome, and (f) whole genome sequencing association analysis using 27 million genetic variants. These methods were used in the accompanying empirical articles comprising this special issue, Genome-Wide Scans of Genetic Variants for Psychophysiological Endophenotypes. Copyright © 2014 Society for Psychophysiological Research.

  1. Statistical process control methods allow the analysis and improvement of anesthesia care.

    PubMed

    Fasting, Sigurd; Gisvold, Sven E

    2003-10-01

    Quality aspects of the anesthetic process are reflected in the rate of intraoperative adverse events. The purpose of this report is to illustrate how the quality of the anesthesia process can be analyzed using statistical process control methods, and exemplify how this analysis can be used for quality improvement. We prospectively recorded anesthesia-related data from all anesthetics for five years. The data included intraoperative adverse events, which were graded into four levels, according to severity. We selected four adverse events, representing important quality and safety aspects, for statistical process control analysis. These were: inadequate regional anesthesia, difficult emergence from general anesthesia, intubation difficulties and drug errors. We analyzed the underlying process using 'p-charts' for statistical process control. In 65,170 anesthetics we recorded adverse events in 18.3%; mostly of lesser severity. Control charts were used to define statistically the predictable normal variation in problem rate, and then used as a basis for analysis of the selected problems with the following results: Inadequate plexus anesthesia: stable process, but unacceptably high failure rate; Difficult emergence: unstable process, because of quality improvement efforts; Intubation difficulties: stable process, rate acceptable; Medication errors: methodology not suited because of low rate of errors. By applying statistical process control methods to the analysis of adverse events, we have exemplified how this allows us to determine if a process is stable, whether an intervention is required, and if quality improvement efforts have the desired effect.

  2. Assessment of statistical methods used in library-based approaches to microbial source tracking.

    PubMed

    Ritter, Kerry J; Carruthers, Ethan; Carson, C Andrew; Ellender, R D; Harwood, Valerie J; Kingsley, Kyle; Nakatsu, Cindy; Sadowsky, Michael; Shear, Brian; West, Brian; Whitlock, John E; Wiggins, Bruce A; Wilbur, Jayson D

    2003-12-01

    Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.

  3. Model-Based Linkage Analysis of a Quantitative Trait.

    PubMed

    Song, Yeunjoo E; Song, Sunah; Schnell, Audrey H

    2017-01-01

    Linkage Analysis is a family-based method of analysis to examine whether any typed genetic markers cosegregate with a given trait, in this case a quantitative trait. If linkage exists, this is taken as evidence in support of a genetic basis for the trait. Historically, linkage analysis was performed using a binary disease trait, but has been extended to include quantitative disease measures. Quantitative traits are desirable as they provide more information than binary traits. Linkage analysis can be performed using single-marker methods (one marker at a time) or multipoint (using multiple markers simultaneously). In model-based linkage analysis the genetic model for the trait of interest is specified. There are many software options for performing linkage analysis. Here, we use the program package Statistical Analysis for Genetic Epidemiology (S.A.G.E.). S.A.G.E. was chosen because it also includes programs to perform data cleaning procedures and to generate and test genetic models for a quantitative trait, in addition to performing linkage analysis. We demonstrate in detail the process of running the program LODLINK to perform single-marker analysis, and MLOD to perform multipoint analysis using output from SEGREG, where SEGREG was used to determine the best fitting statistical model for the trait.

  4. Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop

    NASA Technical Reports Server (NTRS)

    Morrison, Joseph H.

    2010-01-01

    A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.

  5. Classification and statistical analysis of mine spoils chemical composition from Oliete basin (Teruel, NE Spain)

    NASA Astrophysics Data System (ADS)

    Meseguer, S.; Sanfeliu, T.; Jordán, M. M.

    2009-02-01

    The Oliete basin (Early Cretaceous, NE Teruel, Spain) is one of the most important areas for the supply of mine spoils used as ball clays for the production of white and red stoneware in the Spanish ceramic industry of wall and floor tiles. This study corresponds to the second part of the paper published recently by Meseguer et al. (Environ Geol 2008) about the use of mine spoils from Teruel coal mining district. The present study shows a statistical data analysis from chemical data (major, minor and trace elements). The performed statistical analysis of chemical data included descriptive statistics and cluster analysis (with ANOVA and Scheffé methods). The cluster analysis of chemical data provided three main groups: C3 with the highest mean SiO2 content (66%) and lowest mean Al2O3 content (20%); C2 with lower SiO2 content (48%) and higher mean Al2O3 content (28%); and C1 with medium values for the SiO2 and Al2O3 mean content. The main applications of these materials are refractory, white and red ceramics, stoneware, heavy ceramics (including red earthenware, bricks and roof tiles), and components of white Portland cement and aluminous cement. Clays from group 2 are used in refractories (with higher kaolinite content, and constrictions to CaO + MgO and K2O + Na2O contents). All materials can be used in fine ceramics (white or red, according to the Fe2O3 + TiO2 content).

  6. Cognition, comprehension and application of biostatistics in research by Indian postgraduate students in periodontics

    PubMed Central

    Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar

    2014-01-01

    Background: Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. Aim: The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. Materials and Methods: A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Results: Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Conclusion: Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed. PMID:24744547

  7. Correlation of RNA secondary structure statistics with thermodynamic stability and applications to folding.

    PubMed

    Wu, Johnny C; Gardner, David P; Ozer, Stuart; Gutell, Robin R; Ren, Pengyu

    2009-08-28

    The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.

  8. Post-operative diffusion weighted imaging as a predictor of posterior fossa syndrome permanence in paediatric medulloblastoma.

    PubMed

    Chua, Felicia H Z; Thien, Ady; Ng, Lee Ping; Seow, Wan Tew; Low, David C Y; Chang, Kenneth T E; Lian, Derrick W Q; Loh, Eva; Low, Sharon Y Y

    2017-03-01

    Posterior fossa syndrome (PFS) is a serious complication faced by neurosurgeons and their patients, especially in paediatric medulloblastoma patients. The uncertain aetiology of PFS, myriad of cited risk factors and therapeutic challenges make this phenomenon an elusive entity. The primary objective of this study was to identify associative factors related to the development of PFS in medulloblastoma patient post-tumour resection. This is a retrospective study based at a single institution. Patient data and all related information were collected from the hospital records, in accordance to a list of possible risk factors associated with PFS. These included pre-operative tumour volume, hydrocephalus, age, gender, extent of resection, metastasis, ventriculoperitoneal shunt insertion, post-operative meningitis and radiological changes in MRI. Additional variables included molecular and histological subtypes of each patient's medulloblastoma tumour. Statistical analysis was employed to determine evidence of each variable's significance in PFS permanence. A total of 19 patients with appropriately complete data was identified. Initial univariate analysis did not show any statistical significance. However, multivariate analysis for MRI-specific changes reported bilateral DWI restricted diffusion changes involving both right and left sides of the surgical cavity was of statistical significance for PFS permanence. The authors performed a clinical study that evaluated possible risk factors for permanent PFS in paediatric medulloblastoma patients. Analysis of collated results found that post-operative DWI restriction in bilateral regions within the surgical cavity demonstrated statistical significance as a predictor of PFS permanence-a novel finding in the current literature.

  9. Analysis of traffic accident data in Kentucky (1994-1998)

    DOT National Transportation Integrated Search

    1999-09-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 1994 through 1998. A primary objective of this study was to determine average accident statistics for Kentucky highways. Average and critical numbers and rates of ...

  10. Analysis of traffic crash data in Kentucky : 2002-2006.

    DOT National Transportation Integrated Search

    2007-09-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 2002 through 2006. A primary objective of this study was to determine average accident statistics for Kentucky highways. Average and critical numbers and rates of ...

  11. Analysis of traffic crash data in Kentucky : 2003-2007.

    DOT National Transportation Integrated Search

    2008-08-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 2003 through 2007. A primary objective of this study was to determine average accident statistics for Kentucky highways. Average and critical numbers and rates of ...

  12. Surgical Treatment for Discogenic Low-Back Pain: Lumbar Arthroplasty Results in Superior Pain Reduction and Disability Level Improvement Compared With Lumbar Fusion

    PubMed Central

    2007-01-01

    Background The US Food and Drug Administration approved the Charité artificial disc on October 26, 2004. This approval was based on an extensive analysis and review process; 20 years of disc usage worldwide; and the results of a prospective, randomized, controlled clinical trial that compared lumbar artificial disc replacement to fusion. The results of the investigational device exemption (IDE) study led to a conclusion that clinical outcomes following lumbar arthroplasty were at least as good as outcomes from fusion. Methods The author performed a new analysis of the Visual Analog Scale pain scores and the Oswestry Disability Index scores from the Charité artificial disc IDE study and used a nonparametric statistical test, because observed data distributions were not normal. The analysis included all of the enrolled subjects in both the nonrandomized and randomized phases of the study. Results Subjects from both the treatment and control groups improved from the baseline situation (P < .001) at all follow-up times (6 weeks to 24 months). Additionally, these pain and disability levels with artificial disc replacement were superior (P < .05) to the fusion treatment at all follow-up times including 2 years. Conclusions The a priori statistical plan for an IDE study may not adequately address the final distribution of the data. Therefore, statistical analyses more appropriate to the distribution may be necessary to develop meaningful statistical conclusions from the study. A nonparametric statistical analysis of the Charité artificial disc IDE outcomes scores demonstrates superiority for lumbar arthroplasty versus fusion at all follow-up time points to 24 months. PMID:25802574

  13. Multivariate Meta-Analysis of Heterogeneous Studies Using Only Summary Statistics: Efficiency and Robustness

    PubMed Central

    Liu, Dungang; Liu, Regina; Xie, Minge

    2014-01-01

    Meta-analysis has been widely used to synthesize evidence from multiple studies for common hypotheses or parameters of interest. However, it has not yet been fully developed for incorporating heterogeneous studies, which arise often in applications due to different study designs, populations or outcomes. For heterogeneous studies, the parameter of interest may not be estimable for certain studies, and in such a case, these studies are typically excluded from conventional meta-analysis. The exclusion of part of the studies can lead to a non-negligible loss of information. This paper introduces a metaanalysis for heterogeneous studies by combining the confidence density functions derived from the summary statistics of individual studies, hence referred to as the CD approach. It includes all the studies in the analysis and makes use of all information, direct as well as indirect. Under a general likelihood inference framework, this new approach is shown to have several desirable properties, including: i) it is asymptotically as efficient as the maximum likelihood approach using individual participant data (IPD) from all studies; ii) unlike the IPD analysis, it suffices to use summary statistics to carry out the CD approach. Individual-level data are not required; and iii) it is robust against misspecification of the working covariance structure of the parameter estimates. Besides its own theoretical significance, the last property also substantially broadens the applicability of the CD approach. All the properties of the CD approach are further confirmed by data simulated from a randomized clinical trials setting as well as by real data on aircraft landing performance. Overall, one obtains an unifying approach for combining summary statistics, subsuming many of the existing meta-analysis methods as special cases. PMID:26190875

  14. Precipitate statistics in an Al-Mg-Si-Cu alloy from scanning precession electron diffraction data

    NASA Astrophysics Data System (ADS)

    Sunde, J. K.; Paulsen, Ø.; Wenner, S.; Holmestad, R.

    2017-09-01

    The key microstructural feature providing strength to age-hardenable Al alloys is nanoscale precipitates. Alloy development requires a reliable statistical assessment of these precipitates, in order to link the microstructure with material properties. Here, it is demonstrated that scanning precession electron diffraction combined with computational analysis enable the semi-automated extraction of precipitate statistics in an Al-Mg-Si-Cu alloy. Among the main findings is the precipitate number density, which agrees well with a conventional method based on manual counting and measurements. By virtue of its data analysis objectivity, our methodology is therefore seen as an advantageous alternative to existing routines, offering reproducibility and efficiency in alloy statistics. Additional results include improved qualitative information on phase distributions. The developed procedure is generic and applicable to any material containing nanoscale precipitates.

  15. Biomedical applications of aerospace technology

    NASA Technical Reports Server (NTRS)

    Castles, T. R.

    1971-01-01

    Aerospace technology transfer to biomedical research problems is discussed, including transfer innovations and potential applications. Statistical analysis of the transfer activities and impact is also presented.

  16. New software for statistical analysis of Cambridge Structural Database data

    PubMed Central

    Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.

    2011-01-01

    A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784

  17. Statistics Clinic

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

    2014-01-01

    Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.

  18. The Deployment Life Study: Longitudinal Analysis of Military Families Across the Deployment Cycle

    DTIC Science & Technology

    2016-01-01

    psychological and physical aggression than they reported prior to the deployment. 1 H. Fischer, A Guide to U.S. Military Casualty Statistics ...analyses include a large number of statistical tests and thus the results pre- sented in this report should be viewed in terms of patterns, rather...Military Children and Families,” The Future of Children, Vol. 23, No. 2, 2013, pp. 13–39. Fischer, H., A Guide to U.S. Military Casualty Statistics

  19. Trend analysis and selected summary statistics of annual mean streamflow for 38 selected long-term U.S. Geological Survey streamgages in Texas, water years 1916-2012

    USGS Publications Warehouse

    Asquith, William H.; Barbie, Dana L.

    2014-01-01

    Selected summary statistics (L-moments) and estimates of respective sampling variances were computed for the 35 streamgages lacking statistically significant trends. From the L-moments and estimated sampling variances, weighted means or regional values were computed for each L-moment. An example application is included demonstrating how the L-moments could be used to evaluate the magnitude and frequency of annual mean streamflow.

  20. A review of empirical research related to the use of small quantitative samples in clinical outcome scale development.

    PubMed

    Houts, Carrie R; Edwards, Michael C; Wirth, R J; Deal, Linda S

    2016-11-01

    There has been a notable increase in the advocacy of using small-sample designs as an initial quantitative assessment of item and scale performance during the scale development process. This is particularly true in the development of clinical outcome assessments (COAs), where Rasch analysis has been advanced as an appropriate statistical tool for evaluating the developing COAs using a small sample. We review the benefits such methods are purported to offer from both a practical and statistical standpoint and detail several problematic areas, including both practical and statistical theory concerns, with respect to the use of quantitative methods, including Rasch-consistent methods, with small samples. The feasibility of obtaining accurate information and the potential negative impacts of misusing large-sample statistical methods with small samples during COA development are discussed.

  1. Statistical analysis and interpolation of compositional data in materials science.

    PubMed

    Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M

    2015-02-09

    Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.

  2. Using SPSS to Analyze Book Collection Data.

    ERIC Educational Resources Information Center

    Townley, Charles T.

    1981-01-01

    Describes and illustrates Statistical Package for the Social Sciences (SPSS) procedures appropriate for book collection data analysis. Several different procedures for univariate, bivariate, and multivariate analysis are discussed, and applications of procedures for book collection studies are presented. Included are 24 tables illustrating output…

  3. Analysis of traffic accident data in Kentucky (1986-1990)

    DOT National Transportation Integrated Search

    1991-09-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 1986-1990. A primary objectve of this study was to determine average statistics for kentucky highways. Average and critical number and rates of accidents were calc...

  4. Analysis of traffic crash data in Kentucky : 2000-2004.

    DOT National Transportation Integrated Search

    2005-08-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 2000 through 2004. A primary objective of this study was to determine average crash statistics for Kentucky highways. Average and critical numbers and rates of cra...

  5. Analysis of traffic crash data in Kentucky : 2001-2005.

    DOT National Transportation Integrated Search

    2006-08-01

    This report includes an analysis of traffic accident data in Kentucky for the years of 2001 through 2005. A primary objective of this study was to determine average crash statistics for Kentucky highways. Average and critical rates of crashes were ca...

  6. Guidelines for Genome-Scale Analysis of Biological Rhythms.

    PubMed

    Hughes, Michael E; Abruzzi, Katherine C; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M Fernanda; Chen, Zheng; Chiu, Joanna C; Cox, Juergen; Crowell, Alexander M; DeBruyne, Jason P; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J; Duffield, Giles E; Dunlap, Jay C; Eckel-Mahan, Kristin; Esser, Karyn A; FitzGerald, Garret A; Forger, Daniel B; Francey, Lauren J; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H; Herzel, Hanspeter; Herzog, Erik D; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J; Hurley, Jennifer M; de la Iglesia, Horacio O; Johnson, Carl; Kay, Steve A; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A; Li, Jiajia; Li, Xiaodong; Liu, Andrew C; Loros, Jennifer J; Martino, Tami A; Menet, Jerome S; Merrow, Martha; Millar, Andrew J; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N; Olmedo, Maria; Nusinow, Dmitri A; Ptáček, Louis J; Rand, David; Reddy, Akhilesh B; Robles, Maria S; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D; Rund, Samuel S C; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J; Storch, Kai-Florian; Takahashi, Joseph S; Ueda, Hiroki R; Wang, Han; Weitz, Charles; Westermark, Pål O; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B

    2017-10-01

    Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding "big data" that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them.

  7. Guidelines for Genome-Scale Analysis of Biological Rhythms

    PubMed Central

    Hughes, Michael E.; Abruzzi, Katherine C.; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M. Fernanda; Chen, Zheng; Chiu, Joanna C.; Cox, Juergen; Crowell, Alexander M.; DeBruyne, Jason P.; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J.; Duffield, Giles E.; Dunlap, Jay C.; Eckel-Mahan, Kristin; Esser, Karyn A.; FitzGerald, Garret A.; Forger, Daniel B.; Francey, Lauren J.; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S.; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H.; Herzel, Hanspeter; Herzog, Erik D.; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J.; Hurley, Jennifer M.; de la Iglesia, Horacio O.; Johnson, Carl; Kay, Steve A.; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A.; Li, Jiajia; Li, Xiaodong; Liu, Andrew C.; Loros, Jennifer J.; Martino, Tami A.; Menet, Jerome S.; Merrow, Martha; Millar, Andrew J.; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N.; Olmedo, Maria; Nusinow, Dmitri A.; Ptáček, Louis J.; Rand, David; Reddy, Akhilesh B.; Robles, Maria S.; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D.; Rund, Samuel S.C.; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J.; Storch, Kai-Florian; Takahashi, Joseph S.; Ueda, Hiroki R.; Wang, Han; Weitz, Charles; Westermark, Pål O.; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B.

    2017-01-01

    Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding “big data” that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them. PMID:29098954

  8. medplot: a web application for dynamic summary and analysis of longitudinal medical data based on R.

    PubMed

    Ahlin, Črt; Stupica, Daša; Strle, Franc; Lusa, Lara

    2015-01-01

    In biomedical studies the patients are often evaluated numerous times and a large number of variables are recorded at each time-point. Data entry and manipulation of longitudinal data can be performed using spreadsheet programs, which usually include some data plotting and analysis capabilities and are straightforward to use, but are not designed for the analyses of complex longitudinal data. Specialized statistical software offers more flexibility and capabilities, but first time users with biomedical background often find its use difficult. We developed medplot, an interactive web application that simplifies the exploration and analysis of longitudinal data. The application can be used to summarize, visualize and analyze data by researchers that are not familiar with statistical programs and whose knowledge of statistics is limited. The summary tools produce publication-ready tables and graphs. The analysis tools include features that are seldom available in spreadsheet software, such as correction for multiple testing, repeated measurement analyses and flexible non-linear modeling of the association of the numerical variables with the outcome. medplot is freely available and open source, it has an intuitive graphical user interface (GUI), it is accessible via the Internet and can be used within a web browser, without the need for installing and maintaining programs locally on the user's computer. This paper describes the application and gives detailed examples describing how to use the application on real data from a clinical study including patients with early Lyme borreliosis.

  9. Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

    PubMed

    Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

    2015-08-01

    Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.

  10. Web 2.0 in the Professional LIS Literature: An Exploratory Analysis

    ERIC Educational Resources Information Center

    Aharony, Noa

    2011-01-01

    This paper presents a statistical descriptive analysis and a thorough content analysis of descriptors and journal titles extracted from the Library and Information Science Abstracts (LISA) database, focusing on the subject of Web 2.0 and its main applications: blog, wiki, social network and tags.The primary research questions include: whether the…

  11. Analyzing the Validity of the Adult-Adolescent Parenting Inventory for Low-Income Populations

    ERIC Educational Resources Information Center

    Lawson, Michael A.; Alameda-Lawson, Tania; Byrnes, Edward

    2017-01-01

    Objectives: The purpose of this study was to examine the construct and predictive validity of the Adult-Adolescent Parenting Inventory (AAPI-2). Methods: The validity of the AAPI-2 was evaluated using multiple statistical methods, including exploratory factor analysis, confirmatory factor analysis, and latent class analysis. These analyses were…

  12. HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.

    PubMed

    Song, Chi; Tseng, George C

    2014-01-01

    Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values ( r th ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.

  13. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science).

    PubMed

    Zeng, Irene Sui Lan; Lumley, Thomas

    2018-01-01

    Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.

  14. Ketamine as a novel treatment for major depressive disorder and bipolar depression: a systematic review and quantitative meta-analysis.

    PubMed

    Lee, Ellen E; Della Selva, Megan P; Liu, Anson; Himelhoch, Seth

    2015-01-01

    Given the significant disability, morbidity and mortality associated with depression, the promising recent trials of ketamine highlight a novel intervention. A meta-analysis was conducted to assess the efficacy of ketamine in comparison with placebo for the reduction of depressive symptoms in patients who meet criteria for a major depressive episode. Two electronic databases were searched in September 2013 for English-language studies that were randomized placebo-controlled trials of ketamine treatment for patients with major depressive disorder or bipolar depression and utilized a standardized rating scale. Studies including participants receiving electroconvulsive therapy and adolescent/child participants were excluded. Five studies were included in the quantitative meta-analysis. The quantitative meta-analysis showed that ketamine significantly reduced depressive symptoms. The overall effect size at day 1 was large and statistically significant with an overall standardized mean difference of 1.01 (95% confidence interval 0.69-1.34) (P<.001), with the effects sustained at 7 days postinfusion. The heterogeneity of the studies was low and not statistically significant, and the funnel plot showed no publication bias. The large and statistically significant effect of ketamine on depressive symptoms supports a promising, new and effective pharmacotherapy with rapid onset, high efficacy and good tolerability. Copyright © 2015. Published by Elsevier Inc.

  15. Statistics teaching in medical school: opinions of practising doctors.

    PubMed

    Miles, Susan; Price, Gill M; Swift, Louise; Shepstone, Lee; Leinster, Sam J

    2010-11-04

    The General Medical Council expects UK medical graduates to gain some statistical knowledge during their undergraduate education; but provides no specific guidance as to amount, content or teaching method. Published work on statistics teaching for medical undergraduates has been dominated by medical statisticians, with little input from the doctors who will actually be using this knowledge and these skills after graduation. Furthermore, doctor's statistical training needs may have changed due to advances in information technology and the increasing importance of evidence-based medicine. Thus there exists a need to investigate the views of practising medical doctors as to the statistical training required for undergraduate medical students, based on their own use of these skills in daily practice. A questionnaire was designed to investigate doctors' views about undergraduate training in statistics and the need for these skills in daily practice, with a view to informing future teaching. The questionnaire was emailed to all clinicians with a link to the University of East Anglia Medical School. Open ended questions were included to elicit doctors' opinions about both their own undergraduate training in statistics and recommendations for the training of current medical students. Content analysis was performed by two of the authors to systematically categorize and describe all the responses provided by participants. 130 doctors responded, including both hospital consultants and general practitioners. The findings indicated that most had not recognised the value of their undergraduate teaching in statistics and probability at the time, but had subsequently found the skills relevant to their career. Suggestions for improving undergraduate teaching in these areas included referring to actual research and ensuring relevance to, and integration with, clinical practice. Grounding the teaching of statistics in the context of real research studies and including examples of typical clinical work may better prepare medical students for their subsequent career.

  16. Challenges of Big Data Analysis.

    PubMed

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  17. Challenges of Big Data Analysis

    PubMed Central

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-01-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469

  18. A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

    ERIC Educational Resources Information Center

    Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.

    2010-01-01

    This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…

  19. Effects of Cognitive Load on Trust

    DTIC Science & Technology

    2013-10-01

    that may be affected by load  Build a parsing tool to extract relevant features  Statistical analysis of results (by load components) Achieved...for a business application. Participants assessed potential job candidates and reviewed the applicants’ virtual resume which included standard...substantially different from each other that would make any confounding problems or other issues. Some statistics of the Australian data collection are

  20. Why Don't All Professors Use Computers?

    ERIC Educational Resources Information Center

    Drew, David Eli

    1989-01-01

    Discusses the adoption of computer technology at universities and examines reasons why some professors don't use computers. Topics discussed include computer applications, including artificial intelligence, social science research, statistical analysis, and cooperative research; appropriateness of the technology for the task; the Computer Aptitude…

  1. APOD Data Release of Social Network Footprint for 2015

    NASA Astrophysics Data System (ADS)

    Nemiroff, Robert J.; Russell, David; Allen, Alice; Connelly, Paul; Lowe, Stuart R.; Petz, Sydney; Haring, Ralf; Bonnell, Jerry T.; APOD Team

    2017-01-01

    APOD data for 2015 are being made freely available for download and analysis. The data includes page view statistics for the main NASA APOD website at https://apod.nasa.gov, as well as for APOD's social media sites on Facebook, Instagram, Google Plus, and Twitter. General APOD-specific demographic information for each site is included. Popularity statistics that have been archived including Page Views, Likes, Shares, Hearts, and Retweets. The downloadable Excel-type spreadsheet also includes the APOD title and (unlinked) explanation. This data is released not to highlight APOD's popularity but to encourage analyses, with potential examples involving which astronomy topics trend the best and whether popularity is social group dependent.

  2. Global Sensitivity Analysis of Environmental Systems via Multiple Indices based on Statistical Moments of Model Outputs

    NASA Astrophysics Data System (ADS)

    Guadagnini, A.; Riva, M.; Dell'Oca, A.

    2017-12-01

    We propose to ground sensitivity of uncertain parameters of environmental models on a set of indices based on the main (statistical) moments, i.e., mean, variance, skewness and kurtosis, of the probability density function (pdf) of a target model output. This enables us to perform Global Sensitivity Analysis (GSA) of a model in terms of multiple statistical moments and yields a quantification of the impact of model parameters on features driving the shape of the pdf of model output. Our GSA approach includes the possibility of being coupled with the construction of a reduced complexity model that allows approximating the full model response at a reduced computational cost. We demonstrate our approach through a variety of test cases. These include a commonly used analytical benchmark, a simplified model representing pumping in a coastal aquifer, a laboratory-scale tracer experiment, and the migration of fracturing fluid through a naturally fractured reservoir (source) to reach an overlying formation (target). Our strategy allows discriminating the relative importance of model parameters to the four statistical moments considered. We also provide an appraisal of the error associated with the evaluation of our sensitivity metrics by replacing the original system model through the selected surrogate model. Our results suggest that one might need to construct a surrogate model with increasing level of accuracy depending on the statistical moment considered in the GSA. The methodological framework we propose can assist the development of analysis techniques targeted to model calibration, design of experiment, uncertainty quantification and risk assessment.

  3. Statistical trends of episiotomy around the world: Comparative systematic review of changing practices.

    PubMed

    Clesse, Christophe; Lighezzolo-Alnot, Joëlle; De Lavergne, Sylvie; Hamlin, Sandrine; Scheffler, Michèle

    2018-06-01

    The authors' purpose for this article is to identify, review and interpret all publications about the episiotomy rates worldwide. Based on the criteria from the PRISMA guidelines, twenty databases were scrutinized. All studies which include national statistics related to episiotomy were selected, as well as studies presenting estimated data. Sixty-one papers were selected with publication dates between 1995 and 2016. A static and dynamic analysis of all the results was carried out. The assumption for the decline in the number of episiotomies is discussed and confirmed, recalling that nowadays high rates of episiotomy remain in less industrialized countries and East Asia. Finally, our analysis aims to investigate the potential determinants which influence apparent statistical disparities.

  4. Statistical research on the bioactivity of new marine natural products discovered during the 28 years from 1985 to 2012.

    PubMed

    Hu, Yiwen; Chen, Jiahui; Hu, Guping; Yu, Jianchen; Zhu, Xun; Lin, Yongcheng; Chen, Shengping; Yuan, Jie

    2015-01-07

    Every year, hundreds of new compounds are discovered from the metabolites of marine organisms. Finding new and useful compounds is one of the crucial drivers for this field of research. Here we describe the statistics of bioactive compounds discovered from marine organisms from 1985 to 2012. This work is based on our database, which contains information on more than 15,000 chemical substances including 4196 bioactive marine natural products. We performed a comprehensive statistical analysis to understand the characteristics of the novel bioactive compounds and detail temporal trends, chemical structures, species distribution, and research progress. We hope this meta-analysis will provide useful information for research into the bioactivity of marine natural products and drug development.

  5. Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)

    NASA Astrophysics Data System (ADS)

    Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee

    2010-12-01

    Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review available statistical seismology software packages.

  6. The impact of mother's literacy on child dental caries: Individual data or aggregate data analysis?

    PubMed

    Haghdoost, Ali-Akbar; Hessari, Hossein; Baneshi, Mohammad Reza; Rad, Maryam; Shahravan, Arash

    2017-01-01

    To evaluate the impact of mother's literacy on child dental caries based on a national oral health survey in Iran and to investigate the possibility of ecological fallacy in aggregate data analysis. Existing data were from second national oral health survey that was carried out in 2004, which including 8725 6 years old participants. The association of mother's literacy with caries occurrence (DMF (Decayed, Missing, Filling) total score >0) of her child was assessed using individual data by logistic regression model. Then the association of the percentages of mother's literacy and the percentages of decayed teeth in each 30 provinces of Iran was assessed using aggregated data retrieved from the data of second national oral health survey of Iran and alternatively from census of "Statistical Center of Iran" using linear regression model. The significance level was set at 0.05 for all analysis. Individual data analysis showed a statistically significant association between mother's literacy and decayed teeth of children ( P = 0.02, odds ratio = 0.83). There were not statistical significant association between mother's literacy and child dental caries in aggregate data analysis of oral health survey ( P = 0.79, B = 0.03) and census of "Statistical Center of Statistics" ( P = 0.60, B = 0.14). Literate mothers have a preventive effect on occurring dental caries of children. According to the high percentage of illiterate parents in Iran, it's logical to consider suitable methods of oral health education which do not need reading or writing. Aggregate data analysis and individual data analysis had completely different results in this study.

  7. Methodological quality of behavioural weight loss studies: a systematic review

    PubMed Central

    Lemon, S. C.; Wang, M. L.; Haughton, C. F.; Estabrook, D. P.; Frisard, C. F.; Pagoto, S. L.

    2018-01-01

    Summary This systematic review assessed the methodological quality of behavioural weight loss intervention studies conducted among adults and associations between quality and statistically significant weight loss outcome, strength of intervention effectiveness and sample size. Searches for trials published between January, 2009 and December, 2014 were conducted using PUBMED, MEDLINE and PSYCINFO and identified ninety studies. Methodological quality indicators included study design, anthropometric measurement approach, sample size calculations, intent-to-treat (ITT) analysis, loss to follow-up rate, missing data strategy, sampling strategy, report of treatment receipt and report of intervention fidelity (mean = 6.3). Indicators most commonly utilized included randomized design (100%), objectively measured anthropometrics (96.7%), ITT analysis (86.7%) and reporting treatment adherence (76.7%). Most studies (62.2%) had a follow-up rate >75% and reported a loss to follow-up analytic strategy or minimal missing data (69.9%). Describing intervention fidelity (34.4%) and sampling from a known population (41.1%) were least common. Methodological quality was not associated with reporting a statistically significant result, effect size or sample size. This review found the published literature of behavioural weight loss trials to be of high quality for specific indicators, including study design and measurement. Identified for improvement include utilization of more rigorous statistical approaches to loss to follow up and better fidelity reporting. PMID:27071775

  8. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.

    PubMed

    Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan

    2015-01-08

    Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  9. UNITY: Confronting Supernova Cosmology's Statistical and Systematic Uncertainties in a Unified Bayesian Framework

    NASA Astrophysics Data System (ADS)

    Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The

    2015-11-01

    While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.

  10. Standard deviation and standard error of the mean.

    PubMed

    Lee, Dong Kyu; In, Junyong; Lee, Sangseok

    2015-06-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results.

  11. Standard deviation and standard error of the mean

    PubMed Central

    In, Junyong; Lee, Sangseok

    2015-01-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results. PMID:26045923

  12. Association of obesity with hypertension and type 2 diabetes mellitus in India: A meta-analysis of observational studies

    PubMed Central

    Babu, Giridhara R; Murthy, G V S; Ana, Yamuna; Patel, Prital; Deepa, R; Neelon, Sara E Benjamin; Kinra, Sanjay; Reddy, K Srinath

    2018-01-01

    AIM To perform a meta-analysis of the association of obesity with hypertension and type 2 diabetes mellitus (T2DM) in India among adults. METHODS To conduct meta-analysis, we performed comprehensive, electronic literature search in the PubMed, CINAHL Plus, and Google Scholar. We restricted the analysis to studies with documentation of some measure of obesity namely; body mass index, waist-hip ratio, waist circumference and diagnosis of hypertension or diagnosis of T2DM. By obtaining summary estimates of all included studies, the meta-analysis was performed using both RevMan version 5 and “metan” command STATA version 11. Heterogeneity was measured by I2 statistic. Funnel plot analysis has been done to assess the study publication bias. RESULTS Of the 956 studies screened, 18 met the eligibility criteria. The pooled odds ratio between obesity and hypertension was 3.82 (95%CI: 3.39 to 4.25). The heterogeneity around this estimate (I2 statistic) was 0%, indicating low variability. The pooled odds ratio from the included studies showed a statistically significant association between obesity and T2DM (OR = 1.14, 95%CI: 1.04 to 1.24) with a high degree of variability. CONCLUSION Despite methodological differences, obesity showed significant, potentially plausible association with hypertension and T2DM in studies conducted in India. Being a modifiable risk factor, our study informs setting policy priority and intervention efforts to prevent debilitating complications. PMID:29359028

  13. Association of obesity with hypertension and type 2 diabetes mellitus in India: A meta-analysis of observational studies.

    PubMed

    Babu, Giridhara R; Murthy, G V S; Ana, Yamuna; Patel, Prital; Deepa, R; Neelon, Sara E Benjamin; Kinra, Sanjay; Reddy, K Srinath

    2018-01-15

    To perform a meta-analysis of the association of obesity with hypertension and type 2 diabetes mellitus (T2DM) in India among adults. To conduct meta-analysis, we performed comprehensive, electronic literature search in the PubMed, CINAHL Plus, and Google Scholar. We restricted the analysis to studies with documentation of some measure of obesity namely; body mass index, waist-hip ratio, waist circumference and diagnosis of hypertension or diagnosis of T2DM. By obtaining summary estimates of all included studies, the meta-analysis was performed using both RevMan version 5 and "metan" command STATA version 11. Heterogeneity was measured by I 2 statistic. Funnel plot analysis has been done to assess the study publication bias. Of the 956 studies screened, 18 met the eligibility criteria. The pooled odds ratio between obesity and hypertension was 3.82 (95%CI: 3.39 to 4.25). The heterogeneity around this estimate (I2 statistic) was 0%, indicating low variability. The pooled odds ratio from the included studies showed a statistically significant association between obesity and T2DM (OR = 1.14, 95%CI: 1.04 to 1.24) with a high degree of variability. Despite methodological differences, obesity showed significant, potentially plausible association with hypertension and T2DM in studies conducted in India. Being a modifiable risk factor, our study informs setting policy priority and intervention efforts to prevent debilitating complications.

  14. HydroClimATe: hydrologic and climatic analysis toolkit

    USGS Publications Warehouse

    Dickinson, Jesse; Hanson, Randall T.; Predmore, Steven K.

    2014-01-01

    The potential consequences of climate variability and climate change have been identified as major issues for the sustainability and availability of the worldwide water resources. Unlike global climate change, climate variability represents deviations from the long-term state of the climate over periods of a few years to several decades. Currently, rich hydrologic time-series data are available, but the combination of data preparation and statistical methods developed by the U.S. Geological Survey as part of the Groundwater Resources Program is relatively unavailable to hydrologists and engineers who could benefit from estimates of climate variability and its effects on periodic recharge and water-resource availability. This report documents HydroClimATe, a computer program for assessing the relations between variable climatic and hydrologic time-series data. HydroClimATe was developed for a Windows operating system. The software includes statistical tools for (1) time-series preprocessing, (2) spectral analysis, (3) spatial and temporal analysis, (4) correlation analysis, and (5) projections. The time-series preprocessing tools include spline fitting, standardization using a normal or gamma distribution, and transformation by a cumulative departure. The spectral analysis tools include discrete Fourier transform, maximum entropy method, and singular spectrum analysis. The spatial and temporal analysis tool is empirical orthogonal function analysis. The correlation analysis tools are linear regression and lag correlation. The projection tools include autoregressive time-series modeling and generation of many realizations. These tools are demonstrated in four examples that use stream-flow discharge data, groundwater-level records, gridded time series of precipitation data, and the Multivariate ENSO Index.

  15. A perceptual space of local image statistics.

    PubMed

    Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M

    2015-12-01

    Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. A perceptual space of local image statistics

    PubMed Central

    Victor, Jonathan D.; Thengone, Daniel J.; Rizvi, Syed M.; Conte, Mary M.

    2015-01-01

    Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice – a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14 min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4 min. In sum, local image statistics forms a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. PMID:26130606

  17. Rare-Variant Association Analysis: Study Designs and Statistical Tests

    PubMed Central

    Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong

    2014-01-01

    Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions. PMID:24995866

  18. Research Update: Spatially resolved mapping of electronic structure on atomic level by multivariate statistical analysis

    DOE PAGES

    Belianinov, Alex; Panchapakesan, G.; Lin, Wenzhi; ...

    2014-12-02

    Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1 x Sex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signaturemore » and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.« less

  19. Research Update: Spatially resolved mapping of electronic structure on atomic level by multivariate statistical analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Belianinov, Alex, E-mail: belianinova@ornl.gov; Ganesh, Panchapakesan; Lin, Wenzhi

    2014-12-01

    Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe{sub 0.55}Se{sub 0.45} (T{sub c} = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe{sub 1−x}Se{sub x} structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified bymore » their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.« less

  20. Active Nonlinear Feedback Control for Aerospace Systems. Processor

    DTIC Science & Technology

    1990-12-01

    relating to the role of nonlinearities in feedback control. These area include Lyapunov function theory, chaotic controllers, statistical energy analysis , phase robustness, and optimal nonlinear control theory.

  1. Data management in large-scale collaborative toxicity studies: how to file experimental data for automated statistical analysis.

    PubMed

    Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette

    2013-06-01

    High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Computer program documentation for the pasture/range condition assessment processor

    NASA Technical Reports Server (NTRS)

    Mcintyre, K. S.; Miller, T. G. (Principal Investigator)

    1982-01-01

    The processor which drives for the RANGE software allows the user to analyze LANDSAT data containing pasture and rangeland. Analysis includes mapping, generating statistics, calculating vegetative indexes, and plotting vegetative indexes. Routines for using the processor are given. A flow diagram is included.

  3. EHME: a new word database for research in Basque language.

    PubMed

    Acha, Joana; Laka, Itziar; Landa, Josu; Salaburu, Pello

    2014-11-14

    This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria 1 , as well as to obtain statistical characteristics from a list of words

  4. Specialized data analysis of SSME and advanced propulsion system vibration measurements

    NASA Technical Reports Server (NTRS)

    Coffin, Thomas; Swanson, Wayne L.; Jong, Yen-Yi

    1993-01-01

    The basic objectives of this contract were to perform detailed analysis and evaluation of dynamic data obtained during Space Shuttle Main Engine (SSME) test and flight operations, including analytical/statistical assessment of component dynamic performance, and to continue the development and implementation of analytical/statistical models to effectively define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational conditions. This study was to provide timely assessment of engine component operational status, identify probable causes of malfunction, and define feasible engineering solutions. The work was performed under three broad tasks: (1) Analysis, Evaluation, and Documentation of SSME Dynamic Test Results; (2) Data Base and Analytical Model Development and Application; and (3) Development and Application of Vibration Signature Analysis Techniques.

  5. A Novel Genome-Information Content-Based Statistic for Genome-Wide Association Analysis Designed for Next-Generation Sequencing Data

    PubMed Central

    Luo, Li; Zhu, Yun

    2012-01-01

    Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812

  6. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    PubMed

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  7. Enhancing predictive accuracy and reproducibility in clinical evaluation research: Commentary on the special section of the Journal of Evaluation in Clinical Practice.

    PubMed

    Bryant, Fred B

    2016-12-01

    This paper introduces a special section of the current issue of the Journal of Evaluation in Clinical Practice that includes a set of 6 empirical articles showcasing a versatile, new machine-learning statistical method, known as optimal data (or discriminant) analysis (ODA), specifically designed to produce statistical models that maximize predictive accuracy. As this set of papers clearly illustrates, ODA offers numerous important advantages over traditional statistical methods-advantages that enhance the validity and reproducibility of statistical conclusions in empirical research. This issue of the journal also includes a review of a recently published book that provides a comprehensive introduction to the logic, theory, and application of ODA in empirical research. It is argued that researchers have much to gain by using ODA to analyze their data. © 2016 John Wiley & Sons, Ltd.

  8. The classification of gunshot residue using laser electrospray mass spectrometry and offline multivariate statistical analysis

    USDA-ARS?s Scientific Manuscript database

    Nonresonant laser vaporization combined with high-resolution electrospray time-of-flight mass spectrometry enables analysis of a casing after discharge of a firearm revealing organic signature molecules including methyl centralite (MC), diphenylamine (DPA), N-nitrosodiphenylamine (N-NO-DPA), 4-nitro...

  9. An Analysis of Methods Used to Examine Gender Differences in Computer-Related Behavior.

    ERIC Educational Resources Information Center

    Kay, Robin

    1992-01-01

    Review of research investigating gender differences in computer-related behavior examines statistical and methodological flaws. Issues addressed include sample selection, sample size, scale development, scale quality, the use of univariate and multivariate analyses, regressional analysis, construct definition, construct testing, and the…

  10. Links between Characteristics of Collaborative Peer Video Analysis Events and Literacy Teachers' Outcomes

    ERIC Educational Resources Information Center

    Arya, Poonam; Christ, Tanya; Chiu, Ming

    2015-01-01

    This study examined how characteristics of Collaborative Peer Video Analysis (CPVA) events are related to teachers' pedagogical outcomes. Data included 39 transcribed literacy video events, in which 14 in-service teachers engaged in discussions of their video clips. Emergent coding and Statistical Discourse Analysis were used to analyze the data.…

  11. Comprehension-Based versus Production-Based Grammar Instruction: A Meta-Analysis of Comparative Studies

    ERIC Educational Resources Information Center

    Shintani, Natsuko; Li, Shaofeng; Ellis, Rod

    2013-01-01

    This article reports a meta-analysis of studies that investigated the relative effectiveness of comprehension-based instruction (CBI) and production-based instruction (PBI). The meta-analysis only included studies that featured a direct comparison of CBI and PBI in order to ensure methodological and statistical robustness. A total of 35 research…

  12. Combining the Bourne-Shell, sed and awk in the UNIX Environment for Language Analysis.

    ERIC Educational Resources Information Center

    Schmitt, Lothar M.; Christianson, Kiel T.

    This document describes how to construct tools for language analysis in research and teaching using the Bourne-shell, sed, and awk, three search tools, in the UNIX operating system. Applications include: searches for words, phrases, grammatical patterns, and phonemic patterns in text; statistical analysis of text in regard to such searches,…

  13. Review of U.S. Army Unmanned Aerial Systems Accident Reports: Analysis of Human Error Contributions

    DTIC Science & Technology

    2018-03-20

    USAARL Report No. 2018-08 Review of U.S. Army Unmanned Aerial Systems Accident Reports: Analysis of Human Error Contributions By Kathryn A...3 Statistical Analysis Approach ..............................................................................................3 Results...1 Introduction The success of unmanned aerial systems (UAS) operations relies upon a variety of factors, including, but not limited to

  14. Putting the "But" Back in Meta-Analysis: Issues Affecting the Validity of Quantitative Reviews.

    ERIC Educational Resources Information Center

    L'Hommedieu, Randi; And Others

    Some of the frustrations inherent in trying to incorporate qualifications of statistical results into meta-analysis are reviewed, and some solutions are proposed to prevent the loss of information in meta-analytic reports. The validity of a meta-analysis depends on several factors, including the: thoroughness of the literature search; selection of…

  15. Use of check lists in assessing the statistical content of medical studies.

    PubMed Central

    Gardner, M J; Machin, D; Campbell, M J

    1986-01-01

    Two check lists are used routinely in the statistical assessment of manuscripts submitted to the "BMJ." One is for papers of a general nature and the other specifically for reports on clinical trials. Each check list includes questions on the design, conduct, analysis, and presentation of studies, and answers to these contribute to the overall statistical evaluation. Only a small proportion of submitted papers are assessed statistically, and these are selected at the refereeing or editorial stage. Examination of the use of the check lists showed that most papers contained statistical failings, many of which could easily be remedied. It is recommended that the check lists should be used by statistical referees, editorial staff, and authors and also during the design stage of studies. PMID:3082452

  16. Report: Scientific Software.

    ERIC Educational Resources Information Center

    Borman, Stuart A.

    1985-01-01

    Discusses various aspects of scientific software, including evaluation and selection of commercial software products; program exchanges, catalogs, and other information sources; major data analysis packages; statistics and chemometrics software; and artificial intelligence. (JN)

  17. Marketing of Personalized Cancer Care on the Web: An Analysis of Internet Websites

    PubMed Central

    Cronin, Angel; Bair, Elizabeth; Lindeman, Neal; Viswanath, Vish; Janeway, Katherine A.

    2015-01-01

    Internet marketing may accelerate the use of care based on genomic or tumor-derived data. However, online marketing may be detrimental if it endorses products of unproven benefit. We conducted an analysis of Internet websites to identify personalized cancer medicine (PCM) products and claims. A Delphi Panel categorized PCM as standard or nonstandard based on evidence of clinical utility. Fifty-five websites, sponsored by commercial entities, academic institutions, physicians, research institutes, and organizations, that marketed PCM included somatic (58%) and germline (20%) analysis, interpretive services (15%), and physicians/institutions offering personalized care (44%). Of 32 sites offering somatic analysis, 56% included specific test information (range 1–152 tests). All statistical tests were two-sided, and comparisons of website content were conducted using McNemar’s test. More websites contained information about the benefits than limitations of PCM (85% vs 27%, P < .001). Websites specifying somatic analysis were statistically significantly more likely to market one or more nonstandard tests as compared with standard tests (88% vs 44%, P = .04). PMID:25745021

  18. Technical Brief for the final report presentation for Statistical summaries of selected Iowa streamflow data through September 2013, U.S. Geological Survey Open-File Report 2015-1214, Iowa DOT Research Project TR-669.

    DOT National Transportation Integrated Search

    2015-01-01

    Statistical summaries of streamflow data collected at 184 streamgages in Iowa are presented in this report. All streamgages included for analysis have at least 10 years of continuous record collected before or through September 2013. This report is a...

  19. Developing statistical wildlife habitat relationships for assessing cumulative effects of fuels treatments: Final Report for Joint Fire Science Program Project

    Treesearch

    Samuel A. Cushman; Kevin S. McKelvey

    2006-01-01

    The primary weakness in our current ability to evaluate future landscapes in terms of wildlife lies in the lack of quantitative models linking wildlife to forest stand conditions, including fuels treatments. This project focuses on 1) developing statistical wildlife habitat relationships models (WHR) utilizing Forest Inventory and Analysis (FIA) and National Vegetation...

  20. Statistical modeling of space shuttle environmental data

    NASA Technical Reports Server (NTRS)

    Tubbs, J. D.; Brewer, D. W.

    1983-01-01

    Statistical models which use a class of bivariate gamma distribution are examined. Topics discussed include: (1) the ratio of positively correlated gamma varieties; (2) a method to determine if unequal shape parameters are necessary in bivariate gamma distribution; (3) differential equations for modal location of a family of bivariate gamma distribution; and (4) analysis of some wind gust data using the analytical results developed for modeling application.

  1. Methodological Standards for Meta-Analyses and Qualitative Systematic Reviews of Cardiac Prevention and Treatment Studies: A Scientific Statement From the American Heart Association.

    PubMed

    Rao, Goutham; Lopez-Jimenez, Francisco; Boyd, Jack; D'Amico, Frank; Durant, Nefertiti H; Hlatky, Mark A; Howard, George; Kirley, Katherine; Masi, Christopher; Powell-Wiley, Tiffany M; Solomonides, Anthony E; West, Colin P; Wessel, Jennifer

    2017-09-05

    Meta-analyses are becoming increasingly popular, especially in the fields of cardiovascular disease prevention and treatment. They are often considered to be a reliable source of evidence for making healthcare decisions. Unfortunately, problems among meta-analyses such as the misapplication and misinterpretation of statistical methods and tests are long-standing and widespread. The purposes of this statement are to review key steps in the development of a meta-analysis and to provide recommendations that will be useful for carrying out meta-analyses and for readers and journal editors, who must interpret the findings and gauge methodological quality. To make the statement practical and accessible, detailed descriptions of statistical methods have been omitted. Based on a survey of cardiovascular meta-analyses, published literature on methodology, expert consultation, and consensus among the writing group, key recommendations are provided. Recommendations reinforce several current practices, including protocol registration; comprehensive search strategies; methods for data extraction and abstraction; methods for identifying, measuring, and dealing with heterogeneity; and statistical methods for pooling results. Other practices should be discontinued, including the use of levels of evidence and evidence hierarchies to gauge the value and impact of different study designs (including meta-analyses) and the use of structured tools to assess the quality of studies to be included in a meta-analysis. We also recommend choosing a pooling model for conventional meta-analyses (fixed effect or random effects) on the basis of clinical and methodological similarities among studies to be included, rather than the results of a test for statistical heterogeneity. © 2017 American Heart Association, Inc.

  2. A guide to understanding meta-analysis.

    PubMed

    Israel, Heidi; Richter, Randy R

    2011-07-01

    With the focus on evidence-based practice in healthcare, a well-conducted systematic review that includes a meta-analysis where indicated represents a high level of evidence for treatment effectiveness. The purpose of this commentary is to assist clinicians in understanding meta-analysis as a statistical tool using both published articles and explanations of components of the technique. We describe what meta-analysis is, what heterogeneity is, and how it affects meta-analysis, effect size, the modeling techniques of meta-analysis, and strengths and weaknesses of meta-analysis. Common components like forest plot interpretation, software that may be used, special cases for meta-analysis, such as subgroup analysis, individual patient data, and meta-regression, and a discussion of criticisms, are included.

  3. A probabilistic analysis of electrical equipment vulnerability to carbon fibers

    NASA Technical Reports Server (NTRS)

    Elber, W.

    1980-01-01

    The statistical problems of airborne carbon fibers falling onto electrical circuits were idealized and analyzed. The probability of making contact between randomly oriented finite length fibers and sets of parallel conductors with various spacings and lengths was developed theoretically. The probability of multiple fibers joining to bridge a single gap between conductors, or forming continuous networks is included. From these theoretical considerations, practical statistical analyses to assess the likelihood of causing electrical malfunctions was produced. The statistics obtained were confirmed by comparison with results of controlled experiments.

  4. Statistical Modeling for Radiation Hardness Assurance

    NASA Technical Reports Server (NTRS)

    Ladbury, Raymond L.

    2014-01-01

    We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.

  5. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra.

    PubMed

    Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.

  6. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra

    PubMed Central

    Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713

  7. Access and Ownership in the Academic Environment: One Library's Progress Report.

    ERIC Educational Resources Information Center

    Brin, Beth; Cochran, Elissa

    1994-01-01

    Describes the methodology used at the University of Arizona Library to address the issue of access versus ownership of library materials. Topics discussed include participatory management; data collection, including focus groups, interlibrary loan statistics, and graduate research citation analysis; and resulting recommendations, including…

  8. Statistical Methods for the Analysis of Discrete Choice Experiments: A Report of the ISPOR Conjoint Analysis Good Research Practices Task Force.

    PubMed

    Hauber, A Brett; González, Juan Marcos; Groothuis-Oudshoorn, Catharina G M; Prior, Thomas; Marshall, Deborah A; Cunningham, Charles; IJzerman, Maarten J; Bridges, John F P

    2016-06-01

    Conjoint analysis is a stated-preference survey method that can be used to elicit responses that reveal preferences, priorities, and the relative importance of individual features associated with health care interventions or services. Conjoint analysis methods, particularly discrete choice experiments (DCEs), have been increasingly used to quantify preferences of patients, caregivers, physicians, and other stakeholders. Recent consensus-based guidance on good research practices, including two recent task force reports from the International Society for Pharmacoeconomics and Outcomes Research, has aided in improving the quality of conjoint analyses and DCEs in outcomes research. Nevertheless, uncertainty regarding good research practices for the statistical analysis of data from DCEs persists. There are multiple methods for analyzing DCE data. Understanding the characteristics and appropriate use of different analysis methods is critical to conducting a well-designed DCE study. This report will assist researchers in evaluating and selecting among alternative approaches to conducting statistical analysis of DCE data. We first present a simplistic DCE example and a simple method for using the resulting data. We then present a pedagogical example of a DCE and one of the most common approaches to analyzing data from such a question format-conditional logit. We then describe some common alternative methods for analyzing these data and the strengths and weaknesses of each alternative. We present the ESTIMATE checklist, which includes a list of questions to consider when justifying the choice of analysis method, describing the analysis, and interpreting the results. Copyright © 2016 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  9. RADSS: an integration of GIS, spatial statistics, and network service for regional data mining

    NASA Astrophysics Data System (ADS)

    Hu, Haitang; Bao, Shuming; Lin, Hui; Zhu, Qing

    2005-10-01

    Regional data mining, which aims at the discovery of knowledge about spatial patterns, clusters or association between regions, has widely applications nowadays in social science, such as sociology, economics, epidemiology, crime, and so on. Many applications in the regional or other social sciences are more concerned with the spatial relationship, rather than the precise geographical location. Based on the spatial continuity rule derived from Tobler's first law of geography: observations at two sites tend to be more similar to each other if the sites are close together than if far apart, spatial statistics, as an important means for spatial data mining, allow the users to extract the interesting and useful information like spatial pattern, spatial structure, spatial association, spatial outlier and spatial interaction, from the vast amount of spatial data or non-spatial data. Therefore, by integrating with the spatial statistical methods, the geographical information systems will become more powerful in gaining further insights into the nature of spatial structure of regional system, and help the researchers to be more careful when selecting appropriate models. However, the lack of such tools holds back the application of spatial data analysis techniques and development of new methods and models (e.g., spatio-temporal models). Herein, we make an attempt to develop such an integrated software and apply it into the complex system analysis for the Poyang Lake Basin. This paper presents a framework for integrating GIS, spatial statistics and network service in regional data mining, as well as their implementation. After discussing the spatial statistics methods involved in regional complex system analysis, we introduce RADSS (Regional Analysis and Decision Support System), our new regional data mining tool, by integrating GIS, spatial statistics and network service. RADSS includes the functions of spatial data visualization, exploratory spatial data analysis, and spatial statistics. The tool also includes some fundamental spatial and non-spatial database in regional population and environment, which can be updated by external database via CD or network. Utilizing this data mining and exploratory analytical tool, the users can easily and quickly analyse the huge mount of the interrelated regional data, and better understand the spatial patterns and trends of the regional development, so as to make a credible and scientific decision. Moreover, it can be used as an educational tool for spatial data analysis and environmental studies. In this paper, we also present a case study on Poyang Lake Basin as an application of the tool and spatial data mining in complex environmental studies. At last, several concluding remarks are discussed.

  10. Clinical periodontal variables in patients with and without dementia-a systematic review and meta-analysis.

    PubMed

    Maldonado, Alejandra; Laugisch, Oliver; Bürgin, Walter; Sculean, Anton; Eick, Sigrun

    2018-06-22

    Considering the increasing number of elderly people, dementia has gained an important role in today's society. Although the contributing factors for dementia have not been fully understood, chronic periodontitis (CP) seems to have a possible link to dementia. To conduct a systematic review including meta-analysis in order to assess potential differences in clinical periodontal variables between patients with dementia and non-demented individuals. The following focused question was evaluated: is periodontitis associated with dementia? Electronic searches in two databases, MEDLINE and EMBASE, were conducted. Meta-analysis was performed with the collected data in order to find a statistically significant difference in clinical periodontal variables between the group of dementia and the cognitive normal controls. Forty-two articles remained for full text reading. Finally, seven articles met the inclusion criteria and only five studies provided data suitable for meta-analysis. Periodontal probing depth (PPD), bleeding on probing (BOP), gingival bleeding index (GBI), clinical attachment level (CAL), and plaque index (PI) were included as periodontal variables in the meta-analysis. Each variable revealed a statistically significant difference between the groups. In an attempt to reveal an overall difference between the periodontal variables in dementia patients and non-demented individuals, the chosen variables were transformed into units that resulted in a statistically significant overall difference (p < 0.00001). The current findings indicate that compared to systemically healthy individuals, demented patients show significantly worse clinical periodontal variables. However, further epidemiological studies including a high numbers of participants, the use of exact definitions both for dementia and chronic periodontitis and adjusted for cofounders is warranted. These findings appear to support the putative link between CP and dementia. Consequently, the need for periodontal screening and treatment of elderly demented people should be emphasized.

  11. The Effectiveness of Computer-Assisted Instruction to Teach Physical Examination to Students and Trainees in the Health Sciences Professions: A Systematic Review and Meta-Analysis.

    PubMed

    Tomesko, Jennifer; Touger-Decker, Riva; Dreker, Margaret; Zelig, Rena; Parrott, James Scott

    2017-01-01

    To explore knowledge and skill acquisition outcomes related to learning physical examination (PE) through computer-assisted instruction (CAI) compared with a face-to-face (F2F) approach. A systematic literature review and meta-analysis published between January 2001 and December 2016 was conducted. Databases searched included Medline, Cochrane, CINAHL, ERIC, Ebsco, Scopus, and Web of Science. Studies were synthesized by study design, intervention, and outcomes. Statistical analyses included DerSimonian-Laird random-effects model. In total, 7 studies were included in the review, and 5 in the meta-analysis. There were no statistically significant differences for knowledge (mean difference [MD] = 5.39, 95% confidence interval [CI]: -2.05 to 12.84) or skill acquisition (MD = 0.35, 95% CI: -5.30 to 6.01). The evidence does not suggest a strong consistent preference for either CAI or F2F instruction to teach students/trainees PE. Further research is needed to identify conditions which examine knowledge and skill acquisition outcomes that favor one mode of instruction over the other.

  12. New Trends in Mathematics Teaching, Volume III.

    ERIC Educational Resources Information Center

    United Nations Educational, Scientific, and Cultural Organization, Paris (France).

    Each of the ten chapters in this volume is intended to present an objective analysis of the trends of some important subtopic in mathematics education and each includes a bibliography for fuller study. The chapters cover primary school mathematics, algebra, geometry, probability and statistics, analysis, logic, applications of mathematics, methods…

  13. Analysis of Alaska transportation sectors to assess energy use and impacts of price shocks and climate change legislation.

    DOT National Transportation Integrated Search

    2013-04-01

    We analyzed the use of energy by Alaskas transportation sectors to assess the impact of sudden fuel prices changes. : We conducted three types of analysis: 1) Development of broad energy use statistics for each transportation sector, : including t...

  14. Interpreting Bivariate Regression Coefficients: Going beyond the Average

    ERIC Educational Resources Information Center

    Halcoussis, Dennis; Phillips, G. Michael

    2010-01-01

    Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…

  15. INVESTIGATION OF THE USE OF STATISTICS IN COUNSELING STUDENTS.

    ERIC Educational Resources Information Center

    HEWES, ROBERT F.

    THE OBJECTIVE WAS TO EMPLOY TECHNIQUES OF PROFILE ANALYSIS TO DEVELOP THE JOINT PROBABILITY OF SELECTING A SUITABLE SUBJECT MAJOR AND OF ASSURING TO A HIGH DEGREE GRADUATION FROM COLLEGE WITH THAT MAJOR. THE SAMPLE INCLUDED 1,197 MIT FRESHMEN STUDENTS IN 1952-53, AND THE VALIDATION GROUP INCLUDED 699 ENTRANTS IN 1954. DATA INCLUDED SECONDARY…

  16. Clustering, randomness and regularity in cloud fields. I - Theoretical considerations. II - Cumulus cloud fields

    NASA Technical Reports Server (NTRS)

    Weger, R. C.; Lee, J.; Zhu, Tianri; Welch, R. M.

    1992-01-01

    The current controversy existing in reference to the regularity vs. clustering in cloud fields is examined by means of analysis and simulation studies based upon nearest-neighbor cumulative distribution statistics. It is shown that the Poisson representation of random point processes is superior to pseudorandom-number-generated models and that pseudorandom-number-generated models bias the observed nearest-neighbor statistics towards regularity. Interpretation of this nearest-neighbor statistics is discussed for many cases of superpositions of clustering, randomness, and regularity. A detailed analysis is carried out of cumulus cloud field spatial distributions based upon Landsat, AVHRR, and Skylab data, showing that, when both large and small clouds are included in the cloud field distributions, the cloud field always has a strong clustering signal.

  17. Statistical Research on the Bioactivity of New Marine Natural Products Discovered during the 28 Years from 1985 to 2012

    PubMed Central

    Hu, Yiwen; Chen, Jiahui; Hu, Guping; Yu, Jianchen; Zhu, Xun; Lin, Yongcheng; Chen, Shengping; Yuan, Jie

    2015-01-01

    Every year, hundreds of new compounds are discovered from the metabolites of marine organisms. Finding new and useful compounds is one of the crucial drivers for this field of research. Here we describe the statistics of bioactive compounds discovered from marine organisms from 1985 to 2012. This work is based on our database, which contains information on more than 15,000 chemical substances including 4196 bioactive marine natural products. We performed a comprehensive statistical analysis to understand the characteristics of the novel bioactive compounds and detail temporal trends, chemical structures, species distribution, and research progress. We hope this meta-analysis will provide useful information for research into the bioactivity of marine natural products and drug development. PMID:25574736

  18. [Statistical validity of the Mexican Food Security Scale and the Latin American and Caribbean Food Security Scale].

    PubMed

    Villagómez-Ornelas, Paloma; Hernández-López, Pedro; Carrasco-Enríquez, Brenda; Barrios-Sánchez, Karina; Pérez-Escamilla, Rafael; Melgar-Quiñónez, Hugo

    2014-01-01

    This article validates the statistical consistency of two food security scales: the Mexican Food Security Scale (EMSA) and the Latin American and Caribbean Food Security Scale (ELCSA). Validity tests were conducted in order to verify that both scales were consistent instruments, conformed by independent, properly calibrated and adequately sorted items, arranged in a continuum of severity. The following tests were developed: sorting of items; Cronbach's alpha analysis; parallelism of prevalence curves; Rasch models; sensitivity analysis through mean differences' hypothesis test. The tests showed that both scales meet the required attributes and are robust statistical instruments for food security measurement. This is relevant given that the lack of access to food indicator, included in multidimensional poverty measurement in Mexico, is calculated with EMSA.

  19. Shift work, long working hours and preterm birth: a systematic review and meta-analysis.

    PubMed

    van Melick, M J G J; van Beukering, M D M; Mol, B W; Frings-Dresen, M H W; Hulshof, C T J

    2014-11-01

    Specific physical activities or working conditions are suspected for increasing the risk of preterm birth (PTB). The aim of this meta-analysis is to review and summarize the pre-existing evidence on the effect of shift work or long working hours on the risk of PTB. We conducted a systematic search in MEDLINE and EMBASE (1990-2013) for observational and intervention studies with original data. We only included articles that met our specific criteria for language, exposure, outcome, data collection and original data that were of at least of moderate quality. The data of the included studies were pooled. Eight high-quality studies and eight moderate-quality studies were included in the meta-analysis. In these studies, no clear or statistically significant relationship between shift work and PTB was found. The summary estimate OR for performing shift work during pregnancy and the risk of PTB were 1.04 (95% CI 0.90-1.20). For long working hours during pregnancy, the summary estimate OR was 1.25 (95% CI 1.01-1.54), indicating a marginally statistically significant relationship but an only slightly elevated risk. Although in many of the included studies a positive association between long working hours and PTB was seen this did reach only marginal statistical significance. In the studies included in this review, working in shifts or in night shifts during pregnancy was not significantly associated with an increased risk for PTB. For both risk factors, due to the lack of high-quality studies focusing on the risks per trimester, in particular the third trimester, a firm conclusion about an association cannot be stated.

  20. Trends in selected streamflow statistics at 19 long-term streamflow-gaging stations indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico, 1922-2009

    USGS Publications Warehouse

    Barbie, Dana L.; Wehmeyer, Loren L.

    2012-01-01

    Trends in selected streamflow statistics during 1922-2009 were evaluated at 19 long-term streamflow-gaging stations considered indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico. The U.S. Geological Survey, in cooperation with the Texas Water Development Board, evaluated streamflow data from streamflow-gaging stations with more than 50 years of record that were active as of 2009. The outflows into Arkansas and Louisiana were represented by 3 streamflow-gaging stations, and outflows into the Gulf of Mexico, including Galveston Bay, were represented by 16 streamflow-gaging stations. Monotonic trend analyses were done using the following three streamflow statistics generated from daily mean values of streamflow: (1) annual mean daily discharge, (2) annual maximum daily discharge, and (3) annual minimum daily discharge. The trend analyses were based on the nonparametric Kendall's Tau test, which is useful for the detection of monotonic upward or downward trends with time. A total of 69 trend analyses by Kendall's Tau were computed - 19 periods of streamflow multiplied by the 3 streamflow statistics plus 12 additional trend analyses because the periods of record for 2 streamflow-gaging stations were divided into periods representing pre- and post-reservoir impoundment. Unless otherwise described, each trend analysis used the entire period of record for each streamflow-gaging station. The monotonic trend analysis detected 11 statistically significant downward trends, 37 instances of no trend, and 21 statistically significant upward trends. One general region studied, which seemingly has relatively more upward trends for many of the streamflow statistics analyzed, includes the rivers and associated creeks and bayous to Galveston Bay in the Houston metropolitan area. Lastly, the most western river basins considered (the Nueces and Rio Grande) had statistically significant downward trends for many of the streamflow statistics analyzed.

  1. A retrospective analysis of hyperthermic intraperitoneal chemotherapy for gastric cancer with peritoneal metastasis

    PubMed Central

    Yuan, Meiqin; Wang, Zeng; Hu, Guinv; Yang, Yunshan; Lv, Wangxia; Lu, Fangxiao; Zhong, Haijun

    2016-01-01

    Peritoneal metastasis (PM) is a poor prognostic factor in patients with gastric cancer. The aim of this study was to evaluate the efficacy and safety of hyperthermic intraperitoneal chemotherapy (HIPEC) in patients with advanced gastric cancer with PM by retrospective analysis. A total of 54 gastric cancer patients with positive ascitic fluid cytology were included in this study: 23 patients were treated with systemic chemotherapy combined with HIPEC (HIPEC+ group) and 31 received systemic chemotherapy alone (HIPEC- group). The patients were divided into 4 categories according to the changes of ascites, namely disappear, decrease, stable and increase. The disappear + decrease rate in the HIPEC+ group was 82.60%, which was statistically significantly superior to that of the HIPEC- group (54.80%). The disappear + decrease + stable rate was 95.70% in the HIPEC+ group and 74.20% in the HIPEC- group, but the difference was not statistically significant. In 33 patients with complete survival data, including 12 from the HIPEC+ and 21 from the HIPEC- group, the median progression-free survival was 164 and 129 days, respectively, and the median overall survival (OS) was 494 and 223 days, respectively. In patients with ascites disappear/decrease/stable, the OS appeared to be better compared with that in patients with ascites increase, but the difference was not statistically significant. Further analysis revealed that patients with controlled disease (complete response + partial response + stable disease) may have a better OS compared with patients with progressive disease, with a statistically significant difference. The toxicities were well tolerated in both groups. Therefore, HIPEC was found to improve survival in advanced gastric cancer patients with PM, but the difference was not statistically significant, which may be attributed to the small number of cases. Further studies with larger samples are required to confirm our data. PMID:27446587

  2. A statistical simulation model for field testing of non-target organisms in environmental risk assessment of genetically modified plants.

    PubMed

    Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore

    2014-04-01

    Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.

  3. A statistical simulation model for field testing of non-target organisms in environmental risk assessment of genetically modified plants

    PubMed Central

    Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore

    2014-01-01

    Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided. PMID:24834325

  4. The Spreadsheet in an Educational Setting. Microcomputing Working Paper Series F 84-4.

    ERIC Educational Resources Information Center

    Wozny, Lucy

    This overview of a specific spreadsheet, Microsoft's Multiplan for the Apple Macintosh microcomputer, emphasizes specific features that are important to the academic community, including the mathematical functions of algebra, trigonometry, and statistical analysis. Additional features are summarized, including data formats for both numerical and…

  5. Quantitative Microbial Risk Assessment Tutorial: Publishing a Microbial Density Time Series as a Txt File

    EPA Science Inventory

    A SARA Timeseries Utility supports analysis and management of time-varying environmental data including listing, graphing, computing statistics, computing meteorological data and saving in a WDM or text file. File formats supported include WDM, HSPF Binary (.hbn), USGS RDB, and T...

  6. Johnson Space Center's Risk and Reliability Analysis Group 2008 Annual Report

    NASA Technical Reports Server (NTRS)

    Valentine, Mark; Boyer, Roger; Cross, Bob; Hamlin, Teri; Roelant, Henk; Stewart, Mike; Bigler, Mark; Winter, Scott; Reistle, Bruce; Heydorn,Dick

    2009-01-01

    The Johnson Space Center (JSC) Safety & Mission Assurance (S&MA) Directorate s Risk and Reliability Analysis Group provides both mathematical and engineering analysis expertise in the areas of Probabilistic Risk Assessment (PRA), Reliability and Maintainability (R&M) analysis, and data collection and analysis. The fundamental goal of this group is to provide National Aeronautics and Space Administration (NASA) decisionmakers with the necessary information to make informed decisions when evaluating personnel, flight hardware, and public safety concerns associated with current operating systems as well as with any future systems. The Analysis Group includes a staff of statistical and reliability experts with valuable backgrounds in the statistical, reliability, and engineering fields. This group includes JSC S&MA Analysis Branch personnel as well as S&MA support services contractors, such as Science Applications International Corporation (SAIC) and SoHaR. The Analysis Group s experience base includes nuclear power (both commercial and navy), manufacturing, Department of Defense, chemical, and shipping industries, as well as significant aerospace experience specifically in the Shuttle, International Space Station (ISS), and Constellation Programs. The Analysis Group partners with project and program offices, other NASA centers, NASA contractors, and universities to provide additional resources or information to the group when performing various analysis tasks. The JSC S&MA Analysis Group is recognized as a leader in risk and reliability analysis within the NASA community. Therefore, the Analysis Group is in high demand to help the Space Shuttle Program (SSP) continue to fly safely, assist in designing the next generation spacecraft for the Constellation Program (CxP), and promote advanced analytical techniques. The Analysis Section s tasks include teaching classes and instituting personnel qualification processes to enhance the professional abilities of our analysts as well as performing major probabilistic assessments used to support flight rationale and help establish program requirements. During 2008, the Analysis Group performed more than 70 assessments. Although all these assessments were important, some were instrumental in the decisionmaking processes for the Shuttle and Constellation Programs. Two of the more significant tasks were the Space Transportation System (STS)-122 Low Level Cutoff PRA for the SSP and the Orion Pad Abort One (PA-1) PRA for the CxP. These two activities, along with the numerous other tasks the Analysis Group performed in 2008, are summarized in this report. This report also highlights several ongoing and upcoming efforts to provide crucial statistical and probabilistic assessments, such as the Extravehicular Activity (EVA) PRA for the Hubble Space Telescope service mission and the first fully integrated PRAs for the CxP's Lunar Sortie and ISS missions.

  7. System analysis for the Huntsville Operational Support Center distributed computer system

    NASA Technical Reports Server (NTRS)

    Ingels, E. M.

    1983-01-01

    A simulation model was developed and programmed in three languages BASIC, PASCAL, and SLAM. Two of the programs are included in this report, the BASIC and the PASCAL language programs. SLAM is not supported by NASA/MSFC facilities and hence was not included. The statistical comparison of simulations of the same HOSC system configurations are in good agreement and are in agreement with the operational statistics of HOSC that were obtained. Three variations of the most recent HOSC configuration was run and some conclusions drawn as to the system performance under these variations.

  8. Job Satisfaction DEOCS 4.1 Construct Validity Summary

    DTIC Science & Technology

    2017-08-01

    focuses more specifically on satisfaction with the job. Included is a review of the 4.0 description and items, followed by the proposed modifications to...the factor. The DEOCS 4.0 description provided for job satisfaction is “the perception of personal fulfillment in a specific vocation, and sense of...piloting items on the DEOCS; (4) examining the descriptive statistics, exploratory factor analysis results, and aggregation statistics; and (5

  9. The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bihn T. Pham; Jeffrey J. Einerson

    2010-06-01

    This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less

  10. Dietary Inflammatory Potential Score and Risk of Breast Cancer: Systematic Review and Meta-analysis.

    PubMed

    Zahedi, Hoda; Djalalinia, Shirin; Sadeghi, Omid; Asayesh, Hamid; Noroozi, Mehdi; Gorabi, Armita Mahdavi; Mohammadi, Rasool; Qorbani, Mostafa

    2018-02-07

    Several studies have been conducted on the relationship between dietary inflammatory potential (DIP) and breast cancer. However, the findings are conflicting. This systematic review and meta-analysis summarizes the findings on the association between DIP and the risk of breast cancer. We used relevant keywords and searched online international electronic databases, including PubMed and NLM Gateway (for Medline), Institute for Scientific Information (ISI), and Scopus for articles published through February 2017. All cross-sectional, case-control, and cohort studies were included in this meta-analysis. Meta-analysis was performed using the random effects meta-analysis method to address heterogeneity among studies. Findings were analyzed statistically. Nine studies were included in the present systematic review and meta-analysis. The total sample size of these studies was 296,102, and the number of participants varied from 1453 to 122,788. The random effects meta-analysis showed a positive and significant association between DIP and the risk of breast cancer (pooled odds ratio, 1.14; 95% confidence interval, 1.01-1.27). The pooled effect size was not statistically significant because of the type of studies, including cohort (pooled relative risk, 1.04; 95% confidence interval, 0.98-1.10) and case-control (pooled odds ratio, 1.63; 95% confidence interval, 0.89-2.37) studies. We found a significant and positive association between higher DIP score and risk of breast cancer. Modifying inflammatory characteristics of diet can substantially reduce the risk of breast cancer. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Machine learning to analyze images of shocked materials for precise and accurate measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dresselhaus-Cooper, Leora; Howard, Marylesa; Hock, Margaret C.

    A supervised machine learning algorithm, called locally adaptive discriminant analysis (LADA), has been developed to locate boundaries between identifiable image features that have varying intensities. LADA is an adaptation of image segmentation, which includes techniques that find the positions of image features (classes) using statistical intensity distributions for each class in the image. In order to place a pixel in the proper class, LADA considers the intensity at that pixel and the distribution of intensities in local (nearby) pixels. This paper presents the use of LADA to provide, with statistical uncertainties, the positions and shapes of features within ultrafast imagesmore » of shock waves. We demonstrate the ability to locate image features including crystals, density changes associated with shock waves, and material jetting caused by shock waves. This algorithm can analyze images that exhibit a wide range of physical phenomena because it does not rely on comparison to a model. LADA enables analysis of images from shock physics with statistical rigor independent of underlying models or simulations.« less

  12. Exploring the Link Between Streamflow Trends and Climate Change in Indiana, USA

    NASA Astrophysics Data System (ADS)

    Kumar, S.; Kam, J.; Thurner, K.; Merwade, V.

    2007-12-01

    Streamflow trends in Indiana are evaluated for 85 USGS streamflow gaging stations that have continuous unregulated streamflow records varying from 10 to 80 years. The trends are analyzed by using the non-parametric Mann-Kendall test with prior trend-free pre-whitening to remove serial correlation in the data. Bootstrap method is used to establish field significance of the results. Trends are computed for 12 streamflow statistics to include low-, medium- (median and mean flow), and high-flow conditions on annual and seasonal time step. The analysis is done for six study periods, ranging from 10 years to more than 65 years, all ending in 2003. The trends in annual average streamflow, for 50 years study period, are compared with annual average precipitation trends from 14 National Climatic Data Center (NCDC) stations in Indiana, that have 50 years of continuous daily record. The results show field significant positive trends in annual low and medium streamflow statistics at majority of gaging stations for study periods that include 40 or more years of records. In seasonal analysis, all flow statistics in summer and fall (low flow seasons), and only low flow statistics in winter and spring (high flow seasons) are showing positive trends. No field significant trends in annual and seasonal flow statistics are observed for study periods that include 25 or fewer years of records, except for northern Indiana where localized negative trends are observed in 10 and 15 years study periods. Further, stream flow trends are found to be highly correlated with precipitation trends on annual time step. No apparent climate change signal is observed in Indiana stream flow records.

  13. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics.

    PubMed

    Giambartolomei, Claudia; Vukcevic, Damjan; Schadt, Eric E; Franke, Lude; Hingorani, Aroon D; Wallace, Chris; Plagnol, Vincent

    2014-05-01

    Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.

  14. Single-row, double-row, and transosseous equivalent techniques for isolated supraspinatus tendon tears with minimal atrophy: A retrospective comparative outcome and radiographic analysis at minimum 2-year followup

    PubMed Central

    McCormick, Frank; Gupta, Anil; Bruce, Ben; Harris, Josh; Abrams, Geoff; Wilson, Hillary; Hussey, Kristen; Cole, Brian J.

    2014-01-01

    Purpose: The purpose of this study was to measure and compare the subjective, objective, and radiographic healing outcomes of single-row (SR), double-row (DR), and transosseous equivalent (TOE) suture techniques for arthroscopic rotator cuff repair. Materials and Methods: A retrospective comparative analysis of arthroscopic rotator cuff repairs by one surgeon from 2004 to 2010 at minimum 2-year followup was performed. Cohorts were matched for age, sex, and tear size. Subjective outcome variables included ASES, Constant, SST, UCLA, and SF-12 scores. Objective outcome variables included strength, active range of motion (ROM). Radiographic healing was assessed by magnetic resonance imaging (MRI). Statistical analysis was performed using analysis of variance (ANOVA), Mann — Whitney and Kruskal — Wallis tests with significance, and the Fisher exact probability test <0.05. Results: Sixty-three patients completed the study requirements (20 SR, 21 DR, 22 TOE). There was a clinically and statistically significant improvement in outcomes with all repair techniques (ASES mean improvement P = <0.0001). The mean final ASES scores were: SR 83; (SD 21.4); DR 87 (SD 18.2); TOE 87 (SD 13.2); (P = 0.73). There was a statistically significant improvement in strength for each repair technique (P < 0.001). There was no significant difference between techniques across all secondary outcome assessments: ASES improvement, Constant, SST, UCLA, SF-12, ROM, Strength, and MRI re-tear rates. There was a decrease in re-tear rates from single row (22%) to double-row (18%) to transosseous equivalent (11%); however, this difference was not statistically significant (P = 0.6). Conclusions: Compared to preoperatively, arthroscopic rotator cuff repair, using SR, DR, or TOE techniques, yielded a clinically and statistically significant improvement in subjective and objective outcomes at a minimum 2-year follow-up. Level of Evidence: Therapeutic level 3. PMID:24926159

  15. Conducting Simulation Studies in the R Programming Environment.

    PubMed

    Hallgren, Kevin A

    2013-10-12

    Simulation studies allow researchers to answer specific questions about data analysis, statistical power, and best-practices for obtaining accurate results in empirical research. Despite the benefits that simulation research can provide, many researchers are unfamiliar with available tools for conducting their own simulation studies. The use of simulation studies need not be restricted to researchers with advanced skills in statistics and computer programming, and such methods can be implemented by researchers with a variety of abilities and interests. The present paper provides an introduction to methods used for running simulation studies using the R statistical programming environment and is written for individuals with minimal experience running simulation studies or using R. The paper describes the rationale and benefits of using simulations and introduces R functions relevant for many simulation studies. Three examples illustrate different applications for simulation studies, including (a) the use of simulations to answer a novel question about statistical analysis, (b) the use of simulations to estimate statistical power, and (c) the use of simulations to obtain confidence intervals of parameter estimates through bootstrapping. Results and fully annotated syntax from these examples are provided.

  16. Phenotypic mapping of metabolic profiles using self-organizing maps of high-dimensional mass spectrometry data.

    PubMed

    Goodwin, Cody R; Sherrod, Stacy D; Marasco, Christina C; Bachmann, Brian O; Schramm-Sapyta, Nicole; Wikswo, John P; McLean, John A

    2014-07-01

    A metabolic system is composed of inherently interconnected metabolic precursors, intermediates, and products. The analysis of untargeted metabolomics data has conventionally been performed through the use of comparative statistics or multivariate statistical analysis-based approaches; however, each falls short in representing the related nature of metabolic perturbations. Herein, we describe a complementary method for the analysis of large metabolite inventories using a data-driven approach based upon a self-organizing map algorithm. This workflow allows for the unsupervised clustering, and subsequent prioritization of, correlated features through Gestalt comparisons of metabolic heat maps. We describe this methodology in detail, including a comparison to conventional metabolomics approaches, and demonstrate the application of this method to the analysis of the metabolic repercussions of prolonged cocaine exposure in rat sera profiles.

  17. Potential errors and misuse of statistics in studies on leakage in endodontics.

    PubMed

    Lucena, C; Lopez, J M; Pulgar, R; Abalos, C; Valderrama, M J

    2013-04-01

    To assess the quality of the statistical methodology used in studies of leakage in Endodontics, and to compare the results found using appropriate versus inappropriate inferential statistical methods. The search strategy used the descriptors 'root filling' 'microleakage', 'dye penetration', 'dye leakage', 'polymicrobial leakage' and 'fluid filtration' for the time interval 2001-2010 in journals within the categories 'Dentistry, Oral Surgery and Medicine' and 'Materials Science, Biomaterials' of the Journal Citation Report. All retrieved articles were reviewed to find potential pitfalls in statistical methodology that may be encountered during study design, data management or data analysis. The database included 209 papers. In all the studies reviewed, the statistical methods used were appropriate for the category attributed to the outcome variable, but in 41% of the cases, the chi-square test or parametric methods were inappropriately selected subsequently. In 2% of the papers, no statistical test was used. In 99% of cases, a statistically 'significant' or 'not significant' effect was reported as a main finding, whilst only 1% also presented an estimation of the magnitude of the effect. When the appropriate statistical methods were applied in the studies with originally inappropriate data analysis, the conclusions changed in 19% of the cases. Statistical deficiencies in leakage studies may affect their results and interpretation and might be one of the reasons for the poor agreement amongst the reported findings. Therefore, more effort should be made to standardize statistical methodology. © 2012 International Endodontic Journal.

  18. eSACP - a new Nordic initiative towards developing statistical climate services

    NASA Astrophysics Data System (ADS)

    Thorarinsdottir, Thordis; Thejll, Peter; Drews, Martin; Guttorp, Peter; Venälainen, Ari; Uotila, Petteri; Benestad, Rasmus; Mesquita, Michel d. S.; Madsen, Henrik; Fox Maule, Cathrine

    2015-04-01

    The Nordic research council NordForsk has recently announced its support for a new 3-year research initiative on "statistical analysis of climate projections" (eSACP). eSACP will focus on developing e-science tools and services based on statistical analysis of climate projections for the purpose of helping decision-makers and planners in the face of expected future challenges in regional climate change. The motivation behind the project is the growing recognition in our society that forecasts of future climate change is associated with various sources of uncertainty, and that any long-term planning and decision-making dependent on a changing climate must account for this. At the same time there is an obvious gap between scientists from different fields and between practitioners in terms of understanding how climate information relates to different parts of the "uncertainty cascade". In eSACP we will develop generic e-science tools and statistical climate services to facilitate the use of climate projections by decision-makers and scientists from all fields for climate impact analyses and for the development of robust adaptation strategies, which properly (in a statistical sense) account for the inherent uncertainty. The new tool will be publically available and include functionality to utilize the extensive and dynamically growing repositories of data and use state-of-the-art statistical techniques to quantify the uncertainty and innovative approaches to visualize the results. Such a tool will not only be valuable for future assessments and underpin the development of dedicated climate services, but will also assist the scientific community in making more clearly its case on the consequences of our changing climate to policy makers and the general public. The eSACP project is led by Thordis Thorarinsdottir, Norwegian Computing Center, and also includes the Finnish Meteorological Institute, the Norwegian Meteorological Institute, the Technical University of Denmark and the Bjerknes Centre for Climate Research, Norway. This poster will present details of focus areas in the project and show some examples of the expected analysis tools.

  19. An analysis of science versus pseudoscience

    NASA Astrophysics Data System (ADS)

    Hooten, James T.

    2011-12-01

    This quantitative study identified distinctive features in archival datasets commissioned by the National Science Foundation (NSF) for Science and Engineering Indicators reports. The dependent variables included education level, and scores for science fact knowledge, science process knowledge, and pseudoscience beliefs. The dependent variables were aggregated into nine NSF-defined geographic regions and examined for the years 2004 and 2006. The variables were also examined over all years available in the dataset. Descriptive statistics were determined and tests for normality and homogeneity of variances were performed using Statistical Package for the Social Sciences. Analysis of Variance was used to test for statistically significant differences between the nine geographic regions for each of the four dependent variables. Statistical significance of 0.05 was used. Tukey post-hoc analysis was used to compute practical significance of differences between regions. Post-hoc power analysis using G*Power was used to calculate the probability of Type II errors. Tests for correlations across all years of the dependent variables were also performed. Pearson's r was used to indicate the strength of the relationship between the dependent variables. Small to medium differences in science literacy and education level were observed between many of the nine U.S. geographic regions. The most significant differences occurred when the West South Central region was compared to the New England and the Pacific regions. Belief in pseudoscience appeared to be distributed evenly across all U.S. geographic regions. Education level was a strong indicator of science literacy regardless of a respondent's region of residence. Recommendations for further study include more in-depth investigation to uncover the nature of the relationship between education level and belief in pseudoscience.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Apte, A; Veeraraghavan, H; Oh, J

    Purpose: To present an open source and free platform to facilitate radiomics research — The “Radiomics toolbox” in CERR. Method: There is scarcity of open source tools that support end-to-end modeling of image features to predict patient outcomes. The “Radiomics toolbox” strives to fill the need for such a software platform. The platform supports (1) import of various kinds of image modalities like CT, PET, MR, SPECT, US. (2) Contouring tools to delineate structures of interest. (3) Extraction and storage of image based features like 1st order statistics, gray-scale co-occurrence and zonesize matrix based texture features and shape features andmore » (4) Statistical Analysis. Statistical analysis of the extracted features is supported with basic functionality that includes univariate correlations, Kaplan-Meir curves and advanced functionality that includes feature reduction and multivariate modeling. The graphical user interface and the data management are performed with Matlab for the ease of development and readability of code and features for wide audience. Open-source software developed with other programming languages is integrated to enhance various components of this toolbox. For example: Java-based DCM4CHE for import of DICOM, R for statistical analysis. Results: The Radiomics toolbox will be distributed as an open source, GNU copyrighted software. The toolbox was prototyped for modeling Oropharyngeal PET dataset at MSKCC. The analysis will be presented in a separate paper. Conclusion: The Radiomics Toolbox provides an extensible platform for extracting and modeling image features. To emphasize new uses of CERR for radiomics and image-based research, we have changed the name from the “Computational Environment for Radiotherapy Research” to the “Computational Environment for Radiological Research”.« less

  1. Ground-Based Telescope Parametric Cost Model

    NASA Technical Reports Server (NTRS)

    Stahl, H. Philip; Rowell, Ginger Holmes

    2004-01-01

    A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis, The model includes both engineering and performance parameters. While diameter continues to be the dominant cost driver, other significant factors include primary mirror radius of curvature and diffraction limited wavelength. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e.. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter are derived. This analysis indicates that recent mirror technology advances have indeed reduced the historical telescope cost curve.

  2. Assigning statistical significance to proteotypic peptides via database searches

    PubMed Central

    Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo

    2011-01-01

    Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489

  3. Parasites as valuable stock markers for fisheries in Australasia, East Asia and the Pacific Islands.

    PubMed

    Lester, R J G; Moore, B R

    2015-01-01

    Over 30 studies in Australasia, East Asia and the Pacific Islands region have collected and analysed parasite data to determine the ranges of individual fish, many leading to conclusions about stock delineation. Parasites used as biological tags have included both those known to have long residence times in the fish and those thought to be relatively transient. In many cases the parasitological conclusions have been supported by other methods especially analysis of the chemical constituents of otoliths, and to a lesser extent, genetic data. In analysing parasite data, authors have applied multiple different statistical methodologies, including summary statistics, and univariate and multivariate approaches. Recently, a growing number of researchers have found non-parametric methods, such as analysis of similarities and cluster analysis, to be valuable. Future studies into the residence times, life cycles and geographical distributions of parasites together with more robust analytical methods will yield much important information to clarify stock structures in the area.

  4. Effects of Instructional Design with Mental Model Analysis on Learning.

    ERIC Educational Resources Information Center

    Hong, Eunsook

    This paper presents a model for systematic instructional design that includes mental model analysis together with the procedures used in developing computer-based instructional materials in the area of statistical hypothesis testing. The instructional design model is based on the premise that the objective for learning is to achieve expert-like…

  5. An Investigation of Civilians Preparedness to Compete with Individuals with Military Experience for Army Board Select Acquisition Positions

    DTIC Science & Technology

    2017-05-25

    37 Research Design ... research employed a mixed research methodology – quantitative with descriptive statistical analysis and qualitative with a thematic analysis approach...mixed research methodology – quantitative and qualitative, using interviews to collect the data. The interviews included demographic and open-ended

  6. Conjoint Analysis: A Study of the Effects of Using Person Variables.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…

  7. The Effects of Ability Grouping: A Meta-Analysis of Research Findings.

    ERIC Educational Resources Information Center

    Noland, Theresa Koontz; Taylor, Bob L.

    The study reported in this paper quantitatively integrated the recent research findings on ability grouping in order to generalize about these effects on student achievement and student self-concept. Meta-analysis was used to statistically integrate the empirical data. The relationships among various experimental variables including grade level,…

  8. Principal Component Analysis: Resources for an Essential Application of Linear Algebra

    ERIC Educational Resources Information Center

    Pankavich, Stephen; Swanson, Rebecca

    2015-01-01

    Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…

  9. A Program Evaluation of a Summer Research Training Institute for American Indian and Alaska Native Health Professionals

    ERIC Educational Resources Information Center

    Zaback, Tosha; Becker, Thomas M.; Dignan, Mark B.; Lambert, William E.

    2010-01-01

    In this article, the authors describe a unique summer program to train American Indian/Alaska Native (AI/AN) health professionals in a variety of health research-related skills, including epidemiology, data management, statistical analysis, program evaluation, cost-benefit analysis, community-based participatory research, grant writing, and…

  10. A Meta-Analysis: The Relationship between Father Involvement and Student Academic Achievement

    ERIC Educational Resources Information Center

    Jeynes, William H.

    2015-01-01

    A meta-analysis was undertaken, including 66 studies, to determine the relationship between father involvement and the educational outcomes of urban school children. Statistical analyses were done to determine the overall impact and specific components of father involvement. The possible differing effects of paternal involvement by race were also…

  11. A Meta-Analysis of Writing Instruction for Students in the Elementary Grades

    ERIC Educational Resources Information Center

    Graham, Steve; McKeown, Debra; Kiuhara, Sharlene; Harris, Karen R.

    2012-01-01

    In an effort to identify effective instructional practices for teaching writing to elementary grade students, we conducted a meta-analysis of the writing intervention literature, focusing our efforts on true and quasi-experiments. We located 115 documents that included the statistics for computing an effect size (ES). We calculated an average…

  12. QPROT: Statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics.

    PubMed

    Choi, Hyungwon; Kim, Sinae; Fermin, Damian; Tsou, Chih-Chiang; Nesvizhskii, Alexey I

    2015-11-03

    We introduce QPROT, a statistical framework and computational tool for differential protein expression analysis using protein intensity data. QPROT is an extension of the QSPEC suite, originally developed for spectral count data, adapted for the analysis using continuously measured protein-level intensity data. QPROT offers a new intensity normalization procedure and model-based differential expression analysis, both of which account for missing data. Determination of differential expression of each protein is based on the standardized Z-statistic based on the posterior distribution of the log fold change parameter, guided by the false discovery rate estimated by a well-known Empirical Bayes method. We evaluated the classification performance of QPROT using the quantification calibration data from the clinical proteomic technology assessment for cancer (CPTAC) study and a recently published Escherichia coli benchmark dataset, with evaluation of FDR accuracy in the latter. QPROT is a statistical framework with computational software tool for comparative quantitative proteomics analysis. It features various extensions of QSPEC method originally built for spectral count data analysis, including probabilistic treatment of missing values in protein intensity data. With the increasing popularity of label-free quantitative proteomics data, the proposed method and accompanying software suite will be immediately useful for many proteomics laboratories. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Lower back pain and absenteeism among professional public transport drivers.

    PubMed

    Kresal, Friderika; Roblek, Vasja; Jerman, Andrej; Meško, Maja

    2015-01-01

    Drivers in public transport are subjected to lower back pain. The reason for the pain is associated with the characteristics of the physical position imposed on the worker while performing the job. Lower back pain is the main cause of absenteeism among drivers. The present study includes 145 public transport drivers employed as professional drivers for an average of 14.14 years. Analysis of the data obtained in the study includes the basic descriptive statistics, χ(2) test and multiple regression analysis. Analysis of the incidence of lower back pain showed that the majority of our sample population suffered from pain in the lower back. We found that there are no statistically significant differences between the groups formed by the length of service as a professional driver and incidence of lower back pain; we were also interested in whether or not the risk factors of lower back pain affects the absenteeism of city bus drivers. Analysis of the data has shown that the risk factors of pain in the lower part of the spine do affect the absenteeism of city bus drivers.

  14. Critical Analysis of Primary Literature in a Master’s-Level Class: Effects on Self-Efficacy and Science-Process Skills

    PubMed Central

    Abdullah, Christopher; Parris, Julian; Lie, Richard; Guzdar, Amy; Tour, Ella

    2015-01-01

    The ability to think analytically and creatively is crucial for success in the modern workforce, particularly for graduate students, who often aim to become physicians or researchers. Analysis of the primary literature provides an excellent opportunity to practice these skills. We describe a course that includes a structured analysis of four research papers from diverse fields of biology and group exercises in proposing experiments that would follow up on these papers. To facilitate a critical approach to primary literature, we included a paper with questionable data interpretation and two papers investigating the same biological question yet reaching opposite conclusions. We report a significant increase in students’ self-efficacy in analyzing data from research papers, evaluating authors’ conclusions, and designing experiments. Using our science-process skills test, we observe a statistically significant increase in students’ ability to propose an experiment that matches the goal of investigation. We also detect gains in interpretation of controls and quantitative analysis of data. No statistically significant changes were observed in questions that tested the skills of interpretation, inference, and evaluation. PMID:26250564

  15. Selected papers in the hydrologic sciences, 1986

    USGS Publications Warehouse

    Subitzky, Seymour

    1987-01-01

    Water-quality data from long-term (24 years), fixed- station monitoring at the Cape Fear River at Lock 1 near Kelly, N.C., and various measures of basin development are correlated. Subbasin population, number of acres of cropland in the subbasin, number of people employed in manufacturing, and tons of fertilizer applied in the basin are considered as measures of basinwide development activity. Linear correlations show statistically significant posi- tive relations between both population and manufacturing activity and most of the dissolved constituents considered. Negative correlations were found between the acres of harvested cropland and most of the water-quality measures. The amount of fertilizer sold in the subbasin was not statistically related to the water-quality measures considered in this report. The statistical analysis was limited to several commonly used measures of water quality including specific conductance, pH, dissolved solids, several major dissolved ions, and a few nutrients. The major dissolved ions included in the analysis were calcium, sodium, potassium, magnesium, chloride, sulfate, silica, bicarbonate, and fluoride. The nutrients included were dissolved nitrite plus nitrate nitrogen, dissolved ammonia nitrogen, total nitrogen, dissolved phosphates, and total phosphorus. For the chemicals evaluated, manufacturing and population sources are more closely associated with water quality in the Cape Fear River at Lock 1 than are agricultural variables.

  16. Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines

    NASA Astrophysics Data System (ADS)

    Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.

    2016-12-01

    Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.

  17. Statistical analysis of NaOH pretreatment effects on sweet sorghum bagasse characteristics

    NASA Astrophysics Data System (ADS)

    Putri, Ary Mauliva Hada; Wahyuni, Eka Tri; Sudiyani, Yanni

    2017-01-01

    We analyze the behavior of sweet sorghum bagasse characteristics before and after NaOH pretreatments by statistical analysis. These characteristics include the percentages of lignocellulosic materials and the degree of crystallinity. We use the chi-square method to get the values of fitted parameters, and then deploy student's t-test to check whether they are significantly different from zero at 99.73% confidence level (C.L.). We obtain, in the cases of hemicellulose and lignin, that their percentages after pretreatment decrease statistically. On the other hand, crystallinity does not possess similar behavior as the data proves that all fitted parameters in this case might be consistent with zero. Our statistical result is then cross examined with the observations from X-ray diffraction (XRD) and Fourier Transform Infrared (FTIR) Spectroscopy, showing pretty good agreement. This result may indicate that the 10% NaOH pretreatment might not be sufficient in changing the crystallinity index of the sweet sorghum bagasse.

  18. Hybrid statistics-simulations based method for atom-counting from ADF STEM images.

    PubMed

    De Wael, Annelies; De Backer, Annick; Jones, Lewys; Nellist, Peter D; Van Aert, Sandra

    2017-06-01

    A hybrid statistics-simulations based method for atom-counting from annular dark field scanning transmission electron microscopy (ADF STEM) images of monotype crystalline nanostructures is presented. Different atom-counting methods already exist for model-like systems. However, the increasing relevance of radiation damage in the study of nanostructures demands a method that allows atom-counting from low dose images with a low signal-to-noise ratio. Therefore, the hybrid method directly includes prior knowledge from image simulations into the existing statistics-based method for atom-counting, and accounts in this manner for possible discrepancies between actual and simulated experimental conditions. It is shown by means of simulations and experiments that this hybrid method outperforms the statistics-based method, especially for low electron doses and small nanoparticles. The analysis of a simulated low dose image of a small nanoparticle suggests that this method allows for far more reliable quantitative analysis of beam-sensitive materials. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

    NASA Astrophysics Data System (ADS)

    Kacprzak, T.; Kirk, D.; Friedrich, O.; Amara, A.; Refregier, A.; Marian, L.; Dietrich, J. P.; Suchyta, E.; Aleksić, J.; Bacon, D.; Becker, M. R.; Bonnett, C.; Bridle, S. L.; Chang, C.; Eifler, T. F.; Hartley, W. G.; Huff, E. M.; Krause, E.; MacCrann, N.; Melchior, P.; Nicola, A.; Samuroff, S.; Sheldon, E.; Troxel, M. A.; Weller, J.; Zuntz, J.; Abbott, T. M. C.; Abdalla, F. B.; Armstrong, R.; Benoit-Lévy, A.; Bernstein, G. M.; Bernstein, R. A.; Bertin, E.; Brooks, D.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Evrard, A. E.; Neto, A. Fausti; Flaugher, B.; Fosalba, P.; Frieman, J.; Gerdes, D. W.; Goldstein, D. A.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jarvis, M.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Miller, C. J.; Miquel, R.; Mohr, J. J.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Romer, A. K.; Roodman, A.; Rykoff, E. S.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Zhang, Y.; DES Collaboration

    2016-12-01

    Shear peak statistics has gained a lot of attention recently as a practical alternative to the two-point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 deg2 field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range 04 would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two-point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. We discuss prospects for future peak statistics analysis with upcoming DES data.

  20. Statistical analysis plan for the Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART). A randomized controlled trial

    PubMed Central

    Damiani, Lucas Petri; Berwanger, Otavio; Paisani, Denise; Laranjeira, Ligia Nasi; Suzumura, Erica Aranha; Amato, Marcelo Britto Passos; Carvalho, Carlos Roberto Ribeiro; Cavalcanti, Alexandre Biasi

    2017-01-01

    Background The Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART) is an international multicenter randomized pragmatic controlled trial with allocation concealment involving 120 intensive care units in Brazil, Argentina, Colombia, Italy, Poland, Portugal, Malaysia, Spain, and Uruguay. The primary objective of ART is to determine whether maximum stepwise alveolar recruitment associated with PEEP titration, adjusted according to the static compliance of the respiratory system (ART strategy), is able to increase 28-day survival in patients with acute respiratory distress syndrome compared to conventional treatment (ARDSNet strategy). Objective To describe the data management process and statistical analysis plan. Methods The statistical analysis plan was designed by the trial executive committee and reviewed and approved by the trial steering committee. We provide an overview of the trial design with a special focus on describing the primary (28-day survival) and secondary outcomes. We describe our data management process, data monitoring committee, interim analyses, and sample size calculation. We describe our planned statistical analyses for primary and secondary outcomes as well as pre-specified subgroup analyses. We also provide details for presenting results, including mock tables for baseline characteristics, adherence to the protocol and effect on clinical outcomes. Conclusion According to best trial practice, we report our statistical analysis plan and data management plan prior to locking the database and beginning analyses. We anticipate that this document will prevent analysis bias and enhance the utility of the reported results. Trial registration ClinicalTrials.gov number, NCT01374022. PMID:28977255

  1. An Overview of Interrater Agreement on Likert Scales for Researchers and Practitioners

    PubMed Central

    O'Neill, Thomas A.

    2017-01-01

    Applications of interrater agreement (IRA) statistics for Likert scales are plentiful in research and practice. IRA may be implicated in job analysis, performance appraisal, panel interviews, and any other approach to gathering systematic observations. Any rating system involving subject-matter experts can also benefit from IRA as a measure of consensus. Further, IRA is fundamental to aggregation in multilevel research, which is becoming increasingly common in order to address nesting. Although, several technical descriptions of a few specific IRA statistics exist, this paper aims to provide a tractable orientation to common IRA indices to support application. The introductory overview is written with the intent of facilitating contrasts among IRA statistics by critically reviewing equations, interpretations, strengths, and weaknesses. Statistics considered include rwg, rwg*, r′wg, rwg(p), average deviation (AD), awg, standard deviation (Swg), and the coefficient of variation (CVwg). Equations support quick calculation and contrasting of different agreement indices. The article also includes a “quick reference” table and three figures in order to help readers identify how IRA statistics differ and how interpretations of IRA will depend strongly on the statistic employed. A brief consideration of recommended practices involving statistical and practical cutoff standards is presented, and conclusions are offered in light of the current literature. PMID:28553257

  2. Statistical Analysis of Time-Series from Monitoring of Active Volcanic Vents

    NASA Astrophysics Data System (ADS)

    Lachowycz, S.; Cosma, I.; Pyle, D. M.; Mather, T. A.; Rodgers, M.; Varley, N. R.

    2016-12-01

    Despite recent advances in the collection and analysis of time-series from volcano monitoring, and the resulting insights into volcanic processes, challenges remain in forecasting and interpreting activity from near real-time analysis of monitoring data. Statistical methods have potential to characterise the underlying structure and facilitate intercomparison of these time-series, and so inform interpretation of volcanic activity. We explore the utility of multiple statistical techniques that could be widely applicable to monitoring data, including Shannon entropy and detrended fluctuation analysis, by their application to various data streams from volcanic vents during periods of temporally variable activity. Each technique reveals changes through time in the structure of some of the data that were not apparent from conventional analysis. For example, we calculate the Shannon entropy (a measure of the randomness of a signal) of time-series from the recent dome-forming eruptions of Volcán de Colima (Mexico) and Soufrière Hills (Montserrat). The entropy of real-time seismic measurements and the count rate of certain volcano-seismic event types from both volcanoes is found to be temporally variable, with these data generally having higher entropy during periods of lava effusion and/or larger explosions. In some instances, the entropy shifts prior to or coincident with changes in seismic or eruptive activity, some of which were not clearly recognised by real-time monitoring. Comparison with other statistics demonstrates the sensitivity of the entropy to the data distribution, but that it is distinct from conventional statistical measures such as coefficient of variation. We conclude that each analysis technique examined could provide valuable insights for interpretation of diverse monitoring time-series.

  3. Study design and statistical analysis of data in human population studies with the micronucleus assay.

    PubMed

    Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

    2011-01-01

    The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.

  4. New Optical Transforms For Statistical Image Recognition

    NASA Astrophysics Data System (ADS)

    Lee, Sing H.

    1983-12-01

    In optical implementation of statistical image recognition, new optical transforms on large images for real-time recognition are of special interest. Several important linear transformations frequently used in statistical pattern recognition have now been optically implemented, including the Karhunen-Loeve transform (KLT), the Fukunaga-Koontz transform (FKT) and the least-squares linear mapping technique (LSLMT).1-3 The KLT performs principle components analysis on one class of patterns for feature extraction. The FKT performs feature extraction for separating two classes of patterns. The LSLMT separates multiple classes of patterns by maximizing the interclass differences and minimizing the intraclass variations.

  5. Estimation of elastic moduli of graphene monolayer in lattice statics approach at nonzero temperature

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zubko, I. Yu., E-mail: zoubko@list.ru; Kochurov, V. I.

    2015-10-27

    For the aim of the crystal temperature control the computational-statistical approach to studying thermo-mechanical properties for finite sized crystals is presented. The approach is based on the combination of the high-performance computational techniques and statistical analysis of the crystal response on external thermo-mechanical actions for specimens with the statistically small amount of atoms (for instance, nanoparticles). The heat motion of atoms is imitated in the statics approach by including the independent degrees of freedom for atoms connected with their oscillations. We obtained that under heating, graphene material response is nonsymmetric.

  6. Meta-analysis of haplotype-association studies: comparison of methods and empirical evaluation of the literature

    PubMed Central

    2011-01-01

    Background Meta-analysis is a popular methodology in several fields of medical research, including genetic association studies. However, the methods used for meta-analysis of association studies that report haplotypes have not been studied in detail. In this work, methods for performing meta-analysis of haplotype association studies are summarized, compared and presented in a unified framework along with an empirical evaluation of the literature. Results We present multivariate methods that use summary-based data as well as methods that use binary and count data in a generalized linear mixed model framework (logistic regression, multinomial regression and Poisson regression). The methods presented here avoid the inflation of the type I error rate that could be the result of the traditional approach of comparing a haplotype against the remaining ones, whereas, they can be fitted using standard software. Moreover, formal global tests are presented for assessing the statistical significance of the overall association. Although the methods presented here assume that the haplotypes are directly observed, they can be easily extended to allow for such an uncertainty by weighting the haplotypes by their probability. Conclusions An empirical evaluation of the published literature and a comparison against the meta-analyses that use single nucleotide polymorphisms, suggests that the studies reporting meta-analysis of haplotypes contain approximately half of the included studies and produce significant results twice more often. We show that this excess of statistically significant results, stems from the sub-optimal method of analysis used and, in approximately half of the cases, the statistical significance is refuted if the data are properly re-analyzed. Illustrative examples of code are given in Stata and it is anticipated that the methods developed in this work will be widely applied in the meta-analysis of haplotype association studies. PMID:21247440

  7. Mathematics and Statistics Research Department progress report, period ending June 30, 1982

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Denson, M.V.; Funderlic, R.E.; Gosslee, D.G.

    1982-08-01

    This report is the twenty-fifth in the series of progress reports of the Mathematics and Statistics Research Department of the Computer Sciences Division, Union Carbide Corporation Nuclear Division (UCC-ND). Part A records research progress in analysis of large data sets, biometrics research, computational statistics, materials science applications, moving boundary problems, numerical linear algebra, and risk analysis. Collaboration and consulting with others throughout the UCC-ND complex are recorded in Part B. Included are sections on biology, chemistry, energy, engineering, environmental sciences, health and safety, materials science, safeguards, surveys, and the waste storage program. Part C summarizes the various educational activities inmore » which the staff was engaged. Part D lists the presentations of research results, and Part E records the staff's other professional activities during the report period.« less

  8. Time Series Expression Analyses Using RNA-seq: A Statistical Approach

    PubMed Central

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021

  9. Statistical Optimality in Multipartite Ranking and Ordinal Regression.

    PubMed

    Uematsu, Kazuki; Lee, Yoonkyung

    2015-05-01

    Statistical optimality in multipartite ranking is investigated as an extension of bipartite ranking. We consider the optimality of ranking algorithms through minimization of the theoretical risk which combines pairwise ranking errors of ordinal categories with differential ranking costs. The extension shows that for a certain class of convex loss functions including exponential loss, the optimal ranking function can be represented as a ratio of weighted conditional probability of upper categories to lower categories, where the weights are given by the misranking costs. This result also bridges traditional ranking methods such as proportional odds model in statistics with various ranking algorithms in machine learning. Further, the analysis of multipartite ranking with different costs provides a new perspective on non-smooth list-wise ranking measures such as the discounted cumulative gain and preference learning. We illustrate our findings with simulation study and real data analysis.

  10. Detection of reflecting surfaces by a statistical model

    NASA Astrophysics Data System (ADS)

    He, Qiang; Chu, Chee-Hung H.

    2009-02-01

    Remote sensing is widely used assess the destruction from natural disasters and to plan relief and recovery operations. How to automatically extract useful features and segment interesting objects from digital images, including remote sensing imagery, becomes a critical task for image understanding. Unfortunately, current research on automated feature extraction is ignorant of contextual information. As a result, the fidelity of populating attributes corresponding to interesting features and objects cannot be satisfied. In this paper, we present an exploration on meaningful object extraction integrating reflecting surfaces. Detection of specular reflecting surfaces can be useful in target identification and then can be applied to environmental monitoring, disaster prediction and analysis, military, and counter-terrorism. Our method is based on a statistical model to capture the statistical properties of specular reflecting surfaces. And then the reflecting surfaces are detected through cluster analysis.

  11. Time series expression analyses using RNA-seq: a statistical approach.

    PubMed

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.

  12. Temporal Comparisons of Internet Topology

    DTIC Science & Technology

    2014-06-01

    Number CAIDA Cooperative Association of Internet Data Analysis CDN Content Delivery Network CI Confidence Interval DoS denial of service GMT Greenwich...the CAIDA data. Our methods include analysis of graph theoretical measures as well as complex network and statistical measures that will quantify the...tool that probes the Internet for topology analysis and performance [26]. Scamper uses network diagnostic tools, such as traceroute and ping, to probe

  13. Transfusion Indication Threshold Reduction (TITRe2) randomized controlled trial in cardiac surgery: statistical analysis plan.

    PubMed

    Pike, Katie; Nash, Rachel L; Murphy, Gavin J; Reeves, Barnaby C; Rogers, Chris A

    2015-02-22

    The Transfusion Indication Threshold Reduction (TITRe2) trial is the largest randomized controlled trial to date to compare red blood cell transfusion strategies following cardiac surgery. This update presents the statistical analysis plan, detailing how the study will be analyzed and presented. The statistical analysis plan has been written following recommendations from the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, prior to database lock and the final analysis of trial data. Outlined analyses are in line with the Consolidated Standards of Reporting Trials (CONSORT). The study aims to randomize 2000 patients from 17 UK centres. Patients are randomized to either a restrictive (transfuse if haemoglobin concentration <7.5 g/dl) or liberal (transfuse if haemoglobin concentration <9 g/dl) transfusion strategy. The primary outcome is a binary composite outcome of any serious infectious or ischaemic event in the first 3 months following randomization. The statistical analysis plan details how non-adherence with the intervention, withdrawals from the study, and the study population will be derived and dealt with in the analysis. The planned analyses of the trial primary and secondary outcome measures are described in detail, including approaches taken to deal with multiple testing, model assumptions not being met and missing data. Details of planned subgroup and sensitivity analyses and pre-specified ancillary analyses are given, along with potential issues that have been identified with such analyses and possible approaches to overcome such issues. ISRCTN70923932 .

  14. Methods for the evaluation of alternative disaster warning systems

    NASA Technical Reports Server (NTRS)

    Agnew, C. E.; Anderson, R. J., Jr.; Lanen, W. N.

    1977-01-01

    For each of the methods identified, a theoretical basis is provided and an illustrative example is described. The example includes sufficient realism and detail to enable an analyst to conduct an evaluation of other systems. The methods discussed in the study include equal capability cost analysis, consumers' surplus, and statistical decision theory.

  15. Automatic Rock Detection and Mapping from HiRISE Imagery

    NASA Technical Reports Server (NTRS)

    Huertas, Andres; Adams, Douglas S.; Cheng, Yang

    2008-01-01

    This system includes a C-code software program and a set of MATLAB software tools for statistical analysis and rock distribution mapping. The major functions include rock detection and rock detection validation. The rock detection code has been evolved into a production tool that can be used by engineers and geologists with minor training.

  16. Meta-analysis of pemetrexed plus carboplatin doublet safety profile in first-line non-squamous non-small cell lung cancer studies.

    PubMed

    Okamoto, Isamu; Schuette, Wolfgang H W; Stinchcombe, Thomas E; Rodrigues-Pereira, José; San Antonio, Belén; Chen, Jian; Liu, Jingyi; John, William J; Zinner, Ralph G

    2017-05-01

    This meta-analysis compared safety profiles (selected drug-related treatment-emergent adverse events [TEAEs]) of first-line pemetrexed plus carboplatin (PCb) area under the concentration-time curve 5 mg/min•mL (PCb5) or 6 mg/min•mL (PCb6), two widely used regimens in clinical practice for advanced non-squamous non-small cell lung cancer. All patients received pemetrexed 500 mg/m 2 every 21 days with either of two carboplatin doses for up to 4-6 cycles. Safety profiles of PCb doses were compared using three statistical analysis methods: frequency table analysis (FTA), generalized linear mixed effect model (GLMM), and the propensity score method. Efficacy outcomes of PCb5 and PCb6 regimens were summarized. A total of 486 patients mainly from the US, Europe, and East Asia were included in the analysis; 22% (n = 105) received PCb5 in one trial and 78% (n = 381) received PCb6 in four trials. The FTA comparison demonstrated that PCb5 vs PCb6 was associated with a statistically significantly lower incidence of TEAEs, including all-grade thrombocytopenia, anemia, fatigue, and vomiting, and grade 3/4 thrombocytopenia. In the GLMM analysis, PCb5 patients were numerically less likely to experience all-grade and grade 3/4 neutropenia, anemia, and thrombocytopenia. The propensity score regression analysis showed PCb5 group patients were significantly less likely than PCb6 group patients to experience all-grade hematologic TEAEs and grade 3/4 thrombocytopenia and anemia. After applying propensity score 1:1 matching, FTA analysis showed that the PCb5 group had significantly less all-grade and grade 3/4 hematologic toxicities. Overall efficacy outcomes, including overall survival, progression-free survival, and response rate, were similar between the two carboplatin doses. Acknowledging the limitations of this meta-analysis of five trials, heterogeneous in patient's characteristics and trial designs, the results show that the PCb5 regimen was generally associated with a better safety profile than PCb6 across three statistical approaches, with no apparent impact on survival outcomes.

  17. Implementation and evaluation of an efficient secure computation system using ‘R’ for healthcare statistics

    PubMed Central

    Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

    2014-01-01

    Background and objective While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Materials and methods Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software ‘R’ by effectively combining secret-sharing-based secure computation with original computation. Results Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50 000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. Discussion If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using ‘R’ that works interactively while secure computation protocols generally require a significant amount of processing time. Conclusions We propose a secure statistical analysis system using ‘R’ for medical data that effectively integrates secret-sharing-based secure computation and original computation. PMID:24763677

  18. Implementation and evaluation of an efficient secure computation system using 'R' for healthcare statistics.

    PubMed

    Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

    2014-10-01

    While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software 'R' by effectively combining secret-sharing-based secure computation with original computation. Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50,000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using 'R' that works interactively while secure computation protocols generally require a significant amount of processing time. We propose a secure statistical analysis system using 'R' for medical data that effectively integrates secret-sharing-based secure computation and original computation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  19. Automation method to identify the geological structure of seabed using spatial statistic analysis of echo sounding data

    NASA Astrophysics Data System (ADS)

    Kwon, O.; Kim, W.; Kim, J.

    2017-12-01

    Recently construction of subsea tunnel has been increased globally. For safe construction of subsea tunnel, identifying the geological structure including fault at design and construction stage is more than important. Then unlike the tunnel in land, it's very difficult to obtain the data on geological structure because of the limit in geological survey. This study is intended to challenge such difficulties in a way of developing the technology to identify the geological structure of seabed automatically by using echo sounding data. When investigation a potential site for a deep subsea tunnel, there is the technical and economical limit with borehole of geophysical investigation. On the contrary, echo sounding data is easily obtainable while information reliability is higher comparing to above approaches. This study is aimed at developing the algorithm that identifies the large scale of geological structure of seabed using geostatic approach. This study is based on theory of structural geology that topographic features indicate geological structure. Basic concept of algorithm is outlined as follows; (1) convert the seabed topography to the grid data using echo sounding data, (2) apply the moving window in optimal size to the grid data, (3) estimate the spatial statistics of the grid data in the window area, (4) set the percentile standard of spatial statistics, (5) display the values satisfying the standard on the map, (6) visualize the geological structure on the map. The important elements in this study include optimal size of moving window, kinds of optimal spatial statistics and determination of optimal percentile standard. To determine such optimal elements, a numerous simulations were implemented. Eventually, user program based on R was developed using optimal analysis algorithm. The user program was designed to identify the variations of various spatial statistics. It leads to easy analysis of geological structure depending on variation of spatial statistics by arranging to easily designate the type of spatial statistics and percentile standard. This research was supported by the Korea Agency for Infrastructure Technology Advancement under the Ministry of Land, Infrastructure and Transport of the Korean government. (Project Number: 13 Construction Research T01)

  20. Marigold (Calendula officinalis L.): an evidence-based systematic review by the Natural Standard Research Collaboration.

    PubMed

    Basch, Ethan; Bent, Steve; Foppa, Ivo; Haskmi, Sadaf; Kroll, David; Mele, Michelle; Szapary, Philippe; Ulbricht, Catherine; Vora, Mamta; Yong, Sophanna

    2006-01-01

    An evidence-based systematic review including written and statistical analysis of scientific literature, expert opinion, folkloric precedent, history, pharmacology, kinetics/dynamics, interactions, adverse effects, toxicology and dosing.

  1. Forest fires in Pennsylvania.

    Treesearch

    Donald A. Haines; William A. Main; Eugene F. McNamara

    1978-01-01

    Describes factors that contribute to forest fires in Pennsylvania. Includes an analysis of basic statistics; distribution of fires during normal, drought, and wet years; fire cause, fire activity by day-of-week; multiple-fire day; and fire climatology.

  2. An evidence-based systematic review of saffron (Crocus sativus) by the Natural Standard Research Collaboration.

    PubMed

    Ulbricht, Catherine; Conquer, Julie; Costa, Dawn; Hollands, Whitney; Iannuzzi, Carmen; Isaac, Richard; Jordan, Joseph K; Ledesma, Natalie; Ostroff, Cathy; Serrano, Jill M Grimes; Shaffer, Michael D; Varghese, Minney

    2011-03-01

    An evidence-based systematic review including written and statistical analysis of scientific literature, expert opinion, folkloric precedent, history, pharmacology, kinetics/dynamics, interactions, adverse effects, toxicology, and dosing.

  3. Evidence-based systematic review of saw palmetto by the Natural Standard Research Collaboration.

    PubMed

    Ulbricht, Catherine; Basch, Ethan; Bent, Steve; Boon, Heather; Corrado, Michelle; Foppa, Ivo; Hashmi, Sadaf; Hammerness, Paul; Kingsbury, Eileen; Smith, Michael; Szapary, Philippe; Vora, Mamta; Weissner, Wendy

    2006-01-01

    Here presented is an evidence-based systematic review including written and statistical analysis of scientific literature, expert opinion, folkloric precedent, history, pharmacology, kinetics/dynamics, interactions, adverse effects, toxicology, and dosing.

  4. Differences in game-related statistics of basketball performance by game location for men's winning and losing teams.

    PubMed

    Gómez, Miguel A; Lorenzo, Alberto; Barakat, Rubén; Ortega, Enrique; Palao, José M

    2008-02-01

    The aim of the present study was to identify game-related statistics that differentiate winning and losing teams according to game location. The sample included 306 games of the 2004-2005 regular season of the Spanish professional men's league (ACB League). The independent variables were game location (home or away) and game result (win or loss). The game-related statistics registered were free throws (successful and unsuccessful), 2- and 3-point field goals (successful and unsuccessful), offensive and defensive rebounds, blocks, assists, fouls, steals, and turnovers. Descriptive and inferential analyses were done (one-way analysis of variance and discriminate analysis). The multivariate analysis showed that winning teams differ from losing teams in defensive rebounds (SC = .42) and in assists (SC = .38). Similarly, winning teams differ from losing teams when they play at home in defensive rebounds (SC = .40) and in assists (SC = .41). On the other hand, winning teams differ from losing teams when they play away in defensive rebounds (SC = .44), assists (SC = .30), successful 2-point field goals (SC = .31), and unsuccessful 3-point field goals (SC = -.35). Defensive rebounds and assists were the only game-related statistics common to all three analyses.

  5. Characterizing microstructural features of biomedical samples by statistical analysis of Mueller matrix images

    NASA Astrophysics Data System (ADS)

    He, Honghui; Dong, Yang; Zhou, Jialing; Ma, Hui

    2017-03-01

    As one of the salient features of light, polarization contains abundant structural and optical information of media. Recently, as a comprehensive description of polarization property, the Mueller matrix polarimetry has been applied to various biomedical studies such as cancerous tissues detections. In previous works, it has been found that the structural information encoded in the 2D Mueller matrix images can be presented by other transformed parameters with more explicit relationship to certain microstructural features. In this paper, we present a statistical analyzing method to transform the 2D Mueller matrix images into frequency distribution histograms (FDHs) and their central moments to reveal the dominant structural features of samples quantitatively. The experimental results of porcine heart, intestine, stomach, and liver tissues demonstrate that the transformation parameters and central moments based on the statistical analysis of Mueller matrix elements have simple relationships to the dominant microstructural properties of biomedical samples, including the density and orientation of fibrous structures, the depolarization power, diattenuation and absorption abilities. It is shown in this paper that the statistical analysis of 2D images of Mueller matrix elements may provide quantitative or semi-quantitative criteria for biomedical diagnosis.

  6. Statistical analysis of secondary particle distributions in relativistic nucleus-nucleus collisions

    NASA Technical Reports Server (NTRS)

    Mcguire, Stephen C.

    1987-01-01

    The use is described of several statistical techniques to characterize structure in the angular distributions of secondary particles from nucleus-nucleus collisions in the energy range 24 to 61 GeV/nucleon. The objective of this work was to determine whether there are correlations between emitted particle intensity and angle that may be used to support the existence of the quark gluon plasma. The techniques include chi-square null hypothesis tests, the method of discrete Fourier transform analysis, and fluctuation analysis. We have also used the method of composite unit vectors to test for azimuthal asymmetry in a data set of 63 JACEE-3 events. Each method is presented in a manner that provides the reader with some practical detail regarding its application. Of those events with relatively high statistics, Fe approaches 0 at 55 GeV/nucleon was found to possess an azimuthal distribution with a highly non-random structure. No evidence of non-statistical fluctuations was found in the pseudo-rapidity distributions of the events studied. It is seen that the most effective application of these methods relies upon the availability of many events or single events that possess very high multiplicities.

  7. The Need for Anticoagulation Following Inferior Vena Cava Filter Placement: Systematic Review

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ray, Charles E.; Prochazka, Allan

    Purpose. To perform a systemic review to determine the effect of anticoagulation on the rates of venous thromboembolism (pulmonary embolus, deep venous thrombosis, inferior vena cava (IVC) filter thrombosis) following placement of an IVC filter. Methods. A comprehensive computerized literature search was performed to identify relevant articles. Data were abstracted by two reviewers. Studies were included if it could be determined whether or not subjects received anticoagulation following filter placement, and if follow-up data were presented. A meta-analysis of patients from all included studies was performed. A total of 14 articles were included in the final analysis, but the datamore » from only nine articles could be used in the meta-analysis; five studies were excluded because they did not present raw data which could be analyzed in the meta-analysis. A total of 1,369 subjects were included in the final meta-analysis. Results. The summary odds ratio for the effect of anticoagulation on venous thromboembolism rates following filter deployment was 0.639 (95% CI 0.351 to 1.159, p = 0.141). There was significant heterogeneity in the results from different studies [Q statistic of 15.95 (p = 0.043)]. Following the meta-analysis, there was a trend toward decreased venous thromboembolism rates in patients with post-filter anticoagulation (12.3% vs. 15.8%), but the result failed to reach statistical significance. Conclusion. Inferior vena cava filters can be placed in patients who cannot receive concomitant anticoagulation without placing them at significantly higher risk of development of venous thromboembolism.« less

  8. The new statistics: why and how.

    PubMed

    Cumming, Geoff

    2014-01-01

    We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data-analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.

  9. Marketing of personalized cancer care on the web: an analysis of Internet websites.

    PubMed

    Gray, Stacy W; Cronin, Angel; Bair, Elizabeth; Lindeman, Neal; Viswanath, Vish; Janeway, Katherine A

    2015-05-01

    Internet marketing may accelerate the use of care based on genomic or tumor-derived data. However, online marketing may be detrimental if it endorses products of unproven benefit. We conducted an analysis of Internet websites to identify personalized cancer medicine (PCM) products and claims. A Delphi Panel categorized PCM as standard or nonstandard based on evidence of clinical utility. Fifty-five websites, sponsored by commercial entities, academic institutions, physicians, research institutes, and organizations, that marketed PCM included somatic (58%) and germline (20%) analysis, interpretive services (15%), and physicians/institutions offering personalized care (44%). Of 32 sites offering somatic analysis, 56% included specific test information (range 1-152 tests). All statistical tests were two-sided, and comparisons of website content were conducted using McNemar's test. More websites contained information about the benefits than limitations of PCM (85% vs 27%, P < .001). Websites specifying somatic analysis were statistically significantly more likely to market one or more nonstandard tests as compared with standard tests (88% vs 44%, P = .04). © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Analyzing thematic maps and mapping for accuracy

    USGS Publications Warehouse

    Rosenfield, G.H.

    1982-01-01

    Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.

  11. Methods for collection and analysis of aquatic biological and microbiological samples

    USGS Publications Warehouse

    Greeson, Phillip E.; Ehlke, T.A.; Irwin, G.A.; Lium, B.W.; Slack, K.V.

    1977-01-01

    Chapter A4 contains methods used by the U.S. Geological Survey to collect, preserve, and analyze waters to determine their biological and microbiological properties. Part 1 discusses biological sampling and sampling statistics. The statistical procedures are accompanied by examples. Part 2 consists of detailed descriptions of more than 45 individual methods, including those for bacteria, phytoplankton, zooplankton, seston, periphyton, macrophytes, benthic invertebrates, fish and other vertebrates, cellular contents, productivity, and bioassays. Each method is summarized, and the application, interferences, apparatus, reagents, collection, analysis, calculations, reporting of results, precision and references are given. Part 3 consists of a glossary. Part 4 is a list of taxonomic references.

  12. A Longitudinal Analysis of the Influence of a Peer Run Warm Line Phone Service on Psychiatric Recovery.

    PubMed

    Dalgin, Rebecca Spirito; Dalgin, M Halim; Metzger, Scott J

    2018-05-01

    This article focuses on the impact of a peer run warm line as part of the psychiatric recovery process. It utilized data including the Recovery Assessment Scale, community integration measures and crisis service usage. Longitudinal statistical analysis was completed on 48 sets of data from 2011, 2012, and 2013. Although no statistically significant differences were observed for the RAS score, community integration data showed increases in visits to primary care doctors, leisure/recreation activities and socialization with others. This study highlights the complexity of psychiatric recovery and that nonclinical peer services like peer run warm lines may be critical to the process.

  13. MWASTools: an R/bioconductor package for metabolome-wide association studies.

    PubMed

    Rodriguez-Martinez, Andrea; Posma, Joram M; Ayala, Rafael; Neves, Ana L; Anwar, Maryam; Petretto, Enrico; Emanueli, Costanza; Gauguier, Dominique; Nicholson, Jeremy K; Dumas, Marc-Emmanuel

    2018-03-01

    MWASTools is an R package designed to provide an integrated pipeline to analyse metabonomic data in large-scale epidemiological studies. Key functionalities of our package include: quality control analysis; metabolome-wide association analysis using various models (partial correlations, generalized linear models); visualization of statistical outcomes; metabolite assignment using statistical total correlation spectroscopy (STOCSY); and biological interpretation of metabolome-wide association studies results. The MWASTools R package is implemented in R (version  > =3.4) and is available from Bioconductor: https://bioconductor.org/packages/MWASTools/. m.dumas@imperial.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  14. Statistics of SU(5) D-brane models on a type II orientifold

    NASA Astrophysics Data System (ADS)

    Gmeiner, Florian; Stein, Maren

    2006-06-01

    We perform a statistical analysis of models with SU(5) and flipped SU(5) gauge group in a type II orientifold setup. We investigate the distribution and correlation of properties of these models, including the number of generations and the hidden sector gauge group. Compared to the recent analysis [F. Gmeiner, R. Blumenhagen, G. Honecker, D. Lüst, and T. Weigand, J. High Energy Phys.JHEPFG1029-8479 01 (2006) 004; F. Gmeiner, Fortschr. Phys.FPYKA60015-8208 54, 391 (2006).10.1088/1126-6708/2006/01/004] of models with a standard model-like gauge group, we find very similar results.

  15. Clinical and Radiographic Mid-Term Outcomes After Total Shoulder Replacement: A Retrospective Study Protocol Including 400 Anatomical and Reverse Prosthetic Implants

    PubMed Central

    Merolla, Giovanni; Tartarone, Antonio; Porcellini, Giuseppe

    2016-01-01

    Objectives: To obtain outcomes data on anatomical and reverse total shoulder arthroplasty by analysis of clinical scores and standard radiographs. Subject selection and enrollment: 400 consecutive series of patients replaced with anatomical and reverse total shoulder arthroplasty (minimum 3 years follow-up). Study Design: retrospective monocenter. Preoperative assessment: Demographics, clinical scores (Constant-Murley) as available, shoulder X-ray (AP, outlet and axillary views) . Last follow-up: Postoperative radiographhs and clinical scores. Adverse events and complications to be reported as occurred since implantation. Statistical analysis: Data collected will be summarized and analyzed for statistical significance. PMID:27326389

  16. 2008 Homeland Security S and T Stakeholders Conference West-Volume 3 Tuesday

    DTIC Science & Technology

    2008-01-16

    Architecture ( PNNL SRS) • Online data collection / entry • Data Warehouse • On Demand Analysis and Reporting Tools • Reports, Charts & Graphs • Visual / Data...Sustainability 2007– 2016 Our region wide investment include all PANYNJ business areas Computer Statistical Analysis COMPSTAT •NYPD 1990’s •Personnel Management...Coast Guard, and public health Expertise, Depth, Agility Staff Degrees 6 Our Value Added Capabilities • Risk Analysis • Operations Analysis

  17. A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.

    PubMed

    Tang, Liansheng Larry; Balakrishnan, N

    2011-01-01

    The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.

  18. LFSTAT - Low-Flow Analysis in R

    NASA Astrophysics Data System (ADS)

    Koffler, Daniel; Laaha, Gregor

    2013-04-01

    The calculation of characteristic stream flow during dry conditions is a basic requirement for many problems in hydrology, ecohydrology and water resources management. As opposed to floods, a number of different indices are used to characterise low flows and streamflow droughts. Although these indices and methods of calculation have been well documented in the WMO Manual on Low-flow Estimation and Prediction [1], a comprehensive software was missing which enables a fast and standardized calculation of low flow statistics. We present the new software package lfstat to fill in this obvious gap. Our software package is based on the statistical open source software R, and expands it to analyse daily stream flow data records focusing on low-flows. As command-line based programs are not everyone's preference, we also offer a plug-in for the R-Commander, an easy to use graphical user interface (GUI) provided for R which is based on tcl/tk. The functionality of lfstat includes estimation methods for low-flow indices, extreme value statistics, deficit characteristics, and additional graphical methods to control the computation of complex indices and to illustrate the data. Beside the basic low flow indices, the baseflow index and recession constants can be computed. For extreme value statistics, state-of-the-art methods for L-moment based local and regional frequency analysis (RFA) are available. The tools for deficit characteristics include various pooling and threshold selection methods to support the calculation of drought duration and deficit indices. The most common graphics for low flow analysis are available, and the plots can be modified according to the user preferences. Graphics include hydrographs for different periods, flexible streamflow deficit plots, baseflow visualisation, recession diagnostic, flow duration curves as well as double mass curves, and many more. From a technical point of view, the package uses a S3-class called lfobj (low-flow objects). This objects are usual R-data-frames including date, flow, hydrological year and possibly baseflow information. Once these objects are created, analysis can be performed by mouse-click and a script can be saved to make the analysis easily reproducible. At the moment we are offering implementation of all major methods proposed in the WMO manual on Low-flow Estimation and Predictions [1]. Future plans include a dynamic low flow report in odt-file format using odf-weave which allows automatic updates if data or analysis change. We hope to offer a tool to ease and structure the analysis of stream flow data focusing on low-flows and to make analysis transparent and communicable. The package can also be used in teaching students the first steps in low-flow hydrology. The software packages can be installed from CRAN (latest stable) and R-Forge: http://r-forge.r-project.org (development version). References: [1] Gustard, Alan; Demuth, Siegfried, (eds.) Manual on Low-flow Estimation and Prediction. Geneva, Switzerland, World Meteorological Organization, (Operational Hydrology Report No. 50, WMO-No. 1029).

  19. Quantitative trait nucleotide analysis using Bayesian model selection.

    PubMed

    Blangero, John; Goring, Harald H H; Kent, Jack W; Williams, Jeff T; Peterson, Charles P; Almasy, Laura; Dyer, Thomas D

    2005-10-01

    Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.

  20. Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments

    PubMed Central

    Welch, Rene; Chung, Dongjun; Grass, Jeffrey; Landick, Robert

    2017-01-01

    Abstract ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data. PMID:28911122

  1. Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments.

    PubMed

    Welch, Rene; Chung, Dongjun; Grass, Jeffrey; Landick, Robert; Keles, Sündüz

    2017-09-06

    ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Descriptive data analysis.

    PubMed

    Thompson, Cheryl Bagley

    2009-01-01

    This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.

  3. Experimental design, power and sample size for animal reproduction experiments.

    PubMed

    Chapman, Phillip L; Seidel, George E

    2008-01-01

    The present paper concerns statistical issues in the design of animal reproduction experiments, with emphasis on the problems of sample size determination and power calculations. We include examples and non-technical discussions aimed at helping researchers avoid serious errors that may invalidate or seriously impair the validity of conclusions from experiments. Screen shots from interactive power calculation programs and basic SAS power calculation programs are presented to aid in understanding statistical power and computing power in some common experimental situations. Practical issues that are common to most statistical design problems are briefly discussed. These include one-sided hypothesis tests, power level criteria, equality of within-group variances, transformations of response variables to achieve variance equality, optimal specification of treatment group sizes, 'post hoc' power analysis and arguments for the increased use of confidence intervals in place of hypothesis tests.

  4. AASC Recommendations for the Education of an Applied Climatologist

    NASA Astrophysics Data System (ADS)

    Nielsen-Gammon, J. W.; Stooksbury, D.; Akyuz, A.; Dupigny-Giroux, L.; Hubbard, K. G.; Timofeyeva, M. M.

    2011-12-01

    The American Association of State Climatologists (AASC) has developed curricular recommendations for the education of future applied and service climatologists. The AASC was founded in 1976. Membership of the AASC includes state climatologists and others who work in state climate offices; climate researchers in academia and educators; applied climatologists in NOAA and other federal agencies; and the private sector. The AASC is the only professional organization dedicated solely to the growth and development of applied and service climatology. The purpose of the recommendations is to offer a framework for existing and developing academic climatology programs. These recommendations are intended to serve as a road map and to help distinguish the educational needs for future applied climatologists from those of operational meteorologists or other scientists and practitioners. While the home department of climatology students may differ from one program to the next, the most essential factor is that students can demonstrate a breadth and depth of understanding in the knowledge and tools needed to be an applied climatologist. Because the training of an applied climatologist requires significant depth and breadth, the Masters degree is recommended as the minimum level of education needed. This presentation will highlight the AASC recommendations. These include a strong foundation in: - climatology (instrumentation and data collection, climate dynamics, physical climatology, synoptic and regional climatology, applied climatology, climate models, etc.) - basic natural sciences and mathematics including calculus, physics, chemistry, and biology/ecology - fundamental atmospheric sciences (atmospheric dynamics, atmospheric thermodynamics, atmospheric radiation, and weather analysis/synoptic meteorology) and - data analysis and spatial analysis (descriptive statistics, statistical methods, multivariate statistics, geostatistics, GIS, etc.). The recommendations also include a secondary area of concentration (agriculture, economics, geography, hydrology, marine sciences, natural resources, policy, etc.) and a major applied climate research component.

  5. Statistical assessment on a combined analysis of GRYN-ROMN-UCBN upland vegetation vital signs

    USGS Publications Warehouse

    Irvine, Kathryn M.; Rodhouse, Thomas J.

    2014-01-01

    As of 2013, Rocky Mountain and Upper Columbia Basin Inventory and Monitoring Networks have multiple years of vegetation data and Greater Yellowstone Network has three years of vegetation data and monitoring is ongoing in all three networks. Our primary objective is to assess whether a combined analysis of these data aimed at exploring correlations with climate and weather data is feasible. We summarize the core survey design elements across protocols and point out the major statistical challenges for a combined analysis at present. The dissimilarity in response designs between ROMN and UCBN-GRYN network protocols presents a statistical challenge that has not been resolved yet. However, the UCBN and GRYN data are compatible as they implement a similar response design; therefore, a combined analysis is feasible and will be pursued in future. When data collected by different networks are combined, the survey design describing the merged dataset is (likely) a complex survey design. A complex survey design is the result of combining datasets from different sampling designs. A complex survey design is characterized by unequal probability sampling, varying stratification, and clustering (see Lohr 2010 Chapter 7 for general overview). Statistical analysis of complex survey data requires modifications to standard methods, one of which is to include survey design weights within a statistical model. We focus on this issue for a combined analysis of upland vegetation from these networks, leaving other topics for future research. We conduct a simulation study on the possible effects of equal versus unequal probability selection of points on parameter estimates of temporal trend using available packages within the R statistical computing package. We find that, as written, using lmer or lm for trend detection in a continuous response and clm and clmm for visually estimated cover classes with “raw” GRTS design weights specified for the weight argument leads to substantially different results and/or computational instability. However, when only fixed effects are of interest, the survey package (svyglm and svyolr) may be suitable for a model-assisted analysis for trend. We provide possible directions for future research into combined analysis for ordinal and continuous vital sign indictors.

  6. Can upstaging of ductal carcinoma in situ be predicted at biopsy by histologic and mammographic features?

    NASA Astrophysics Data System (ADS)

    Shi, Bibo; Grimm, Lars J.; Mazurowski, Maciej A.; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.

    2017-03-01

    Reducing the overdiagnosis and overtreatment associated with ductal carcinoma in situ (DCIS) requires accurate prediction of the invasive potential at cancer screening. In this work, we investigated the utility of pre-operative histologic and mammographic features to predict upstaging of DCIS. The goal was to provide intentionally conservative baseline performance using readily available data from radiologists and pathologists and only linear models. We conducted a retrospective analysis on 99 patients with DCIS. Of those 25 were upstaged to invasive cancer at the time of definitive surgery. Pre-operative factors including both the histologic features extracted from stereotactic core needle biopsy (SCNB) reports and the mammographic features annotated by an expert breast radiologist were investigated with statistical analysis. Furthermore, we built classification models based on those features in an attempt to predict the presence of an occult invasive component in DCIS, with generalization performance assessed by receiver operating characteristic (ROC) curve analysis. Histologic features including nuclear grade and DCIS subtype did not show statistically significant differences between cases with pure DCIS and with DCIS plus invasive disease. However, three mammographic features, i.e., the major axis length of DCIS lesion, the BI-RADS level of suspicion, and radiologist's assessment did achieve the statistical significance. Using those three statistically significant features as input, a linear discriminant model was able to distinguish patients with DCIS plus invasive disease from those with pure DCIS, with AUC-ROC equal to 0.62. Overall, mammograms used for breast screening contain useful information that can be perceived by radiologists and help predict occult invasive components in DCIS.

  7. The Effectiveness of Computer-Assisted Instruction to Teach Physical Examination to Students and Trainees in the Health Sciences Professions: A Systematic Review and Meta-Analysis

    PubMed Central

    Tomesko, Jennifer; Touger-Decker, Riva; Dreker, Margaret; Zelig, Rena; Parrott, James Scott

    2017-01-01

    Purpose: To explore knowledge and skill acquisition outcomes related to learning physical examination (PE) through computer-assisted instruction (CAI) compared with a face-to-face (F2F) approach. Method: A systematic literature review and meta-analysis published between January 2001 and December 2016 was conducted. Databases searched included Medline, Cochrane, CINAHL, ERIC, Ebsco, Scopus, and Web of Science. Studies were synthesized by study design, intervention, and outcomes. Statistical analyses included DerSimonian-Laird random-effects model. Results: In total, 7 studies were included in the review, and 5 in the meta-analysis. There were no statistically significant differences for knowledge (mean difference [MD] = 5.39, 95% confidence interval [CI]: −2.05 to 12.84) or skill acquisition (MD = 0.35, 95% CI: −5.30 to 6.01). Conclusions: The evidence does not suggest a strong consistent preference for either CAI or F2F instruction to teach students/trainees PE. Further research is needed to identify conditions which examine knowledge and skill acquisition outcomes that favor one mode of instruction over the other. PMID:29349338

  8. Educational games in geriatric medicine education: a systematic review

    PubMed Central

    2010-01-01

    Objective To systematically review the medical literature to assess the effect of geriatric educational games on the satisfaction, knowledge, beliefs, attitudes and behaviors of health care professionals. Methods We conducted a systematic review following the Cochrane Collaboration methodology including an electronic search of 10 electronic databases. We included randomized controlled trials (RCT) and controlled clinical trials (CCT) and excluded single arm studies. Population of interests included members (practitioners or students) of the health care professions. Outcomes of interests were participants' satisfaction, knowledge, beliefs, attitude, and behaviors. Results We included 8 studies evaluating 5 geriatric role playing games, all conducted in United States. All studies suffered from one or more methodological limitations but the overall quality of evidence was acceptable. None of the studies assessed the effects of the games on beliefs or behaviors. None of the 8 studies reported a statistically significant difference between the 2 groups in terms of change in attitude. One study assessed the impact on knowledge and found non-statistically significant difference between the 2 groups. Two studies found levels of satisfaction among participants to be high. We did not conduct a planned meta-analysis because the included studies either reported no statistical data or reported different summary statistics. Conclusion The available evidence does not support the use of role playing interventions in geriatric medical education with the aim of improving the attitudes towards the elderly. PMID:20416055

  9. Lactobacillus for preventing recurrent urinary tract infections in women: meta-analysis.

    PubMed

    Grin, Peter M; Kowalewska, Paulina M; Alhazzan, Waleed; Fox-Robichaud, Alison E

    2013-02-01

    Urinary tract infections (UTIs) are the most common infections affecting women, and often recur. Lactobacillus probiotics could potentially replace low dose, long term antibiotics as a safer prophylactic for recurrent UTI (rUTI). This systematic review and meta-analysis was performed to compile the results of existing randomized clinical trials (RCTs) to determine the efficacy of probiotic Lactobacillus species in preventing rUTI. MEDLINE and EMBASE were searched from inception to July 2012 for RCTs using a Lactobacillus prophylactic against rUTI in premenopausal adult women. A random-effects model meta-analysis was performed using a pooled risk ratio, comparing incidence of rUTI in patients receiving Lactobacillus to control. Data from 294 patients across five studies were included. There was no statistically significant difference in the risk for rUTI in patients receiving Lactobacillus versus controls, as indicated by the pooled risk ratio of 0.85 (95% confidence interval of 0.58-1.25, p = 0.41). A sensitivity analysis was performed, excluding studies using ineffective strains and studies testing for safety. Data from 127 patients in two studies were included. A statistically significant decrease in rUTI was found in patients given Lactobacillus, denoted by the pooled risk ratio of 0.51 (95% confidence interval 0.26-0.99, p = 0.05) with no statistical heterogeneity (I2 = 0%). Probiotic strains of Lactobacillus are safe and effective in preventing rUTI in adult women. However, more RCTs are required before a definitive recommendation can be made since the patient population contributing data to this meta-analysis was small.

  10. No direct correlation between rotavirus diarrhea and breast feeding: A meta-analysis.

    PubMed

    Shen, Jian; Zhang, Bi-Meng; Zhu, Sheng-Guo; Chen, Jian-Jie

    2018-04-01

    Some studies indicated that children with exclusive breast feeding had a reduction in the prevalence of rotavirus diarrhea, while some others held the opposite views. In this study, we aimed to systematically find the associations between rotavirus diarrhea and breast feeding. A literature search up to June 2016 in electronic literature databases, including PubMed and Embase, was performed. The Newcastle-Ottawa Scale was used to conduct the quality assessment of all the selected studies. Statistical analyses were performed using the R package version 3.12 (R Foundation for Statistical Computing, Beijing1, China, meta package), and odds ratio (OR) and 95% confidence interval (CI) were used to assess the strength of the association. The heterogeneity was assessed by Cochran's Q-statistic and I 2 test, and the sensitivity analysis was performed by trimming one study at a time. A total of 17 articles, which included 10,841 participants, were investigated in the present meta-analysis. There was no significant difference between the case group and control group (OR, 0.59 95% CI 0.33-1.07) in the meta-analysis of exclusive breast feeding, and no significant difference was found between the case group and the control group (OR, 0.86; 95% CI 0.63-1.16) in the meta-analysis of breast feeding. No significant difference was found between the case group and control group (OR, 0.78 95% CI 0.59-1.04) for all quantitative data. There may be no direct correlation between rotavirus diarrhea and breast feeding. Copyright © 2017. Published by Elsevier B.V.

  11. Planck 2015 results. XVI. Isotropy and statistics of the CMB

    NASA Astrophysics Data System (ADS)

    Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Akrami, Y.; Aluri, P. K.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Basak, S.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Casaponsa, B.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Church, S.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Contreras, D.; Couchot, F.; Coulais, A.; Crill, B. P.; Cruz, M.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Fantaye, Y.; Fergusson, J.; Fernandez-Cobos, R.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Frolov, A.; Galeotta, S.; Galli, S.; Ganga, K.; Gauthier, C.; Ghosh, T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huang, Z.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kim, J.; Kisner, T. S.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; Liu, H.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Marinucci, D.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; McGehee, P.; Meinhold, P. R.; Melchiorri, A.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mikkelsen, K.; Mitra, S.; Miville-Deschênes, M.-A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Pant, N.; Paoletti, D.; Pasian, F.; Patanchon, G.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Popa, L.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Rotti, A.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Souradeep, T.; Spencer, L. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Trombetti, T.; Tucci, M.; Tuovinen, J.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zibin, J. P.; Zonca, A.

    2016-09-01

    We test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect our studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The "Cold Spot" is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.

  12. Planck 2015 results: XVI. Isotropy and statistics of the CMB

    DOE PAGES

    Ade, P. A. R.; Aghanim, N.; Akrami, Y.; ...

    2016-09-20

    In this paper, we test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect ourmore » studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The “Cold Spot” is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Finally, where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.« less

  13. Accuracy of trace element determinations in alternate fuels

    NASA Technical Reports Server (NTRS)

    Greenbauer-Seng, L. A.

    1980-01-01

    A review of the techniques used at Lewis Research Center (LeRC) in trace metals analysis is presented, including the results of Atomic Absorption Spectrometry and DC Arc Emission Spectrometry of blank levels and recovery experiments for several metals. The design of an Interlaboratory Study conducted by LeRC is presented. Several factors were investigated, including: laboratory, analytical technique, fuel type, concentration, and ashing additive. Conclusions drawn from the statistical analysis will help direct research efforts toward those areas most responsible for the poor interlaboratory analytical results.

  14. Predictors of workplace violence among female sex workers in Tijuana, Mexico.

    PubMed

    Katsulis, Yasmina; Durfee, Alesha; Lopez, Vera; Robillard, Alyssa

    2015-05-01

    For sex workers, differences in rates of exposure to workplace violence are likely influenced by a variety of risk factors, including where one works and under what circumstances. Economic stressors, such as housing insecurity, may also increase the likelihood of exposure. Bivariate analyses demonstrate statistically significant associations between workplace violence and selected predictor variables, including age, drug use, exchanging sex for goods, soliciting clients outdoors, and experiencing housing insecurity. Multivariate regression analysis shows that after controlling for each of these variables in one model, only soliciting clients outdoors and housing insecurity emerge as statistically significant predictors for workplace violence. © The Author(s) 2014.

  15. The Relationship between Parental Involvement and Urban Secondary School Student Academic Achievement: A Meta-Analysis

    ERIC Educational Resources Information Center

    Jeynes, William H.

    2007-01-01

    A meta-analysis is undertaken, including 52 studies, to determine the influence of parental involvement on the educational outcomes of urban secondary school children. Statistical analyses are done to determine the overall impact of parental involvement as well as specific components of parental involvement. Four different measures of educational…

  16. Spectral analysis of groove spacing on Ganymede

    NASA Technical Reports Server (NTRS)

    Grimm, R. E.

    1984-01-01

    The technique used to analyze groove spacing on Ganymede is presented. Data from Voyager images are used determine the surface topography and position of the grooves. Power spectal estimates are statistically analyzed and sample data is included.

  17. The Third U.S.A. Mathematical Olympiad

    ERIC Educational Resources Information Center

    Greitzer, Samuel L.

    1975-01-01

    The 1974 Third United States of America Mathematical Olympiad for secondary school students is described. Included are five test problems with solutions, a brief statistical analysis of test scores, and a list of the eight finalists. (CR)

  18. Sample size and power considerations in network meta-analysis

    PubMed Central

    2012-01-01

    Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327

  19. The Essential Genome of Escherichia coli K-12.

    PubMed

    Goodall, Emily C A; Robinson, Ashley; Johnston, Iain G; Jabbari, Sara; Turner, Keith A; Cunningham, Adam F; Lund, Peter A; Cole, Jeffrey A; Henderson, Ian R

    2018-02-20

    Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli , we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli , reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data. Copyright © 2018 Goodall et al.

  20. Racial and Gender Disparities in the Physician Assistant Profession.

    PubMed

    Smith, Darron T; Jacobson, Cardell K

    2016-06-01

    To examine whether racial, gender, and ethnic salary disparities exist in the physician assistant (PA) profession and what factors, if any, are associated with the differentials. We use a nationally representative survey of 15,105 PAs from the American Academy of Physician Assistants (AAPA). We use bivariate and multivariate statistics to analyze pay differentials from the 2009 AAPA survey. Women represent nearly two-thirds of the profession but receive approximately $18,000 less in primary compensation. The differential reduces to just over $9,500 when the analysis includes a variety of other variables. According to AAPA survey, minority PAs tend to make slightly higher salaries than White PAs nationally, although the differences are not statistically significant once the control variables are included in the analysis. Despite the rough parity in primary salary, PAs of color are vastly underrepresented in the profession. The salaries of women lag in comparison to their male counterparts. © Health Research and Educational Trust.

  1. Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

  2. Spatial variation of volcanic rock geochemistry in the Virunga Volcanic Province: Statistical analysis of an integrated database

    NASA Astrophysics Data System (ADS)

    Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu

    2017-10-01

    We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.

  3. Bias due to selective inclusion and reporting of outcomes and analyses in systematic reviews of randomised trials of healthcare interventions.

    PubMed

    Page, Matthew J; McKenzie, Joanne E; Kirkham, Jamie; Dwan, Kerry; Kramer, Sharon; Green, Sally; Forbes, Andrew

    2014-10-01

    Systematic reviews may be compromised by selective inclusion and reporting of outcomes and analyses. Selective inclusion occurs when there are multiple effect estimates in a trial report that could be included in a particular meta-analysis (e.g. from multiple measurement scales and time points) and the choice of effect estimate to include in the meta-analysis is based on the results (e.g. statistical significance, magnitude or direction of effect). Selective reporting occurs when the reporting of a subset of outcomes and analyses in the systematic review is based on the results (e.g. a protocol-defined outcome is omitted from the published systematic review). To summarise the characteristics and synthesise the results of empirical studies that have investigated the prevalence of selective inclusion or reporting in systematic reviews of randomised controlled trials (RCTs), investigated the factors (e.g. statistical significance or direction of effect) associated with the prevalence and quantified the bias. We searched the Cochrane Methodology Register (to July 2012), Ovid MEDLINE, Ovid EMBASE, Ovid PsycINFO and ISI Web of Science (each up to May 2013), and the US Agency for Healthcare Research and Quality (AHRQ) Effective Healthcare Program's Scientific Resource Center (SRC) Methods Library (to June 2013). We also searched the abstract books of the 2011 and 2012 Cochrane Colloquia and the article alerts for methodological work in research synthesis published from 2009 to 2011 and compiled in Research Synthesis Methods. We included both published and unpublished empirical studies that investigated the prevalence and factors associated with selective inclusion or reporting, or both, in systematic reviews of RCTs of healthcare interventions. We included empirical studies assessing any type of selective inclusion or reporting, such as investigations of how frequently RCT outcome data is selectively included in systematic reviews based on the results, outcomes and analyses are discrepant between protocol and published review or non-significant outcomes are partially reported in the full text or summary within systematic reviews. Two review authors independently selected empirical studies for inclusion, extracted the data and performed a risk of bias assessment. A third review author resolved any disagreements about inclusion or exclusion of empirical studies, data extraction and risk of bias. We contacted authors of included studies for additional unpublished data. Primary outcomes included overall prevalence of selective inclusion or reporting, association between selective inclusion or reporting and the statistical significance of the effect estimate, and association between selective inclusion or reporting and the direction of the effect estimate. We combined prevalence estimates and risk ratios (RRs) using a random-effects meta-analysis model. Seven studies met the inclusion criteria. No studies had investigated selective inclusion of results in systematic reviews, or discrepancies in outcomes and analyses between systematic review registry entries and published systematic reviews. Based on a meta-analysis of four studies (including 485 Cochrane Reviews), 38% (95% confidence interval (CI) 23% to 54%) of systematic reviews added, omitted, upgraded or downgraded at least one outcome between the protocol and published systematic review. The association between statistical significance and discrepant outcome reporting between protocol and published systematic review was uncertain. The meta-analytic estimate suggested an increased risk of adding or upgrading (i.e. changing a secondary outcome to primary) when the outcome was statistically significant, although the 95% CI included no association and a decreased risk as plausible estimates (RR 1.43, 95% CI 0.71 to 2.85; two studies, n = 552 meta-analyses). Also, the meta-analytic estimate suggested an increased risk of downgrading (i.e. changing a primary outcome to secondary) when the outcome was statistically significant, although the 95% CI included no association and a decreased risk as plausible estimates (RR 1.26, 95% CI 0.60 to 2.62; two studies, n = 484 meta-analyses). None of the included studies had investigated whether the association between statistical significance and adding, upgrading or downgrading of outcomes was modified by the type of comparison, direction of effect or type of outcome; or whether there is an association between direction of the effect estimate and discrepant outcome reporting.Several secondary outcomes were reported in the included studies. Two studies found that reasons for discrepant outcome reporting were infrequently reported in published systematic reviews (6% in one study and 22% in the other). One study (including 62 Cochrane Reviews) found that 32% (95% CI 21% to 45%) of systematic reviews did not report all primary outcomes in the abstract. Another study (including 64 Cochrane and 118 non-Cochrane reviews) found that statistically significant primary outcomes were more likely to be completely reported in the systematic review abstract than non-significant primary outcomes (RR 2.66, 95% CI 1.81 to 3.90). None of the studies included systematic reviews published after 2009 when reporting standards for systematic reviews (Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) Statement, and Methodological Expectations of Cochrane Intervention Reviews (MECIR)) were disseminated, so the results might not be generalisable to more recent systematic reviews. Discrepant outcome reporting between the protocol and published systematic review is fairly common, although the association between statistical significance and discrepant outcome reporting is uncertain. Complete reporting of outcomes in systematic review abstracts is associated with statistical significance of the results for those outcomes. Systematic review outcomes and analysis plans should be specified prior to seeing the results of included studies to minimise post-hoc decisions that may be based on the observed results. Modifications that occur once the review has commenced, along with their justification, should be clearly reported. Effect estimates and CIs should be reported for all systematic review outcomes regardless of the results. The lack of research on selective inclusion of results in systematic reviews needs to be addressed and studies that avoid the methodological weaknesses of existing research are also needed.

  4. Image encryption based on a delayed fractional-order chaotic logistic system

    NASA Astrophysics Data System (ADS)

    Wang, Zhen; Huang, Xia; Li, Ning; Song, Xiao-Na

    2012-05-01

    A new image encryption scheme is proposed based on a delayed fractional-order chaotic logistic system. In the process of generating a key stream, the time-varying delay and fractional derivative are embedded in the proposed scheme to improve the security. Such a scheme is described in detail with security analyses including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. Experimental results show that the newly proposed image encryption scheme possesses high security.

  5. Study/experimental/research design: much more than statistics.

    PubMed

    Knight, Kenneth L

    2010-01-01

    The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes "Methods" sections hard to read and understand. To clarify the difference between study design and statistical analysis, to show the advantages of a properly written study design on article comprehension, and to encourage authors to correctly describe study designs. The role of study design is explored from the introduction of the concept by Fisher through modern-day scientists and the AMA Manual of Style. At one time, when experiments were simpler, the study design and statistical design were identical or very similar. With the complex research that is common today, which often includes manipulating variables to create new variables and the multiple (and different) analyses of a single data set, data collection is very different than statistical design. Thus, both a study design and a statistical design are necessary. Scientific manuscripts will be much easier to read and comprehend. A proper experimental design serves as a road map to the study methods, helping readers to understand more clearly how the data were obtained and, therefore, assisting them in properly analyzing the results.

  6. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling

    PubMed Central

    Wood, John

    2017-01-01

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080

  7. The Chandra Source Catalog

    NASA Astrophysics Data System (ADS)

    Evans, Ian; Primini, Francis A.; Glotfelty, Kenny J.; Anderson, Craig S.; Bonaventura, Nina R.; Chen, Judy C.; Davis, John E.; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Galle, Elizabeth C.; Gibbs, Danny G., II; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; He, Xiang Qun (Helen); Houck, John C.; Karovska, Margarita; Kashyap, Vinay L.; Lauer, Jennifer; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph B.; Mitschang, Arik W.; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Nowak, Michael A.; Plummer, David A.; Refsdal, Brian L.; Rots, Arnold H.; Siemiginowska, Aneta L.; Sundheim, Beth A.; Tibbetts, Michael S.; van Stone, David W.; Winkelman, Sherry L.; Zografou, Panagoula

    2009-09-01

    The first release of the Chandra Source Catalog (CSC) was published in 2009 March, and includes information about 94,676 X-ray sources detected in a subset of public ACIS imaging observations from roughly the first eight years of the Chandra mission. This release of the catalog includes point and compact sources with observed spatial extents <˜30''.The CSC is a general purpose virtual X-ray astrophysics facility that provides access to a carefully selected set of generally useful quantities for individual X-ray sources, and is designed to satisfy the needs of a broad-based group of scientists, including those who may be less familiar with astronomical data analysis in the X-ray regime.The catalog (1) provides access to the best estimates of the X-ray source properties for detected sources, with good scientific fidelity, and directly supports medium sophistication scientific analysis on using the individual source data; (2) facilitates analysis of a wide range of statistical properties for classes of X-ray sources; (3) provides efficient access to calibrated observational data and ancillary data products for individual X-ray sources, so that users can perform detailed further analysis using existing tools; and (4) includes real X-ray sources detected with flux significance greater than a predefined threshold, while maintaining the number of spurious sources at an acceptable level. For each detected X-ray source, the CSC provides commonly tabulated quantities, including source position, extent, multi-band fluxes, hardness ratios, and variability statistics, derived from the observations in which the source is detected. In addition to these traditional catalog elements, for each X-ray source the CSC includes an extensive set of file-based data products that can be manipulated interactively, including source images, event lists, light curves, and spectra from each observation in which a source is detected.

  8. The coordinate-based meta-analysis of neuroimaging data.

    PubMed

    Samartsidis, Pantelis; Montagna, Silvia; Nichols, Thomas E; Johnson, Timothy D

    2017-01-01

    Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research.

  9. The coordinate-based meta-analysis of neuroimaging data

    PubMed Central

    Samartsidis, Pantelis; Montagna, Silvia; Nichols, Thomas E.; Johnson, Timothy D.

    2017-01-01

    Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research. PMID:29545671

  10. Ozone data and mission sampling analysis

    NASA Technical Reports Server (NTRS)

    Robbins, J. L.

    1980-01-01

    A methodology was developed to analyze discrete data obtained from the global distribution of ozone. Statistical analysis techniques were applied to describe the distribution of data variance in terms of empirical orthogonal functions and components of spherical harmonic models. The effects of uneven data distribution and missing data were considered. Data fill based on the autocorrelation structure of the data is described. Computer coding of the analysis techniques is included.

  11. The CTS 11.7 GHz angle of arrival experiment

    NASA Technical Reports Server (NTRS)

    Kwan, B. W.; Hodge, D. B.

    1981-01-01

    The objective of the experiment was to determine the statistical behavior of attenuation and angle of arrival on an Earth-space propagation path using the CTS 11.7 GHz beacon. Measurements performed from 1976 to 1978 form the data base for analysis. The statistics of the signal attenuation and phase variations due to atmospheric disturbances are presented. Rainfall rate distributions are also included to provide a link between the above effects on wave propagation and meteorological conditions.

  12. qFeature

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2015-09-14

    This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.

  13. SRS 2010 Vegetation Inventory GeoStatistical Mapping Results for Custom Reaction Intensity and Total Dead Fuels.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, Lloyd A.; Paresol, Bernard

    This report of the geostatistical analysis results of the fire fuels response variables, custom reaction intensity and total dead fuels is but a part of an SRS 2010 vegetation inventory project. For detailed description of project, theory and background including sample design, methods, and results please refer to USDA Forest Service Savannah River Site internal report “SRS 2010 Vegetation Inventory GeoStatistical Mapping Report”, (Edwards & Parresol 2013).

  14. Noise characteristics of the Skylab S-193 altimeter altitude measurements

    NASA Technical Reports Server (NTRS)

    Hatch, W. E.

    1975-01-01

    The statistical characteristics of the SKYLAB S-193 altimeter altitude noise are considered. These results are reported in a concise format for use and analysis by the scientific community. In most instances the results have been grouped according to satellite pointing so that the effects of pointing on the statistical characteristics can be readily seen. The altimeter measurements and the processing techniques are described. The mathematical descriptions of the computer programs used for these results are included.

  15. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  16. FADTTSter: accelerating hypothesis testing with functional analysis of diffusion tensor tract statistics

    NASA Astrophysics Data System (ADS)

    Noel, Jean; Prieto, Juan C.; Styner, Martin

    2017-03-01

    Functional Analysis of Diffusion Tensor Tract Statistics (FADTTS) is a toolbox for analysis of white matter (WM) fiber tracts. It allows associating diffusion properties along major WM bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these WM tract properties. However, to use this toolbox, a user must have an intermediate knowledge in scripting languages (MATLAB). FADTTSter was created to overcome this issue and make the statistical analysis accessible to any non-technical researcher. FADTTSter is actively being used by researchers at the University of North Carolina. FADTTSter guides non-technical users through a series of steps including quality control of subjects and fibers in order to setup the necessary parameters to run FADTTS. Additionally, FADTTSter implements interactive charts for FADTTS' outputs. This interactive chart enhances the researcher experience and facilitates the analysis of the results. FADTTSter's motivation is to improve usability and provide a new analysis tool to the community that complements FADTTS. Ultimately, by enabling FADTTS to a broader audience, FADTTSter seeks to accelerate hypothesis testing in neuroimaging studies involving heterogeneous clinical data and diffusion tensor imaging. This work is submitted to the Biomedical Applications in Molecular, Structural, and Functional Imaging conference. The source code of this application is available in NITRC.

  17. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  18. Evaluating regional trends in ground-water nitrate concentrations of the Columbia Basin ground water management area, Washington

    USGS Publications Warehouse

    Frans, Lonna M.; Helsel, Dennis R.

    2005-01-01

    Trends in nitrate concentrations in water from 474 wells in 17 subregions in the Columbia Basin Ground Water Management Area (GWMA) in three counties in eastern Washington were evaluated using a variety of statistical techniques, including the Friedman test and the Kendall test. The Kendall test was modified from its typical 'seasonal' version into a 'regional' version by using well locations in place of seasons. No statistically significant trends in nitrate concentrations were identified in samples from wells in the GWMA, the three counties, or the 17 subregions from 1998 to 2002 when all data were included in the analysis. For wells in which nitrate concentrations were greater than 10 milligrams per liter (mg/L), however, a significant downward trend of -0.4 mg/L per year was observed between 1998 and 2002 for the GWMA as a whole, as well as for Adams County (-0.35 mg/L per year) and for Franklin County (-0.46 mg/L per year). Trend analysis for a smaller but longer-term 51-well dataset in Franklin County found a statistically significant upward trend in nitrate concentrations of 0.1 mg/L per year between 1986 and 2003. The largest increase of nitrate concentrations occurred between 1986 and 1991. No statistically significant differences were observed in this dataset between 1998 and 2003 indicating that the increase in nitrate concentrations has leveled off.

  19. Trial Sequential Analysis in systematic reviews with meta-analysis.

    PubMed

    Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

    2017-03-06

    Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.

  20. Statistical tools for transgene copy number estimation based on real-time PCR.

    PubMed

    Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

    2007-11-01

    As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.

  1. Statistical principle and methodology in the NISAN system.

    PubMed Central

    Asano, C

    1979-01-01

    The NISAN system is a new interactive statistical analysis program package constructed by an organization of Japanese statisticans. The package is widely available for both statistical situations, confirmatory analysis and exploratory analysis, and is planned to obtain statistical wisdom and to choose optimal process of statistical analysis for senior statisticians. PMID:540594

  2. Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.

    PubMed

    Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe

    2017-12-27

    The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.

  3. An experiment on the impact of a neonicotinoid pesticide on honeybees: the value of a formal analysis of the data.

    PubMed

    Schick, Robert S; Greenwood, Jeremy J D; Buckland, Stephen T

    2017-01-01

    We assess the analysis of the data resulting from a field experiment conducted by Pilling et al. (PLoS ONE. doi: 10.1371/journal.pone.0077193, 5) on the potential effects of thiamethoxam on honeybees. The experiment had low levels of replication, so Pilling et al. concluded that formal statistical analysis would be misleading. This would be true if such an analysis merely comprised tests of statistical significance and if the investigators concluded that lack of significance meant little or no effect. However, an analysis that includes estimation of the size of any effects-with confidence limits-allows one to reach conclusions that are not misleading and that produce useful insights. For the data of Pilling et al., we use straightforward statistical analysis to show that the confidence limits are generally so wide that any effects of thiamethoxam could have been large without being statistically significant. Instead of formal analysis, Pilling et al. simply inspected the data and concluded that they provided no evidence of detrimental effects and from this that thiamethoxam poses a "low risk" to bees. Conclusions derived from the inspection of the data were not just misleading in this case but also are unacceptable in principle, for if data are inadequate for a formal analysis (or only good enough to provide estimates with wide confidence intervals), then they are bound to be inadequate as a basis for reaching any sound conclusions. Given that the data in this case are largely uninformative with respect to the treatment effect, any conclusions reached from such informal approaches can do little more than reflect the prior beliefs of those involved.

  4. A statistically derived index for classifying East Coast fever reactions in cattle challenged with Theileria parva under experimental conditions.

    PubMed

    Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J

    2000-04-01

    A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.

  5. High-Density Signal Interface Electromagnetic Radiation Prediction for Electromagnetic Compatibility Evaluation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Halligan, Matthew

    Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less

  6. The statistical analysis of circadian phase and amplitude in constant-routine core-temperature data

    NASA Technical Reports Server (NTRS)

    Brown, E. N.; Czeisler, C. A.

    1992-01-01

    Accurate estimation of the phases and amplitude of the endogenous circadian pacemaker from constant-routine core-temperature series is crucial for making inferences about the properties of the human biological clock from data collected under this protocol. This paper presents a set of statistical methods based on a harmonic-regression-plus-correlated-noise model for estimating the phases and the amplitude of the endogenous circadian pacemaker from constant-routine core-temperature data. The methods include a Bayesian Monte Carlo procedure for computing the uncertainty in these circadian functions. We illustrate the techniques with a detailed study of a single subject's core-temperature series and describe their relationship to other statistical methods for circadian data analysis. In our laboratory, these methods have been successfully used to analyze more than 300 constant routines and provide a highly reliable means of extracting phase and amplitude information from core-temperature data.

  7. Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

    DOE PAGES

    Kacprzak, T.; Kirk, D.; Friedrich, O.; ...

    2016-08-19

    Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less

  8. Effects of Inaccurate Identification of Interictal Epileptiform Discharges in Concurrent EEG-fMRI

    NASA Astrophysics Data System (ADS)

    Gkiatis, K.; Bromis, K.; Kakkos, I.; Karanasiou, I. S.; Matsopoulos, G. K.; Garganis, K.

    2017-11-01

    Concurrent continuous EEG-fMRI is a novel multimodal technique that is finding its way into clinical practice in epilepsy. EEG timeseries are used to identify the timing of interictal epileptiform discharges (IEDs) which is then included in a GLM analysis in fMRI to localize the epileptic onset zone. Nevertheless, there are still some concerns about its reliability concerning BOLD changes correlated with IEDs. Even though IEDs are identified by an experienced neurologist-epiliptologist, the reliability and concordance of the mark-ups is depending on many factors including the level of fatigue, the amount of time that he spent or, in some cases, even the screen that is being used for the display of timeseries. This investigation is aiming to unravel the effect of misidentification or inaccuracy in the mark-ups of IEDs in the fMRI statistical parametric maps. Concurrent EEG-fMRI was conducted in six subjects with various types of epilepsy. IEDs were identified by an experienced neurologist-epiliptologist. Analysis of EEG was performed with EEGLAB and analysis of fMRI was conducted in FSL. Preliminary results revealed lower statistical significance for missing events or larger period of IEDs than the actual ones and the introduction of false positives and false negatives in statistical parametric maps when random events were included in the GLM on top of the IEDs. Our results suggest that mark-ups in EEG for simultaneous EEG-fMRI should be done with caution from an experienced and restful neurologist as it affects the fMRI results in various and unpredicted ways.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steed, Chad Allen

    EDENx is a multivariate data visualization tool that allows interactive user driven analysis of large-scale data sets with high dimensionality. EDENx builds on our earlier system, called EDEN to enable analysis of more dimensions and larger scale data sets. EDENx provides an initial overview of summary statistics for each variable in the data set under investigation. EDENx allows the user to interact with graphical summary plots of the data to investigate subsets and their statistical associations. These plots include histograms, binned scatterplots, binned parallel coordinate plots, timeline plots, and graphical correlation indicators. From the EDENx interface, a user can selectmore » a subsample of interest and launch a more detailed data visualization via the EDEN system. EDENx is best suited for high-level, aggregate analysis tasks while EDEN is more appropriate for detail data investigations.« less

  10. Inactive Hepatitis B Carrier and Pregnancy Outcomes: A Systematic Review and Meta-analysis.

    PubMed

    Keramat, Afsaneh; Younesian, Masud; Gholami Fesharaki, Mohammad; Hasani, Maryam; Mirzaei, Samaneh; Ebrahimi, Elham; Alavian, Seyed Moaed; Mohammadi, Fatemeh

    2017-04-01

    We aimed to explore whether maternal asymptomatic hepatitis B (HB) infection effects on pre-term rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension, or antepartum hemorrhage. We searched the PubMed, Scopus, and ISI web of science from 1990 to Feb 2015. In addition, electronic literature searches supplemented by searching the gray literature (e.g., conference abstracts thesis and the result of technical reports) and scanning the reference lists of included studies and relevant systematic reviews. We explored statistical heterogeneity using the, I2 and tau-squared (Tau2) statistical tests. Eighteen studies were included. Preterm rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension and antepartum hemorrhage were considerable outcomes in this survey. The results showed no significant association between inactive HB and these complications in pregnancy. The small amounts of P -value and chi-square and large amount of I2 suggested the probable heterogeneity in this part, which we tried to modify with statistical methods such as subgroup analysis. Inactive HB infection did not increase the risk of adversely mentioned outcomes in this study. Further, well-designed studies should be performed to confirm the results.

  11. A Monte Carlo investigation of thrust imbalance of solid rocket motor pairs

    NASA Technical Reports Server (NTRS)

    Sforzini, R. H.; Foster, W. A., Jr.; Johnson, J. S., Jr.

    1974-01-01

    A technique is described for theoretical, statistical evaluation of the thrust imbalance of pairs of solid-propellant rocket motors (SRMs) firing in parallel. Sets of the significant variables, determined as a part of the research, are selected using a random sampling technique and the imbalance calculated for a large number of motor pairs. The performance model is upgraded to include the effects of statistical variations in the ovality and alignment of the motor case and mandrel. Effects of cross-correlations of variables are minimized by selecting for the most part completely independent input variables, over forty in number. The imbalance is evaluated in terms of six time - varying parameters as well as eleven single valued ones which themselves are subject to statistical analysis. A sample study of the thrust imbalance of 50 pairs of 146 in. dia. SRMs of the type to be used on the space shuttle is presented. The FORTRAN IV computer program of the analysis and complete instructions for its use are included. Performance computation time for one pair of SRMs is approximately 35 seconds on the IBM 370/155 using the FORTRAN H compiler.

  12. Estimating individual benefits of medical or behavioral treatments in severely ill patients.

    PubMed

    Diaz, Francisco J

    2017-01-01

    There is a need for statistical methods appropriate for the analysis of clinical trials from a personalized-medicine viewpoint as opposed to the common statistical practice that simply examines average treatment effects. This article proposes an approach to quantifying, reporting and analyzing individual benefits of medical or behavioral treatments to severely ill patients with chronic conditions, using data from clinical trials. The approach is a new development of a published framework for measuring the severity of a chronic disease and the benefits treatments provide to individuals, which utilizes regression models with random coefficients. Here, a patient is considered to be severely ill if the patient's basal severity is close to one. This allows the derivation of a very flexible family of probability distributions of individual benefits that depend on treatment duration and the covariates included in the regression model. Our approach may enrich the statistical analysis of clinical trials of severely ill patients because it allows investigating the probability distribution of individual benefits in the patient population and the variables that influence it, and we can also measure the benefits achieved in specific patients including new patients. We illustrate our approach using data from a clinical trial of the anti-depressant imipramine.

  13. Antiviral treatment of Bell's palsy based on baseline severity: a systematic review and meta-analysis.

    PubMed

    Turgeon, Ricky D; Wilby, Kyle J; Ensom, Mary H H

    2015-06-01

    We conducted a systematic review with meta-analysis to evaluate the efficacy of antiviral agents on complete recovery of Bell's palsy. We searched CENTRAL, Embase, MEDLINE, International Pharmaceutical Abstracts, and sources of unpublished literature to November 1, 2014. Primary and secondary outcomes were complete and satisfactory recovery, respectively. To evaluate statistical heterogeneity, we performed subgroup analysis of baseline severity of Bell's palsy and between-study sensitivity analyses based on risk of allocation and detection bias. The 10 included randomized controlled trials (2419 patients; 807 with severe Bell's palsy at onset) had variable risk of bias, with 9 trials having a high risk of bias in at least 1 domain. Complete recovery was not statistically significantly greater with antiviral use versus no antiviral use in the random-effects meta-analysis of 6 trials (relative risk, 1.06; 95% confidence interval, 0.97-1.16; I(2) = 65%). Conversely, random-effects meta-analysis of 9 trials showed a statistically significant difference in satisfactory recovery (relative risk, 1.10; 95% confidence interval, 1.02-1.18; I(2) = 63%). Response to antiviral agents did not differ visually or statistically between patients with severe symptoms at baseline and those with milder disease (test for interaction, P = .11). Sensitivity analyses did not show a clear effect of bias on outcomes. Antiviral agents are not efficacious in increasing the proportion of patients with Bell's palsy who achieved complete recovery, regardless of baseline symptom severity. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. Impact of e-learning on nurses' and student nurses knowledge, skills, and satisfaction: a systematic review and meta-analysis.

    PubMed

    Lahti, Mari; Hätönen, Heli; Välimäki, Maritta

    2014-01-01

    To review the impact of e-learning on nurses' and nursing student's knowledge, skills and satisfaction related to e-learning. We conducted a systematic review and meta-analysis of randomized controlled trials (RCT) to assess the impact of e-learning on nurses' and nursing student's knowledge, skills and satisfaction. Electronic databases including MEDLINE (1948-2010), CINAHL (1981-2010), Psychinfo (1967-2010) and Eric (1966-2010) were searched in May 2010 and again in December 2010. All RCT studies evaluating the effectiveness of e-learning and differentiating between traditional learning methods among nurses were included. Data was extracted related to the purpose of the trial, sample, measurements used, index test results and reference standard. An extraction tool developed for Cochrane reviews was used. Methodological quality of eligible trials was assessed. 11 trials were eligible for inclusion in the analysis. We identified 11 randomized controlled trials including a total of 2491 nurses and student nurses'. First, the random effect size for four studies showed some improvement associated with e-learning compared to traditional techniques on knowledge. However, the difference was not statistically significant (p=0.39, MD 0.44, 95% CI -0.57 to 1.46). Second, one study reported a slight impact on e-learning on skills, but the difference was not statistically significant, either (p=0.13, MD 0.03, 95% CI -0.09 to 0.69). And third, no results on nurses or student nurses' satisfaction could be reported as the statistical data from three possible studies were not available. Overall, there was no statistical difference between groups in e-learning and traditional learning relating to nurses' or student nurses' knowledge, skills and satisfaction. E-learning can, however, offer an alternative method of education. In future, more studies following the CONSORT and QUOROM statements are needed to evaluate the effects of these interventions. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Evidence of nonextensive statistical physics behavior in the watershed distribution in active tectonic areas: examples from Greece

    NASA Astrophysics Data System (ADS)

    Vallianatos, Filippos; Kouli, Maria

    2013-08-01

    The Digital Elevation Model (DEM) for the Crete Island with a resolution of approximately 20 meters was used in order to delineate watersheds by computing the flow direction and using it in the Watershed function. The Watershed function uses a raster of flow direction to determine contributing area. The Geographic Information Systems routine procedure was applied and the watersheds as well as the streams network (using a threshold of 2000 cells, i.e. the minimum number of cells that constitute a stream) were extracted from the hydrologically corrected (free of sinks) DEM. A number of a few thousand watersheds were delineated, and their areal extent was calculated. From these watersheds a number of 300 was finally selected for further analysis as the watersheds of extremely small area were excluded in order to avoid possible artifacts. Our analysis approach is based on the basic principles of Complexity theory and Tsallis Entropy introduces in the frame of non-extensive statistical physics. This concept has been successfully used for the analysis of a variety of complex dynamic systems including natural hazards, where fractality and long-range interactions are important. The analysis indicates that the statistical distribution of watersheds can be successfully described with the theoretical estimations of non-extensive statistical physics implying the complexity that characterizes the occurrences of them.

  16. In-depth statistical analysis of the use of a website providing patients' narratives on lifestyle change when living with chronic back pain or coronary heart disease.

    PubMed

    Schweier, Rebecca; Grande, Gesine; Richter, Cynthia; Riedel-Heller, Steffi G; Romppel, Matthias

    2018-07-01

    To investigate the use of lebensstil-aendern.de ("lifestyle change"), a website providing peer narratives of experiences with successful lifestyle change, and to analyze whether peer model characteristics, clip content, and media type have an influence on the number of visitors, dwell time, and exit rates. An in-depth statistical analysis of website use with multilevel regression analyses. In two years, lebensstil-aendern.de attracted 12,844 visitors. The in-depth statistical analysis of usage rates demonstrated that audio clips were less popular than video or text-only clips, longer clips attracted more visitors, and clips by younger and female interviewees were preferred. User preferences for clip content categories differed between heart and back pain patients. Clips about stress management drew the smallest numbers of visitors in both indication modules. Patients are interested in the experiences of others. Because the quality of information for user-generated content is generally low, healthcare providers should include quality-assured patient narratives in their interventions. User preferences for content, medium, and peer characteristics need to be taken into account. If healthcare providers decide to include patient experiences in their websites, they should plan their intervention according to the different needs and preferences of users. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. Non-extensivity and complexity in the earthquake activity at the West Corinth rift (Greece)

    NASA Astrophysics Data System (ADS)

    Michas, Georgios; Vallianatos, Filippos; Sammonds, Peter

    2013-04-01

    Earthquakes exhibit complex phenomenology that is revealed from the fractal structure in space, time and magnitude. For that reason other tools rather than the simple Poissonian statistics seem more appropriate to describe the statistical properties of the phenomenon. Here we use Non-Extensive Statistical Physics [NESP] to investigate the inter-event time distribution of the earthquake activity at the west Corinth rift (central Greece). This area is one of the most seismotectonically active areas in Europe, with an important continental N-S extension and high seismicity rates. NESP concept refers to the non-additive Tsallis entropy Sq that includes Boltzmann-Gibbs entropy as a particular case. This concept has been successfully used for the analysis of a variety of complex dynamic systems including earthquakes, where fractality and long-range interactions are important. The analysis indicates that the cumulative inter-event time distribution can be successfully described with NESP, implying the complexity that characterizes the temporal occurrences of earthquakes. Further on, we use the Tsallis entropy (Sq) and the Fischer Information Measure (FIM) to investigate the complexity that characterizes the inter-event time distribution through different time windows along the evolution of the seismic activity at the West Corinth rift. The results of this analysis reveal a different level of organization and clusterization of the seismic activity in time. Acknowledgments. GM wish to acknowledge the partial support of the Greek State Scholarships Foundation (IKY).

  18. Preservation of Mercury in Polyethylene Containers.

    ERIC Educational Resources Information Center

    Piccolino, Samuel Paul

    1983-01-01

    Reports results of experiments favoring use of 0.5 percent nitric acid with an oxidant (potassium dichromate or potassium permanganate) to preserve samples in polyethylene containers for mercury analysis. Includes procedures used and statistical data obtained from the experiments. (JN)

  19. Transportation Satellite Accounts : A New Way of Measuring Transportation Services in America : [2011

    DOT National Transportation Integrated Search

    2011-01-01

    Transportation Satellite Accounts (TSA), produced by the Bureau of Economic Analysis and the Bureau of Transportation Statistics, provides measures of national transportation output. TSA includes both in-house and for-hire transportation services. Fo...

  20. [Evaluation of using statistical methods in selected national medical journals].

    PubMed

    Sych, Z

    1996-01-01

    The paper covers the performed evaluation of frequency with which the statistical methods were applied in analyzed works having been published in six selected, national medical journals in the years 1988-1992. For analysis the following journals were chosen, namely: Klinika Oczna, Medycyna Pracy, Pediatria Polska, Polski Tygodnik Lekarski, Roczniki Państwowego Zakładu Higieny, Zdrowie Publiczne. Appropriate number of works up to the average in the remaining medical journals was randomly selected from respective volumes of Pol. Tyg. Lek. The studies did not include works wherein the statistical analysis was not implemented, which referred both to national and international publications. That exemption was also extended to review papers, casuistic ones, reviews of books, handbooks, monographies, reports from scientific congresses, as well as papers on historical topics. The number of works was defined in each volume. Next, analysis was performed to establish the mode of finding out a suitable sample in respective studies, differentiating two categories: random and target selections. Attention was also paid to the presence of control sample in the individual works. In the analysis attention was also focussed on the existence of sample characteristics, setting up three categories: complete, partial and lacking. In evaluating the analyzed works an effort was made to present the results of studies in tables and figures (Tab. 1, 3). Analysis was accomplished with regard to the rate of employing statistical methods in analyzed works in relevant volumes of six selected, national medical journals for the years 1988-1992, simultaneously determining the number of works, in which no statistical methods were used. Concurrently the frequency of applying the individual statistical methods was analyzed in the scrutinized works. Prominence was given to fundamental statistical methods in the field of descriptive statistics (measures of position, measures of dispersion) as well as most important methods of mathematical statistics such as parametric tests of significance, analysis of variance (in single and dual classifications). non-parametric tests of significance, correlation and regression. The works, in which use was made of either multiple correlation or multiple regression or else more complex methods of studying the relationship for two or more numbers of variables, were incorporated into the works whose statistical methods were constituted by correlation and regression as well as other methods, e.g. statistical methods being used in epidemiology (coefficients of incidence and morbidity, standardization of coefficients, survival tables) factor analysis conducted by Jacobi-Hotellng's method, taxonomic methods and others. On the basis of the performed studies it has been established that the frequency of employing statistical methods in the six selected national, medical journals in the years 1988-1992 was 61.1-66.0% of the analyzed works (Tab. 3), and they generally were almost similar to the frequency provided in English language medical journals. On a whole, no significant differences were disclosed in the frequency of applied statistical methods (Tab. 4) as well as in frequency of random tests (Tab. 3) in the analyzed works, appearing in the medical journals in respective years 1988-1992. The most frequently used statistical methods in analyzed works for 1988-1992 were the measures of position 44.2-55.6% and measures of dispersion 32.5-38.5% as well as parametric tests of significance 26.3-33.1% of the works analyzed (Tab. 4). For the purpose of increasing the frequency and reliability of the used statistical methods, the didactics should be widened in the field of biostatistics at medical studies and postgraduation training designed for physicians and scientific-didactic workers.

  1. Robust inference for group sequential trials.

    PubMed

    Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei

    2017-03-01

    For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Markov Logic Networks in the Analysis of Genetic Data

    PubMed Central

    Sakhanenko, Nikita A.

    2010-01-01

    Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249

  3. Statistical assessment of bi-exponential diffusion weighted imaging signal characteristics induced by intravoxel incoherent motion in malignant breast tumors

    PubMed Central

    Wong, Oi Lei; Lo, Gladys G.; Chan, Helen H. L.; Wong, Ting Ting; Cheung, Polly S. Y.

    2016-01-01

    Background The purpose of this study is to statistically assess whether bi-exponential intravoxel incoherent motion (IVIM) model better characterizes diffusion weighted imaging (DWI) signal of malignant breast tumor than mono-exponential Gaussian diffusion model. Methods 3 T DWI data of 29 malignant breast tumors were retrospectively included. Linear least-square mono-exponential fitting and segmented least-square bi-exponential fitting were used for apparent diffusion coefficient (ADC) and IVIM parameter quantification, respectively. F-test and Akaike Information Criterion (AIC) were used to statistically assess the preference of mono-exponential and bi-exponential model using region-of-interests (ROI)-averaged and voxel-wise analysis. Results For ROI-averaged analysis, 15 tumors were significantly better fitted by bi-exponential function and 14 tumors exhibited mono-exponential behavior. The calculated ADC, D (true diffusion coefficient) and f (pseudo-diffusion fraction) showed no significant differences between mono-exponential and bi-exponential preferable tumors. Voxel-wise analysis revealed that 27 tumors contained more voxels exhibiting mono-exponential DWI decay while only 2 tumors presented more bi-exponential decay voxels. ADC was consistently and significantly larger than D for both ROI-averaged and voxel-wise analysis. Conclusions Although the presence of IVIM effect in malignant breast tumors could be suggested, statistical assessment shows that bi-exponential fitting does not necessarily better represent the DWI signal decay in breast cancer under clinically typical acquisition protocol and signal-to-noise ratio (SNR). Our study indicates the importance to statistically examine the breast cancer DWI signal characteristics in practice. PMID:27709078

  4. Role of diversity in ICA and IVA: theory and applications

    NASA Astrophysics Data System (ADS)

    Adalı, Tülay

    2016-05-01

    Independent component analysis (ICA) has been the most popular approach for solving the blind source separation problem. Starting from a simple linear mixing model and the assumption of statistical independence, ICA can recover a set of linearly-mixed sources to within a scaling and permutation ambiguity. It has been successfully applied to numerous data analysis problems in areas as diverse as biomedicine, communications, finance, geo- physics, and remote sensing. ICA can be achieved using different types of diversity—statistical property—and, can be posed to simultaneously account for multiple types of diversity such as higher-order-statistics, sample dependence, non-circularity, and nonstationarity. A recent generalization of ICA, independent vector analysis (IVA), generalizes ICA to multiple data sets and adds the use of one more type of diversity, statistical dependence across the data sets, for jointly achieving independent decomposition of multiple data sets. With the addition of each new diversity type, identification of a broader class of signals become possible, and in the case of IVA, this includes sources that are independent and identically distributed Gaussians. We review the fundamentals and properties of ICA and IVA when multiple types of diversity are taken into account, and then ask the question whether diversity plays an important role in practical applications as well. Examples from various domains are presented to demonstrate that in many scenarios it might be worthwhile to jointly account for multiple statistical properties. This paper is submitted in conjunction with the talk delivered for the "Unsupervised Learning and ICA Pioneer Award" at the 2016 SPIE Conference on Sensing and Analysis Technologies for Biomedical and Cognitive Applications.

  5. Developing and validating a nutrition knowledge questionnaire: key methods and considerations.

    PubMed

    Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina

    2017-10-01

    To outline key statistical considerations and detailed methodologies for the development and evaluation of a valid and reliable nutrition knowledge questionnaire. Literature on questionnaire development in a range of fields was reviewed and a set of evidence-based guidelines specific to the creation of a nutrition knowledge questionnaire have been developed. The recommendations describe key qualitative methods and statistical considerations, and include relevant examples from previous papers and existing nutrition knowledge questionnaires. Where details have been omitted for the sake of brevity, the reader has been directed to suitable references. We recommend an eight-step methodology for nutrition knowledge questionnaire development as follows: (i) definition of the construct and development of a test plan; (ii) generation of the item pool; (iii) choice of the scoring system and response format; (iv) assessment of content validity; (v) assessment of face validity; (vi) purification of the scale using item analysis, including item characteristics, difficulty and discrimination; (vii) evaluation of the scale including its factor structure and internal reliability, or Rasch analysis, including assessment of dimensionality and internal reliability; and (viii) gathering of data to re-examine the questionnaire's properties, assess temporal stability and confirm construct validity. Several of these methods have previously been overlooked. The measurement of nutrition knowledge is an important consideration for individuals working in the nutrition field. Improved methods in the development of nutrition knowledge questionnaires, such as the use of factor analysis or Rasch analysis, will enable more confidence in reported measures of nutrition knowledge.

  6. AN EXPLORATION OF THE STATISTICAL SIGNATURES OF STELLAR FEEDBACK

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boyden, Ryan D.; Offner, Stella S. R.; Koch, Eric W.

    2016-12-20

    All molecular clouds are observed to be turbulent, but the origin, means of sustenance, and evolution of the turbulence remain debated. One possibility is that stellar feedback injects enough energy into the cloud to drive observed motions on parsec scales. Recent numerical studies of molecular clouds have found that feedback from stars, such as protostellar outflows and winds, injects energy and impacts turbulence. We expand upon these studies by analyzing magnetohydrodynamic simulations of molecular clouds, including stellar winds, with a range of stellar mass-loss rates and magnetic field strengths. We generate synthetic {sup 12}CO(1–0) maps assuming that the simulations aremore » at the distance of the nearby Perseus molecular cloud. By comparing the outputs from different initial conditions and evolutionary times, we identify differences in the synthetic observations and characterize these using common astrostatistics. We quantify the different statistical responses using a variety of metrics proposed in the literature. We find that multiple astrostatistics, including the principal component analysis, the spectral correlation function, and the velocity coordinate spectrum (VCS), are sensitive to changes in stellar mass-loss rates and/or time evolution. A few statistics, including the Cramer statistic and VCS, are sensitive to the magnetic field strength. These findings demonstrate that stellar feedback influences molecular cloud turbulence and can be identified and quantified observationally using such statistics.« less

  7. Holo-analysis.

    PubMed

    Rosen, G D

    2006-06-01

    Meta-analysis is a vague descriptor used to encompass very diverse methods of data collection analysis, ranging from simple averages to more complex statistical methods. Holo-analysis is a fully comprehensive statistical analysis of all available data and all available variables in a specified topic, with results expressed in a holistic factual empirical model. The objectives and applications of holo-analysis include software production for prediction of responses with confidence limits, translation of research conditions to praxis (field) circumstances, exposure of key missing variables, discovery of theoretically unpredictable variables and interactions, and planning future research. Holo-analyses are cited as examples of the effects on broiler feed intake and live weight gain of exogenous phytases, which account for 70% of variation in responses in terms of 20 highly significant chronological, dietary, environmental, genetic, managemental, and nutrient variables. Even better future accountancy of variation will be facilitated if and when authors of papers routinely provide key data for currently neglected variables, such as temperatures, complete feed formulations, and mortalities.

  8. Local statistics of retinal optic flow for self-motion through natural sceneries.

    PubMed

    Calow, Dirk; Lappe, Markus

    2007-12-01

    Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.

  9. Navigation analysis for Viking 1979, option B

    NASA Technical Reports Server (NTRS)

    Mitchell, P. H.

    1971-01-01

    A parametric study performed for 48 trans-Mars reference missions in support of the Viking program is reported. The launch dates cover several months in the year 1979, and each launch date has multiple arrival dates in 1980. A plot of launch versus arrival dates with case numbers designated for reference purposes is included. The analysis consists of the computation of statistical covariance matrices based on certain assumptions about the ground-based tracking systems. The error model statistics are listed in tables. Tracking systems were assumed at three sites: Goldstone, California; Canberra, Australia; and Madrid, Spain. The tracking data consisted of range and Doppler measurements taken during the tracking intervals starting at E-30(d) and ending at E-10(d) for the control data and ending at E-18(h) for the knowledge data. The control and knowledge covariance matrices were delivered to the Planetary Mission Analysis Branch for inputs into a delta V dispersion analysis.

  10. A Study of the NASS-CDS System for Injury/Fatality Rates of Occupants in Various Restraints and A Discussion of Alternative Presentation Methods

    PubMed Central

    Stucki, Sheldon Lee; Biss, David J.

    2000-01-01

    An analysis was performed using the National Automotive Sampling System Crashworthiness Data System (NASS-CDS) database to compare the injury/fatality rates of variously restrained driver occupants as compared to unrestrained driver occupants in the total database of drivers/frontals, and also by Delta-V. A structured search of the NASS-CDS was done using the SAS® statistical analysis software to extract the data for this analysis and the SUDAAN software package was used to arrive at statistical significance indicators. In addition, this paper goes on to investigate different methods for presenting results of accident database searches including significance results; a risk versus Delta-V format for specific exposures; and, a percent cumulative injury versus Delta-V format to characterize injury trends. These alternative analysis presentation methods are then discussed by example using the present study results. PMID:11558105

  11. Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data

    NASA Technical Reports Server (NTRS)

    Baker, W. E.; Bloom, S. C.; Nestler, M. S.

    1985-01-01

    A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.

  12. An Overview of R in Health Decision Sciences.

    PubMed

    Jalal, Hawre; Pechlivanoglou, Petros; Krijkamp, Eline; Alarid-Escudero, Fernando; Enns, Eva; Hunink, M G Myriam

    2017-10-01

    As the complexity of health decision science applications increases, high-level programming languages are increasingly adopted for statistical analyses and numerical computations. These programming languages facilitate sophisticated modeling, model documentation, and analysis reproducibility. Among the high-level programming languages, the statistical programming framework R is gaining increased recognition. R is freely available, cross-platform compatible, and open source. A large community of users who have generated an extensive collection of well-documented packages and functions supports it. These functions facilitate applications of health decision science methodology as well as the visualization and communication of results. Although R's popularity is increasing among health decision scientists, methodological extensions of R in the field of decision analysis remain isolated. The purpose of this article is to provide an overview of existing R functionality that is applicable to the various stages of decision analysis, including model design, input parameter estimation, and analysis of model outputs.

  13. Study designs, use of statistical tests, and statistical analysis software choice in 2015: Results from two Pakistani monthly Medline indexed journals.

    PubMed

    Shaikh, Masood Ali

    2017-09-01

    Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.

  14. The other half of the story: effect size analysis in quantitative research.

    PubMed

    Maher, Jessica Middlemis; Markey, Jonathan C; Ebert-May, Diane

    2013-01-01

    Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.

  15. An Exploration of Bias in Meta-Analysis: The Case of Technology Integration Research in Higher Education

    ERIC Educational Resources Information Center

    Bernard, Robert M.; Borokhovski, Eugene; Schmid, Richard F.; Tamim, Rana M.

    2014-01-01

    This article contains a second-order meta-analysis and an exploration of bias in the technology integration literature in higher education. Thirteen meta-analyses, dated from 2000 to 2014 were selected to be included based on the questions asked and the presence of adequate statistical information to conduct a quantitative synthesis. The weighted…

  16. Ethics Education in University Aviation Management Programs in the US: Part Two B--Statistical Analysis of Current Practice.

    ERIC Educational Resources Information Center

    Oderman, Dale

    2003-01-01

    Part Two B of a three-part study examined how 40 universities with baccalaureate programs in aviation management include ethics education in the curricula. Analysis of responses suggests that there is strong support for ethics instruction and that active department head involvement leads to higher levels of planned ethics inclusion. (JOW)

  17. The Essential Genome of Escherichia coli K-12

    PubMed Central

    2018-01-01

    ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657

  18. Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.

    PubMed

    Ritz, Christian; Van der Vliet, Leana

    2009-09-01

    The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.

  19. Local image statistics: maximum-entropy constructions and perceptual salience

    PubMed Central

    Victor, Jonathan D.; Conte, Mary M.

    2012-01-01

    The space of visual signals is high-dimensional and natural visual images have a highly complex statistical structure. While many studies suggest that only a limited number of image statistics are used for perceptual judgments, a full understanding of visual function requires analysis not only of the impact of individual image statistics, but also, how they interact. In natural images, these statistical elements (luminance distributions, correlations of low and high order, edges, occlusions, etc.) are intermixed, and their effects are difficult to disentangle. Thus, there is a need for construction of stimuli in which one or more statistical elements are introduced in a controlled fashion, so that their individual and joint contributions can be analyzed. With this as motivation, we present algorithms to construct synthetic images in which local image statistics—including luminance distributions, pair-wise correlations, and higher-order correlations—are explicitly specified and all other statistics are determined implicitly by maximum-entropy. We then apply this approach to measure the sensitivity of the human visual system to local image statistics and to sample their interactions. PMID:22751397

  20. Development of new on-line statistical program for the Korean Society for Radiation Oncology

    PubMed Central

    Song, Si Yeol; Ahn, Seung Do; Chung, Weon Kuu; Choi, Eun Kyung; Cho, Kwan Ho

    2015-01-01

    Purpose To develop new on-line statistical program for the Korean Society for Radiation Oncology (KOSRO) to collect and extract medical data in radiation oncology more efficiently. Materials and Methods The statistical program is a web-based program. The directory was placed in a sub-folder of the homepage of KOSRO and its web address is http://www.kosro.or.kr/asda. The operating systems server is Linux and the webserver is the Apache HTTP server. For database (DB) server, MySQL is adopted and dedicated scripting language is the PHP. Each ID and password are controlled independently and all screen pages for data input or analysis are made to be friendly to users. Scroll-down menu is actively used for the convenience of user and the consistence of data analysis. Results Year of data is one of top categories and main topics include human resource, equipment, clinical statistics, specialized treatment and research achievement. Each topic or category has several subcategorized topics. Real-time on-line report of analysis is produced immediately after entering each data and the administrator is able to monitor status of data input of each hospital. Backup of data as spread sheets can be accessed by the administrator and be used for academic works by any members of the KOSRO. Conclusion The new on-line statistical program was developed to collect data from nationwide departments of radiation oncology. Intuitive screen and consistent input structure are expected to promote entering data of member hospitals and annual statistics should be a cornerstone of advance in radiation oncology. PMID:26157684

  1. Development of new on-line statistical program for the Korean Society for Radiation Oncology.

    PubMed

    Song, Si Yeol; Ahn, Seung Do; Chung, Weon Kuu; Shin, Kyung Hwan; Choi, Eun Kyung; Cho, Kwan Ho

    2015-06-01

    To develop new on-line statistical program for the Korean Society for Radiation Oncology (KOSRO) to collect and extract medical data in radiation oncology more efficiently. The statistical program is a web-based program. The directory was placed in a sub-folder of the homepage of KOSRO and its web address is http://www.kosro.or.kr/asda. The operating systems server is Linux and the webserver is the Apache HTTP server. For database (DB) server, MySQL is adopted and dedicated scripting language is the PHP. Each ID and password are controlled independently and all screen pages for data input or analysis are made to be friendly to users. Scroll-down menu is actively used for the convenience of user and the consistence of data analysis. Year of data is one of top categories and main topics include human resource, equipment, clinical statistics, specialized treatment and research achievement. Each topic or category has several subcategorized topics. Real-time on-line report of analysis is produced immediately after entering each data and the administrator is able to monitor status of data input of each hospital. Backup of data as spread sheets can be accessed by the administrator and be used for academic works by any members of the KOSRO. The new on-line statistical program was developed to collect data from nationwide departments of radiation oncology. Intuitive screen and consistent input structure are expected to promote entering data of member hospitals and annual statistics should be a cornerstone of advance in radiation oncology.

  2. Software for Statistical Analysis of Weibull Distributions with Application to Gear Fatigue Data: User Manual with Verification

    NASA Technical Reports Server (NTRS)

    Krantz, Timothy L.

    2002-01-01

    The Weibull distribution has been widely adopted for the statistical description and inference of fatigue data. This document provides user instructions, examples, and verification for software to analyze gear fatigue test data. The software was developed presuming the data are adequately modeled using a two-parameter Weibull distribution. The calculations are based on likelihood methods, and the approach taken is valid for data that include type 1 censoring. The software was verified by reproducing results published by others.

  3. Computing Science and Statistics: Volume 24. Graphics and Visualization

    DTIC Science & Technology

    1993-03-20

    r, is set to 3.569, the population examples include: kneading ingredients into a bread eventually oscillates about 16 fixed values. However the dough ...34fun statistics". My goal is to offer leagues I said in jest "After all, regression analysis is you the equivalent of a fortune cookie which clearly is... cookie of the night reads: One problem that statisticians traditionally seem to "You have good friends who will come to your aid in have is that they

  4. Software for Statistical Analysis of Weibull Distributions with Application to Gear Fatigue Data: User Manual with Verification

    NASA Technical Reports Server (NTRS)

    Kranz, Timothy L.

    2002-01-01

    The Weibull distribution has been widely adopted for the statistical description and inference of fatigue data. This document provides user instructions, examples, and verification for software to analyze gear fatigue test data. The software was developed presuming the data are adequately modeled using a two-parameter Weibull distribution. The calculations are based on likelihood methods, and the approach taken is valid for data that include type I censoring. The software was verified by reproducing results published by others.

  5. High order statistical signatures from source-driven measurements of subcritical fissile systems

    NASA Astrophysics Data System (ADS)

    Mattingly, John Kelly

    1998-11-01

    This research focuses on the development and application of high order statistical analyses applied to measurements performed with subcritical fissile systems driven by an introduced neutron source. The signatures presented are derived from counting statistics of the introduced source and radiation detectors that observe the response of the fissile system. It is demonstrated that successively higher order counting statistics possess progressively higher sensitivity to reactivity. Consequently, these signatures are more sensitive to changes in the composition, fissile mass, and configuration of the fissile assembly. Furthermore, it is shown that these techniques are capable of distinguishing the response of the fissile system to the introduced source from its response to any internal or inherent sources. This ability combined with the enhanced sensitivity of higher order signatures indicates that these techniques will be of significant utility in a variety of applications. Potential applications include enhanced radiation signature identification of weapons components for nuclear disarmament and safeguards applications and augmented nondestructive analysis of spent nuclear fuel. In general, these techniques expand present capabilities in the analysis of subcritical measurements.

  6. Oregon ground-water quality and its relation to hydrogeological factors; a statistical approach

    USGS Publications Warehouse

    Miller, T.L.; Gonthier, J.B.

    1984-01-01

    An appraisal of Oregon ground-water quality was made using existing data accessible through the U.S. Geological Survey computer system. The data available for about 1,000 sites were separated by aquifer units and hydrologic units. Selected statistical moments were described for 19 constituents including major ions. About 96 percent of all sites in the data base were sampled only once. The sample data were classified by aquifer unit and hydrologic unit and analysis of variance was run to determine if significant differences exist between the units within each of these two classifications for the same 19 constituents on which statistical moments were determined. Results of the analysis of variance indicated both classification variables performed about the same, but aquifer unit did provide more separation for some constituents. Samples from the Rogue River basin were classified by location within the flow system and type of flow system. The samples were then analyzed using analysis of variance on 14 constituents to determine if there were significant differences between subsets classified by flow path. Results of this analysis were not definitive, but classification as to the type of flow system did indicate potential for segregating water-quality data into distinct subsets. (USGS)

  7. Effectiveness of propolis on oral health: a meta-analysis.

    PubMed

    Hwu, Yueh-Juen; Lin, Feng-Yu

    2014-12-01

    The use of propolis mouth rinse or gel as a supplementary intervention has increased during the last decade in Taiwan. However, the effect of propolis on oral health is not well understood. The purpose of this meta-analysis was to present the best available evidence regarding the effects of propolis use on oral health, including oral infection, dental plaque, and stomatitis. Researchers searched seven electronic databases for relevant articles published between 1969 and 2012. Data were collected using inclusion and exclusion criteria. The Joanna Briggs Institute Meta Analysis of Statistics Assessment and Review Instrument was used to evaluate the quality of the identified articles. Eight trials published from 1997 to 2011 with 194 participants had extractable data. The result of the meta-analysis indicated that, although propolis had an effect on reducing dental plaque, this effect was not statistically significant. The results were not statistically significant for oral infection or stomatitis. Although there are a number of promising indications, in view of the limited number and quality of studies and the variation in results among studies, this review highlights the need for additional well-designed trials to draw conclusions that are more robust.

  8. Sources of Safety Data and Statistical Strategies for Design and Analysis: Clinical Trials.

    PubMed

    Zink, Richard C; Marchenko, Olga; Sanchez-Kam, Matilde; Ma, Haijun; Jiang, Qi

    2018-03-01

    There has been an increased emphasis on the proactive and comprehensive evaluation of safety endpoints to ensure patient well-being throughout the medical product life cycle. In fact, depending on the severity of the underlying disease, it is important to plan for a comprehensive safety evaluation at the start of any development program. Statisticians should be intimately involved in this process and contribute their expertise to study design, safety data collection, analysis, reporting (including data visualization), and interpretation. In this manuscript, we review the challenges associated with the analysis of safety endpoints and describe the safety data that are available to influence the design and analysis of premarket clinical trials. We share our recommendations for the statistical and graphical methodologies necessary to appropriately analyze, report, and interpret safety outcomes, and we discuss the advantages and disadvantages of safety data obtained from clinical trials compared to other sources. Clinical trials are an important source of safety data that contribute to the totality of safety information available to generate evidence for regulators, sponsors, payers, physicians, and patients. This work is a result of the efforts of the American Statistical Association Biopharmaceutical Section Safety Working Group.

  9. MoleculaRnetworks: an integrated graph theoretic and data mining tool to explore solvent organization in molecular simulation.

    PubMed

    Mooney, Barbara Logan; Corrales, L René; Clark, Aurora E

    2012-03-30

    This work discusses scripts for processing molecular simulations data written using the software package R: A Language and Environment for Statistical Computing. These scripts, named moleculaRnetworks, are intended for the geometric and solvent network analysis of aqueous solutes and can be extended to other H-bonded solvents. New algorithms, several of which are based on graph theory, that interrogate the solvent environment about a solute are presented and described. This includes a novel method for identifying the geometric shape adopted by the solvent in the immediate vicinity of the solute and an exploratory approach for describing H-bonding, both based on the PageRank algorithm of Google search fame. The moleculaRnetworks codes include a preprocessor, which distills simulation trajectories into physicochemical data arrays, and an interactive analysis script that enables statistical, trend, and correlation analysis, and other data mining. The goal of these scripts is to increase access to the wealth of structural and dynamical information that can be obtained from molecular simulations. Copyright © 2012 Wiley Periodicals, Inc.

  10. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  11. Balneotherapy for osteoarthritis. A cochrane review.

    PubMed

    Verhagen, Arianne; Bierma-Zeinstra, Sita; Lambeck, Johan; Cardoso, Jefferson Rosa; de Bie, Rob; Boers, Maarten; de Vet, Henrica C W

    2008-06-01

    Balneotherapy (or spa therapy, mineral baths) for patients with arthritis is one of the oldest forms of therapy. We assessed effectiveness of balneotherapy for patients with osteoarthritis (OA). We performed a broad search strategy to retrieve eligible studies, selecting randomized controlled trials comparing balneotherapy with any intervention or with no intervention. Two authors independently assessed quality and extracted data. Disagreements were solved by consensus. In the event of clinical heterogeneity or lack of data we refrained from statistical pooling. Seven trials (498 patients) were included in this review: one performed an intention-to-treat analysis, 2 provided data for our own analysis, and one reported a "quality of life" outcome. We found silver-level evidence of mineral baths compared to no treatment (effect sizes 0.34-1.82). Adverse events were not measured or found in included trials. We found silver-level evidence concerning the beneficial effects of mineral baths compared to no treatment. Of all other balneological treatments, no clear effects were found. However, the scientific evidence is weak because of the poor methodological quality and the absence of an adequate statistical analysis and data presentation.

  12. Statistical tools for analysis and modeling of cosmic populations and astronomical time series: CUDAHM and TSE

    NASA Astrophysics Data System (ADS)

    Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.

    2018-01-01

    This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.

  13. Analysis of S-box in Image Encryption Using Root Mean Square Error Method

    NASA Astrophysics Data System (ADS)

    Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan

    2012-07-01

    The use of substitution boxes (S-boxes) in encryption applications has proven to be an effective nonlinear component in creating confusion and randomness. The S-box is evolving and many variants appear in literature, which include advanced encryption standard (AES) S-box, affine power affine (APA) S-box, Skipjack S-box, Gray S-box, Lui J S-box, residue prime number S-box, Xyi S-box, and S8 S-box. These S-boxes have algebraic and statistical properties which distinguish them from each other in terms of encryption strength. In some circumstances, the parameters from algebraic and statistical analysis yield results which do not provide clear evidence in distinguishing an S-box for an application to a particular set of data. In image encryption applications, the use of S-boxes needs special care because the visual analysis and perception of a viewer can sometimes identify artifacts embedded in the image. In addition to existing algebraic and statistical analysis already used for image encryption applications, we propose an application of root mean square error technique, which further elaborates the results and enables the analyst to vividly distinguish between the performances of various S-boxes. While the use of the root mean square error analysis in statistics has proven to be effective in determining the difference in original data and the processed data, its use in image encryption has shown promising results in estimating the strength of the encryption method. In this paper, we show the application of the root mean square error analysis to S-box image encryption. The parameters from this analysis are used in determining the strength of S-boxes

  14. Analysis of Wood Structure Connections Using Cylindrical Steel and Carbon Fiber Dowel Pins

    NASA Astrophysics Data System (ADS)

    Vodiannikov, Mikhail A.; Kashevarova, Galina G., Dr.

    2017-06-01

    In this paper, the results of the statistical analysis of corrosion processes and moisture saturation of glued laminated timber structures and their joints in corrosive environment are shown. This paper includes calculation results for dowel connections of wood structures using steel and carbon fiber reinforced plastic cylindrical dowel pins in accordance with applicable regulatory documents by means of finite element analysis in ANSYS software, as well as experimental findings. Dependence diagrams are shown; comparative analysis of the results obtained is conducted.

  15. Revised Planning Methodology For Signalized Intersections And Operational Analysis Of Exclusive Left-Turn Lanes, Part-II: Models And Procedures (Final Report)

    DOT National Transportation Integrated Search

    1996-04-01

    THIS REPORT ALSO DESCRIBES THE PROCEDURES FOR DIRECT ESTIMATION OF INTERSECTION CAPACITY WITH SIMULATION, INCLUDING A SET OF RIGOROUS STATISTICAL TESTS FOR SIMULATION PARAMETER CALIBRATION FROM FIELD DATA.

  16. The use of multivariate statistics in studies of wildlife habitat

    Treesearch

    David E. Capen

    1981-01-01

    This report contains edited and reviewed versions of papers presented at a workshop held at the University of Vermont in April 1980. Topics include sampling avian habitats, multivariate methods, applications, examples, and new approaches to analysis and interpretation.

  17. Black Male Labor Force Participation.

    ERIC Educational Resources Information Center

    Baer, Roger K.

    This study attempts to test (via multiple regression analysis) hypothesized relationships between designated independent variables and age specific incidences of labor force participation for black male subpopulations in 54 Standard Metropolitan Statistical Areas. Leading independent variables tested include net migration, earnings, unemployment,…

  18. Design and analysis of multiple diseases genome-wide association studies without controls.

    PubMed

    Chen, Zhongxue; Huang, Hanwen; Ng, Hon Keung Tony

    2012-11-15

    In genome-wide association studies (GWAS), multiple diseases with shared controls is one of the case-control study designs. If data obtained from these studies are appropriately analyzed, this design can have several advantages such as improving statistical power in detecting associations and reducing the time and cost in the data collection process. In this paper, we propose a study design for GWAS which involves multiple diseases but without controls. We also propose corresponding statistical data analysis strategy for GWAS with multiple diseases but no controls. Through a simulation study, we show that the statistical association test with the proposed study design is more powerful than the test with single disease sharing common controls, and it has comparable power to the overall test based on the whole dataset including the controls. We also apply the proposed method to a real GWAS dataset to illustrate the methodologies and the advantages of the proposed design. Some possible limitations of this study design and testing method and their solutions are also discussed. Our findings indicate that the proposed study design and statistical analysis strategy could be more efficient than the usual case-control GWAS as well as those with shared controls. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Quasi-experimental study designs series-paper 10: synthesizing evidence for effects collected from quasi-experimental studies presents surmountable challenges.

    PubMed

    Becker, Betsy Jane; Aloe, Ariel M; Duvendack, Maren; Stanley, T D; Valentine, Jeffrey C; Fretheim, Atle; Tugwell, Peter

    2017-09-01

    To outline issues of importance to analytic approaches to the synthesis of quasi-experiments (QEs) and to provide a statistical model for use in analysis. We drew on studies of statistics, epidemiology, and social-science methodology to outline methods for synthesis of QE studies. The design and conduct of QEs, effect sizes from QEs, and moderator variables for the analysis of those effect sizes were discussed. Biases, confounding, design complexities, and comparisons across designs offer serious challenges to syntheses of QEs. Key components of meta-analyses of QEs were identified, including the aspects of QE study design to be coded and analyzed. Of utmost importance are the design and statistical controls implemented in the QEs. Such controls and any potential sources of bias and confounding must be modeled in analyses, along with aspects of the interventions and populations studied. Because of such controls, effect sizes from QEs are more complex than those from randomized experiments. A statistical meta-regression model that incorporates important features of the QEs under review was presented. Meta-analyses of QEs provide particular challenges, but thorough coding of intervention characteristics and study methods, along with careful analysis, should allow for sound inferences. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. THE MEASUREMENT OF BONE QUALITY USING GRAY LEVEL CO-OCCURRENCE MATRIX TEXTURAL FEATURES.

    PubMed

    Shirvaikar, Mukul; Huang, Ning; Dong, Xuanliang Neil

    2016-10-01

    In this paper, statistical methods for the estimation of bone quality to predict the risk of fracture are reported. Bone mineral density and bone architecture properties are the main contributors of bone quality. Dual-energy X-ray Absorptiometry (DXA) is the traditional clinical measurement technique for bone mineral density, but does not include architectural information to enhance the prediction of bone fragility. Other modalities are not practical due to cost and access considerations. This study investigates statistical parameters based on the Gray Level Co-occurrence Matrix (GLCM) extracted from two-dimensional projection images and explores links with architectural properties and bone mechanics. Data analysis was conducted on Micro-CT images of 13 trabecular bones (with an in-plane spatial resolution of about 50μm). Ground truth data for bone volume fraction (BV/TV), bone strength and modulus were available based on complex 3D analysis and mechanical tests. Correlation between the statistical parameters and biomechanical test results was studied using regression analysis. The results showed Cluster-Shade was strongly correlated with the microarchitecture of the trabecular bone and related to mechanical properties. Once the principle thesis of utilizing second-order statistics is established, it can be extended to other modalities, providing cost and convenience advantages for patients and doctors.

  1. THE MEASUREMENT OF BONE QUALITY USING GRAY LEVEL CO-OCCURRENCE MATRIX TEXTURAL FEATURES

    PubMed Central

    Shirvaikar, Mukul; Huang, Ning; Dong, Xuanliang Neil

    2016-01-01

    In this paper, statistical methods for the estimation of bone quality to predict the risk of fracture are reported. Bone mineral density and bone architecture properties are the main contributors of bone quality. Dual-energy X-ray Absorptiometry (DXA) is the traditional clinical measurement technique for bone mineral density, but does not include architectural information to enhance the prediction of bone fragility. Other modalities are not practical due to cost and access considerations. This study investigates statistical parameters based on the Gray Level Co-occurrence Matrix (GLCM) extracted from two-dimensional projection images and explores links with architectural properties and bone mechanics. Data analysis was conducted on Micro-CT images of 13 trabecular bones (with an in-plane spatial resolution of about 50μm). Ground truth data for bone volume fraction (BV/TV), bone strength and modulus were available based on complex 3D analysis and mechanical tests. Correlation between the statistical parameters and biomechanical test results was studied using regression analysis. The results showed Cluster-Shade was strongly correlated with the microarchitecture of the trabecular bone and related to mechanical properties. Once the principle thesis of utilizing second-order statistics is established, it can be extended to other modalities, providing cost and convenience advantages for patients and doctors. PMID:28042512

  2. Statistical ecology comes of age.

    PubMed

    Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric

    2014-12-01

    The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.

  3. Statistical ecology comes of age

    PubMed Central

    Gimenez, Olivier; Buckland, Stephen T.; Morgan, Byron J. T.; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M.; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M.; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric

    2014-01-01

    The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1–4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data. PMID:25540151

  4. Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

    NASA Astrophysics Data System (ADS)

    Passemier, Damien; McKay, Matthew R.; Chen, Yang

    2015-07-01

    Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.

  5. Clinical study of the Erlanger silver catheter--data management and biometry.

    PubMed

    Martus, P; Geis, C; Lugauer, S; Böswald, M; Guggenbichler, J P

    1999-01-01

    The clinical evaluation of venous catheters for catheter-induced infections must conform to a strict biometric methodology. The statistical planning of the study (target population, design, degree of blinding), data management (database design, definition of variables, coding), quality assurance (data inspection at several levels) and the biometric evaluation of the Erlanger silver catheter project are described. The three-step data flow included: 1) primary data from the hospital, 2) relational database, 3) files accessible for statistical evaluation. Two different statistical models were compared: analyzing the first catheter only of a patient in the analysis (independent data) and analyzing several catheters from the same patient (dependent data) by means of the generalized estimating equations (GEE) method. The main result of the study was based on the comparison of both statistical models.

  6. Statistical methods for meta-analyses including information from studies without any events-add nothing to nothing and succeed nevertheless.

    PubMed

    Kuss, O

    2015-03-30

    Meta-analyses with rare events, especially those that include studies with no event in one ('single-zero') or even both ('double-zero') treatment arms, are still a statistical challenge. In the case of double-zero studies, researchers in general delete these studies or use continuity corrections to avoid them. A number of arguments against both options has been given, and statistical methods that use the information from double-zero studies without using continuity corrections have been proposed. In this paper, we collect them and compare them by simulation. This simulation study tries to mirror real-life situations as completely as possible by deriving true underlying parameters from empirical data on actually performed meta-analyses. It is shown that for each of the commonly encountered effect estimators valid statistical methods are available that use the information from double-zero studies without using continuity corrections. Interestingly, all of them are truly random effects models, and so also the current standard method for very sparse data as recommended from the Cochrane collaboration, the Yusuf-Peto odds ratio, can be improved on. For actual analysis, we recommend to use beta-binomial regression methods to arrive at summary estimates for the odds ratio, the relative risk, or the risk difference. Methods that ignore information from double-zero studies or use continuity corrections should no longer be used. We illustrate the situation with an example where the original analysis ignores 35 double-zero studies, and a superior analysis discovers a clinically relevant advantage of off-pump surgery in coronary artery bypass grafting. Copyright © 2014 John Wiley & Sons, Ltd.

  7. On statistical inference in time series analysis of the evolution of road safety.

    PubMed

    Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora

    2013-11-01

    Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures

    PubMed Central

    Theobald, Douglas L.; Wuttke, Deborah S.

    2008-01-01

    Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907

  9. Genome-wide association analysis of secondary imaging phenotypes from the Alzheimer's disease neuroimaging initiative study.

    PubMed

    Zhu, Wensheng; Yuan, Ying; Zhang, Jingwen; Zhou, Fan; Knickmeyer, Rebecca C; Zhu, Hongtu

    2017-02-01

    The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. BCM: toolkit for Bayesian analysis of Computational Models using samplers.

    PubMed

    Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

    2016-10-21

    Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.

  11. Statistical analysis of the calibration procedure for personnel radiation measurement instruments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bush, W.J.; Bengston, S.J.; Kalbeitzer, F.L.

    1980-11-01

    Thermoluminescent analyzer (TLA) calibration procedures were used to estimate personnel radiation exposure levels at the Idaho National Engineering Laboratory (INEL). A statistical analysis is presented herein based on data collected over a six month period in 1979 on four TLA's located in the Department of Energy (DOE) Radiological and Environmental Sciences Laboratory at the INEL. The data were collected according to the day-to-day procedure in effect at that time. Both gamma and beta radiation models are developed. Observed TLA readings of thermoluminescent dosimeters are correlated with known radiation levels. This correlation is then used to predict unknown radiation doses frommore » future analyzer readings of personnel thermoluminescent dosimeters. The statistical techniques applied in this analysis include weighted linear regression, estimation of systematic and random error variances, prediction interval estimation using Scheffe's theory of calibration, the estimation of the ratio of the means of two normal bivariate distributed random variables and their corresponding confidence limits according to Kendall and Stuart, tests of normality, experimental design, a comparison between instruments, and quality control.« less

  12. Influence of periodontal treatment on rheumatoid arthritis: a systematic review and meta-analysis.

    PubMed

    Calderaro, Débora Cerqueira; Corrêa, Jôice Dias; Ferreira, Gilda Aparecida; Barbosa, Izabela Guimarães; Martins, Carolina Castro; Silva, Tarcília Aparecida; Teixeira, Antônio Lúcio

    To evaluate the influence of periodontal treatment on rheumatoid arthritis activity. MEDLINE/PUBMED, The Cochrane Library, Clinical Trials, SciELO and LILACS were searched for studies published until December 2014. Included articles were: prospective studies; including patients older than 18 years, diagnosed with periodontitis and rheumatoid arthritis submitted to non-surgical periodontal treatment; with a control group receiving no periodontal treatment; with outcomes including at least one marker of rheumatoid arthritis activity. Methodological quality of the studies was assessed using PEDro scale. Quantitative data were pooled in statistical meta-analysis using Review Manager 5. Four articles were included. Non-surgical periodontal treatment was associated with a significant reduction of DAS28 (OR: -1.18; 95% CI: -1.43, -0.93; p<0.00001). Erythrocyte sedimentation rate, C-reactive protein, patient's assessment of rheumatoid activity using visual analogical scale, tender and swollen joint counts showed a trend toward reduction (not statistically significant). The reduction of DAS 28 in patients with rheumatoid arthritis after periodontal treatment suggests that the improvement of periodontal condition is beneficial to these patients. Further randomized controlled clinical trials are necessary to confirm this finding. Copyright © 2016. Published by Elsevier Editora Ltda.

  13. Influence of periodontal treatment on rheumatoid arthritis: a systematic review and meta-analysis.

    PubMed

    Calderaro, Débora Cerqueira; Corrêa, Jôice Dias; Ferreira, Gilda Aparecida; Barbosa, Izabela Guimarães; Martins, Carolina Castro; Silva, Tarcília Aparecida; Teixeira, Antônio Lúcio

    2016-11-26

    To evaluate the influence of periodontal treatment on rheumatoid arthritis activity. MEDLINE/PUBMED, The Cochrane Library, Clinical Trials, SciELO and LILACS were searched for studies published until December 2014. Included articles were: prospective studies; including patients older than 18 years, diagnosed with periodontitis and rheumatoid arthritis submitted to non-surgical periodontal treatment; with a control group receiving no periodontal treatment; with outcomes including at least one marker of rheumatoid arthritis activity. Methodological quality of the studies was assessed using PEDro scale. Quantitative data were pooled in statistical meta-analysis using Review Manager 5. Four articles were included. Non-surgical periodontal treatment was associated with a significant reduction of DAS28 (OR: -1.18; 95% CI: -1.43, -0.93; p <0.00001). Erythrocyte sedimentation rate, C-reactive protein, patient's assessment of rheumatoid activity using visual analogical scale, tender and swollen joint counts showed a trend towards reduction (not statistically significant). The reduction of DAS 28 in patients with rheumatoid arthritis after periodontal treatment suggests that the improvement of periodontal condition is beneficial to these patients. Further randomized controlled clinical trials are necessary to confirm this finding. Copyright © 2016. Published by Elsevier Editora Ltda.

  14. SCARE: A post-processor program to MSC/NASTRAN for the reliability analysis of structural ceramic components

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, J. P.

    1985-01-01

    A computer program was developed for calculating the statistical fast fracture reliability and failure probability of ceramic components. The program includes the two-parameter Weibull material fracture strength distribution model, using the principle of independent action for polyaxial stress states and Batdorf's shear-sensitive as well as shear-insensitive crack theories, all for volume distributed flaws in macroscopically isotropic solids. Both penny-shaped cracks and Griffith cracks are included in the Batdorf shear-sensitive crack response calculations, using Griffith's maximum tensile stress or critical coplanar strain energy release rate criteria to predict mixed mode fracture. Weibull material parameters can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and fracture data. The reliability prediction analysis uses MSC/NASTRAN stress, temperature and volume output, obtained from the use of three-dimensional, quadratic, isoparametric, or axisymmetric finite elements. The statistical fast fracture theories employed, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.

  15. Living systematic reviews: 3. Statistical methods for updating meta-analyses.

    PubMed

    Simmonds, Mark; Salanti, Georgia; McKenzie, Joanne; Elliott, Julian

    2017-11-01

    A living systematic review (LSR) should keep the review current as new research evidence emerges. Any meta-analyses included in the review will also need updating as new material is identified. If the aim of the review is solely to present the best current evidence standard meta-analysis may be sufficient, provided reviewers are aware that results may change at later updates. If the review is used in a decision-making context, more caution may be needed. When using standard meta-analysis methods, the chance of incorrectly concluding that any updated meta-analysis is statistically significant when there is no effect (the type I error) increases rapidly as more updates are performed. Inaccurate estimation of any heterogeneity across studies may also lead to inappropriate conclusions. This paper considers four methods to avoid some of these statistical problems when updating meta-analyses: two methods, that is, law of the iterated logarithm and the Shuster method control primarily for inflation of type I error and two other methods, that is, trial sequential analysis and sequential meta-analysis control for type I and II errors (failing to detect a genuine effect) and take account of heterogeneity. This paper compares the methods and considers how they could be applied to LSRs. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. [Review of meta-analysis research on exercise in South Korea].

    PubMed

    Song, Youngshin; Gang, Moonhee; Kim, Sun Ae; Shin, In Soo

    2014-10-01

    The purpose of this study was to evaluate the quality of meta-analysis regarding exercise using Assessment of Multiple Systematic Reviews (AMSTAR) as well as to compare effect size according to outcomes. Electronic databases including the Korean Studies Information Service System (KISS), the National Assembly Library and the DBpia, HAKJISA and RISS4U for the dates 1990 to January 2014 were searched for 'meta-analysis' and 'exercise' in the fields of medical, nursing, physical therapy and physical exercise in Korea. AMSTAR was scored for quality assessment of the 33 articles included in the study. Data were analyzed using descriptive statistics, t-test, ANOVA and χ²-test. The mean score for AMSTAR evaluations was 4.18 (SD=1.78) and about 67% were classified at the low-quality level and 30% at the moderate-quality level. The scores of quality were statistically different by field of research, number of participants, number of databases, financial support and approval by IRB. The effect size that presented in individual studies were different by type of exercise in the applied intervention. This critical appraisal of meta-analysis published in various field that focused on exercise indicates that a guideline such as the PRISMA checklist should be strongly recommended for optimum reporting of meta-analysis across research fields.

  17. Lung Cancer Risk Prediction Model Incorporating Lung Function: Development and Validation in the UK Biobank Prospective Cohort Study.

    PubMed

    Muller, David C; Johansson, Mattias; Brennan, Paul

    2017-03-10

    Purpose Several lung cancer risk prediction models have been developed, but none to date have assessed the predictive ability of lung function in a population-based cohort. We sought to develop and internally validate a model incorporating lung function using data from the UK Biobank prospective cohort study. Methods This analysis included 502,321 participants without a previous diagnosis of lung cancer, predominantly between 40 and 70 years of age. We used flexible parametric survival models to estimate the 2-year probability of lung cancer, accounting for the competing risk of death. Models included predictors previously shown to be associated with lung cancer risk, including sex, variables related to smoking history and nicotine addiction, medical history, family history of lung cancer, and lung function (forced expiratory volume in 1 second [FEV1]). Results During accumulated follow-up of 1,469,518 person-years, there were 738 lung cancer diagnoses. A model incorporating all predictors had excellent discrimination (concordance (c)-statistic [95% CI] = 0.85 [0.82 to 0.87]). Internal validation suggested that the model will discriminate well when applied to new data (optimism-corrected c-statistic = 0.84). The full model, including FEV1, also had modestly superior discriminatory power than one that was designed solely on the basis of questionnaire variables (c-statistic = 0.84 [0.82 to 0.86]; optimism-corrected c-statistic = 0.83; p FEV1 = 3.4 × 10 -13 ). The full model had better discrimination than standard lung cancer screening eligibility criteria (c-statistic = 0.66 [0.64 to 0.69]). Conclusion A risk prediction model that includes lung function has strong predictive ability, which could improve eligibility criteria for lung cancer screening programs.

  18. The neuronal correlates of intranasal trigeminal function – An ALE meta-analysis of human functional brain imaging data

    PubMed Central

    Albrecht, Jessica; Kopietz, Rainer; Frasnelli, Johannes; Wiesmann, Martin; Hummel, Thomas; Lundström, Johan N.

    2009-01-01

    Almost every odor we encounter in daily life has the capacity to produce a trigeminal sensation. Surprisingly, few functional imaging studies exploring human neuronal correlates of intranasal trigeminal function exist, and results are to some degree inconsistent. We utilized activation likelihood estimation (ALE), a quantitative voxel-based meta-analysis tool, to analyze functional imaging data (fMRI/PET) following intranasal trigeminal stimulation with carbon dioxide (CO2), a stimulus known to exclusively activate the trigeminal system. Meta-analysis tools are able to identify activations common across studies, thereby enabling activation mapping with higher certainty. Activation foci of nine studies utilizing trigeminal stimulation were included in the meta-analysis. We found significant ALE scores, thus indicating consistent activation across studies, in the brainstem, ventrolateral posterior thalamic nucleus, anterior cingulate cortex, insula, precentral gyrus, as well as in primary and secondary somatosensory cortices – a network known for the processing of intranasal nociceptive stimuli. Significant ALE values were also observed in the piriform cortex, insula, and the orbitofrontal cortex, areas known to process chemosensory stimuli, and in association cortices. Additionally, the trigeminal ALE statistics were directly compared with ALE statistics originating from olfactory stimulation, demonstrating considerable overlap in activation. In conclusion, the results of this meta-analysis map the human neuronal correlates of intranasal trigeminal stimulation with high statistical certainty and demonstrate that the cortical areas recruited during the processing of intranasal CO2 stimuli include those outside traditional trigeminal areas. Moreover, through illustrations of the considerable overlap between brain areas that process trigeminal and olfactory information; these results demonstrate the interconnectivity of flavor processing. PMID:19913573

  19. Substituting values for censored data from Texas, USA, reservoirs inflated and obscured trends in analyses commonly used for water quality target development.

    PubMed

    Grantz, Erin; Haggard, Brian; Scott, J Thad

    2018-06-12

    We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.

  20. Analysis of Variance: What Is Your Statistical Software Actually Doing?

    ERIC Educational Resources Information Center

    Li, Jian; Lomax, Richard G.

    2011-01-01

    Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…

Top