analysis statistical analyses: Topics by Science.gov

Sample records for analysis statistical analyses

Statistical approaches in published ophthalmic clinical science papers: a comparison to statistical practice two decades ago.

PubMed

Zhang, Harrison G; Ying, Gui-Shuang

2018-02-09

The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Research of Extension of the Life Cycle of Helicopter Rotor Blade in Hungary

DTIC Science & Technology

2003-02-01

Radiography (DXR), and (iii) Vibration Diagnostics (VD) with Statistical Energy Analysis (SEA) were semi- simultaneously applied [1]. The used three...2.2. Vibration Diagnostics (VD)) Parallel to the NDT measurements the Statistical Energy Analysis (SEA) as a vibration diagnostical tool were...noises were analysed with a dual-channel real time frequency analyser (BK2035). In addition to the Statistical Energy Analysis measurement a small
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.

PubMed

Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V

2018-04-01

A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Sunspot activity and influenza pandemics: a statistical assessment of the purported association.

PubMed

Towers, S

2017-10-01

Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.
[Statistical analysis using freely-available "EZR (Easy R)" software].

PubMed

Kanda, Yoshinobu

2015-10-01

Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Reporting quality of statistical methods in surgical observational studies: protocol for systematic review.

PubMed

Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume

2014-06-28

Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
Algorithm for Identifying Erroneous Rain-Gauge Readings

NASA Technical Reports Server (NTRS)

Rickman, Doug

2005-01-01

An algorithm analyzes rain-gauge data to identify statistical outliers that could be deemed to be erroneous readings. Heretofore, analyses of this type have been performed in burdensome manual procedures that have involved subjective judgements. Sometimes, the analyses have included computational assistance for detecting values falling outside of arbitrary limits. The analyses have been performed without statistically valid knowledge of the spatial and temporal variations of precipitation within rain events. In contrast, the present algorithm makes it possible to automate such an analysis, makes the analysis objective, takes account of the spatial distribution of rain gauges in conjunction with the statistical nature of spatial variations in rainfall readings, and minimizes the use of arbitrary criteria. The algorithm implements an iterative process that involves nonparametric statistics.
Using R-Project for Free Statistical Analysis in Extension Research

ERIC Educational Resources Information Center

Mangiafico, Salvatore S.

2013-01-01

One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Quantitative Methods for Analysing Joint Questionnaire Data: Exploring the Role of Joint in Force Design

DTIC Science & Technology

2015-08-01

the nine questions. The Statistical Package for the Social Sciences ( SPSS ) [11] was used to conduct statistical analysis on the sample. Two types...constructs. SPSS was again used to conduct statistical analysis on the sample. This time factor analysis was conducted. Factor analysis attempts to...Business Research Methods and Statistics using SPSS . P432. 11 IBM SPSS Statistics . (2012) 12 Burns, R.B., Burns, R.A. (2008) ‘Business Research
Formalizing the definition of meta-analysis in Molecular Ecology.

PubMed

ArchMiller, Althea A; Bauer, Eric F; Koch, Rebecca E; Wijayawardena, Bhagya K; Anil, Ammu; Kottwitz, Jack J; Munsterman, Amelia S; Wilson, Alan E

2015-08-01

Meta-analysis, the statistical synthesis of pertinent literature to develop evidence-based conclusions, is relatively new to the field of molecular ecology, with the first meta-analysis published in the journal Molecular Ecology in 2003 (Slate & Phua 2003). The goal of this article is to formalize the definition of meta-analysis for the authors, editors, reviewers and readers of Molecular Ecology by completing a review of the meta-analyses previously published in this journal. We also provide a brief overview of the many components required for meta-analysis with a more specific discussion of the issues related to the field of molecular ecology, including the use and statistical considerations of Wright's FST and its related analogues as effect sizes in meta-analysis. We performed a literature review to identify articles published as 'meta-analyses' in Molecular Ecology, which were then evaluated by at least two reviewers. We specifically targeted Molecular Ecology publications because as a flagship journal in this field, meta-analyses published in Molecular Ecology have the potential to set the standard for meta-analyses in other journals. We found that while many of these reviewed articles were strong meta-analyses, others failed to follow standard meta-analytical techniques. One of these unsatisfactory meta-analyses was in fact a secondary analysis. Other studies attempted meta-analyses but lacked the fundamental statistics that are considered necessary for an effective and powerful meta-analysis. By drawing attention to the inconsistency of studies labelled as meta-analyses, we emphasize the importance of understanding the components of traditional meta-analyses to fully embrace the strengths of quantitative data synthesis in the field of molecular ecology. © 2015 John Wiley & Sons Ltd.
Transfusion Indication Threshold Reduction (TITRe2) randomized controlled trial in cardiac surgery: statistical analysis plan.

PubMed

Pike, Katie; Nash, Rachel L; Murphy, Gavin J; Reeves, Barnaby C; Rogers, Chris A

2015-02-22

The Transfusion Indication Threshold Reduction (TITRe2) trial is the largest randomized controlled trial to date to compare red blood cell transfusion strategies following cardiac surgery. This update presents the statistical analysis plan, detailing how the study will be analyzed and presented. The statistical analysis plan has been written following recommendations from the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, prior to database lock and the final analysis of trial data. Outlined analyses are in line with the Consolidated Standards of Reporting Trials (CONSORT). The study aims to randomize 2000 patients from 17 UK centres. Patients are randomized to either a restrictive (transfuse if haemoglobin concentration <7.5 g/dl) or liberal (transfuse if haemoglobin concentration <9 g/dl) transfusion strategy. The primary outcome is a binary composite outcome of any serious infectious or ischaemic event in the first 3 months following randomization. The statistical analysis plan details how non-adherence with the intervention, withdrawals from the study, and the study population will be derived and dealt with in the analysis. The planned analyses of the trial primary and secondary outcome measures are described in detail, including approaches taken to deal with multiple testing, model assumptions not being met and missing data. Details of planned subgroup and sensitivity analyses and pre-specified ancillary analyses are given, along with potential issues that have been identified with such analyses and possible approaches to overcome such issues. ISRCTN70923932 .
Statistical analysis and interpretation of prenatal diagnostic imaging studies, Part 2: descriptive and inferential statistical methods.

PubMed

Tuuli, Methodius G; Odibo, Anthony O

2011-08-01

The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.
Statistical analysis of Thematic Mapper Simulator data for the geobotanical discrimination of rock types in southwest Oregon

NASA Technical Reports Server (NTRS)

Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.

1984-01-01

An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.
A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

ERIC Educational Resources Information Center

Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.

2010-01-01

This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…
75 FR 24718 - Guidance for Industry on Documenting Statistical Analysis Programs and Data Files; Availability

Federal Register 2010, 2011, 2012, 2013, 2014

2010-05-05

...] Guidance for Industry on Documenting Statistical Analysis Programs and Data Files; Availability AGENCY... Programs and Data Files.'' This guidance is provided to inform study statisticians of recommendations for documenting statistical analyses and data files submitted to the Center for Veterinary Medicine (CVM) for the...
A statistical package for computing time and frequency domain analysis

NASA Technical Reports Server (NTRS)

Brownlow, J.

1978-01-01

The spectrum analysis (SPA) program is a general purpose digital computer program designed to aid in data analysis. The program does time and frequency domain statistical analyses as well as some preanalysis data preparation. The capabilities of the SPA program include linear trend removal and/or digital filtering of data, plotting and/or listing of both filtered and unfiltered data, time domain statistical characterization of data, and frequency domain statistical characterization of data.
Biomechanical Analysis of Military Boots. Phase 1. Materials Testing of Military and Commercial Footwear

DTIC Science & Technology

1992-10-01

N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics

PubMed Central

2011-01-01

Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
Nonindependence and sensitivity analyses in ecological and evolutionary meta-analyses.

PubMed

Noble, Daniel W A; Lagisz, Malgorzata; O'dea, Rose E; Nakagawa, Shinichi

2017-05-01

Meta-analysis is an important tool for synthesizing research on a variety of topics in ecology and evolution, including molecular ecology, but can be susceptible to nonindependence. Nonindependence can affect two major interrelated components of a meta-analysis: (i) the calculation of effect size statistics and (ii) the estimation of overall meta-analytic estimates and their uncertainty. While some solutions to nonindependence exist at the statistical analysis stages, there is little advice on what to do when complex analyses are not possible, or when studies with nonindependent experimental designs exist in the data. Here we argue that exploring the effects of procedural decisions in a meta-analysis (e.g. inclusion of different quality data, choice of effect size) and statistical assumptions (e.g. assuming no phylogenetic covariance) using sensitivity analyses are extremely important in assessing the impact of nonindependence. Sensitivity analyses can provide greater confidence in results and highlight important limitations of empirical work (e.g. impact of study design on overall effects). Despite their importance, sensitivity analyses are seldom applied to problems of nonindependence. To encourage better practice for dealing with nonindependence in meta-analytic studies, we present accessible examples demonstrating the impact that ignoring nonindependence can have on meta-analytic estimates. We also provide pragmatic solutions for dealing with nonindependent study designs, and for analysing dependent effect sizes. Additionally, we offer reporting guidelines that will facilitate disclosure of the sources of nonindependence in meta-analyses, leading to greater transparency and more robust conclusions. © 2017 John Wiley & Sons Ltd.
Statistical analysis plan for the Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART). A randomized controlled trial

PubMed Central

Damiani, Lucas Petri; Berwanger, Otavio; Paisani, Denise; Laranjeira, Ligia Nasi; Suzumura, Erica Aranha; Amato, Marcelo Britto Passos; Carvalho, Carlos Roberto Ribeiro; Cavalcanti, Alexandre Biasi

2017-01-01

Background The Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART) is an international multicenter randomized pragmatic controlled trial with allocation concealment involving 120 intensive care units in Brazil, Argentina, Colombia, Italy, Poland, Portugal, Malaysia, Spain, and Uruguay. The primary objective of ART is to determine whether maximum stepwise alveolar recruitment associated with PEEP titration, adjusted according to the static compliance of the respiratory system (ART strategy), is able to increase 28-day survival in patients with acute respiratory distress syndrome compared to conventional treatment (ARDSNet strategy). Objective To describe the data management process and statistical analysis plan. Methods The statistical analysis plan was designed by the trial executive committee and reviewed and approved by the trial steering committee. We provide an overview of the trial design with a special focus on describing the primary (28-day survival) and secondary outcomes. We describe our data management process, data monitoring committee, interim analyses, and sample size calculation. We describe our planned statistical analyses for primary and secondary outcomes as well as pre-specified subgroup analyses. We also provide details for presenting results, including mock tables for baseline characteristics, adherence to the protocol and effect on clinical outcomes. Conclusion According to best trial practice, we report our statistical analysis plan and data management plan prior to locking the database and beginning analyses. We anticipate that this document will prevent analysis bias and enhance the utility of the reported results. Trial registration ClinicalTrials.gov number, NCT01374022. PMID:28977255

Secondary Analysis of National Longitudinal Transition Study 2 Data

ERIC Educational Resources Information Center

Hicks, Tyler A.; Knollman, Greg A.

2015-01-01

This review examines published secondary analyses of National Longitudinal Transition Study 2 (NLTS2) data, with a primary focus upon statistical objectives, paradigms, inferences, and methods. Its primary purpose was to determine which statistical techniques have been common in secondary analyses of NLTS2 data. The review begins with an…
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

PubMed

Chu, Annie; Cui, Jenny; Dinov, Ivo D

2009-03-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

PubMed

Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

2012-08-08

Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

PubMed

Bennett, Bradley C; Husby, Chad E

2008-03-28

Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Citation of previous meta-analyses on the same topic: a clue to perpetuation of incorrect methods?

PubMed

Li, Tianjing; Dickersin, Kay

2013-06-01

Systematic reviews and meta-analyses serve as a basis for decision-making and clinical practice guidelines and should be carried out using appropriate methodology to avoid incorrect inferences. We describe the characteristics, statistical methods used for meta-analyses, and citation patterns of all 21 glaucoma systematic reviews we identified pertaining to the effectiveness of prostaglandin analog eye drops in treating primary open-angle glaucoma, published between December 2000 and February 2012. We abstracted data, assessed whether appropriate statistical methods were applied in meta-analyses, and examined citation patterns of included reviews. We identified two forms of problematic statistical analyses in 9 of the 21 systematic reviews examined. Except in 1 case, none of the 9 reviews that used incorrect statistical methods cited a previously published review that used appropriate methods. Reviews that used incorrect methods were cited 2.6 times more often than reviews that used appropriate statistical methods. We speculate that by emulating the statistical methodology of previous systematic reviews, systematic review authors may have perpetuated incorrect approaches to meta-analysis. The use of incorrect statistical methods, perhaps through emulating methods described in previous research, calls conclusions of systematic reviews into question and may lead to inappropriate patient care. We urge systematic review authors and journal editors to seek the advice of experienced statisticians before undertaking or accepting for publication a systematic review and meta-analysis. The author(s) have no proprietary or commercial interest in any materials discussed in this article. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.

PubMed

Smyth, Gordon K; Altman, Naomi S

2013-05-26

Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
The effect of berberine on insulin resistance in women with polycystic ovary syndrome: detailed statistical analysis plan (SAP) for a multicenter randomized controlled trial.

PubMed

Zhang, Ying; Sun, Jin; Zhang, Yun-Jiao; Chai, Qian-Yun; Zhang, Kang; Ma, Hong-Li; Wu, Xiao-Ke; Liu, Jian-Ping

2016-10-21

Although Traditional Chinese Medicine (TCM) has been widely used in clinical settings, a major challenge that remains in TCM is to evaluate its efficacy scientifically. This randomized controlled trial aims to evaluate the efficacy and safety of berberine in the treatment of patients with polycystic ovary syndrome. In order to improve the transparency and research quality of this clinical trial, we prepared this statistical analysis plan (SAP). The trial design, primary and secondary outcomes, and safety outcomes were declared to reduce selection biases in data analysis and result reporting. We specified detailed methods for data management and statistical analyses. Statistics in corresponding tables, listings, and graphs were outlined. The SAP provided more detailed information than trial protocol on data management and statistical analysis methods. Any post hoc analyses could be identified via referring to this SAP, and the possible selection bias and performance bias will be reduced in the trial. This study is registered at ClinicalTrials.gov, NCT01138930 , registered on 7 June 2010.
Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.

PubMed

Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C

2018-03-07

Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Meta-analysis of neutropenia or leukopenia as a prognostic factor in patients with malignant disease undergoing chemotherapy.

PubMed

Shitara, Kohei; Matsuo, Keitaro; Oze, Isao; Mizota, Ayako; Kondo, Chihiro; Nomura, Motoo; Yokota, Tomoya; Takahari, Daisuke; Ura, Takashi; Muro, Kei

2011-08-01

We performed a systematic review and meta-analysis to determine the impact of neutropenia or leukopenia experienced during chemotherapy on survival. Eligible studies included prospective or retrospective analyses that evaluated neutropenia or leukopenia as a prognostic factor for overall survival or disease-free survival. Statistical analyses were conducted to calculate a summary hazard ratio and 95% confidence interval (CI) using random-effects or fixed-effects models based on the heterogeneity of the included studies. Thirteen trials were selected for the meta-analysis, with a total of 9,528 patients. The hazard ratio of death was 0.69 (95% CI, 0.64-0.75) for patients with higher-grade neutropenia or leukopenia compared to patients with lower-grade or lack of cytopenia. Our analysis was also stratified by statistical method (any statistical method to decrease lead-time bias; time-varying analysis or landmark analysis), but no differences were observed. Our results indicate that neutropenia or leukopenia experienced during chemotherapy is associated with improved survival in patients with advanced cancer or hematological malignancies undergoing chemotherapy. Future prospective analyses designed to investigate the potential impact of chemotherapy dose adjustment coupled with monitoring of neutropenia or leukopenia on survival are warranted.
The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method

NASA Astrophysics Data System (ADS)

Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad

2018-04-01

Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.
A systematic review of the quality of statistical methods employed for analysing quality of life data in cancer randomised controlled trials.

PubMed

Hamel, Jean-Francois; Saulnier, Patrick; Pe, Madeline; Zikos, Efstathios; Musoro, Jammbe; Coens, Corneel; Bottomley, Andrew

2017-09-01

Over the last decades, Health-related Quality of Life (HRQoL) end-points have become an important outcome of the randomised controlled trials (RCTs). HRQoL methodology in RCTs has improved following international consensus recommendations. However, no international recommendations exist concerning the statistical analysis of such data. The aim of our study was to identify and characterise the quality of the statistical methods commonly used for analysing HRQoL data in cancer RCTs. Building on our recently published systematic review, we analysed a total of 33 published RCTs studying the HRQoL methods reported in RCTs since 1991. We focussed on the ability of the methods to deal with the three major problems commonly encountered when analysing HRQoL data: their multidimensional and longitudinal structure and the commonly high rate of missing data. All studies reported HRQoL being assessed repeatedly over time for a period ranging from 2 to 36 months. Missing data were common, with compliance rates ranging from 45% to 90%. From the 33 studies considered, 12 different statistical methods were identified. Twenty-nine studies analysed each of the questionnaire sub-dimensions without type I error adjustment. Thirteen studies repeated the HRQoL analysis at each assessment time again without type I error adjustment. Only 8 studies used methods suitable for repeated measurements. Our findings show a lack of consistency in statistical methods for analysing HRQoL data. Problems related to multiple comparisons were rarely considered leading to a high risk of false positive results. It is therefore critical that international recommendations for improving such statistical practices are developed. Copyright © 2017. Published by Elsevier Ltd.
Trial Sequential Analysis in systematic reviews with meta-analysis.

PubMed

Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

2017-03-06

Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.
Global atmospheric circulation statistics, 1000-1 mb

NASA Technical Reports Server (NTRS)

Randel, William J.

1992-01-01

The atlas presents atmospheric general circulation statistics derived from twelve years (1979-90) of daily National Meteorological Center (NMC) operational geopotential height analyses; it is an update of a prior atlas using data over 1979-1986. These global analyses are available on pressure levels covering 1000-1 mb (approximately 0-50 km). The geopotential grids are a combined product of the Climate Analysis Center (which produces analyses over 70-1 mb) and operational NMC analyses (over 1000-100 mb). Balance horizontal winds and hydrostatic temperatures are derived from the geopotential fields.
Distinguishing Mediational Models and Analyses in Clinical Psychology: Atemporal Associations Do Not Imply Causation.

PubMed

Winer, E Samuel; Cervone, Daniel; Bryant, Jessica; McKinney, Cliff; Liu, Richard T; Nadorff, Michael R

2016-09-01

A popular way to attempt to discern causality in clinical psychology is through mediation analysis. However, mediation analysis is sometimes applied to research questions in clinical psychology when inferring causality is impossible. This practice may soon increase with new, readily available, and easy-to-use statistical advances. Thus, we here provide a heuristic to remind clinical psychological scientists of the assumptions of mediation analyses. We describe recent statistical advances and unpack assumptions of causality in mediation, underscoring the importance of time in understanding mediational hypotheses and analyses in clinical psychology. Example analyses demonstrate that statistical mediation can occur despite theoretical mediation being improbable. We propose a delineation of mediational effects derived from cross-sectional designs into the terms temporal and atemporal associations to emphasize time in conceptualizing process models in clinical psychology. The general implications for mediational hypotheses and the temporal frameworks from within which they may be drawn are discussed. © 2016 Wiley Periodicals, Inc.
Secondary Analysis of Qualitative Data.

ERIC Educational Resources Information Center

Turner, Paul D.

The reanalysis of data to answer the original research question with better statistical techniques or to answer new questions with old data is not uncommon in quantitative studies. Meta analysis and research syntheses have increased with the increase in research using similar statistical analyses, refinements of analytical techniques, and the…
Impact of ontology evolution on functional analyses.

PubMed

Groß, Anika; Hartung, Michael; Prüfer, Kay; Kelso, Janet; Rahm, Erhard

2012-10-15

Ontologies are used in the annotation and analysis of biological data. As knowledge accumulates, ontologies and annotation undergo constant modifications to reflect this new knowledge. These modifications may influence the results of statistical applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. Here, we investigate to what degree modifications of the Gene Ontology (GO) impact these statistical analyses for both experimental and simulated data. The analysis is based on new measures for the stability of result sets and considers different ontology and annotation changes. Our results show that past changes in the GO are non-uniformly distributed over different branches of the ontology. Considering the semantic relatedness of significant categories in analysis results allows a more realistic stability assessment for functional enrichment studies. We observe that the results of term-enrichment analyses tend to be surprisingly stable despite changes in ontology and annotation.
Meta‐analysis using individual participant data: one‐stage and two‐stage approaches, and why they may differ

PubMed Central

Ensor, Joie; Riley, Richard D.

2016-01-01

Meta‐analysis using individual participant data (IPD) obtains and synthesises the raw, participant‐level data from a set of relevant studies. The IPD approach is becoming an increasingly popular tool as an alternative to traditional aggregate data meta‐analysis, especially as it avoids reliance on published results and provides an opportunity to investigate individual‐level interactions, such as treatment‐effect modifiers. There are two statistical approaches for conducting an IPD meta‐analysis: one‐stage and two‐stage. The one‐stage approach analyses the IPD from all studies simultaneously, for example, in a hierarchical regression model with random effects. The two‐stage approach derives aggregate data (such as effect estimates) in each study separately and then combines these in a traditional meta‐analysis model. There have been numerous comparisons of the one‐stage and two‐stage approaches via theoretical consideration, simulation and empirical examples, yet there remains confusion regarding when each approach should be adopted, and indeed why they may differ. In this tutorial paper, we outline the key statistical methods for one‐stage and two‐stage IPD meta‐analyses, and provide 10 key reasons why they may produce different summary results. We explain that most differences arise because of different modelling assumptions, rather than the choice of one‐stage or two‐stage itself. We illustrate the concepts with recently published IPD meta‐analyses, summarise key statistical software and provide recommendations for future IPD meta‐analyses. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27747915
ISSUES IN THE STATISTICAL ANALYSIS OF SMALL-AREA HEALTH DATA. (R825173)

EPA Science Inventory

The availability of geographically indexed health and population data, with advances in computing, geographical information systems and statistical methodology, have opened the way for serious exploration of small area health statistics based on routine data. Such analyses may be...
Ten Ways to Improve the Use of Statistical Mediation Analysis in the Practice of Child and Adolescent Treatment Research

ERIC Educational Resources Information Center

Maric, Marija; Wiers, Reinout W.; Prins, Pier J. M.

2012-01-01

Despite guidelines and repeated calls from the literature, statistical mediation analysis in youth treatment outcome research is rare. Even more concerning is that many studies that "have" reported mediation analyses do not fulfill basic requirements for mediation analysis, providing inconclusive data and clinical implications. As a result, after…
Prison Radicalization: The New Extremist Training Grounds?

DTIC Science & Technology

2007-09-01

distributing and collecting survey data , and the data analysis. The analytical methodology includes descriptive and inferential statistical methods, in... statistical analysis of the responses to identify significant correlations and relationships. B. SURVEY DATA COLLECTION To effectively access a...Q18, Q19, Q20, and Q21. Due to the exploratory nature of this small survey, data analyses were confined mostly to descriptive statistics and

Coordinate based random effect size meta-analysis of neuroimaging studies.

PubMed

Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

2017-06-01

Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.
The Problem of Auto-Correlation in Parasitology

PubMed Central

Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

2012-01-01

Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

PubMed Central

Chu, Annie; Cui, Jenny; Dinov, Ivo D.

2011-01-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
New software for statistical analysis of Cambridge Structural Database data

PubMed Central

Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.

2011-01-01

A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784
Statistical Analysis Techniques for Small Sample Sizes

NASA Technical Reports Server (NTRS)

Navard, S. E.

1984-01-01

The small sample sizes problem which is encountered when dealing with analysis of space-flight data is examined. Because of such a amount of data available, careful analyses are essential to extract the maximum amount of information with acceptable accuracy. Statistical analysis of small samples is described. The background material necessary for understanding statistical hypothesis testing is outlined and the various tests which can be done on small samples are explained. Emphasis is on the underlying assumptions of each test and on considerations needed to choose the most appropriate test for a given type of analysis.
Family Early Literacy Practices Questionnaire: A Validation Study for a Spanish-Speaking Population

ERIC Educational Resources Information Center

Lewis, Kandia

2012-01-01

The purpose of the current study was to evaluate the psychometric validity of a Spanish translated version of a family involvement questionnaire (the FELP) using a mixed-methods design. Thus, statistical analyses (i.e., factor analysis, reliability analysis, and item analysis) and qualitative analyses (i.e., focus group data) were assessed.…
The Relationship between Visual Analysis and Five Statistical Analyses in a Simple AB Single-Case Research Design

ERIC Educational Resources Information Center

Brossart, Daniel F.; Parker, Richard I.; Olson, Elizabeth A.; Mahadevan, Lakshmi

2006-01-01

This study explored some practical issues for single-case researchers who rely on visual analysis of graphed data, but who also may consider supplemental use of promising statistical analysis techniques. The study sought to answer three major questions: (a) What is a typical range of effect sizes from these analytic techniques for data from…
Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

PubMed Central

Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

2006-01-01

In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
Limitations of Using Microsoft Excel Version 2016 (MS Excel 2016) for Statistical Analysis for Medical Research.

PubMed

Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak

2016-06-01

Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study.

PubMed

van der Krieke, Lian; Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith Gm; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-08-07

Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher's tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study

PubMed Central

Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-01-01

Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160
Visualization of time series statistical data by shape analysis (GDP ratio changes among Asia countries)

NASA Astrophysics Data System (ADS)

Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri

2018-03-01

It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.
Statistical Diversions

ERIC Educational Resources Information Center

Petocz, Peter; Sowey, Eric

2012-01-01

The term "data snooping" refers to the practice of choosing which statistical analyses to apply to a set of data after having first looked at those data. Data snooping contradicts a fundamental precept of applied statistics, that the scheme of analysis is to be planned in advance. In this column, the authors shall elucidate the…
Analysis and meta-analysis of single-case designs: an introduction.

PubMed

Shadish, William R

2014-04-01

The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis.

PubMed

Neyeloff, Jeruza L; Fuchs, Sandra C; Moreira, Leila B

2012-01-20

Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis

PubMed Central

2012-01-01

Background Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. Findings We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. Conclusions It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software. PMID:22264277
Use of Statistical Analyses in the Ophthalmic Literature

PubMed Central

Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.

2014-01-01

Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977
Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

PubMed

Shadish, William R; Hedges, Larry V; Pustejovsky, James E

2014-04-01

This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

PubMed

Willis, Brian H; Riley, Richard D

2017-09-20

An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
CADDIS Volume 4. Data Analysis: Basic Analyses

EPA Pesticide Factsheets

Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.

Statistical analysis of fNIRS data: a comprehensive review.

PubMed

Tak, Sungho; Ye, Jong Chul

2014-01-15

Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Quasi-experimental study designs series-paper 10: synthesizing evidence for effects collected from quasi-experimental studies presents surmountable challenges.

PubMed

Becker, Betsy Jane; Aloe, Ariel M; Duvendack, Maren; Stanley, T D; Valentine, Jeffrey C; Fretheim, Atle; Tugwell, Peter

2017-09-01

To outline issues of importance to analytic approaches to the synthesis of quasi-experiments (QEs) and to provide a statistical model for use in analysis. We drew on studies of statistics, epidemiology, and social-science methodology to outline methods for synthesis of QE studies. The design and conduct of QEs, effect sizes from QEs, and moderator variables for the analysis of those effect sizes were discussed. Biases, confounding, design complexities, and comparisons across designs offer serious challenges to syntheses of QEs. Key components of meta-analyses of QEs were identified, including the aspects of QE study design to be coded and analyzed. Of utmost importance are the design and statistical controls implemented in the QEs. Such controls and any potential sources of bias and confounding must be modeled in analyses, along with aspects of the interventions and populations studied. Because of such controls, effect sizes from QEs are more complex than those from randomized experiments. A statistical meta-regression model that incorporates important features of the QEs under review was presented. Meta-analyses of QEs provide particular challenges, but thorough coding of intervention characteristics and study methods, along with careful analysis, should allow for sound inferences. Copyright © 2017 Elsevier Inc. All rights reserved.
Detection of semi-volatile organic compounds in permeable ...

EPA Pesticide Factsheets

Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
P-MartCancer-Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets.

PubMed

Webb-Robertson, Bobbie-Jo M; Bramer, Lisa M; Jensen, Jeffrey L; Kobold, Markus A; Stratton, Kelly G; White, Amanda M; Rodland, Karin D

2017-11-01

P-MartCancer is an interactive web-based software environment that enables statistical analyses of peptide or protein data, quantitated from mass spectrometry-based global proteomics experiments, without requiring in-depth knowledge of statistical programming. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification, and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access and the capability to analyze multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. P-MartCancer is deployed as a web service (https://pmart.labworks.org/cptac.html), alternatively available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/). Cancer Res; 77(21); e47-50. ©2017 AACR . ©2017 American Association for Cancer Research.
Living systematic reviews: 3. Statistical methods for updating meta-analyses.

PubMed

Simmonds, Mark; Salanti, Georgia; McKenzie, Joanne; Elliott, Julian

2017-11-01

A living systematic review (LSR) should keep the review current as new research evidence emerges. Any meta-analyses included in the review will also need updating as new material is identified. If the aim of the review is solely to present the best current evidence standard meta-analysis may be sufficient, provided reviewers are aware that results may change at later updates. If the review is used in a decision-making context, more caution may be needed. When using standard meta-analysis methods, the chance of incorrectly concluding that any updated meta-analysis is statistically significant when there is no effect (the type I error) increases rapidly as more updates are performed. Inaccurate estimation of any heterogeneity across studies may also lead to inappropriate conclusions. This paper considers four methods to avoid some of these statistical problems when updating meta-analyses: two methods, that is, law of the iterated logarithm and the Shuster method control primarily for inflation of type I error and two other methods, that is, trial sequential analysis and sequential meta-analysis control for type I and II errors (failing to detect a genuine effect) and take account of heterogeneity. This paper compares the methods and considers how they could be applied to LSRs. Copyright © 2017 Elsevier Inc. All rights reserved.
The Data from Aeromechanics Test and Analytics -- Management and Analysis Package (DATAMAP). Volume I. User’s Manual.

DTIC Science & Technology

1980-12-01

to sound pressure level in decibels assuming a fre- quency of 1000 Hz. 249 The perceived noisiness values are derived from a formula specified in...Analyses .......... 244 6.i.16 Perceived Noise Level Analysis .............249 6.1.17 Acoustic Weighting Networks ................250 6.2 DERIVATIONS...BAND ANALYSIS BASIC STATISTICAL ANALYSES: *OCTAVE ANALYSIS MEAN *THIRD OCTAVE ANALYSIS VARIANCE *PERCEIVED NOISE LEVEL STANDARD DEVIATION CALCULATION
CADDIS Volume 4. Data Analysis: Selecting an Analysis Approach

EPA Pesticide Factsheets

An approach for selecting statistical analyses to inform causal analysis. Describes methods for determining whether test site conditions differ from reference expectations. Describes an approach for estimating stressor-response relationships.
An Exploratory Data Analysis System for Support in Medical Decision-Making

PubMed Central

Copeland, J. A.; Hamel, B.; Bourne, J. R.

1979-01-01

An experimental system was developed to allow retrieval and analysis of data collected during a study of neurobehavioral correlates of renal disease. After retrieving data organized in a relational data base, simple bivariate statistics of parametric and nonparametric nature could be conducted. An “exploratory” mode in which the system provided guidance in selection of appropriate statistical analyses was also available to the user. The system traversed a decision tree using the inherent qualities of the data (e.g., the identity and number of patients, tests, and time epochs) to search for the appropriate analyses to employ.
The Australasian Resuscitation in Sepsis Evaluation (ARISE) trial statistical analysis plan.

PubMed

Delaney, Anthony P; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve

2013-09-01

The Australasian Resuscitation in Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the emergency department with severe sepsis. In keeping with current practice, and considering aspects of trial design and reporting specific to non-pharmacological interventions, our plan outlines the principles and methods for analysing and reporting the trial results. The document is prepared before completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and before completion of the two related international studies. Our statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. We reviewed the data collected by the research team as specified in the study protocol and detailed in the study case report form. We describe information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation, other related therapies and other relevant data with appropriate comparisons between groups. We define the primary, secondary and tertiary outcomes for the study, with description of the planned statistical analyses. We have developed a statistical analysis plan with a trial profile, mock-up tables and figures. We describe a plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies and adverse events. We describe the primary, secondary and tertiary outcomes with identification of subgroups to be analysed. We have developed a statistical analysis plan for the ARISE study, available in the public domain, before the completion of recruitment into the study. This will minimise analytical bias and conforms to current best practice in conducting clinical trials.
Identification of key micro-organisms involved in Douchi fermentation by statistical analysis and their use in an experimental fermentation.

PubMed

Chen, C; Xiang, J Y; Hu, W; Xie, Y B; Wang, T J; Cui, J W; Xu, Y; Liu, Z; Xiang, H; Xie, Q

2015-11-01

To screen and identify safe micro-organisms used during Douchi fermentation, and verify the feasibility of producing high-quality Douchi using these identified micro-organisms. PCR-denaturing gradient gel electrophoresis (DGGE) and automatic amino-acid analyser were used to investigate the microbial diversity and free amino acids (FAAs) content of 10 commercial Douchi samples. The correlations between microbial communities and FAAs were analysed by statistical analysis. Ten strains with significant positive correlation were identified. Then an experiment on Douchi fermentation by identified strains was carried out, and the nutritional composition in Douchi was analysed. Results showed that FAAs and relative content of isoflavone aglycones in verification Douchi samples were generally higher than those in commercial Douchi samples. Our study indicated that fungi, yeasts, Bacillus and lactic acid bacteria were the key players in Douchi fermentation, and with identified probiotic micro-organisms participating in fermentation, a higher quality Douchi product was produced. This is the first report to analyse and confirm the key micro-organisms during Douchi fermentation by statistical analysis. This work proves fermentation micro-organisms to be the key influencing factor of Douchi quality, and demonstrates the feasibility of fermenting Douchi using identified starter micro-organisms. © 2015 The Society for Applied Microbiology.
Targeted On-Demand Team Performance App Development

DTIC Science & Technology

2016-10-01

from three sites; 6) Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes...statistical analyses, and examine any resulting qualitative data for trends or connections to statistical outcomes. On Schedule 21 Predictive...Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes.  What opportunities for
Data Analysis and Graphing in an Introductory Physics Laboratory: Spreadsheet versus Statistics Suite

ERIC Educational Resources Information Center

Peterlin, Primoz

2010-01-01

Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
Validating Future Force Performance Measures (Army Class): Concluding Analyses

DTIC Science & Technology

2016-06-01

32 Table 3.10. Descriptive Statistics and Intercorrelations for LV Final Predictor Factor Scores...55 Table 4.7. Descriptive Statistics for Analysis Criteria...Soldier attrition and performance: Dependability (Non- Delinquency ), Adjustment, Physical Conditioning, Leadership, Work Orientation, and Agreeableness
ParallABEL: an R library for generalized parallelization of genome-wide association studies.

PubMed

Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

2010-04-29

Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.
Trends in selected streamflow statistics at 19 long-term streamflow-gaging stations indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico, 1922-2009

USGS Publications Warehouse

Barbie, Dana L.; Wehmeyer, Loren L.

2012-01-01

Trends in selected streamflow statistics during 1922-2009 were evaluated at 19 long-term streamflow-gaging stations considered indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico. The U.S. Geological Survey, in cooperation with the Texas Water Development Board, evaluated streamflow data from streamflow-gaging stations with more than 50 years of record that were active as of 2009. The outflows into Arkansas and Louisiana were represented by 3 streamflow-gaging stations, and outflows into the Gulf of Mexico, including Galveston Bay, were represented by 16 streamflow-gaging stations. Monotonic trend analyses were done using the following three streamflow statistics generated from daily mean values of streamflow: (1) annual mean daily discharge, (2) annual maximum daily discharge, and (3) annual minimum daily discharge. The trend analyses were based on the nonparametric Kendall's Tau test, which is useful for the detection of monotonic upward or downward trends with time. A total of 69 trend analyses by Kendall's Tau were computed - 19 periods of streamflow multiplied by the 3 streamflow statistics plus 12 additional trend analyses because the periods of record for 2 streamflow-gaging stations were divided into periods representing pre- and post-reservoir impoundment. Unless otherwise described, each trend analysis used the entire period of record for each streamflow-gaging station. The monotonic trend analysis detected 11 statistically significant downward trends, 37 instances of no trend, and 21 statistically significant upward trends. One general region studied, which seemingly has relatively more upward trends for many of the streamflow statistics analyzed, includes the rivers and associated creeks and bayous to Galveston Bay in the Houston metropolitan area. Lastly, the most western river basins considered (the Nueces and Rio Grande) had statistically significant downward trends for many of the streamflow statistics analyzed.
Methodological Standards for Meta-Analyses and Qualitative Systematic Reviews of Cardiac Prevention and Treatment Studies: A Scientific Statement From the American Heart Association.

PubMed

Rao, Goutham; Lopez-Jimenez, Francisco; Boyd, Jack; D'Amico, Frank; Durant, Nefertiti H; Hlatky, Mark A; Howard, George; Kirley, Katherine; Masi, Christopher; Powell-Wiley, Tiffany M; Solomonides, Anthony E; West, Colin P; Wessel, Jennifer

2017-09-05

Meta-analyses are becoming increasingly popular, especially in the fields of cardiovascular disease prevention and treatment. They are often considered to be a reliable source of evidence for making healthcare decisions. Unfortunately, problems among meta-analyses such as the misapplication and misinterpretation of statistical methods and tests are long-standing and widespread. The purposes of this statement are to review key steps in the development of a meta-analysis and to provide recommendations that will be useful for carrying out meta-analyses and for readers and journal editors, who must interpret the findings and gauge methodological quality. To make the statement practical and accessible, detailed descriptions of statistical methods have been omitted. Based on a survey of cardiovascular meta-analyses, published literature on methodology, expert consultation, and consensus among the writing group, key recommendations are provided. Recommendations reinforce several current practices, including protocol registration; comprehensive search strategies; methods for data extraction and abstraction; methods for identifying, measuring, and dealing with heterogeneity; and statistical methods for pooling results. Other practices should be discontinued, including the use of levels of evidence and evidence hierarchies to gauge the value and impact of different study designs (including meta-analyses) and the use of structured tools to assess the quality of studies to be included in a meta-analysis. We also recommend choosing a pooling model for conventional meta-analyses (fixed effect or random effects) on the basis of clinical and methodological similarities among studies to be included, rather than the results of a test for statistical heterogeneity. © 2017 American Heart Association, Inc.
Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

PubMed Central

Riley, Richard D.

2017-01-01

An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Meta-analysis of magnitudes, differences and variation in evolutionary parameters.

PubMed

Morrissey, M B

2016-10-01

Meta-analysis is increasingly used to synthesize major patterns in the large literatures within ecology and evolution. Meta-analytic methods that do not account for the process of observing data, which we may refer to as 'informal meta-analyses', may have undesirable properties. In some cases, informal meta-analyses may produce results that are unbiased, but do not necessarily make the best possible use of available data. In other cases, unbiased statistical noise in individual reports in the literature can potentially be converted into severe systematic biases in informal meta-analyses. I first present a general description of how failure to account for noise in individual inferences should be expected to lead to biases in some kinds of meta-analysis. In particular, informal meta-analyses of quantities that reflect the dispersion of parameters in nature, for example, the mean absolute value of a quantity, are likely to be generally highly misleading. I then re-analyse three previously published informal meta-analyses, where key inferences were of aspects of the dispersion of values in nature, for example, the mean absolute value of selection gradients. Major biological conclusions in each original informal meta-analysis closely match those that could arise as artefacts due to statistical noise. I present alternative mixed-model-based analyses that are specifically tailored to each situation, but where all analyses may be implemented with widely available open-source software. In each example meta-re-analysis, major conclusions change substantially. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Implementation of Head Start Planned Variation: 1970-1971. Part II.

ERIC Educational Resources Information Center

Lukas, Carol Van Deusen; Wohlleb, Cynthia

This volume of appendices is Part II of a study of program implementation in 12 models of Head Start Planned Variation. It presents details of the data analysis, copies of data collection instruments, and additional analyses and statistics. The appendices are: (A) Analysis of Variance Designs, (B) Copies of Instruments, (C) Additional Analyses,…
Early Millennials: The Sophomore Class of 2002 a Decade Later. Statistical Analysis Report. NCES 2017-437

ERIC Educational Resources Information Center

Chen, Xianglei; Lauff, Erich; Arbeit, Caren A.; Henke, Robin; Skomsvold, Paul; Hufford, Justine

2017-01-01

This Statistical Analysis Report tracks a cohort of 2002 high school sophomores over 10 years, examining the extent to which cohort members had reached such life course milestones as finishing school, starting a job, leaving home, getting married, and having children. The analyses in this report are based on data from the Education Longitudinal…

A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.

PubMed

Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L

2014-01-01

We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.
A Monte Carlo Analysis of the Thrust Imbalance for the RSRMV Booster During Both the Ignition Transient and Steady State Operation

NASA Technical Reports Server (NTRS)

Foster, Winfred A., Jr.; Crowder, Winston; Steadman, Todd E.

2014-01-01

This paper presents the results of statistical analyses performed to predict the thrust imbalance between two solid rocket motor boosters to be used on the Space Launch System (SLS) vehicle. Two legacy internal ballistics codes developed for the Space Shuttle program were coupled with a Monte Carlo analysis code to determine a thrust imbalance envelope for the SLS vehicle based on the performance of 1000 motor pairs. Thirty three variables which could impact the performance of the motors during the ignition transient and thirty eight variables which could impact the performance of the motors during steady state operation of the motor were identified and treated as statistical variables for the analyses. The effects of motor to motor variation as well as variations between motors of a single pair were included in the analyses. The statistical variations of the variables were defined based on data provided by NASA's Marshall Space Flight Center for the upgraded five segment booster and from the Space Shuttle booster when appropriate. The results obtained for the statistical envelope are compared with the design specification thrust imbalance limits for the SLS launch vehicle
A Monte Carlo Analysis of the Thrust Imbalance for the Space Launch System Booster During Both the Ignition Transient and Steady State Operation

NASA Technical Reports Server (NTRS)

Foster, Winfred A., Jr.; Crowder, Winston; Steadman, Todd E.

2014-01-01

This paper presents the results of statistical analyses performed to predict the thrust imbalance between two solid rocket motor boosters to be used on the Space Launch System (SLS) vehicle. Two legacy internal ballistics codes developed for the Space Shuttle program were coupled with a Monte Carlo analysis code to determine a thrust imbalance envelope for the SLS vehicle based on the performance of 1000 motor pairs. Thirty three variables which could impact the performance of the motors during the ignition transient and thirty eight variables which could impact the performance of the motors during steady state operation of the motor were identified and treated as statistical variables for the analyses. The effects of motor to motor variation as well as variations between motors of a single pair were included in the analyses. The statistical variations of the variables were defined based on data provided by NASA's Marshall Space Flight Center for the upgraded five segment booster and from the Space Shuttle booster when appropriate. The results obtained for the statistical envelope are compared with the design specification thrust imbalance limits for the SLS launch vehicle.
Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kleijnen, J.P.C.; Helton, J.C.

1999-04-01

The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are consideredmore » for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.« less
Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study.

PubMed

Egbewale, Bolaji E; Lewis, Martyn; Sim, Julius

2014-04-09

Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. 126 hypothetical trial scenarios were evaluated (126,000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power.
Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study

PubMed Central

2014-01-01

Background Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. Methods 126 hypothetical trial scenarios were evaluated (126 000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Results Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Conclusions Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power. PMID:24712304
Scripts for TRUMP data analyses. Part II (HLA-related data): statistical analyses specific for hematopoietic stem cell transplantation.

PubMed

Kanda, Junya

2016-01-01

The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.
Crop identification technology assessment for remote sensing. (CITARS) Volume 9: Statistical analysis of results

NASA Technical Reports Server (NTRS)

Davis, B. J.; Feiveson, A. H.

1975-01-01

Results are presented of CITARS data processing in raw form. Tables of descriptive statistics are given along with descriptions and results of inferential analyses. The inferential results are organized by questions which CITARS was designed to answer.
Data Processing System (DPS) software with experimental design, statistical analysis and data mining developed for use in entomological research.

PubMed

Tang, Qi-Yi; Zhang, Chuan-Xi

2013-04-01

A comprehensive but simple-to-use software package called DPS (Data Processing System) has been developed to execute a range of standard numerical analyses and operations used in experimental design, statistics and data mining. This program runs on standard Windows computers. Many of the functions are specific to entomological and other biological research and are not found in standard statistical software. This paper presents applications of DPS to experimental design, statistical analysis and data mining in entomology. © 2012 The Authors Insect Science © 2012 Institute of Zoology, Chinese Academy of Sciences.
The Thurgood Marshall School of Law Empirical Findings: A Report of the Statistical Analysis of the July 2010 TMSL Texas Bar Results

ERIC Educational Resources Information Center

Kadhi, Tau; Holley, D.

2010-01-01

The following report gives the statistical findings of the July 2010 TMSL Bar results. Procedures: Data is pre-existing and was given to the Evaluator by email from the Registrar and Dean. Statistical analyses were run using SPSS 17 to address the following research questions: 1. What are the statistical descriptors of the July 2010 overall TMSL…
Mediation analysis in nursing research: a methodological review.

PubMed

Liu, Jianghong; Ulrich, Connie

2016-12-01

Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask - and answer - more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science.
LSAT Dimensionality Analysis for the December 1991, June 1992, and October 1992 Administrations. Statistical Report. LSAC Research Report Series.

ERIC Educational Resources Information Center

Douglas, Jeff; Kim, Hae-Rim; Roussos, Louis; Stout, William; Zhang, Jinming

An extensive nonparametric dimensionality analysis of latent structure was conducted on three forms of the Law School Admission Test (LSAT) (December 1991, June 1992, and October 1992) using the DIMTEST model in confirmatory analyses and using DIMTEST, FAC, DETECT, HCA, PROX, and a genetic algorithm in exploratory analyses. Results indicate that…
Spatial variation of volcanic rock geochemistry in the Virunga Volcanic Province: Statistical analysis of an integrated database

NASA Astrophysics Data System (ADS)

Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu

2017-10-01

We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.
Topographic ERP analyses: a step-by-step tutorial review.

PubMed

Murray, Micah M; Brunet, Denis; Michel, Christoph M

2008-06-01

In this tutorial review, we detail both the rationale for as well as the implementation of a set of analyses of surface-recorded event-related potentials (ERPs) that uses the reference-free spatial (i.e. topographic) information available from high-density electrode montages to render statistical information concerning modulations in response strength, latency, and topography both between and within experimental conditions. In these and other ways these topographic analysis methods allow the experimenter to glean additional information and neurophysiologic interpretability beyond what is available from canonical waveform analyses. In this tutorial we present the example of somatosensory evoked potentials (SEPs) in response to stimulation of each hand to illustrate these points. For each step of these analyses, we provide the reader with both a conceptual and mathematical description of how the analysis is carried out, what it yields, and how to interpret its statistical outcome. We show that these topographic analysis methods are intuitive and easy-to-use approaches that can remove much of the guesswork often confronting ERP researchers and also assist in identifying the information contained within high-density ERP datasets.
Dark matter constraints from a joint analysis of dwarf Spheroidal galaxy observations with VERITAS

DOE PAGES

Archambault, S.; Archer, A.; Benbow, W.; ...

2017-04-05

We present constraints on the annihilation cross section of weakly interacting massive particles dark matter based on the joint statistical analysis of four dwarf galaxies with VERITAS. These results are derived from an optimized photon weighting statistical technique that improves on standard imaging atmospheric Cherenkov telescope (IACT) analyses by utilizing the spectral and spatial properties of individual photon events.
mvMapper: statistical and geographical data exploration and visualization of multivariate analysis of population structure

USDA-ARS?s Scientific Manuscript database

Characterizing population genetic structure across geographic space is a fundamental challenge in population genetics. Multivariate statistical analyses are powerful tools for summarizing genetic variability, but geographic information and accompanying metadata is not always easily integrated into t...
P-MartCancer–Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Webb-Robertson, Bobbie-Jo M.; Bramer, Lisa M.; Jensen, Jeffrey L.

P-MartCancer is a new interactive web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without requiring direct interaction with the data or with statistical software. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium (CPTAC) at the peptide, gene and protein levels. P-MartCancer is deployed using Azure technologies (http://pmart.labworks.org/cptac.html), the web-service is alternativelymore » available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/) and many statistical functions can be utilized directly from an R package available on GitHub (https://github.com/pmartR).« less
Emotional and cognitive effects of peer tutoring among secondary school mathematics students

NASA Astrophysics Data System (ADS)

Alegre Ansuategui, Francisco José; Moliner Miravet, Lidón

2017-11-01

This paper describes an experience of same-age peer tutoring conducted with 19 eighth-grade mathematics students in a secondary school in Castellon de la Plana (Spain). Three constructs were analysed before and after launching the program: academic performance, mathematics self-concept and attitude of solidarity. Students' perceptions of the method were also analysed. The quantitative data was gathered by means of a mathematics self-concept questionnaire, an attitude of solidarity questionnaire and the students' numerical ratings. A statistical analysis was performed using Student's t-test. The qualitative information was gathered by means of discussion groups and a field diary. This information was analysed using descriptive analysis and by categorizing the information. Results show statistically significant improvements in all the variables and the positive assessment of the experience and the interactions that took place between the students.
Statistical analysis of the determinations of the Sun's Galactocentric distance

NASA Astrophysics Data System (ADS)

Malkin, Zinovy

2013-02-01

Based on several tens of R0 measurements made during the past two decades, several studies have been performed to derive the best estimate of R0. Some used just simple averaging to derive a result, whereas others provided comprehensive analyses of possible errors in published results. In either case, detailed statistical analyses of data used were not performed. However, a computation of the best estimates of the Galactic rotation constants is not only an astronomical but also a metrological task. Here we perform an analysis of 53 R0 measurements (published in the past 20 years) to assess the consistency of the data. Our analysis shows that they are internally consistent. It is also shown that any trend in the R0 estimates from the last 20 years is statistically negligible, which renders the presence of a bandwagon effect doubtful. On the other hand, the formal errors in the published R0 estimates improve significantly with time.
The Thurgood Marshall School of Law Empirical Findings: A Report of the Statistical Analysis of the February 2010 TMSL Texas Bar Results

ERIC Educational Resources Information Center

Kadhi, T.; Holley, D.; Rudley, D.; Garrison, P.; Green, T.

2010-01-01

The following report gives the statistical findings of the 2010 Thurgood Marshall School of Law (TMSL) Texas Bar results. This data was pre-existing and was given to the Evaluator by email from the Dean. Then, in-depth statistical analyses were run using the SPSS 17 to address the following questions: 1. What are the statistical descriptors of the…

Statistical analysis of solid waste composition data: Arithmetic mean, standard deviation and correlation coefficients.

PubMed

Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard

2017-11-01

Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

PubMed

West, Brady T; Sakshaug, Joseph W; Aurelien, Guy Alain S

2016-01-01

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.
How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

PubMed Central

West, Brady T.; Sakshaug, Joseph W.; Aurelien, Guy Alain S.

2016-01-01

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data. PMID:27355817
Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves.

PubMed

Guyot, Patricia; Ades, A E; Ouwens, Mario J N M; Welton, Nicky J

2012-02-01

The results of Randomized Controlled Trials (RCTs) on time-to-event outcomes that are usually reported are median time to events and Cox Hazard Ratio. These do not constitute the sufficient statistics required for meta-analysis or cost-effectiveness analysis, and their use in secondary analyses requires strong assumptions that may not have been adequately tested. In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient time-to-event data from which they were generated. We develop an algorithm that maps from digitised curves back to KM data by finding numerical solutions to the inverted KM equations, using where available information on number of events and numbers at risk. The reproducibility and accuracy of survival probabilities, median survival times and hazard ratios based on reconstructed KM data was assessed by comparing published statistics (survival probabilities, medians and hazard ratios) with statistics based on repeated reconstructions by multiple observers. The validation exercise established there was no material systematic error and that there was a high degree of reproducibility for all statistics. Accuracy was excellent for survival probabilities and medians, for hazard ratios reasonable accuracy can only be obtained if at least numbers at risk or total number of events are reported. The algorithm is a reliable tool for meta-analysis and cost-effectiveness analyses of RCTs reporting time-to-event data. It is recommended that all RCTs should report information on numbers at risk and total number of events alongside KM curves.
A randomized, placebo-controlled trial of patient education for acute low back pain (PREVENT Trial): statistical analysis plan.

PubMed

Traeger, Adrian C; Skinner, Ian W; Hübscher, Markus; Lee, Hopin; Moseley, G Lorimer; Nicholas, Michael K; Henschke, Nicholas; Refshauge, Kathryn M; Blyth, Fiona M; Main, Chris J; Hush, Julia M; Pearce, Garry; Lo, Serigne; McAuley, James H

Statistical analysis plans increase the transparency of decisions made in the analysis of clinical trial results. The purpose of this paper is to detail the planned analyses for the PREVENT trial, a randomized, placebo-controlled trial of patient education for acute low back pain. We report the pre-specified principles, methods, and procedures to be adhered to in the main analysis of the PREVENT trial data. The primary outcome analysis will be based on Mixed Models for Repeated Measures (MMRM), which can test treatment effects at specific time points, and the assumptions of this analysis are outlined. We also outline the treatment of secondary outcomes and planned sensitivity analyses. We provide decisions regarding the treatment of missing data, handling of descriptive and process measure data, and blinded review procedures. Making public the pre-specified statistical analysis plan for the PREVENT trial minimizes the potential for bias in the analysis of trial data, and in the interpretation and reporting of trial results. ACTRN12612001180808 (https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?ACTRN=12612001180808). Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.
Meta-analysis of randomized clinical trials in the era of individual patient data sharing.

PubMed

Kawahara, Takuya; Fukuda, Musashi; Oba, Koji; Sakamoto, Junichi; Buyse, Marc

2018-06-01

Individual patient data (IPD) meta-analysis is considered to be a gold standard when the results of several randomized trials are combined. Recent initiatives on sharing IPD from clinical trials offer unprecedented opportunities for using such data in IPD meta-analyses. First, we discuss the evidence generated and the benefits obtained by a long-established prospective IPD meta-analysis in early breast cancer. Next, we discuss a data-sharing system that has been adopted by several pharmaceutical sponsors. We review a number of retrospective IPD meta-analyses that have already been proposed using this data-sharing system. Finally, we discuss the role of data sharing in IPD meta-analysis in the future. Treatment effects can be more reliably estimated in both types of IPD meta-analyses than with summary statistics extracted from published papers. Specifically, with rich covariate information available on each patient, prognostic and predictive factors can be identified or confirmed. Also, when several endpoints are available, surrogate endpoints can be assessed statistically. Although there are difficulties in conducting, analyzing, and interpreting retrospective IPD meta-analysis utilizing the currently available data-sharing systems, data sharing will play an important role in IPD meta-analysis in the future.
The use and misuse of statistical analyses. [in geophysics and space physics

NASA Technical Reports Server (NTRS)

Reiff, P. H.

1983-01-01

The statistical techniques most often used in space physics include Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density, and superposed epoch analysis. Tests are presented which can evaluate the significance of the results obtained through each of these. Data presented without some form of error analysis are frequently useless, since they offer no way of assessing whether a bump on a spectrum or on a superposed epoch analysis is real or merely a statistical fluctuation. Among many of the published linear correlations, for instance, the uncertainty in the intercept and slope is not given, so that the significance of the fitted parameters cannot be assessed.
Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses.

PubMed

Metz, Anneke M

2008-01-01

There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study.
Multi-trait analysis of genome-wide association summary statistics using MTAG.

PubMed

Turley, Patrick; Walters, Raymond K; Maghzian, Omeed; Okbay, Aysu; Lee, James J; Fontana, Mark Alan; Nguyen-Viet, Tuan Anh; Wedow, Robbee; Zacher, Meghan; Furlotte, Nicholas A; Magnusson, Patrik; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Laibson, David; Cesarini, David; Neale, Benjamin M; Benjamin, Daniel J

2018-02-01

We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.
Australasian Resuscitation In Sepsis Evaluation trial statistical analysis plan.

PubMed

Delaney, Anthony; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve

2013-10-01

The Australasian Resuscitation In Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the ED with severe sepsis. In keeping with current practice, and taking into considerations aspects of trial design and reporting specific to non-pharmacologic interventions, this document outlines the principles and methods for analysing and reporting the trial results. The document is prepared prior to completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and prior to completion of the two related international studies. The statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. The data collected by the research team as specified in the study protocol, and detailed in the study case report form were reviewed. Information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation and other related therapies, and other relevant data are described with appropriate comparisons between groups. The primary, secondary and tertiary outcomes for the study are defined, with description of the planned statistical analyses. A statistical analysis plan was developed, along with a trial profile, mock-up tables and figures. A plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies, along with adverse events are described. The primary, secondary and tertiary outcomes are described along with identification of subgroups to be analysed. A statistical analysis plan for the ARISE study has been developed, and is available in the public domain, prior to the completion of recruitment into the study. This will minimise analytic bias and conforms to current best practice in conducting clinical trials. © 2013 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
Borrowing of strength and study weights in multivariate and network meta-analysis.

PubMed

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2017-12-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis

PubMed Central

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2016-01-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

PubMed

Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

2009-11-01

G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
ParallABEL: an R library for generalized parallelization of genome-wide association studies

PubMed Central

2010-01-01

Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Conclusions Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL. PMID:20429914
Anticoagulant vs. antiplatelet therapy in patients with cryptogenic stroke and patent foramen ovale: an individual participant data meta-analysis.

PubMed

Kent, David M; Dahabreh, Issa J; Ruthazer, Robin; Furlan, Anthony J; Weimar, Christian; Serena, Joaquín; Meier, Bernhard; Mattle, Heinrich P; Di Angelantonio, Emanuele; Paciaroni, Maurizio; Schuchlenz, Herwig; Homma, Shunichi; Lutz, Jennifer S; Thaler, David E

2015-09-14

The preferred antithrombotic strategy for secondary prevention in patients with cryptogenic stroke (CS) and patent foramen ovale (PFO) is unknown. We pooled multiple observational studies and used propensity score-based methods to estimate the comparative effectiveness of oral anticoagulation (OAC) compared with antiplatelet therapy (APT). Individual participant data from 12 databases of medically treated patients with CS and PFO were analysed with Cox regression models, to estimate database-specific hazard ratios (HRs) comparing OAC with APT, for both the primary composite outcome [recurrent stroke, transient ischaemic attack (TIA), or death] and stroke alone. Propensity scores were applied via inverse probability of treatment weighting to control for confounding. We synthesized database-specific HRs using random-effects meta-analysis models. This analysis included 2385 (OAC = 804 and APT = 1581) patients with 227 composite endpoints (stroke/TIA/death). The difference between OAC and APT was not statistically significant for the primary composite outcome [adjusted HR = 0.76, 95% confidence interval (CI) 0.52-1.12] or for the secondary outcome of stroke alone (adjusted HR = 0.75, 95% CI 0.44-1.27). Results were consistent in analyses applying alternative weighting schemes, with the exception that OAC had a statistically significant beneficial effect on the composite outcome in analyses standardized to the patient population who actually received APT (adjusted HR = 0.64, 95% CI 0.42-0.99). Subgroup analyses did not detect statistically significant heterogeneity of treatment effects across clinically important patient groups. We did not find a statistically significant difference comparing OAC with APT; our results justify randomized trials comparing different antithrombotic approaches in these patients. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2015. For permissions please email: journals.permissions@oup.com.
Metamodels for Computer-Based Engineering Design: Survey and Recommendations

NASA Technical Reports Server (NTRS)

Simpson, Timothy W.; Peplinski, Jesse; Koch, Patrick N.; Allen, Janet K.

1997-01-01

The use of statistical techniques to build approximations of expensive computer analysis codes pervades much of todays engineering design. These statistical approximations, or metamodels, are used to replace the actual expensive computer analyses, facilitating multidisciplinary, multiobjective optimization and concept exploration. In this paper we review several of these techniques including design of experiments, response surface methodology, Taguchi methods, neural networks, inductive learning, and kriging. We survey their existing application in engineering design and then address the dangers of applying traditional statistical techniques to approximate deterministic computer analysis codes. We conclude with recommendations for the appropriate use of statistical approximation techniques in given situations and how common pitfalls can be avoided.
Performance of Between-Study Heterogeneity Measures in the Cochrane Library.

PubMed

Ma, Xiaoyue; Lin, Lifeng; Qu, Zhiyong; Zhu, Motao; Chu, Haitao

2018-05-29

The growth in comparative effectiveness research and evidence-based medicine has increased attention to systematic reviews and meta-analyses. Meta-analysis synthesizes and contrasts evidence from multiple independent studies to improve statistical efficiency and reduce bias. Assessing heterogeneity is critical for performing a meta-analysis and interpreting results. As a widely used heterogeneity measure, the I statistic quantifies the proportion of total variation across studies that is due to real differences in effect size. The presence of outlying studies can seriously exaggerate the I statistic. Two alternative heterogeneity measures, the Ir and Im, have been recently proposed to reduce the impact of outlying studies. To evaluate these measures' performance empirically, we applied them to 20,599 meta-analyses in the Cochrane Library. We found that the Ir and Im have strong agreement with the I, while they are more robust than the I when outlying studies appear.
Mediation analysis in nursing research: a methodological review

PubMed Central

Liu, Jianghong; Ulrich, Connie

2017-01-01

Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask – and answer – more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science. PMID:26176804
Multivariate statistical analysis of low-voltage EDS spectrum images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson, I.M.

1998-03-01

Whereas energy-dispersive X-ray spectrometry (EDS) has been used for compositional analysis in the scanning electron microscope for 30 years, the benefits of using low operating voltages for such analyses have been explored only during the last few years. This paper couples low-voltage EDS with two other emerging areas of characterization: spectrum imaging and multivariate statistical analysis. The specimen analyzed for this study was a finished Intel Pentium processor, with the polyimide protective coating stripped off to expose the final active layers.
Periodontal disease and carotid atherosclerosis: A meta-analysis of 17,330 participants.

PubMed

Zeng, Xian-Tao; Leng, Wei-Dong; Lam, Yat-Yin; Yan, Bryan P; Wei, Xue-Mei; Weng, Hong; Kwong, Joey S W

2016-01-15

The association between periodontal disease and carotid atherosclerosis has been evaluated primarily in single-center studies, and whether periodontal disease is an independent risk factor of carotid atherosclerosis remains uncertain. This meta-analysis aimed to evaluate the association between periodontal disease and carotid atherosclerosis. We searched PubMed and Embase for relevant observational studies up to February 20, 2015. Two authors independently extracted data from included studies, and odds ratios (ORs) with 95% confidence intervals (CIs) were calculated for overall and subgroup meta-analyses. Statistical heterogeneity was assessed by the chi-squared test (P<0.1 for statistical significance) and quantified by the I(2) statistic. Data analysis was conducted using the Comprehensive Meta-Analysis (CMA) software. Fifteen observational studies involving 17,330 participants were included in the meta-analysis. The overall pooled result showed that periodontal disease was associated with carotid atherosclerosis (OR: 1.27, 95% CI: 1.14-1.41; P<0.001) but statistical heterogeneity was substantial (I(2)=78.90%). Subgroup analysis of adjusted smoking and diabetes mellitus showed borderline significance (OR: 1.08; 95% CI: 1.00-1.18; P=0.05). Sensitivity and cumulative analyses both indicated that our results were robust. Findings of our meta-analysis indicated that the presence of periodontal disease was associated with carotid atherosclerosis; however, further large-scale, well-conducted clinical studies are needed to explore the precise risk of developing carotid atherosclerosis in patients with periodontal disease. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Differences in game-related statistics of basketball performance by game location for men's winning and losing teams.

PubMed

Gómez, Miguel A; Lorenzo, Alberto; Barakat, Rubén; Ortega, Enrique; Palao, José M

2008-02-01

The aim of the present study was to identify game-related statistics that differentiate winning and losing teams according to game location. The sample included 306 games of the 2004-2005 regular season of the Spanish professional men's league (ACB League). The independent variables were game location (home or away) and game result (win or loss). The game-related statistics registered were free throws (successful and unsuccessful), 2- and 3-point field goals (successful and unsuccessful), offensive and defensive rebounds, blocks, assists, fouls, steals, and turnovers. Descriptive and inferential analyses were done (one-way analysis of variance and discriminate analysis). The multivariate analysis showed that winning teams differ from losing teams in defensive rebounds (SC = .42) and in assists (SC = .38). Similarly, winning teams differ from losing teams when they play at home in defensive rebounds (SC = .40) and in assists (SC = .41). On the other hand, winning teams differ from losing teams when they play away in defensive rebounds (SC = .44), assists (SC = .30), successful 2-point field goals (SC = .31), and unsuccessful 3-point field goals (SC = -.35). Defensive rebounds and assists were the only game-related statistics common to all three analyses.
Epidemiology Characteristics, Methodological Assessment and Reporting of Statistical Analysis of Network Meta-Analyses in the Field of Cancer

PubMed Central

Ge, Long; Tian, Jin-hui; Li, Xiu-xia; Song, Fujian; Li, Lun; Zhang, Jun; Li, Ge; Pei, Gai-qin; Qiu, Xia; Yang, Ke-hu

2016-01-01

Because of the methodological complexity of network meta-analyses (NMAs), NMAs may be more vulnerable to methodological risks than conventional pair-wise meta-analysis. Our study aims to investigate epidemiology characteristics, conduction of literature search, methodological quality and reporting of statistical analysis process in the field of cancer based on PRISMA extension statement and modified AMSTAR checklist. We identified and included 102 NMAs in the field of cancer. 61 NMAs were conducted using a Bayesian framework. Of them, more than half of NMAs did not report assessment of convergence (60.66%). Inconsistency was assessed in 27.87% of NMAs. Assessment of heterogeneity in traditional meta-analyses was more common (42.62%) than in NMAs (6.56%). Most of NMAs did not report assessment of similarity (86.89%) and did not used GRADE tool to assess quality of evidence (95.08%). 43 NMAs were adjusted indirect comparisons, the methods used were described in 53.49% NMAs. Only 4.65% NMAs described the details of handling of multi group trials and 6.98% described the methods of similarity assessment. The median total AMSTAR-score was 8.00 (IQR: 6.00–8.25). Methodological quality and reporting of statistical analysis did not substantially differ by selected general characteristics. Overall, the quality of NMAs in the field of cancer was generally acceptable. PMID:27848997
A Geospatial Statistical Analysis of the Density of Lottery Outlets within Ethnically Concentrated Neighborhoods

ERIC Educational Resources Information Center

Wiggins, Lyna; Nower, Lia; Mayers, Raymond Sanchez; Peterson, N. Andrew

2010-01-01

This study examines the density of lottery outlets within ethnically concentrated neighborhoods in Middlesex County, New Jersey, using geospatial statistical analyses. No prior studies have empirically examined the relationship between lottery outlet density and population demographics. Results indicate that lottery outlets were not randomly…
Conducting Multilevel Analyses in Medical Education

ERIC Educational Resources Information Center

Zyphur, Michael J.; Kaplan, Seth A.; Islam, Gazi; Barsky, Adam P.; Franklin, Michael S.

2008-01-01

A significant body of education literature has begun using multilevel statistical models to examine data that reside at multiple levels of analysis. In order to provide a primer for medical education researchers, the current work gives a brief overview of some issues associated with multilevel statistical modeling. To provide an example of this…
Centralized Analysis of Local Data, With Dollars and Lives on the Line: Lessons From The Home Radon Experience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, PhillipN.; Gelman, Andrew

2014-11-24

In this chapter we elucidate four main themes. The first is that modern data analyses, including "Big Data" analyses, often rely on data from different sources, which can present challenges in constructing statistical models that can make effective use of all of the data. The second theme is that although data analysis is usually centralized, frequently the final outcome is to provide information or allow decision-making for individuals. Third, data analyses often have multiple uses by design: the outcomes of the analysis are intended to be used by more than one person or group, for more than one purpose. Finally,more » issues of privacy and confidentiality can cause problems in more subtle ways than are usually considered; we will illustrate this point by discussing a case in which there is substantial and effective political opposition to simply acknowledging the geographic distribution of a health hazard. A researcher analyzes some data and learns something important. What happens next? What does it take for the results to make a difference in people's lives? In this chapter we tell a story - a true story - about a statistical analysis that should have changed government policy, but didn't. The project was a research success that did not make its way into policy, and we think it provides some useful insights into the interplay between locally-collected data, statistical analysis, and individual decision making.« less
Implementation of novel statistical procedures and other advanced approaches to improve analysis of CASA data.

PubMed

Ramón, M; Martínez-Pastor, F

2018-04-23

Computer-aided sperm analysis (CASA) produces a wealth of data that is frequently ignored. The use of multiparametric statistical methods can help explore these datasets, unveiling the subpopulation structure of sperm samples. In this review we analyse the significance of the internal heterogeneity of sperm samples and its relevance. We also provide a brief description of the statistical tools used for extracting sperm subpopulations from the datasets, namely unsupervised clustering (with non-hierarchical, hierarchical and two-step methods) and the most advanced supervised methods, based on machine learning. The former method has allowed exploration of subpopulation patterns in many species, whereas the latter offering further possibilities, especially considering functional studies and the practical use of subpopulation analysis. We also consider novel approaches, such as the use of geometric morphometrics or imaging flow cytometry. Finally, although the data provided by CASA systems provides valuable information on sperm samples by applying clustering analyses, there are several caveats. Protocols for capturing and analysing motility or morphometry should be standardised and adapted to each experiment, and the algorithms should be open in order to allow comparison of results between laboratories. Moreover, we must be aware of new technology that could change the paradigm for studying sperm motility and morphology.
Unconscious analyses of visual scenes based on feature conjunctions.

PubMed

Tachibana, Ryosuke; Noguchi, Yasuki

2015-06-01

To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
A Primer on Receiver Operating Characteristic Analysis and Diagnostic Efficiency Statistics for Pediatric Psychology: We Are Ready to ROC

PubMed Central

2014-01-01

Objective To offer a practical demonstration of receiver operating characteristic (ROC) analyses, diagnostic efficiency statistics, and their application to clinical decision making using a popular parent checklist to assess for potential mood disorder. Method Secondary analyses of data from 589 families seeking outpatient mental health services, completing the Child Behavior Checklist and semi-structured diagnostic interviews. Results Internalizing Problems raw scores discriminated mood disorders significantly better than did age- and gender-normed T scores, or an Affective Problems score. Internalizing scores <8 had a diagnostic likelihood ratio <0.3, and scores >30 had a diagnostic likelihood ratio of 7.4. Conclusions This study illustrates a series of steps in defining a clinical problem, operationalizing it, selecting a valid study design, and using ROC analyses to generate statistics that support clinical decisions. The ROC framework offers important advantages for clinical interpretation. Appendices include sample scripts using SPSS and R to check assumptions and conduct ROC analyses. PMID:23965298
A primer on receiver operating characteristic analysis and diagnostic efficiency statistics for pediatric psychology: we are ready to ROC.

PubMed

Youngstrom, Eric A

2014-03-01

To offer a practical demonstration of receiver operating characteristic (ROC) analyses, diagnostic efficiency statistics, and their application to clinical decision making using a popular parent checklist to assess for potential mood disorder. Secondary analyses of data from 589 families seeking outpatient mental health services, completing the Child Behavior Checklist and semi-structured diagnostic interviews. Internalizing Problems raw scores discriminated mood disorders significantly better than did age- and gender-normed T scores, or an Affective Problems score. Internalizing scores <8 had a diagnostic likelihood ratio <0.3, and scores >30 had a diagnostic likelihood ratio of 7.4. This study illustrates a series of steps in defining a clinical problem, operationalizing it, selecting a valid study design, and using ROC analyses to generate statistics that support clinical decisions. The ROC framework offers important advantages for clinical interpretation. Appendices include sample scripts using SPSS and R to check assumptions and conduct ROC analyses.
[Gender-sensitive epidemiological data analysis: methodological aspects and empirical outcomes. Illustrated by a health reporting example].

PubMed

Jahn, I; Foraita, R

2008-01-01

In Germany gender-sensitive approaches are part of guidelines for good epidemiological practice as well as health reporting. They are increasingly claimed to realize the gender mainstreaming strategy in research funding by the federation and federal states. This paper focuses on methodological aspects of data analysis, as an empirical data example of which serves the health report of Bremen, a population-based cross-sectional study. Health reporting requires analysis and reporting methods that are able to discover sex/gender issues of questions, on the one hand, and consider how results can adequately be communicated, on the other hand. The core question is: Which consequences do a different inclusion of the category sex in different statistical analyses for identification of potential target groups have on the results? As evaluation methods logistic regressions as well as a two-stage procedure were exploratively conducted. This procedure combines graphical models with CHAID decision trees and allows for visualising complex results. Both methods are analysed by stratification as well as adjusted by sex/gender and compared with each other. As a result, only stratified analyses are able to detect differences between the sexes and within the sex/gender groups as long as one cannot resort to previous knowledge. Adjusted analyses can detect sex/gender differences only if interaction terms have been included in the model. Results are discussed from a statistical-epidemiological perspective as well as in the context of health reporting. As a conclusion, the question, if a statistical method is gender-sensitive, can only be answered by having concrete research questions and known conditions. Often, an appropriate statistic procedure can be chosen after conducting a separate analysis for women and men. Future gender studies deserve innovative study designs as well as conceptual distinctiveness with regard to the biological and the sociocultural elements of the category sex/gender.
Contour plot assessment of existing meta-analyses confirms robust association of statin use and acute kidney injury risk.

PubMed

Chevance, Aurélie; Schuster, Tibor; Steele, Russell; Ternès, Nils; Platt, Robert W

2015-10-01

Robustness of an existing meta-analysis can justify decisions on whether to conduct an additional study addressing the same research question. We illustrate the graphical assessment of the potential impact of an additional study on an existing meta-analysis using published data on statin use and the risk of acute kidney injury. A previously proposed graphical augmentation approach is used to assess the sensitivity of the current test and heterogeneity statistics extracted from existing meta-analysis data. In addition, we extended the graphical augmentation approach to assess potential changes in the pooled effect estimate after updating a current meta-analysis and applied the three graphical contour definitions to data from meta-analyses on statin use and acute kidney injury risk. In the considered example data, the pooled effect estimates and heterogeneity indices demonstrated to be considerably robust to the addition of a future study. Supportingly, for some previously inconclusive meta-analyses, a study update might yield statistically significant kidney injury risk increase associated with higher statin exposure. The illustrated contour approach should become a standard tool for the assessment of the robustness of meta-analyses. It can guide decisions on whether to conduct additional studies addressing a relevant research question. Copyright © 2015 Elsevier Inc. All rights reserved.
Recovering incomplete data using Statistical Multiple Imputations (SMI): a case study in environmental chemistry.

PubMed

Mercer, Theresa G; Frostick, Lynne E; Walmsley, Anthony D

2011-10-15

This paper presents a statistical technique that can be applied to environmental chemistry data where missing values and limit of detection levels prevent the application of statistics. A working example is taken from an environmental leaching study that was set up to determine if there were significant differences in levels of leached arsenic (As), chromium (Cr) and copper (Cu) between lysimeters containing preservative treated wood waste and those containing untreated wood. Fourteen lysimeters were setup and left in natural conditions for 21 weeks. The resultant leachate was analysed by ICP-OES to determine the As, Cr and Cu concentrations. However, due to the variation inherent in each lysimeter combined with the limits of detection offered by ICP-OES, the collected quantitative data was somewhat incomplete. Initial data analysis was hampered by the number of 'missing values' in the data. To recover the dataset, the statistical tool of Statistical Multiple Imputation (SMI) was applied, and the data was re-analysed successfully. It was demonstrated that using SMI did not affect the variance in the data, but facilitated analysis of the complete dataset. Copyright © 2011 Elsevier B.V. All rights reserved.
Kidney function changes with aging in adults: comparison between cross-sectional and longitudinal data analyses in renal function assessment.

PubMed

Chung, Sang M; Lee, David J; Hand, Austin; Young, Philip; Vaidyanathan, Jayabharathi; Sahajwalla, Chandrahas

2015-12-01

The study evaluated whether the renal function decline rate per year with age in adults varies based on two primary statistical analyses: cross-section (CS), using one observation per subject, and longitudinal (LT), using multiple observations per subject over time. A total of 16628 records (3946 subjects; age range 30-92 years) of creatinine clearance and relevant demographic data were used. On average, four samples per subject were collected for up to 2364 days (mean: 793 days). A simple linear regression and random coefficient models were selected for CS and LT analyses, respectively. The renal function decline rates per year were 1.33 and 0.95 ml/min/year for CS and LT analyses, respectively, and were slower when the repeated individual measurements were considered. The study confirms that rates are different based on statistical analyses, and that a statistically robust longitudinal model with a proper sampling design provides reliable individual as well as population estimates of the renal function decline rates per year with age in adults. In conclusion, our findings indicated that one should be cautious in interpreting the renal function decline rate with aging information because its estimation was highly dependent on the statistical analyses. From our analyses, a population longitudinal analysis (e.g. random coefficient model) is recommended if individualization is critical, such as a dose adjustment based on renal function during a chronic therapy. Copyright © 2015 John Wiley & Sons, Ltd.
A comparison of InVivoStat with other statistical software packages for analysis of data generated from animal experiments.

PubMed

Clark, Robin A; Shoaib, Mohammed; Hewitt, Katherine N; Stanford, S Clare; Bate, Simon T

2012-08-01

InVivoStat is a free-to-use statistical software package for analysis of data generated from animal experiments. The package is designed specifically for researchers in the behavioural sciences, where exploiting the experimental design is crucial for reliable statistical analyses. This paper compares the analysis of three experiments conducted using InVivoStat with other widely used statistical packages: SPSS (V19), PRISM (V5), UniStat (V5.6) and Statistica (V9). We show that InVivoStat provides results that are similar to those from the other packages and, in some cases, are more advanced. This investigation provides evidence of further validation of InVivoStat and should strengthen users' confidence in this new software package.
Longitudinal Assessment of Self-Reported Recent Back Pain and Combat Deployment in the Millennium Cohort Study

DTIC Science & Technology

2016-11-15

participants who were followed for the development of back pain for an average of 3.9 years. Methods. Descriptive statistics and longitudinal...health, military personnel, occupational health, outcome assessment, statistics, survey methodology . Level of Evidence: 3 Spine 2016;41:1754–1763ack...based on the National Health and Nutrition Examination Survey.21 Statistical Analysis Descriptive and univariate analyses compared character- istics
Human Deception Detection from Whole Body Motion Analysis

DTIC Science & Technology

2015-12-01

9.3.2. Prediction Probability The output reports from SPSS detail the stepwise procedures for each series of analyses using Wald statistic values for... statistical significance in determining replication, but instead used a combination of significance and direction of means to determine partial or...and the independents need not be unbound. All data were analyzed utilizing the Statistical Package for Social Sciences ( SPSS , v.19.0, Chicago, IL
Characteristics of meta-analyses and their component studies in the Cochrane Database of Systematic Reviews: a cross-sectional, descriptive analysis

PubMed Central

2011-01-01

Background Cochrane systematic reviews collate and summarise studies of the effects of healthcare interventions. The characteristics of these reviews and the meta-analyses and individual studies they contain provide insights into the nature of healthcare research and important context for the development of relevant statistical and other methods. Methods We classified every meta-analysis with at least two studies in every review in the January 2008 issue of the Cochrane Database of Systematic Reviews (CDSR) according to the medical specialty, the types of interventions being compared and the type of outcome. We provide descriptive statistics for numbers of meta-analyses, numbers of component studies and sample sizes of component studies, broken down by these categories. Results We included 2321 reviews containing 22,453 meta-analyses, which themselves consist of data from 112,600 individual studies (which may appear in more than one meta-analysis). Meta-analyses in the areas of gynaecology, pregnancy and childbirth (21%), mental health (13%) and respiratory diseases (13%) are well represented in the CDSR. Most meta-analyses address drugs, either with a control or placebo group (37%) or in a comparison with another drug (25%). The median number of meta-analyses per review is six (inter-quartile range 3 to 12). The median number of studies included in the meta-analyses with at least two studies is three (inter-quartile range 2 to 6). Sample sizes of individual studies range from 2 to 1,242,071, with a median of 91 participants. Discussion It is clear that the numbers of studies eligible for meta-analyses are typically very small for all medical areas, outcomes and interventions covered by Cochrane reviews. This highlights the particular importance of suitable methods for the meta-analysis of small data sets. There was little variation in number of studies per meta-analysis across medical areas, across outcome data types or across types of interventions being compared. PMID:22114982
Study Designs and Statistical Analyses for Biomarker Research

PubMed Central

Gosho, Masahiko; Nagashima, Kengo; Sato, Yasunori

2012-01-01

Biomarkers are becoming increasingly important for streamlining drug discovery and development. In addition, biomarkers are widely expected to be used as a tool for disease diagnosis, personalized medication, and surrogate endpoints in clinical research. In this paper, we highlight several important aspects related to study design and statistical analysis for clinical research incorporating biomarkers. We describe the typical and current study designs for exploring, detecting, and utilizing biomarkers. Furthermore, we introduce statistical issues such as confounding and multiplicity for statistical tests in biomarker research. PMID:23012528
Informal Statistics Help Desk

NASA Technical Reports Server (NTRS)

Young, M.; Koslovsky, M.; Schaefer, Caroline M.; Feiveson, A. H.

2017-01-01

Back by popular demand, the JSC Biostatistics Laboratory and LSAH statisticians are offering an opportunity to discuss your statistical challenges and needs. Take the opportunity to meet the individuals offering expert statistical support to the JSC community. Join us for an informal conversation about any questions you may have encountered with issues of experimental design, analysis, or data visualization. Get answers to common questions about sample size, repeated measures, statistical assumptions, missing data, multiple testing, time-to-event data, and when to trust the results of your analyses.
Mathematical background and attitudes toward statistics in a sample of Spanish college students.

PubMed

Carmona, José; Martínez, Rafael J; Sánchez, Manuel

2005-08-01

To examine the relation of mathematical background and initial attitudes toward statistics of Spanish college students in social sciences the Survey of Attitudes Toward Statistics was given to 827 students. Multivariate analyses tested the effects of two indicators of mathematical background (amount of exposure and achievement in previous courses) on the four subscales. Analysis suggested grades in previous courses are more related to initial attitudes toward statistics than the number of mathematics courses taken. Mathematical background was related with students' affective responses to statistics but not with their valuing of statistics. Implications of possible research are discussed.

Analysis of the dependence of extreme rainfalls

NASA Astrophysics Data System (ADS)

Padoan, Simone; Ancey, Christophe; Parlange, Marc

2010-05-01

The aim of spatial analysis is to quantitatively describe the behavior of environmental phenomena such as precipitation levels, wind speed or daily temperatures. A number of generic approaches to spatial modeling have been developed[1], but these are not necessarily ideal for handling extremal aspects given their focus on mean process levels. The areal modelling of the extremes of a natural process observed at points in space is important in environmental statistics; for example, understanding extremal spatial rainfall is crucial in flood protection. In light of recent concerns over climate change, the use of robust mathematical and statistical methods for such analyses has grown in importance. Multivariate extreme value models and the class of maxstable processes [2] have a similar asymptotic motivation to the univariate Generalized Extreme Value (GEV) distribution , but providing a general approach to modeling extreme processes incorporating temporal or spatial dependence. Statistical methods for max-stable processes and data analyses of practical problems are discussed by [3] and [4]. This work illustrates methods to the statistical modelling of spatial extremes and gives examples of their use by means of a real extremal data analysis of Switzerland precipitation levels. [1] Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York. [2] de Haan, L and Ferreria A. (2006). Extreme Value Theory An Introduction. Springer, USA. [3] Padoan, S. A., Ribatet, M and Sisson, S. A. (2009). Likelihood-Based Inference for Max-Stable Processes. Journal of the American Statistical Association, Theory & Methods. In press. [4] Davison, A. C. and Gholamrezaee, M. (2009), Geostatistics of extremes. Journal of the Royal Statistical Society, Series B. To appear.
The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

PubMed

Christensen, G B; Knight, S; Camp, N J

2009-11-01

We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.
Finding Balance at the Elusive Mean

ERIC Educational Resources Information Center

Hudson, Rick A.

2012-01-01

Data analysis plays an important role in people's lives. Citizens need to be able to conduct critical analyses of statistical information in the work place, in their personal lives, and when portrayed by the media. However, becoming a conscientious consumer of statistics is a gradual process. The experiences that students have with data in the…
Improved analyses using function datasets and statistical modeling

Treesearch

John S. Hogland; Nathaniel M. Anderson

2014-01-01

Raster modeling is an integral component of spatial analysis. However, conventional raster modeling techniques can require a substantial amount of processing time and storage space and have limited statistical functionality and machine learning algorithms. To address this issue, we developed a new modeling framework using C# and ArcObjects and integrated that framework...
QUANTITATIVE IMAGING AND STATISTICAL ANALYSIS OF FLUORESCENCE IN SITU HYBRIDIZATION (FISH) OF AUREOBASIDIUM PULLULANS. (R823845)

EPA Science Inventory

Abstract
Image and multifactorial statistical analyses were used to evaluate the intensity of fluorescence signal from cells of three strains of A. pullulans and one strain of Rhodosporidium toruloides, as an outgroup, hybridized with either a universal o...
Reframing Serial Murder Within Empirical Research.

PubMed

Gurian, Elizabeth A

2017-04-01

Empirical research on serial murder is limited due to the lack of consensus on a definition, the continued use of primarily descriptive statistics, and linkage to popular culture depictions. These limitations also inhibit our understanding of these offenders and affect credibility in the field of research. Therefore, this comprehensive overview of a sample of 508 cases (738 total offenders, including partnered groups of two or more offenders) provides analyses of solo male, solo female, and partnered serial killers to elucidate statistical differences and similarities in offending and adjudication patterns among the three groups. This analysis of serial homicide offenders not only supports previous research on offending patterns present in the serial homicide literature but also reveals that empirically based analyses can enhance our understanding beyond traditional case studies and descriptive statistics. Further research based on these empirical analyses can aid in the development of more accurate classifications and definitions of serial murderers.
Meta- and statistical analysis of single-case intervention research data: quantitative gifts and a wish list.

PubMed

Kratochwill, Thomas R; Levin, Joel R

2014-04-01

In this commentary, we add to the spirit of the articles appearing in the special series devoted to meta- and statistical analysis of single-case intervention-design data. Following a brief discussion of historical factors leading to our initial involvement in statistical analysis of such data, we discuss: (a) the value added by including statistical-analysis recommendations in the What Works Clearinghouse Standards for single-case intervention designs; (b) the importance of visual analysis in single-case intervention research, along with the distinctive role that could be played by single-case effect-size measures; and (c) the elevated internal validity and statistical-conclusion validity afforded by the incorporation of various forms of randomization into basic single-case design structures. For the future, we envision more widespread application of quantitative analyses, as critical adjuncts to visual analysis, in both primary single-case intervention research studies and literature reviews in the behavioral, educational, and health sciences. Copyright © 2014 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Teaching Statistics in Biology: Using Inquiry-based Learning to Strengthen Understanding of Statistical Analysis in Biology Laboratory Courses

PubMed Central

2008-01-01

There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study. PMID:18765754
Study design and statistical analysis of data in human population studies with the micronucleus assay.

PubMed

Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

2011-01-01

The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.
Systems and methods for detection of blowout precursors in combustors

DOEpatents

Lieuwen, Tim C.; Nair, Suraj

2006-08-15

The present invention comprises systems and methods for detecting flame blowout precursors in combustors. The blowout precursor detection system comprises a combustor, a pressure measuring device, and blowout precursor detection unit. A combustion controller may also be used to control combustor parameters. The methods of the present invention comprise receiving pressure data measured by an acoustic pressure measuring device, performing one or a combination of spectral analysis, statistical analysis, and wavelet analysis on received pressure data, and determining the existence of a blowout precursor based on such analyses. The spectral analysis, statistical analysis, and wavelet analysis further comprise their respective sub-methods to determine the existence of blowout precursors.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2012-01-01

Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Test 6, Test 7, and Gas Standard Analysis Results

NASA Technical Reports Server (NTRS)

Perez, Horacio, III

2007-01-01

This viewgraph presentation shows results of analyses on odor, toxic off gassing and gas standards. The topics include: 1) Statistical Analysis Definitions; 2) Odor Analysis Results NASA Standard 6001 Test 6; 3) Toxic Off gassing Analysis Results NASA Standard 6001 Test 7; and 4) Gas Standard Results NASA Standard 6001 Test 7;
Quantitative Analysis of Venus Radar Backscatter Data in ArcGIS

NASA Technical Reports Server (NTRS)

Long, S. M.; Grosfils, E. B.

2005-01-01

Ongoing mapping of the Ganiki Planitia (V14) quadrangle of Venus and definition of material units has involved an integrated but qualitative analysis of Magellan radar backscatter images and topography using standard geomorphological mapping techniques. However, such analyses do not take full advantage of the quantitative information contained within the images. Analysis of the backscatter coefficient allows a much more rigorous statistical comparison between mapped units, permitting first order selfsimilarity tests of geographically separated materials assigned identical geomorphological labels. Such analyses cannot be performed directly on pixel (DN) values from Magellan backscatter images, because the pixels are scaled to the Muhleman law for radar echoes on Venus and are not corrected for latitudinal variations in incidence angle. Therefore, DN values must be converted based on pixel latitude back to their backscatter coefficient values before accurate statistical analysis can occur. Here we present a method for performing the conversions and analysis of Magellan backscatter data using commonly available ArcGIS software and illustrate the advantages of the process for geological mapping.
Problem area descriptions : motor vehicle crashes - data analysis and IVI program analysis

DOT National Transportation Integrated Search

In general, the IVI program focuses on the more significant safety problem categories as : indicated by statistical analyses of crash data. However, other factors were considered in setting : program priorities and schedules. For some problem areas, ...
Gait patterns for crime fighting: statistical evaluation

NASA Astrophysics Data System (ADS)

Sulovská, Kateřina; Bělašková, Silvie; Adámek, Milan

2013-10-01

The criminality is omnipresent during the human history. Modern technology brings novel opportunities for identification of a perpetrator. One of these opportunities is an analysis of video recordings, which may be taken during the crime itself or before/after the crime. The video analysis can be classed as identification analyses, respectively identification of a person via externals. The bipedal locomotion focuses on human movement on the basis of their anatomical-physiological features. Nowadays, the human gait is tested by many laboratories to learn whether the identification via bipedal locomotion is possible or not. The aim of our study is to use 2D components out of 3D data from the VICON Mocap system for deep statistical analyses. This paper introduces recent results of a fundamental study focused on various gait patterns during different conditions. The study contains data from 12 participants. Curves obtained from these measurements were sorted, averaged and statistically tested to estimate the stability and distinctiveness of this biometrics. Results show satisfactory distinctness of some chosen points, while some do not embody significant difference. However, results presented in this paper are of initial phase of further deeper and more exacting analyses of gait patterns under different conditions.
Efficiency Analysis of Public Universities in Thailand

ERIC Educational Resources Information Center

Kantabutra, Saranya; Tang, John C. S.

2010-01-01

This paper examines the performance of Thai public universities in terms of efficiency, using a non-parametric approach called data envelopment analysis. Two efficiency models, the teaching efficiency model and the research efficiency model, are developed and the analysis is conducted at the faculty level. Further statistical analyses are also…
Statistical analysis of iron geochemical data suggests limited late Proterozoic oxygenation

NASA Astrophysics Data System (ADS)

Sperling, Erik A.; Wolock, Charles J.; Morgan, Alex S.; Gill, Benjamin C.; Kunzmann, Marcus; Halverson, Galen P.; MacDonald, Francis A.; Knoll, Andrew H.; Johnston, David T.

2015-07-01

Sedimentary rocks deposited across the Proterozoic-Phanerozoic transition record extreme climate fluctuations, a potential rise in atmospheric oxygen or re-organization of the seafloor redox landscape, and the initial diversification of animals. It is widely assumed that the inferred redox change facilitated the observed trends in biodiversity. Establishing this palaeoenvironmental context, however, requires that changes in marine redox structure be tracked by means of geochemical proxies and translated into estimates of atmospheric oxygen. Iron-based proxies are among the most effective tools for tracking the redox chemistry of ancient oceans. These proxies are inherently local, but have global implications when analysed collectively and statistically. Here we analyse about 4,700 iron-speciation measurements from shales 2,300 to 360 million years old. Our statistical analyses suggest that subsurface water masses in mid-Proterozoic oceans were predominantly anoxic and ferruginous (depleted in dissolved oxygen and iron-bearing), but with a tendency towards euxinia (sulfide-bearing) that is not observed in the Neoproterozoic era. Analyses further indicate that early animals did not experience appreciable benthic sulfide stress. Finally, unlike proxies based on redox-sensitive trace-metal abundances, iron geochemical data do not show a statistically significant change in oxygen content through the Ediacaran and Cambrian periods, sharply constraining the magnitude of the end-Proterozoic oxygen increase. Indeed, this re-analysis of trace-metal data is consistent with oxygenation continuing well into the Palaeozoic era. Therefore, if changing redox conditions facilitated animal diversification, it did so through a limited rise in oxygen past critical functional and ecological thresholds, as is seen in modern oxygen minimum zone benthic animal communities.
Antiviral treatment of Bell's palsy based on baseline severity: a systematic review and meta-analysis.

PubMed

Turgeon, Ricky D; Wilby, Kyle J; Ensom, Mary H H

2015-06-01

We conducted a systematic review with meta-analysis to evaluate the efficacy of antiviral agents on complete recovery of Bell's palsy. We searched CENTRAL, Embase, MEDLINE, International Pharmaceutical Abstracts, and sources of unpublished literature to November 1, 2014. Primary and secondary outcomes were complete and satisfactory recovery, respectively. To evaluate statistical heterogeneity, we performed subgroup analysis of baseline severity of Bell's palsy and between-study sensitivity analyses based on risk of allocation and detection bias. The 10 included randomized controlled trials (2419 patients; 807 with severe Bell's palsy at onset) had variable risk of bias, with 9 trials having a high risk of bias in at least 1 domain. Complete recovery was not statistically significantly greater with antiviral use versus no antiviral use in the random-effects meta-analysis of 6 trials (relative risk, 1.06; 95% confidence interval, 0.97-1.16; I(2) = 65%). Conversely, random-effects meta-analysis of 9 trials showed a statistically significant difference in satisfactory recovery (relative risk, 1.10; 95% confidence interval, 1.02-1.18; I(2) = 63%). Response to antiviral agents did not differ visually or statistically between patients with severe symptoms at baseline and those with milder disease (test for interaction, P = .11). Sensitivity analyses did not show a clear effect of bias on outcomes. Antiviral agents are not efficacious in increasing the proportion of patients with Bell's palsy who achieved complete recovery, regardless of baseline symptom severity. Copyright © 2015 Elsevier Inc. All rights reserved.
Kolmogorov-Smirnov statistical test for analysis of ZAP-70 expression in B-CLL, compared with quantitative PCR and IgV(H) mutation status.

PubMed

Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan

2006-07-15

ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.
Generalized Majority Logic Criterion to Analyze the Statistical Strength of S-Boxes

NASA Astrophysics Data System (ADS)

Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan

2012-05-01

The majority logic criterion is applicable in the evaluation process of substitution boxes used in the advanced encryption standard (AES). The performance of modified or advanced substitution boxes is predicted by processing the results of statistical analysis by the majority logic criteria. In this paper, we use the majority logic criteria to analyze some popular and prevailing substitution boxes used in encryption processes. In particular, the majority logic criterion is applied to AES, affine power affine (APA), Gray, Lui J, residue prime, S8 AES, Skipjack, and Xyi substitution boxes. The majority logic criterion is further extended into a generalized majority logic criterion which has a broader spectrum of analyzing the effectiveness of substitution boxes in image encryption applications. The integral components of the statistical analyses used for the generalized majority logic criterion are derived from results of entropy analysis, contrast analysis, correlation analysis, homogeneity analysis, energy analysis, and mean of absolute deviation (MAD) analysis.

Factor Scores, Structure and Communality Coefficients: A Primer

ERIC Educational Resources Information Center

Odum, Mary

2011-01-01

(Purpose) The purpose of this paper is to present an easy-to-understand primer on three important concepts of factor analysis: Factor scores, structure coefficients, and communality coefficients. Given that statistical analyses are a part of a global general linear model (GLM), and utilize weights as an integral part of analyses (Thompson, 2006;…
A data storage, retrieval and analysis system for endocrine research. [for Skylab

NASA Technical Reports Server (NTRS)

Newton, L. E.; Johnston, D. A.

1975-01-01

This retrieval system builds, updates, retrieves, and performs basic statistical analyses on blood, urine, and diet parameters for the M071 and M073 Skylab and Apollo experiments. This system permits data entry from cards to build an indexed sequential file. Programs are easily modified for specialized analyses.
A randomised trial of adaptive pacing therapy, cognitive behaviour therapy, graded exercise, and specialist medical care for chronic fatigue syndrome (PACE): statistical analysis plan

PubMed Central

2013-01-01

Background The publication of protocols by medical journals is increasingly becoming an accepted means for promoting good quality research and maximising transparency. Recently, Finfer and Bellomo have suggested the publication of statistical analysis plans (SAPs).The aim of this paper is to make public and to report in detail the planned analyses that were approved by the Trial Steering Committee in May 2010 for the principal papers of the PACE (Pacing, graded Activity, and Cognitive behaviour therapy: a randomised Evaluation) trial, a treatment trial for chronic fatigue syndrome. It illustrates planned analyses of a complex intervention trial that allows for the impact of clustering by care providers, where multiple care-providers are present for each patient in some but not all arms of the trial. Results The trial design, objectives and data collection are reported. Considerations relating to blinding, samples, adherence to the protocol, stratification, centre and other clustering effects, missing data, multiplicity and compliance are described. Descriptive, interim and final analyses of the primary and secondary outcomes are then outlined. Conclusions This SAP maximises transparency, providing a record of all planned analyses, and it may be a resource for those who are developing SAPs, acting as an illustrative example for teaching and methodological research. It is not the sum of the statistical analysis sections of the principal papers, being completed well before individual papers were drafted. Trial registration ISRCTN54285094 assigned 22 May 2003; First participant was randomised on 18 March 2005. PMID:24225069
BRepertoire: a user-friendly web server for analysing antibody repertoire data.

PubMed

Margreitter, Christian; Lu, Hui-Chun; Townsend, Catherine; Stewart, Alexander; Dunn-Walters, Deborah K; Fraternali, Franca

2018-04-14

Antibody repertoire analysis by high throughput sequencing is now widely used, but a persisting challenge is enabling immunologists to explore their data to discover discriminating repertoire features for their own particular investigations. Computational methods are necessary for large-scale evaluation of antibody properties. We have developed BRepertoire, a suite of user-friendly web-based software tools for large-scale statistical analyses of repertoire data. The software is able to use data preprocessed by IMGT, and performs statistical and comparative analyses with versatile plotting options. BRepertoire has been designed to operate in various modes, for example analysing sequence-specific V(D)J gene usage, discerning physico-chemical properties of the CDR regions and clustering of clonotypes. Those analyses are performed on the fly by a number of R packages and are deployed by a shiny web platform. The user can download the analysed data in different table formats and save the generated plots as image files ready for publication. We believe BRepertoire to be a versatile analytical tool that complements experimental studies of immune repertoires. To illustrate the server's functionality, we show use cases including differential gene usage in a vaccination dataset and analysis of CDR3H properties in old and young individuals. The server is accessible under http://mabra.biomed.kcl.ac.uk/BRepertoire.
Conceptual and statistical problems associated with the use of diversity indices in ecology.

PubMed

Barrantes, Gilbert; Sandoval, Luis

2009-09-01

Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
The change of adjacent segment after cervical disc arthroplasty compared with anterior cervical discectomy and fusion: a meta-analysis of randomized controlled trials.

PubMed

Dong, Liang; Xu, Zhengwei; Chen, Xiujin; Wang, Dongqi; Li, Dichen; Liu, Tuanjing; Hao, Dingjun

2017-10-01

Many meta-analyses have been performed to study the efficacy of cervical disc arthroplasty (CDA) compared with anterior cervical discectomy and fusion (ACDF); however, there are few data referring to adjacent segment within these meta-analyses, or investigators are unable to arrive at the same conclusion in the few meta-analyses about adjacent segment. With the increased concerns surrounding adjacent segment degeneration (ASDeg) and adjacent segment disease (ASDis) after anterior cervical surgery, it is necessary to perform a comprehensive meta-analysis to analyze adjacent segment parameters. To perform a comprehensive meta-analysis to elaborate adjacent segment motion, degeneration, disease, and reoperation of CDA compared with ACDF. Meta-analysis of randomized controlled trials (RCTs). PubMed, Embase, and Cochrane Library were searched for RCTs comparing CDA and ACDF before May 2016. The analysis parameters included follow-up time, operative segments, adjacent segment motion, ASDeg, ASDis, and adjacent segment reoperation. The risk of bias scale was used to assess the papers. Subgroup analysis and sensitivity analysis were used to analyze the reason for high heterogeneity. Twenty-nine RCTs fulfilled the inclusion criteria. Compared with ACDF, the rate of adjacent segment reoperation in the CDA group was significantly lower (p<.01), and the advantage of that group in reducing adjacent segment reoperation increases with increasing follow-up time by subgroup analysis. There was no statistically significant difference in ASDeg between CDA and ACDF within the 24-month follow-up period; however, the rate of ASDeg in CDA was significantly lower than that of ACDF with the increase in follow-up time (p<.01). There was no statistically significant difference in ASDis between CDA and ACDF (p>.05). Cervical disc arthroplasty provided a lower adjacent segment range of motion (ROM) than did ACDF, but the difference was not statistically significant. Compared with ACDF, the advantages of CDA were lower ASDeg and adjacent segment reoperation. However, there was no statistically significant difference in ASDis and adjacent segment ROM. Copyright © 2017 Elsevier Inc. All rights reserved.
From sexless to sexy: Why it is time for human genetics to consider and report analyses of sex.

PubMed

Powers, Matthew S; Smith, Phillip H; McKee, Sherry A; Ehringer, Marissa A

2017-01-01

Science has come a long way with regard to the consideration of sex differences in clinical and preclinical research, but one field remains behind the curve: human statistical genetics. The goal of this commentary is to raise awareness and discussion about how to best consider and evaluate possible sex effects in the context of large-scale human genetic studies. Over the course of this commentary, we reinforce the importance of interpreting genetic results in the context of biological sex, establish evidence that sex differences are not being considered in human statistical genetics, and discuss how best to conduct and report such analyses. Our recommendation is to run stratified analyses by sex no matter the sample size or the result and report the findings. Summary statistics from stratified analyses are helpful for meta-analyses, and patterns of sex-dependent associations may be hidden in a combined dataset. In the age of declining sequencing costs, large consortia efforts, and a number of useful control samples, it is now time for the field of human genetics to appropriately include sex in the design, analysis, and reporting of results.
A Guerilla Guide to Common Problems in ‘Neurostatistics’: Essential Statistical Topics in Neuroscience

PubMed Central

Smith, Paul F.

2017-01-01

Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins. PMID:29371855
A Guerilla Guide to Common Problems in 'Neurostatistics': Essential Statistical Topics in Neuroscience.

PubMed

Smith, Paul F

2017-01-01

Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins.
First Monte Carlo analysis of fragmentation functions from single-inclusive e + e - annihilation

DOE PAGES

Sato, Nobuo; Ethier, J. J.; Melnitchouk, W.; ...

2016-12-02

Here, we perform the first iterative Monte Carlo (IMC) analysis of fragmentation functions constrained by all available data from single-inclusive $e^+ e^-$ annihilation into pions and kaons. The IMC method eliminates potential bias in traditional analyses based on single fits introduced by fixing parameters not well contrained by the data, and provides a statistically rigorous determination of uncertainties. Our analysis reveals specific features of fragmentation functions using the new IMC methodology and those obtained from previous analyses, especially for light quarks and for strange quark fragmentation to kaons.
Do regional methods really help reduce uncertainties in flood frequency analyses?

NASA Astrophysics Data System (ADS)

Cong Nguyen, Chi; Payrastre, Olivier; Gaume, Eric

2013-04-01

Flood frequency analyses are often based on continuous measured series at gauge sites. However, the length of the available data sets is usually too short to provide reliable estimates of extreme design floods. To reduce the estimation uncertainties, the analyzed data sets have to be extended either in time, making use of historical and paleoflood data, or in space, merging data sets considered as statistically homogeneous to build large regional data samples. Nevertheless, the advantage of the regional analyses, the important increase of the size of the studied data sets, may be counterbalanced by the possible heterogeneities of the merged sets. The application and comparison of four different flood frequency analysis methods to two regions affected by flash floods in the south of France (Ardèche and Var) illustrates how this balance between the number of records and possible heterogeneities plays in real-world applications. The four tested methods are: (1) a local statistical analysis based on the existing series of measured discharges, (2) a local analysis valuating the existing information on historical floods, (3) a standard regional flood frequency analysis based on existing measured series at gauged sites and (4) a modified regional analysis including estimated extreme peak discharges at ungauged sites. Monte Carlo simulations are conducted to simulate a large number of discharge series with characteristics similar to the observed ones (type of statistical distributions, number of sites and records) to evaluate to which extent the results obtained on these case studies can be generalized. These two case studies indicate that even small statistical heterogeneities, which are not detected by the standard homogeneity tests implemented in regional flood frequency studies, may drastically limit the usefulness of such approaches. On the other hand, these result show that the valuation of information on extreme events, either historical flood events at gauged sites or estimated extremes at ungauged sites in the considered region, is an efficient way to reduce uncertainties in flood frequency studies.
Meta-analyses on intra-aortic balloon pump in cardiogenic shock complicating acute myocardial infarction may provide biased results.

PubMed

Acconcia, M C; Caretta, Q; Romeo, F; Borzi, M; Perrone, M A; Sergi, D; Chiarotti, F; Calabrese, C M; Sili Scavalli, A; Gaudio, C

2018-04-01

Intra-aortic balloon pump (IABP) is the device most commonly investigated in patients with cardiogenic shock (CS) complicating acute myocardial infarction (AMI). Recently meta-analyses on this topic showed opposite results: some complied with the actual guideline recommendations, while others did not, due to the presence of bias. We investigated the reasons for the discrepancy among meta-analyses and strategies employed to avoid the potential source of bias. Scientific databases were searched for meta-analyses of IABP support in AMI complicated by CS. The presence of clinical diversity, methodological diversity and statistical heterogeneity were analyzed. When we found clinical or methodological diversity, we reanalyzed the data by comparing the patients selected for homogeneous groups. When the fixed effect model was employed despite the presence of statistical heterogeneity, the meta-analysis was repeated adopting the random effect model, with the same estimator used in the original meta-analysis. Twelve meta-analysis were selected. Six meta-analyses of randomized controlled trials (RCTs) were inconclusive because underpowered to detect the IABP effect. Five included RCTs and observational studies (Obs) and one only Obs. Some meta-analyses on RCTs and Obs had biased results due to presence of clinical and/or methodological diversity. The reanalysis of data reallocated for homogeneous groups was no more in contrast with guidelines recommendations. Meta-analyses performed without controlling for clinical and/or methodological diversity, represent a confounding message against a good clinical practice. The reanalysis of data demonstrates the validity of the current guidelines recommendations in addressing clinical decision making in providing IABP support in AMI complicated by CS.
Accounting for standard errors of vision-specific latent trait in regression models.

PubMed

Wong, Wan Ling; Li, Xiang; Li, Jialiang; Wong, Tien Yin; Cheng, Ching-Yu; Lamoureux, Ecosse L

2014-07-11

To demonstrate the effectiveness of Hierarchical Bayesian (HB) approach in a modeling framework for association effects that accounts for SEs of vision-specific latent traits assessed using Rasch analysis. A systematic literature review was conducted in four major ophthalmic journals to evaluate Rasch analysis performed on vision-specific instruments. The HB approach was used to synthesize the Rasch model and multiple linear regression model for the assessment of the association effects related to vision-specific latent traits. The effectiveness of this novel HB one-stage "joint-analysis" approach allows all model parameters to be estimated simultaneously and was compared with the frequently used two-stage "separate-analysis" approach in our simulation study (Rasch analysis followed by traditional statistical analyses without adjustment for SE of latent trait). Sixty-six reviewed articles performed evaluation and validation of vision-specific instruments using Rasch analysis, and 86.4% (n = 57) performed further statistical analyses on the Rasch-scaled data using traditional statistical methods; none took into consideration SEs of the estimated Rasch-scaled scores. The two models on real data differed for effect size estimations and the identification of "independent risk factors." Simulation results showed that our proposed HB one-stage "joint-analysis" approach produces greater accuracy (average of 5-fold decrease in bias) with comparable power and precision in estimation of associations when compared with the frequently used two-stage "separate-analysis" procedure despite accounting for greater uncertainty due to the latent trait. Patient-reported data, using Rasch analysis techniques, do not take into account the SE of latent trait in association analyses. The HB one-stage "joint-analysis" is a better approach, producing accurate effect size estimations and information about the independent association of exposure variables with vision-specific latent traits. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
Statistical analysis of hydrological response in urbanising catchments based on adaptive sampling using inter-amount times

NASA Astrophysics Data System (ADS)

ten Veldhuis, Marie-Claire; Schleiss, Marc

2017-04-01

In this study, we introduced an alternative approach for analysis of hydrological flow time series, using an adaptive sampling framework based on inter-amount times (IATs). The main difference with conventional flow time series is the rate at which low and high flows are sampled: the unit of analysis for IATs is a fixed flow amount, instead of a fixed time window. We analysed statistical distributions of flows and IATs across a wide range of sampling scales to investigate sensitivity of statistical properties such as quantiles, variance, skewness, scaling parameters and flashiness indicators to the sampling scale. We did this based on streamflow time series for 17 (semi)urbanised basins in North Carolina, US, ranging from 13 km2 to 238 km2 in size. Results showed that adaptive sampling of flow time series based on inter-amounts leads to a more balanced representation of low flow and peak flow values in the statistical distribution. While conventional sampling gives a lot of weight to low flows, as these are most ubiquitous in flow time series, IAT sampling gives relatively more weight to high flow values, when given flow amounts are accumulated in shorter time. As a consequence, IAT sampling gives more information about the tail of the distribution associated with high flows, while conventional sampling gives relatively more information about low flow periods. We will present results of statistical analyses across a range of subdaily to seasonal scales and will highlight some interesting insights that can be derived from IAT statistics with respect to basin flashiness and impact urbanisation on hydrological response.
Sigsearch: a new term for post hoc unplanned search for statistically significant relationships with the intent to create publishable findings.

PubMed

Hashim, Muhammad Jawad

2010-09-01

Post-hoc secondary data analysis with no prespecified hypotheses has been discouraged by textbook authors and journal editors alike. Unfortunately no single term describes this phenomenon succinctly. I would like to coin the term "sigsearch" to define this practice and bring it within the teaching lexicon of statistics courses. Sigsearch would include any unplanned, post-hoc search for statistical significance using multiple comparisons of subgroups. It would also include data analysis with outcomes other than the prespecified primary outcome measure of a study as well as secondary data analyses of earlier research.
Analyzing the Validity of the Adult-Adolescent Parenting Inventory for Low-Income Populations

ERIC Educational Resources Information Center

Lawson, Michael A.; Alameda-Lawson, Tania; Byrnes, Edward

2017-01-01

Objectives: The purpose of this study was to examine the construct and predictive validity of the Adult-Adolescent Parenting Inventory (AAPI-2). Methods: The validity of the AAPI-2 was evaluated using multiple statistical methods, including exploratory factor analysis, confirmatory factor analysis, and latent class analysis. These analyses were…
Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.

PubMed

Simovski, Boris; Kanduri, Chakravarthi; Gundersen, Sveinung; Titov, Dmytro; Domanska, Diana; Bock, Christoph; Bossini-Castillo, Lara; Chikina, Maria; Favorov, Alexander; Layer, Ryan M; Mironov, Andrey A; Quinlan, Aaron R; Sheffield, Nathan C; Trynka, Gosia; Sandve, Geir K

2018-06-05

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.
Point-by-point compositional analysis for atom probe tomography.

PubMed

Stephenson, Leigh T; Ceguerra, Anna V; Li, Tong; Rojhirunsakool, Tanaporn; Nag, Soumya; Banerjee, Rajarshi; Cairney, Julie M; Ringer, Simon P

2014-01-01

This new alternate approach to data processing for analyses that traditionally employed grid-based counting methods is necessary because it removes a user-imposed coordinate system that not only limits an analysis but also may introduce errors. We have modified the widely used "binomial" analysis for APT data by replacing grid-based counting with coordinate-independent nearest neighbour identification, improving the measurements and the statistics obtained, allowing quantitative analysis of smaller datasets, and datasets from non-dilute solid solutions. It also allows better visualisation of compositional fluctuations in the data. Our modifications include:.•using spherical k-atom blocks identified by each detected atom's first k nearest neighbours.•3D data visualisation of block composition and nearest neighbour anisotropy.•using z-statistics to directly compare experimental and expected composition curves. Similar modifications may be made to other grid-based counting analyses (contingency table, Langer-Bar-on-Miller, sinusoidal model) and could be instrumental in developing novel data visualisation options.
Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses

PubMed Central

Callahan, Ben J.; Sankaran, Kris; Fukuyama, Julia A.; McMurdie, Paul J.; Holmes, Susan P.

2016-01-01

High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or OTU composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, whether parametric or nonparametric. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests and nonparametric testing using community networks and the ggnetwork package. PMID:27508062
Data on xylem sap proteins from Mn- and Fe-deficient tomato plants obtained using shotgun proteomics.

PubMed

Ceballos-Laita, Laura; Gutierrez-Carbonell, Elain; Takahashi, Daisuke; Abadía, Anunciación; Uemura, Matsuo; Abadía, Javier; López-Millán, Ana Flor

2018-04-01

This article contains consolidated proteomic data obtained from xylem sap collected from tomato plants grown in Fe- and Mn-sufficient control, as well as Fe-deficient and Mn-deficient conditions. Data presented here cover proteins identified and quantified by shotgun proteomics and Progenesis LC-MS analyses: proteins identified with at least two peptides and showing changes statistically significant (ANOVA; p ≤ 0.05) and above a biologically relevant selected threshold (fold ≥ 2) between treatments are listed. The comparison between Fe-deficient, Mn-deficient and control xylem sap samples using a multivariate statistical data analysis (Principal Component Analysis, PCA) is also included. Data included in this article are discussed in depth in the research article entitled "Effects of Fe and Mn deficiencies on the protein profiles of tomato ( Solanum lycopersicum) xylem sap as revealed by shotgun analyses" [1]. This dataset is made available to support the cited study as well to extend analyses at a later stage.

Extreme between-study homogeneity in meta-analyses could offer useful insights.

PubMed

Ioannidis, John P A; Trikalinos, Thomas A; Zintzaras, Elias

2006-10-01

Meta-analyses are routinely evaluated for the presence of large between-study heterogeneity. We examined whether it is also important to probe whether there is extreme between-study homogeneity. We used heterogeneity tests with left-sided statistical significance for inference and developed a Monte Carlo simulation test for testing extreme homogeneity in risk ratios across studies, using the empiric distribution of the summary risk ratio and heterogeneity statistic. A left-sided P=0.01 threshold was set for claiming extreme homogeneity to minimize type I error. Among 11,803 meta-analyses with binary contrasts from the Cochrane Library, 143 (1.21%) had left-sided P-value <0.01 for the asymptotic Q statistic and 1,004 (8.50%) had left-sided P-value <0.10. The frequency of extreme between-study homogeneity did not depend on the number of studies in the meta-analyses. We identified examples where extreme between-study homogeneity (left-sided P-value <0.01) could result from various possibilities beyond chance. These included inappropriate statistical inference (asymptotic vs. Monte Carlo), use of a specific effect metric, correlated data or stratification using strong predictors of outcome, and biases and potential fraud. Extreme between-study homogeneity may provide useful insights about a meta-analysis and its constituent studies.
Metal and physico-chemical variations at a hydroelectric reservoir analyzed by Multivariate Analyses and Artificial Neural Networks: environmental management and policy/decision-making tools.

PubMed

Cavalcante, Y L; Hauser-Davis, R A; Saraiva, A C F; Brandão, I L S; Oliveira, T F; Silveira, A M

2013-01-01

This paper compared and evaluated seasonal variations in physico-chemical parameters and metals at a hydroelectric power station reservoir by applying Multivariate Analyses and Artificial Neural Networks (ANN) statistical techniques. A Factor Analysis was used to reduce the number of variables: the first factor was composed of elements Ca, K, Mg and Na, and the second by Chemical Oxygen Demand. The ANN showed 100% correct classifications in training and validation samples. Physico-chemical analyses showed that water pH values were not statistically different between the dry and rainy seasons, while temperature, conductivity, alkalinity, ammonia and DO were higher in the dry period. TSS, hardness and COD, on the other hand, were higher during the rainy season. The statistical analyses showed that Ca, K, Mg and Na are directly connected to the Chemical Oxygen Demand, which indicates a possibility of their input into the reservoir system by domestic sewage and agricultural run-offs. These statistical applications, thus, are also relevant in cases of environmental management and policy decision-making processes, to identify which factors should be further studied and/or modified to recover degraded or contaminated water bodies. Copyright © 2012 Elsevier B.V. All rights reserved.
Status of selected ion flow tube MS: accomplishments and challenges in breath analysis and other areas.

PubMed

Smith, David; Španěl, Patrik

2016-06-01

This article reflects our observations of recent accomplishments made using selected ion flow tube MS (SIFT-MS). Only brief descriptions are given of SIFT-MS as an analytical method and of the recent extensions to the underpinning analytical ion chemistry required to realize more robust analyses. The challenge of breath analysis is given special attention because, when achieved, it renders analysis of other air media relatively straightforward. Brief overviews are given of recent SIFT-MS breath analyses by leading research groups, noting the desirability of detection and quantification of single volatile biomarkers rather than reliance on statistical analyses, if breath analysis is to be accepted into clinical practice. A 'strengths, weaknesses, opportunities and threats' analysis of SIFT-MS is made, which should help to increase its utility for trace gas analysis.
Statistical power analysis in wildlife research

USGS Publications Warehouse

Steidl, R.J.; Hayes, J.P.

1997-01-01

Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.
A decade of individual participant data meta-analyses: A review of current practice.

PubMed

Simmonds, Mark; Stewart, Gavin; Stewart, Lesley

2015-11-01

Individual participant data (IPD) systematic reviews and meta-analyses are often considered to be the gold standard for meta-analysis. In the ten years since the first review into the methodology and reporting practice of IPD reviews was published much has changed in the field. This paper investigates current reporting and statistical practice in IPD systematic reviews. A systematic review was performed to identify systematic reviews that collected and analysed IPD. Data were extracted from each included publication on a variety of issues related to the reporting of IPD review process, and the statistical methods used. There has been considerable growth in the use of "one-stage" methods to perform IPD meta-analyses. The majority of reviews consider at least one covariate other than the primary intervention, either using subgroup analysis or including covariates in one-stage regression models. Random-effects analyses, however, are not often used. Reporting of review methods was often limited, with few reviews presenting a risk-of-bias assessment. Details on issues specific to the use of IPD were little reported, including how IPD were obtained; how data was managed and checked for consistency and errors; and for how many studies and participants IPD were sought and obtained. While the last ten years have seen substantial changes in how IPD meta-analyses are performed there remains considerable scope for improving the quality of reporting for both the process of IPD systematic reviews, and the statistical methods employed in them. It is to be hoped that the publication of the PRISMA-IPD guidelines specific to IPD reviews will improve reporting in this area. Copyright © 2015 Elsevier Inc. All rights reserved.
Evaluation of a weighted test in the analysis of ordinal gait scores in an additivity model for five OP pesticides.

EPA Science Inventory

Appropriate statistical analyses are critical for evaluating interactions of mixtures with a common mode of action, as is often the case for cumulative risk assessments. Our objective is to develop analyses for use when a response variable is ordinal, and to test for interaction...
School Finance Court Cases and Disparate Racial Impact: The Contribution of Statistical Analysis in New York

ERIC Educational Resources Information Center

Stiefel, Leanna; Schwartz, Amy Ellen; Berne, Robert; Chellman, Colin C.

2005-01-01

Although analyses of state school finance systems rarely focus on the distribution of funds to students of different races, the advent of racial discrimination as an issue in school finance court cases may change that situation. In this article, we describe the background, analyses, and results of plaintiffs' testimony regarding racial…
Differences in reporting of analyses in internal company documents versus published trial reports: comparisons in industry-sponsored trials in off-label uses of gabapentin.

PubMed

Vedula, S Swaroop; Li, Tianjing; Dickersin, Kay

2013-01-01

Details about the type of analysis (e.g., intent to treat [ITT]) and definitions (i.e., criteria for including participants in the analysis) are necessary for interpreting a clinical trial's findings. Our objective was to compare the description of types of analyses and criteria for including participants in the publication (i.e., what was reported) with descriptions in the corresponding internal company documents (i.e., what was planned and what was done). Trials were for off-label uses of gabapentin sponsored by Pfizer and Parke-Davis, and documents were obtained through litigation. For each trial, we compared internal company documents (protocols, statistical analysis plans, and research reports, all unpublished), with publications. One author extracted data and another verified, with a third person verifying discordant items and a sample of the rest. Extracted data included the number of participants randomized and analyzed for efficacy, and types of analyses for efficacy and safety and their definitions (i.e., criteria for including participants in each type of analysis). We identified 21 trials, 11 of which were published randomized controlled trials, and that provided the documents needed for planned comparisons. For three trials, there was disagreement on the number of randomized participants between the research report and publication. Seven types of efficacy analyses were described in the protocols, statistical analysis plans, and publications, including ITT and six others. The protocol or publication described ITT using six different definitions, resulting in frequent disagreements between the two documents (i.e., different numbers of participants were included in the analyses). Descriptions of analyses conducted did not agree between internal company documents and what was publicly reported. Internal company documents provide extensive documentation of methods planned and used, and trial findings, and should be publicly accessible. Reporting standards for randomized controlled trials should recommend transparent descriptions and definitions of analyses performed and which study participants are excluded.
PIXE analysis of elements in gastric cancer and adjacent mucosa

NASA Astrophysics Data System (ADS)

Liu, Qixin; Zhong, Ming; Zhang, Xiaofeng; Yan, Lingnuo; Xu, Yongling; Ye, Simao

1990-04-01

The elemental regional distributions in 20 resected human stomach tissues were obtained using PIXE analysis. The samples were pathologically divided into four types: normal, adjacent mucosa A, adjacent mucosa B and cancer. The targets for PIXE analysis were prepared by wet digestion with a pressure bomb system. P, K, Fe, Cu, Zn and Se were measured and statistically analysed. We found significantly higher concentrations of P, K, Cu, Zn and a higher ratio of Cu compared to Zn in cancer tissue as compared with normal tissue, but statistically no significant difference between adjacent mucosa and cancer tissue was found.
Statistical analysis of an inter-laboratory comparison of small-scale safety and thermal testing of RDX

DOE PAGES

Brown, Geoffrey W.; Sandstrom, Mary M.; Preston, Daniel N.; ...

2014-11-17

In this study, the Integrated Data Collection Analysis (IDCA) program has conducted a proficiency test for small-scale safety and thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results from this test for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Class 5 Type II standard. The material was tested as a well-characterized standard several times during the proficiency test to assess differences among participants and the range of results that may arise for well-behaved explosive materials.
A General Framework for Power Analysis to Detect the Moderator Effects in Two- and Three-Level Cluster Randomized Trials

ERIC Educational Resources Information Center

Dong, Nianbo; Spybrook, Jessaca; Kelcey, Ben

2016-01-01

The purpose of this study is to propose a general framework for power analyses to detect the moderator effects in two- and three-level cluster randomized trials (CRTs). The study specifically aims to: (1) develop the statistical formulations for calculating statistical power, minimum detectable effect size (MDES) and its confidence interval to…
Using Markov Chain Analyses in Counselor Education Research

ERIC Educational Resources Information Center

Duys, David K.; Headrick, Todd C.

2004-01-01

This study examined the efficacy of an infrequently used statistical analysis in counselor education research. A Markov chain analysis was used to examine hypothesized differences between students' use of counseling skills in an introductory course. Thirty graduate students participated in the study. Independent raters identified the microskills…
An Analysis of Methods Used to Examine Gender Differences in Computer-Related Behavior.

ERIC Educational Resources Information Center

Kay, Robin

1992-01-01

Review of research investigating gender differences in computer-related behavior examines statistical and methodological flaws. Issues addressed include sample selection, sample size, scale development, scale quality, the use of univariate and multivariate analyses, regressional analysis, construct definition, construct testing, and the…
Reporting Practices and Use of Quantitative Methods in Canadian Journal Articles in Psychology.

PubMed

Counsell, Alyssa; Harlow, Lisa L

2017-05-01

With recent focus on the state of research in psychology, it is essential to assess the nature of the statistical methods and analyses used and reported by psychological researchers. To that end, we investigated the prevalence of different statistical procedures and the nature of statistical reporting practices in recent articles from the four major Canadian psychology journals. The majority of authors evaluated their research hypotheses through the use of analysis of variance (ANOVA), t -tests, and multiple regression. Multivariate approaches were less common. Null hypothesis significance testing remains a popular strategy, but the majority of authors reported a standardized or unstandardized effect size measure alongside their significance test results. Confidence intervals on effect sizes were infrequently employed. Many authors provided minimal details about their statistical analyses and less than a third of the articles presented on data complications such as missing data and violations of statistical assumptions. Strengths of and areas needing improvement for reporting quantitative results are highlighted. The paper concludes with recommendations for how researchers and reviewers can improve comprehension and transparency in statistical reporting.
The response of numerical weather prediction analysis systems to FGGE 2b data

NASA Technical Reports Server (NTRS)

Hollingsworth, A.; Lorenc, A.; Tracton, S.; Arpe, K.; Cats, G.; Uppala, S.; Kallberg, P.

1985-01-01

An intercomparison of analyses of the main PGGE Level IIb data set is presented with three advanced analysis systems. The aims of the work are to estimate the extent and magnitude of the differences between the analyses, to identify the reasons for the differences, and finally to estimate the significance of the differences. Extratropical analyses only are considered. Objective evaluations of analysis quality, such as fit to observations, statistics of analysis differences, and mean fields are discussed. In addition, substantial emphasis is placed on subjective evaluation of a series of case studies that were selected to illustrate the importance of different aspects of the analysis procedures, such as quality control, data selection, resolution, dynamical balance, and the role of the assimilating forecast model. In some cases, the forecast models are used as selective amplifiers of analysis differences to assist in deciding which analysis was more nearly correct in the treatment of particular data.
Logistic regression applied to natural hazards: rare event logistic regression with replications

NASA Astrophysics Data System (ADS)

Guns, M.; Vanacker, V.

2012-06-01

Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Statistical methods for meta-analyses including information from studies without any events-add nothing to nothing and succeed nevertheless.

PubMed

Kuss, O

2015-03-30

Meta-analyses with rare events, especially those that include studies with no event in one ('single-zero') or even both ('double-zero') treatment arms, are still a statistical challenge. In the case of double-zero studies, researchers in general delete these studies or use continuity corrections to avoid them. A number of arguments against both options has been given, and statistical methods that use the information from double-zero studies without using continuity corrections have been proposed. In this paper, we collect them and compare them by simulation. This simulation study tries to mirror real-life situations as completely as possible by deriving true underlying parameters from empirical data on actually performed meta-analyses. It is shown that for each of the commonly encountered effect estimators valid statistical methods are available that use the information from double-zero studies without using continuity corrections. Interestingly, all of them are truly random effects models, and so also the current standard method for very sparse data as recommended from the Cochrane collaboration, the Yusuf-Peto odds ratio, can be improved on. For actual analysis, we recommend to use beta-binomial regression methods to arrive at summary estimates for the odds ratio, the relative risk, or the risk difference. Methods that ignore information from double-zero studies or use continuity corrections should no longer be used. We illustrate the situation with an example where the original analysis ignores 35 double-zero studies, and a superior analysis discovers a clinically relevant advantage of off-pump surgery in coronary artery bypass grafting. Copyright © 2014 John Wiley & Sons, Ltd.
Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

PubMed

Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

2008-01-01

ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Metaprop: a Stata command to perform meta-analysis of binomial data.

PubMed

Nyaga, Victoria N; Arbyn, Marc; Aerts, Marc

2014-01-01

Meta-analyses have become an essential tool in synthesizing evidence on clinical and epidemiological questions derived from a multitude of similar studies assessing the particular issue. Appropriate and accessible statistical software is needed to produce the summary statistic of interest. Metaprop is a statistical program implemented to perform meta-analyses of proportions in Stata. It builds further on the existing Stata procedure metan which is typically used to pool effects (risk ratios, odds ratios, differences of risks or means) but which is also used to pool proportions. Metaprop implements procedures which are specific to binomial data and allows computation of exact binomial and score test-based confidence intervals. It provides appropriate methods for dealing with proportions close to or at the margins where the normal approximation procedures often break down, by use of the binomial distribution to model the within-study variability or by allowing Freeman-Tukey double arcsine transformation to stabilize the variances. Metaprop was applied on two published meta-analyses: 1) prevalence of HPV-infection in women with a Pap smear showing ASC-US; 2) cure rate after treatment for cervical precancer using cold coagulation. The first meta-analysis showed a pooled HPV-prevalence of 43% (95% CI: 38%-48%). In the second meta-analysis, the pooled percentage of cured women was 94% (95% CI: 86%-97%). By using metaprop, no studies with 0% or 100% proportions were excluded from the meta-analysis. Furthermore, study specific and pooled confidence intervals always were within admissible values, contrary to the original publication, where metan was used.
Single-case research design in pediatric psychology: considerations regarding data analysis.

PubMed

Cohen, Lindsey L; Feinstein, Amanda; Masuda, Akihiko; Vowles, Kevin E

2014-03-01

Single-case research allows for an examination of behavior and can demonstrate the functional relation between intervention and outcome in pediatric psychology. This review highlights key assumptions, methodological and design considerations, and options for data analysis. Single-case methodology and guidelines are reviewed with an in-depth focus on visual and statistical analyses. Guidelines allow for the careful evaluation of design quality and visual analysis. A number of statistical techniques have been introduced to supplement visual analysis, but to date, there is no consensus on their recommended use in single-case research design. Single-case methodology is invaluable for advancing pediatric psychology science and practice, and guidelines have been introduced to enhance the consistency, validity, and reliability of these studies. Experts generally agree that visual inspection is the optimal method of analysis in single-case design; however, statistical approaches are becoming increasingly evaluated and used to augment data interpretation.

Statistical analysis of major ion and trace element geochemistry of water, 1986-2006, at seven wells transecting the freshwater/saline-water interface of the Edwards Aquifer, San Antonio, Texas

USGS Publications Warehouse

Mahler, Barbara J.

2008-01-01

The statistical analyses taken together indicate that the geochemistry at the freshwater-zone wells is more variable than that at the transition-zone wells. The geochemical variability at the freshwater-zone wells might result from dilution of ground water by meteoric water. This is indicated by relatively constant major ion molar ratios; a preponderance of positive correlations between SC, major ions, and trace elements; and a principal components analysis in which the major ions are strongly loaded on the first principal component. Much of the variability at three of the four transition-zone wells might result from the use of different laboratory analytical methods or reporting procedures during the period of sampling. This is reflected by a lack of correlation between SC and major ion concentrations at the transition-zone wells and by a principal components analysis in which the variability is fairly evenly distributed across several principal components. The statistical analyses further indicate that, although the transition-zone wells are less well connected to surficial hydrologic conditions than the freshwater-zone wells, there is some connection but the response time is longer.
Considerations in the statistical analysis of clinical trials in periodontitis.

PubMed

Imrey, P B

1986-05-01

Adult periodontitis has been described as a chronic infectious process exhibiting sporadic, acute exacerbations which cause quantal, localized losses of dental attachment. Many analytic problems of periodontal trials are similar to those of other chronic diseases. However, the episodic, localized, infrequent, and relatively unpredictable behavior of exacerbations, coupled with measurement error difficulties, cause some specific problems. Considerable controversy exists as to the proper selection and treatment of multiple site data from the same patient for group comparisons for epidemiologic or therapeutic evaluative purposes. This paper comments, with varying degrees of emphasis, on several issues pertinent to the analysis of periodontal trials. Considerable attention is given to the ways in which measurement variability may distort analytic results. Statistical treatments of multiple site data for descriptive summaries are distinguished from treatments for formal statistical inference to validate therapeutic effects. Evidence suggesting that sites behave independently is contested. For inferential analyses directed at therapeutic or preventive effects, analytic models based on site independence are deemed unsatisfactory. Methods of summarization that may yield more powerful analyses than all-site mean scores, while retaining appropriate treatment of inter-site associations, are suggested. Brief comments and opinions on an assortment of other issues in clinical trial analysis are preferred.
Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

PubMed

Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

2016-12-20

Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
The effect of ion-exchange purification on the determination of plutonium at the New Brunswick Laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mitchell, W.G.; Spaletto, M.I.; Lewis, K.

The method of plutonium (Pu) determination at the Brunswick Laboratory (NBL) consists of a combination of ion-exchange purification followed by controlled-potential coulometric analysis (IE/CPC). The present report's purpose is to quantify any detectable Pu loss occurring in the ion-exchange (IE) purification step which would cause a negative bias in the NBL method for Pu analysis. The magnitude of any such loss would be contained within the reproducibility (0.05%) of the IE/CPC method which utilizes a state-of-the-art autocoulometer developed at NBL. When the NBL IE/CPC method is used for Pu analysis, any loss in ion-exchange purification (<0.05%) is confounded with themore » repeatability of the ion-exchange and the precision of the CPC analysis technique (<0.05%). Consequently, to detect a bias in the IE/CPC method due to the IE alone using the IE/CPC method itself requires that many randomized analyses on a single material be performed over time and that statistical analysis of the data be performed. The initial approach described in this report to quantify any IE loss was an independent method, Isotope Dilution Mass Spectrometry; however, the number of analyses performed was insufficient to assign a statistically significant value to the IE loss (<0.02% of 10 mg samples of Pu). The second method used for quantifying any IE loss of Pu was multiple ion exchanges of the same Pu aliquant; the small number of analyses possible per individual IE together with the column-to-column variability over multiple ion exchanges prevented statistical detection of any loss of <0.05%. 12 refs.« less
Behavior, sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation.

PubMed

Eickhoff, Simon B; Nichols, Thomas E; Laird, Angela R; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T; Bzdok, Danilo; Eickhoff, Claudia R

2016-08-15

Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
Behavior, Sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation

PubMed Central

Eickhoff, Simon B.; Nichols, Thomas E.; Laird, Angela R.; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T.

2016-01-01

Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. PMID:27179606
Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review.

PubMed

Groppe, David M; Urbach, Thomas P; Kutas, Marta

2011-12-01

Event-related potentials (ERPs) and magnetic fields (ERFs) are typically analyzed via ANOVAs on mean activity in a priori windows. Advances in computing power and statistics have produced an alternative, mass univariate analyses consisting of thousands of statistical tests and powerful corrections for multiple comparisons. Such analyses are most useful when one has little a priori knowledge of effect locations or latencies, and for delineating effect boundaries. Mass univariate analyses complement and, at times, obviate traditional analyses. Here we review this approach as applied to ERP/ERF data and four methods for multiple comparison correction: strong control of the familywise error rate (FWER) via permutation tests, weak control of FWER via cluster-based permutation tests, false discovery rate control, and control of the generalized FWER. We end with recommendations for their use and introduce free MATLAB software for their implementation. Copyright © 2011 Society for Psychophysiological Research.
Consumer-driven definition of traditional food products and innovation in traditional foods. A qualitative cross-cultural study.

PubMed

Guerrero, Luis; Guàrdia, Maria Dolors; Xicola, Joan; Verbeke, Wim; Vanhonacker, Filiep; Zakowska-Biemans, Sylwia; Sajdakowska, Marta; Sulmont-Rossé, Claire; Issanchou, Sylvie; Contel, Michele; Scalvedi, M Luisa; Granli, Britt Signe; Hersleth, Margrethe

2009-04-01

Traditional food products (TFP) are an important part of European culture, identity, and heritage. In order to maintain and expand the market share of TFP, further improvement in safety, health, or convenience is needed by means of different innovations. The aim of this study was to obtain a consumer-driven definition for the concept of TFP and innovation and to compare these across six European countries (Belgium, France, Italy, Norway, Poland and Spain) by means of semantic and textual statistical analyses. Twelve focus groups were performed, two per country, under similar conditions. The transcriptions obtained were submitted to an ordinary semantic analysis and to a textual statistical analysis using the software ALCESTE. Four main dimensions were identified for the concept of TFP: habit-natural, origin-locality, processing-elaboration and sensory properties. Five dimensions emerged around the concept of innovation: novelty-change, variety, processing-technology, origin-ethnicity and convenience. TFP were similarly perceived in the countries analysed, while some differences were detected for the concept of innovation. Semantic and statistical analyses of the focus groups led to similar results for both concepts. In some cases and according to the consumers' point of view the application of innovations may damage the traditional character of TFP.
The Deployment Life Study: Longitudinal Analysis of Military Families Across the Deployment Cycle

DTIC Science & Technology

2016-01-01

psychological and physical aggression than they reported prior to the deployment. 1 H. Fischer, A Guide to U.S. Military Casualty Statistics ...analyses include a large number of statistical tests and thus the results pre- sented in this report should be viewed in terms of patterns, rather...Military Children and Families,” The Future of Children, Vol. 23, No. 2, 2013, pp. 13–39. Fischer, H., A Guide to U.S. Military Casualty Statistics
Group Influences on Young Adult Warfighters’ Risk Taking

DTIC Science & Technology

2016-12-01

Statistical Analysis Latent linear growth models were fitted using the maximum likelihood estimation method in Mplus (version 7.0; Muthen & Muthen...condition had a higher net score than those in the alone condition (b = 20.53, SE = 6.29, p < .001). Results of the relevant statistical analyses are...8.56 110.86*** 22.01 158.25*** 29.91 Model fit statistics BIC 4004.50 5302.539 5540.58 Chi-square (df) 41.51*** (16) 38.10** (20) 42.19** (20
The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

PubMed

Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

2010-03-01

New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.
[A Review on the Use of Effect Size in Nursing Research].

PubMed

Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae

2015-10-01

The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.
Statistics for the Relative Detectability of Chemicals in Weak Gaseous Plumes in LWIR Hyperspectral Imagery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Metoyer, Candace N.; Walsh, Stephen J.; Tardiff, Mark F.

2008-10-30

The detection and identification of weak gaseous plumes using thermal imaging data is complicated by many factors. These include variability due to atmosphere, ground and plume temperature, and background clutter. This paper presents an analysis of one formulation of the physics-based model that describes the at-sensor observed radiance. The motivating question for the analyses performed in this paper is as follows. Given a set of backgrounds, is there a way to predict the background over which the probability of detecting a given chemical will be the highest? Two statistics were developed to address this question. These statistics incorporate data frommore » the long-wave infrared band to predict the background over which chemical detectability will be the highest. These statistics can be computed prior to data collection. As a preliminary exploration into the predictive ability of these statistics, analyses were performed on synthetic hyperspectral images. Each image contained one chemical (either carbon tetrachloride or ammonia) spread across six distinct background types. The statistics were used to generate predictions for the background ranks. Then, the predicted ranks were compared to the empirical ranks obtained from the analyses of the synthetic images. For the simplified images under consideration, the predicted and empirical ranks showed a promising amount of agreement. One statistic accurately predicted the best and worst background for detection in all of the images. Future work may include explorations of more complicated plume ingredients, background types, and noise structures.« less
The statistical analysis of global climate change studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hardin, J.W.

1992-01-01

The focus of this work is to contribute to the enhancement of the relationship between climatologists and statisticians. The analysis of global change data has been underway for many years by atmospheric scientists. Much of this analysis includes a heavy reliance on statistics and statistical inference. Some specific climatological analyses are presented and the dependence on statistics is documented before the analysis is undertaken. The first problem presented involves the fluctuation-dissipation theorem and its application to global climate models. This problem has a sound theoretical niche in the literature of both climate modeling and physics, but a statistical analysis inmore » which the data is obtained from the model to show graphically the relationship has not been undertaken. It is under this motivation that the author presents this problem. A second problem concerning the standard errors in estimating global temperatures is purely statistical in nature although very little materials exists for sampling on such a frame. This problem not only has climatological and statistical ramifications, but political ones as well. It is planned to use these results in a further analysis of global warming using actual data collected on the earth. In order to simplify the analysis of these problems, the development of a computer program, MISHA, is presented. This interactive program contains many of the routines, functions, graphics, and map projections needed by the climatologist in order to effectively enter the arena of data visualization.« less
msap: a tool for the statistical analysis of methylation-sensitive amplified polymorphism data.

PubMed

Pérez-Figueroa, A

2013-05-01

In this study msap, an R package which analyses methylation-sensitive amplified polymorphism (MSAP or MS-AFLP) data is presented. The program provides a deep analysis of epigenetic variation starting from a binary data matrix indicating the banding pattern between the isoesquizomeric endonucleases HpaII and MspI, with differential sensitivity to cytosine methylation. After comparing the restriction fragments, the program determines if each fragment is susceptible to methylation (representative of epigenetic variation) or if there is no evidence of methylation (representative of genetic variation). The package provides, in a user-friendly command line interface, a pipeline of different analyses of the variation (genetic and epigenetic) among user-defined groups of samples, as well as the classification of the methylation occurrences in those groups. Statistical testing provides support to the analyses. A comprehensive report of the analyses and several useful plots could help researchers to assess the epigenetic and genetic variation in their MSAP experiments. msap is downloadable from CRAN (http://cran.r-project.org/) and its own webpage (http://msap.r-forge.R-project.org/). The package is intended to be easy to use even for those people unfamiliar with the R command line environment. Advanced users may take advantage of the available source code to adapt msap to more complex analyses. © 2013 Blackwell Publishing Ltd.
Effective Analysis of Reaction Time Data

ERIC Educational Resources Information Center

Whelan, Robert

2008-01-01

Most analyses of reaction time (RT) data are conducted by using the statistical techniques with which psychologists are most familiar, such as analysis of variance on the sample mean. Unfortunately, these methods are usually inappropriate for RT data, because they have little power to detect genuine differences in RT between conditions. In…
Determining Sample Sizes for Precise Contrast Analysis with Heterogeneous Variances

ERIC Educational Resources Information Center

Jan, Show-Li; Shieh, Gwowen

2014-01-01

The analysis of variance (ANOVA) is one of the most frequently used statistical analyses in practical applications. Accordingly, the single and multiple comparison procedures are frequently applied to assess the differences among mean effects. However, the underlying assumption of homogeneous variances may not always be tenable. This study…
Single-Level and Multilevel Mediation Analysis

ERIC Educational Resources Information Center

Tofighi, Davood; Thoemmes, Felix

2014-01-01

Mediation analysis is a statistical approach used to examine how the effect of an independent variable on an outcome is transmitted through an intervening variable (mediator). In this article, we provide a gentle introduction to single-level and multilevel mediation analyses. Using single-level data, we demonstrate an application of structural…
The classification of secondary colorectal liver cancer in human biopsy samples using angular dispersive x-ray diffraction and multivariate analysis

NASA Astrophysics Data System (ADS)

Theodorakou, Chrysoula; Farquharson, Michael J.

2009-08-01

The motivation behind this study is to assess whether angular dispersive x-ray diffraction (ADXRD) data, processed using multivariate analysis techniques, can be used for classifying secondary colorectal liver cancer tissue and normal surrounding liver tissue in human liver biopsy samples. The ADXRD profiles from a total of 60 samples of normal liver tissue and colorectal liver metastases were measured using a synchrotron radiation source. The data were analysed for 56 samples using nonlinear peak-fitting software. Four peaks were fitted to all of the ADXRD profiles, and the amplitude, area, amplitude and area ratios for three of the four peaks were calculated and used for the statistical and multivariate analysis. The statistical analysis showed that there are significant differences between all the peak-fitting parameters and ratios between the normal and the diseased tissue groups. The technique of soft independent modelling of class analogy (SIMCA) was used to classify normal liver tissue and colorectal liver metastases resulting in 67% of the normal tissue samples and 60% of the secondary colorectal liver tissue samples being classified correctly. This study has shown that the ADXRD data of normal and secondary colorectal liver cancer are statistically different and x-ray diffraction data analysed using multivariate analysis have the potential to be used as a method of tissue classification.
Statistical quality control through overall vibration analysis

NASA Astrophysics Data System (ADS)

Carnero, M. ^a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos

2010-05-01

The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence of predictive variables (high-frequency vibration displacements) that are sensible to the processes setup and the quality of the products obtained. Based on the result of this overall vibration analysis, a second paper will analyse self-induced vibration spectrums in order to define limit vibration bands, controllable every cycle or connected to permanent vibration-monitoring systems able to adjust sensible process variables identified by ANOVA, once the vibration readings exceed established quality limits.

[Analysis of the technical efficiency of hospitals in the Spanish National Health Service].

PubMed

Pérez-Romero, Carmen; Ortega-Díaz, M Isabel; Ocaña-Riola, Ricardo; Martín-Martín, José Jesús

To analyse the technical efficiency and productivity of general hospitals in the Spanish National Health Service (NHS) (2010-2012) and identify explanatory hospital and regional variables. 230 NHS hospitals were analysed by data envelopment analysis for overall, technical and scale efficiency, and Malmquist index. The robustness of the analysis is contrasted with alternative input-output models. A fixed effects multilevel cross-sectional linear model was used to analyse the explanatory efficiency variables. The average rate of overall technical efficiency (OTE) was 0.736 in 2012; there was considerable variability by region. Malmquist index (2010-2012) is 1.013. A 23% variability in OTE is attributable to the region in question. Statistically significant exogenous variables (residents per 100 physicians, aging index, average annual income per household, essential public service expenditure and public health expenditure per capita) explain 42% of the OTE variability between hospitals and 64% between regions. The number of residents showed a statistically significant relationship. As regards regions, there is a statistically significant direct linear association between OTE and annual income per capita and essential public service expenditure, and an indirect association with the aging index and annual public health expenditure per capita. The significant room for improvement in the efficiency of hospitals is conditioned by region-specific characteristics, specifically aging, wealth and the public expenditure policies of each one. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Biometric Analysis – A Reliable Indicator for Diagnosing Taurodontism using Panoramic Radiographs

PubMed Central

Hegde, Veda; Anegundi, Rajesh Trayambhak; Pravinchandra, K.R.

2013-01-01

Background: Taurodontism is a clinical entity with a morpho–anatomical change in the shape of the tooth, which was thought to be absent in modern man. Taurodontism is mostly observed as an isolated trait or a component of a syndrome. Various techniques have been devised to diagnose taurodontism. Aim: The aim of this study was to analyze whether a biometric analysis was useful in diagnosing taurodontism, in radiographs which appeared to be normal on cursory observations. Setting and Design: This study was carried out in our institution by using radiographs which were taken for routine procedures. Material and Methods: In this retrospective study, panoramic radiographs were obtained from dental records of children who were aged between 9–14 years, who did not have any abnormality on cursory observations. Biometric analyses were carried out on permanent mandibular first molar(s) by using a novel biometric method. The values were tabulated and analysed. Statistics: Fischer exact probability test, Chi square test and Chi-square test with Yates correction were used for statistical analysis of the data. Results: Cursory observation did not yield us any case of taurodontism. In contrast, the biometric analysis yielded us a statistically significant number of cases of taurodontism. However, there was no statistically significant difference in the number of cases with taurodontism, which was obtained between the genders and the age group which was considered. Conclusion: Thus, taurodontism was diagnosed on a biometric analysis, which was otherwise missed on a cursory observation. It is therefore necessary from the clinical point of view, to diagnose even the mildest form of taurodontism by using metric analysis rather than just relying on a visual radiographic assessment, as its occurrence has many clinical implications and a diagnostic importance. PMID:24086912
Statistical analysis plan for the family-led rehabilitation after stroke in India (ATTEND) trial: A multicenter randomized controlled trial of a new model of stroke rehabilitation compared to usual care.

PubMed

Billot, Laurent; Lindley, Richard I; Harvey, Lisa A; Maulik, Pallab K; Hackett, Maree L; Murthy, Gudlavalleti Vs; Anderson, Craig S; Shamanna, Bindiganavale R; Jan, Stephen; Walker, Marion; Forster, Anne; Langhorne, Peter; Verma, Shweta J; Felix, Cynthia; Alim, Mohammed; Gandhi, Dorcas Bc; Pandian, Jeyaraj Durai

2017-02-01

Background In low- and middle-income countries, few patients receive organized rehabilitation after stroke, yet the burden of chronic diseases such as stroke is increasing in these countries. Affordable models of effective rehabilitation could have a major impact. The ATTEND trial is evaluating a family-led caregiver delivered rehabilitation program after stroke. Objective To publish the detailed statistical analysis plan for the ATTEND trial prior to trial unblinding. Methods Based upon the published registration and protocol, the blinded steering committee and management team, led by the trial statistician, have developed a statistical analysis plan. The plan has been informed by the chosen outcome measures, the data collection forms and knowledge of key baseline data. Results The resulting statistical analysis plan is consistent with best practice and will allow open and transparent reporting. Conclusions Publication of the trial statistical analysis plan reduces potential bias in trial reporting, and clearly outlines pre-specified analyses. Clinical Trial Registrations India CTRI/2013/04/003557; Australian New Zealand Clinical Trials Registry ACTRN1261000078752; Universal Trial Number U1111-1138-6707.
Meta-analysis of thirty-two case-control and two ecological radon studies of lung cancer.

PubMed

Dobrzynski, Ludwik; Fornalski, Krzysztof W; Reszczynska, Joanna

2018-03-01

A re-analysis has been carried out of thirty-two case-control and two ecological studies concerning the influence of radon, a radioactive gas, on the risk of lung cancer. Three mathematically simplest dose-response relationships (models) were tested: constant (zero health effect), linear, and parabolic (linear-quadratic). Health effect end-points reported in the analysed studies are odds ratios or relative risk ratios, related either to morbidity or mortality. In our preliminary analysis, we show that the results of dose-response fitting are qualitatively (within uncertainties, given as error bars) the same, whichever of these health effect end-points are applied. Therefore, we deemed it reasonable to aggregate all response data into the so-called Relative Health Factor and jointly analysed such mixed data, to obtain better statistical power. In the second part of our analysis, robust Bayesian and classical methods of analysis were applied to this combined dataset. In this part of our analysis, we selected different subranges of radon concentrations. In view of substantial differences between the methodology used by the authors of case-control and ecological studies, the mathematical relationships (models) were applied mainly to the thirty-two case-control studies. The degree to which the two ecological studies, analysed separately, affect the overall results when combined with the thirty-two case-control studies, has also been evaluated. In all, as a result of our meta-analysis of the combined cohort, we conclude that the analysed data concerning radon concentrations below ~1000 Bq/m3 (~20 mSv/year of effective dose to the whole body) do not support the thesis that radon may be a cause of any statistically significant increase in lung cancer incidence.
A practical and systematic review of Weibull statistics for reporting strengths of dental materials

PubMed Central

Quinn, George D.; Quinn, Janet B.

2011-01-01

Objectives To review the history, theory and current applications of Weibull analyses sufficient to make informed decisions regarding practical use of the analysis in dental material strength testing. Data References are made to examples in the engineering and dental literature, but this paper also includes illustrative analyses of Weibull plots, fractographic interpretations, and Weibull distribution parameters obtained for a dense alumina, two feldspathic porcelains, and a zirconia. Sources Informational sources include Weibull's original articles, later articles specific to applications and theoretical foundations of Weibull analysis, texts on statistics and fracture mechanics and the international standards literature. Study Selection The chosen Weibull analyses are used to illustrate technique, the importance of flaw size distributions, physical meaning of Weibull parameters and concepts of “equivalent volumes” to compare measured strengths obtained from different test configurations. Conclusions Weibull analysis has a strong theoretical basis and can be of particular value in dental applications, primarily because of test specimen size limitations and the use of different test configurations. Also endemic to dental materials, however, is increased difficulty in satisfying application requirements, such as confirming fracture origin type and diligence in obtaining quality strength data. PMID:19945745
SPSS and SAS programs for addressing interdependence and basic levels-of-analysis issues in psychological data.

PubMed

O'Connor, Brian P

2004-02-01

Levels-of-analysis issues arise whenever individual-level data are collected from more than one person from the same dyad, family, classroom, work group, or other interaction unit. Interdependence in data from individuals in the same interaction units also violates the independence-of-observations assumption that underlies commonly used statistical tests. This article describes the data analysis challenges that are presented by these issues and presents SPSS and SAS programs for conducting appropriate analyses. The programs conduct the within-and-between-analyses described by Dansereau, Alutto, and Yammarino (1984) and the dyad-level analyses described by Gonzalez and Griffin (1999) and Griffin and Gonzalez (1995). Contrasts with general multilevel modeling procedures are then discussed.
Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. analysis and examples.

PubMed Central

Peto, R.; Pike, M. C.; Armitage, P.; Breslow, N. E.; Cox, D. R.; Howard, S. V.; Mantel, N.; McPherson, K.; Peto, J.; Smith, P. G.

1977-01-01

Part I of this report appeared in the previous issue (Br. J. Cancer (1976) 34,585), and discussed the design of randomized clinical trials. Part II now describes efficient methods of analysis of randomized clinical trials in which we wish to compare the duration of survival (or the time until some other untoward event first occurs) among different groups of patients. It is intended to enable physicians without statistical training either to analyse such data themselves using life tables, the logrank test and retrospective stratification, or, when such analyses are presented, to appreciate them more critically, but the discussion may also be of interest to statisticians who have not yet specialized in clinical trial analyses. PMID:831755
How can my research paper be useful for future meta-analyses on forest restoration practices?

Treesearch

Enrique Andivia; Pedro Villar‑Salvador; Juan A. Oliet; Jaime Puertolas; R. Kasten Dumroese

2018-01-01

Statistical meta-analysis is a powerful and useful tool to quantitatively synthesize the information conveyed in published studies on a particular topic. It allows identifying and quantifying overall patterns and exploring causes of variation. The inclusion of published works in meta-analyses requires, however, a minimum quality standard of the reported data and...
SimHap GUI: an intuitive graphical user interface for genetic association analysis.

PubMed

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-12-25

Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.
Diagnosis checking of statistical analysis in RCTs indexed in PubMed.

PubMed

Lee, Paul H; Tse, Andy C Y

2017-11-01

Statistical analysis is essential for reporting of the results of randomized controlled trials (RCTs), as well as evaluating their effectiveness. However, the validity of a statistical analysis also depends on whether the assumptions of that analysis are valid. To review all RCTs published in journals indexed in PubMed during December 2014 to provide a complete picture of how RCTs handle assumptions of statistical analysis. We reviewed all RCTs published in December 2014 that appeared in journals indexed in PubMed using the Cochrane highly sensitive search strategy. The 2014 impact factors of the journals were used as proxies for their quality. The type of statistical analysis used and whether the assumptions of the analysis were tested were reviewed. In total, 451 papers were included. Of the 278 papers that reported a crude analysis for the primary outcomes, 31 (27·2%) reported whether the outcome was normally distributed. Of the 172 papers that reported an adjusted analysis for the primary outcomes, diagnosis checking was rarely conducted, with only 20%, 8·6% and 7% checked for generalized linear model, Cox proportional hazard model and multilevel model, respectively. Study characteristics (study type, drug trial, funding sources, journal type and endorsement of CONSORT guidelines) were not associated with the reporting of diagnosis checking. The diagnosis of statistical analyses in RCTs published in PubMed-indexed journals was usually absent. Journals should provide guidelines about the reporting of a diagnosis of assumptions. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.
Improving phylogenetic analyses by incorporating additional information from genetic sequence databases.

PubMed

Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A

2009-10-01

Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
The extent and consequences of p-hacking in science.

PubMed

Head, Megan L; Holman, Luke; Lanfear, Rob; Kahn, Andrew T; Jennions, Michael D

2015-03-01

A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as "p-hacking," occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.
Analysis methodology and development of a statistical tool for biodistribution data from internal contamination with actinides.

PubMed

Lamart, Stephanie; Griffiths, Nina M; Tchitchek, Nicolas; Angulo, Jaime F; Van der Meeren, Anne

2017-03-01

The aim of this work was to develop a computational tool that integrates several statistical analysis features for biodistribution data from internal contamination experiments. These data represent actinide levels in biological compartments as a function of time and are derived from activity measurements in tissues and excreta. These experiments aim at assessing the influence of different contamination conditions (e.g. intake route or radioelement) on the biological behavior of the contaminant. The ever increasing number of datasets and diversity of experimental conditions make the handling and analysis of biodistribution data difficult. This work sought to facilitate the statistical analysis of a large number of datasets and the comparison of results from diverse experimental conditions. Functional modules were developed using the open-source programming language R to facilitate specific operations: descriptive statistics, visual comparison, curve fitting, and implementation of biokinetic models. In addition, the structure of the datasets was harmonized using the same table format. Analysis outputs can be written in text files and updated data can be written in the consistent table format. Hence, a data repository is built progressively, which is essential for the optimal use of animal data. Graphical representations can be automatically generated and saved as image files. The resulting computational tool was applied using data derived from wound contamination experiments conducted under different conditions. In facilitating biodistribution data handling and statistical analyses, this computational tool ensures faster analyses and a better reproducibility compared with the use of multiple office software applications. Furthermore, re-analysis of archival data and comparison of data from different sources is made much easier. Hence this tool will help to understand better the influence of contamination characteristics on actinide biokinetics. Our approach can aid the optimization of treatment protocols and therefore contribute to the improvement of the medical response after internal contamination with actinides.
Statistics Clinic

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

2014-01-01

Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Exploratory study on a statistical method to analyse time resolved data obtained during nanomaterial exposure measurements

NASA Astrophysics Data System (ADS)

Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.

2013-04-01

Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Maintenance therapy with sucralfate in duodenal ulcer: genuine prevention or accelerated healing of ulcer recurrence?

PubMed

Bynum, T E; Koch, G G

1991-08-08

We sought to compare the efficacy of sucralfate to placebo for the prevention of duodenal ulcer recurrence and to determine that the efficacy of sucralfate was due to a true reduction in ulcer prevalence and not due to secondary effects such as analgesic activity or accelerated healing. This was a double-blind, randomized, placebo-controlled, parallel groups, multicenter clinical study with 254 patients. All patients had a past history of at least two duodenal ulcers with at least one ulcer diagnosed by endoscopic examination 3 months or less before the start of the study. Complete ulcer healing without erosions was required to enter the study. Sucralfate or placebo were dosed as a 1-g tablet twice a day for 4 months, or until ulcer recurrence. Endoscopic examinations once a month and when symptoms developed determined the presence or absence of duodenal ulcers. If a patient developed an ulcer between monthly scheduled visits, the patient was dosed with a 1-g sucralfate tablet twice a day until the next scheduled visit. Statistical analyses of the results determined the efficacy of sucralfate compared with placebo for preventing duodenal ulcer recurrence. Comparisons of therapeutic agents for preventing duodenal ulcers have usually been made by testing for statistical differences in the cumulative rates for all ulcers developed during a follow-up period, regardless of the time of detection. Statistical experts at the United States Food and Drug Administration (FDA) and on the FDA Advisory Panel expressed doubts about clinical study results based on this type of analysis. They suggested three possible mechanisms for reducing the number of observed ulcers: (a) analgesic effects, (b) accelerated healing, and (c) true ulcer prevention. Traditional ulcer analysis could miss recurring ulcers due to an analgesic effect or accelerated healing. Point-prevalence analysis could miss recurring ulcers due to accelerated healing between endoscopic examinations. Maximum ulcer analyses, a novel statistical method, eliminated analgesic effects by regularly scheduled endoscopies and accelerated healing of recurring ulcers by frequent endoscopies and an open-label phase. Maximum ulcer analysis reflects true ulcer recurrence and prevention. Sucralfate was significantly superior to placebo in reducing ulcer prevalence by all analyses. Significance (p less than 0.05) was found at months 3 and 4 for all analyses. All months were significant in the traditional analysis, months 2-4 in point-prevalence analysis, and months 3-4 in the maximal ulcer prevalence analysis. Sucralfate was shown to be effective for the prevention of duodenal ulcer recurrence by a true reduction in new ulcer development.
Statistical issues on the analysis of change in follow-up studies in dental research.

PubMed

Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S

2007-12-01

To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.
Evaluation of Evidence of Statistical Support and Corroboration of Subgroup Claims in Randomized Clinical Trials.

PubMed

Wallach, Joshua D; Sullivan, Patrick G; Trepanowski, John F; Sainani, Kristin L; Steyerberg, Ewout W; Ioannidis, John P A

2017-04-01

Many published randomized clinical trials (RCTs) make claims for subgroup differences. To evaluate how often subgroup claims reported in the abstracts of RCTs are actually supported by statistical evidence (P < .05 from an interaction test) and corroborated by subsequent RCTs and meta-analyses. This meta-epidemiological survey examines data sets of trials with at least 1 subgroup claim, including Subgroup Analysis of Trials Is Rarely Easy (SATIRE) articles and Discontinuation of Randomized Trials (DISCO) articles. We used Scopus (updated July 2016) to search for English-language articles citing each of the eligible index articles with at least 1 subgroup finding in the abstract. Articles with a subgroup claim in the abstract with or without evidence of statistical heterogeneity (P < .05 from an interaction test) in the text and articles attempting to corroborate the subgroup findings. Study characteristics of trials with at least 1 subgroup claim in the abstract were recorded. Two reviewers extracted the data necessary to calculate subgroup-level effect sizes, standard errors, and the P values for interaction. For individual RCTs and meta-analyses that attempted to corroborate the subgroup findings from the index articles, trial characteristics were extracted. Cochran Q test was used to reevaluate heterogeneity with the data from all available trials. The number of subgroup claims in the abstracts of RCTs, the number of subgroup claims in the abstracts of RCTs with statistical support (subgroup findings), and the number of subgroup findings corroborated by subsequent RCTs and meta-analyses. Sixty-four eligible RCTs made a total of 117 subgroup claims in their abstracts. Of these 117 claims, only 46 (39.3%) in 33 articles had evidence of statistically significant heterogeneity from a test for interaction. In addition, out of these 46 subgroup findings, only 16 (34.8%) ensured balance between randomization groups within the subgroups (eg, through stratified randomization), 13 (28.3%) entailed a prespecified subgroup analysis, and 1 (2.2%) was adjusted for multiple testing. Only 5 (10.9%) of the 46 subgroup findings had at least 1 subsequent pure corroboration attempt by a meta-analysis or an RCT. In all 5 cases, the corroboration attempts found no evidence of a statistically significant subgroup effect. In addition, all effect sizes from meta-analyses were attenuated toward the null. A minority of subgroup claims made in the abstracts of RCTs are supported by their own data (ie, a significant interaction effect). For those that have statistical support (P < .05 from an interaction test), most fail to meet other best practices for subgroup tests, including prespecification, stratified randomization, and adjustment for multiple testing. Attempts to corroborate statistically significant subgroup differences are rare; when done, the initially observed subgroup differences are not reproduced.
The High Cost of Complexity in Experimental Design and Data Analysis: Type I and Type II Error Rates in Multiway ANOVA.

ERIC Educational Resources Information Center

Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.

2002-01-01

Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…
Plant selection for ethnobotanical uses on the Amalfi Coast (Southern Italy).

PubMed

Savo, V; Joy, R; Caneva, G; McClatchey, W C

2015-07-15

Many ethnobotanical studies have investigated selection criteria for medicinal and non-medicinal plants. In this paper we test several statistical methods using different ethnobotanical datasets in order to 1) define to which extent the nature of the datasets can affect the interpretation of results; 2) determine if the selection for different plant uses is based on phylogeny, or other selection criteria. We considered three different ethnobotanical datasets: two datasets of medicinal plants and a dataset of non-medicinal plants (handicraft production, domestic and agro-pastoral practices) and two floras of the Amalfi Coast. We performed residual analysis from linear regression, the binomial test and the Bayesian approach for calculating under-used and over-used plant families within ethnobotanical datasets. Percentages of agreement were calculated to compare the results of the analyses. We also analyzed the relationship between plant selection and phylogeny, chorology, life form and habitat using the chi-square test. Pearson's residuals for each of the significant chi-square analyses were examined for investigating alternative hypotheses of plant selection criteria. The three statistical analysis methods differed within the same dataset, and between different datasets and floras, but with some similarities. In the two medicinal datasets, only Lamiaceae was identified in both floras as an over-used family by all three statistical methods. All statistical methods in one flora agreed that Malvaceae was over-used and Poaceae under-used, but this was not found to be consistent with results of the second flora in which one statistical result was non-significant. All other families had some discrepancy in significance across methods, or floras. Significant over- or under-use was observed in only a minority of cases. The chi-square analyses were significant for phylogeny, life form and habitat. Pearson's residuals indicated a non-random selection of woody species for non-medicinal uses and an under-use of plants of temperate forests for medicinal uses. Our study showed that selection criteria for plant uses (including medicinal) are not always based on phylogeny. The comparison of different statistical methods (regression, binomial and Bayesian) under different conditions led to the conclusion that the most conservative results are obtained using regression analysis.

Ataxia Telangiectasia–Mutated Gene Polymorphisms and Acute Normal Tissue Injuries in Cancer Patients After Radiation Therapy: A Systematic Review and Meta-analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dong, Lihua; Cui, Jingkun; Tang, Fengjiao

Purpose: Studies of the association between ataxia telangiectasia–mutated (ATM) gene polymorphisms and acute radiation injuries are often small in sample size, and the results are inconsistent. We conducted the first meta-analysis to provide a systematic review of published findings. Methods and Materials: Publications were identified by searching PubMed up to April 25, 2014. Primary meta-analysis was performed for all acute radiation injuries, and subgroup meta-analyses were based on clinical endpoint. The influence of sample size and radiation injury incidence on genetic effects was estimated in sensitivity analyses. Power calculations were also conducted. Results: The meta-analysis was conducted on the ATMmore » polymorphism rs1801516, including 5 studies with 1588 participants. For all studies, the cut-off for differentiating cases from controls was grade 2 acute radiation injuries. The primary meta-analysis showed a significant association with overall acute radiation injuries (allelic model: odds ratio = 1.33, 95% confidence interval: 1.04-1.71). Subgroup analyses detected an association between the rs1801516 polymorphism and a significant increase in urinary and lower gastrointestinal injuries and an increase in skin injury that was not statistically significant. There was no between-study heterogeneity in any meta-analyses. In the sensitivity analyses, small studies did not show larger effects than large studies. In addition, studies with high incidence of acute radiation injuries showed larger effects than studies with low incidence. Power calculations revealed that the statistical power of the primary meta-analysis was borderline, whereas there was adequate power for the subgroup analysis of studies with high incidence of acute radiation injuries. Conclusions: Our meta-analysis showed a consistency of the results from the overall and subgroup analyses. We also showed that the genetic effect of the rs1801516 polymorphism on acute radiation injuries was dependent on the incidence of the injury. These support the evidence of an association between the rs1801516 polymorphism and acute radiation injuries, encouraging further research of this topic.« less
Vitamin D and depression: a systematic review and meta-analysis comparing studies with and without biological flaws.

PubMed

Spedding, Simon

2014-04-11

Efficacy of Vitamin D supplements in depression is controversial, awaiting further literature analysis. Biological flaws in primary studies is a possible reason meta-analyses of Vitamin D have failed to demonstrate efficacy. This systematic review and meta-analysis of Vitamin D and depression compared studies with and without biological flaws. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The literature search was undertaken through four databases for randomized controlled trials (RCTs). Studies were critically appraised for methodological quality and biological flaws, in relation to the hypothesis and study design. Meta-analyses were performed for studies according to the presence of biological flaws. The 15 RCTs identified provide a more comprehensive evidence-base than previous systematic reviews; methodological quality of studies was generally good and methodology was diverse. A meta-analysis of all studies without flaws demonstrated a statistically significant improvement in depression with Vitamin D supplements (+0.78 CI +0.24, +1.27). Studies with biological flaws were mainly inconclusive, with the meta-analysis demonstrating a statistically significant worsening in depression by taking Vitamin D supplements (-1.1 CI -0.7, -1.5). Vitamin D supplementation (≥800 I.U. daily) was somewhat favorable in the management of depression in studies that demonstrate a change in vitamin levels, and the effect size was comparable to that of anti-depressant medication.
Interpretation of correlations in clinical research.

PubMed

Hung, Man; Bounsanga, Jerry; Voss, Maren Wright

2017-11-01

Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.
The Use of the Position Analysis Questionnaire (PAQ) for Establishing the Job Component Validity of Tests. Report No. 5. Final Report.

ERIC Educational Resources Information Center

McCormick, Ernest J.; And Others

The Position Analysis Questionnaire (PAQ), a structured job analysis questionnaire that provides for the analysis of individual jobs in terms of each of 187 job elements, was used to establish the job component validity of certain commercially-available vocational aptitude tests. Prior to the general analyses reported here, a statistical analysis…
How to Make Nothing Out of Something: Analyses of the Impact of Study Sampling and Statistical Interpretation in Misleading Meta-Analytic Conclusions

PubMed Central

Cunningham, Michael R.; Baumeister, Roy F.

2016-01-01

The limited resource model states that self-control is governed by a relatively finite set of inner resources on which people draw when exerting willpower. Once self-control resources have been used up or depleted, they are less available for other self-control tasks, leading to a decrement in subsequent self-control success. The depletion effect has been studied for over 20 years, tested or extended in more than 600 studies, and supported in an independent meta-analysis (Hagger et al., 2010). Meta-analyses are supposed to reduce bias in literature reviews. Carter et al.’s (2015) meta-analysis, by contrast, included a series of questionable decisions involving sampling, methods, and data analysis. We provide quantitative analyses of key sampling issues: exclusion of many of the best depletion studies based on idiosyncratic criteria and the emphasis on mini meta-analyses with low statistical power as opposed to the overall depletion effect. We discuss two key methodological issues: failure to code for research quality, and the quantitative impact of weak studies by novice researchers. We discuss two key data analysis issues: questionable interpretation of the results of trim and fill and Funnel Plot Asymmetry test procedures, and the use and misinterpretation of the untested Precision Effect Test and Precision Effect Estimate with Standard Error (PEESE) procedures. Despite these serious problems, the Carter et al. (2015) meta-analysis results actually indicate that there is a real depletion effect – contrary to their title. PMID:27826272
Dark Energy Survey Year 1 Results: Multi-Probe Methodology and Simulated Likelihood Analyses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krause, E.; et al.

We present the methodology for and detail the implementation of the Dark Energy Survey (DES) 3x2pt DES Year 1 (Y1) analysis, which combines configuration-space two-point statistics from three different cosmological probes: cosmic shear, galaxy-galaxy lensing, and galaxy clustering, using data from the first year of DES observations. We have developed two independent modeling pipelines and describe the code validation process. We derive expressions for analytical real-space multi-probe covariances, and describe their validation with numerical simulations. We stress-test the inference pipelines in simulated likelihood analyses that vary 6-7 cosmology parameters plus 20 nuisance parameters and precisely resemble the analysis to be presented in the DES 3x2pt analysis paper, using a variety of simulated input data vectors with varying assumptions. We find that any disagreement between pipelines leads to changes in assigned likelihoodmore » $$\\Delta \\chi^2 \\le 0.045$$ with respect to the statistical error of the DES Y1 data vector. We also find that angular binning and survey mask do not impact our analytic covariance at a significant level. We determine lower bounds on scales used for analysis of galaxy clustering (8 Mpc$$~h^{-1}$$) and galaxy-galaxy lensing (12 Mpc$$~h^{-1}$$) such that the impact of modeling uncertainties in the non-linear regime is well below statistical errors, and show that our analysis choices are robust against a variety of systematics. These tests demonstrate that we have a robust analysis pipeline that yields unbiased cosmological parameter inferences for the flagship 3x2pt DES Y1 analysis. We emphasize that the level of independent code development and subsequent code comparison as demonstrated in this paper is necessary to produce credible constraints from increasingly complex multi-probe analyses of current data.« less
ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data.

PubMed

Carter, Kim W; Francis, Richard W; Carter, K W; Francis, R W; Bresnahan, M; Gissler, M; Grønborg, T K; Gross, R; Gunnes, N; Hammond, G; Hornig, M; Hultman, C M; Huttunen, J; Langridge, A; Leonard, H; Newman, S; Parner, E T; Petersson, G; Reichenberg, A; Sandin, S; Schendel, D E; Schalkwyk, L; Sourander, A; Steadman, C; Stoltenberg, C; Suominen, A; Surén, P; Susser, E; Sylvester Vethanayagam, A; Yusof, Z

2016-04-01

Research studies exploring the determinants of disease require sufficient statistical power to detect meaningful effects. Sample size is often increased through centralized pooling of disparately located datasets, though ethical, privacy and data ownership issues can often hamper this process. Methods that facilitate the sharing of research data that are sympathetic with these issues and which allow flexible and detailed statistical analyses are therefore in critical need. We have created a software platform for the Virtual Pooling and Analysis of Research data (ViPAR), which employs free and open source methods to provide researchers with a web-based platform to analyse datasets housed in disparate locations. Database federation permits controlled access to remotely located datasets from a central location. The Secure Shell protocol allows data to be securely exchanged between devices over an insecure network. ViPAR combines these free technologies into a solution that facilitates 'virtual pooling' where data can be temporarily pooled into computer memory and made available for analysis without the need for permanent central storage. Within the ViPAR infrastructure, remote sites manage their own harmonized research dataset in a database hosted at their site, while a central server hosts the data federation component and a secure analysis portal. When an analysis is initiated, requested data are retrieved from each remote site and virtually pooled at the central site. The data are then analysed by statistical software and, on completion, results of the analysis are returned to the user and the virtually pooled data are removed from memory. ViPAR is a secure, flexible and powerful analysis platform built on open source technology that is currently in use by large international consortia, and is made publicly available at [http://bioinformatics.childhealthresearch.org.au/software/vipar/]. © The Author 2015. Published by Oxford University Press on behalf of the International Epidemiological Association.
A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses.

PubMed

Buttigieg, Pier Luigi; Ramette, Alban

2014-12-01

The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.
Parasites as valuable stock markers for fisheries in Australasia, East Asia and the Pacific Islands.

PubMed

Lester, R J G; Moore, B R

2015-01-01

Over 30 studies in Australasia, East Asia and the Pacific Islands region have collected and analysed parasite data to determine the ranges of individual fish, many leading to conclusions about stock delineation. Parasites used as biological tags have included both those known to have long residence times in the fish and those thought to be relatively transient. In many cases the parasitological conclusions have been supported by other methods especially analysis of the chemical constituents of otoliths, and to a lesser extent, genetic data. In analysing parasite data, authors have applied multiple different statistical methodologies, including summary statistics, and univariate and multivariate approaches. Recently, a growing number of researchers have found non-parametric methods, such as analysis of similarities and cluster analysis, to be valuable. Future studies into the residence times, life cycles and geographical distributions of parasites together with more robust analytical methods will yield much important information to clarify stock structures in the area.
Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

PubMed

Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

2014-09-18

Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.
[Cluster analysis applicability to fitness evaluation of cosmonauts on long-term missions of the International space station].

PubMed

Egorov, A D; Stepantsov, V I; Nosovskiĭ, A M; Shipov, A A

2009-01-01

Cluster analysis was applied to evaluate locomotion training (running and running intermingled with walking) of 13 cosmonauts on long-term ISS missions by the parameters of duration (min), distance (m) and intensity (km/h). Based on the results of analyses, the cosmonauts were distributed into three steady groups of 2, 5 and 6 persons. Distance and speed showed a statistical rise (p < 0.03) from group 1 to group 3. Duration of physical locomotion training was not statistically different in the groups (p = 0.125). Therefore, cluster analysis is an adequate method of evaluating fitness of cosmonauts on long-term missions.
Suggestions for presenting the results of data analyses

USGS Publications Warehouse

Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.

2001-01-01

We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
Time Advice and Learning Questions in Computer Simulations

ERIC Educational Resources Information Center

Rey, Gunter Daniel

2011-01-01

Students (N = 101) used an introductory text and a computer simulation to learn fundamental concepts about statistical analyses (e.g., analysis of variance, regression analysis and General Linear Model). Each learner was randomly assigned to one cell of a 2 (with or without time advice) x 3 (with learning questions and corrective feedback, with…
Testing Mediation Using Multiple Regression and Structural Equation Modeling Analyses in Secondary Data

ERIC Educational Resources Information Center

Li, Spencer D.

2011-01-01

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
A Comparison of Imputation Methods for Bayesian Factor Analysis Models

ERIC Educational Resources Information Center

Merkle, Edgar C.

2011-01-01

Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Multivariate geomorphic analysis of forest streams: Implications for assessment of land use impacts on channel condition

Treesearch

Richard. D. Wood-Smith; John M. Buffington

1996-01-01

Multivariate statistical analyses of geomorphic variables from 23 forest stream reaches in southeast Alaska result in successful discrimination between pristine streams and those disturbed by land management, specifically timber harvesting and associated road building. Results of discriminant function analysis indicate that a three-variable model discriminates 10...
Using Rasch Analysis to Identify Uncharacteristic Responses to Undergraduate Assessments

ERIC Educational Resources Information Center

Edwards, Antony; Alcock, Lara

2010-01-01

Rasch Analysis is a statistical technique that is commonly used to analyse both test data and Likert survey data, to construct and evaluate question item banks, and to evaluate change in longitudinal studies. In this article, we introduce the dichotomous Rasch model, briefly discussing its assumptions. Then, using data collected in an…
A book review of Spatial data analysis in ecology and agriculture using R

USDA-ARS?s Scientific Manuscript database

Spatial Data Analysis in Ecology and Agriculture Using R is a valuable resource to assist agricultural and ecological researchers with spatial data analyses using the R statistical software(www.r-project.org). Special emphasis is on spatial data sets; how-ever, the text also provides ample guidance ...
Technologies for Teaching and Learning about Box Plots and Statistical Analysis

ERIC Educational Resources Information Center

Forster, Patricia A.

2007-01-01

This paper analyses technology-based instruction on data-analysis with box plots. Examples of instruction taken from the research literature inform a study of two classes of 17 year-old students (upper secondary) in which the mathematical relationships that their teachers targeted are distinguished as being, or not being, relevant to statistical…
A Meta-Analysis: The Relationship between Father Involvement and Student Academic Achievement

ERIC Educational Resources Information Center

Jeynes, William H.

2015-01-01

A meta-analysis was undertaken, including 66 studies, to determine the relationship between father involvement and the educational outcomes of urban school children. Statistical analyses were done to determine the overall impact and specific components of father involvement. The possible differing effects of paternal involvement by race were also…

Grade Trend Analysis for a Credit-Bearing Library Instruction Course

ERIC Educational Resources Information Center

Guo, Shu

2015-01-01

Statistics suggest the prevalence of grade inflation nationwide, and researchers perform many analyses on student grades at both university and college levels. This analysis focuses on a one-credit library instruction course for undergraduate students at a large public university. The studies examine thirty semester GPAs and the percentages of As…
[Basic concepts for network meta-analysis].

PubMed

Catalá-López, Ferrán; Tobías, Aurelio; Roqué, Marta

2014-12-01

Systematic reviews and meta-analyses have long been fundamental tools for evidence-based clinical practice. Initially, meta-analyses were proposed as a technique that could improve the accuracy and the statistical power of previous research from individual studies with small sample size. However, one of its main limitations has been the fact of being able to compare no more than two treatments in an analysis, even when the clinical research question necessitates that we compare multiple interventions. Network meta-analysis (NMA) uses novel statistical methods that incorporate information from both direct and indirect treatment comparisons in a network of studies examining the effects of various competing treatments, estimating comparisons between many treatments in a single analysis. Despite its potential limitations, NMA applications in clinical epidemiology can be of great value in situations where there are several treatments that have been compared against a common comparator. Also, NMA can be relevant to a research or clinical question when many treatments must be considered or when there is a mix of both direct and indirect information in the body of evidence. Copyright © 2013 Elsevier España, S.L.U. All rights reserved.
The Effect of Folate and Folate Plus Zinc Supplementation on Endocrine Parameters and Sperm Characteristics in Sub-Fertile Men: A Systematic Review and Meta-Analysis.

PubMed

Irani, Morvarid; Amirian, Malihe; Sadeghi, Ramin; Lez, Justine Le; Latifnejad Roudsari, Robab

2017-08-29

To evaluate the effect of folate and folate plus zinc supplementation on endocrine parameters and sperm characteristics in sub fertile men. We conducted a systematic review and meta-analysis. Electronic databases of Medline, Scopus , Google scholar and Persian databases (SID, Iran medex, Magiran, Medlib, Iran doc) were searched from 1966 to December 2016 using a set of relevant keywords including "folate or folic acid AND (infertility, infertile, sterility)".All available randomized controlled trials (RCTs), conducted on a sample of sub fertile men with semen analyses, who took oral folic acid or folate plus zinc, were included. Data collected included endocrine parameters and sperm characteristics. Statistical analyses were done by Comprehensive Meta-analysis Version 2. In total, seven studies were included. Six studies had sufficient data for meta-analysis. "Sperm concentration was statistically higher in men supplemented with folate than with placebo (P < .001)". However, folate supplementation alone did not seem to be more effective than the placebo on the morphology (P = .056) and motility of the sperms (P = .652). Folate plus zinc supplementation did not show any statistically different effect on serum testosterone (P = .86), inhibin B (P = .84), FSH (P = .054), and sperm motility (P = .169) as compared to the placebo. Yet, folate plus zinc showed statistically higher effect on the sperm concentration (P < .001), morphology (P < .001), and serum folate level (P < .001) as compared to placebo. Folate plus zinc supplementation has a positive effect on sperm characteristics in sub fertile men. However, these results should be interpreted with caution due to the important heterogeneity of the studies included in this meta-analysis. Further trials are still needed to confirm the current findings.
Geographically Sourcing Cocaine's Origin - Delineation of the Nineteen Major Coca Growing Regions in South America.

PubMed

Mallette, Jennifer R; Casale, John F; Jordan, James; Morello, David R; Beyer, Paul M

2016-03-23

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses ((2)H and (18)O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
Geographically Sourcing Cocaine’s Origin - Delineation of the Nineteen Major Coca Growing Regions in South America

NASA Astrophysics Data System (ADS)

Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.

2016-03-01

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
Good analytical practice: statistics and handling data in biomedical science. A primer and directions for authors. Part 1: Introduction. Data within and between one or two sets of individuals.

PubMed

Blann, A D; Nation, B R

2008-01-01

The biomedical scientist is bombarded on a daily basis by information, almost all of which refers to the health status of an individual or groups of individuals. This review is the first of a two-part article written to explain some of the issues related to the presentation and analysis of data. The first part focuses on types of data and how to present and analyse data from an individual or from one or two groups of persons. The second part will examine data from three or more sets of persons, what methods are available to allow this analysis (i.e., statistical software packages), and will conclude with a statement on appropriate descriptors of data, their analyses, and presentation for authors considering submission of their data to this journal.
A weighted U-statistic for genetic association analyses of sequencing data.

PubMed

Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

2014-12-01

With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
A probabilistic analysis of electrical equipment vulnerability to carbon fibers

NASA Technical Reports Server (NTRS)

Elber, W.

1980-01-01

The statistical problems of airborne carbon fibers falling onto electrical circuits were idealized and analyzed. The probability of making contact between randomly oriented finite length fibers and sets of parallel conductors with various spacings and lengths was developed theoretically. The probability of multiple fibers joining to bridge a single gap between conductors, or forming continuous networks is included. From these theoretical considerations, practical statistical analyses to assess the likelihood of causing electrical malfunctions was produced. The statistics obtained were confirmed by comparison with results of controlled experiments.
The added value of ordinal analysis in clinical trials: an example in traumatic brain injury.

PubMed

Roozenbeek, Bob; Lingsma, Hester F; Perel, Pablo; Edwards, Phil; Roberts, Ian; Murray, Gordon D; Maas, Andrew Ir; Steyerberg, Ewout W

2011-01-01

In clinical trials, ordinal outcome measures are often dichotomized into two categories. In traumatic brain injury (TBI) the 5-point Glasgow outcome scale (GOS) is collapsed into unfavourable versus favourable outcome. Simulation studies have shown that exploiting the ordinal nature of the GOS increases chances of detecting treatment effects. The objective of this study is to quantify the benefits of ordinal analysis in the real-life situation of a large TBI trial. We used data from the CRASH trial that investigated the efficacy of corticosteroids in TBI patients (n = 9,554). We applied two techniques for ordinal analysis: proportional odds analysis and the sliding dichotomy approach, where the GOS is dichotomized at different cut-offs according to baseline prognostic risk. These approaches were compared to dichotomous analysis. The information density in each analysis was indicated by a Wald statistic. All analyses were adjusted for baseline characteristics. Dichotomous analysis of the six-month GOS showed a non-significant treatment effect (OR = 1.09, 95% CI 0.98 to 1.21, P = 0.096). Ordinal analysis with proportional odds regression or sliding dichotomy showed highly statistically significant treatment effects (OR 1.15, 95% CI 1.06 to 1.25, P = 0.0007 and 1.19, 95% CI 1.08 to 1.30, P = 0.0002), with 2.05-fold and 2.56-fold higher information density compared to the dichotomous approach respectively. Analysis of the CRASH trial data confirmed that ordinal analysis of outcome substantially increases statistical power. We expect these results to hold for other fields of critical care medicine that use ordinal outcome measures and recommend that future trials adopt ordinal analyses. This will permit detection of smaller treatment effects.
Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice

PubMed Central

Stewart, Gavin B.; Altman, Douglas G.; Askie, Lisa M.; Duley, Lelia; Simmonds, Mark C.; Stewart, Lesley A.

2012-01-01

Background Individual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and Findings We included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. Conclusions For these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials. PMID:23056232
Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

PubMed Central

Reif, David M.; Israel, Mark A.; Moore, Jason H.

2007-01-01

The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
Using Network Analysis to Characterize Biogeographic Data in a Community Archive

NASA Astrophysics Data System (ADS)

Wellman, T. P.; Bristol, S.

2017-12-01

Informative measures are needed to evaluate and compare data from multiple providers in a community-driven data archive. This study explores insights from network theory and other descriptive and inferential statistics to examine data content and application across an assemblage of publically available biogeographic data sets. The data are archived in ScienceBase, a collaborative catalog of scientific data supported by the U.S Geological Survey to enhance scientific inquiry and acuity. In gaining understanding through this investigation and other scientific venues our goal is to improve scientific insight and data use across a spectrum of scientific applications. Network analysis is a tool to reveal patterns of non-trivial topological features in the data that do not exhibit complete regularity or randomness. In this work, network analyses are used to explore shared events and dependencies between measures of data content and application derived from metadata and catalog information and measures relevant to biogeographic study. Descriptive statistical tools are used to explore relations between network analysis properties, while inferential statistics are used to evaluate the degree of confidence in these assessments. Network analyses have been used successfully in related fields to examine social awareness of scientific issues, taxonomic structures of biological organisms, and ecosystem resilience to environmental change. Use of network analysis also shows promising potential to identify relationships in biogeographic data that inform programmatic goals and scientific interests.
Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM

NASA Astrophysics Data System (ADS)

Köylü, Ü.; Geymen, A.

2016-10-01

Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.
Analysis of Exhaled Breath Volatile Organic Compounds in Inflammatory Bowel Disease: A Pilot Study.

PubMed

Hicks, Lucy C; Huang, Juzheng; Kumar, Sacheen; Powles, Sam T; Orchard, Timothy R; Hanna, George B; Williams, Horace R T

2015-09-01

Distinguishing between the inflammatory bowel diseases [IBD], Crohn's disease [CD] and ulcerative colitis [UC], is important for determining management and prognosis. Selected ion flow tube mass spectrometry [SIFT-MS] may be used to analyse volatile organic compounds [VOCs] in exhaled breath: these may be altered in disease states, and distinguishing breath VOC profiles can be identified. The aim of this pilot study was to identify, quantify, and analyse VOCs present in the breath of IBD patients and controls, potentially providing insights into disease pathogenesis and complementing current diagnostic algorithms. SIFT-MS breath profiling of 56 individuals [20 UC, 18 CD, and 18 healthy controls] was undertaken. Multivariate analysis included principal components analysis and partial least squares discriminant analysis with orthogonal signal correction [OSC-PLS-DA]. Receiver operating characteristic [ROC] analysis was performed for each comparative analysis using statistically significant VOCs. OSC-PLS-DA modelling was able to distinguish both CD and UC from healthy controls and from one other with good sensitivity and specificity. ROC analysis using combinations of statistically significant VOCs [dimethyl sulphide, hydrogen sulphide, hydrogen cyanide, ammonia, butanal, and nonanal] gave integrated areas under the curve of 0.86 [CD vs healthy controls], 0.74 [UC vs healthy controls], and 0.83 [CD vs UC]. Exhaled breath VOC profiling was able to distinguish IBD patients from controls, as well as to separate UC from CD, using both multivariate and univariate statistical techniques. Copyright © 2015 European Crohn’s and Colitis Organisation (ECCO). Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Statistical analysis plan for the Pneumatic CompREssion for PreVENting Venous Thromboembolism (PREVENT) trial: a study protocol for a randomized controlled trial.

PubMed

Arabi, Yaseen; Al-Hameed, Fahad; Burns, Karen E A; Mehta, Sangeeta; Alsolamy, Sami; Almaani, Mohammed; Mandourah, Yasser; Almekhlafi, Ghaleb A; Al Bshabshe, Ali; Finfer, Simon; Alshahrani, Mohammed; Khalid, Imran; Mehta, Yatin; Gaur, Atul; Hawa, Hassan; Buscher, Hergen; Arshad, Zia; Lababidi, Hani; Al Aithan, Abdulsalam; Jose, Jesna; Abdukahil, Sheryl Ann I; Afesh, Lara Y; Dbsawy, Maamoun; Al-Dawood, Abdulaziz

2018-03-15

The Pneumatic CompREssion for Preventing VENous Thromboembolism (PREVENT) trial evaluates the effect of adjunctive intermittent pneumatic compression (IPC) with pharmacologic thromboprophylaxis compared to pharmacologic thromboprophylaxis alone on venous thromboembolism (VTE) in critically ill adults. In this multicenter randomized trial, critically ill patients receiving pharmacologic thromboprophylaxis will be randomized to an IPC or a no IPC (control) group. The primary outcome is "incident" proximal lower-extremity deep vein thrombosis (DVT) within 28 days after randomization. Radiologists interpreting the lower-extremity ultrasonography will be blinded to intervention allocation, whereas the patients and treating team will be unblinded. The trial has 80% power to detect a 3% absolute risk reduction in the rate of proximal DVT from 7% to 4%. Consistent with international guidelines, we have developed a detailed plan to guide the analysis of the PREVENT trial. This plan specifies the statistical methods for the evaluation of primary and secondary outcomes, and defines covariates for adjusted analyses a priori. Application of this statistical analysis plan to the PREVENT trial will facilitate unbiased analyses of clinical data. ClinicalTrials.gov , ID: NCT02040103 . Registered on 3 November 2013; Current controlled trials, ID: ISRCTN44653506 . Registered on 30 October 2013.
Agriculture, population growth, and statistical analysis of the radiocarbon record.

PubMed

Zahid, H Jabran; Robinson, Erick; Kelly, Robert L

2016-01-26

The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
OSPAR standard method and software for statistical analysis of beach litter data.

PubMed

Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit

2017-09-15

The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
Quantile regression for the statistical analysis of immunological data with many non-detects.

PubMed

Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

2012-07-07

Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
Reduction of Complications of Local Anaesthesia in Dental Healthcare Setups by Application of the Six Sigma Methodology: A Statistical Quality Improvement Technique.

PubMed

Akifuddin, Syed; Khatoon, Farheen

2015-12-01

Health care faces challenges due to complications, inefficiencies and other concerns that threaten the safety of patients. The purpose of his study was to identify causes of complications encountered after administration of local anaesthesia for dental and oral surgical procedures and to reduce the incidence of complications by introduction of six sigma methodology. DMAIC (Define, Measure, Analyse, Improve and Control) process of Six Sigma was taken into consideration to reduce the incidence of complications encountered after administration of local anaesthesia injections for dental and oral surgical procedures using failure mode and effect analysis. Pareto analysis was taken into consideration to analyse the most recurring complications. Paired z-sample test using Minitab Statistical Inference and Fisher's exact test was used to statistically analyse the obtained data. The p-value <0.05 was considered as significant value. Total 54 systemic and 62 local complications occurred during three months of analyse and measure phase. Syncope, failure of anaesthesia, trismus, auto mordeduras and pain at injection site was found to be most recurring complications. Cumulative defective percentage was 7.99 in case of pre-improved data and decreased to 4.58 in the control phase. Estimate for difference was 0.0341228 and 95% lower bound for difference was 0.0193966. p-value was found to be highly significant with p= 0.000. The application of six sigma improvement methodology in healthcare tends to deliver consistently better results to the patients as well as hospitals and results in better patient compliance as well as satisfaction.
The Extent and Consequences of P-Hacking in Science

PubMed Central

Head, Megan L.; Holman, Luke; Lanfear, Rob; Kahn, Andrew T.; Jennions, Michael D.

2015-01-01

A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses. PMID:25768323

permGPU: Using graphics processing units in RNA microarray association studies.

PubMed

Shterev, Ivo D; Jung, Sin-Ho; George, Stephen L; Owzar, Kouros

2010-06-16

Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.
Statistical analysis plan for the Laser-1st versus Drops-1st for Glaucoma and Ocular Hypertension Trial (LiGHT): a multi-centre randomised controlled trial.

PubMed

Vickerstaff, Victoria; Ambler, Gareth; Bunce, Catey; Xing, Wen; Gazzard, Gus

2015-11-11

The LiGHT trial (Laser-1st versus Drops-1st for Glaucoma and Ocular Hypertension Trial) is a multicentre randomised controlled trial of two treatment pathways for patients who are newly diagnosed with open-angle glaucoma (OAG) and ocular hypertension (OHT). The main hypothesis for the trial is that lowering intraocular pressure (IOP) with selective laser trabeculoplasty (SLT) as the primary treatment ('Laser-1st') leads to a better health-related quality of life than for those started on IOP-lowering drops as their primary treatment ('Medicine-1st') and that this is associated with reduced costs and improved tolerability of treatment. This paper describes the statistical analysis plan for the study. The LiGHT trial is an unmasked, multi-centre randomised controlled trial. A total of 718 patients (359 per arm) are being randomised to two groups: medicine-first or laser-first treatment. Outcomes are recorded at baseline and at 6-month intervals up to 36 months. The primary outcome measure is health-related quality of life (HRQL) at 36 months measured using the EQ-5D-5L. The main secondary outcome is the Glaucoma Utility Index. We plan to analyse the patient outcome data according to the group to which the patient was originally assigned. Methods of statistical analysis are described, including the handling of missing data, the covariates used in the adjusted analyses and the planned sensitivity analyses. The trial was registered with the ISRCTN register on 23/07/2012, number ISRCTN32038223 .
Which statistics should tropical biologists learn?

PubMed

Loaiza Velásquez, Natalia; González Lutz, María Isabel; Monge-Nájera, Julián

2011-09-01

Tropical biologists study the richest and most endangered biodiversity in the planet, and in these times of climate change and mega-extinctions, the need for efficient, good quality research is more pressing than in the past. However, the statistical component in research published by tropical authors sometimes suffers from poor quality in data collection; mediocre or bad experimental design and a rigid and outdated view of data analysis. To suggest improvements in their statistical education, we listed all the statistical tests and other quantitative analyses used in two leading tropical journals, the Revista de Biología Tropical and Biotropica, during a year. The 12 most frequent tests in the articles were: Analysis of Variance (ANOVA), Chi-Square Test, Student's T Test, Linear Regression, Pearson's Correlation Coefficient, Mann-Whitney U Test, Kruskal-Wallis Test, Shannon's Diversity Index, Tukey's Test, Cluster Analysis, Spearman's Rank Correlation Test and Principal Component Analysis. We conclude that statistical education for tropical biologists must abandon the old syllabus based on the mathematical side of statistics and concentrate on the correct selection of these and other procedures and tests, on their biological interpretation and on the use of reliable and friendly freeware. We think that their time will be better spent understanding and protecting tropical ecosystems than trying to learn the mathematical foundations of statistics: in most cases, a well designed one-semester course should be enough for their basic requirements.
An audit of the statistics and the comparison with the parameter in the population

NASA Astrophysics Data System (ADS)

Bujang, Mohamad Adam; Sa'at, Nadiah; Joys, A. Reena; Ali, Mariana Mohamad

2015-10-01

The sufficient sample size that is needed to closely estimate the statistics for particular parameters are use to be an issue. Although sample size might had been calculated referring to objective of the study, however, it is difficult to confirm whether the statistics are closed with the parameter for a particular population. All these while, guideline that uses a p-value less than 0.05 is widely used as inferential evidence. Therefore, this study had audited results that were analyzed from various sub sample and statistical analyses and had compared the results with the parameters in three different populations. Eight types of statistical analysis and eight sub samples for each statistical analysis were analyzed. Results found that the statistics were consistent and were closed to the parameters when the sample study covered at least 15% to 35% of population. Larger sample size is needed to estimate parameter that involve with categorical variables compared with numerical variables. Sample sizes with 300 to 500 are sufficient to estimate the parameters for medium size of population.
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model.

PubMed

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
Grey literature in meta-analyses.

PubMed

Conn, Vicki S; Valentine, Jeffrey C; Cooper, Harris M; Rantz, Marilyn J

2003-01-01

In meta-analysis, researchers combine the results of individual studies to arrive at cumulative conclusions. Meta-analysts sometimes include "grey literature" in their evidential base, which includes unpublished studies and studies published outside widely available journals. Because grey literature is a source of data that might not employ peer review, critics have questioned the validity of its data and the results of meta-analyses that include it. To examine evidence regarding whether grey literature should be included in meta-analyses and strategies to manage grey literature in quantitative synthesis. This article reviews evidence on whether the results of studies published in peer-reviewed journals are representative of results from broader samplings of research on a topic as a rationale for inclusion of grey literature. Strategies to enhance access to grey literature are addressed. The most consistent and robust difference between published and grey literature is that published research is more likely to contain results that are statistically significant. Effect size estimates of published research are about one-third larger than those of unpublished studies. Unfunded and small sample studies are less likely to be published. Yet, importantly, methodological rigor does not differ between published and grey literature. Meta-analyses that exclude grey literature likely (a) over-represent studies with statistically significant findings, (b) inflate effect size estimates, and (c) provide less precise effect size estimates than meta-analyses including grey literature. Meta-analyses should include grey literature to fully reflect the existing evidential base and should assess the impact of methodological variations through moderator analysis.
Differences in Reporting of Analyses in Internal Company Documents Versus Published Trial Reports: Comparisons in Industry-Sponsored Trials in Off-Label Uses of Gabapentin

PubMed Central

Vedula, S. Swaroop; Li, Tianjing; Dickersin, Kay

2013-01-01

Background Details about the type of analysis (e.g., intent to treat [ITT]) and definitions (i.e., criteria for including participants in the analysis) are necessary for interpreting a clinical trial's findings. Our objective was to compare the description of types of analyses and criteria for including participants in the publication (i.e., what was reported) with descriptions in the corresponding internal company documents (i.e., what was planned and what was done). Trials were for off-label uses of gabapentin sponsored by Pfizer and Parke-Davis, and documents were obtained through litigation. Methods and Findings For each trial, we compared internal company documents (protocols, statistical analysis plans, and research reports, all unpublished), with publications. One author extracted data and another verified, with a third person verifying discordant items and a sample of the rest. Extracted data included the number of participants randomized and analyzed for efficacy, and types of analyses for efficacy and safety and their definitions (i.e., criteria for including participants in each type of analysis). We identified 21 trials, 11 of which were published randomized controlled trials, and that provided the documents needed for planned comparisons. For three trials, there was disagreement on the number of randomized participants between the research report and publication. Seven types of efficacy analyses were described in the protocols, statistical analysis plans, and publications, including ITT and six others. The protocol or publication described ITT using six different definitions, resulting in frequent disagreements between the two documents (i.e., different numbers of participants were included in the analyses). Conclusions Descriptions of analyses conducted did not agree between internal company documents and what was publicly reported. Internal company documents provide extensive documentation of methods planned and used, and trial findings, and should be publicly accessible. Reporting standards for randomized controlled trials should recommend transparent descriptions and definitions of analyses performed and which study participants are excluded. Please see later in the article for the Editors' Summary PMID:23382656
RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.

PubMed

Glaab, Enrico; Schneider, Reinhard

2015-07-01

High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Tipping point analysis of atmospheric oxygen concentration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Livina, V. N.; Forbes, A. B.; Vaz Martins, T. M.

2015-03-15

We apply tipping point analysis to nine observational oxygen concentration records around the globe, analyse their dynamics and perform projections under possible future scenarios, leading to oxygen deficiency in the atmosphere. The analysis is based on statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the observed data using Bayesian and wavelet techniques.
SimHap GUI: An intuitive graphical user interface for genetic association analysis

PubMed Central

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-01-01

Background Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. Results We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. Conclusion SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis. PMID:19109877
Proceedings of the NASA Symposium on Mathematical Pattern Recognition and Image Analysis

NASA Technical Reports Server (NTRS)

Guseman, L. F., Jr.

1983-01-01

The application of mathematical and statistical analyses techniques to imagery obtained by remote sensors is described by Principal Investigators. Scene-to-map registration, geometric rectification, and image matching are among the pattern recognition aspects discussed.
Combined Analyses of Bacterial, Fungal and Nematode Communities in Andosolic Agricultural Soils in Japan

PubMed Central

Bao, Zhihua; Ikunaga, Yoko; Matsushita, Yuko; Morimoto, Sho; Takada-Hoshino, Yuko; Okada, Hiroaki; Oba, Hirosuke; Takemoto, Shuhei; Niwa, Shigeru; Ohigashi, Kentaro; Suzuki, Chika; Nagaoka, Kazunari; Takenaka, Makoto; Urashima, Yasufumi; Sekiguchi, Hiroyuki; Kushida, Atsuhiko; Toyota, Koki; Saito, Masanori; Tsushima, Seiya

2012-01-01

We simultaneously examined the bacteria, fungi and nematode communities in Andosols from four agro-geographical sites in Japan using polymerase chain reaction-denaturing gradient gel electrophoresis (PCR-DGGE) and statistical analyses to test the effects of environmental factors including soil properties on these communities depending on geographical sites. Statistical analyses such as Principal component analysis (PCA) and Redundancy analysis (RDA) revealed that the compositions of the three soil biota communities were strongly affected by geographical sites, which were in turn strongly associated with soil characteristics such as total C (TC), total N (TN), C/N ratio and annual mean soil temperature (ST). In particular, the TC, TN and C/N ratio had stronger effects on bacterial and fungal communities than on the nematode community. Additionally, two-way cluster analysis using the combined DGGE profile also indicated that all soil samples were classified into four clusters corresponding to the four sites, showing high site specificity of soil samples, and all DNA bands were classified into four clusters, showing the coexistence of specific DGGE bands of bacteria, fungi and nematodes in Andosol fields. The results of this study suggest that geography relative to soil properties has a simultaneous impact on soil microbial and nematode community compositions. This is the first combined profile analysis of bacteria, fungi and nematodes at different sites with agricultural Andosols. PMID:22223474
Combined analyses of bacterial, fungal and nematode communities in andosolic agricultural soils in Japan.

PubMed

Bao, Zhihua; Ikunaga, Yoko; Matsushita, Yuko; Morimoto, Sho; Takada-Hoshino, Yuko; Okada, Hiroaki; Oba, Hirosuke; Takemoto, Shuhei; Niwa, Shigeru; Ohigashi, Kentaro; Suzuki, Chika; Nagaoka, Kazunari; Takenaka, Makoto; Urashima, Yasufumi; Sekiguchi, Hiroyuki; Kushida, Atsuhiko; Toyota, Koki; Saito, Masanori; Tsushima, Seiya

2012-01-01

We simultaneously examined the bacteria, fungi and nematode communities in Andosols from four agro-geographical sites in Japan using polymerase chain reaction-denaturing gradient gel electrophoresis (PCR-DGGE) and statistical analyses to test the effects of environmental factors including soil properties on these communities depending on geographical sites. Statistical analyses such as Principal component analysis (PCA) and Redundancy analysis (RDA) revealed that the compositions of the three soil biota communities were strongly affected by geographical sites, which were in turn strongly associated with soil characteristics such as total C (TC), total N (TN), C/N ratio and annual mean soil temperature (ST). In particular, the TC, TN and C/N ratio had stronger effects on bacterial and fungal communities than on the nematode community. Additionally, two-way cluster analysis using the combined DGGE profile also indicated that all soil samples were classified into four clusters corresponding to the four sites, showing high site specificity of soil samples, and all DNA bands were classified into four clusters, showing the coexistence of specific DGGE bands of bacteria, fungi and nematodes in Andosol fields. The results of this study suggest that geography relative to soil properties has a simultaneous impact on soil microbial and nematode community compositions. This is the first combined profile analysis of bacteria, fungi and nematodes at different sites with agricultural Andosols.
Intratumoral heterogeneity analysis reveals hidden associations between protein expression losses and patient survival in clear cell renal cell carcinoma

PubMed Central

Devarajan, Karthik; Parsons, Theodore; Wang, Qiong; O'Neill, Raymond; Solomides, Charalambos; Peiper, Stephen C.; Testa, Joseph R.; Uzzo, Robert; Yang, Haifeng

2017-01-01

Intratumoral heterogeneity (ITH) is a prominent feature of kidney cancer. It is not known whether it has utility in finding associations between protein expression and clinical parameters. We used ITH that is detected by immunohistochemistry (IHC) to aid the association analysis between the loss of SWI/SNF components and clinical parameters.160 ccRCC tumors (40 per tumor stage) were used to generate tissue microarray (TMA). Four foci from different regions of each tumor were selected. IHC was performed against PBRM1, ARID1A, SETD2, SMARCA4, and SMARCA2. Statistical analyses were performed to correlate biomarker losses with patho-clinical parameters. Categorical variables were compared between groups using Fisher's exact tests. Univariate and multivariable analyses were used to correlate biomarker changes and patient survivals. Multivariable analyses were performed by constructing decision trees using the classification and regression trees (CART) methodology. IHC detected widespread ITH in ccRCC tumors. The statistical analysis of the “Truncal loss” (root loss) found additional correlations between biomarker losses and tumor stages than the traditional “Loss in tumor (total)”. Losses of SMARCA4 or SMARCA2 significantly improved prognosis for overall survival (OS). Losses of PBRM1, ARID1A or SETD2 had the opposite effect. Thus “Truncal Loss” analysis revealed hidden links between protein losses and patient survival in ccRCC. PMID:28445125
The heterogeneity statistic I(2) can be biased in small meta-analyses.

PubMed

von Hippel, Paul T

2015-04-14

Estimated effects vary across studies, partly because of random sampling error and partly because of heterogeneity. In meta-analysis, the fraction of variance that is due to heterogeneity is estimated by the statistic I(2). We calculate the bias of I(2), focusing on the situation where the number of studies in the meta-analysis is small. Small meta-analyses are common; in the Cochrane Library, the median number of studies per meta-analysis is 7 or fewer. We use Mathematica software to calculate the expectation and bias of I(2). I(2) has a substantial bias when the number of studies is small. The bias is positive when the true fraction of heterogeneity is small, but the bias is typically negative when the true fraction of heterogeneity is large. For example, with 7 studies and no true heterogeneity, I(2) will overestimate heterogeneity by an average of 12 percentage points, but with 7 studies and 80 percent true heterogeneity, I(2) can underestimate heterogeneity by an average of 28 percentage points. Biases of 12-28 percentage points are not trivial when one considers that, in the Cochrane Library, the median I(2) estimate is 21 percent. The point estimate I(2) should be interpreted cautiously when a meta-analysis has few studies. In small meta-analyses, confidence intervals should supplement or replace the biased point estimate I(2).
Testing a Coupled Global-limited-area Data Assimilation System using Observations from the 2004 Pacific Typhoon Season

NASA Astrophysics Data System (ADS)

Holt, C. R.; Szunyogh, I.; Gyarmati, G.; Hoffman, R. N.; Leidner, M.

2011-12-01

Tropical cyclone (TC) track and intensity forecasts have improved in recent years due to increased model resolution, improved data assimilation, and the rapid increase in the number of routinely assimilated observations over oceans. The data assimilation approach that has received the most attention in recent years is Ensemble Kalman Filtering (EnKF). The most attractive feature of the EnKF is that it uses a fully flow-dependent estimate of the error statistics, which can have important benefits for the analysis of rapidly developing TCs. We implement the Local Ensemble Transform Kalman Filter algorithm, a vari- ation of the EnKF, on a reduced-resolution version of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) model and the NCEP Regional Spectral Model (RSM) to build a coupled global-limited area anal- ysis/forecast system. This is the first time, to our knowledge, that such a system is used for the analysis and forecast of tropical cyclones. We use data from summer 2004 to study eight tropical cyclones in the Northwest Pacific. The benchmark data sets that we use to assess the performance of our system are the NCEP Reanalysis and the NCEP Operational GFS analyses from 2004. These benchmark analyses were both obtained by the Statistical Spectral Interpolation, which was the operational data assimilation system of NCEP in 2004. The GFS Operational analysis assimilated a large number of satellite radiance observations in addition to the observations assimilated in our system. All analyses are verified against the Joint Typhoon Warning Center Best Track data set. The errors are calculated for the position and intensity of the TCs. The global component of the ensemble-based system shows improvement in po- sition analysis over the NCEP Reanalysis, but shows no significant difference from the NCEP operational analysis for most of the storm tracks. The regional com- ponent of our system improves position analysis over all the global analyses. The intensity analyses, measured by the minimum sea level pressure, are of similar quality in all of the analyses. Regional deterministic forecasts started from our analyses are generally not significantly different from those started from the GFS operational analysis. On average, the regional experiments performed better for longer than 48 h sea level pressure forecasts, while the global forecast performed better in predicting the position for longer than 48 h.
Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: An example from a vertigo phase III study with longitudinal count data as primary endpoint

PubMed Central

2012-01-01

Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. Conclusions The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint. PMID:22962944
Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint.

PubMed

Adrion, Christine; Mansmann, Ulrich

2012-09-10

A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.
Cyber Risk Management for Critical Infrastructure: A Risk Analysis Model and Three Case Studies.

PubMed

Paté-Cornell, M-Elisabeth; Kuypers, Marshall; Smith, Matthew; Keller, Philip

2018-02-01

Managing cyber security in an organization involves allocating the protection budget across a spectrum of possible options. This requires assessing the benefits and the costs of these options. The risk analyses presented here are statistical when relevant data are available, and system-based for high-consequence events that have not happened yet. This article presents, first, a general probabilistic risk analysis framework for cyber security in an organization to be specified. It then describes three examples of forward-looking analyses motivated by recent cyber attacks. The first one is the statistical analysis of an actual database, extended at the upper end of the loss distribution by a Bayesian analysis of possible, high-consequence attack scenarios that may happen in the future. The second is a systems analysis of cyber risks for a smart, connected electric grid, showing that there is an optimal level of connectivity. The third is an analysis of sequential decisions to upgrade the software of an existing cyber security system or to adopt a new one to stay ahead of adversaries trying to find their way in. The results are distributions of losses to cyber attacks, with and without some considered countermeasures in support of risk management decisions based both on past data and anticipated incidents. © 2017 Society for Risk Analysis.
Meta-analyses of Adverse Effects Data Derived from Randomised Controlled Trials as Compared to Observational Studies: Methodological Overview

PubMed Central

Golder, Su; Loke, Yoon K.; Bland, Martin

2011-01-01

Background There is considerable debate as to the relative merits of using randomised controlled trial (RCT) data as opposed to observational data in systematic reviews of adverse effects. This meta-analysis of meta-analyses aimed to assess the level of agreement or disagreement in the estimates of harm derived from meta-analysis of RCTs as compared to meta-analysis of observational studies. Methods and Findings Searches were carried out in ten databases in addition to reference checking, contacting experts, citation searches, and hand-searching key journals, conference proceedings, and Web sites. Studies were included where a pooled relative measure of an adverse effect (odds ratio or risk ratio) from RCTs could be directly compared, using the ratio of odds ratios, with the pooled estimate for the same adverse effect arising from observational studies. Nineteen studies, yielding 58 meta-analyses, were identified for inclusion. The pooled ratio of odds ratios of RCTs compared to observational studies was estimated to be 1.03 (95% confidence interval 0.93–1.15). There was less discrepancy with larger studies. The symmetric funnel plot suggests that there is no consistent difference between risk estimates from meta-analysis of RCT data and those from meta-analysis of observational studies. In almost all instances, the estimates of harm from meta-analyses of the different study designs had 95% confidence intervals that overlapped (54/58, 93%). In terms of statistical significance, in nearly two-thirds (37/58, 64%), the results agreed (both studies showing a significant increase or significant decrease or both showing no significant difference). In only one meta-analysis about one adverse effect was there opposing statistical significance. Conclusions Empirical evidence from this overview indicates that there is no difference on average in the risk estimate of adverse effects of an intervention derived from meta-analyses of RCTs and meta-analyses of observational studies. This suggests that systematic reviews of adverse effects should not be restricted to specific study types. Please see later in the article for the Editors' Summary PMID:21559325

The GnRH analogue triptorelin confers ovarian radio-protection to adult female rats.

PubMed

Camats, N; García, F; Parrilla, J J; Calaf, J; Martín-Mateo, M; Caldés, M Garcia

2009-10-02

There is a controversy regarding the effects of the analogues of the gonadotrophin-releasing hormone (GnRH) in radiotherapy. This has led us to study the possible radio-protection of the ovarian function of a GnRH agonist analogue (GnRHa), triptorelin, in adult, female rats (Rattus norvegicus sp.). The effects of the X-irradiation on the oocytes of ovarian primordial follicles, with and without GnRHa treatment, were compared, directly in the female rats (F(0)) with reproductive parameters, and in the somatic cells of the resulting foetuses (F(1)) with cytogenetical parameters. In order to do this, the ovaries and uteri from 82 females were extracted for the reproductive analysis and 236 foetuses were obtained for cytogenetical analysis. The cytogenetical study was based on the data from 22,151 metaphases analysed. The cytogenetical parameters analysed to assess the existence of chromosomal instability were the number of aberrant metaphases (2234) and the number (2854) and type of structural chromosomal aberrations, including gaps and breaks. Concerning the reproductive analysis of the ovaries and the uteri, the parameters analysed were the number of corpora lutea, implantations, implantation losses and foetuses. Triptorelin confers radio-protection of the ovaries in front of chromosomal instability, which is different, with respect to the single and fractioned dose. The cytogenetical analysis shows a general decrease in most of the parameters of the triptorelin-treated groups, with respect to their controls, and some of these differences were considered to be statistically significant. The reproductive analysis indicates that there is also radio-protection by the agonist, although minor to the cytogenetical one. Only some of the analysed parameters show a statistically significant decrease in the triptorelin-treated groups.
Instructional Advice, Time Advice and Learning Questions in Computer Simulations

ERIC Educational Resources Information Center

Rey, Gunter Daniel

2010-01-01

Undergraduate students (N = 97) used an introductory text and a computer simulation to learn fundamental concepts about statistical analyses (e.g., analysis of variance, regression analysis and General Linear Model). Each learner was randomly assigned to one cell of a 2 (with or without instructional advice) x 2 (with or without time advice) x 2…
Assessment of Students' Scientific and Alternative Conceptions of Energy and Momentum Using Concentration Analysis

ERIC Educational Resources Information Center

Dega, Bekele Gashe; Govender, Nadaraj

2016-01-01

This study compares the scientific and alternative conceptions of energy and momentum of university first-year science students in Ethiopia and the US. Written data were collected using the Energy and Momentum Conceptual Survey developed by Singh and Rosengrant. The Concentration Analysis statistical method was used for analysing the Ethiopian…
PREDICTED GROUND WATER, SOIL AND SOIL GAS IMPACTS FROM U.S. GASOLINES, 2004, FIRST ANALYSIS OF THE AUTUMNAL DATA

EPA Science Inventory

Ninety six gasoline samples were collected from around the U.S. in Autumn 2004. A detailed hydrocarbon analysis was performed on each sample resulting in a data set of approximately 300 chemicals per sample. Statistical analyses were performed on the entire suite of reported chem...
Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size

ERIC Educational Resources Information Center

Shieh, Gwowen

2015-01-01

Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…
The Relationship between Parental Involvement and Urban Secondary School Student Academic Achievement: A Meta-Analysis

ERIC Educational Resources Information Center

Jeynes, William H.

2007-01-01

A meta-analysis is undertaken, including 52 studies, to determine the influence of parental involvement on the educational outcomes of urban secondary school children. Statistical analyses are done to determine the overall impact of parental involvement as well as specific components of parental involvement. Four different measures of educational…
Reply to discussion: ground water response to forest harvest: implications or hillslope stability

Treesearch

Amod Dhakal; Roy C. Sidle; A.C. Johnson; R.T. Edwards

2008-01-01

Dhakal and Sidle (this volume) have requested clarification of some of the rationales and approaches used in analyses described by Johnson et al. (2007). Here we further describe hydrologic conditions typical of southeast Alaska and elaborate on an accepted methodology used for conducting analysis of covariance statistical analysis (ANCOVA). We discuss Dhakal and Sidle...
A Content Analysis of Dissertations in the Field of Educational Technology: The Case of Turkey

ERIC Educational Resources Information Center

Durak, Gurhan; Cankaya, Serkan; Yunkul, Eyup; Misirli, Zeynel Abidin

2018-01-01

The present study aimed at conducting content analysis on dissertations carried out so far in the field of Educational Technology in Turkey. A total of 137 dissertations were examined to determine the key words, academic discipline, research areas, theoretical frameworks, research designs and models, statistical analyses, data collection tools,…
STRengthening analytical thinking for observational studies: the STRATOS initiative.

PubMed

Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James

2014-12-30

The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even 'standard' analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
Correlating tephras and cryptotephras using glass compositional analyses and numerical and statistical methods: Review and evaluation

NASA Astrophysics Data System (ADS)

Lowe, David J.; Pearce, Nicholas J. G.; Jorgensen, Murray A.; Kuehn, Stephen C.; Tryon, Christian A.; Hayward, Chris L.

2017-11-01

We define tephras and cryptotephras and their components (mainly ash-sized particles of glass ± crystals in distal deposits) and summarize the basis of tephrochronology as a chronostratigraphic correlational and dating tool for palaeoenvironmental, geological, and archaeological research. We then document and appraise recent advances in analytical methods used to determine the major, minor, and trace elements of individual glass shards from tephra or cryptotephra deposits to aid their correlation and application. Protocols developed recently for the electron probe microanalysis of major elements in individual glass shards help to improve data quality and standardize reporting procedures. A narrow electron beam (diameter ∼3-5 μm) can now be used to analyze smaller glass shards than previously attainable. Reliable analyses of 'microshards' (defined here as glass shards <32 μm in diameter) using narrow beams are useful for fine-grained samples from distal or ultra-distal geographic locations, and for vesicular or microlite-rich glass shards or small melt inclusions. Caveats apply, however, in the microprobe analysis of very small microshards (≤∼5 μm in diameter), where particle geometry becomes important, and of microlite-rich glass shards where the potential problem of secondary fluorescence across phase boundaries needs to be recognised. Trace element analyses of individual glass shards using laser ablation inductively coupled plasma-mass spectrometry (LA-ICP-MS), with crater diameters of 20 μm and 10 μm, are now effectively routine, giving detection limits well below 1 ppm. Smaller ablation craters (<10 μm) can be subject to significant element fractionation during analysis, but the systematic relationship of such fractionation with glass composition suggests that analyses for some elements at these resolutions may be quantifiable. In undertaking analyses, either by microprobe or LA-ICP-MS, reference material data acquired using the same procedure, and preferably from the same analytical session, should be presented alongside new analytical data. In part 2 of the review, we describe, critically assess, and recommend ways in which tephras or cryptotephras can be correlated (in conjunction with other information) using numerical or statistical analyses of compositional data. Statistical methods provide a less subjective means of dealing with analytical data pertaining to tephra components (usually glass or crystals/phenocrysts) than heuristic alternatives. They enable a better understanding of relationships among the data from multiple viewpoints to be developed and help quantify the degree of uncertainty in establishing correlations. In common with other scientific hypothesis testing, it is easier to infer using such analysis that two or more tephras are different rather than the same. Adding stratigraphic, chronological, spatial, or palaeoenvironmental data (i.e. multiple criteria) is usually necessary and allows for more robust correlations to be made. A two-stage approach is useful, the first focussed on differences in the mean composition of samples, or their range, which can be visualised graphically via scatterplot matrices or bivariate plots coupled with the use of statistical tools such as distance measures, similarity coefficients, hierarchical cluster analysis (informed by distance measures or similarity or cophenetic coefficients), and principal components analysis (PCA). Some statistical methods (cluster analysis, discriminant analysis) are referred to as 'machine learning' in the computing literature. The second stage examines sample variance and the degree of compositional similarity so that sample equivalence or otherwise can be established on a statistical basis. This stage may involve discriminant function analysis (DFA), support vector machines (SVMs), canonical variates analysis (CVA), and ANOVA or MANOVA (or its two-sample special case, the Hotelling two-sample T2 test). Randomization tests can be used where distributional assumptions such as multivariate normality underlying parametric tests are doubtful. Compositional data may be transformed and scaled before being subjected to multivariate statistical procedures including calculation of distance matrices, hierarchical cluster analysis, and PCA. Such transformations may make the assumption of multivariate normality more appropriate. A sequential procedure using Mahalanobis distance and the Hotelling two-sample T2 test is illustrated using glass major element data from trachytic to phonolitic Kenyan tephras. All these methods require a broad range of high-quality compositional data which can be used to compare 'unknowns' with reference (training) sets that are sufficiently complete to account for all possible correlatives, including tephras with heterogeneous glasses that contain multiple compositional groups. Currently, incomplete databases are tending to limit correlation efficacy. The development of an open, online global database to facilitate progress towards integrated, high-quality tephrostratigraphic frameworks for different regions is encouraged.
Selection and Reporting of Statistical Methods to Assess Reliability of a Diagnostic Test: Conformity to Recommended Methods in a Peer-Reviewed Journal

PubMed Central

Park, Ji Eun; Han, Kyunghwa; Sung, Yu Sub; Chung, Mi Sun; Koo, Hyun Jung; Yoon, Hee Mang; Choi, Young Jun; Lee, Seung Soo; Kim, Kyung Won; Shin, Youngbin; An, Suah; Cho, Hyo-Min

2017-01-01

Objective To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Materials and Methods Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Results Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Conclusion Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary. PMID:29089821
Full in-vitro analyses of new-generation bulk fill dental composites cured by halogen light.

PubMed

Tekin, Tuçe Hazal; Kantürk Figen, Aysel; Yılmaz Atalı, Pınar; Coşkuner Filiz, Bilge; Pişkin, Mehmet Burçin

2017-08-01

The objective of this study was to investigate the full in-vitro analyses of new-generation bulk-fill dental composites cured by halogen light (HLG). Two types' four composites were studied: Surefill SDR (SDR) and Xtra Base (XB) as bulk-fill flowable materials; QuixFill (QF) and XtraFill (XF) as packable bulk-fill materials. Samples were prepared for each analysis and test by applying the same procedure, but with different diameters and thicknesses appropriate to the analysis and test requirements. Thermal properties were determined by thermogravimetric analysis (TG/DTG) and differential scanning calorimetry (DSC) analysis; the Vickers microhardness (VHN) was measured after 1, 7, 15 and 30days of storage in water. The degree of conversion values for the materials (DC, %) were immediately measured using near-infrared spectroscopy (FT-IR). The surface morphology of the composites was investigated by scanning electron microscopes (SEM) and atomic-force microscopy (AFM) analyses. The sorption and solubility measurements were also performed after 1, 7, 15 and 30days of storage in water. In addition to his, the data were statistically analyzed using one-way analysis of variance, and both the Newman Keuls and Tukey multiple comparison tests. The statistical significance level was established at p<0.05. According to the ISO 4049 standards, all the tested materials showed acceptable water sorption and solubility, and a halogen light source was an option to polymerize bulk-fill, resin-based dental composites. Copyright © 2017 Elsevier B.V. All rights reserved.
Vitamin D and Depression: A Systematic Review and Meta-Analysis Comparing Studies with and without Biological Flaws

PubMed Central

Spedding, Simon

2014-01-01

Efficacy of Vitamin D supplements in depression is controversial, awaiting further literature analysis. Biological flaws in primary studies is a possible reason meta-analyses of Vitamin D have failed to demonstrate efficacy. This systematic review and meta-analysis of Vitamin D and depression compared studies with and without biological flaws. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The literature search was undertaken through four databases for randomized controlled trials (RCTs). Studies were critically appraised for methodological quality and biological flaws, in relation to the hypothesis and study design. Meta-analyses were performed for studies according to the presence of biological flaws. The 15 RCTs identified provide a more comprehensive evidence-base than previous systematic reviews; methodological quality of studies was generally good and methodology was diverse. A meta-analysis of all studies without flaws demonstrated a statistically significant improvement in depression with Vitamin D supplements (+0.78 CI +0.24, +1.27). Studies with biological flaws were mainly inconclusive, with the meta-analysis demonstrating a statistically significant worsening in depression by taking Vitamin D supplements (−1.1 CI −0.7, −1.5). Vitamin D supplementation (≥800 I.U. daily) was somewhat favorable in the management of depression in studies that demonstrate a change in vitamin levels, and the effect size was comparable to that of anti-depressant medication. PMID:24732019
Evaluation and application of summary statistic imputation to discover new height-associated loci.

PubMed

Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

2018-05-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression.
Evaluation and application of summary statistic imputation to discover new height-associated loci

PubMed Central

2018-01-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression. PMID:29782485
UNITY: Confronting Supernova Cosmology's Statistical and Systematic Uncertainties in a Unified Bayesian Framework

NASA Astrophysics Data System (ADS)

Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The

2015-11-01

While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.
Classical Statistics and Statistical Learning in Imaging Neuroscience

PubMed Central

Bzdok, Danilo

2017-01-01

Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula

PubMed Central

Giordano, Bruno L.; Kayser, Christoph; Rousselet, Guillaume A.; Gross, Joachim; Schyns, Philippe G.

2016-01-01

Abstract We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open‐source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541–1573, 2017. © 2016 Wiley Periodicals, Inc. PMID:27860095
STRengthening Analytical Thinking for Observational Studies: the STRATOS initiative

PubMed Central

Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James

2014-01-01

The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even ‘standard’ analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. PMID:25074480
Statistical analyses on sandstones: Systematic approach for predicting petrographical and petrophysical properties

NASA Astrophysics Data System (ADS)

Stück, H. L.; Siegesmund, S.

2012-04-01

Sandstones are a popular natural stone due to their wide occurrence and availability. The different applications for these stones have led to an increase in demand. From the viewpoint of conservation and the natural stone industry, an understanding of the material behaviour of this construction material is very important. Sandstones are a highly heterogeneous material. Based on statistical analyses with a sufficiently large dataset, a systematic approach to predicting the material behaviour should be possible. Since the literature already contains a large volume of data concerning the petrographical and petrophysical properties of sandstones, a large dataset could be compiled for the statistical analyses. The aim of this study is to develop constraints on the material behaviour and especially on the weathering behaviour of sandstones. Approximately 300 samples from historical and presently mined natural sandstones in Germany and ones described worldwide were included in the statistical approach. The mineralogical composition and fabric characteristics were determined from detailed thin section analyses and descriptions in the literature. Particular attention was paid to evaluating the compositional and textural maturity, grain contact respectively contact thickness, type of cement, degree of alteration and the intergranular volume. Statistical methods were used to test for normal distributions and calculating the linear regression of the basic petrophysical properties of density, porosity, water uptake as well as the strength. The sandstones were classified into three different pore size distributions and evaluated with the other petrophysical properties. Weathering behavior like hygric swelling and salt loading tests were also included. To identify similarities between individual sandstones or to define groups of specific sandstone types, principle component analysis, cluster analysis and factor analysis were applied. Our results show that composition and porosity evolution during diagenesis is a very important control on the petrophysical properties of a building stone. The relationship between intergranular volume, cementation and grain contact, can also provide valuable information to predict the strength properties. Since the samples investigated mainly originate from the Triassic German epicontinental basin, arkoses and feldspar-arenites are underrepresented. In general, the sandstones can be grouped as follows: i) quartzites, highly mature with a primary porosity of about 40%, ii) quartzites, highly mature, showing a primary porosity of 40% but with early clay infiltration, iii) sublitharenites-lithic arenites exhibiting a lower primary porosity, higher cementation with quartz and Fe-oxides ferritic and iv) sublitharenites-lithic arenites with a higher content of pseudomatrix. However, in the last two groups the feldspar and lithoclasts can also show considerable alteration. All sandstone groups differ with respect to the pore space and strength data, as well as water uptake properties, which were obtained by linear regression analysis. Similar petrophysical properties are discernible for each type when using principle component analysis. Furthermore, strength as well as the porosity of sandstones shows distinct differences considering their stratigraphic ages and the compositions. The relationship between porosity, strength as well as salt resistance could also be verified. Hygric swelling shows an interrelation to pore size type, porosity and strength but also to the degree of alteration (e.g. lithoclasts, pseudomatrix). To summarize, the different regression analyses and the calculated confidence regions provide a significant tool to classify the petrographical and petrophysical parameters of sandstones. Based on this, the durability and the weathering behavior of the sandstone groups can be constrained. Keywords: sandstones, petrographical & petrophysical properties, predictive approach, statistical investigation

Graphical augmentations to the funnel plot assess the impact of additional evidence on a meta-analysis.

PubMed

Langan, Dean; Higgins, Julian P T; Gregory, Walter; Sutton, Alexander J

2012-05-01

We aim to illustrate the potential impact of a new study on a meta-analysis, which gives an indication of the robustness of the meta-analysis. A number of augmentations are proposed to one of the most widely used of graphical displays, the funnel plot. Namely, 1) statistical significance contours, which define regions of the funnel plot in which a new study would have to be located to change the statistical significance of the meta-analysis; and 2) heterogeneity contours, which show how a new study would affect the extent of heterogeneity in a given meta-analysis. Several other features are also described, and the use of multiple features simultaneously is considered. The statistical significance contours suggest that one additional study, no matter how large, may have a very limited impact on the statistical significance of a meta-analysis. The heterogeneity contours illustrate that one outlying study can increase the level of heterogeneity dramatically. The additional features of the funnel plot have applications including 1) informing sample size calculations for the design of future studies eligible for inclusion in the meta-analysis; and 2) informing the updating prioritization of a portfolio of meta-analyses such as those prepared by the Cochrane Collaboration. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Customer perceived service quality, satisfaction and loyalty in Indian private healthcare.

PubMed

Kondasani, Rama Koteswara Rao; Panda, Rajeev Kumar

2015-01-01

The purpose of this paper is to analyse how perceived service quality and customer satisfaction lead to loyalty towards healthcare service providers. In total, 475 hospital patients participated in a questionnaire survey in five Indian private hospitals. Descriptive statistics, factor analysis, regression and correlation statistics were employed to analyse customer perceived service quality and how it leads to loyalty towards service providers. Results indicate that the service seeker-service provider relationship, quality of facilities and the interaction with supporting staff have a positive effect on customer perception. Findings help healthcare managers to formulate effective strategies to ensure a better quality of services to the customers. This study helps healthcare managers to build customer loyalty towards healthcare services, thereby attracting and gaining more customers. This paper will help healthcare managers and service providers to analyse customer perceptions and their loyalty towards Indian private healthcare services.
Statistics for NAEG: past efforts, new results, and future plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilbert, R.O.; Simpson, J.C.; Kinnison, R.R.

A brief review of Nevada Applied Ecology Group (NAEG) objectives is followed by a summary of past statistical analyses conducted by Pacific Northwest Laboratory for the NAEG. Estimates of spatial pattern of radionuclides and other statistical analyses at NS's 201, 219 and 221 are reviewed as background for new analyses presented in this paper. Suggested NAEG activities and statistical analyses needed for the projected termination date of NAEG studies in March 1986 are given.
On the Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations

PubMed Central

Lightfoot, Emma; O’Connell, Tamsin C.

2016-01-01

Oxygen isotope analysis of archaeological skeletal remains is an increasingly popular tool to study past human migrations. It is based on the assumption that human body chemistry preserves the δ18O of precipitation in such a way as to be a useful technique for identifying migrants and, potentially, their homelands. In this study, the first such global survey, we draw on published human tooth enamel and bone bioapatite data to explore the validity of using oxygen isotope analyses to identify migrants in the archaeological record. We use human δ18O results to show that there are large variations in human oxygen isotope values within a population sample. This may relate to physiological factors influencing the preservation of the primary isotope signal, or due to human activities (such as brewing, boiling, stewing, differential access to water sources and so on) causing variation in ingested water and food isotope values. We compare the number of outliers identified using various statistical methods. We determine that the most appropriate method for identifying migrants is dependent on the data but is likely to be the IQR or median absolute deviation from the median under most archaeological circumstances. Finally, through a spatial assessment of the dataset, we show that the degree of overlap in human isotope values from different locations across Europe is such that identifying individuals’ homelands on the basis of oxygen isotope analysis alone is not possible for the regions analysed to date. Oxygen isotope analysis is a valid method for identifying first-generation migrants from an archaeological site when used appropriately, however it is difficult to identify migrants using statistical methods for a sample size of less than c. 25 individuals. In the absence of local previous analyses, each sample should be treated as an individual dataset and statistical techniques can be used to identify migrants, but in most cases pinpointing a specific homeland should not be attempted. PMID:27124001
Night shift work and breast cancer risk: what do the meta-analyses tell us?

PubMed

Pahwa, Manisha; Labrèche, France; Demers, Paul A

2018-05-22

Objectives This paper aims to compare results, assess the quality, and discuss the implications of recently published meta-analyses of night shift work and breast cancer risk. Methods A comprehensive search was conducted for meta-analyses published from 2007-2017 that included at least one pooled effect size (ES) for breast cancer associated with any night shift work exposure metric and were accompanied by a systematic literature review. Pooled ES from each meta-analysis were ascertained with a focus on ever/never exposure associations. Assessments of heterogeneity and publication bias were also extracted. The AMSTAR 2 checklist was used to evaluate quality. Results Seven meta-analyses, published from 2013-2016, collectively included 30 cohort and case-control studies spanning 1996-2016. Five meta-analyses reported pooled ES for ever/never night shift work exposure; these ranged from 0.99 [95% confidence interval (CI) 0.95-1.03, N=10 cohort studies) to 1.40 (95% CI 1.13-1.73, N=9 high quality studies). Estimates for duration, frequency, and cumulative night shift work exposure were scant and mostly not statistically significant. Meta-analyses of cohort, Asian, and more fully-adjusted studies generally resulted in lower pooled ES than case-control, European, American, or minimally-adjusted studies. Most reported statistically significant between-study heterogeneity. Publication bias was not evident in any of the meta-analyses. Only one meta-analysis was strong in critical quality domains. Conclusions Fairly consistent elevated pooled ES were found for ever/never night shift work and breast cancer risk, but results for other shift work exposure metrics were inconclusive. Future evaluations of shift work should incorporate high quality meta-analyses that better appraise individual study quality.
Extraction of information from major element chemical analyses of lunar basalts

NASA Technical Reports Server (NTRS)

Butler, J. C.

1985-01-01

Major element chemical analyses often form the framework within which similarities and differences of analyzed specimens are noted and used to propose or devise models. When percentages are formed the ratios of pairs of components are preserved whereas many familiar statistical and geometrical descriptors are likely to exhibit major changes. This ratio preserving aspect forms the basis for a proposed framework. An analysis of compositional variability within the data set of 42 major element analyses of lunar reference samples was selected to investigate this proposal.
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.

PubMed

Festing, M F

2001-01-01

In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
[Continuity of hospital identifiers in hospital discharge data - Analysis of the nationwide German DRG Statistics from 2005 to 2013].

PubMed

Nimptsch, Ulrike; Wengler, Annelene; Mansky, Thomas

2016-11-01

In Germany, nationwide hospital discharge data (DRG statistics provided by the research data centers of the Federal Statistical Office and the Statistical Offices of the 'Länder') are increasingly used as data source for health services research. Within this data hospitals can be separated via their hospital identifier ([Institutionskennzeichen] IK). However, this hospital identifier primarily designates the invoicing unit and is not necessarily equivalent to one hospital location. Aiming to investigate direction and extent of possible bias in hospital-level analyses this study examines the continuity of the hospital identifier within a cross-sectional and longitudinal approach and compares the results to official hospital census statistics. Within the DRG statistics from 2005 to 2013 the annual number of hospitals as classified by hospital identifiers was counted for each year of observation. The annual number of hospitals derived from DRG statistics was compared to the number of hospitals in the official census statistics 'Grunddaten der Krankenhäuser'. Subsequently, the temporal continuity of hospital identifiers in the DRG statistics was analyzed within cohorts of hospitals. Until 2013, the annual number of hospital identifiers in the DRG statistics fell by 175 (from 1,725 to 1,550). This decline affected only providers with small or medium case volume. The number of hospitals identified in the DRG statistics was lower than the number given in the census statistics (e.g., in 2013 1,550 IK vs. 1,668 hospitals in the census statistics). The longitudinal analyses revealed that the majority of hospital identifiers persisted in the years of observation, while one fifth of hospital identifiers changed. In cross-sectional studies of German hospital discharge data the separation of hospitals via the hospital identifier might lead to underestimating the number of hospitals and consequential overestimation of caseload per hospital. Discontinuities of hospital identifiers over time might impair the follow-up of hospital cohorts. These limitations must be taken into account in analyses of German hospital discharge data focusing on the hospital level. Copyright © 2016. Published by Elsevier GmbH.
Mobile phones and head tumours. The discrepancies in cause-effect relationships in the epidemiological studies - how do they arise?

PubMed

Levis, Angelo G; Minicuci, Nadia; Ricci, Paolo; Gennaro, Valerio; Garbisa, Spiridione

2011-06-17

Whether or not there is a relationship between use of mobile phones (analogue and digital cellulars, and cordless) and head tumour risk (brain tumours, acoustic neuromas, and salivary gland tumours) is still a matter of debate; progress requires a critical analysis of the methodological elements necessary for an impartial evaluation of contradictory studies. A close examination of the protocols and results from all case-control and cohort studies, pooled- and meta-analyses on head tumour risk for mobile phone users was carried out, and for each study the elements necessary for evaluating its reliability were identified. In addition, new meta-analyses of the literature data were undertaken. These were limited to subjects with mobile phone latency time compatible with the progression of the examined tumours, and with analysis of the laterality of head tumour localisation corresponding to the habitual laterality of mobile phone use. Blind protocols, free from errors, bias, and financial conditioning factors, give positive results that reveal a cause-effect relationship between long-term mobile phone use or latency and statistically significant increase of ipsilateral head tumour risk, with biological plausibility. Non-blind protocols, which instead are affected by errors, bias, and financial conditioning factors, give negative results with systematic underestimate of such risk. However, also in these studies a statistically significant increase in risk of ipsilateral head tumours is quite common after more than 10 years of mobile phone use or latency. The meta-analyses, our included, examining only data on ipsilateral tumours in subjects using mobile phones since or for at least 10 years, show large and statistically significant increases in risk of ipsilateral brain gliomas and acoustic neuromas. Our analysis of the literature studies and of the results from meta-analyses of the significant data alone shows an almost doubling of the risk of head tumours induced by long-term mobile phone use or latency.
Mobile phones and head tumours. The discrepancies in cause-effect relationships in the epidemiological studies - how do they arise?

PubMed Central

2011-01-01

Background Whether or not there is a relationship between use of mobile phones (analogue and digital cellulars, and cordless) and head tumour risk (brain tumours, acoustic neuromas, and salivary gland tumours) is still a matter of debate; progress requires a critical analysis of the methodological elements necessary for an impartial evaluation of contradictory studies. Methods A close examination of the protocols and results from all case-control and cohort studies, pooled- and meta-analyses on head tumour risk for mobile phone users was carried out, and for each study the elements necessary for evaluating its reliability were identified. In addition, new meta-analyses of the literature data were undertaken. These were limited to subjects with mobile phone latency time compatible with the progression of the examined tumours, and with analysis of the laterality of head tumour localisation corresponding to the habitual laterality of mobile phone use. Results Blind protocols, free from errors, bias, and financial conditioning factors, give positive results that reveal a cause-effect relationship between long-term mobile phone use or latency and statistically significant increase of ipsilateral head tumour risk, with biological plausibility. Non-blind protocols, which instead are affected by errors, bias, and financial conditioning factors, give negative results with systematic underestimate of such risk. However, also in these studies a statistically significant increase in risk of ipsilateral head tumours is quite common after more than 10 years of mobile phone use or latency. The meta-analyses, our included, examining only data on ipsilateral tumours in subjects using mobile phones since or for at least 10 years, show large and statistically significant increases in risk of ipsilateral brain gliomas and acoustic neuromas. Conclusions Our analysis of the literature studies and of the results from meta-analyses of the significant data alone shows an almost doubling of the risk of head tumours induced by long-term mobile phone use or latency. PMID:21679472
The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms

PubMed Central

Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

2013-01-01

Abstract To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches – for example, analysis of variance (ANOVA) – are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing. PMID:24567836
A quantitative analysis of factors influencing the professional longevity of high school science teachers in Florida

NASA Astrophysics Data System (ADS)

Ridgley, James Alexander, Jr.

This dissertation is an exploratory quantitative analysis of various independent variables to determine their effect on the professional longevity (years of service) of high school science teachers in the state of Florida for the academic years 2011-2012 to 2013-2014. Data are collected from the Florida Department of Education, National Center for Education Statistics, and the National Assessment of Educational Progress databases. The following research hypotheses are examined: H1 - There are statistically significant differences in Level 1 (teacher variables) that influence the professional longevity of a high school science teacher in Florida. H2 - There are statistically significant differences in Level 2 (school variables) that influence the professional longevity of a high school science teacher in Florida. H3 - There are statistically significant differences in Level 3 (district variables) that influence the professional longevity of a high school science teacher in Florida. H4 - When tested in a hierarchical multiple regression, there are statistically significant differences in Level 1, Level 2, or Level 3 that influence the professional longevity of a high school science teacher in Florida. The professional longevity of a Floridian high school science teacher is the dependent variable. The independent variables are: (Level 1) a teacher's sex, age, ethnicity, earned degree, salary, number of schools taught in, migration count, and various years of service in different areas of education; (Level 2) a school's geographic location, residential population density, average class size, charter status, and SES; and (Level 3) a school district's average SES and average spending per pupil. Statistical analyses of exploratory MLRs and a HMR are used to support the research hypotheses. The final results of the HMR analysis show a teacher's age, salary, earned degree (unknown, associate, and doctorate), and ethnicity (Hispanic and Native Hawaiian/Pacific Islander); a school's charter status; and a school district's average SES are all significant predictors of a Florida high school science teacher's professional longevity. Although statistically significant in the initial exploratory MLR analyses, a teacher's ethnicity (Asian and Black), a school's geographic location (city and rural), and a school's SES are not statistically significant in the final HMR model.
The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms.

PubMed

Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

2013-08-01

To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches - for example, analysis of variance (ANOVA) - are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing.
Models of dyadic social interaction.

PubMed Central

Griffin, Dale; Gonzalez, Richard

2003-01-01

We discuss the logic of research designs for dyadic interaction and present statistical models with parameters that are tied to psychologically relevant constructs. Building on Karl Pearson's classic nineteenth-century statistical analysis of within-organism similarity, we describe several approaches to indexing dyadic interdependence and provide graphical methods for visualizing dyadic data. We also describe several statistical and conceptual solutions to the 'levels of analytic' problem in analysing dyadic data. These analytic strategies allow the researcher to examine and measure psychological questions of interdependence and social influence. We provide illustrative data from casually interacting and romantic dyads. PMID:12689382
Methodological approaches in analysing observational data: A practical example on how to address clustering and selection bias.

PubMed

Trutschel, Diana; Palm, Rebecca; Holle, Bernhard; Simon, Michael

2017-11-01

Because not every scientific question on effectiveness can be answered with randomised controlled trials, research methods that minimise bias in observational studies are required. Two major concerns influence the internal validity of effect estimates: selection bias and clustering. Hence, to reduce the bias of the effect estimates, more sophisticated statistical methods are needed. To introduce statistical approaches such as propensity score matching and mixed models into representative real-world analysis and to conduct the implementation in statistical software R to reproduce the results. Additionally, the implementation in R is presented to allow the results to be reproduced. We perform a two-level analytic strategy to address the problems of bias and clustering: (i) generalised models with different abilities to adjust for dependencies are used to analyse binary data and (ii) the genetic matching and covariate adjustment methods are used to adjust for selection bias. Hence, we analyse the data from two population samples, the sample produced by the matching method and the full sample. The different analysis methods in this article present different results but still point in the same direction. In our example, the estimate of the probability of receiving a case conference is higher in the treatment group than in the control group. Both strategies, genetic matching and covariate adjustment, have their limitations but complement each other to provide the whole picture. The statistical approaches were feasible for reducing bias but were nevertheless limited by the sample used. For each study and obtained sample, the pros and cons of the different methods have to be weighted. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Narrative Review of Statistical Reporting Checklists, Mandatory Statistical Editing, and Rectifying Common Problems in the Reporting of Scientific Articles.

PubMed

Dexter, Franklin; Shafer, Steven L

2017-03-01

Considerable attention has been drawn to poor reproducibility in the biomedical literature. One explanation is inadequate reporting of statistical methods by authors and inadequate assessment of statistical reporting and methods during peer review. In this narrative review, we examine scientific studies of several well-publicized efforts to improve statistical reporting. We also review several retrospective assessments of the impact of these efforts. These studies show that instructions to authors and statistical checklists are not sufficient; no findings suggested that either improves the quality of statistical methods and reporting. Second, even basic statistics, such as power analyses, are frequently missing or incorrectly performed. Third, statistical review is needed for all papers that involve data analysis. A consistent finding in the studies was that nonstatistical reviewers (eg, "scientific reviewers") and journal editors generally poorly assess statistical quality. We finish by discussing our experience with statistical review at Anesthesia & Analgesia from 2006 to 2016.
Descriptive and inferential statistical methods used in burns research.

PubMed

Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars

2010-05-01

Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals in the fields of biostatistics and epidemiology when using more advanced statistical techniques. Copyright 2009 Elsevier Ltd and ISBI. All rights reserved.
Analysis of categorical moderators in mixed-effects meta-analysis: Consequences of using pooled versus separate estimates of the residual between-studies variances.

PubMed

Rubio-Aparicio, María; Sánchez-Meca, Julio; López-López, José Antonio; Botella, Juan; Marín-Martínez, Fulgencio

2017-11-01

Subgroup analyses allow us to examine the influence of a categorical moderator on the effect size in meta-analysis. We conducted a simulation study using a dichotomous moderator, and compared the impact of pooled versus separate estimates of the residual between-studies variance on the statistical performance of the Q B (P) and Q B (S) tests for subgroup analyses assuming a mixed-effects model. Our results suggested that similar performance can be expected as long as there are at least 20 studies and these are approximately balanced across categories. Conversely, when subgroups were unbalanced, the practical consequences of having heterogeneous residual between-studies variances were more evident, with both tests leading to the wrong statistical conclusion more often than in the conditions with balanced subgroups. A pooled estimate should be preferred for most scenarios, unless the residual between-studies variances are clearly different and there are enough studies in each category to obtain precise separate estimates. © 2017 The British Psychological Society.
An evaluation of various methods of treatment for Legg-Calvé-Perthes disease.

PubMed

Wang, L; Bowen, J R; Puniak, M A; Guille, J T; Glutting, J

1995-05-01

An analysis of 5 methods of treatment for Legg-Calvé-Perthes disease was done on 124 patients with 141 affected hips. Before treatment, all groups were statistically similar concerning initial Mose measurement, age at onset of the disease, gender, and Catterall class. Treatments included the Scottish Rite orthosis (41 hips), nonweight bearing and exercises (41 hips), Petrie cast (29 hips), femoral varus osteotomy (15 hips), or Salter osteotomy (15 hips). Hips treated by the Scottish Rite orthosis had a significantly worse Mose measurement across time interaction (repeated measures analysis of variance, post hoc analyses, p < 0.05). For the other 4 treatment methods, there was no statistically different change. At followup, the Mose measurements for hips treated with the Scottish Rite orthosis were significantly worse than those for hips treated by nonweight bearing and exercises, Petrie cast, varus osteotomy, or Salter osteotomy (repeated measures analysis of variance, post hoc analyses, p < 0.05). There was, however, no significant difference in the distribution of hips according to the Stulberg et al classification at the last followup.
Statistical Analyses of Femur Parameters for Designing Anatomical Plates.

PubMed

Wang, Lin; He, Kunjin; Chen, Zhengming

2016-01-01

Femur parameters are key prerequisites for scientifically designing anatomical plates. Meanwhile, individual differences in femurs present a challenge to design well-fitting anatomical plates. Therefore, to design anatomical plates more scientifically, analyses of femur parameters with statistical methods were performed in this study. The specific steps were as follows. First, taking eight anatomical femur parameters as variables, 100 femur samples were classified into three classes with factor analysis and Q-type cluster analysis. Second, based on the mean parameter values of the three classes of femurs, three sizes of average anatomical plates corresponding to the three classes of femurs were designed. Finally, based on Bayes discriminant analysis, a new femur could be assigned to the proper class. Thereafter, the average anatomical plate suitable for that new femur was selected from the three available sizes of plates. Experimental results showed that the classification of femurs was quite reasonable based on the anatomical aspects of the femurs. For instance, three sizes of condylar buttress plates were designed. Meanwhile, 20 new femurs are judged to which classes the femurs belong. Thereafter, suitable condylar buttress plates were determined and selected.

Geographical origin discrimination of lentils (Lens culinaris Medik.) using 1H NMR fingerprinting and multivariate statistical analyses.

PubMed

Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela

2017-12-15

Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sieve analysis in HIV-1 vaccine efficacy trials

PubMed Central

Edlefsen, Paul T.; Gilbert, Peter B.; Rolland, Morgane

2013-01-01

Purpose of review The genetic characterization of HIV-1 breakthrough infections in vaccine and placebo recipients offers new ways to assess vaccine efficacy trials. Statistical and sequence analysis methods provide opportunities to mine the mechanisms behind the effect of an HIV vaccine. Recent findings The release of results from two HIV-1 vaccine efficacy trials, Step/HVTN-502 and RV144, led to numerous studies in the last five years, including efforts to sequence HIV-1 breakthrough infections and compare viral characteristics between the vaccine and placebo groups. Novel genetic and statistical analysis methods uncovered features that distinguished founder viruses isolated from vaccinees from those isolated from placebo recipients, and identified HIV-1 genetic targets of vaccine-induced immune responses. Summary Studies of HIV-1 breakthrough infections in vaccine efficacy trials can provide an independent confirmation to correlates of risk studies, as they take advantage of vaccine/placebo comparisons while correlates of risk analyses are limited to vaccine recipients. Through the identification of viral determinants impacted by vaccine-mediated host immune responses, sieve analyses can shed light on potential mechanisms of vaccine protection. PMID:23719202
Sieve analysis in HIV-1 vaccine efficacy trials.

PubMed

Edlefsen, Paul T; Gilbert, Peter B; Rolland, Morgane

2013-09-01

The genetic characterization of HIV-1 breakthrough infections in vaccine and placebo recipients offers new ways to assess vaccine efficacy trials. Statistical and sequence analysis methods provide opportunities to mine the mechanisms behind the effect of an HIV vaccine. The release of results from two HIV-1 vaccine efficacy trials, Step/HVTN-502 (HIV Vaccine Trials Network-502) and RV144, led to numerous studies in the last 5 years, including efforts to sequence HIV-1 breakthrough infections and compare viral characteristics between the vaccine and placebo groups. Novel genetic and statistical analysis methods uncovered features that distinguished founder viruses isolated from vaccinees from those isolated from placebo recipients, and identified HIV-1 genetic targets of vaccine-induced immune responses. Studies of HIV-1 breakthrough infections in vaccine efficacy trials can provide an independent confirmation to correlates of risk studies, as they take advantage of vaccine/placebo comparisons, whereas correlates of risk analyses are limited to vaccine recipients. Through the identification of viral determinants impacted by vaccine-mediated host immune responses, sieve analyses can shed light on potential mechanisms of vaccine protection.
Temporal scaling and spatial statistical analyses of groundwater level fluctuations

NASA Astrophysics Data System (ADS)

Sun, H.; Yuan, L., Sr.; Zhang, Y.

2017-12-01

Natural dynamics such as groundwater level fluctuations can exhibit multifractionality and/or multifractality due likely to multi-scale aquifer heterogeneity and controlling factors, whose statistics requires efficient quantification methods. This study explores multifractionality and non-Gaussian properties in groundwater dynamics expressed by time series of daily level fluctuation at three wells located in the lower Mississippi valley, after removing the seasonal cycle in the temporal scaling and spatial statistical analysis. First, using the time-scale multifractional analysis, a systematic statistical method is developed to analyze groundwater level fluctuations quantified by the time-scale local Hurst exponent (TS-LHE). Results show that the TS-LHE does not remain constant, implying the fractal-scaling behavior changing with time and location. Hence, we can distinguish the potentially location-dependent scaling feature, which may characterize the hydrology dynamic system. Second, spatial statistical analysis shows that the increment of groundwater level fluctuations exhibits a heavy tailed, non-Gaussian distribution, which can be better quantified by a Lévy stable distribution. Monte Carlo simulations of the fluctuation process also show that the linear fractional stable motion model can well depict the transient dynamics (i.e., fractal non-Gaussian property) of groundwater level, while fractional Brownian motion is inadequate to describe natural processes with anomalous dynamics. Analysis of temporal scaling and spatial statistics therefore may provide useful information and quantification to understand further the nature of complex dynamics in hydrology.
Multispectral determination of soil moisture-2. [Guymon, Oklahoma and Dalhart, Texas

NASA Technical Reports Server (NTRS)

Estes, J. E.; Simonett, D. S. (Principal Investigator); Hajic, E. J.; Hilton, B. M.; Lees, R. D.

1982-01-01

Soil moisture data obtained using scatterometers, modular multispectral scanners and passive microwave radiometers were revised and grouped into four field cover types for statistical anaysis. Guymon data are grouped as alfalfa, bare, milo with rows perpendicular to the field view, and milo viewed parallel to the field of view. Dalhart data are grouped as bare combo, stubble, disked stubble, and corn field. Summary graphs combine selected analyses to compare the effects of field cover. The analysis for each of the cover types is presented in tables and graphs. Other tables show elementary statistics, correlation matrices, and single variable regressions. Selected eigenvectors and factor analyses are included and the highest correlating sensor typs for each location are summarized.
Incorporating oximeter analyses to investigate synchronies in heart rate while teaching and learning about race

NASA Astrophysics Data System (ADS)

Amat, Arnau; Zapata, Corinna; Alexakos, Konstantinos; Pride, Leah D.; Paylor-Smith, Christian; Hernandez, Matthew

2016-09-01

In this paper, we look closely at two events selected through event-oriented inquiry that were part of a classroom presentation on race. The first event was a provocative discussion about Mark Twain's ( Pudd'nhead Wilson, Harper, New York, 1899) and passing for being White. The other was a discussion on the use of the N-word. Grounded in authentic inquiry, we use ethnographic narrative, cogenerative dialogues, and video and oximeter data analyses as part of a multi-ontological approach for studying emotions. Statistical analysis of oximeter data shows statistically significant heart rate synchrony among two of the coteachers during their presentations, providing evidence of emotional synchrony, resonance, and social and emotional contagion.
Inelastic Single Pion Signal Study in T2K νe Appearance using Modified Decay Electron Cut

NASA Astrophysics Data System (ADS)

Iwamoto, Konosuke; T2K Collaboration

2015-04-01

The T2K long-baseline neutrino experiment uses sophisticated selection criteria to identify the neutrino oscillation signals among the events reconstructed in the Super-Kamiokande (SK) detector for νe and νμ appearance and disappearance analyses. In current analyses, charged-current quasi-elastic (CCQE) events are used as the signal reaction in the SK detector because the energy can be precisely reconstructed. This talk presents an approach to increase the statistics of the oscillation analysis by including non-CCQE events with one Michel electron and reconstruct them as the inelastic single pion productions. The increase in statistics, backgrounds to this new process and energy reconstruction implications will be presented with this increased event sample.
Use of recurrence plots in the analysis of pupil diameter dynamics in narcoleptics

NASA Astrophysics Data System (ADS)

Keegan, Andrew P.; Zbilut, J. P.; Merritt, S. L.; Mercer, P. J.

1993-11-01

Recurrence plots were used to evaluate pupil dynamics of subjects with narcolepsy. Preliminary data indicate that this nonlinear method of analyses may be more useful in revealing underlying deterministic differences than traditional methods like FFT and counting statistics.
Statistical analysis of sperm sorting

NASA Astrophysics Data System (ADS)

Koh, James; Marcos, Marcos

2017-11-01

The success rate of assisted reproduction depends on the proportion of morphologically normal sperm. It is possible to use an external field for manipulation and sorting. Depending on their morphology, the extent of response varies. Due to the wide distribution in sperm morphology even among individuals, the resulting distribution of kinematic behaviour, and consequently the feasibility of sorting, should be analysed statistically. In this theoretical work, Resistive Force Theory and Slender Body Theory will be applied and compared. Full name is Marcos.
Assessing the significance of pedobarographic signals using random field theory.

PubMed

Pataky, Todd C

2008-08-07

Traditional pedobarographic statistical analyses are conducted over discrete regions. Recent studies have demonstrated that regionalization can corrupt pedobarographic field data through conflation when arbitrary dividing lines inappropriately delineate smooth field processes. An alternative is to register images such that homologous structures optimally overlap and then conduct statistical tests at each pixel to generate statistical parametric maps (SPMs). The significance of SPM processes may be assessed within the framework of random field theory (RFT). RFT is ideally suited to pedobarographic image analysis because its fundamental data unit is a lattice sampling of a smooth and continuous spatial field. To correct for the vast number of multiple comparisons inherent in such data, recent pedobarographic studies have employed a Bonferroni correction to retain a constant family-wise error rate. This approach unfortunately neglects the spatial correlation of neighbouring pixels, so provides an overly conservative (albeit valid) statistical threshold. RFT generally relaxes the threshold depending on field smoothness and on the geometry of the search area, but it also provides a framework for assigning p values to suprathreshold clusters based on their spatial extent. The current paper provides an overview of basic RFT concepts and uses simulated and experimental data to validate both RFT-relevant field smoothness estimations and RFT predictions regarding the topological characteristics of random pedobarographic fields. Finally, previously published experimental data are re-analysed using RFT inference procedures to demonstrate how RFT yields easily understandable statistical results that may be incorporated into routine clinical and laboratory analyses.
Size and shape measurement in contemporary cephalometrics.

PubMed

McIntyre, Grant T; Mossey, Peter A

2003-06-01

The traditional method of analysing cephalograms--conventional cephalometric analysis (CCA)--involves the calculation of linear distance measurements, angular measurements, area measurements, and ratios. Because shape information cannot be determined from these 'size-based' measurements, an increasing number of studies employ geometric morphometric tools in the cephalometric analysis of craniofacial morphology. Most of the discussions surrounding the appropriateness of CCA, Procrustes superimposition, Euclidean distance matrix analysis (EDMA), thin-plate spline analysis (TPS), finite element morphometry (FEM), elliptical Fourier functions (EFF), and medial axis analysis (MAA) have centred upon mathematical and statistical arguments. Surprisingly, little information is available to assist the orthodontist in the clinical relevance of each technique. This article evaluates the advantages and limitations of the above methods currently used to analyse the craniofacial morphology on cephalograms and investigates their clinical relevance and possible applications.
Confounding in statistical mediation analysis: What it is and how to address it.

PubMed

Valente, Matthew J; Pelham, William E; Smyth, Heather; MacKinnon, David P

2017-11-01

Psychology researchers are often interested in mechanisms underlying how randomized interventions affect outcomes such as substance use and mental health. Mediation analysis is a common statistical method for investigating psychological mechanisms that has benefited from exciting new methodological improvements over the last 2 decades. One of the most important new developments is methodology for estimating causal mediated effects using the potential outcomes framework for causal inference. Potential outcomes-based methods developed in epidemiology and statistics have important implications for understanding psychological mechanisms. We aim to provide a concise introduction to and illustration of these new methods and emphasize the importance of confounder adjustment. First, we review the traditional regression approach for estimating mediated effects. Second, we describe the potential outcomes framework. Third, we define what a confounder is and how the presence of a confounder can provide misleading evidence regarding mechanisms of interventions. Fourth, we describe experimental designs that can help rule out confounder bias. Fifth, we describe new statistical approaches to adjust for measured confounders of the mediator-outcome relation and sensitivity analyses to probe effects of unmeasured confounders on the mediated effect. All approaches are illustrated with application to a real counseling intervention dataset. Counseling psychologists interested in understanding the causal mechanisms of their interventions can benefit from incorporating the most up-to-date techniques into their mediation analyses. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Two-Year versus One-Year Head Start Program Impact: Addressing Selection Bias by Comparing Regression Modeling with Propensity Score Analysis

ERIC Educational Resources Information Center

Leow, Christine; Wen, Xiaoli; Korfmacher, Jon

2015-01-01

This article compares regression modeling and propensity score analysis as different types of statistical techniques used in addressing selection bias when estimating the impact of two-year versus one-year Head Start on children's school readiness. The analyses were based on the national Head Start secondary dataset. After controlling for…
An Exploration of Bias in Meta-Analysis: The Case of Technology Integration Research in Higher Education

ERIC Educational Resources Information Center

Bernard, Robert M.; Borokhovski, Eugene; Schmid, Richard F.; Tamim, Rana M.

2014-01-01

This article contains a second-order meta-analysis and an exploration of bias in the technology integration literature in higher education. Thirteen meta-analyses, dated from 2000 to 2014 were selected to be included based on the questions asked and the presence of adequate statistical information to conduct a quantitative synthesis. The weighted…
medplot: a web application for dynamic summary and analysis of longitudinal medical data based on R.

PubMed

Ahlin, Črt; Stupica, Daša; Strle, Franc; Lusa, Lara

2015-01-01

In biomedical studies the patients are often evaluated numerous times and a large number of variables are recorded at each time-point. Data entry and manipulation of longitudinal data can be performed using spreadsheet programs, which usually include some data plotting and analysis capabilities and are straightforward to use, but are not designed for the analyses of complex longitudinal data. Specialized statistical software offers more flexibility and capabilities, but first time users with biomedical background often find its use difficult. We developed medplot, an interactive web application that simplifies the exploration and analysis of longitudinal data. The application can be used to summarize, visualize and analyze data by researchers that are not familiar with statistical programs and whose knowledge of statistics is limited. The summary tools produce publication-ready tables and graphs. The analysis tools include features that are seldom available in spreadsheet software, such as correction for multiple testing, repeated measurement analyses and flexible non-linear modeling of the association of the numerical variables with the outcome. medplot is freely available and open source, it has an intuitive graphical user interface (GUI), it is accessible via the Internet and can be used within a web browser, without the need for installing and maintaining programs locally on the user's computer. This paper describes the application and gives detailed examples describing how to use the application on real data from a clinical study including patients with early Lyme borreliosis.
Analysis of longitudinal data from animals where some data are missing in SPSS

PubMed Central

Duricki, DA; Soleman, S; Moon, LDF

2017-01-01

Testing of therapies for disease or injury often involves analysis of longitudinal data from animals. Modern analytical methods have advantages over conventional methods (particularly where some data are missing) yet are not used widely by pre-clinical researchers. We provide here an easy to use protocol for analysing longitudinal data from animals and present a click-by-click guide for performing suitable analyses using the statistical package SPSS. We guide readers through analysis of a real-life data set obtained when testing a therapy for brain injury (stroke) in elderly rats. We show that repeated measures analysis of covariance failed to detect a treatment effect when a few data points were missing (due to animal drop-out) whereas analysis using an alternative method detected a beneficial effect of treatment; specifically, we demonstrate the superiority of linear models (with various covariance structures) analysed using Restricted Maximum Likelihood estimation (to include all available data). This protocol takes two hours to follow. PMID:27196723
SNPassoc: an R package to perform whole genome association studies.

PubMed

González, Juan R; Armengol, Lluís; Solé, Xavier; Guinó, Elisabet; Mercader, Josep M; Estivill, Xavier; Moreno, Víctor

2007-03-01

The popularization of large-scale genotyping projects has led to the widespread adoption of genetic association studies as the tool of choice in the search for single nucleotide polymorphisms (SNPs) underlying susceptibility to complex diseases. Although the analysis of individual SNPs is a relatively trivial task, when the number is large and multiple genetic models need to be explored it becomes necessary a tool to automate the analyses. In order to address this issue, we developed SNPassoc, an R package to carry out most common analyses in whole genome association studies. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Package SNPassoc is available at CRAN from http://cran.r-project.org. A tutorial is available on Bioinformatics online and in http://davinci.crg.es/estivill_lab/snpassoc.
Discovering human germ cell mutagens with whole genome sequencing: Insights from power calculations reveal the importance of controlling for between-family variability.

PubMed

Webster, R J; Williams, A; Marchetti, F; Yauk, C L

2018-07-01

Mutations in germ cells pose potential genetic risks to offspring. However, de novo mutations are rare events that are spread across the genome and are difficult to detect. Thus, studies in this area have generally been under-powered, and no human germ cell mutagen has been identified. Whole Genome Sequencing (WGS) of human pedigrees has been proposed as an approach to overcome these technical and statistical challenges. WGS enables analysis of a much wider breadth of the genome than traditional approaches. Here, we performed power analyses to determine the feasibility of using WGS in human families to identify germ cell mutagens. Different statistical models were compared in the power analyses (ANOVA and multiple regression for one-child families, and mixed effect model sampling between two to four siblings per family). Assumptions were made based on parameters from the existing literature, such as the mutation-by-paternal age effect. We explored two scenarios: a constant effect due to an exposure that occurred in the past, and an accumulating effect where the exposure is continuing. Our analysis revealed the importance of modeling inter-family variability of the mutation-by-paternal age effect. Statistical power was improved by models accounting for the family-to-family variability. Our power analyses suggest that sufficient statistical power can be attained with 4-28 four-sibling families per treatment group, when the increase in mutations ranges from 40 to 10% respectively. Modeling family variability using mixed effect models provided a reduction in sample size compared to a multiple regression approach. Much larger sample sizes were required to detect an interaction effect between environmental exposures and paternal age. These findings inform study design and statistical modeling approaches to improve power and reduce sequencing costs for future studies in this area. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Image encryption based on a delayed fractional-order chaotic logistic system

NASA Astrophysics Data System (ADS)

Wang, Zhen; Huang, Xia; Li, Ning; Song, Xiao-Na

2012-05-01

A new image encryption scheme is proposed based on a delayed fractional-order chaotic logistic system. In the process of generating a key stream, the time-varying delay and fractional derivative are embedded in the proposed scheme to improve the security. Such a scheme is described in detail with security analyses including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. Experimental results show that the newly proposed image encryption scheme possesses high security.
A Tutorial in Bayesian Potential Outcomes Mediation Analysis.

PubMed

Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P

2018-01-01

Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.

Assessment of statistical methods used in library-based approaches to microbial source tracking.

PubMed

Ritter, Kerry J; Carruthers, Ethan; Carson, C Andrew; Ellender, R D; Harwood, Valerie J; Kingsley, Kyle; Nakatsu, Cindy; Sadowsky, Michael; Shear, Brian; West, Brian; Whitlock, John E; Wiggins, Bruce A; Wilbur, Jayson D

2003-12-01

Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.
Reduction of Complications of Local Anaesthesia in Dental Healthcare Setups by Application of the Six Sigma Methodology: A Statistical Quality Improvement Technique

PubMed Central

Khatoon, Farheen

2015-01-01

Background Health care faces challenges due to complications, inefficiencies and other concerns that threaten the safety of patients. Aim The purpose of his study was to identify causes of complications encountered after administration of local anaesthesia for dental and oral surgical procedures and to reduce the incidence of complications by introduction of six sigma methodology. Materials and Methods DMAIC (Define, Measure, Analyse, Improve and Control) process of Six Sigma was taken into consideration to reduce the incidence of complications encountered after administration of local anaesthesia injections for dental and oral surgical procedures using failure mode and effect analysis. Pareto analysis was taken into consideration to analyse the most recurring complications. Paired z-sample test using Minitab Statistical Inference and Fisher’s exact test was used to statistically analyse the obtained data. The p-value <0.05 was considered as significant value. Results Total 54 systemic and 62 local complications occurred during three months of analyse and measure phase. Syncope, failure of anaesthesia, trismus, auto mordeduras and pain at injection site was found to be most recurring complications. Cumulative defective percentage was 7.99 in case of pre-improved data and decreased to 4.58 in the control phase. Estimate for difference was 0.0341228 and 95% lower bound for difference was 0.0193966. p-value was found to be highly significant with p= 0.000. Conclusion The application of six sigma improvement methodology in healthcare tends to deliver consistently better results to the patients as well as hospitals and results in better patient compliance as well as satisfaction. PMID:26816989
Expanding the enablement framework and testing an evaluative instrument for diabetes patient education.

PubMed

Leeseberg Stamler, L; Cole, M M; Patrick, L J

2001-08-01

Strategies to delay or prevent complications from diabetes include diabetes patient education. Diabetes educators seek to provide education that meets the needs of clients and influences positive health outcomes. (1) To expand prior research exploring an enablement framework for patient education by examining perceptions of patient education by persons with diabetes and (2) to test the mastery of stress instrument (MSI) as a potential evaluative instrument for patient education. Triangulated data collection with a convenience sample of adults taking diabetes education classes. Half the sample completed audio-taped semi-structured interviews pre, during and posteducation and all completed the MSI posteducation. Qualitative data were analysed using latent content analysis, descriptive statistics were completed. Qualitative analysis revealed content categories similar to previous work with prenatal participants, supporting the enablement framework. Statistical analyses noted congruence with psychometric findings from development of MSI; secondary qualitative analyses revealed congruency between MSI scores and patient perceptions. Mastery is an outcome congruent with the enablement framework for patient education across content areas. Mastery of stress instrument may be a instrument for identification of patients who are coping well with diabetes self-management, as well as those who are not and who require further nursing interventions.
FARVATX: FAmily-based Rare Variant Association Test for X-linked genes

PubMed Central

Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H.; Silverman, Edwin K; Park, Taesung; Won, Sungho

2016-01-01

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease (COPD). Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. PMID:27325607
FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

PubMed

Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H; Silverman, Edwin K; Park, Taesung; Won, Sungho

2016-09-01

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. © 2016 WILEY PERIODICALS, INC.
Are conventional statistical techniques exhaustive for defining metal background concentrations in harbour sediments? A case study: The Coastal Area of Bari (Southeast Italy).

PubMed

Mali, Matilda; Dell'Anna, Maria Michela; Mastrorilli, Piero; Damiani, Leonardo; Ungaro, Nicola; Belviso, Claudia; Fiore, Saverio

2015-11-01

Sediment contamination by metals poses significant risks to coastal ecosystems and is considered to be problematic for dredging operations. The determination of the background values of metal and metalloid distribution based on site-specific variability is fundamental in assessing pollution levels in harbour sediments. The novelty of the present work consists of addressing the scope and limitation of analysing port sediments through the use of conventional statistical techniques (such as: linear regression analysis, construction of cumulative frequency curves and the iterative 2σ technique), that are commonly employed for assessing Regional Geochemical Background (RGB) values in coastal sediments. This study ascertained that although the tout court use of such techniques in determining the RGB values in harbour sediments seems appropriate (the chemical-physical parameters of port sediments fit well with statistical equations), it should nevertheless be avoided because it may be misleading and can mask key aspects of the study area that can only be revealed by further investigations, such as mineralogical and multivariate statistical analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analysis and interpretation of cost data in randomised controlled trials: review of published studies

PubMed Central

Barber, Julie A; Thompson, Simon G

1998-01-01

Objective To review critically the statistical methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the study. Design Survey of published randomised trials including an economic evaluation with cost values suitable for statistical analysis; 45 such trials published in 1995 were identified from Medline. Main outcome measures The use of statistical methods for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified. Results Although all 45 trials reviewed apparently had cost data for each patient, only 9 (20%) reported adequate measures of variability for these data and only 25 (56%) gave results of statistical tests or a measure of precision for the comparison of costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented in the paper. No paper reported sample size calculations for costs. Conclusions The analysis and interpretation of cost data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs of alternative therapies have often been reported in the absence of supporting statistical evidence. Improvements in the analysis and reporting of health economic assessments are urgently required. Health economic guidelines need to be revised to incorporate more detailed statistical advice. Key messagesHealth economic evaluations required for important healthcare policy decisions are often carried out in randomised controlled trialsA review of such published economic evaluations assessed whether statistical methods for cost outcomes have been appropriately used and interpretedFew publications presented adequate descriptive information for costs or performed appropriate statistical analysesIn at least two thirds of the papers, the main conclusions regarding costs were not justifiedThe analysis and reporting of health economic assessments within randomised controlled trials urgently need improving PMID:9794854
Methodological and Reporting Quality of Systematic Reviews and Meta-analyses in Endodontics.

PubMed

Nagendrababu, Venkateshbabu; Pulikkotil, Shaju Jacob; Sultan, Omer Sheriff; Jayaraman, Jayakumar; Peters, Ove A

2018-06-01

The aim of this systematic review (SR) was to evaluate the quality of SRs and meta-analyses (MAs) in endodontics. A comprehensive literature search was conducted to identify relevant articles in the electronic databases from January 2000 to June 2017. Two reviewers independently assessed the articles for eligibility and data extraction. SRs and MAs on interventional studies with a minimum of 2 therapeutic strategies in endodontics were included in this SR. Methodologic and reporting quality were assessed using A Measurement Tool to Assess Systematic Reviews (AMSTAR) and Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA), respectively. The interobserver reliability was calculated using the Cohen kappa statistic. Statistical analysis with the level of significance at P < .05 was performed using Kruskal-Wallis tests and simple linear regression analysis. A total of 30 articles were selected for the current SR. Using AMSTAR, the item related to the scientific quality of studies used in conclusion was adhered by less than 40% of studies. Using PRISMA, 3 items were reported by less than 40% of studies, which were on objectives, protocol registration, and funding. No association was evident comparing the number of authors and country with quality. Statistical significance was observed when quality was compared among journals, with studies published as Cochrane reviews superior to those published in other journals. AMSTAR and PRISMA scores were significantly related. SRs in endodontics showed variability in both methodologic and reporting quality. Copyright © 2018 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
[Clinical research XXIII. From clinical judgment to meta-analyses].

PubMed

Rivas-Ruiz, Rodolfo; Castelán-Martínez, Osvaldo D; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Noyola-Castillo, Maura E; Talavera, Juan O

2014-01-01

Systematic reviews (SR) are studies made in order to ask clinical questions based on original articles. Meta-analysis (MTA) is the mathematical analysis of SR. These analyses are divided in two groups, those which evaluate the measured results of quantitative variables (for example, the body mass index -BMI-) and those which evaluate qualitative variables (for example, if a patient is alive or dead, or if he is healing or not). Quantitative variables generally use the mean difference analysis and qualitative variables can be performed using several calculations: odds ratio (OR), relative risk (RR), absolute risk reduction (ARR) and hazard ratio (HR). These analyses are represented through forest plots which allow the evaluation of each individual study, as well as the heterogeneity between studies and the overall effect of the intervention. These analyses are mainly based on Student's t test and chi-squared. To take appropriate decisions based on the MTA, it is important to understand the characteristics of statistical methods in order to avoid misinterpretations.
Differentiation of chocolates according to the cocoa's geographical origin using chemometrics.

PubMed

Cambrai, Amandine; Marcic, Christophe; Morville, Stéphane; Sae Houer, Pierre; Bindler, Françoise; Marchioni, Eric

2010-02-10

The determination of the geographical origin of cocoa used to produce chocolate has been assessed through the analysis of the volatile compounds of chocolate samples. The analysis of the volatile content and their statistical processing by multivariate analyses tended to form independent groups for both Africa and Madagascar, even if some of the chocolate samples analyzed appeared in a mixed zone together with those from America. This analysis also allowed a clear separation between Caribbean chocolates and those from other origins. Height compounds (such as linalool or (E,E)-2,4-decadienal) characteristic of chocolate's different geographical origins were also identified. The method described in this work (hydrodistillation, GC analysis, and statistic treatment) may improve the control of the geographical origin of chocolate during its long production process.
Predictors of persistent pain after total knee arthroplasty: a systematic review and meta-analysis.

PubMed

Lewis, G N; Rice, D A; McNair, P J; Kluger, M

2015-04-01

Several studies have identified clinical, psychosocial, patient characteristic, and perioperative variables that are associated with persistent postsurgical pain; however, the relative effect of these variables has yet to be quantified. The aim of the study was to provide a systematic review and meta-analysis of predictor variables associated with persistent pain after total knee arthroplasty (TKA). Included studies were required to measure predictor variables prior to or at the time of surgery, include a pain outcome measure at least 3 months post-TKA, and include a statistical analysis of the effect of the predictor variable(s) on the outcome measure. Counts were undertaken of the number of times each predictor was analysed and the number of times it was found to have a significant relationship with persistent pain. Separate meta-analyses were performed to determine the effect size of each predictor on persistent pain. Outcomes from studies implementing uni- and multivariable statistical models were analysed separately. Thirty-two studies involving almost 30 000 patients were included in the review. Preoperative pain was the predictor that most commonly demonstrated a significant relationship with persistent pain across uni- and multivariable analyses. In the meta-analyses of data from univariate models, the largest effect sizes were found for: other pain sites, catastrophizing, and depression. For data from multivariate models, significant effects were evident for: catastrophizing, preoperative pain, mental health, and comorbidities. Catastrophizing, mental health, preoperative knee pain, and pain at other sites are the strongest independent predictors of persistent pain after TKA. © The Author 2014. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome.

PubMed

Simovski, Boris; Vodák, Daniel; Gundersen, Sveinung; Domanska, Diana; Azab, Abdulrahman; Holden, Lars; Holden, Marit; Grytten, Ivar; Rand, Knut; Drabløs, Finn; Johansen, Morten; Mora, Antonio; Lund-Andersen, Christin; Fromm, Bastian; Eskeland, Ragnhild; Gabrielsen, Odd Stokke; Ferkingstad, Egil; Nakken, Sigve; Bengtsen, Mads; Nederbragt, Alexander Johan; Thorarensen, Hildur Sif; Akse, Johannes Andreas; Glad, Ingrid; Hovig, Eivind; Sandve, Geir Kjetil

2017-07-01

Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation. We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and confirmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been implemented in a comprehensive open-source software system, the GSuite HyperBrowser. To make the functionality accessible to biologists, and to facilitate reproducible analysis, we have also developed a web-based interface providing an expertly guided and customizable way of utilizing the methodology. With this system, many novel biological questions can flexibly be posed and rapidly answered. Through a combination of streamlined data acquisition, interoperable representation of dataset collections, and customizable statistical analysis with guided setup and interpretation, the GSuite HyperBrowser represents a first comprehensive solution for integrative analysis of track collections across the genome and epigenome. The software is available at: https://hyperbrowser.uio.no. © The Author 2017. Published by Oxford University Press.
Stroke Treatment Academic Industry Roundtable Recommendations for Individual Data Pooling Analyses in Stroke.

PubMed

Lees, Kennedy R; Khatri, Pooja

2016-08-01

Pooled analysis of individual patient data from stroke trials can deliver more precise estimates of treatment effect, enhance power to examine prespecified subgroups, and facilitate exploration of treatment-modifying influences. Analysis plans should be declared, and preferably published, before trial results are known. For pooling trials that used diverse analytic approaches, an ordinal analysis is favored, with justification for considering deaths and severe disability jointly. Because trial pooling is an incremental process, analyses should follow a sequential approach, with statistical adjustment for iterations. Updated analyses should be published when revised conclusions have a clinical implication. However, caution is recommended in declaring pooled findings that may prejudice ongoing trials, unless clinical implications are compelling. All contributing trial teams should contribute to leadership, data verification, and authorship of pooled analyses. Development work is needed to enable reliable inferences to be drawn about individual drug or device effects that contribute to a pooled analysis, versus a class effect, if the treatment strategy combines ≥2 such drugs or devices. Despite the practical challenges, pooled analyses are powerful and essential tools in interpreting clinical trial findings and advancing clinical care. © 2016 American Heart Association, Inc.
Systematic review and meta-analysis in cardiac surgery: a primer.

PubMed

Yanagawa, Bobby; Tam, Derrick Y; Mazine, Amine; Tricco, Andrea C

2018-03-01

The purpose of this article is to review the strengths and weaknesses of systematic reviews and meta-analyses to inform our current understanding of cardiac surgery. A systematic review and meta-analysis of a focused topic can provide a quantitative estimate for the effect of a treatment intervention or exposure. In cardiac surgery, observational studies and small, single-center prospective trials provide most of the clinical outcomes that form the evidence base for patient management and guideline recommendations. As such, meta-analyses can be particularly valuable in synthesizing the literature for a particular focused surgical question. Since the year 2000, there are over 800 meta-analysis-related publications in our field. There are some limitations to this technique, including clinical, methodological and statistical heterogeneity, among other challenges. Despite these caveats, results of meta-analyses have been useful in forming treatment recommendations or in providing guidance in the design of future clinical trials. There is a growing number of meta-analyses in the field of cardiac surgery. Knowledge translation via meta-analyses will continue to guide and inform cardiac surgical practice and our practice guidelines.
Modeling and replicating statistical topology and evidence for CMB nonhomogeneity

PubMed Central

Agami, Sarit

2017-01-01

Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301
Sources of Safety Data and Statistical Strategies for Design and Analysis: Postmarket Surveillance.

PubMed

Izem, Rima; Sanchez-Kam, Matilde; Ma, Haijun; Zink, Richard; Zhao, Yueqin

2018-03-01

Safety data are continuously evaluated throughout the life cycle of a medical product to accurately assess and characterize the risks associated with the product. The knowledge about a medical product's safety profile continually evolves as safety data accumulate. This paper discusses data sources and analysis considerations for safety signal detection after a medical product is approved for marketing. This manuscript is the second in a series of papers from the American Statistical Association Biopharmaceutical Section Safety Working Group. We share our recommendations for the statistical and graphical methodologies necessary to appropriately analyze, report, and interpret safety outcomes, and we discuss the advantages and disadvantages of safety data obtained from passive postmarketing surveillance systems compared to other sources. Signal detection has traditionally relied on spontaneous reporting databases that have been available worldwide for decades. However, current regulatory guidelines and ease of reporting have increased the size of these databases exponentially over the last few years. With such large databases, data-mining tools using disproportionality analysis and helpful graphics are often used to detect potential signals. Although the data sources have many limitations, analyses of these data have been successful at identifying safety signals postmarketing. Experience analyzing these dynamic data is useful in understanding the potential and limitations of analyses with new data sources such as social media, claims, or electronic medical records data.
On the structure and phase transitions of power-law Poissonian ensembles

NASA Astrophysics Data System (ADS)

Eliazar, Iddo; Oshanin, Gleb

2012-10-01

Power-law Poissonian ensembles are Poisson processes that are defined on the positive half-line, and that are governed by power-law intensities. Power-law Poissonian ensembles are stochastic objects of fundamental significance; they uniquely display an array of fractal features and they uniquely generate a span of important applications. In this paper we apply three different methods—oligarchic analysis, Lorenzian analysis and heterogeneity analysis—to explore power-law Poissonian ensembles. The amalgamation of these analyses, combined with the topology of power-law Poissonian ensembles, establishes a detailed and multi-faceted picture of the statistical structure and the statistical phase transitions of these elemental ensembles.
Sister chromatid exchanges and micronuclei analysis in lymphocytes of men exposed to simazine through drinking water.

PubMed

Suárez, Susanna; Rubio, Arantxa; Sueiro, Rosa Ana; Garrido, Joaquín

2003-06-06

In some cities of the autonomous community of Extremadura (south-west of Spain), levels of simazine from 10 to 30 ppm were detected in tap water. To analyse the possible effect of this herbicide, two biomarkers, sister chromatid exchanges (SCE) and micronuclei (MN), were used in peripheral blood lymphocytes from males exposed to simazine through drinking water. SCE and MN analysis failed to detect any statistically significant increase in the people exposed to simazine when compared with the controls. With respect to high frequency cells (HFC), a statistically significant difference was detected between exposed and control groups.
MWASTools: an R/bioconductor package for metabolome-wide association studies.

PubMed

Rodriguez-Martinez, Andrea; Posma, Joram M; Ayala, Rafael; Neves, Ana L; Anwar, Maryam; Petretto, Enrico; Emanueli, Costanza; Gauguier, Dominique; Nicholson, Jeremy K; Dumas, Marc-Emmanuel

2018-03-01

MWASTools is an R package designed to provide an integrated pipeline to analyse metabonomic data in large-scale epidemiological studies. Key functionalities of our package include: quality control analysis; metabolome-wide association analysis using various models (partial correlations, generalized linear models); visualization of statistical outcomes; metabolite assignment using statistical total correlation spectroscopy (STOCSY); and biological interpretation of metabolome-wide association studies results. The MWASTools R package is implemented in R (version > =3.4) and is available from Bioconductor: https://bioconductor.org/packages/MWASTools/. m.dumas@imperial.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Students' attitudes towards learning statistics

NASA Astrophysics Data System (ADS)

Ghulami, Hassan Rahnaward; Hamid, Mohd Rashid Ab; Zakaria, Roslinazairimah

2015-05-01

Positive attitude towards learning is vital in order to master the core content of the subject matters under study. This is unexceptional in learning statistics course especially at the university level. Therefore, this study investigates the students' attitude towards learning statistics. Six variables or constructs have been identified such as affect, cognitive competence, value, difficulty, interest, and effort. The instrument used for the study is questionnaire that was adopted and adapted from the reliable instrument of Survey of Attitudes towards Statistics(SATS©). This study is conducted to engineering undergraduate students in one of the university in the East Coast of Malaysia. The respondents consist of students who were taking the applied statistics course from different faculties. The results are analysed in terms of descriptive analysis and it contributes to the descriptive understanding of students' attitude towards the teaching and learning process of statistics.

Use of Multivariate Linkage Analysis for Dissection of a Complex Cognitive Trait

PubMed Central

Marlow, Angela J.; Fisher, Simon E.; Francks, Clyde; MacPhie, I. Laurence; Cherny, Stacey S.; Richardson, Alex J.; Talcott, Joel B.; Stein, John F.; Monaco, Anthony P.; Cardon, Lon R.

2003-01-01

Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits. PMID:12587094
Surgical adverse outcome reporting as part of routine clinical care.

PubMed

Kievit, J; Krukerink, M; Marang-van de Mheen, P J

2010-12-01

In The Netherlands, health professionals have created a doctor-driven standardised system to report and analyse adverse outcomes (AO). The aim is to improve healthcare by learning from past experiences. The key elements of this system are (1) an unequivocal definition of an adverse outcome, (2) appropriate contextual information and (3) a three-dimensional hierarchical classification system. First, to assess whether routine doctor-driven AO reporting is feasible. Second, to investigate how doctors can learn from AO reporting and analysis to improve the quality of care. Feasibility was assessed by how well doctors reported AO in the surgical department of a Dutch university hospital over a period of 9 years. AO incidence was analysed per patient subgroup and over time, in a time-trend analysis of three equal 3-year periods. AO were analysed case by case and statistically, to learn lessons from past events. In 19,907 surgical admissions, 9189 AOs were reported: one or more AO in 18.2% of admissions. On average, 55 lessons were learnt each year (in 4.3% of AO). More AO were reported in P3 than P1 (OR 1.39 (1.23-1.57)). Although minor AO increased, fatal AO decreased over time (OR 0.59 (0.45-0.77)). Doctor-driven AO reporting is shown to be feasible. Lessons can be learnt from case-by-case analyses of individual AO, as well as by statistical analysis of AO groups and subgroups (illustrated by time-trend analysis), thus contributing to the improvement of the quality of care. Moreover, by standardising AO reporting, data can be compared across departments or hospitals, to generate (confidential) mirror information for professionals cooperating in a peer-review setting.
Surgical adverse outcome reporting as part of routine clinical care

PubMed Central

Krukerink, M; Marang-van de Mheen, P J

2010-01-01

Background In The Netherlands, health professionals have created a doctor-driven standardised system to report and analyse adverse outcomes (AO). The aim is to improve healthcare by learning from past experiences. The key elements of this system are (1) an unequivocal definition of an adverse outcome, (2) appropriate contextual information and (3) a three-dimensional hierarchical classification system. Objectives First, to assess whether routine doctor-driven AO reporting is feasible. Second, to investigate how doctors can learn from AO reporting and analysis to improve the quality of care. Methods Feasibility was assessed by how well doctors reported AO in the surgical department of a Dutch university hospital over a period of 9 years. AO incidence was analysed per patient subgroup and over time, in a time-trend analysis of three equal 3-year periods. AO were analysed case by case and statistically, to learn lessons from past events. Results In 19 907 surgical admissions, 9189 AOs were reported: one or more AO in 18.2% of admissions. On average, 55 lessons were learnt each year (in 4.3% of AO). More AO were reported in P3 than P1 (OR 1.39 (1.23–1.57)). Although minor AO increased, fatal AO decreased over time (OR 0.59 (0.45–0.77)). Conclusions Doctor-driven AO reporting is shown to be feasible. Lessons can be learnt from case-by-case analyses of individual AO, as well as by statistical analysis of AO groups and subgroups (illustrated by time-trend analysis), thus contributing to the improvement of the quality of care. Moreover, by standardising AO reporting, data can be compared across departments or hospitals, to generate (confidential) mirror information for professionals cooperating in a peer-review setting. PMID:20430928
Statistical Learning Analysis in Neuroscience: Aiming for Transparency

PubMed Central

Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan

2009-01-01

Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270
Performing statistical analyses on quantitative data in Taverna workflows: an example using R and maxdBrowse to identify differentially-expressed genes from microarray data.

PubMed

Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

2008-08-07

There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.
Performing statistical analyses on quantitative data in Taverna workflows: An example using R and maxdBrowse to identify differentially-expressed genes from microarray data

PubMed Central

Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

2008-01-01

Background There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Results Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Conclusion Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data. PMID:18687127
dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing.

PubMed

Gruber, Bernd; Unmack, Peter J; Berry, Oliver F; Georges, Arthur

2018-05-01

Although vast technological advances have been made and genetic software packages are growing in number, it is not a trivial task to analyse SNP data. We announce a new r package, dartr, enabling the analysis of single nucleotide polymorphism data for population genomic and phylogenomic applications. dartr provides user-friendly functions for data quality control and marker selection, and permits rigorous evaluations of conformation to Hardy-Weinberg equilibrium, gametic-phase disequilibrium and neutrality. The package reports standard descriptive statistics, permits exploration of patterns in the data through principal components analysis and conducts standard F-statistics, as well as basic phylogenetic analyses, population assignment, isolation by distance and exports data to a variety of commonly used downstream applications (e.g., newhybrids, faststructure and phylogeny applications) outside of the r environment. The package serves two main purposes: first, a user-friendly approach to lower the hurdle to analyse such data-therefore, the package comes with a detailed tutorial targeted to the r beginner to allow data analysis without requiring deep knowledge of r. Second, we use a single, well-established format-genlight from the adegenet package-as input for all our functions to avoid data reformatting. By strictly using the genlight format, we hope to facilitate this format as the de facto standard of future software developments and hence reduce the format jungle of genetic data sets. The dartr package is available via the r CRAN network and GitHub. © 2017 John Wiley & Sons Ltd.
METHODS OF DEALING WITH VALUES BELOW THE LIMIT OF DETECTION USING SAS

EPA Science Inventory

Due to limitations of chemical analysis procedures, small concentrations cannot be precisely measured. These concentrations are said to be below the limit of detection (LOD). In statistical analyses, these values are often censored and substituted with a constant value, such ...
Fatality Reduction by Air Bags: Analyses of Accident Data through Early 1996

DOT National Transportation Integrated Search

1996-08-01

The fatality risk of front-seat occupants of passenger cars and light trucks equipped with air bags is compared to the corresponding risk in similar vehicles without air bags, based on statistical analysis of Fatal Accident Reporting System (FARS)dat...
Analyzing Mixed-Dyadic Data Using Structural Equation Models

ERIC Educational Resources Information Center

Peugh, James L.; DiLillo, David; Panuzio, Jillian

2013-01-01

Mixed-dyadic data, collected from distinguishable (nonexchangeable) or indistinguishable (exchangeable) dyads, require statistical analysis techniques that model the variation within dyads and between dyads appropriately. The purpose of this article is to provide a tutorial for performing structural equation modeling analyses of cross-sectional…
Taxonomic evaluation of Streptomyces hirsutus and related species using multi-locus sequence analysis

USDA-ARS?s Scientific Manuscript database

Phylogenetic analyses of species of Streptomyces based on 16S rRNA gene sequences resulted in a statistically well-supported clade (100% bootstrap value) containing 8 species having very similar gross morphology. These species, including Streptomyces bambergiensis, Streptomyces chlorus, Streptomyces...
Geographically Sourcing Cocaine’s Origin – Delineation of the Nineteen Major Coca Growing Regions in South America

PubMed Central

Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.

2016-01-01

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions. PMID:27006288
The Ontology of Biological and Clinical Statistics (OBCS) for standardized and reproducible statistical analysis.

PubMed

Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun

2016-09-14

Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.
Leasing Into the Sun: A Mixed Method Analysis of Transactions of Homes with Third Party Owned Solar

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoen, Ben; Rand, Joseph; Adomatis, Sandra

This analysis is the first to examine if homes with third-party owned (TPO) PV systems are unique in the marketplace as compared to non-PV or non-TPO PV homes. This is of growing importance as the number of homes with TPO systems is nearly a half of a million in the US currently and is growing. A hedonic pricing model analysis of 20,106 homes that sold in California between 2011 and 2013 is conducted, as well as a paired sales analysis of 18 pairs of TPO PV and non-PV homes in San Diego spanning 2012 and 2013. The hedonic model examinedmore » 2,914 non-TPO PV home sales and 113 TPO PV sales and fails to uncover statistically significant premiums for TPO PV homes nor for those with pre-paid leases as compared to non-PV homes. Similarly, the paired sales analysis does not find evidence of an impact to value for the TPO homes when comparing to non-PV homes. Analyses of non-TPO PV sales both here and previously have found larger and statistically significant premiums. Collection of a larger dataset that covers the present period is recommended for future analyses so that smaller, more nuanced and recent effects can be discovered.« less
Proper joint analysis of summary association statistics requires the adjustment of heterogeneity in SNP coverage pattern.

PubMed

Zhang, Han; Wheeler, William; Song, Lei; Yu, Kai

2017-07-07

As meta-analysis results published by consortia of genome-wide association studies (GWASs) become increasingly available, many association summary statistics-based multi-locus tests have been developed to jointly evaluate multiple single-nucleotide polymorphisms (SNPs) to reveal novel genetic architectures of various complex traits. The validity of these approaches relies on the accurate estimate of z-score correlations at considered SNPs, which in turn requires knowledge on the set of SNPs assessed by each study participating in the meta-analysis. However, this exact SNP coverage information is usually unavailable from the meta-analysis results published by GWAS consortia. In the absence of the coverage information, researchers typically estimate the z-score correlations by making oversimplified coverage assumptions. We show through real studies that such a practice can generate highly inflated type I errors, and we demonstrate the proper way to incorporate correct coverage information into multi-locus analyses. We advocate that consortia should make SNP coverage information available when posting their meta-analysis results, and that investigators who develop analytic tools for joint analyses based on summary data should pay attention to the variation in SNP coverage and adjust for it appropriately. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra.

PubMed

Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

2015-01-01

To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.
The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra

PubMed Central

Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

2015-01-01

Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model

PubMed Central

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639
Biometric Analysis - A Reliable Indicator for Diagnosing Taurodontism using Panoramic Radiographs.

PubMed

Hegde, Veda; Anegundi, Rajesh Trayambhak; Pravinchandra, K R

2013-08-01

Taurodontism is a clinical entity with a morpho-anatomical change in the shape of the tooth, which was thought to be absent in modern man. Taurodontism is mostly observed as an isolated trait or a component of a syndrome. Various techniques have been devised to diagnose taurodontism. The aim of this study was to analyze whether a biometric analysis was useful in diagnosing taurodontism, in radiographs which appeared to be normal on cursory observations. This study was carried out in our institution by using radiographs which were taken for routine procedures. In this retrospective study, panoramic radiographs were obtained from dental records of children who were aged between 9-14 years, who did not have any abnormality on cursory observations. Biometric analyses were carried out on permanent mandibular first molar(s) by using a novel biometric method. The values were tabulated and analysed. Fischer exact probability test, Chi square test and Chi-square test with Yates correction were used for statistical analysis of the data. Cursory observation did not yield us any case of taurodontism. In contrast, the biometric analysis yielded us a statistically significant number of cases of taurodontism. However, there was no statistically significant difference in the number of cases with taurodontism, which was obtained between the genders and the age group which was considered. Thus, taurodontism was diagnosed on a biometric analysis, which was otherwise missed on a cursory observation. It is therefore necessary from the clinical point of view, to diagnose even the mildest form of taurodontism by using metric analysis rather than just relying on a visual radiographic assessment, as its occurrence has many clinical implications and a diagnostic importance.
Quantitative chromatin pattern description in Feulgen-stained nuclei as a diagnostic tool to characterize the oligodendroglial and astroglial components in mixed oligo-astrocytomas.

PubMed

Decaestecker, C; Lopes, B S; Gordower, L; Camby, I; Cras, P; Martin, J J; Kiss, R; VandenBerg, S R; Salmon, I

1997-04-01

The oligoastrocytoma, as a mixed glioma, represents a nosologic dilemma with respect to precisely defining the oligodendroglial and astroglial phenotypes that constitute the neoplastic cell lineages of these tumors. In this study, cell image analysis with Feulgen-stained nuclei was used to distinguish between oligodendroglial and astrocytic phenotypes in oligodendrogliomas and astrocytomas and then applied to mixed oligoastrocytomas. Quantitative features with respect to chromatin pattern (30 variables) and DNA ploidy (8 variables) were evaluated on Feulgen-stained nuclei in a series of 71 gliomas using computer-assisted microscopy. These included 32 oligodendrogliomas (OLG group: 24 grade II and 8 grade III tumors according to the WHO classification), 32 astrocytomas (AST group: 13 grade II and 19 grade III tumors), and 7 oligoastrocytomas (OLGAST group). Initially, image analysis with multivariate statistical analyses (Discriminant Analysis) could identify each glial tumor group. Highly significant statistical differences were obtained distinguishing the morphonuclear features of oligodendrogliomas from those of astrocytomas, regardless of their histological grade. When compared with the 7 mixed oligoastrocytomas under study, 5 exhibited DNA ploidy and chromatin pattern characteristics similar to grade II oligodendrogliomas, I to grade III oligodendrogliomas, and I to grade II astrocytomas. Using multifactorial statistical analyses (Discriminant Analysis combined with Principal Component Analysis). It was possible to quantify the proportion of "typical" glial cell phenotypes that compose grade II and III oligodendrogliomas and grade II and III astrocytomas in each mixed glioma. Cytometric image analysis may be an important adjunct to routine histopathology for the reproducible identification of neoplasms containing a mixture of oligodendroglial and astrocytic phenotypes.

Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling

PubMed Central

Wood, John

2017-01-01

Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

PubMed

Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

2015-04-01

Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Sensitivity analysis of static resistance of slender beam under bending

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valeš, Jan

2016-06-08

The paper deals with statical and sensitivity analyses of resistance of simply supported I-beams under bending. The resistance was solved by geometrically nonlinear finite element method in the programme Ansys. The beams are modelled with initial geometrical imperfections following the first eigenmode of buckling. Imperfections were, together with geometrical characteristics of cross section, and material characteristics of steel, considered as random quantities. The method Latin Hypercube Sampling was applied to evaluate statistical and sensitivity resistance analyses.
The Need for Speed in Rodent Locomotion Analyses

PubMed Central

Batka, Richard J.; Brown, Todd J.; Mcmillan, Kathryn P.; Meadows, Rena M.; Jones, Kathryn J.; Haulcomb, Melissa M.

2016-01-01

Locomotion analysis is now widely used across many animal species to understand the motor defects in disease, functional recovery following neural injury, and the effectiveness of various treatments. More recently, rodent locomotion analysis has become an increasingly popular method in a diverse range of research. Speed is an inseparable aspect of locomotion that is still not fully understood, and its effects are often not properly incorporated while analyzing data. In this hybrid manuscript, we accomplish three things: (1) review the interaction between speed and locomotion variables in rodent studies, (2) comprehensively analyze the relationship between speed and 162 locomotion variables in a group of 16 wild-type mice using the CatWalk gait analysis system, and (3) develop and test a statistical method in which locomotion variables are analyzed and reported in the context of speed. Notable results include the following: (1) over 90% of variables, reported by CatWalk, were dependent on speed with an average R2 value of 0.624, (2) most variables were related to speed in a nonlinear manner, (3) current methods of controlling for speed are insufficient, and (4) the linear mixed model is an appropriate and effective statistical method for locomotion analyses that is inclusive of speed-dependent relationships. Given the pervasive dependency of locomotion variables on speed, we maintain that valid conclusions from locomotion analyses cannot be made unless they are analyzed and reported within the context of speed. PMID:24890845
GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.

PubMed

Davis, Sean; Meltzer, Paul S

2007-07-15

Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.
Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

PubMed

Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

2015-08-01

Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.
Lehrer in der Bundesrepublik Deutschland. Eine Kritische Analyse Statistischer Daten uber das Lehrpersonal an Allgemeinbildenden Schulen. (Education in the Federal Republic of Germany. A Statistical Study of Teachers in Schools of General Education.)

ERIC Educational Resources Information Center

Kohler, Helmut

The purpose of this study was to analyze the available statistics concerning teachers in schools of general education in the Federal Republic of Germany. An analysis of the demographic structure of the pool of full-time teachers showed that in 1971 30 percent of the teachers were under age 30, and 50 percent were under age 35. It was expected that…
Statistical Performances of Resistive Active Power Splitter

NASA Astrophysics Data System (ADS)

Lalléchère, Sébastien; Ravelo, Blaise; Thakur, Atul

2016-03-01

In this paper, the synthesis and sensitivity analysis of an active power splitter (PWS) is proposed. It is based on the active cell composed of a Field Effect Transistor in cascade with shunted resistor at the input and the output (resistive amplifier topology). The PWS uncertainty versus resistance tolerances is suggested by using stochastic method. Furthermore, with the proposed topology, we can control easily the device gain while varying a resistance. This provides useful tool to analyse the statistical sensitivity of the system in uncertain environment.
A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula.

PubMed

Ince, Robin A A; Giordano, Bruno L; Kayser, Christoph; Rousselet, Guillaume A; Gross, Joachim; Schyns, Philippe G

2017-03-01

We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open-source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541-1573, 2017. © 2016 Wiley Periodicals, Inc. 2016 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
Ocean data assimilation using optimal interpolation with a quasi-geostrophic model

NASA Technical Reports Server (NTRS)

Rienecker, Michele M.; Miller, Robert N.

1991-01-01

A quasi-geostrophic (QG) stream function is analyzed by optimal interpolation (OI) over a 59-day period in a 150-km-square domain off northern California. Hydrographic observations acquired over five surveys were assimilated into a QG open boundary ocean model. Assimilation experiments were conducted separately for individual surveys to investigate the sensitivity of the OI analyses to parameters defining the decorrelation scale of an assumed error covariance function. The analyses were intercompared through dynamical hindcasts between surveys. The best hindcast was obtained using the smooth analyses produced with assumed error decorrelation scales identical to those of the observed stream function. The rms difference between the hindcast stream function and the final analysis was only 23 percent of the observation standard deviation. The two sets of OI analyses were temporally smoother than the fields from statistical objective analysis and in good agreement with the only independent data available for comparison.
Using Meta-analyses for Comparative Effectiveness Research

PubMed Central

Ruppar, Todd M.; Phillips, Lorraine J.; Chase, Jo-Ana D.

2012-01-01

Comparative effectiveness research seeks to identify the most effective interventions for particular patient populations. Meta-analysis is an especially valuable form of comparative effectiveness research because it emphasizes the magnitude of intervention effects rather than relying on tests of statistical significance among primary studies. Overall effects can be calculated for diverse clinical and patient-centered variables to determine the outcome patterns. Moderator analyses compare intervention characteristics among primary studies by determining if effect sizes vary among studies with different intervention characteristics. Intervention effectiveness can be linked to patient characteristics to provide evidence for patient-centered care. Moderator analyses often answer questions never posed by primary studies because neither multiple intervention characteristics nor populations are compared in single primary studies. Thus meta-analyses provide unique contributions to knowledge. Although meta-analysis is a powerful comparative effectiveness strategy, methodological challenges and limitations in primary research must be acknowledged to interpret findings. PMID:22789450
Reuse, Recycle, Reweigh: Combating Influenza through Efficient Sequential Bayesian Computation for Massive Data.

PubMed

Tom, Jennifer A; Sinsheimer, Janet S; Suchard, Marc A

Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework.
Reuse, Recycle, Reweigh: Combating Influenza through Efficient Sequential Bayesian Computation for Massive Data

PubMed Central

Tom, Jennifer A.; Sinsheimer, Janet S.; Suchard, Marc A.

2015-01-01

Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework. PMID:26681992
A software platform for statistical evaluation of patient respiratory patterns in radiation therapy.

PubMed

Dunn, Leon; Kenny, John

2017-10-01

The aim of this work was to design and evaluate a software tool for analysis of a patient's respiration, with the goal of optimizing the effectiveness of motion management techniques during radiotherapy imaging and treatment. A software tool which analyses patient respiratory data files (.vxp files) created by the Varian Real-Time Position Management System (RPM) was developed to analyse patient respiratory data. The software, called RespAnalysis, was created in MATLAB and provides four modules, one each for determining respiration characteristics, providing breathing coaching (biofeedback training), comparing pre and post-training characteristics and performing a fraction-by-fraction assessment. The modules analyse respiratory traces to determine signal characteristics and specifically use a Sample Entropy algorithm as the key means to quantify breathing irregularity. Simulated respiratory signals, as well as 91 patient RPM traces were analysed with RespAnalysis to test the viability of using the Sample Entropy for predicting breathing regularity. Retrospective assessment of patient data demonstrated that the Sample Entropy metric was a predictor of periodic irregularity in respiration data, however, it was found to be insensitive to amplitude variation. Additional waveform statistics assessing the distribution of signal amplitudes over time coupled with Sample Entropy method were found to be useful in assessing breathing regularity. The RespAnalysis software tool presented in this work uses the Sample Entropy method to analyse patient respiratory data recorded for motion management purposes in radiation therapy. This is applicable during treatment simulation and during subsequent treatment fractions, providing a way to quantify breathing irregularity, as well as assess the need for breathing coaching. It was demonstrated that the Sample Entropy metric was correlated to the irregularity of the patient's respiratory motion in terms of periodicity, whilst other metrics, such as percentage deviation of inhale/exhale peak positions provided insight into respiratory amplitude regularity. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Separate enrichment analysis of pathways for up- and downregulated genes.

PubMed

Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

2014-03-06

Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.
An exploration of counterfeit medicine surveillance strategies guided by geospatial analysis: lessons learned from counterfeit Avastin detection in the US drug supply chain.

PubMed

Cuomo, Raphael E; Mackey, Tim K

2014-12-02

To explore healthcare policy and system improvements that would more proactively respond to future penetration of counterfeit cancer medications in the USA drug supply chain using geospatial analysis. A statistical and geospatial analysis of areas that received notices from the Food and Drug Administration (FDA) about the possibility of counterfeit Avastin penetrating the US drug supply chain. Data from FDA warning notices were compared to data from 44 demographic variables available from the US Census Bureau via correlation, means testing and geospatial visualisation. Results were interpreted in light of existing literature in order to recommend improvements to surveillance of counterfeit medicines. This study analysed 791 distinct healthcare provider addresses that received FDA warning notices across 30,431 zip codes in the USA. Statistical outputs were Pearson's correlation coefficients and t values. Geospatial outputs were cartographic visualisations. These data were used to generate the overarching study outcome, which was a recommendation for a strategy for drug safety surveillance congruent with existing literature on counterfeit medication. Zip codes with greater numbers of individuals age 65+ and greater numbers of ethnic white individuals were most correlated with receipt of a counterfeit Avastin notice. Geospatial visualisations designed in conjunction with statistical analysis of demographic variables appeared more capable of suggesting areas and populations that may be at risk for undetected counterfeit Avastin penetration. This study suggests that dual incorporation of statistical and geospatial analysis in surveillance of counterfeit medicine may be helpful in guiding efforts to prevent, detect and visualise counterfeit medicines penetrations in the US drug supply chain and other settings. Importantly, the information generated by these analyses could be utilised to identify at-risk populations associated with demographic characteristics. Stakeholders should explore these results as another tool to improve on counterfeit medicine surveillance. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Sampling surface and subsurface particle-size distributions in wadable gravel-and cobble-bed streams for analyses in sediment transport, hydraulics, and streambed monitoring

Treesearch

Kristin Bunte; Steven R. Abt

2001-01-01

This document provides guidance for sampling surface and subsurface sediment from wadable gravel-and cobble-bed streams. After a short introduction to streams types and classifications in gravel-bed rivers, the document explains the field and laboratory measurement of particle sizes and the statistical analysis of particle-size distributions. Analysis of particle...
Navigation and Dispersion Analysis of the First Orion Exploration Mission

NASA Technical Reports Server (NTRS)

Zanetti, Renato; D'Souza, Christopher

2015-01-01

This paper seeks to present the Orion EM-1 Linear Covariance Analysis for the DRO mission. The delta V statistics for each maneuver are presented. Included in the memo are several sensitivity analyses: variation in the time of OTC-1 (the first outbound correction maneuver), variation in the accuracy of the trans-Lunar injection, and variation in the length of the optical navigation passes.
Publication Bias Currently Makes an Accurate Estimate of the Benefits of Enrichment Programs Difficult: A Postmortem of Two Meta-Analyses Using Statistical Power Analysis

ERIC Educational Resources Information Center

Warne, Russell T.

2016-01-01

Recently Kim (2016) published a meta-analysis on the effects of enrichment programs for gifted students. She found that these programs produced substantial effects for academic achievement (g = 0.96) and socioemotional outcomes (g = 0.55). However, given current theory and empirical research these estimates of the benefits of enrichment programs…
Not so Fast My Friend: The Rush to R and the Need for Rigorous Evaluation of Data Analysis and Software in Education

ERIC Educational Resources Information Center

Harwell, Michael

2014-01-01

Commercial data analysis software has been a fixture of quantitative analyses in education for more than three decades. Despite its apparent widespread use there is no formal evidence cataloging what software is used in educational research and educational statistics classes, by whom and for what purpose, and whether some programs should be…

Time Series Expression Analyses Using RNA-seq: A Statistical Approach

PubMed Central

Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

2013-01-01

RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021
Statistical Analysis of a Round-Robin Measurement Survey of Two Candidate Materials for a Seebeck Coefficient Standard Reference Material

PubMed Central

Lu, Z. Q. J.; Lowhorn, N. D.; Wong-Ng, W.; Zhang, W.; Thomas, E. L.; Otani, M.; Green, M. L.; Tran, T. N.; Caylor, C.; Dilley, N. R.; Downey, A.; Edwards, B.; Elsner, N.; Ghamaty, S.; Hogan, T.; Jie, Q.; Li, Q.; Martin, J.; Nolas, G.; Obara, H.; Sharp, J.; Venkatasubramanian, R.; Willigan, R.; Yang, J.; Tritt, T.

2009-01-01

In an effort to develop a Standard Reference Material (SRM™) for Seebeck coefficient, we have conducted a round-robin measurement survey of two candidate materials—undoped Bi2Te3 and Constantan (55 % Cu and 45 % Ni alloy). Measurements were performed in two rounds by twelve laboratories involved in active thermoelectric research using a number of different commercial and custom-built measurement systems and techniques. In this paper we report the detailed statistical analyses on the interlaboratory measurement results and the statistical methodology for analysis of irregularly sampled measurement curves in the interlaboratory study setting. Based on these results, we have selected Bi2Te3 as the prototype standard material. Once available, this SRM will be useful for future interlaboratory data comparison and instrument calibrations. PMID:27504212
Time series expression analyses using RNA-seq: a statistical approach.

PubMed

Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

2013-01-01

RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
Cluster analysis of phytoplankton data collected from the National Stream Quality Accounting Network in the Tennessee River basin, 1974-81

USGS Publications Warehouse

Stephens, D.W.; Wangsgard, J.B.

1988-01-01

A computer program, Numerical Taxonomy System of Multivariate Statistical Programs (NTSYS), was used with interfacing software to perform cluster analyses of phytoplankton data stored in the biological files of the U.S. Geological Survey. The NTSYS software performs various types of statistical analyses and is capable of handling a large matrix of data. Cluster analyses were done on phytoplankton data collected from 1974 to 1981 at four national Stream Quality Accounting Network stations in the Tennessee River basin. Analysis of the changes in clusters of phytoplankton genera indicated possible changes in the water quality of the French Broad River near Knoxville, Tennessee. At this station, the most common diatom groups indicated a shift in dominant forms with some of the less common diatoms being replaced by green and blue-green algae. There was a reduction in genera variability between 1974-77 and 1979-81 sampling periods. Statistical analysis of chloride and dissolved solids confirmed that concentrations of these substances were smaller in 1974-77 than in 1979-81. At Pickwick Landing Dam, the furthest downstream station used in the study, there was an increase in the number of genera of ' rare ' organisms with time. The appearance of two groups of green and blue-green algae indicated that an increase in temperature or nutrient concentrations occurred from 1974 to 1981, but this could not be confirmed using available water quality data. Associations of genera forming the phytoplankton communities at three stations on the Tennessee River were found to be seasonal. Nodal analysis of combined data from all four stations used in the study did not identify any seasonal or temporal patterns during 1974-81. Cluster analysis using the NYSYS programs was effective in reducing the large phytoplankton data set to a manageable size and provided considerable insight into the structure of phytoplankton communities in the Tennessee River basin. Problems encountered using cluster analysis were the subjectivity introduced in the definition of meaningful clusters, and the lack of taxonomic identification to the species level. (Author 's abstract)
Evidence for the Selective Reporting of Analyses and Discrepancies in Clinical Trials: A Systematic Review of Cohort Studies of Clinical Trials

PubMed Central

Dwan, Kerry; Altman, Douglas G.; Clarke, Mike; Gamble, Carrol; Higgins, Julian P. T.; Sterne, Jonathan A. C.; Williamson, Paula R.; Kirkham, Jamie J.

2014-01-01

Background Most publications about selective reporting in clinical trials have focussed on outcomes. However, selective reporting of analyses for a given outcome may also affect the validity of findings. If analyses are selected on the basis of the results, reporting bias may occur. The aims of this study were to review and summarise the evidence from empirical cohort studies that assessed discrepant or selective reporting of analyses in randomised controlled trials (RCTs). Methods and Findings A systematic review was conducted and included cohort studies that assessed any aspect of the reporting of analyses of RCTs by comparing different trial documents, e.g., protocol compared to trial report, or different sections within a trial publication. The Cochrane Methodology Register, Medline (Ovid), PsycInfo (Ovid), and PubMed were searched on 5 February 2014. Two authors independently selected studies, performed data extraction, and assessed the methodological quality of the eligible studies. Twenty-two studies (containing 3,140 RCTs) published between 2000 and 2013 were included. Twenty-two studies reported on discrepancies between information given in different sources. Discrepancies were found in statistical analyses (eight studies), composite outcomes (one study), the handling of missing data (three studies), unadjusted versus adjusted analyses (three studies), handling of continuous data (three studies), and subgroup analyses (12 studies). Discrepancy rates varied, ranging from 7% (3/42) to 88% (7/8) in statistical analyses, 46% (36/79) to 82% (23/28) in adjusted versus unadjusted analyses, and 61% (11/18) to 100% (25/25) in subgroup analyses. This review is limited in that none of the included studies investigated the evidence for bias resulting from selective reporting of analyses. It was not possible to combine studies to provide overall summary estimates, and so the results of studies are discussed narratively. Conclusions Discrepancies in analyses between publications and other study documentation were common, but reasons for these discrepancies were not discussed in the trial reports. To ensure transparency, protocols and statistical analysis plans need to be published, and investigators should adhere to these or explain discrepancies. Please see later in the article for the Editors' Summary PMID:24959719
Research on Visual Analysis Methods of Terrorism Events

NASA Astrophysics Data System (ADS)

Guo, Wenyue; Liu, Haiyan; Yu, Anzhu; Li, Jing

2016-06-01

Under the situation that terrorism events occur more and more frequency throughout the world, improving the response capability of social security incidents has become an important aspect to test governments govern ability. Visual analysis has become an important method of event analysing for its advantage of intuitive and effective. To analyse events' spatio-temporal distribution characteristics, correlations among event items and the development trend, terrorism event's spatio-temporal characteristics are discussed. Suitable event data table structure based on "5W" theory is designed. Then, six types of visual analysis are purposed, and how to use thematic map and statistical charts to realize visual analysis on terrorism events is studied. Finally, experiments have been carried out by using the data provided by Global Terrorism Database, and the results of experiments proves the availability of the methods.
The CMS Data Analysis School Experience

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Filippis, N.; Bauerdick, L.; Chen, J.

The CMS Data Analysis School is an official event organized by the CMS Collaboration to teach students and post-docs how to perform a physics analysis. The school is coordinated by the CMS schools committee and was first implemented at the LHC Physics Center at Fermilab in 2010. As part of the training, there are a number of “short” exercises on physics object reconstruction and identification, Monte Carlo simulation, and statistical analysis, which are followed by “long” exercises based on physics analyses. Some of the long exercises go beyond the current state of the art of the corresponding CMS analyses. Thismore » paper describes the goals of the school, the preparations for a school, the structure of the training, and student satisfaction with the experience as measured by surveys.« less
The CMS data analysis school experience

NASA Astrophysics Data System (ADS)

De Filippis, N.; Bauerdick, L.; Chen, J.; Gallo, E.; Klima, B.; Malik, S.; Mulders, M.; Palla, F.; Rolandi, G.

2017-10-01

The CMS Data Analysis School is an official event organized by the CMS Collaboration to teach students and post-docs how to perform a physics analysis. The school is coordinated by the CMS schools committee and was first implemented at the LHC Physics Center at Fermilab in 2010. As part of the training, there are a number of “short” exercises on physics object reconstruction and identification, Monte Carlo simulation, and statistical analysis, which are followed by “long” exercises based on physics analyses. Some of the long exercises go beyond the current state of the art of the corresponding CMS analyses. This paper describes the goals of the school, the preparations for a school, the structure of the training, and student satisfaction with the experience as measured by surveys.
Vibro-acoustic analysis of composite plates

NASA Astrophysics Data System (ADS)

Sarigül, A. S.; Karagözlü, E.

2014-03-01

Vibro-acoustic analysis plays a vital role on the design of aircrafts, spacecrafts, land vehicles and ships produced from thin plates backed by closed cavities, with regard to human health and living comfort. For this type of structures, it is required a coupled solution that takes into account structural-acoustic interaction which is crucial for sensitive solutions. In this study, coupled vibro-acoustic analyses of plates produced from composite materials have been performed by using finite element analysis software. The study has been carried out for E-glass/Epoxy, Kevlar/Epoxy and Carbon/Epoxy plates with different ply angles and numbers of ply. The effects of composite material, ply orientation and number of layer on coupled vibro-acoustic characteristics of plates have been analysed for various combinations. The analysis results have been statistically examined and assessed.
Applications of Stochastic Analyses for Collaborative Learning and Cognitive Assessment

DTIC Science & Technology

2007-04-01

models (Visser, Maartje, Raijmakers, & Molenaar , 2002). The second part of this paper illustrates two applications of the methods described in the...clustering three-way data sets. Computational Statistics and Data Analysis, 51 (11), 5368–5376. Visser, I., Maartje, E., Raijmakers, E. J., & Molenaar
NEUROBEHAVIORAL EVALUATIONS OF BINARY AND TERTIARY MIXTURES OF CHEMICALS: LESSIONS LEARNING.

EPA Science Inventory

The classical approach to the statistical analysis of binary chemical mixtures is to construct full dose-response curves for one compound in the presence of a range of doses of the second compound (isobolographic analyses). For interaction studies using more than two chemicals, ...
Validation of contractor HMA testing data in the materials acceptance process - phase II : final report.

DOT National Transportation Integrated Search

2016-08-01

This study conducted an analysis of the SCDOT HMA specification. A Research Steering Committee provided oversight : of the process. The research process included extensive statistical analyses of test data supplied by SCDOT. : A total of 2,789 AC tes...
Descriptive Statistics for Modern Test Score Distributions: Skewness, Kurtosis, Discreteness, and Ceiling Effects.

PubMed

Ho, Andrew D; Yu, Carol C

2015-06-01

Many statistical analyses benefit from the assumption that unconditional or conditional distributions are continuous and normal. More than 50 years ago in this journal, Lord and Cook chronicled departures from normality in educational tests, and Micerri similarly showed that the normality assumption is met rarely in educational and psychological practice. In this article, the authors extend these previous analyses to state-level educational test score distributions that are an increasingly common target of high-stakes analysis and interpretation. Among 504 scale-score and raw-score distributions from state testing programs from recent years, nonnormal distributions are common and are often associated with particular state programs. The authors explain how scaling procedures from item response theory lead to nonnormal distributions as well as unusual patterns of discreteness. The authors recommend that distributional descriptive statistics be calculated routinely to inform model selection for large-scale test score data, and they illustrate consequences of nonnormality using sensitivity studies that compare baseline results to those from normalized score scales.
Effect of filter type on the statistics of energy transfer between resolved and subfilter scales from a-priori analysis of direct numerical simulations of isotropic turbulence

NASA Astrophysics Data System (ADS)

Buzzicotti, M.; Linkmann, M.; Aluie, H.; Biferale, L.; Brasseur, J.; Meneveau, C.

2018-02-01

The effects of different filtering strategies on the statistical properties of the resolved-to-subfilter scale (SFS) energy transfer are analysed in forced homogeneous and isotropic turbulence. We carry out a-priori analyses of the statistical characteristics of SFS energy transfer by filtering data obtained from direct numerical simulations with up to 20483 grid points as a function of the filter cutoff scale. In order to quantify the dependence of extreme events and anomalous scaling on the filter, we compare a sharp Fourier Galerkin projector, a Gaussian filter and a novel class of Galerkin projectors with non-sharp spectral filter profiles. Of interest is the importance of Galilean invariance and we confirm that local SFS energy transfer displays intermittency scaling in both skewness and flatness as a function of the cutoff scale. Furthermore, we quantify the robustness of scaling as a function of the filtering type.
SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

PubMed

O'Connor, B P

2000-08-01

Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map.

PubMed

Orsi, Rebecca

2017-02-01

Concept mapping is now a commonly-used technique for articulating and evaluating programmatic outcomes. However, research regarding validity of knowledge and outcomes produced with concept mapping is sparse. The current study describes quantitative validity analyses using a concept mapping dataset. We sought to increase the validity of concept mapping evaluation results by running multiple cluster analysis methods and then using several metrics to choose from among solutions. We present four different clustering methods based on analyses using the R statistical software package: partitioning around medoids (PAM), fuzzy analysis (FANNY), agglomerative nesting (AGNES) and divisive analysis (DIANA). We then used the Dunn and Davies-Bouldin indices to assist in choosing a valid cluster solution for a concept mapping outcomes evaluation. We conclude that the validity of the outcomes map is high, based on the analyses described. Finally, we discuss areas for further concept mapping methods research. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Learning investment indicators through data extension

NASA Astrophysics Data System (ADS)

Dvořák, Marek

2017-07-01

Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
An ANOVA approach for statistical comparisons of brain networks.

PubMed

Fraiman, Daniel; Fraiman, Ricardo

2018-03-16

The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.
Waveform classification and statistical analysis of seismic precursors to the July 2008 Vulcanian Eruption of Soufrière Hills Volcano, Montserrat

NASA Astrophysics Data System (ADS)

Rodgers, Mel; Smith, Patrick; Pyle, David; Mather, Tamsin

2016-04-01

Understanding the transition between quiescence and eruption at dome-forming volcanoes, such as Soufrière Hills Volcano (SHV), Montserrat, is important for monitoring volcanic activity during long-lived eruptions. Statistical analysis of seismic events (e.g. spectral analysis and identification of multiplets via cross-correlation) can be useful for characterising seismicity patterns and can be a powerful tool for analysing temporal changes in behaviour. Waveform classification is crucial for volcano monitoring, but consistent classification, both during real-time analysis and for retrospective analysis of previous volcanic activity, remains a challenge. Automated classification allows consistent re-classification of events. We present a machine learning (random forest) approach to rapidly classify waveforms that requires minimal training data. We analyse the seismic precursors to the July 2008 Vulcanian explosion at SHV and show systematic changes in frequency content and multiplet behaviour that had not previously been recognised. These precursory patterns of seismicity may be interpreted as changes in pressure conditions within the conduit during magma ascent and could be linked to magma flow rates. Frequency analysis of the different waveform classes supports the growing consensus that LP and Hybrid events should be considered end members of a continuum of low-frequency source processes. By using both supervised and unsupervised machine-learning methods we investigate the nature of waveform classification and assess current classification schemes.
Web-TCGA: an online platform for integrated analysis of molecular cancer data sets.

PubMed

Deng, Mario; Brägelmann, Johannes; Schultze, Joachim L; Perner, Sven

2016-02-06

The Cancer Genome Atlas (TCGA) is a pool of molecular data sets publicly accessible and freely available to cancer researchers anywhere around the world. However, wide spread use is limited since an advanced knowledge of statistics and statistical software is required. In order to improve accessibility we created Web-TCGA, a web based, freely accessible online tool, which can also be run in a private instance, for integrated analysis of molecular cancer data sets provided by TCGA. In contrast to already available tools, Web-TCGA utilizes different methods for analysis and visualization of TCGA data, allowing users to generate global molecular profiles across different cancer entities simultaneously. In addition to global molecular profiles, Web-TCGA offers highly detailed gene and tumor entity centric analysis by providing interactive tables and views. As a supplement to other already available tools, such as cBioPortal (Sci Signal 6:pl1, 2013, Cancer Discov 2:401-4, 2012), Web-TCGA is offering an analysis service, which does not require any installation or configuration, for molecular data sets available at the TCGA. Individual processing requests (queries) are generated by the user for mutation, methylation, expression and copy number variation (CNV) analyses. The user can focus analyses on results from single genes and cancer entities or perform a global analysis (multiple cancer entities and genes simultaneously).

Using structural equation modeling for network meta-analysis.

PubMed

Tu, Yu-Kang; Wu, Yun-Chun

2017-07-14

Network meta-analysis overcomes the limitations of traditional pair-wise meta-analysis by incorporating all available evidence into a general statistical framework for simultaneous comparisons of several treatments. Currently, network meta-analyses are undertaken either within the Bayesian hierarchical linear models or frequentist generalized linear mixed models. Structural equation modeling (SEM) is a statistical method originally developed for modeling causal relations among observed and latent variables. As random effect is explicitly modeled as a latent variable in SEM, it is very flexible for analysts to specify complex random effect structure and to make linear and nonlinear constraints on parameters. The aim of this article is to show how to undertake a network meta-analysis within the statistical framework of SEM. We used an example dataset to demonstrate the standard fixed and random effect network meta-analysis models can be easily implemented in SEM. It contains results of 26 studies that directly compared three treatment groups A, B and C for prevention of first bleeding in patients with liver cirrhosis. We also showed that a new approach to network meta-analysis based on the technique of unrestricted weighted least squares (UWLS) method can also be undertaken using SEM. For both the fixed and random effect network meta-analysis, SEM yielded similar coefficients and confidence intervals to those reported in the previous literature. The point estimates of two UWLS models were identical to those in the fixed effect model but the confidence intervals were greater. This is consistent with results from the traditional pairwise meta-analyses. Comparing to UWLS model with common variance adjusted factor, UWLS model with unique variance adjusted factor has greater confidence intervals when the heterogeneity was larger in the pairwise comparison. The UWLS model with unique variance adjusted factor reflects the difference in heterogeneity within each comparison. SEM provides a very flexible framework for univariate and multivariate meta-analysis, and its potential as a powerful tool for advanced meta-analysis is still to be explored.
Identification of the isomers using principal component analysis (PCA) method

NASA Astrophysics Data System (ADS)

Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

2016-03-01

In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.
Automatically visualise and analyse data on pathways using PathVisioRPC from any programming environment.

PubMed

Bohler, Anwesha; Eijssen, Lars M T; van Iersel, Martijn P; Leemans, Christ; Willighagen, Egon L; Kutmon, Martina; Jaillard, Magali; Evelo, Chris T

2015-08-23

Biological pathways are descriptive diagrams of biological processes widely used for functional analysis of differentially expressed genes or proteins. Primary data analysis, such as quality control, normalisation, and statistical analysis, is often performed in scripting languages like R, Perl, and Python. Subsequent pathway analysis is usually performed using dedicated external applications. Workflows involving manual use of multiple environments are time consuming and error prone. Therefore, tools are needed that enable pathway analysis directly within the same scripting languages used for primary data analyses. Existing tools have limited capability in terms of available pathway content, pathway editing and visualisation options, and export file formats. Consequently, making the full-fledged pathway analysis tool PathVisio available from various scripting languages will benefit researchers. We developed PathVisioRPC, an XMLRPC interface for the pathway analysis software PathVisio. PathVisioRPC enables creating and editing biological pathways, visualising data on pathways, performing pathway statistics, and exporting results in several image formats in multiple programming environments. We demonstrate PathVisioRPC functionalities using examples in Python. Subsequently, we analyse a publicly available NCBI GEO gene expression dataset studying tumour bearing mice treated with cyclophosphamide in R. The R scripts demonstrate how calls to existing R packages for data processing and calls to PathVisioRPC can directly work together. To further support R users, we have created RPathVisio simplifying the use of PathVisioRPC in this environment. We have also created a pathway module for the microarray data analysis portal ArrayAnalysis.org that calls the PathVisioRPC interface to perform pathway analysis. This module allows users to use PathVisio functionality online without having to download and install the software and exemplifies how the PathVisioRPC interface can be used by data analysis pipelines for functional analysis of processed genomics data. PathVisioRPC enables data visualisation and pathway analysis directly from within various analytical environments used for preliminary analyses. It supports the use of existing pathways from WikiPathways or pathways created using the RPC itself. It also enables automation of tasks performed using PathVisio, making it useful to PathVisio users performing repeated visualisation and analysis tasks. PathVisioRPC is freely available for academic and commercial use at http://projects.bigcat.unimaas.nl/pathvisiorpc.
Investigation of continuous effect modifiers in a meta-analysis on higher versus lower PEEP in patients requiring mechanical ventilation--protocol of the ICEM study.

PubMed

Kasenda, Benjamin; Sauerbrei, Willi; Royston, Patrick; Briel, Matthias

2014-05-20

Categorizing an inherently continuous predictor in prognostic analyses raises several critical methodological issues: dependence of the statistical significance on the number and position of the chosen cut-point(s), loss of statistical power, and faulty interpretation of the results if a non-linear association is incorrectly assumed to be linear. This also applies to a therapeutic context where investigators of randomized clinical trials (RCTs) are interested in interactions between treatment assignment and one or more continuous predictors. Our goal is to apply the multivariable fractional polynomial interaction (MFPI) approach to investigate interactions between continuous patient baseline variables and the allocated treatment in an individual patient data meta-analysis of three RCTs (N = 2,299) from the intensive care field. For each study, MFPI will provide a continuous treatment effect function. Functions from each of the three studies will be averaged by a novel meta-analysis approach for functions. We will plot treatment effect functions separately for each study and also the averaged function. The averaged function with a related confidence interval will provide a suitable basis to assess whether a continuous patient characteristic modifies the treatment comparison and may be relevant for clinical decision-making. The compared interventions will be a higher or lower positive end-expiratory pressure (PEEP) ventilation strategy in patients requiring mechanical ventilation. The continuous baseline variables body mass index, PaO2/FiO2, respiratory compliance, and oxygenation index will be the investigated potential effect modifiers. Clinical outcomes for this analysis will be in-hospital mortality, time to death, time to unassisted breathing, and pneumothorax. This project will be the first meta-analysis to combine continuous treatment effect functions derived by the MFPI procedure separately in each of several RCTs. Such an approach requires individual patient data (IPD). They are available from an earlier IPD meta-analysis using different methods for analysis. This new analysis strategy allows assessing whether treatment effects interact with continuous baseline patient characteristics and avoids categorization-based subgroup analyses. These interaction analyses of the present study will be exploratory in nature. However, they may help to foster future research using the MFPI approach to improve interaction analyses of continuous predictors in RCTs and IPD meta-analyses. This study is registered in PROSPERO (CRD42012003129).
The influence of control group reproduction on the statistical ...

EPA Pesticide Factsheets

Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of breeding pairs of medaka. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) will have on the statistical power of the test. A software tool, the MEOGRT Reproduction Power Analysis Tool, was developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. The manuscript illustrates how the reproductive performance of the control medaka that are used in a MEOGRT influence statistical power, and therefore the successful implementation of the protocol. Example scenarios, based upon medaka reproduction data collected at MED, are discussed that bolster the recommendation that facilities planning to implement the MEOGRT should have a culture of medaka with hi
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis

NASA Astrophysics Data System (ADS)

Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.

2013-12-01

In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
The mediating effect of calling on the relationship between medical school students' academic burnout and empathy.

PubMed

Chae, Su Jin; Jeong, So Mi; Chung, Yoon-Sok

2017-09-01

This study is aimed at identifying the relationships between medical school students' academic burnout, empathy, and calling, and determining whether their calling has a mediating effect on the relationship between academic burnout and empathy. A mixed method study was conducted. One hundred twenty-seven medical students completed a survey. Scales measuring academic burnout, medical students' empathy, and calling were utilized. For statistical analysis, correlation analysis, descriptive statistics analysis, and hierarchical multiple regression analyses were conducted. For qualitative approach, eight medical students participated in a focus group interview. The study found that empathy has a statistically significant, negative correlation with academic burnout, while having a significant, positive correlation with calling. Sense of calling proved to be an effective mediator of the relationship between academic burnout and empathy. This result demonstrates that calling is a key variable that mediates the relationship between medical students' academic burnout and empathy. As such, this study provides baseline data for an education that could improve medical students' empathy skills.
Comparative effectiveness research methodology using secondary data: A starting user's guide.

PubMed

Sun, Maxine; Lipsitz, Stuart R

2018-04-01

The use of secondary data, such as claims or administrative data, in comparative effectiveness research has grown tremendously in recent years. We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before initiating a comparative effectiveness investigation, and (3) optimize the quality of their investigations. Specifically, we review concepts of adjusted analyses and confounders, methods of propensity score analyses, and instrumental variable analyses, risk prediction models (logistic and time-to-event), decision-curve analysis, as well as the interpretation of the P value and hypothesis testing. Overall, we hope that the current review article can help research investigators relying on secondary data to perform comparative effectiveness research better understand the necessity of a rigorous planning before study start, and gain better insight in the choice of statistical methods so as to optimize the quality of the research study. Copyright © 2017 Elsevier Inc. All rights reserved.
Performance simulation in high altitude platforms (HAPs) communications systems

NASA Astrophysics Data System (ADS)

Ulloa-Vásquez, Fernando; Delgado-Penin, J. A.

2002-07-01

This paper considers the analysis by simulation of a digital narrowband communication system for an scenario which consists of a High-Altitude aeronautical Platform (HAP) and fixed/mobile terrestrial transceivers. The aeronautical channel is modelled considering geometrical (angle of elevation vs. horizontal distance of the terrestrial reflectors) and statistical arguments and under these circumstances a serial concatenated coded digital transmission is analysed for several hypothesis related to radio-electric coverage areas. The results indicate a good feasibility for the communication system proposed and analysed.
Meta-analysis inside and outside particle physics: two traditions that should converge?

PubMed

Baker, Rose D; Jackson, Dan

2013-06-01

The use of meta-analysis in medicine and epidemiology really took off in the 1970s. However, in high-energy physics, the Particle Data Group has been carrying out meta-analyses of measurements of particle masses and other properties since 1957. Curiously, there has been virtually no interaction between those working inside and outside particle physics. In this paper, we use statistical models to study two major differences in practice. The first is the usefulness of systematic errors, which physicists are now beginning to quote in addition to statistical errors. The second is whether it is better to treat heterogeneity by scaling up errors as do the Particle Data Group or by adding a random effect as does the rest of the community. Besides fitting models, we derive and use an exact test of the error-scaling hypothesis. We also discuss the other methodological differences between the two streams of meta-analysis. Our conclusion is that systematic errors are not currently very useful and that the conventional random effects model, as routinely used in meta-analysis, has a useful role to play in particle physics. The moral we draw for statisticians is that we should be more willing to explore 'grassroots' areas of statistical application, so that good statistical practice can flow both from and back to the statistical mainstream. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.
Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

PubMed Central

Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

2009-01-01

Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Assessment of water quality parameters using multivariate analysis for Klang River basin, Malaysia.

PubMed

Mohamed, Ibrahim; Othman, Faridah; Ibrahim, Adriana I N; Alaa-Eldin, M E; Yunus, Rossita M

2015-01-01

This case study uses several univariate and multivariate statistical techniques to evaluate and interpret a water quality data set obtained from the Klang River basin located within the state of Selangor and the Federal Territory of Kuala Lumpur, Malaysia. The river drains an area of 1,288 km(2), from the steep mountain rainforests of the main Central Range along Peninsular Malaysia to the river mouth in Port Klang, into the Straits of Malacca. Water quality was monitored at 20 stations, nine of which are situated along the main river and 11 along six tributaries. Data was collected from 1997 to 2007 for seven parameters used to evaluate the status of the water quality, namely dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, suspended solids, ammoniacal nitrogen, pH, and temperature. The data were first investigated using descriptive statistical tools, followed by two practical multivariate analyses that reduced the data dimensions for better interpretation. The analyses employed were factor analysis and principal component analysis, which explain 60 and 81.6% of the total variation in the data, respectively. We found that the resulting latent variables from the factor analysis are interpretable and beneficial for describing the water quality in the Klang River. This study presents the usefulness of several statistical methods in evaluating and interpreting water quality data for the purpose of monitoring the effectiveness of water resource management. The results should provide more straightforward data interpretation as well as valuable insight for managers to conceive optimum action plans for controlling pollution in river water.
Modeling Human-Computer Decision Making with Covariance Structure Analysis.

ERIC Educational Resources Information Center

Coovert, Michael D.; And Others

Arguing that sufficient theory exists about the interplay between human information processing, computer systems, and the demands of various tasks to construct useful theories of human-computer interaction, this study presents a structural model of human-computer interaction and reports the results of various statistical analyses of this model.…
Statistical Analysis of PDF's for Na Released by Photons from Solid Surfaces

NASA Astrophysics Data System (ADS)

Gamborino, D.; Wurz, P.

2018-05-01

We analyse the adequacy of three model speed PDF's previously used to describe the desorption of Na from a solid surface either by ESD or PSD. We found that the Maxwell PDF is too wide compared to measurements and non-thermal PDF's are better suited.
Spatial variability effects on precision and power of forage yield estimation

USDA-ARS?s Scientific Manuscript database

Spatial analyses of yield trials are important, as they adjust cultivar means for spatial variation and improve the statistical precision of yield estimation. While the relative efficiency of spatial analysis has been frequently reported in several yield trials, its application on long-term forage y...
METHODS OF DEALING WITH VALUES BELOW THE LIMIT OF DETECTION USING SAS

EPA Science Inventory

Due to limitations of chemical analysis procedures, small values cannot be precisely measured. These values are said to be below the limit of detection (LOD). In statistical analyses, these values are often censored and substituted with a constant value, such as half the LOD,...
Tactics of Interventions: Student Mobility and Human Capital Building in Singapore

ERIC Educational Resources Information Center

Koh, Aaron

2012-01-01

Hitherto, research on transnational higher education student mobility tended to narrowly present hard statistics on student mobility, analysing these in terms of "trends" and the implication this has on policy and internationalizing strategies. What is missing from this "big picture" is a close-up analysis of the micropolitics…
Detecting most influencing courses on students grades using block PCA

NASA Astrophysics Data System (ADS)

Othman, Osama H.; Gebril, Rami Salah

2014-12-01

One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.
Systematic survey of the design, statistical analysis, and reporting of studies published in the 2008 volume of the Journal of Cerebral Blood Flow and Metabolism.

PubMed

Vesterinen, Hanna M; Vesterinen, Hanna V; Egan, Kieren; Deister, Amelie; Schlattmann, Peter; Macleod, Malcolm R; Dirnagl, Ulrich

2011-04-01

Translating experimental findings into clinically effective therapies is one of the major bottlenecks of modern medicine. As this has been particularly true for cerebrovascular research, attention has turned to the quality and validity of experimental cerebrovascular studies. We set out to assess the study design, statistical analyses, and reporting of cerebrovascular research. We assessed all original articles published in the Journal of Cerebral Blood Flow and Metabolism during the year 2008 against a checklist designed to capture the key attributes relating to study design, statistical analyses, and reporting. A total of 156 original publications were included (animal, in vitro, human). Few studies reported a primary research hypothesis, statement of purpose, or measures to safeguard internal validity (such as randomization, blinding, exclusion or inclusion criteria). Many studies lacked sufficient information regarding methods and results to form a reasonable judgment about their validity. In nearly 20% of studies, statistical tests were either not appropriate or information to allow assessment of appropriateness was lacking. This study identifies a number of factors that should be addressed if the quality of research in basic and translational biomedicine is to be improved. We support the widespread implementation of the ARRIVE (Animal Research Reporting In Vivo Experiments) statement for the reporting of experimental studies in biomedicine, for improving training in proper study design and analysis, and that reviewers and editors adopt a more constructively critical approach in the assessment of manuscripts for publication.
Experimental design matters for statistical analysis: how to handle blocking.

PubMed

Jensen, Signe M; Schaarschmidt, Frank; Onofri, Andrea; Ritz, Christian

2018-03-01

Nowadays, evaluation of the effects of pesticides often relies on experimental designs that involve multiple concentrations of the pesticide of interest or multiple pesticides at specific comparable concentrations and, possibly, secondary factors of interest. Unfortunately, the experimental design is often more or less neglected when analysing data. Two data examples were analysed using different modelling strategies. First, in a randomized complete block design, mean heights of maize treated with a herbicide and one of several adjuvants were compared. Second, translocation of an insecticide applied to maize as a seed treatment was evaluated using incomplete data from an unbalanced design with several layers of hierarchical sampling. Extensive simulations were carried out to further substantiate the effects of different modelling strategies. It was shown that results from suboptimal approaches (two-sample t-tests and ordinary ANOVA assuming independent observations) may be both quantitatively and qualitatively different from the results obtained using an appropriate linear mixed model. The simulations demonstrated that the different approaches may lead to differences in coverage percentages of confidence intervals and type 1 error rates, confirming that misleading conclusions can easily happen when an inappropriate statistical approach is chosen. To ensure that experimental data are summarized appropriately, avoiding misleading conclusions, the experimental design should duly be reflected in the choice of statistical approaches and models. We recommend that author guidelines should explicitly point out that authors need to indicate how the statistical analysis reflects the experimental design. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

Systematic survey of the design, statistical analysis, and reporting of studies published in the 2008 volume of the Journal of Cerebral Blood Flow and Metabolism

PubMed Central

Vesterinen, Hanna V; Egan, Kieren; Deister, Amelie; Schlattmann, Peter; Macleod, Malcolm R; Dirnagl, Ulrich

2011-01-01

Translating experimental findings into clinically effective therapies is one of the major bottlenecks of modern medicine. As this has been particularly true for cerebrovascular research, attention has turned to the quality and validity of experimental cerebrovascular studies. We set out to assess the study design, statistical analyses, and reporting of cerebrovascular research. We assessed all original articles published in the Journal of Cerebral Blood Flow and Metabolism during the year 2008 against a checklist designed to capture the key attributes relating to study design, statistical analyses, and reporting. A total of 156 original publications were included (animal, in vitro, human). Few studies reported a primary research hypothesis, statement of purpose, or measures to safeguard internal validity (such as randomization, blinding, exclusion or inclusion criteria). Many studies lacked sufficient information regarding methods and results to form a reasonable judgment about their validity. In nearly 20% of studies, statistical tests were either not appropriate or information to allow assessment of appropriateness was lacking. This study identifies a number of factors that should be addressed if the quality of research in basic and translational biomedicine is to be improved. We support the widespread implementation of the ARRIVE (Animal Research Reporting In Vivo Experiments) statement for the reporting of experimental studies in biomedicine, for improving training in proper study design and analysis, and that reviewers and editors adopt a more constructively critical approach in the assessment of manuscripts for publication. PMID:21157472
Statistical analysis plan of the head position in acute ischemic stroke trial pilot (HEADPOST pilot).

PubMed

Olavarría, Verónica V; Arima, Hisatomi; Anderson, Craig S; Brunser, Alejandro; Muñoz-Venturelli, Paula; Billot, Laurent; Lavados, Pablo M

2017-02-01

Background The HEADPOST Pilot is a proof-of-concept, open, prospective, multicenter, international, cluster randomized, phase IIb controlled trial, with masked outcome assessment. The trial will test if lying flat head position initiated in patients within 12 h of onset of acute ischemic stroke involving the anterior circulation increases cerebral blood flow in the middle cerebral arteries, as measured by transcranial Doppler. The study will also assess the safety and feasibility of patients lying flat for ≥24 h. The trial was conducted in centers in three countries, with ability to perform early transcranial Doppler. A feature of this trial was that patients were randomized to a certain position according to the month of admission to hospital. Objective To outline in detail the predetermined statistical analysis plan for HEADPOST Pilot study. Methods All data collected by participating researchers will be reviewed and formally assessed. Information pertaining to the baseline characteristics of patients, their process of care, and the delivery of treatments will be classified, and for each item, appropriate descriptive statistical analyses are planned with comparisons made between randomized groups. For the outcomes, statistical comparisons to be made between groups are planned and described. Results This statistical analysis plan was developed for the analysis of the results of the HEADPOST Pilot study to be transparent, available, verifiable, and predetermined before data lock. Conclusions We have developed a statistical analysis plan for the HEADPOST Pilot study which is to be followed to avoid analysis bias arising from prior knowledge of the study findings. Trial registration The study is registered under HEADPOST-Pilot, ClinicalTrials.gov Identifier NCT01706094.
Statistical issues in quality control of proteomic analyses: good experimental design and planning.

PubMed

Cairns, David A

2011-03-01

Quality control is becoming increasingly important in proteomic investigations as experiments become more multivariate and quantitative. Quality control applies to all stages of an investigation and statistics can play a key role. In this review, the role of statistical ideas in the design and planning of an investigation is described. This involves the design of unbiased experiments using key concepts from statistical experimental design, the understanding of the biological and analytical variation in a system using variance components analysis and the determination of a required sample size to perform a statistically powerful investigation. These concepts are described through simple examples and an example data set from a 2-D DIGE pilot experiment. Each of these concepts can prove useful in producing better and more reproducible data. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Do polymorphisms of 5,10-methylenetetrahydrofolate reductase (MTHFR) gene affect the risk of childhood acute lymphoblastic leukemia?

PubMed

Pereira, Tiago Veiga; Rudnicki, Martina; Pereira, Alexandre Costa; Pombo-de-Oliveira, Maria S; Franco, Rendrik França

2006-01-01

Meta-analysis has become an important statistical tool in genetic association studies, since it may provide more powerful and precise estimates. However, meta-analytic studies are prone to several potential biases not only because the preferential publication of "positive'' studies but also due to difficulties in obtaining all relevant information during the study selection process. In this letter, we point out major problems in meta-analysis that may lead to biased conclusions, illustrating an empirical example of two recent meta-analyses on the relation between MTHFR polymorphisms and risk of acute lymphoblastic leukemia that, despite the similarity in statistical methods and period of study selection, provided partially conflicting results.
A simple technique investigating baseline heterogeneity helped to eliminate potential bias in meta-analyses.

PubMed

Hicks, Amy; Fairhurst, Caroline; Torgerson, David J

2018-03-01

To perform a worked example of an approach that can be used to identify and remove potentially biased trials from meta-analyses via the analysis of baseline variables. True randomisation produces treatment groups that differ only by chance; therefore, a meta-analysis of a baseline measurement should produce no overall difference and zero heterogeneity. A meta-analysis from the British Medical Journal, known to contain significant heterogeneity and imbalance in baseline age, was chosen. Meta-analyses of baseline variables were performed and trials systematically removed, starting with those with the largest t-statistic, until the I 2 measure of heterogeneity became 0%, then the outcome meta-analysis repeated with only the remaining trials as a sensitivity check. We argue that heterogeneity in a meta-analysis of baseline variables should not exist, and therefore removing trials which contribute to heterogeneity from a meta-analysis will produce a more valid result. In our example none of the overall outcomes changed when studies contributing to heterogeneity were removed. We recommend routine use of this technique, using age and a second baseline variable predictive of outcome for the particular study chosen, to help eliminate potential bias in meta-analyses. Copyright © 2017 Elsevier Inc. All rights reserved.
Substituting values for censored data from Texas, USA, reservoirs inflated and obscured trends in analyses commonly used for water quality target development.

PubMed

Grantz, Erin; Haggard, Brian; Scott, J Thad

2018-06-12

We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.
Trends in study design and the statistical methods employed in a leading general medicine journal.

PubMed

Gosho, M; Sato, Y; Nagashima, K; Takahashi, S

2018-02-01

Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing after the presentation of the FDA guidance for adaptive design. © 2017 John Wiley & Sons Ltd.
Modelling multiple sources of dissemination bias in meta-analysis.

PubMed

Bowden, Jack; Jackson, Dan; Thompson, Simon G

2010-03-30

Asymmetry in the funnel plot for a meta-analysis suggests the presence of dissemination bias. This may be caused by publication bias through the decisions of journal editors, by selective reporting of research results by authors or by a combination of both. Typically, study results that are statistically significant or have larger estimated effect sizes are more likely to appear in the published literature, hence giving a biased picture of the evidence-base. Previous statistical approaches for addressing dissemination bias have assumed only a single selection mechanism. Here we consider a more realistic scenario in which multiple dissemination processes, involving both the publishing authors and journals, are operating. In practical applications, the methods can be used to provide sensitivity analyses for the potential effects of multiple dissemination biases operating in meta-analysis.
Processes and subdivisions in diogenites, a multivariate statistical analysis

NASA Technical Reports Server (NTRS)

Harriott, T. A.; Hewins, R. H.

1984-01-01

Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
Sensitivity Analyses of the Change in FVC in a Phase 3 Trial of Pirfenidone for Idiopathic Pulmonary Fibrosis.

PubMed

Lederer, David J; Bradford, Williamson Z; Fagan, Elizabeth A; Glaspole, Ian; Glassberg, Marilyn K; Glasscock, Kenneth F; Kardatzke, David; King, Talmadge E; Lancaster, Lisa H; Nathan, Steven D; Pereira, Carlos A; Sahn, Steven A; Swigris, Jeffrey J; Noble, Paul W

2015-07-01

FVC outcomes in clinical trials on idiopathic pulmonary fibrosis (IPF) can be substantially influenced by the analytic methodology and the handling of missing data. We conducted a series of sensitivity analyses to assess the robustness of the statistical finding and the stability of the estimate of the magnitude of treatment effect on the primary end point of FVC change in a phase 3 trial evaluating pirfenidone in adults with IPF. Source data included all 555 study participants randomized to treatment with pirfenidone or placebo in the Assessment of Pirfenidone to Confirm Efficacy and Safety in Idiopathic Pulmonary Fibrosis (ASCEND) study. Sensitivity analyses were conducted to assess whether alternative statistical tests and methods for handling missing data influenced the observed magnitude of treatment effect on the primary end point of change from baseline to week 52 in FVC. The distribution of FVC change at week 52 was systematically different between the two treatment groups and favored pirfenidone in each analysis. The method used to impute missing data due to death had a marked effect on the magnitude of change in FVC in both treatment groups; however, the magnitude of treatment benefit was generally consistent on a relative basis, with an approximate 50% reduction in FVC decline observed in the pirfenidone group in each analysis. Our results confirm the robustness of the statistical finding on the primary end point of change in FVC in the ASCEND trial and corroborate the estimated magnitude of the pirfenidone treatment effect in patients with IPF. ClinicalTrials.gov; No.: NCT01366209; URL: www.clinicaltrials.gov.
Differences in Looking at Own- and Other-Race Faces Are Subtle and Analysis-Dependent: An Account of Discrepant Reports.

PubMed

Arizpe, Joseph; Kravitz, Dwight J; Walsh, Vincent; Yovel, Galit; Baker, Chris I

2016-01-01

The Other-Race Effect (ORE) is the robust and well-established finding that people are generally poorer at facial recognition of individuals of another race than of their own race. Over the past four decades, much research has focused on the ORE because understanding this phenomenon is expected to elucidate fundamental face processing mechanisms and the influence of experience on such mechanisms. Several recent studies of the ORE in which the eye-movements of participants viewing own- and other-race faces were tracked have, however, reported highly conflicting results regarding the presence or absence of differential patterns of eye-movements to own- versus other-race faces. This discrepancy, of course, leads to conflicting theoretical interpretations of the perceptual basis for the ORE. Here we investigate fixation patterns to own- versus other-race (African and Chinese) faces for Caucasian participants using different analysis methods. While we detect statistically significant, though subtle, differences in fixation pattern using an Area of Interest (AOI) approach, we fail to detect significant differences when applying a spatial density map approach. Though there were no significant differences in the spatial density maps, the qualitative patterns matched the results from the AOI analyses reflecting how, in certain contexts, Area of Interest (AOI) analyses can be more sensitive in detecting the differential fixation patterns than spatial density analyses, due to spatial pooling of data with AOIs. AOI analyses, however, also come with the limitation of requiring a priori specification. These findings provide evidence that the conflicting reports in the prior literature may be at least partially accounted for by the differences in the statistical sensitivity associated with the different analysis methods employed across studies. Overall, our results suggest that detection of differences in eye-movement patterns can be analysis-dependent and rests on the assumptions inherent in the given analysis.
Differences in Looking at Own- and Other-Race Faces Are Subtle and Analysis-Dependent: An Account of Discrepant Reports

PubMed Central

Arizpe, Joseph; Kravitz, Dwight J.; Walsh, Vincent; Yovel, Galit; Baker, Chris I.

2016-01-01

The Other-Race Effect (ORE) is the robust and well-established finding that people are generally poorer at facial recognition of individuals of another race than of their own race. Over the past four decades, much research has focused on the ORE because understanding this phenomenon is expected to elucidate fundamental face processing mechanisms and the influence of experience on such mechanisms. Several recent studies of the ORE in which the eye-movements of participants viewing own- and other-race faces were tracked have, however, reported highly conflicting results regarding the presence or absence of differential patterns of eye-movements to own- versus other-race faces. This discrepancy, of course, leads to conflicting theoretical interpretations of the perceptual basis for the ORE. Here we investigate fixation patterns to own- versus other-race (African and Chinese) faces for Caucasian participants using different analysis methods. While we detect statistically significant, though subtle, differences in fixation pattern using an Area of Interest (AOI) approach, we fail to detect significant differences when applying a spatial density map approach. Though there were no significant differences in the spatial density maps, the qualitative patterns matched the results from the AOI analyses reflecting how, in certain contexts, Area of Interest (AOI) analyses can be more sensitive in detecting the differential fixation patterns than spatial density analyses, due to spatial pooling of data with AOIs. AOI analyses, however, also come with the limitation of requiring a priori specification. These findings provide evidence that the conflicting reports in the prior literature may be at least partially accounted for by the differences in the statistical sensitivity associated with the different analysis methods employed across studies. Overall, our results suggest that detection of differences in eye-movement patterns can be analysis-dependent and rests on the assumptions inherent in the given analysis. PMID:26849447
Prognostic value of inflammation-based scores in patients with osteosarcoma

PubMed Central

Liu, Bangjian; Huang, Yujing; Sun, Yuanjue; Zhang, Jianjun; Yao, Yang; Shen, Zan; Xiang, Dongxi; He, Aina

2016-01-01

Systemic inflammation responses have been associated with cancer development and progression. C-reactive protein (CRP), Glasgow prognostic score (GPS), neutrophil-lymphocyte ratio (NLR), platelet-lymphocyte ratio (PLR), lymphocyte-monocyte ratio (LMR), and neutrophil-platelet score (NPS) have been shown to be independent risk factors in various types of malignant tumors. This retrospective analysis of 162 osteosarcoma cases was performed to estimate their predictive value of survival in osteosarcoma. All statistical analyses were performed by SPSS statistical software. Receiver operating characteristic (ROC) analysis was generated to set optimal thresholds; area under the curve (AUC) was used to show the discriminatory abilities of inflammation-based scores; Kaplan-Meier analysis was performed to plot the survival curve; cox regression models were employed to determine the independent prognostic factors. The optimal cut-off points of NLR, PLR, and LMR were 2.57, 123.5 and 4.73, respectively. GPS and NLR had a markedly larger AUC than CRP, PLR and LMR. High levels of CRP, GPS, NLR, PLR, and low level of LMR were significantly associated with adverse prognosis (P < 0.05). Multivariate Cox regression analyses revealed that GPS, NLR, and occurrence of metastasis were top risk factors associated with death of osteosarcoma patients. PMID:28008988
Seeking a fingerprint: analysis of point processes in actigraphy recording

NASA Astrophysics Data System (ADS)

Gudowska-Nowak, Ewa; Ochab, Jeremi K.; Oleś, Katarzyna; Beldzik, Ewa; Chialvo, Dante R.; Domagalik, Aleksandra; Fąfrowicz, Magdalena; Marek, Tadeusz; Nowak, Maciej A.; Ogińska, Halszka; Szwed, Jerzy; Tyburczyk, Jacek

2016-05-01

Motor activity of humans displays complex temporal fluctuations which can be characterised by scale-invariant statistics, thus demonstrating that structure and fluctuations of such kinetics remain similar over a broad range of time scales. Previous studies on humans regularly deprived of sleep or suffering from sleep disorders predicted a change in the invariant scale parameters with respect to those for healthy subjects. In this study we investigate the signal patterns from actigraphy recordings by means of characteristic measures of fractional point processes. We analyse spontaneous locomotor activity of healthy individuals recorded during a week of regular sleep and a week of chronic partial sleep deprivation. Behavioural symptoms of lack of sleep can be evaluated by analysing statistics of duration times during active and resting states, and alteration of behavioural organisation can be assessed by analysis of power laws detected in the event count distribution, distribution of waiting times between consecutive movements and detrended fluctuation analysis of recorded time series. We claim that among different measures characterising complexity of the actigraphy recordings and their variations implied by chronic sleep distress, the exponents characterising slopes of survival functions in resting states are the most effective biomarkers distinguishing between healthy and sleep-deprived groups.
Quantitative Analysis of Repertoire-Scale Immunoglobulin Properties in Vaccine-Induced B-Cell Responses

DTIC Science & Technology

2017-05-10

repertoire-wide properties. Finally, through 75 the use of appropriate statistical analyses, the repertoire profiles can be quantitatively compared and 76...cell response to eVLP and 503 quantitatively compare GC B-cell repertoires from immunization conditions. We partitioned the 504 resulting clonotype... Quantitative analysis of repertoire-scale immunoglobulin properties in vaccine-induced B-cell responses Ilja V. Khavrutskii1, Sidhartha Chaudhury*1
Statistical Analysis of Fort Hood Quality-of-Life Questionnaire.

DTIC Science & Technology

1978-10-01

The objective of this work was to provide supplementary data analyses of data abstracted from the Quality - of - Life questionnaire developed earlier at...the Fort Hood Field Unit at the request of Headquarters, TRADOC Combined Arms Test Activity (TCATA). The Quality - of - Life questionnaire data were...to the Quality - of - Life questionnaire. These data were then intensively analyzed using analysis of variance and correlational techniques. The results
Experimental design of an interlaboratory study for trace metal analysis of liquid fluids. [for aerospace vehicles

NASA Technical Reports Server (NTRS)

Greenbauer-Seng, L. A.

1983-01-01

The accurate determination of trace metals and fuels is an important requirement in much of the research into and development of alternative fuels for aerospace applications. Recognizing the detrimental effects of certain metals on fuel performance and fuel systems at the part per million and in some cases part per billion levels requires improved accuracy in determining these low concentration elements. Accurate analyses are also required to ensure interchangeability of analysis results between vendor, researcher, and end use for purposes of quality control. Previous interlaboratory studies have demonstrated the inability of different laboratories to agree on the results of metal analysis, particularly at low concentration levels, yet typically good precisions are reported within a laboratory. An interlaboratory study was designed to gain statistical information about the sources of variation in the reported concentrations. Five participant laboratories were used on a fee basis and were not informed of the purpose of the analyses. The effects of laboratory, analytical technique, concentration level, and ashing additive were studied in four fuel types for 20 elements of interest. The prescribed sample preparation schemes (variations of dry ashing) were used by all of the laboratories. The analytical data were statistically evaluated using a computer program for the analysis of variance technique.
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"

ERIC Educational Resources Information Center

Ozturk, Elif

2012-01-01

The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Treatment of missing data in follow-up studies of randomised controlled trials: A systematic review of the literature.

PubMed

Sullivan, Thomas R; Yelland, Lisa N; Lee, Katherine J; Ryan, Philip; Salter, Amy B

2017-08-01

After completion of a randomised controlled trial, an extended follow-up period may be initiated to learn about longer term impacts of the intervention. Since extended follow-up studies often involve additional eligibility restrictions and consent processes for participation, and a longer duration of follow-up entails a greater risk of participant attrition, missing data can be a considerable threat in this setting. As a potential source of bias, it is critical that missing data are appropriately handled in the statistical analysis, yet little is known about the treatment of missing data in extended follow-up studies. The aims of this review were to summarise the extent of missing data in extended follow-up studies and the use of statistical approaches to address this potentially serious problem. We performed a systematic literature search in PubMed to identify extended follow-up studies published from January to June 2015. Studies were eligible for inclusion if the original randomised controlled trial results were also published and if the main objective of extended follow-up was to compare the original randomised groups. We recorded information on the extent of missing data and the approach used to treat missing data in the statistical analysis of the primary outcome of the extended follow-up study. Of the 81 studies included in the review, 36 (44%) reported additional eligibility restrictions and 24 (30%) consent processes for entry into extended follow-up. Data were collected at a median of 7 years after randomisation. Excluding 28 studies with a time to event primary outcome, 51/53 studies (96%) reported missing data on the primary outcome. The median percentage of randomised participants with complete data on the primary outcome was just 66% in these studies. The most common statistical approach to address missing data was complete case analysis (51% of studies), while likelihood-based analyses were also well represented (25%). Sensitivity analyses around the missing data mechanism were rarely performed (25% of studies), and when they were, they often involved unrealistic assumptions about the mechanism. Despite missing data being a serious problem in extended follow-up studies, statistical approaches to addressing missing data were often inadequate. We recommend researchers clearly specify all sources of missing data in follow-up studies and use statistical methods that are valid under a plausible assumption about the missing data mechanism. Sensitivity analyses should also be undertaken to assess the robustness of findings to assumptions about the missing data mechanism.
A wind proxy based on migrating dunes at the Baltic coast: statistical analysis of the link between wind conditions and sand movement

NASA Astrophysics Data System (ADS)

Bierstedt, Svenja E.; Hünicke, Birgit; Zorita, Eduardo; Ludwig, Juliane

2017-07-01

We statistically analyse the relationship between the structure of migrating dunes in the southern Baltic and the driving wind conditions over the past 26 years, with the long-term aim of using migrating dunes as a proxy for past wind conditions at an interannual resolution. The present analysis is based on the dune record derived from geo-radar measurements by Ludwig et al. (2017). The dune system is located at the Baltic Sea coast of Poland and is migrating from west to east along the coast. The dunes present layers with different thicknesses that can be assigned to absolute dates at interannual timescales and put in relation to seasonal wind conditions. To statistically analyse this record and calibrate it as a wind proxy, we used a gridded regional meteorological reanalysis data set (coastDat2) covering recent decades. The identified link between the dune annual layers and wind conditions was additionally supported by the co-variability between dune layers and observed sea level variations in the southern Baltic Sea. We include precipitation and temperature into our analysis, in addition to wind, to learn more about the dependency between these three atmospheric factors and their common influence on the dune system. We set up a statistical linear model based on the correlation between the frequency of days with specific wind conditions in a given season and dune migration velocities derived for that season. To some extent, the dune records can be seen as analogous to tree-ring width records, and hence we use a proxy validation method usually applied in dendrochronology, cross-validation with the leave-one-out method, when the observational record is short. The revealed correlations between the wind record from the reanalysis and the wind record derived from the dune structure is in the range between 0.28 and 0.63, yielding similar statistical validation skill as dendroclimatological records.

A concept for holistic whole body MRI data analysis, Imiomics

PubMed Central

Malmberg, Filip; Johansson, Lars; Lind, Lars; Sundbom, Magnus; Ahlström, Håkan; Kullberg, Joel

2017-01-01

Purpose To present and evaluate a whole-body image analysis concept, Imiomics (imaging–omics) and an image registration method that enables Imiomics analyses by deforming all image data to a common coordinate system, so that the information in each voxel can be compared between persons or within a person over time and integrated with non-imaging data. Methods The presented image registration method utilizes relative elasticity constraints of different tissue obtained from whole-body water-fat MRI. The registration method is evaluated by inverse consistency and Dice coefficients and the Imiomics concept is evaluated by example analyses of importance for metabolic research using non-imaging parameters where we know what to expect. The example analyses include whole body imaging atlas creation, anomaly detection, and cross-sectional and longitudinal analysis. Results The image registration method evaluation on 128 subjects shows low inverse consistency errors and high Dice coefficients. Also, the statistical atlas with fat content intensity values shows low standard deviation values, indicating successful deformations to the common coordinate system. The example analyses show expected associations and correlations which agree with explicit measurements, and thereby illustrate the usefulness of the proposed Imiomics concept. Conclusions The registration method is well-suited for Imiomics analyses, which enable analyses of relationships to non-imaging data, e.g. clinical data, in new types of holistic targeted and untargeted big-data analysis. PMID:28241015
Statistical analyses and characteristics of volcanic tremor on Stromboli Volcano (Italy)

NASA Astrophysics Data System (ADS)

Falsaperla, S.; Langer, H.; Spampinato, S.

A study of volcanic tremor on Stromboli is carried out on the basis of data recorded daily between 1993 and 1995 by a permanent seismic station (STR) located 1.8km away from the active craters. We also consider the signal of a second station (TF1), which operated for a shorter time span. Changes in the spectral tremor characteristics can be related to modifications in volcanic activity, particularly to lava effusions and explosive sequences. Statistical analyses were carried out on a set of spectra calculated daily from seismic signals where explosion quakes were present or excluded. Principal component analysis and cluster analysis were applied to identify different classes of spectra. Three clusters of spectra are associated with two different states of volcanic activity. One cluster corresponds to a state of low to moderate activity, whereas the two other clusters are present during phases with a high magma column as inferred from the occurrence of lava fountains or effusions. We therefore conclude that variations in volcanic activity at Stromboli are usually linked to changes in the spectral characteristics of volcanic tremor. Site effects are evident when comparing the spectra calculated from signals synchronously recorded at STR and TF1. However, some major spectral peaks at both stations may reflect source properties. Statistical considerations and polarization analysis are in favor of a prevailing presence of P-waves in the tremor signal along with a position of the source northwest of the craters and at shallow depth.
Spatial variation in the bacterial and denitrifying bacterial community in a biofilter treating subsurface agricultural drainage.

PubMed

Andrus, J Malia; Porter, Matthew D; Rodríguez, Luis F; Kuehlhorn, Timothy; Cooke, Richard A C; Zhang, Yuanhui; Kent, Angela D; Zilles, Julie L

2014-02-01

Denitrifying biofilters can remove agricultural nitrates from subsurface drainage, reducing nitrate pollution that contributes to coastal hypoxic zones. The performance and reliability of natural and engineered systems dependent upon microbially mediated processes, such as the denitrifying biofilters, can be affected by the spatial structure of their microbial communities. Furthermore, our understanding of the relationship between microbial community composition and function is influenced by the spatial distribution of samples.In this study we characterized the spatial structure of bacterial communities in a denitrifying biofilter in central Illinois. Bacterial communities were assessed using automated ribosomal intergenic spacer analysis for bacteria and terminal restriction fragment length polymorphism of nosZ for denitrifying bacteria.Non-metric multidimensional scaling and analysis of similarity (ANOSIM) analyses indicated that bacteria showed statistically significant spatial structure by depth and transect,while denitrifying bacteria did not exhibit significant spatial structure. For determination of spatial patterns, we developed a package of automated functions for the R statistical environment that allows directional analysis of microbial community composition data using either ANOSIM or Mantel statistics.Applying this package to the biofilter data, the flow path correlation range for the bacterial community was 6.4 m at the shallower, periodically in undated depth and 10.7 m at the deeper, continually submerged depth. These spatial structures suggest a strong influence of hydrology on the microbial community composition in these denitrifying biofilters. Understanding such spatial structure can also guide optimal sample collection strategies for microbial community analyses.
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

PubMed Central

Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.

2014-01-01

Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607
Comparison of bacterial community structure and dynamics during the thermophilic composting of different types of solid wastes: anaerobic digestion residue, pig manure and chicken manure

PubMed Central

Song, Caihong; Li, Mingxiao; Jia, Xuan; Wei, Zimin; Zhao, Yue; Xi, Beidou; Zhu, Chaowei; Liu, Dongming

2014-01-01

This study investigated the impact of composting substrate types on the bacterial community structure and dynamics during composting processes. To this end, pig manure (PM), chicken manure (CM), a mixture of PM and CM (PM + CM), and a mixture of PM, CM and anaerobic digestion residue (ADR) (PM + CM + ADR) were selected for thermophilic composting. The bacterial community structure and dynamics during the composting process were detected and analysed by polymerase chain reaction–denaturing gradient gel electrophoresis (DGGE) coupled with a statistic analysis. The physical-chemical analyses indicated that compared to single-material composting (PM, CM), co-composting (PM + CM, PM + CM + ADR) could promote the degradation of organic matter and strengthen the ability of conserving nitrogen. A DGGE profile and statistical analysis demonstrated that co-composting, especially PM + CM + ADR, could improve the bacterial community structure and functional diversity, even in the thermophilic stage. Therefore, co-composting could weaken the screening effect of high temperature on bacterial communities. Dominant sequencing analyses indicated a dramatic shift in the dominant bacterial communities from single-material composting to co-composting. Notably, compared with PM, PM + CM increased the quantity of xylan-degrading bacteria and reduced the quantity of human pathogens. PMID:24963997
Interim analyses in 2 x 2 crossover trials.

PubMed

Cook, R J

1995-09-01

A method is presented for performing interim analyses in long term 2 x 2 crossover trials with serial patient entry. The analyses are based on a linear statistic that combines data from individuals observed for one treatment period with data from individuals observed for both periods. The coefficients in this linear combination can be chosen quite arbitrarily, but we focus on variance-based weights to maximize power for tests regarding direct treatment effects. The type I error rate of this procedure is controlled by utilizing the joint distribution of the linear statistics over analysis stages. Methods for performing power and sample size calculations are indicated. A two-stage sequential design involving simultaneous patient entry and a single between-period interim analysis is considered in detail. The power and average number of measurements required for this design are compared to those of the usual crossover trial. The results indicate that, while there is minimal loss in power relative to the usual crossover design in the absence of differential carry-over effects, the proposed design can have substantially greater power when differential carry-over effects are present. The two-stage crossover design can also lead to more economical studies in terms of the expected number of measurements required, due to the potential for early stopping. Attention is directed toward normally distributed responses.
Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

PubMed

Schlenker, Evelyn

2016-01-01

This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
mvp - an open-source preprocessor for cleaning duplicate records and missing values in mass spectrometry data.

PubMed

Lee, Geunho; Lee, Hyun Beom; Jung, Byung Hwa; Nam, Hojung

2017-07-01

Mass spectrometry (MS) data are used to analyze biological phenomena based on chemical species. However, these data often contain unexpected duplicate records and missing values due to technical or biological factors. These 'dirty data' problems increase the difficulty of performing MS analyses because they lead to performance degradation when statistical or machine-learning tests are applied to the data. Thus, we have developed missing values preprocessor (mvp), an open-source software for preprocessing data that might include duplicate records and missing values. mvp uses the property of MS data in which identical chemical species present the same or similar values for key identifiers, such as the mass-to-charge ratio and intensity signal, and forms cliques via graph theory to process dirty data. We evaluated the validity of the mvp process via quantitative and qualitative analyses and compared the results from a statistical test that analyzed the original and mvp-applied data. This analysis showed that using mvp reduces problems associated with duplicate records and missing values. We also examined the effects of using unprocessed data in statistical tests and examined the improved statistical test results obtained with data preprocessed using mvp.
Analysis of Precipitation (Rain and Snow) Levels and Straight-line Wind Speeds in Support of the 10-year Natural Phenomena Hazards Review for Los Alamos National Laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kelly, Elizabeth J.; Dewart, Jean Marie; Deola, Regina

This report provides site-specific return level analyses for rain, snow, and straight-line wind extreme events. These analyses are in support of the 10-year review plan for the assessment of meteorological natural phenomena hazards at Los Alamos National Laboratory (LANL). These analyses follow guidance from Department of Energy, DOE Standard, Natural Phenomena Hazards Analysis and Design Criteria for DOE Facilities (DOE-STD-1020-2012), Nuclear Regulatory Commission Standard Review Plan (NUREG-0800, 2007) and ANSI/ ANS-2.3-2011, Estimating Tornado, Hurricane, and Extreme Straight-Line Wind Characteristics at Nuclear Facility Sites. LANL precipitation and snow level data have been collected since 1910, although not all years are complete.more » In this report the results from the more recent data (1990–2014) are compared to those of past analyses and a 2004 National Oceanographic and Atmospheric Administration report. Given the many differences in the data sets used in these different analyses, the lack of statistically significant differences in return level estimates increases confidence in the data and in the modeling and analysis approach.« less
Empirical evidence about inconsistency among studies in a pair‐wise meta‐analysis

PubMed Central

Turner, Rebecca M.; Higgins, Julian P. T.

2015-01-01

This paper investigates how inconsistency (as measured by the I2 statistic) among studies in a meta‐analysis may differ, according to the type of outcome data and effect measure. We used hierarchical models to analyse data from 3873 binary, 5132 continuous and 880 mixed outcome meta‐analyses within the Cochrane Database of Systematic Reviews. Predictive distributions for inconsistency expected in future meta‐analyses were obtained, which can inform priors for between‐study variance. Inconsistency estimates were highest on average for binary outcome meta‐analyses of risk differences and continuous outcome meta‐analyses. For a planned binary outcome meta‐analysis in a general research setting, the predictive distribution for inconsistency among log odds ratios had median 22% and 95% CI: 12% to 39%. For a continuous outcome meta‐analysis, the predictive distribution for inconsistency among standardized mean differences had median 40% and 95% CI: 15% to 73%. Levels of inconsistency were similar for binary data measured by log odds ratios and log relative risks. Fitted distributions for inconsistency expected in continuous outcome meta‐analyses using mean differences were almost identical to those using standardized mean differences. The empirical evidence on inconsistency gives guidance on which outcome measures are most likely to be consistent in particular circumstances and facilitates Bayesian meta‐analysis with an informative prior for heterogeneity. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd. PMID:26679486
Comparison of corneal endothelial image analysis by Konan SP8000 noncontact and Bio-Optics Bambi systems.

PubMed

Benetz, B A; Diaconu, E; Bowlin, S J; Oak, S S; Laing, R A; Lass, J H

1999-01-01

Compare corneal endothelial image analysis by Konan SP8000 and Bio-Optics Bambi image-analysis systems. Corneal endothelial images from 98 individuals (191 eyes), ranging in age from 4 to 87 years, with a normal slit-lamp examination and no history of ocular trauma, intraocular surgery, or intraocular inflammation were obtained by the Konan SP8000 noncontact specular microscope. One observer analyzed these images by using the Konan system and a second observer by using the Bio-Optics Bambi system. Three methods of analyses were used: a fixed-frame method to obtain cell density (for both Konan and Bio-Optics Bambi) and a "dot" (Konan) or "corners" (Bio-Optics Bambi) method to determine morphometric parameters. The cell density determined by the Konan fixed-frame method was significantly higher (157 cells/mm2) than the Bio-Optics Bambi fixed-frame method determination (p<0.0001). However, the difference in cell density, although still statistically significant, was smaller and reversed comparing the Konan fixed-frame method with both Konan dot and Bio-Optics Bambi comers method (-74 cells/mm2, p<0.0001; -55 cells/mm2, p<0.0001, respectively). Small but statistically significant morphometric analyses differences between Konan and Bio-Optics Bambi were seen: cell density, +19 cells/mm2 (p = 0.03); cell area, -3.0 microm2 (p = 0.008); and coefficient of variation, +1.0 (p = 0.003). There was no statistically significant difference between these two methods in the percentage of six-sided cells detected (p = 0.55). Cell densities measured by the Konan fixed-frame method were comparable with Konan and Bio-Optics Bambi's morphometric analysis, but not with the Bio-Optics Bambi fixed-frame method. The two morphometric analyses were comparable with minimal or no differences for the parameters that were studied. The Konan SP8000 endothelial image-analysis system may be useful for large-scale clinical trials determining cell loss; its noncontact system has many clinical benefits (including patient comfort, safety, ease of use, and short procedure time) and provides reliable cell-density calculations.
Adopting a Patient-Centered Approach to Primary Outcome Analysis of Acute Stroke Trials by Use of a Utility-Weighted Modified Rankin Scale

PubMed Central

Chaisinanunkul, Napasri; Adeoye, Opeolu; Lewis, Roger J.; Grotta, James C.; Broderick, Joseph; Jovin, Tudor G.; Nogueira, Raul G.; Elm, Jordan; Graves, Todd; Berry, Scott; Lees, Kennedy R.; Barreto, Andrew D.; Saver, Jeffrey L.

2015-01-01

Background and Purpose Although the modified Rankin Scale (mRS) is the most commonly employed primary endpoint in acute stroke trials, its power is limited when analyzed in dichotomized fashion and its indication of effect size challenging to interpret when analyzed ordinally. Weighting the seven Rankin levels by utilities may improve scale interpretability while preserving statistical power. Methods A utility weighted mRS (UW-mRS) was derived by averaging values from time-tradeoff (patient centered) and person-tradeoff (clinician centered) studies. The UW-mRS, standard ordinal mRS, and dichotomized mRS were applied to 11 trials or meta-analyses of acute stroke treatments, including lytic, endovascular reperfusion, blood pressure moderation, and hemicraniectomy interventions. Results Utility values were: mRS 0–1.0; mRS 1 - 0.91; mRS 2 - 0.76; mRS 3 - 0.65; mRS 4 - 0.33; mRS 5 & 6 - 0. For trials with unidirectional treatment effects, the UW-mRS paralleled the ordinal mRS and outperformed dichotomous mRS analyses. Both the UW-mRS and the ordinal mRS were statistically significant in six of eight unidirectional effect trials, while dichotomous analyses were statistically significant in two to four of eight. In bidirectional effect trials, both the UW-mRS and ordinal tests captured the divergent treatment effects by showing neutral results whereas some dichotomized analyses showed positive results. Mean utility differences in trials with statistically significant positive results ranged from 0.026 to 0.249. Conclusion A utility-weighted mRS performs similarly to the standard ordinal mRS in detecting treatment effects in actual stroke trials and ensures the quantitative outcome is a valid reflection of patient-centered benefits. PMID:26138130
A phylogenetic transform enhances analysis of compositional microbiota data.

PubMed

Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

2017-02-15

Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.
Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis.

PubMed

Shrout, Patrick E; Rodgers, Joseph L

2018-01-04

Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.
Effects of different preservation methods on inter simple sequence repeat (ISSR) and random amplified polymorphic DNA (RAPD) molecular markers in botanic samples.

PubMed

Wang, Xiaolong; Li, Lin; Zhao, Jiaxin; Li, Fangliang; Guo, Wei; Chen, Xia

2017-04-01

To evaluate the effects of different preservation methods (stored in a -20°C ice chest, preserved in liquid nitrogen and dried in silica gel) on inter simple sequence repeat (ISSR) or random amplified polymorphic DNA (RAPD) analyses in various botanical specimens (including broad-leaved plants, needle-leaved plants and succulent plants) for different times (three weeks and three years), we used a statistical analysis based on the number of bands, genetic index and cluster analysis. The results demonstrate that methods used to preserve samples can provide sufficient amounts of genomic DNA for ISSR and RAPD analyses; however, the effect of different preservation methods on these analyses vary significantly, and the preservation time has little effect on these analyses. Our results provide a reference for researchers to select the most suitable preservation method depending on their study subject for the analysis of molecular markers based on genomic DNA. Copyright © 2017 Académie des sciences. Published by Elsevier Masson SAS. All rights reserved.
Cryptic or pseudocryptic: can morphological methods inform copepod taxonomy? An analysis of publications and a case study of the Eurytemora affinis species complex

PubMed Central

Lajus, Dmitry; Sukhikh, Natalia; Alekseev, Victor

2015-01-01

Interest in cryptic species has increased significantly with current progress in genetic methods. The large number of cryptic species suggests that the resolution of traditional morphological techniques may be insufficient for taxonomical research. However, some species now considered to be cryptic may, in fact, be designated pseudocryptic after close morphological examination. Thus the “cryptic or pseudocryptic” dilemma speaks to the resolution of morphological analysis and its utility for identifying species. We address this dilemma first by systematically reviewing data published from 1980 to 2013 on cryptic species of Copepoda and then by performing an in-depth morphological study of the former Eurytemora affinis complex of cryptic species. Analyzing the published data showed that, in 5 of 24 revisions eligible for systematic review, cryptic species assignment was based solely on the genetic variation of forms without detailed morphological analysis to confirm the assignment. Therefore, some newly described cryptic species might be designated pseudocryptic under more detailed morphological analysis as happened with Eurytemora affinis complex. Recent genetic analyses of the complex found high levels of heterogeneity without morphological differences; it is argued to be cryptic. However, next detailed morphological analyses allowed to describe a number of valid species. Our study, using deep statistical analyses usually not applied for new species describing, of this species complex confirmed considerable differences between former cryptic species. In particular, fluctuating asymmetry (FA), the random variation of left and right structures, was significantly different between forms and provided independent information about their status. Our work showed that multivariate statistical approaches, such as principal component analysis, can be powerful techniques for the morphological discrimination of cryptic taxons. Despite increasing cryptic species designations, morphological techniques have great potential in determining copepod taxonomy. PMID:26120427
[Database supported electronic retrospective analyses in radiation oncology: establishing a workflow using the example of pancreatic cancer].

PubMed

Kessel, K A; Habermehl, D; Bohn, C; Jäger, A; Floca, R O; Zhang, L; Bougatf, N; Bendl, R; Debus, J; Combs, S E

2012-12-01

Especially in the field of radiation oncology, handling a large variety of voluminous datasets from various information systems in different documentation styles efficiently is crucial for patient care and research. To date, conducting retrospective clinical analyses is rather difficult and time consuming. With the example of patients with pancreatic cancer treated with radio-chemotherapy, we performed a therapy evaluation by using an analysis system connected with a documentation system. A total number of 783 patients have been documented into a professional, database-based documentation system. Information about radiation therapy, diagnostic images and dose distributions have been imported into the web-based system. For 36 patients with disease progression after neoadjuvant chemoradiation, we designed and established an analysis workflow. After an automatic registration of the radiation plans with the follow-up images, the recurrence volumes are segmented manually. Based on these volumes the DVH (dose volume histogram) statistic is calculated, followed by the determination of the dose applied to the region of recurrence. All results are saved in the database and included in statistical calculations. The main goal of using an automatic analysis tool is to reduce time and effort conducting clinical analyses, especially with large patient groups. We showed a first approach and use of some existing tools, however manual interaction is still necessary. Further steps need to be taken to enhance automation. Already, it has become apparent that the benefits of digital data management and analysis lie in the central storage of data and reusability of the results. Therefore, we intend to adapt the analysis system to other types of tumors in radiation oncology.
On Improving the Quality and Interpretation of Environmental Assessments using Statistical Analysis and Geographic Information Systems

NASA Astrophysics Data System (ADS)

Karuppiah, R.; Faldi, A.; Laurenzi, I.; Usadi, A.; Venkatesh, A.

2014-12-01

An increasing number of studies are focused on assessing the environmental footprint of different products and processes, especially using life cycle assessment (LCA). This work shows how combining statistical methods and Geographic Information Systems (GIS) with environmental analyses can help improve the quality of results and their interpretation. Most environmental assessments in literature yield single numbers that characterize the environmental impact of a process/product - typically global or country averages, often unchanging in time. In this work, we show how statistical analysis and GIS can help address these limitations. For example, we demonstrate a method to separately quantify uncertainty and variability in the result of LCA models using a power generation case study. This is important for rigorous comparisons between the impacts of different processes. Another challenge is lack of data that can affect the rigor of LCAs. We have developed an approach to estimate environmental impacts of incompletely characterized processes using predictive statistical models. This method is applied to estimate unreported coal power plant emissions in several world regions. There is also a general lack of spatio-temporal characterization of the results in environmental analyses. For instance, studies that focus on water usage do not put in context where and when water is withdrawn. Through the use of hydrological modeling combined with GIS, we quantify water stress on a regional and seasonal basis to understand water supply and demand risks for multiple users. Another example where it is important to consider regional dependency of impacts is when characterizing how agricultural land occupation affects biodiversity in a region. We developed a data-driven methodology used in conjuction with GIS to determine if there is a statistically significant difference between the impacts of growing different crops on different species in various biomes of the world.
The Active for Life Year 5 (AFLY5) school-based cluster randomised controlled trial protocol: detailed statistical analysis plan.

PubMed

Lawlor, Debbie A; Peters, Tim J; Howe, Laura D; Noble, Sian M; Kipping, Ruth R; Jago, Russell

2013-07-24

The Active For Life Year 5 (AFLY5) randomised controlled trial protocol was published in this journal in 2011. It provided a summary analysis plan. This publication is an update of that protocol and provides a detailed analysis plan. This update provides a detailed analysis plan of the effectiveness and cost-effectiveness of the AFLY5 intervention. The plan includes details of how variables will be quality control checked and the criteria used to define derived variables. Details of four key analyses are provided: (a) effectiveness analysis 1 (the effect of the AFLY5 intervention on primary and secondary outcomes at the end of the school year in which the intervention is delivered); (b) mediation analyses (secondary analyses examining the extent to which any effects of the intervention are mediated via self-efficacy, parental support and knowledge, through which the intervention is theoretically believed to act); (c) effectiveness analysis 2 (the effect of the AFLY5 intervention on primary and secondary outcomes 12 months after the end of the intervention) and (d) cost effectiveness analysis (the cost-effectiveness of the AFLY5 intervention). The details include how the intention to treat and per-protocol analyses were defined and planned sensitivity analyses for dealing with missing data. A set of dummy tables are provided in Additional file 1. This detailed analysis plan was written prior to any analyst having access to any data and was approved by the AFLY5 Trial Steering Committee. Its publication will ensure that analyses are in accordance with an a priori plan related to the trial objectives and not driven by knowledge of the data. ISRCTN50133740.
A Statistical Method for Synthesizing Mediation Analyses Using the Product of Coefficient Approach Across Multiple Trials

PubMed Central

Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks

2016-01-01

Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sample records for analysis statistical analyses

Abstract