method statistical analyses: Topics by Science.gov

Sample records for method statistical analyses

Citation of previous meta-analyses on the same topic: a clue to perpetuation of incorrect methods?

PubMed

Li, Tianjing; Dickersin, Kay

2013-06-01

Systematic reviews and meta-analyses serve as a basis for decision-making and clinical practice guidelines and should be carried out using appropriate methodology to avoid incorrect inferences. We describe the characteristics, statistical methods used for meta-analyses, and citation patterns of all 21 glaucoma systematic reviews we identified pertaining to the effectiveness of prostaglandin analog eye drops in treating primary open-angle glaucoma, published between December 2000 and February 2012. We abstracted data, assessed whether appropriate statistical methods were applied in meta-analyses, and examined citation patterns of included reviews. We identified two forms of problematic statistical analyses in 9 of the 21 systematic reviews examined. Except in 1 case, none of the 9 reviews that used incorrect statistical methods cited a previously published review that used appropriate methods. Reviews that used incorrect methods were cited 2.6 times more often than reviews that used appropriate statistical methods. We speculate that by emulating the statistical methodology of previous systematic reviews, systematic review authors may have perpetuated incorrect approaches to meta-analysis. The use of incorrect statistical methods, perhaps through emulating methods described in previous research, calls conclusions of systematic reviews into question and may lead to inappropriate patient care. We urge systematic review authors and journal editors to seek the advice of experienced statisticians before undertaking or accepting for publication a systematic review and meta-analysis. The author(s) have no proprietary or commercial interest in any materials discussed in this article. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
A Nonparametric Geostatistical Method For Estimating Species Importance

Treesearch

Andrew J. Lister; Rachel Riemann; Michael Hoppus

2001-01-01

Parametric statistical methods are not always appropriate for conducting spatial analyses of forest inventory data. Parametric geostatistical methods such as variography and kriging are essentially averaging procedures, and thus can be affected by extreme values. Furthermore, non normal distributions violate the assumptions of analyses in which test statistics are...
A systematic review of the quality of statistical methods employed for analysing quality of life data in cancer randomised controlled trials.

PubMed

Hamel, Jean-Francois; Saulnier, Patrick; Pe, Madeline; Zikos, Efstathios; Musoro, Jammbe; Coens, Corneel; Bottomley, Andrew

2017-09-01

Over the last decades, Health-related Quality of Life (HRQoL) end-points have become an important outcome of the randomised controlled trials (RCTs). HRQoL methodology in RCTs has improved following international consensus recommendations. However, no international recommendations exist concerning the statistical analysis of such data. The aim of our study was to identify and characterise the quality of the statistical methods commonly used for analysing HRQoL data in cancer RCTs. Building on our recently published systematic review, we analysed a total of 33 published RCTs studying the HRQoL methods reported in RCTs since 1991. We focussed on the ability of the methods to deal with the three major problems commonly encountered when analysing HRQoL data: their multidimensional and longitudinal structure and the commonly high rate of missing data. All studies reported HRQoL being assessed repeatedly over time for a period ranging from 2 to 36 months. Missing data were common, with compliance rates ranging from 45% to 90%. From the 33 studies considered, 12 different statistical methods were identified. Twenty-nine studies analysed each of the questionnaire sub-dimensions without type I error adjustment. Thirteen studies repeated the HRQoL analysis at each assessment time again without type I error adjustment. Only 8 studies used methods suitable for repeated measurements. Our findings show a lack of consistency in statistical methods for analysing HRQoL data. Problems related to multiple comparisons were rarely considered leading to a high risk of false positive results. It is therefore critical that international recommendations for improving such statistical practices are developed. Copyright © 2017. Published by Elsevier Ltd.
Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics

PubMed Central

2011-01-01

Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
Trends in statistical methods in articles published in Archives of Plastic Surgery between 2012 and 2017.

PubMed

Han, Kyunghwa; Jung, Inkyung

2018-05-01

This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.
Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Udey, Ruth Norma

Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.
Reporting quality of statistical methods in surgical observational studies: protocol for systematic review.

PubMed

Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume

2014-06-28

Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
A new statistical method for design and analyses of component tolerance

NASA Astrophysics Data System (ADS)

Movahedi, Mohammad Mehdi; Khounsiavash, Mohsen; Otadi, Mahmood; Mosleh, Maryam

2017-03-01

Tolerancing conducted by design engineers to meet customers' needs is a prerequisite for producing high-quality products. Engineers use handbooks to conduct tolerancing. While use of statistical methods for tolerancing is not something new, engineers often use known distributions, including the normal distribution. Yet, if the statistical distribution of the given variable is unknown, a new statistical method will be employed to design tolerance. In this paper, we use generalized lambda distribution for design and analyses component tolerance. We use percentile method (PM) to estimate the distribution parameters. The findings indicated that, when the distribution of the component data is unknown, the proposed method can be used to expedite the design of component tolerance. Moreover, in the case of assembled sets, more extensive tolerance for each component with the same target performance can be utilized.
Methodological Standards for Meta-Analyses and Qualitative Systematic Reviews of Cardiac Prevention and Treatment Studies: A Scientific Statement From the American Heart Association.

PubMed

Rao, Goutham; Lopez-Jimenez, Francisco; Boyd, Jack; D'Amico, Frank; Durant, Nefertiti H; Hlatky, Mark A; Howard, George; Kirley, Katherine; Masi, Christopher; Powell-Wiley, Tiffany M; Solomonides, Anthony E; West, Colin P; Wessel, Jennifer

2017-09-05

Meta-analyses are becoming increasingly popular, especially in the fields of cardiovascular disease prevention and treatment. They are often considered to be a reliable source of evidence for making healthcare decisions. Unfortunately, problems among meta-analyses such as the misapplication and misinterpretation of statistical methods and tests are long-standing and widespread. The purposes of this statement are to review key steps in the development of a meta-analysis and to provide recommendations that will be useful for carrying out meta-analyses and for readers and journal editors, who must interpret the findings and gauge methodological quality. To make the statement practical and accessible, detailed descriptions of statistical methods have been omitted. Based on a survey of cardiovascular meta-analyses, published literature on methodology, expert consultation, and consensus among the writing group, key recommendations are provided. Recommendations reinforce several current practices, including protocol registration; comprehensive search strategies; methods for data extraction and abstraction; methods for identifying, measuring, and dealing with heterogeneity; and statistical methods for pooling results. Other practices should be discontinued, including the use of levels of evidence and evidence hierarchies to gauge the value and impact of different study designs (including meta-analyses) and the use of structured tools to assess the quality of studies to be included in a meta-analysis. We also recommend choosing a pooling model for conventional meta-analyses (fixed effect or random effects) on the basis of clinical and methodological similarities among studies to be included, rather than the results of a test for statistical heterogeneity. © 2017 American Heart Association, Inc.
Use of Statistical Analyses in the Ophthalmic Literature

PubMed Central

Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.

2014-01-01

Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977
Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.

PubMed

Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C

2018-03-07

Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Survey of the Methods and Reporting Practices in Published Meta-analyses of Test Performance: 1987 to 2009

ERIC Educational Resources Information Center

Dahabreh, Issa J.; Chung, Mei; Kitsios, Georgios D.; Terasawa, Teruhiko; Raman, Gowri; Tatsioni, Athina; Tobar, Annette; Lau, Joseph; Trikalinos, Thomas A.; Schmid, Christopher H.

2013-01-01

We performed a survey of meta-analyses of test performance to describe the evolution in their methods and reporting. Studies were identified through MEDLINE (1966-2009), reference lists, and relevant reviews. We extracted information on clinical topics, literature review methods, quality assessment, and statistical analyses. We reviewed 760…
Secondary Analysis of National Longitudinal Transition Study 2 Data

ERIC Educational Resources Information Center

Hicks, Tyler A.; Knollman, Greg A.

2015-01-01

This review examines published secondary analyses of National Longitudinal Transition Study 2 (NLTS2) data, with a primary focus upon statistical objectives, paradigms, inferences, and methods. Its primary purpose was to determine which statistical techniques have been common in secondary analyses of NLTS2 data. The review begins with an…
Dissecting the genetics of complex traits using summary association statistics.

PubMed

Pasaniuc, Bogdan; Price, Alkes L

2017-02-01

During the past decade, genome-wide association studies (GWAS) have been used to successfully identify tens of thousands of genetic variants associated with complex traits and diseases. These studies have produced extensive repositories of genetic variation and trait measurements across large numbers of individuals, providing tremendous opportunities for further analyses. However, privacy concerns and other logistical considerations often limit access to individual-level genetic data, motivating the development of methods that analyse summary association statistics. Here, we review recent progress on statistical methods that leverage summary association data to gain insights into the genetic basis of complex traits and diseases.
Statistical innovations in diagnostic device evaluation.

PubMed

Yu, Tinghui; Li, Qin; Gray, Gerry; Yue, Lilly Q

2016-01-01

Due to rapid technological development, innovations in diagnostic devices are proceeding at an extremely fast pace. Accordingly, the needs for adopting innovative statistical methods have emerged in the evaluation of diagnostic devices. Statisticians in the Center for Devices and Radiological Health at the Food and Drug Administration have provided leadership in implementing statistical innovations. The innovations discussed in this article include: the adoption of bootstrap and Jackknife methods, the implementation of appropriate multiple reader multiple case study design, the application of robustness analyses for missing data, and the development of study designs and data analyses for companion diagnostics.
Statistical power of intervention analyses: simulation and empirical application to treated lumber prices

Treesearch

Jeffrey P. Prestemon

2009-01-01

Timber product markets are subject to large shocks deriving from natural disturbances and policy shifts. Statistical modeling of shocks is often done to assess their economic importance. In this article, I simulate the statistical power of univariate and bivariate methods of shock detection using time series intervention models. Simulations show that bivariate methods...
Conceptual and statistical problems associated with the use of diversity indices in ecology.

PubMed

Barrantes, Gilbert; Sandoval, Luis

2009-09-01

Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
Quantitative Methods for Analysing Joint Questionnaire Data: Exploring the Role of Joint in Force Design

DTIC Science & Technology

2015-08-01

the nine questions. The Statistical Package for the Social Sciences ( SPSS ) [11] was used to conduct statistical analysis on the sample. Two types...constructs. SPSS was again used to conduct statistical analysis on the sample. This time factor analysis was conducted. Factor analysis attempts to...Business Research Methods and Statistics using SPSS . P432. 11 IBM SPSS Statistics . (2012) 12 Burns, R.B., Burns, R.A. (2008) ‘Business Research
Statistical strategies to quantify respiratory sinus arrhythmia: Are commonly used metrics equivalent?

PubMed Central

Lewis, Gregory F.; Furman, Senta A.; McCool, Martha F.; Porges, Stephen W.

2011-01-01

Three frequently used RSA metrics are investigated to document violations of assumptions for parametric analyses, moderation by respiration, influences of nonstationarity, and sensitivity to vagal blockade. Although all metrics are highly correlated, new findings illustrate that the metrics are noticeably different on the above dimensions. Only one method conforms to the assumptions for parametric analyses, is not moderated by respiration, is not influenced by nonstationarity, and reliably generates stronger effect sizes. Moreover, this method is also the most sensitive to vagal blockade. Specific features of this method may provide insights into improving the statistical characteristics of other commonly used RSA metrics. These data provide the evidence to question, based on statistical grounds, published reports using particular metrics of RSA. PMID:22138367
Methods in pharmacoepidemiology: a review of statistical analyses and data reporting in pediatric drug utilization studies.

PubMed

Sequi, Marco; Campi, Rita; Clavenna, Antonio; Bonati, Maurizio

2013-03-01

To evaluate the quality of data reporting and statistical methods performed in drug utilization studies in the pediatric population. Drug utilization studies evaluating all drug prescriptions to children and adolescents published between January 1994 and December 2011 were retrieved and analyzed. For each study, information on measures of exposure/consumption, the covariates considered, descriptive and inferential analyses, statistical tests, and methods of data reporting was extracted. An overall quality score was created for each study using a 12-item checklist that took into account the presence of outcome measures, covariates of measures, descriptive measures, statistical tests, and graphical representation. A total of 22 studies were reviewed and analyzed. Of these, 20 studies reported at least one descriptive measure. The mean was the most commonly used measure (18 studies), but only five of these also reported the standard deviation. Statistical analyses were performed in 12 studies, with the chi-square test being the most commonly performed test. Graphs were presented in 14 papers. Sixteen papers reported the number of drug prescriptions and/or packages, and ten reported the prevalence of the drug prescription. The mean quality score was 8 (median 9). Only seven of the 22 studies received a score of ≥10, while four studies received a score of <6. Our findings document that only a few of the studies reviewed applied statistical methods and reported data in a satisfactory manner. We therefore conclude that the methodology of drug utilization studies needs to be improved.

The intervals method: a new approach to analyse finite element outputs using multivariate statistics

PubMed Central

De Esteban-Trivigno, Soledad; Püschel, Thomas A.; Fortuny, Josep

2017-01-01

Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches. PMID:29043107
Errors in statistical decision making Chapter 2 in Applied Statistics in Agricultural, Biological, and Environmental Sciences

USDA-ARS?s Scientific Manuscript database

Agronomic and Environmental research experiments result in data that are analyzed using statistical methods. These data are unavoidably accompanied by uncertainty. Decisions about hypotheses, based on statistical analyses of these data are therefore subject to error. This error is of three types,...
Methodological difficulties of conducting agroecological studies from a statistical perspective

USDA-ARS?s Scientific Manuscript database

Statistical methods for analysing agroecological data might not be able to help agroecologists to solve all of the current problems concerning crop and animal husbandry, but such methods could well help agroecologists to assess, tackle, and resolve several agroecological issues in a more reliable an...
Method and data evaluation at NASA endocrine laboratory. [Skylab 3 experiments

NASA Technical Reports Server (NTRS)

Johnston, D. A.

1974-01-01

The biomedical data of the astronauts on Skylab 3 were analyzed to evaluate the univariate statistical methods for comparing endocrine series experiments in relation to other medical experiments. It was found that an information storage and retrieval system was needed to facilitate statistical analyses.
Quasi-Static Probabilistic Structural Analyses Process and Criteria

NASA Technical Reports Server (NTRS)

Goldberg, B.; Verderaime, V.

1999-01-01

Current deterministic structural methods are easily applied to substructures and components, and analysts have built great design insights and confidence in them over the years. However, deterministic methods cannot support systems risk analyses, and it was recently reported that deterministic treatment of statistical data is inconsistent with error propagation laws that can result in unevenly conservative structural predictions. Assuming non-nal distributions and using statistical data formats throughout prevailing stress deterministic processes lead to a safety factor in statistical format, which integrated into the safety index, provides a safety factor and first order reliability relationship. The embedded safety factor in the safety index expression allows a historically based risk to be determined and verified over a variety of quasi-static metallic substructures consistent with the traditional safety factor methods and NASA Std. 5001 criteria.
Inappropriate Fiddling with Statistical Analyses to Obtain a Desirable P-value: Tests to Detect its Presence in Published Literature

PubMed Central

Gadbury, Gary L.; Allison, David B.

2012-01-01

Much has been written regarding p-values below certain thresholds (most notably 0.05) denoting statistical significance and the tendency of such p-values to be more readily publishable in peer-reviewed journals. Intuition suggests that there may be a tendency to manipulate statistical analyses to push a “near significant p-value” to a level that is considered significant. This article presents a method for detecting the presence of such manipulation (herein called “fiddling”) in a distribution of p-values from independent studies. Simulations are used to illustrate the properties of the method. The results suggest that the method has low type I error and that power approaches acceptable levels as the number of p-values being studied approaches 1000. PMID:23056287
Inappropriate fiddling with statistical analyses to obtain a desirable p-value: tests to detect its presence in published literature.

PubMed

Gadbury, Gary L; Allison, David B

2012-01-01

Much has been written regarding p-values below certain thresholds (most notably 0.05) denoting statistical significance and the tendency of such p-values to be more readily publishable in peer-reviewed journals. Intuition suggests that there may be a tendency to manipulate statistical analyses to push a "near significant p-value" to a level that is considered significant. This article presents a method for detecting the presence of such manipulation (herein called "fiddling") in a distribution of p-values from independent studies. Simulations are used to illustrate the properties of the method. The results suggest that the method has low type I error and that power approaches acceptable levels as the number of p-values being studied approaches 1000.
Statistical methods for convergence detection of multi-objective evolutionary algorithms.

PubMed

Trautmann, H; Wagner, T; Naujoks, B; Preuss, M; Mehnen, J

2009-01-01

In this paper, two approaches for estimating the generation in which a multi-objective evolutionary algorithm (MOEA) shows statistically significant signs of convergence are introduced. A set-based perspective is taken where convergence is measured by performance indicators. The proposed techniques fulfill the requirements of proper statistical assessment on the one hand and efficient optimisation for real-world problems on the other hand. The first approach accounts for the stochastic nature of the MOEA by repeating the optimisation runs for increasing generation numbers and analysing the performance indicators using statistical tools. This technique results in a very robust offline procedure. Moreover, an online convergence detection method is introduced as well. This method automatically stops the MOEA when either the variance of the performance indicators falls below a specified threshold or a stagnation of their overall trend is detected. Both methods are analysed and compared for two MOEA and on different classes of benchmark functions. It is shown that the methods successfully operate on all stated problems needing less function evaluations while preserving good approximation quality at the same time.
Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

PubMed Central

Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

1999-01-01

Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
Living systematic reviews: 3. Statistical methods for updating meta-analyses.

PubMed

Simmonds, Mark; Salanti, Georgia; McKenzie, Joanne; Elliott, Julian

2017-11-01

A living systematic review (LSR) should keep the review current as new research evidence emerges. Any meta-analyses included in the review will also need updating as new material is identified. If the aim of the review is solely to present the best current evidence standard meta-analysis may be sufficient, provided reviewers are aware that results may change at later updates. If the review is used in a decision-making context, more caution may be needed. When using standard meta-analysis methods, the chance of incorrectly concluding that any updated meta-analysis is statistically significant when there is no effect (the type I error) increases rapidly as more updates are performed. Inaccurate estimation of any heterogeneity across studies may also lead to inappropriate conclusions. This paper considers four methods to avoid some of these statistical problems when updating meta-analyses: two methods, that is, law of the iterated logarithm and the Shuster method control primarily for inflation of type I error and two other methods, that is, trial sequential analysis and sequential meta-analysis control for type I and II errors (failing to detect a genuine effect) and take account of heterogeneity. This paper compares the methods and considers how they could be applied to LSRs. Copyright © 2017 Elsevier Inc. All rights reserved.
Using Artificial Neural Networks in Educational Research: Some Comparisons with Linear Statistical Models.

ERIC Educational Resources Information Center

Everson, Howard T.; And Others

This paper explores the feasibility of neural computing methods such as artificial neural networks (ANNs) and abductory induction mechanisms (AIM) for use in educational measurement. ANNs and AIMS methods are contrasted with more traditional statistical techniques, such as multiple regression and discriminant function analyses, for making…
Statistical analysis and interpretation of prenatal diagnostic imaging studies, Part 2: descriptive and inferential statistical methods.

PubMed

Tuuli, Methodius G; Odibo, Anthony O

2011-08-01

The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses

PubMed Central

Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy

2015-01-01

Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
Statistical Design Model (SDM) of satellite thermal control subsystem

NASA Astrophysics Data System (ADS)

Mirshams, Mehran; Zabihian, Ehsan; Aarabi Chamalishahi, Mahdi

2016-07-01

Satellites thermal control, is a satellite subsystem that its main task is keeping the satellite components at its own survival and activity temperatures. Ability of satellite thermal control plays a key role in satisfying satellite's operational requirements and designing this subsystem is a part of satellite design. In the other hand due to the lack of information provided by companies and designers still doesn't have a specific design process while it is one of the fundamental subsystems. The aim of this paper, is to identify and extract statistical design models of spacecraft thermal control subsystem by using SDM design method. This method analyses statistical data with a particular procedure. To implement SDM method, a complete database is required. Therefore, we first collect spacecraft data and create a database, and then we extract statistical graphs using Microsoft Excel, from which we further extract mathematical models. Inputs parameters of the method are mass, mission, and life time of the satellite. For this purpose at first thermal control subsystem has been introduced and hardware using in the this subsystem and its variants has been investigated. In the next part different statistical models has been mentioned and a brief compare will be between them. Finally, this paper particular statistical model is extracted from collected statistical data. Process of testing the accuracy and verifying the method use a case study. Which by the comparisons between the specifications of thermal control subsystem of a fabricated satellite and the analyses results, the methodology in this paper was proved to be effective. Key Words: Thermal control subsystem design, Statistical design model (SDM), Satellite conceptual design, Thermal hardware
Statistical Selection of Biological Models for Genome-Wide Association Analyses.

PubMed

Bi, Wenjian; Kang, Guolian; Pounds, Stanley B

2018-05-24

Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.
Epidemiologic programs for computers and calculators. A microcomputer program for multiple logistic regression by unconditional and conditional maximum likelihood methods.

PubMed

Campos-Filho, N; Franco, E L

1989-02-01

A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Bayesian methods in reliability

NASA Astrophysics Data System (ADS)

Sander, P.; Badoux, R.

1991-11-01

The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Data Analysis and Graphing in an Introductory Physics Laboratory: Spreadsheet versus Statistics Suite

ERIC Educational Resources Information Center

Peterlin, Primoz

2010-01-01

Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
Trends in study design and the statistical methods employed in a leading general medicine journal.

PubMed

Gosho, M; Sato, Y; Nagashima, K; Takahashi, S

2018-02-01

Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing after the presentation of the FDA guidance for adaptive design. © 2017 John Wiley & Sons Ltd.
A decade of individual participant data meta-analyses: A review of current practice.

PubMed

Simmonds, Mark; Stewart, Gavin; Stewart, Lesley

2015-11-01

Individual participant data (IPD) systematic reviews and meta-analyses are often considered to be the gold standard for meta-analysis. In the ten years since the first review into the methodology and reporting practice of IPD reviews was published much has changed in the field. This paper investigates current reporting and statistical practice in IPD systematic reviews. A systematic review was performed to identify systematic reviews that collected and analysed IPD. Data were extracted from each included publication on a variety of issues related to the reporting of IPD review process, and the statistical methods used. There has been considerable growth in the use of "one-stage" methods to perform IPD meta-analyses. The majority of reviews consider at least one covariate other than the primary intervention, either using subgroup analysis or including covariates in one-stage regression models. Random-effects analyses, however, are not often used. Reporting of review methods was often limited, with few reviews presenting a risk-of-bias assessment. Details on issues specific to the use of IPD were little reported, including how IPD were obtained; how data was managed and checked for consistency and errors; and for how many studies and participants IPD were sought and obtained. While the last ten years have seen substantial changes in how IPD meta-analyses are performed there remains considerable scope for improving the quality of reporting for both the process of IPD systematic reviews, and the statistical methods employed in them. It is to be hoped that the publication of the PRISMA-IPD guidelines specific to IPD reviews will improve reporting in this area. Copyright © 2015 Elsevier Inc. All rights reserved.

Statistical Literacy in the Data Science Workplace

ERIC Educational Resources Information Center

Grant, Robert

2017-01-01

Statistical literacy, the ability to understand and make use of statistical information including methods, has particular relevance in the age of data science, when complex analyses are undertaken by teams from diverse backgrounds. Not only is it essential to communicate to the consumers of information but also within the team. Writing from the…
Reporting Practices and Use of Quantitative Methods in Canadian Journal Articles in Psychology.

PubMed

Counsell, Alyssa; Harlow, Lisa L

2017-05-01

With recent focus on the state of research in psychology, it is essential to assess the nature of the statistical methods and analyses used and reported by psychological researchers. To that end, we investigated the prevalence of different statistical procedures and the nature of statistical reporting practices in recent articles from the four major Canadian psychology journals. The majority of authors evaluated their research hypotheses through the use of analysis of variance (ANOVA), t -tests, and multiple regression. Multivariate approaches were less common. Null hypothesis significance testing remains a popular strategy, but the majority of authors reported a standardized or unstandardized effect size measure alongside their significance test results. Confidence intervals on effect sizes were infrequently employed. Many authors provided minimal details about their statistical analyses and less than a third of the articles presented on data complications such as missing data and violations of statistical assumptions. Strengths of and areas needing improvement for reporting quantitative results are highlighted. The paper concludes with recommendations for how researchers and reviewers can improve comprehension and transparency in statistical reporting.
Narrative Review of Statistical Reporting Checklists, Mandatory Statistical Editing, and Rectifying Common Problems in the Reporting of Scientific Articles.

PubMed

Dexter, Franklin; Shafer, Steven L

2017-03-01

Considerable attention has been drawn to poor reproducibility in the biomedical literature. One explanation is inadequate reporting of statistical methods by authors and inadequate assessment of statistical reporting and methods during peer review. In this narrative review, we examine scientific studies of several well-publicized efforts to improve statistical reporting. We also review several retrospective assessments of the impact of these efforts. These studies show that instructions to authors and statistical checklists are not sufficient; no findings suggested that either improves the quality of statistical methods and reporting. Second, even basic statistics, such as power analyses, are frequently missing or incorrectly performed. Third, statistical review is needed for all papers that involve data analysis. A consistent finding in the studies was that nonstatistical reviewers (eg, "scientific reviewers") and journal editors generally poorly assess statistical quality. We finish by discussing our experience with statistical review at Anesthesia & Analgesia from 2006 to 2016.
Statistical parameters of random heterogeneity estimated by analysing coda waves based on finite difference method

NASA Astrophysics Data System (ADS)

Emoto, K.; Saito, T.; Shiomi, K.

2017-12-01

Short-period (<1 s) seismograms are strongly affected by small-scale (<10 km) heterogeneities in the lithosphere. In general, short-period seismograms are analysed based on the statistical method by considering the interaction between seismic waves and randomly distributed small-scale heterogeneities. Statistical properties of the random heterogeneities have been estimated by analysing short-period seismograms. However, generally, the small-scale random heterogeneity is not taken into account for the modelling of long-period (>2 s) seismograms. We found that the energy of the coda of long-period seismograms shows a spatially flat distribution. This phenomenon is well known in short-period seismograms and results from the scattering by small-scale heterogeneities. We estimate the statistical parameters that characterize the small-scale random heterogeneity by modelling the spatiotemporal energy distribution of long-period seismograms. We analyse three moderate-size earthquakes that occurred in southwest Japan. We calculate the spatial distribution of the energy density recorded by a dense seismograph network in Japan at the period bands of 8-16 s, 4-8 s and 2-4 s and model them by using 3-D finite difference (FD) simulations. Compared to conventional methods based on statistical theories, we can calculate more realistic synthetics by using the FD simulation. It is not necessary to assume a uniform background velocity, body or surface waves and scattering properties considered in general scattering theories. By taking the ratio of the energy of the coda area to that of the entire area, we can separately estimate the scattering and the intrinsic absorption effects. Our result reveals the spectrum of the random inhomogeneity in a wide wavenumber range including the intensity around the corner wavenumber as P(m) = 8πε2a3/(1 + a2m2)2, where ε = 0.05 and a = 3.1 km, even though past studies analysing higher-frequency records could not detect the corner. Finally, we estimate the intrinsic attenuation by modelling the decay rate of the energy. The method proposed in this study is suitable for quantifying the statistical properties of long-wavelength subsurface random inhomogeneity, which leads the way to characterizing a wider wavenumber range of spectra, including the corner wavenumber.
Descriptive and inferential statistical methods used in burns research.

PubMed

Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars

2010-05-01

Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals in the fields of biostatistics and epidemiology when using more advanced statistical techniques. Copyright 2009 Elsevier Ltd and ISBI. All rights reserved.
Mass spectrometry-based protein identification with accurate statistical significance assignment.

PubMed

Alves, Gelio; Yu, Yi-Kuo

2015-03-01

Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Implementation of novel statistical procedures and other advanced approaches to improve analysis of CASA data.

PubMed

Ramón, M; Martínez-Pastor, F

2018-04-23

Computer-aided sperm analysis (CASA) produces a wealth of data that is frequently ignored. The use of multiparametric statistical methods can help explore these datasets, unveiling the subpopulation structure of sperm samples. In this review we analyse the significance of the internal heterogeneity of sperm samples and its relevance. We also provide a brief description of the statistical tools used for extracting sperm subpopulations from the datasets, namely unsupervised clustering (with non-hierarchical, hierarchical and two-step methods) and the most advanced supervised methods, based on machine learning. The former method has allowed exploration of subpopulation patterns in many species, whereas the latter offering further possibilities, especially considering functional studies and the practical use of subpopulation analysis. We also consider novel approaches, such as the use of geometric morphometrics or imaging flow cytometry. Finally, although the data provided by CASA systems provides valuable information on sperm samples by applying clustering analyses, there are several caveats. Protocols for capturing and analysing motility or morphometry should be standardised and adapted to each experiment, and the algorithms should be open in order to allow comparison of results between laboratories. Moreover, we must be aware of new technology that could change the paradigm for studying sperm motility and morphology.
Reporting guidance considerations from a statistical perspective: overview of tools to enhance the rigour of reporting of randomised trials and systematic reviews.

PubMed

Hutton, Brian; Wolfe, Dianna; Moher, David; Shamseer, Larissa

2017-05-01

Research waste has received considerable attention from the biomedical community. One noteworthy contributor is incomplete reporting in research publications. When detailing statistical methods and results, ensuring analytic methods and findings are completely documented improves transparency. For publications describing randomised trials and systematic reviews, guidelines have been developed to facilitate complete reporting. This overview summarises aspects of statistical reporting in trials and systematic reviews of health interventions. A narrative approach to summarise features regarding statistical methods and findings from reporting guidelines for trials and reviews was taken. We aim to enhance familiarity of statistical details that should be reported in biomedical research among statisticians and their collaborators. We summarise statistical reporting considerations for trials and systematic reviews from guidance documents including the Consolidated Standards of Reporting Trials (CONSORT) Statement for reporting of trials, the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) Statement for trial protocols, the Statistical Analyses and Methods in the Published Literature (SAMPL) Guidelines for statistical reporting principles, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement for systematic reviews and PRISMA for Protocols (PRISMA-P). Considerations regarding sharing of study data and statistical code are also addressed. Reporting guidelines provide researchers with minimum criteria for reporting. If followed, they can enhance research transparency and contribute improve quality of biomedical publications. Authors should employ these tools for planning and reporting of their research. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Statistical methods for meta-analyses including information from studies without any events-add nothing to nothing and succeed nevertheless.

PubMed

Kuss, O

2015-03-30

Meta-analyses with rare events, especially those that include studies with no event in one ('single-zero') or even both ('double-zero') treatment arms, are still a statistical challenge. In the case of double-zero studies, researchers in general delete these studies or use continuity corrections to avoid them. A number of arguments against both options has been given, and statistical methods that use the information from double-zero studies without using continuity corrections have been proposed. In this paper, we collect them and compare them by simulation. This simulation study tries to mirror real-life situations as completely as possible by deriving true underlying parameters from empirical data on actually performed meta-analyses. It is shown that for each of the commonly encountered effect estimators valid statistical methods are available that use the information from double-zero studies without using continuity corrections. Interestingly, all of them are truly random effects models, and so also the current standard method for very sparse data as recommended from the Cochrane collaboration, the Yusuf-Peto odds ratio, can be improved on. For actual analysis, we recommend to use beta-binomial regression methods to arrive at summary estimates for the odds ratio, the relative risk, or the risk difference. Methods that ignore information from double-zero studies or use continuity corrections should no longer be used. We illustrate the situation with an example where the original analysis ignores 35 double-zero studies, and a superior analysis discovers a clinically relevant advantage of off-pump surgery in coronary artery bypass grafting. Copyright © 2014 John Wiley & Sons, Ltd.
Methodological approaches in analysing observational data: A practical example on how to address clustering and selection bias.

PubMed

Trutschel, Diana; Palm, Rebecca; Holle, Bernhard; Simon, Michael

2017-11-01

Because not every scientific question on effectiveness can be answered with randomised controlled trials, research methods that minimise bias in observational studies are required. Two major concerns influence the internal validity of effect estimates: selection bias and clustering. Hence, to reduce the bias of the effect estimates, more sophisticated statistical methods are needed. To introduce statistical approaches such as propensity score matching and mixed models into representative real-world analysis and to conduct the implementation in statistical software R to reproduce the results. Additionally, the implementation in R is presented to allow the results to be reproduced. We perform a two-level analytic strategy to address the problems of bias and clustering: (i) generalised models with different abilities to adjust for dependencies are used to analyse binary data and (ii) the genetic matching and covariate adjustment methods are used to adjust for selection bias. Hence, we analyse the data from two population samples, the sample produced by the matching method and the full sample. The different analysis methods in this article present different results but still point in the same direction. In our example, the estimate of the probability of receiving a case conference is higher in the treatment group than in the control group. Both strategies, genetic matching and covariate adjustment, have their limitations but complement each other to provide the whole picture. The statistical approaches were feasible for reducing bias but were nevertheless limited by the sample used. For each study and obtained sample, the pros and cons of the different methods have to be weighted. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Plant selection for ethnobotanical uses on the Amalfi Coast (Southern Italy).

PubMed

Savo, V; Joy, R; Caneva, G; McClatchey, W C

2015-07-15

Many ethnobotanical studies have investigated selection criteria for medicinal and non-medicinal plants. In this paper we test several statistical methods using different ethnobotanical datasets in order to 1) define to which extent the nature of the datasets can affect the interpretation of results; 2) determine if the selection for different plant uses is based on phylogeny, or other selection criteria. We considered three different ethnobotanical datasets: two datasets of medicinal plants and a dataset of non-medicinal plants (handicraft production, domestic and agro-pastoral practices) and two floras of the Amalfi Coast. We performed residual analysis from linear regression, the binomial test and the Bayesian approach for calculating under-used and over-used plant families within ethnobotanical datasets. Percentages of agreement were calculated to compare the results of the analyses. We also analyzed the relationship between plant selection and phylogeny, chorology, life form and habitat using the chi-square test. Pearson's residuals for each of the significant chi-square analyses were examined for investigating alternative hypotheses of plant selection criteria. The three statistical analysis methods differed within the same dataset, and between different datasets and floras, but with some similarities. In the two medicinal datasets, only Lamiaceae was identified in both floras as an over-used family by all three statistical methods. All statistical methods in one flora agreed that Malvaceae was over-used and Poaceae under-used, but this was not found to be consistent with results of the second flora in which one statistical result was non-significant. All other families had some discrepancy in significance across methods, or floras. Significant over- or under-use was observed in only a minority of cases. The chi-square analyses were significant for phylogeny, life form and habitat. Pearson's residuals indicated a non-random selection of woody species for non-medicinal uses and an under-use of plants of temperate forests for medicinal uses. Our study showed that selection criteria for plant uses (including medicinal) are not always based on phylogeny. The comparison of different statistical methods (regression, binomial and Bayesian) under different conditions led to the conclusion that the most conservative results are obtained using regression analysis.
The effect of berberine on insulin resistance in women with polycystic ovary syndrome: detailed statistical analysis plan (SAP) for a multicenter randomized controlled trial.

PubMed

Zhang, Ying; Sun, Jin; Zhang, Yun-Jiao; Chai, Qian-Yun; Zhang, Kang; Ma, Hong-Li; Wu, Xiao-Ke; Liu, Jian-Ping

2016-10-21

Although Traditional Chinese Medicine (TCM) has been widely used in clinical settings, a major challenge that remains in TCM is to evaluate its efficacy scientifically. This randomized controlled trial aims to evaluate the efficacy and safety of berberine in the treatment of patients with polycystic ovary syndrome. In order to improve the transparency and research quality of this clinical trial, we prepared this statistical analysis plan (SAP). The trial design, primary and secondary outcomes, and safety outcomes were declared to reduce selection biases in data analysis and result reporting. We specified detailed methods for data management and statistical analyses. Statistics in corresponding tables, listings, and graphs were outlined. The SAP provided more detailed information than trial protocol on data management and statistical analysis methods. Any post hoc analyses could be identified via referring to this SAP, and the possible selection bias and performance bias will be reduced in the trial. This study is registered at ClinicalTrials.gov, NCT01138930 , registered on 7 June 2010.
Analysis of the dependence of extreme rainfalls

NASA Astrophysics Data System (ADS)

Padoan, Simone; Ancey, Christophe; Parlange, Marc

2010-05-01

The aim of spatial analysis is to quantitatively describe the behavior of environmental phenomena such as precipitation levels, wind speed or daily temperatures. A number of generic approaches to spatial modeling have been developed[1], but these are not necessarily ideal for handling extremal aspects given their focus on mean process levels. The areal modelling of the extremes of a natural process observed at points in space is important in environmental statistics; for example, understanding extremal spatial rainfall is crucial in flood protection. In light of recent concerns over climate change, the use of robust mathematical and statistical methods for such analyses has grown in importance. Multivariate extreme value models and the class of maxstable processes [2] have a similar asymptotic motivation to the univariate Generalized Extreme Value (GEV) distribution , but providing a general approach to modeling extreme processes incorporating temporal or spatial dependence. Statistical methods for max-stable processes and data analyses of practical problems are discussed by [3] and [4]. This work illustrates methods to the statistical modelling of spatial extremes and gives examples of their use by means of a real extremal data analysis of Switzerland precipitation levels. [1] Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York. [2] de Haan, L and Ferreria A. (2006). Extreme Value Theory An Introduction. Springer, USA. [3] Padoan, S. A., Ribatet, M and Sisson, S. A. (2009). Likelihood-Based Inference for Max-Stable Processes. Journal of the American Statistical Association, Theory & Methods. In press. [4] Davison, A. C. and Gholamrezaee, M. (2009), Geostatistics of extremes. Journal of the Royal Statistical Society, Series B. To appear.
Statistical Prediction in Proprietary Rehabilitation.

ERIC Educational Resources Information Center

Johnson, Kurt L.; And Others

1987-01-01

Applied statistical methods to predict case expenditures for low back pain rehabilitation cases in proprietary rehabilitation. Extracted predictor variables from case records of 175 workers compensation claimants with some degree of permanent disability due to back injury. Performed several multiple regression analyses resulting in a formula that…
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.

PubMed

Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V

2018-04-01

A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Application of multivariate statistical techniques in microbial ecology.

PubMed

Paliy, O; Shankar, V

2016-03-01

Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure. © 2016 John Wiley & Sons Ltd.
Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

PubMed

Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

2017-07-21

DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.
Dynamic modelling of n-of-1 data: powerful and flexible data analytics applied to individualised studies.

PubMed

Vieira, Rute; McDonald, Suzanne; Araújo-Soares, Vera; Sniehotta, Falko F; Henderson, Robin

2017-09-01

N-of-1 studies are based on repeated observations within an individual or unit over time and are acknowledged as an important research method for generating scientific evidence about the health or behaviour of an individual. Statistical analyses of n-of-1 data require accurate modelling of the outcome while accounting for its distribution, time-related trend and error structures (e.g., autocorrelation) as well as reporting readily usable contextualised effect sizes for decision-making. A number of statistical approaches have been documented but no consensus exists on which method is most appropriate for which type of n-of-1 design. We discuss the statistical considerations for analysing n-of-1 studies and briefly review some currently used methodologies. We describe dynamic regression modelling as a flexible and powerful approach, adaptable to different types of outcomes and capable of dealing with the different challenges inherent to n-of-1 statistical modelling. Dynamic modelling borrows ideas from longitudinal and event history methodologies which explicitly incorporate the role of time and the influence of past on future. We also present an illustrative example of the use of dynamic regression on monitoring physical activity during the retirement transition. Dynamic modelling has the potential to expand researchers' access to robust and user-friendly statistical methods for individualised studies.
Comparative effectiveness research methodology using secondary data: A starting user's guide.

PubMed

Sun, Maxine; Lipsitz, Stuart R

2018-04-01

The use of secondary data, such as claims or administrative data, in comparative effectiveness research has grown tremendously in recent years. We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before initiating a comparative effectiveness investigation, and (3) optimize the quality of their investigations. Specifically, we review concepts of adjusted analyses and confounders, methods of propensity score analyses, and instrumental variable analyses, risk prediction models (logistic and time-to-event), decision-curve analysis, as well as the interpretation of the P value and hypothesis testing. Overall, we hope that the current review article can help research investigators relying on secondary data to perform comparative effectiveness research better understand the necessity of a rigorous planning before study start, and gain better insight in the choice of statistical methods so as to optimize the quality of the research study. Copyright © 2017 Elsevier Inc. All rights reserved.

Review of Statistical Methods for Analysing Healthcare Resources and Costs

PubMed Central

Mihaylova, Borislava; Briggs, Andrew; O'Hagan, Anthony; Thompson, Simon G

2011-01-01

We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20799344
Sunspot activity and influenza pandemics: a statistical assessment of the purported association.

PubMed

Towers, S

2017-10-01

Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.
Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

PubMed

Schaid, Daniel J

2010-01-01

Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
A risk-based statistical investigation of the quantification of polymorphic purity of a pharmaceutical candidate by solid-state 19F NMR.

PubMed

Barry, Samantha J; Pham, Tran N; Borman, Phil J; Edwards, Andrew J; Watson, Simon A

2012-01-27

The DMAIC (Define, Measure, Analyse, Improve and Control) framework and associated statistical tools have been applied to both identify and reduce variability observed in a quantitative (19)F solid-state NMR (SSNMR) analytical method. The method had been developed to quantify levels of an additional polymorph (Form 3) in batches of an active pharmaceutical ingredient (API), where Form 1 is the predominant polymorph. In order to validate analyses of the polymorphic form, a single batch of API was used as a standard each time the method was used. The level of Form 3 in this standard was observed to gradually increase over time, the effect not being immediately apparent due to method variability. In order to determine the cause of this unexpected increase and to reduce method variability, a risk-based statistical investigation was performed to identify potential factors which could be responsible for these effects. Factors identified by the risk assessment were investigated using a series of designed experiments to gain a greater understanding of the method. The increase of the level of Form 3 in the standard was primarily found to correlate with the number of repeat analyses, an effect not previously reported in SSNMR literature. Differences in data processing (phasing and linewidth) were found to be responsible for the variability in the method. After implementing corrective actions the variability was reduced such that the level of Form 3 was within an acceptable range of ±1% ww(-1) in fresh samples of API. Copyright © 2011. Published by Elsevier B.V.
Least Squares Procedures.

ERIC Educational Resources Information Center

Hester, Yvette

Least squares methods are sophisticated mathematical curve fitting procedures used in all classical parametric methods. The linear least squares approximation is most often associated with finding the "line of best fit" or the regression line. Since all statistical analyses are correlational and all classical parametric methods are least…
Quasi-experimental study designs series-paper 10: synthesizing evidence for effects collected from quasi-experimental studies presents surmountable challenges.

PubMed

Becker, Betsy Jane; Aloe, Ariel M; Duvendack, Maren; Stanley, T D; Valentine, Jeffrey C; Fretheim, Atle; Tugwell, Peter

2017-09-01

To outline issues of importance to analytic approaches to the synthesis of quasi-experiments (QEs) and to provide a statistical model for use in analysis. We drew on studies of statistics, epidemiology, and social-science methodology to outline methods for synthesis of QE studies. The design and conduct of QEs, effect sizes from QEs, and moderator variables for the analysis of those effect sizes were discussed. Biases, confounding, design complexities, and comparisons across designs offer serious challenges to syntheses of QEs. Key components of meta-analyses of QEs were identified, including the aspects of QE study design to be coded and analyzed. Of utmost importance are the design and statistical controls implemented in the QEs. Such controls and any potential sources of bias and confounding must be modeled in analyses, along with aspects of the interventions and populations studied. Because of such controls, effect sizes from QEs are more complex than those from randomized experiments. A statistical meta-regression model that incorporates important features of the QEs under review was presented. Meta-analyses of QEs provide particular challenges, but thorough coding of intervention characteristics and study methods, along with careful analysis, should allow for sound inferences. Copyright © 2017 Elsevier Inc. All rights reserved.
Computed statistics at streamgages, and methods for estimating low-flow frequency statistics and development of regional regression equations for estimating low-flow frequency statistics at ungaged locations in Missouri

USGS Publications Warehouse

Southard, Rodney E.

2013-01-01

The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical estimates on one of these streams can be calculated at an ungaged location that has a drainage area that is between 40 percent of the drainage area of the farthest upstream streamgage and within 150 percent of the drainage area of the farthest downstream streamgage along the stream of interest. The second method may be used on any stream with a streamgage that has operated for 10 years or longer and for which anthropogenic effects have not changed the low-flow characteristics at the ungaged location since collection of the streamflow data. A ratio of drainage area of the stream at the ungaged location to the drainage area of the stream at the streamgage was computed to estimate the statistic at the ungaged location. The range of applicability is between 40- and 150-percent of the drainage area of the streamgage, and the ungaged location must be located on the same stream as the streamgage. The third method uses regional regression equations to estimate selected low-flow frequency statistics for unregulated streams in Missouri. This report presents regression equations to estimate frequency statistics for the 10-year recurrence interval and for the N-day durations of 1, 2, 3, 7, 10, 30, and 60 days. Basin and climatic characteristics were computed using geographic information system software and digital geospatial data. A total of 35 characteristics were computed for use in preliminary statewide and regional regression analyses based on existing digital geospatial data and previous studies. Spatial analyses for geographical bias in the predictive accuracy of the regional regression equations defined three low-flow regions with the State representing the three major physiographic provinces in Missouri. Region 1 includes the Central Lowlands, Region 2 includes the Ozark Plateaus, and Region 3 includes the Mississippi Alluvial Plain. A total of 207 streamgages were used in the regression analyses for the regional equations. Of the 207 U.S. Geological Survey streamgages, 77 were located in Region 1, 120 were located in Region 2, and 10 were located in Region 3. Streamgages located outside of Missouri were selected to extend the range of data used for the independent variables in the regression analyses. Streamgages included in the regression analyses had 10 or more years of record and were considered to be affected minimally by anthropogenic activities or trends. Regional regression analyses identified three characteristics as statistically significant for the development of regional equations. For Region 1, drainage area, longest flow path, and streamflow-variability index were statistically significant. The range in the standard error of estimate for Region 1 is 79.6 to 94.2 percent. For Region 2, drainage area and streamflow variability index were statistically significant, and the range in the standard error of estimate is 48.2 to 72.1 percent. For Region 3, drainage area and streamflow-variability index also were statistically significant with a range in the standard error of estimate of 48.1 to 96.2 percent. Limitations on the use of estimating low-flow frequency statistics at ungaged locations are dependent on the method used. The first method outlined for use in Missouri, power curve equations, were developed to estimate the selected statistics for ungaged locations on 28 selected streams with multiple streamgages located on the same stream. A second method uses a drainage-area ratio to compute statistics at an ungaged location using data from a single streamgage on the same stream with 10 or more years of record. Ungaged locations on these streams may use the ratio of the drainage area at an ungaged location to the drainage area at a streamgage location to scale the selected statistic value from the streamgage location to the ungaged location. This method can be used if the drainage area of the ungaged location is within 40 to 150 percent of the streamgage drainage area. The third method is the use of the regional regression equations. The limits for the use of these equations are based on the ranges of the characteristics used as independent variables and that streams must be affected minimally by anthropogenic activities.
Prison Radicalization: The New Extremist Training Grounds?

DTIC Science & Technology

2007-09-01

distributing and collecting survey data , and the data analysis. The analytical methodology includes descriptive and inferential statistical methods, in... statistical analysis of the responses to identify significant correlations and relationships. B. SURVEY DATA COLLECTION To effectively access a...Q18, Q19, Q20, and Q21. Due to the exploratory nature of this small survey, data analyses were confined mostly to descriptive statistics and
Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?

PubMed

Mukaka, Mavuto; White, Sarah A; Terlouw, Dianne J; Mwapasa, Victor; Kalilani-Phiri, Linda; Faragher, E Brian

2016-07-22

Missing outcomes can seriously impair the ability to make correct inferences from randomized controlled trials (RCTs). Complete case (CC) analysis is commonly used, but it reduces sample size and is perceived to lead to reduced statistical efficiency of estimates while increasing the potential for bias. As multiple imputation (MI) methods preserve sample size, they are generally viewed as the preferred analytical approach. We examined this assumption, comparing the performance of CC and MI methods to determine risk difference (RD) estimates in the presence of missing binary outcomes. We conducted simulation studies of 5000 simulated data sets with 50 imputations of RCTs with one primary follow-up endpoint at different underlying levels of RD (3-25 %) and missing outcomes (5-30 %). For missing at random (MAR) or missing completely at random (MCAR) outcomes, CC method estimates generally remained unbiased and achieved precision similar to or better than MI methods, and high statistical coverage. Missing not at random (MNAR) scenarios yielded invalid inferences with both methods. Effect size estimate bias was reduced in MI methods by always including group membership even if this was unrelated to missingness. Surprisingly, under MAR and MCAR conditions in the assessed scenarios, MI offered no statistical advantage over CC methods. While MI must inherently accompany CC methods for intention-to-treat analyses, these findings endorse CC methods for per protocol risk difference analyses in these conditions. These findings provide an argument for the use of the CC approach to always complement MI analyses, with the usual caveat that the validity of the mechanism for missingness be thoroughly discussed. More importantly, researchers should strive to collect as much data as possible.
Spurious correlations and inference in landscape genetics

Treesearch

Samuel A. Cushman; Erin L. Landguth

2010-01-01

Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...
Statistical Analyses of Raw Material Data for MTM45-1/CF7442A-36% RW: CMH Cure Cycle

NASA Technical Reports Server (NTRS)

Coroneos, Rula; Pai, Shantaram, S.; Murthy, Pappu

2013-01-01

This report describes statistical characterization of physical properties of the composite material system MTM45-1/CF7442A, which has been tested and is currently being considered for use on spacecraft structures. This composite system is made of 6K plain weave graphite fibers in a highly toughened resin system. This report summarizes the distribution types and statistical details of the tests and the conditions for the experimental data generated. These distributions will be used in multivariate regression analyses to help determine material and design allowables for similar material systems and to establish a procedure for other material systems. Additionally, these distributions will be used in future probabilistic analyses of spacecraft structures. The specific properties that are characterized are the ultimate strength, modulus, and Poisson??s ratio by using a commercially available statistical package. Results are displayed using graphical and semigraphical methods and are included in the accompanying appendixes.
Cluster mass inference via random field theory.

PubMed

Zhang, Hui; Nichols, Thomas E; Johnson, Timothy D

2009-01-01

Cluster extent and voxel intensity are two widely used statistics in neuroimaging inference. Cluster extent is sensitive to spatially extended signals while voxel intensity is better for intense but focal signals. In order to leverage strength from both statistics, several nonparametric permutation methods have been proposed to combine the two methods. Simulation studies have shown that of the different cluster permutation methods, the cluster mass statistic is generally the best. However, to date, there is no parametric cluster mass inference available. In this paper, we propose a cluster mass inference method based on random field theory (RFT). We develop this method for Gaussian images, evaluate it on Gaussian and Gaussianized t-statistic images and investigate its statistical properties via simulation studies and real data. Simulation results show that the method is valid under the null hypothesis and demonstrate that it can be more powerful than the cluster extent inference method. Further, analyses with a single subject and a group fMRI dataset demonstrate better power than traditional cluster size inference, and good accuracy relative to a gold-standard permutation test.
[Gender-sensitive epidemiological data analysis: methodological aspects and empirical outcomes. Illustrated by a health reporting example].

PubMed

Jahn, I; Foraita, R

2008-01-01

In Germany gender-sensitive approaches are part of guidelines for good epidemiological practice as well as health reporting. They are increasingly claimed to realize the gender mainstreaming strategy in research funding by the federation and federal states. This paper focuses on methodological aspects of data analysis, as an empirical data example of which serves the health report of Bremen, a population-based cross-sectional study. Health reporting requires analysis and reporting methods that are able to discover sex/gender issues of questions, on the one hand, and consider how results can adequately be communicated, on the other hand. The core question is: Which consequences do a different inclusion of the category sex in different statistical analyses for identification of potential target groups have on the results? As evaluation methods logistic regressions as well as a two-stage procedure were exploratively conducted. This procedure combines graphical models with CHAID decision trees and allows for visualising complex results. Both methods are analysed by stratification as well as adjusted by sex/gender and compared with each other. As a result, only stratified analyses are able to detect differences between the sexes and within the sex/gender groups as long as one cannot resort to previous knowledge. Adjusted analyses can detect sex/gender differences only if interaction terms have been included in the model. Results are discussed from a statistical-epidemiological perspective as well as in the context of health reporting. As a conclusion, the question, if a statistical method is gender-sensitive, can only be answered by having concrete research questions and known conditions. Often, an appropriate statistic procedure can be chosen after conducting a separate analysis for women and men. Future gender studies deserve innovative study designs as well as conceptual distinctiveness with regard to the biological and the sociocultural elements of the category sex/gender.
Coordinate based random effect size meta-analysis of neuroimaging studies.

PubMed

Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

2017-06-01

Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.
The effect of ion-exchange purification on the determination of plutonium at the New Brunswick Laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mitchell, W.G.; Spaletto, M.I.; Lewis, K.

The method of plutonium (Pu) determination at the Brunswick Laboratory (NBL) consists of a combination of ion-exchange purification followed by controlled-potential coulometric analysis (IE/CPC). The present report's purpose is to quantify any detectable Pu loss occurring in the ion-exchange (IE) purification step which would cause a negative bias in the NBL method for Pu analysis. The magnitude of any such loss would be contained within the reproducibility (0.05%) of the IE/CPC method which utilizes a state-of-the-art autocoulometer developed at NBL. When the NBL IE/CPC method is used for Pu analysis, any loss in ion-exchange purification (<0.05%) is confounded with themore » repeatability of the ion-exchange and the precision of the CPC analysis technique (<0.05%). Consequently, to detect a bias in the IE/CPC method due to the IE alone using the IE/CPC method itself requires that many randomized analyses on a single material be performed over time and that statistical analysis of the data be performed. The initial approach described in this report to quantify any IE loss was an independent method, Isotope Dilution Mass Spectrometry; however, the number of analyses performed was insufficient to assign a statistically significant value to the IE loss (<0.02% of 10 mg samples of Pu). The second method used for quantifying any IE loss of Pu was multiple ion exchanges of the same Pu aliquant; the small number of analyses possible per individual IE together with the column-to-column variability over multiple ion exchanges prevented statistical detection of any loss of <0.05%. 12 refs.« less
Computer program for prediction of fuel consumption statistical data for an upper stage three-axes stabilized on-off control system

NASA Technical Reports Server (NTRS)

1982-01-01

A FORTRAN coded computer program and method to predict the reaction control fuel consumption statistics for a three axis stabilized rocket vehicle upper stage is described. A Monte Carlo approach is used which is more efficient by using closed form estimates of impulses. The effects of rocket motor thrust misalignment, static unbalance, aerodynamic disturbances, and deviations in trajectory, mass properties and control system characteristics are included. This routine can be applied to many types of on-off reaction controlled vehicles. The pseudorandom number generation and statistical analyses subroutines including the output histograms can be used for other Monte Carlo analyses problems.
Colorimetric determination of nitrate plus nitrite in water by enzymatic reduction, automated discrete analyzer methods

USGS Publications Warehouse

Patton, Charles J.; Kryskalla, Jennifer R.

2011-01-01

In addition to operational details and performance benchmarks for these new DA-AtNaR2 nitrate + nitrite assays, this report also provides results of interference studies for common inorganic and organic matrix constituents at 1, 10, and 100 times their median concentrations in surface-water and groundwater samples submitted annually to the NWQL for nitrate + nitrite analyses. Paired t-test and Wilcoxon signed-rank statistical analyses of results determined by CFA-CdR methods and DA-AtNaR2 methods indicate that nitrate concentration differences between population means or sign ranks were either statistically equivalent to zero at the 95 percent confidence level (p ≥ 0.05) or analytically equivalent to zero-that is, when p < 0.05, concentration differences between population means or medians were less than MDLs.
Cluster detection methods applied to the Upper Cape Cod cancer data.

PubMed

Ozonoff, Al; Webster, Thomas; Vieira, Veronica; Weinberg, Janice; Ozonoff, David; Aschengrau, Ann

2005-09-15

A variety of statistical methods have been suggested to assess the degree and/or the location of spatial clustering of disease cases. However, there is relatively little in the literature devoted to comparison and critique of different methods. Most of the available comparative studies rely on simulated data rather than real data sets. We have chosen three methods currently used for examining spatial disease patterns: the M-statistic of Bonetti and Pagano; the Generalized Additive Model (GAM) method as applied by Webster; and Kulldorff's spatial scan statistic. We apply these statistics to analyze breast cancer data from the Upper Cape Cancer Incidence Study using three different latency assumptions. The three different latency assumptions produced three different spatial patterns of cases and controls. For 20 year latency, all three methods generally concur. However, for 15 year latency and no latency assumptions, the methods produce different results when testing for global clustering. The comparative analyses of real data sets by different statistical methods provides insight into directions for further research. We suggest a research program designed around examining real data sets to guide focused investigation of relevant features using simulated data, for the purpose of understanding how to interpret statistical methods applied to epidemiological data with a spatial component.
Longitudinal Assessment of Self-Reported Recent Back Pain and Combat Deployment in the Millennium Cohort Study

DTIC Science & Technology

2016-11-15

participants who were followed for the development of back pain for an average of 3.9 years. Methods. Descriptive statistics and longitudinal...health, military personnel, occupational health, outcome assessment, statistics, survey methodology . Level of Evidence: 3 Spine 2016;41:1754–1763ack...based on the National Health and Nutrition Examination Survey.21 Statistical Analysis Descriptive and univariate analyses compared character- istics
Interpretation of correlations in clinical research.

PubMed

Hung, Man; Bounsanga, Jerry; Voss, Maren Wright

2017-11-01

Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.

Across-cohort QC analyses of GWAS summary statistics from complex traits.

PubMed

Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

2016-01-01

Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.
Across-cohort QC analyses of GWAS summary statistics from complex traits

PubMed Central

Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

2017-01-01

Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965
Dissecting the genetics of complex traits using summary association statistics

PubMed Central

Pasaniuc, Bogdan; Price, Alkes L.

2017-01-01

During the past decade, genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants associated with complex traits and diseases. These studies have produced extensive repositories of genetic variation and trait measurements across large numbers of individuals, providing tremendous opportunities for further analyses. However, privacy concerns and other logistical considerations often limit access to individual-level genetic data, motivating the development of methods that analyze summary association statistics. Here we review recent progress on statistical methods that leverage summary association data to gain insights into the genetic basis of complex traits and diseases. PMID:27840428
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

PubMed

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-02-01

A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Selection and Reporting of Statistical Methods to Assess Reliability of a Diagnostic Test: Conformity to Recommended Methods in a Peer-Reviewed Journal

PubMed Central

Park, Ji Eun; Han, Kyunghwa; Sung, Yu Sub; Chung, Mi Sun; Koo, Hyun Jung; Yoon, Hee Mang; Choi, Young Jun; Lee, Seung Soo; Kim, Kyung Won; Shin, Youngbin; An, Suah; Cho, Hyo-Min

2017-01-01

Objective To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Materials and Methods Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Results Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Conclusion Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary. PMID:29089821
Visualization of time series statistical data by shape analysis (GDP ratio changes among Asia countries)

NASA Astrophysics Data System (ADS)

Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri

2018-03-01

It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.
Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

PubMed

Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

2016-12-20

Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Scripts for TRUMP data analyses. Part II (HLA-related data): statistical analyses specific for hematopoietic stem cell transplantation.

PubMed

Kanda, Junya

2016-01-01

The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.
Meta-analysis of neutropenia or leukopenia as a prognostic factor in patients with malignant disease undergoing chemotherapy.

PubMed

Shitara, Kohei; Matsuo, Keitaro; Oze, Isao; Mizota, Ayako; Kondo, Chihiro; Nomura, Motoo; Yokota, Tomoya; Takahari, Daisuke; Ura, Takashi; Muro, Kei

2011-08-01

We performed a systematic review and meta-analysis to determine the impact of neutropenia or leukopenia experienced during chemotherapy on survival. Eligible studies included prospective or retrospective analyses that evaluated neutropenia or leukopenia as a prognostic factor for overall survival or disease-free survival. Statistical analyses were conducted to calculate a summary hazard ratio and 95% confidence interval (CI) using random-effects or fixed-effects models based on the heterogeneity of the included studies. Thirteen trials were selected for the meta-analysis, with a total of 9,528 patients. The hazard ratio of death was 0.69 (95% CI, 0.64-0.75) for patients with higher-grade neutropenia or leukopenia compared to patients with lower-grade or lack of cytopenia. Our analysis was also stratified by statistical method (any statistical method to decrease lead-time bias; time-varying analysis or landmark analysis), but no differences were observed. Our results indicate that neutropenia or leukopenia experienced during chemotherapy is associated with improved survival in patients with advanced cancer or hematological malignancies undergoing chemotherapy. Future prospective analyses designed to investigate the potential impact of chemotherapy dose adjustment coupled with monitoring of neutropenia or leukopenia on survival are warranted.
Trends in Citations to Books on Epidemiological and Statistical Methods in the Biomedical Literature

PubMed Central

Porta, Miquel; Vandenbroucke, Jan P.; Ioannidis, John P. A.; Sanz, Sergio; Fernandez, Esteve; Bhopal, Raj; Morabia, Alfredo; Victora, Cesar; Lopez, Tomàs

2013-01-01

Background There are no analyses of citations to books on epidemiological and statistical methods in the biomedical literature. Such analyses may shed light on how concepts and methods changed while biomedical research evolved. Our aim was to analyze the number and time trends of citations received from biomedical articles by books on epidemiological and statistical methods, and related disciplines. Methods and Findings The data source was the Web of Science. The study books were published between 1957 and 2010. The first year of publication of the citing articles was 1945. We identified 125 books that received at least 25 citations. Books first published in 1980–1989 had the highest total and median number of citations per year. Nine of the 10 most cited texts focused on statistical methods. Hosmer & Lemeshow's Applied logistic regression received the highest number of citations and highest average annual rate. It was followed by books by Fleiss, Armitage, et al., Rothman, et al., and Kalbfleisch and Prentice. Fifth in citations per year was Sackett, et al., Evidence-based medicine. The rise of multivariate methods, clinical epidemiology, or nutritional epidemiology was reflected in the citation trends. Educational textbooks, practice-oriented books, books on epidemiological substantive knowledge, and on theory and health policies were much less cited. None of the 25 top-cited books had the theoretical or sociopolitical scope of works by Cochrane, McKeown, Rose, or Morris. Conclusions Books were mainly cited to reference methods. Books first published in the 1980s continue to be most influential. Older books on theory and policies were rooted in societal and general medical concerns, while the most modern books are almost purely on methods. PMID:23667447
Polygenic scores via penalized regression on summary statistics.

PubMed

Mak, Timothy Shin Heng; Porsch, Robert Milan; Choi, Shing Wan; Zhou, Xueya; Sham, Pak Chung

2017-09-01

Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating PGS have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can use LD information available elsewhere to supplement such analyses. To answer this question, we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and P-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred. © 2017 WILEY PERIODICALS, INC.
FARVATX: FAmily-based Rare Variant Association Test for X-linked genes

PubMed Central

Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H.; Silverman, Edwin K; Park, Taesung; Won, Sungho

2016-01-01

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease (COPD). Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. PMID:27325607
FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

PubMed

Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H; Silverman, Edwin K; Park, Taesung; Won, Sungho

2016-09-01

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. © 2016 WILEY PERIODICALS, INC.
Topographic ERP analyses: a step-by-step tutorial review.

PubMed

Murray, Micah M; Brunet, Denis; Michel, Christoph M

2008-06-01

In this tutorial review, we detail both the rationale for as well as the implementation of a set of analyses of surface-recorded event-related potentials (ERPs) that uses the reference-free spatial (i.e. topographic) information available from high-density electrode montages to render statistical information concerning modulations in response strength, latency, and topography both between and within experimental conditions. In these and other ways these topographic analysis methods allow the experimenter to glean additional information and neurophysiologic interpretability beyond what is available from canonical waveform analyses. In this tutorial we present the example of somatosensory evoked potentials (SEPs) in response to stimulation of each hand to illustrate these points. For each step of these analyses, we provide the reader with both a conceptual and mathematical description of how the analysis is carried out, what it yields, and how to interpret its statistical outcome. We show that these topographic analysis methods are intuitive and easy-to-use approaches that can remove much of the guesswork often confronting ERP researchers and also assist in identifying the information contained within high-density ERP datasets.
Use of recurrence plots in the analysis of pupil diameter dynamics in narcoleptics

NASA Astrophysics Data System (ADS)

Keegan, Andrew P.; Zbilut, J. P.; Merritt, S. L.; Mercer, P. J.

1993-11-01

Recurrence plots were used to evaluate pupil dynamics of subjects with narcolepsy. Preliminary data indicate that this nonlinear method of analyses may be more useful in revealing underlying deterministic differences than traditional methods like FFT and counting statistics.
A robust and efficient statistical method for genetic association studies using case and control samples from multiple cohorts

PubMed Central

2013-01-01

Background The theoretical basis of genome-wide association studies (GWAS) is statistical inference of linkage disequilibrium (LD) between any polymorphic marker and a putative disease locus. Most methods widely implemented for such analyses are vulnerable to several key demographic factors and deliver a poor statistical power for detecting genuine associations and also a high false positive rate. Here, we present a likelihood-based statistical approach that accounts properly for non-random nature of case–control samples in regard of genotypic distribution at the loci in populations under study and confers flexibility to test for genetic association in presence of different confounding factors such as population structure, non-randomness of samples etc. Results We implemented this novel method together with several popular methods in the literature of GWAS, to re-analyze recently published Parkinson’s disease (PD) case–control samples. The real data analysis and computer simulation show that the new method confers not only significantly improved statistical power for detecting the associations but also robustness to the difficulties stemmed from non-randomly sampling and genetic structures when compared to its rivals. In particular, the new method detected 44 significant SNPs within 25 chromosomal regions of size < 1 Mb but only 6 SNPs in two of these regions were previously detected by the trend test based methods. It discovered two SNPs located 1.18 Mb and 0.18 Mb from the PD candidates, FGF20 and PARK8, without invoking false positive risk. Conclusions We developed a novel likelihood-based method which provides adequate estimation of LD and other population model parameters by using case and control samples, the ease in integration of these samples from multiple genetically divergent populations and thus confers statistically robust and powerful analyses of GWAS. On basis of simulation studies and analysis of real datasets, we demonstrated significant improvement of the new method over the non-parametric trend test, which is the most popularly implemented in the literature of GWAS. PMID:23394771
Methods for detecting, quantifying, and adjusting for dissemination bias in meta-analysis are described.

PubMed

Mueller, Katharina Felicitas; Meerpohl, Joerg J; Briel, Matthias; Antes, Gerd; von Elm, Erik; Lang, Britta; Motschall, Edith; Schwarzer, Guido; Bassler, Dirk

2016-12-01

To systematically review methodological articles which focus on nonpublication of studies and to describe methods of detecting and/or quantifying and/or adjusting for dissemination in meta-analyses. To evaluate whether the methods have been applied to an empirical data set for which one can be reasonably confident that all studies conducted have been included. We systematically searched Medline, the Cochrane Library, and Web of Science, for methodological articles that describe at least one method of detecting and/or quantifying and/or adjusting for dissemination bias in meta-analyses. The literature search retrieved 2,224 records, of which we finally included 150 full-text articles. A great variety of methods to detect, quantify, or adjust for dissemination bias were described. Methods included graphical methods mainly based on funnel plot approaches, statistical methods, such as regression tests, selection models, sensitivity analyses, and a great number of more recent statistical approaches. Only few methods have been validated in empirical evaluations using unpublished studies obtained from regulators (Food and Drug Administration, European Medicines Agency). We present an overview of existing methods to detect, quantify, or adjust for dissemination bias. It remains difficult to advise which method should be used as they are all limited and their validity has rarely been assessed. Therefore, a thorough literature search remains crucial in systematic reviews, and further steps to increase the availability of all research results need to be taken. Copyright © 2016 Elsevier Inc. All rights reserved.
Trends in citations to books on epidemiological and statistical methods in the biomedical literature.

PubMed

Porta, Miquel; Vandenbroucke, Jan P; Ioannidis, John P A; Sanz, Sergio; Fernandez, Esteve; Bhopal, Raj; Morabia, Alfredo; Victora, Cesar; Lopez, Tomàs

2013-01-01

There are no analyses of citations to books on epidemiological and statistical methods in the biomedical literature. Such analyses may shed light on how concepts and methods changed while biomedical research evolved. Our aim was to analyze the number and time trends of citations received from biomedical articles by books on epidemiological and statistical methods, and related disciplines. The data source was the Web of Science. The study books were published between 1957 and 2010. The first year of publication of the citing articles was 1945. We identified 125 books that received at least 25 citations. Books first published in 1980-1989 had the highest total and median number of citations per year. Nine of the 10 most cited texts focused on statistical methods. Hosmer & Lemeshow's Applied logistic regression received the highest number of citations and highest average annual rate. It was followed by books by Fleiss, Armitage, et al., Rothman, et al., and Kalbfleisch and Prentice. Fifth in citations per year was Sackett, et al., Evidence-based medicine. The rise of multivariate methods, clinical epidemiology, or nutritional epidemiology was reflected in the citation trends. Educational textbooks, practice-oriented books, books on epidemiological substantive knowledge, and on theory and health policies were much less cited. None of the 25 top-cited books had the theoretical or sociopolitical scope of works by Cochrane, McKeown, Rose, or Morris. Books were mainly cited to reference methods. Books first published in the 1980s continue to be most influential. Older books on theory and policies were rooted in societal and general medical concerns, while the most modern books are almost purely on methods.
Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study

PubMed Central

2014-01-01

Background Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. Methods 126 hypothetical trial scenarios were evaluated (126 000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Results Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Conclusions Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power. PMID:24712304
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

PubMed

Chu, Annie; Cui, Jenny; Dinov, Ivo D

2009-03-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.

Sensitivity analysis of static resistance of slender beam under bending

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valeš, Jan

2016-06-08

The paper deals with statical and sensitivity analyses of resistance of simply supported I-beams under bending. The resistance was solved by geometrically nonlinear finite element method in the programme Ansys. The beams are modelled with initial geometrical imperfections following the first eigenmode of buckling. Imperfections were, together with geometrical characteristics of cross section, and material characteristics of steel, considered as random quantities. The method Latin Hypercube Sampling was applied to evaluate statistical and sensitivity resistance analyses.
On the Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations

PubMed Central

Lightfoot, Emma; O’Connell, Tamsin C.

2016-01-01

Oxygen isotope analysis of archaeological skeletal remains is an increasingly popular tool to study past human migrations. It is based on the assumption that human body chemistry preserves the δ18O of precipitation in such a way as to be a useful technique for identifying migrants and, potentially, their homelands. In this study, the first such global survey, we draw on published human tooth enamel and bone bioapatite data to explore the validity of using oxygen isotope analyses to identify migrants in the archaeological record. We use human δ18O results to show that there are large variations in human oxygen isotope values within a population sample. This may relate to physiological factors influencing the preservation of the primary isotope signal, or due to human activities (such as brewing, boiling, stewing, differential access to water sources and so on) causing variation in ingested water and food isotope values. We compare the number of outliers identified using various statistical methods. We determine that the most appropriate method for identifying migrants is dependent on the data but is likely to be the IQR or median absolute deviation from the median under most archaeological circumstances. Finally, through a spatial assessment of the dataset, we show that the degree of overlap in human isotope values from different locations across Europe is such that identifying individuals’ homelands on the basis of oxygen isotope analysis alone is not possible for the regions analysed to date. Oxygen isotope analysis is a valid method for identifying first-generation migrants from an archaeological site when used appropriately, however it is difficult to identify migrants using statistical methods for a sample size of less than c. 25 individuals. In the absence of local previous analyses, each sample should be treated as an individual dataset and statistical techniques can be used to identify migrants, but in most cases pinpointing a specific homeland should not be attempted. PMID:27124001
Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study

PubMed Central

Nour-Eldein, Hebatallah

2016-01-01

Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839
Plant Taxonomy as a Field Study

ERIC Educational Resources Information Center

Dalby, D. H.

1970-01-01

Suggests methods of teaching plant identification and taxonomic theory using keys, statistical analyses, and biometrics. Population variation, genotype- environment interaction and experimental taxonomy are used in laboratory and field. (AL)
A Comparison of Imputation Methods for Bayesian Factor Analysis Models

ERIC Educational Resources Information Center

Merkle, Edgar C.

2011-01-01

Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Learning investment indicators through data extension

NASA Astrophysics Data System (ADS)

Dvořák, Marek

2017-07-01

Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies

PubMed Central

Jia, Erik; Chen, Tianlu

2018-01-01

Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp. PMID:29385130
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study

PubMed Central

Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-01-01

Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160
[Statistics for statistics?--Thoughts about psychological tools].

PubMed

Berger, Uwe; Stöbel-Richter, Yve

2007-12-01

Statistical methods take a prominent place among psychologists' educational programs. Being known as difficult to understand and heavy to learn, students fear of these contents. Those, who do not aspire after a research carrier at the university, will forget the drilled contents fast. Furthermore, because it does not apply for the work with patients and other target groups at a first glance, the methodological education as a whole was often questioned. For many psychological practitioners the statistical education makes only sense by enforcing respect against other professions, namely physicians. For the own business, statistics is rarely taken seriously as a professional tool. The reason seems to be clear: Statistics treats numbers, while psychotherapy treats subjects. So, does statistics ends in itself? With this article, we try to answer the question, if and how statistical methods were represented within the psychotherapeutical and psychological research. Therefore, we analyzed 46 Originals of a complete volume of the journal Psychotherapy, Psychosomatics, Psychological Medicine (PPmP). Within the volume, 28 different analyse methods were applied, from which 89 per cent were directly based upon statistics. To be able to write and critically read Originals as a backbone of research, presumes a high degree of statistical education. To ignore statistics means to ignore research and at least to reveal the own professional work to arbitrariness.
Strengthen forensic entomology in court--the need for data exploration and the validation of a generalised additive mixed model.

PubMed

Baqué, Michèle; Amendt, Jens

2013-01-01

Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.

PubMed

Smyth, Gordon K; Altman, Naomi S

2013-05-26

Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

PubMed

West, Brady T; Sakshaug, Joseph W; Aurelien, Guy Alain S

2016-01-01

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.
How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

PubMed Central

West, Brady T.; Sakshaug, Joseph W.; Aurelien, Guy Alain S.

2016-01-01

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data. PMID:27355817
Systems and methods for detection of blowout precursors in combustors

DOEpatents

Lieuwen, Tim C.; Nair, Suraj

2006-08-15

The present invention comprises systems and methods for detecting flame blowout precursors in combustors. The blowout precursor detection system comprises a combustor, a pressure measuring device, and blowout precursor detection unit. A combustion controller may also be used to control combustor parameters. The methods of the present invention comprise receiving pressure data measured by an acoustic pressure measuring device, performing one or a combination of spectral analysis, statistical analysis, and wavelet analysis on received pressure data, and determining the existence of a blowout precursor based on such analyses. The spectral analysis, statistical analysis, and wavelet analysis further comprise their respective sub-methods to determine the existence of blowout precursors.
Comparison of Time-to-First Event and Recurrent Event Methods in Randomized Clinical Trials.

PubMed

Claggett, Brian; Pocock, Stuart; Wei, L J; Pfeffer, Marc A; McMurray, John J V; Solomon, Scott D

2018-03-27

Background -Most Phase-3 trials feature time-to-first event endpoints for their primary and/or secondary analyses. In chronic diseases where a clinical event can occur more than once, recurrent-event methods have been proposed to more fully capture disease burden and have been assumed to improve statistical precision and power compared to conventional "time-to-first" methods. Methods -To better characterize factors that influence statistical properties of recurrent-events and time-to-first methods in the evaluation of randomized therapy, we repeatedly simulated trials with 1:1 randomization of 4000 patients to active vs control therapy, with true patient-level risk reduction of 20% (i.e. RR=0.80). For patients who discontinued active therapy after a first event, we assumed their risk reverted subsequently to their original placebo-level risk. Through simulation, we varied a) the degree of between-patient heterogeneity of risk and b) the extent of treatment discontinuation. Findings were compared with those from actual randomized clinical trials. Results -As the degree of between-patient heterogeneity of risk was increased, both time-to-first and recurrent-events methods lost statistical power to detect a true risk reduction and confidence intervals widened. The recurrent-events analyses continued to estimate the true RR=0.80 as heterogeneity increased, while the Cox model produced estimates that were attenuated. The power of recurrent-events methods declined as the rate of study drug discontinuation post-event increased. Recurrent-events methods provided greater power than time-to-first methods in scenarios where drug discontinuation was ≤30% following a first event, lesser power with drug discontinuation rates of ≥60%, and comparable power otherwise. We confirmed in several actual trials in chronic heart failure that treatment effect estimates were attenuated when estimated via the Cox model and that increased statistical power from recurrent-events methods was most pronounced in trials with lower treatment discontinuation rates. Conclusions -We find that the statistical power of both recurrent-events and time-to-first methods are reduced by increasing heterogeneity of patient risk, a parameter not included in conventional power and sample size formulas. Data from real clinical trials are consistent with simulation studies, confirming that the greatest statistical gains from use of recurrent-events methods occur in the presence of high patient heterogeneity and low rates of study drug discontinuation.
Family Early Literacy Practices Questionnaire: A Validation Study for a Spanish-Speaking Population

ERIC Educational Resources Information Center

Lewis, Kandia

2012-01-01

The purpose of the current study was to evaluate the psychometric validity of a Spanish translated version of a family involvement questionnaire (the FELP) using a mixed-methods design. Thus, statistical analyses (i.e., factor analysis, reliability analysis, and item analysis) and qualitative analyses (i.e., focus group data) were assessed.…
Evaluation of maintenance/rehabilitation alternatives for continuously reinforced concrete pavement

NASA Astrophysics Data System (ADS)

Barnett, T. L.; Darter, M. I.; Laybourne, N. R.

1981-05-01

The design, construction, performance, and costs of several maintenance and rehabilitation methods were evaluated. Patching, cement grout and asphalt undersealing, epoxying of cracks, and an asphalt overlay were considered. Nondestructive testing, deflections, reflection cracking, cost, and statistical analyses were used to evaluate the methods.
Regression methods for spatially correlated data: an example using beetle attacks in a seed orchard

Treesearch

Preisler Haiganoush; Nancy G. Rappaport; David L. Wood

1997-01-01

We present a statistical procedure for studying the simultaneous effects of observed covariates and unmeasured spatial variables on responses of interest. The procedure uses regression type analyses that can be used with existing statistical software packages. An example using the rate of twig beetle attacks on Douglas-fir trees in a seed orchard illustrates the...
Identification of natural images and computer-generated graphics based on statistical and textural features.

PubMed

Peng, Fei; Li, Jiao-ting; Long, Min

2015-03-01

To discriminate the acquisition pipelines of digital images, a novel scheme for the identification of natural images and computer-generated graphics is proposed based on statistical and textural features. First, the differences between them are investigated from the view of statistics and texture, and 31 dimensions of feature are acquired for identification. Then, LIBSVM is used for the classification. Finally, the experimental results are presented. The results show that it can achieve an identification accuracy of 97.89% for computer-generated graphics, and an identification accuracy of 97.75% for natural images. The analyses also demonstrate the proposed method has excellent performance, compared with some existing methods based only on statistical features or other features. The method has a great potential to be implemented for the identification of natural images and computer-generated graphics. © 2014 American Academy of Forensic Sciences.
Statistical technique for analysing functional connectivity of multiple spike trains.

PubMed

Masud, Mohammad Shahed; Borisyuk, Roman

2011-03-15

A new statistical technique, the Cox method, used for analysing functional connectivity of simultaneously recorded multiple spike trains is presented. This method is based on the theory of modulated renewal processes and it estimates a vector of influence strengths from multiple spike trains (called reference trains) to the selected (target) spike train. Selecting another target spike train and repeating the calculation of the influence strengths from the reference spike trains enables researchers to find all functional connections among multiple spike trains. In order to study functional connectivity an "influence function" is identified. This function recognises the specificity of neuronal interactions and reflects the dynamics of postsynaptic potential. In comparison to existing techniques, the Cox method has the following advantages: it does not use bins (binless method); it is applicable to cases where the sample size is small; it is sufficiently sensitive such that it estimates weak influences; it supports the simultaneous analysis of multiple influences; it is able to identify a correct connectivity scheme in difficult cases of "common source" or "indirect" connectivity. The Cox method has been thoroughly tested using multiple sets of data generated by the neural network model of the leaky integrate and fire neurons with a prescribed architecture of connections. The results suggest that this method is highly successful for analysing functional connectivity of simultaneously recorded multiple spike trains. Copyright © 2011 Elsevier B.V. All rights reserved.

ARC Researchers at ASME 2015 Internal Combustion Engine Division Fall

Science.gov Websites

-sense. Therefore, the focus of this paper is on the various methods of computing CA50 for analysing and classifying cycle-to-cycle variability. The assumptions made to establish fast and possibly on-line methods SI engine. Then the various fast methods for computing CA50 feed the two statistical methods
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

PubMed Central

Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.

2014-01-01

Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607
Time Series Expression Analyses Using RNA-seq: A Statistical Approach

PubMed Central

Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

2013-01-01

RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021
Time series expression analyses using RNA-seq: a statistical approach.

PubMed

Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

2013-01-01

RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

PubMed

Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

2008-01-01

ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Statistical Methods Applied to Gamma-ray Spectroscopy Algorithms in Nuclear Security Missions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fagan, Deborah K.; Robinson, Sean M.; Runkle, Robert C.

2012-10-01

In a wide range of nuclear security missions, gamma-ray spectroscopy is a critical research and development priority. One particularly relevant challenge is the interdiction of special nuclear material for which gamma-ray spectroscopy supports the goals of detecting and identifying gamma-ray sources. This manuscript examines the existing set of spectroscopy methods, attempts to categorize them by the statistical methods on which they rely, and identifies methods that have yet to be considered. Our examination shows that current methods effectively estimate the effect of counting uncertainty but in many cases do not address larger sources of decision uncertainty—ones that are significantly moremore » complex. We thus explore the premise that significantly improving algorithm performance requires greater coupling between the problem physics that drives data acquisition and statistical methods that analyze such data. Untapped statistical methods, such as Bayes Modeling Averaging and hierarchical and empirical Bayes methods have the potential to reduce decision uncertainty by more rigorously and comprehensively incorporating all sources of uncertainty. We expect that application of such methods will demonstrate progress in meeting the needs of nuclear security missions by improving on the existing numerical infrastructure for which these analyses have not been conducted.« less
Logistic regression applied to natural hazards: rare event logistic regression with replications

NASA Astrophysics Data System (ADS)

Guns, M.; Vanacker, V.

2012-06-01

Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
CADDIS Volume 4. Data Analysis: Selecting an Analysis Approach

EPA Pesticide Factsheets

An approach for selecting statistical analyses to inform causal analysis. Describes methods for determining whether test site conditions differ from reference expectations. Describes an approach for estimating stressor-response relationships.
Antimicrobial susceptibility of Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from a diagnostic veterinary laboratory and recommendations for a surveillance system

PubMed Central

Glass-Kaastra, Shiona K.; Pearl, David L.; Reid-Smith, Richard J.; McEwen, Beverly; Slavic, Durda; McEwen, Scott A.; Fairles, Jim

2014-01-01

Antimicrobial susceptibility data on Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from Ontario swine (January 1998 to October 2010) were acquired from a comprehensive diagnostic veterinary laboratory in Ontario, Canada. In relation to the possible development of a surveillance system for antimicrobial resistance, data were assessed for ease of management, completeness, consistency, and applicability for temporal and spatial statistical analyses. Limited farm location data precluded spatial analyses and missing demographic data limited their use as predictors within multivariable statistical models. Changes in the standard panel of antimicrobials used for susceptibility testing reduced the number of antimicrobials available for temporal analyses. Data consistency and quality could improve over time in this and similar diagnostic laboratory settings by encouraging complete reporting with sample submission and by modifying database systems to limit free-text data entry. These changes could make more statistical methods available for disease surveillance and cluster detection. PMID:24688133
Antimicrobial susceptibility of Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from a diagnostic veterinary laboratory and recommendations for a surveillance system.

PubMed

Glass-Kaastra, Shiona K; Pearl, David L; Reid-Smith, Richard J; McEwen, Beverly; Slavic, Durda; McEwen, Scott A; Fairles, Jim

2014-04-01

Antimicrobial susceptibility data on Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from Ontario swine (January 1998 to October 2010) were acquired from a comprehensive diagnostic veterinary laboratory in Ontario, Canada. In relation to the possible development of a surveillance system for antimicrobial resistance, data were assessed for ease of management, completeness, consistency, and applicability for temporal and spatial statistical analyses. Limited farm location data precluded spatial analyses and missing demographic data limited their use as predictors within multivariable statistical models. Changes in the standard panel of antimicrobials used for susceptibility testing reduced the number of antimicrobials available for temporal analyses. Data consistency and quality could improve over time in this and similar diagnostic laboratory settings by encouraging complete reporting with sample submission and by modifying database systems to limit free-text data entry. These changes could make more statistical methods available for disease surveillance and cluster detection.
Simultaneous assessment of phase chemistry, phase abundance and bulk chemistry with statistical electron probe micro-analyses: Application to cement clinkers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wilson, William; Krakowiak, Konrad J.; Ulm, Franz-Josef, E-mail: ulm@mit.edu

2014-01-15

According to recent developments in cement clinker engineering, the optimization of chemical substitutions in the main clinker phases offers a promising approach to improve both reactivity and grindability of clinkers. Thus, monitoring the chemistry of the phases may become part of the quality control at the cement plants, along with the usual measurements of the abundance of the mineralogical phases (quantitative X-ray diffraction) and the bulk chemistry (X-ray fluorescence). This paper presents a new method to assess these three complementary quantities with a single experiment. The method is based on electron microprobe spot analyses, performed over a grid located onmore » a representative surface of the sample and interpreted with advanced statistical tools. This paper describes the method and the experimental program performed on industrial clinkers to establish the accuracy in comparison to conventional methods. -- Highlights: •A new method of clinker characterization •Combination of electron probe technique with cluster analysis •Simultaneous assessment of phase abundance, composition and bulk chemistry •Experimental validation performed on industrial clinkers.« less
Statistical methods for the beta-binomial model in teratology.

PubMed Central

Yamamoto, E; Yanagimoto, T

1994-01-01

The beta-binomial model is widely used for analyzing teratological data involving littermates. Recent developments in statistical analyses of teratological data are briefly reviewed with emphasis on the model. For statistical inference of the parameters in the beta-binomial distribution, separation of the likelihood introduces an likelihood inference. This leads to reducing biases of estimators and also to improving accuracy of empirical significance levels of tests. Separate inference of the parameters can be conducted in a unified way. PMID:8187716
Group Influences on Young Adult Warfighters’ Risk Taking

DTIC Science & Technology

2016-12-01

Statistical Analysis Latent linear growth models were fitted using the maximum likelihood estimation method in Mplus (version 7.0; Muthen & Muthen...condition had a higher net score than those in the alone condition (b = 20.53, SE = 6.29, p < .001). Results of the relevant statistical analyses are...8.56 110.86*** 22.01 158.25*** 29.91 Model fit statistics BIC 4004.50 5302.539 5540.58 Chi-square (df) 41.51*** (16) 38.10** (20) 42.19** (20
Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review.

PubMed

Groppe, David M; Urbach, Thomas P; Kutas, Marta

2011-12-01

Event-related potentials (ERPs) and magnetic fields (ERFs) are typically analyzed via ANOVAs on mean activity in a priori windows. Advances in computing power and statistics have produced an alternative, mass univariate analyses consisting of thousands of statistical tests and powerful corrections for multiple comparisons. Such analyses are most useful when one has little a priori knowledge of effect locations or latencies, and for delineating effect boundaries. Mass univariate analyses complement and, at times, obviate traditional analyses. Here we review this approach as applied to ERP/ERF data and four methods for multiple comparison correction: strong control of the familywise error rate (FWER) via permutation tests, weak control of FWER via cluster-based permutation tests, false discovery rate control, and control of the generalized FWER. We end with recommendations for their use and introduce free MATLAB software for their implementation. Copyright © 2011 Society for Psychophysiological Research.
Statistical methods for analysing responses of wildlife to human disturbance.

Treesearch

Haiganoush K. Preisler; Alan A. Ager; Michael J. Wisdom

2006-01-01

1. Off-road recreation is increasing rapidly in many areas of the world, and effects on wildlife can be highly detrimental. Consequently, we have developed methods for studying wildlife responses to off-road recreation with the use of new technologies that allow frequent and accurate monitoring of human-wildlife interactions. To illustrate these methods, we studied the...
A Mixed-Methods Study Investigating the Relationship between Media Multitasking Orientation and Grade Point Average

ERIC Educational Resources Information Center

Lee, Jennifer

2012-01-01

The intent of this study was to examine the relationship between media multitasking orientation and grade point average. The study utilized a mixed-methods approach to investigate the research questions. In the quantitative section of the study, the primary method of statistical analyses was multiple regression. The independent variables for the…
Assessment of statistical methods used in library-based approaches to microbial source tracking.

PubMed

Ritter, Kerry J; Carruthers, Ethan; Carson, C Andrew; Ellender, R D; Harwood, Valerie J; Kingsley, Kyle; Nakatsu, Cindy; Sadowsky, Michael; Shear, Brian; West, Brian; Whitlock, John E; Wiggins, Bruce A; Wilbur, Jayson D

2003-12-01

Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.
Confounding in statistical mediation analysis: What it is and how to address it.

PubMed

Valente, Matthew J; Pelham, William E; Smyth, Heather; MacKinnon, David P

2017-11-01

Psychology researchers are often interested in mechanisms underlying how randomized interventions affect outcomes such as substance use and mental health. Mediation analysis is a common statistical method for investigating psychological mechanisms that has benefited from exciting new methodological improvements over the last 2 decades. One of the most important new developments is methodology for estimating causal mediated effects using the potential outcomes framework for causal inference. Potential outcomes-based methods developed in epidemiology and statistics have important implications for understanding psychological mechanisms. We aim to provide a concise introduction to and illustration of these new methods and emphasize the importance of confounder adjustment. First, we review the traditional regression approach for estimating mediated effects. Second, we describe the potential outcomes framework. Third, we define what a confounder is and how the presence of a confounder can provide misleading evidence regarding mechanisms of interventions. Fourth, we describe experimental designs that can help rule out confounder bias. Fifth, we describe new statistical approaches to adjust for measured confounders of the mediator-outcome relation and sensitivity analyses to probe effects of unmeasured confounders on the mediated effect. All approaches are illustrated with application to a real counseling intervention dataset. Counseling psychologists interested in understanding the causal mechanisms of their interventions can benefit from incorporating the most up-to-date techniques into their mediation analyses. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Content and Citation Analyses of "Public Relations Review."

ERIC Educational Resources Information Center

Morton, Linda P.; Lin, Li-Yun

1995-01-01

Analyzes 161 cited and 177 uncited articles published in "Public Relations Review" (1975-93) to determine if 3 independent variables--research methods, type of statistics, and topics--influenced whether or not articles were cited in other research articles. Finds significant differences between quantitative and qualitative research methods but not…
Pooling sexes when assessing ground reaction forces during walking: Statistical Parametric Mapping versus traditional approach.

PubMed

Castro, Marcelo P; Pataky, Todd C; Sole, Gisela; Vilas-Boas, Joao Paulo

2015-07-16

Ground reaction force (GRF) data from men and women are commonly pooled for analyses. However, it may not be justifiable to pool sexes on the basis of discrete parameters extracted from continuous GRF gait waveforms because this can miss continuous effects. Forty healthy participants (20 men and 20 women) walked at a cadence of 100 steps per minute across two force plates, recording GRFs. Two statistical methods were used to test the null hypothesis of no mean GRF differences between sexes: (i) Statistical Parametric Mapping-using the entire three-component GRF waveform; and (ii) traditional approach-using the first and second vertical GRF peaks. Statistical Parametric Mapping results suggested large sex differences, which post-hoc analyses suggested were due predominantly to higher anterior-posterior and vertical GRFs in early stance in women compared to men. Statistically significant differences were observed for the first GRF peak and similar values for the second GRF peak. These contrasting results emphasise that different parts of the waveform have different signal strengths and thus that one may use the traditional approach to choose arbitrary metrics and make arbitrary conclusions. We suggest that researchers and clinicians consider both the entire gait waveforms and sex-specificity when analysing GRF data. Copyright © 2015 Elsevier Ltd. All rights reserved.

Post Hoc Analyses of ApoE Genotype-Defined Subgroups in Clinical Trials.

PubMed

Kennedy, Richard E; Cutter, Gary R; Wang, Guoqiao; Schneider, Lon S

2016-01-01

Many post hoc analyses of clinical trials in Alzheimer's disease (AD) and mild cognitive impairment (MCI) are in small Phase 2 trials. Subject heterogeneity may lead to statistically significant post hoc results that cannot be replicated in larger follow-up studies. We investigated the extent of this problem using simulation studies mimicking current trial methods with post hoc analyses based on ApoE4 carrier status. We used a meta-database of 24 studies, including 3,574 subjects with mild AD and 1,171 subjects with MCI/prodromal AD, to simulate clinical trial scenarios. Post hoc analyses examined if rates of progression on the Alzheimer's Disease Assessment Scale-cognitive (ADAS-cog) differed between ApoE4 carriers and non-carriers. Across studies, ApoE4 carriers were younger and had lower baseline scores, greater rates of progression, and greater variability on the ADAS-cog. Up to 18% of post hoc analyses for 18-month trials in AD showed greater rates of progression for ApoE4 non-carriers that were statistically significant but unlikely to be confirmed in follow-up studies. The frequency of erroneous conclusions dropped below 3% with trials of 100 subjects per arm. In MCI, rates of statistically significant differences with greater progression in ApoE4 non-carriers remained below 3% unless sample sizes were below 25 subjects per arm. Statistically significant differences for ApoE4 in post hoc analyses often reflect heterogeneity among small samples rather than true differential effect among ApoE4 subtypes. Such analyses must be viewed cautiously. ApoE genotype should be incorporated into the design stage to minimize erroneous conclusions.
GMHDIF: A Computer Program for Detecting DIF in Dichotomous and Polytomous Items Using Generalized Mantel-Haenszel Statistics

ERIC Educational Resources Information Center

Fidalgo, Angel M.

2011-01-01

Mantel-Haenszel (MH) methods constitute one of the most popular nonparametric differential item functioning (DIF) detection procedures. GMHDIF has been developed to provide an easy-to-use program for conducting DIF analyses. Some of the advantages of this program are that (a) it performs two-stage DIF analyses in multiple groups simultaneously;…
Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study.

PubMed

Nour-Eldein, Hebatallah

2016-01-01

With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles.
Spectral statistics of the uni-modular ensemble

NASA Astrophysics Data System (ADS)

Joyner, Christopher H.; Smilansky, Uzy; Weidenmüller, Hans A.

2017-09-01

We investigate the spectral statistics of Hermitian matrices in which the elements are chosen uniformly from U(1) , called the uni-modular ensemble (UME), in the limit of large matrix size. Using three complimentary methods; a supersymmetric integration method, a combinatorial graph-theoretical analysis and a Brownian motion approach, we are able to derive expressions for 1 / N corrections to the mean spectral moments and also analyse the fluctuations about this mean. By addressing the same ensemble from three different point of view, we can critically compare their relative advantages and derive some new results.
Statistical issues on the analysis of change in follow-up studies in dental research.

PubMed

Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S

2007-12-01

To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.
Statistics Clinic

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

2014-01-01

Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Approach for Input Uncertainty Propagation and Robust Design in CFD Using Sensitivity Derivatives

NASA Technical Reports Server (NTRS)

Putko, Michele M.; Taylor, Arthur C., III; Newman, Perry A.; Green, Lawrence L.

2002-01-01

An implementation of the approximate statistical moment method for uncertainty propagation and robust optimization for quasi 3-D Euler CFD code is presented. Given uncertainties in statistically independent, random, normally distributed input variables, first- and second-order statistical moment procedures are performed to approximate the uncertainty in the CFD output. Efficient calculation of both first- and second-order sensitivity derivatives is required. In order to assess the validity of the approximations, these moments are compared with statistical moments generated through Monte Carlo simulations. The uncertainties in the CFD input variables are also incorporated into a robust optimization procedure. For this optimization, statistical moments involving first-order sensitivity derivatives appear in the objective function and system constraints. Second-order sensitivity derivatives are used in a gradient-based search to successfully execute a robust optimization. The approximate methods used throughout the analyses are found to be valid when considering robustness about input parameter mean values.
Adjusting the Adjusted X[superscript 2]/df Ratio Statistic for Dichotomous Item Response Theory Analyses: Does the Model Fit?

ERIC Educational Resources Information Center

Tay, Louis; Drasgow, Fritz

2012-01-01

Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted X[superscript 2]/df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted…
The allele combinations of three loci based on, liver, stomach cancers, hematencephalon, COPD and normal population: A preliminary study.

PubMed

Gai, Liping; Liu, Hui; Cui, Jing-Hui; Yu, Weijian; Ding, Xiao-Dong

2017-03-20

The purpose of this study was to examine the specific allele combinations of three loci connected with the liver cancers, stomach cancers, hematencephalon and patients with chronic obstructive pulmonary disease (COPD) and to explore the feasibility of the research methods. We explored different mathematical methods for statistical analyses to assess the association between the genotype and phenotype. At the same time we still analyses the statistical results of allele combinations of three loci by difference value method and ratio method. All the DNA blood samples were collected from patients with 50 liver cancers, 75 stomach cancers, 50 hematencephalon, 72 COPD and 200 normal populations. All the samples were from Chinese. Alleles from short tandem repeat (STR) loci were determined using the STR Profiler plus PCR amplification kit (15 STR loci). Previous research was based on combinations of single-locus alleles, and combinations of cross-loci (two loci) alleles. Allele combinations of three loci were obtained by computer counting and stronger genetic signal was obtained. The methods of allele combinations of three loci can help to identify the statistically significant differences of allele combinations between liver cancers, stomach cancers, patients with hematencephalon, COPD and the normal population. The probability of illness followed different rules and had apparent specificity. This method can be extended to other diseases and provide reference for early clinical diagnosis. Copyright © 2016. Published by Elsevier B.V.
Parasites as valuable stock markers for fisheries in Australasia, East Asia and the Pacific Islands.

PubMed

Lester, R J G; Moore, B R

2015-01-01

Over 30 studies in Australasia, East Asia and the Pacific Islands region have collected and analysed parasite data to determine the ranges of individual fish, many leading to conclusions about stock delineation. Parasites used as biological tags have included both those known to have long residence times in the fish and those thought to be relatively transient. In many cases the parasitological conclusions have been supported by other methods especially analysis of the chemical constituents of otoliths, and to a lesser extent, genetic data. In analysing parasite data, authors have applied multiple different statistical methodologies, including summary statistics, and univariate and multivariate approaches. Recently, a growing number of researchers have found non-parametric methods, such as analysis of similarities and cluster analysis, to be valuable. Future studies into the residence times, life cycles and geographical distributions of parasites together with more robust analytical methods will yield much important information to clarify stock structures in the area.
Clinical trials, epidemiology, and public confidence.

PubMed

Seigel, Daniel

2003-11-15

Critics in the media have become wary of exaggerated research claims from clinical trials and epidemiological studies. Closer to home, reviews of published studies find a high frequency of poor quality in research methods, including those used for statistical analysis. The statistical literature has long recognized that questionable research findings can occur when investigators fail to set aside their own outcome preferences as they analyse and interpret data. These preferences can be related to financial interests, a concern for patients, peer recognition, and commitment to a hypothesis. Several analyses of published papers provide evidence of an association between financial conflicts of interest and reported results. If we are to regain professional and lay confidence in research findings some changes are required. Clinical journals need to develop more competence in the review of analytic methods and provide space for thorough discussion of published papers whose results are challenged. Graduate schools need to prepare students for the conflicting interests that surround the practice of statistics. Above all, each of us must recognize our responsibility to use analytic procedures that illuminate the research issues rather than those serving special interests. Copyright 2003 John Wiley & Sons, Ltd.
A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses.

PubMed

Buttigieg, Pier Luigi; Ramette, Alban

2014-12-01

The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.
Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study.

PubMed

Egbewale, Bolaji E; Lewis, Martyn; Sim, Julius

2014-04-09

Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. 126 hypothetical trial scenarios were evaluated (126,000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power.
Statistical analysis of fNIRS data: a comprehensive review.

PubMed

Tak, Sungho; Ye, Jong Chul

2014-01-15

Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Suggestions for presenting the results of data analyses

USGS Publications Warehouse

Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.

2001-01-01

We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
Cross-Sectional and Panel Data Analyses of an Incompletely Observed Variable Derived from the Nonrandomized Method for Surveying Sensitive Questions

ERIC Educational Resources Information Center

Yamaguchi, Kazuo

2016-01-01

This article describes (1) the survey methodological and statistical characteristics of the nonrandomized method for surveying sensitive questions for both cross-sectional and panel survey data and (2) the way to use the incompletely observed variable obtained from this survey method in logistic regression and in loglinear and log-multiplicative…
Statistical methods for analysing responses of wildlife to human disturbance

Treesearch

Haiganoush K. Preisler; Alan A. Ager; Michael J. Wisdom

2006-01-01

Off-road recreation is increasing rapidly in many areas of the world, and effects on wildlife can be highly detrimental. Consequently, we have developed methods for studying wildlife responses to off-road recreation with the use of new technologies that allow frequent and accurate monitoring of human-wildlife interactions. To...
Power Analysis for Complex Mediational Designs Using Monte Carlo Methods

ERIC Educational Resources Information Center

Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.

2010-01-01

Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex…
A Simple Method to Control Positive Baseline Trend within Data Nonoverlap

ERIC Educational Resources Information Center

Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.

2014-01-01

Nonoverlap is widely used as a statistical summary of data; however, these analyses rarely correct unwanted positive baseline trend. This article presents and validates the graph rotation for overlap and trend (GROT) technique, a hand calculation method for controlling positive baseline trend within an analysis of data nonoverlap. GROT is…
Analyzing the Validity of the Adult-Adolescent Parenting Inventory for Low-Income Populations

ERIC Educational Resources Information Center

Lawson, Michael A.; Alameda-Lawson, Tania; Byrnes, Edward

2017-01-01

Objectives: The purpose of this study was to examine the construct and predictive validity of the Adult-Adolescent Parenting Inventory (AAPI-2). Methods: The validity of the AAPI-2 was evaluated using multiple statistical methods, including exploratory factor analysis, confirmatory factor analysis, and latent class analysis. These analyses were…

Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.

PubMed

Simovski, Boris; Kanduri, Chakravarthi; Gundersen, Sveinung; Titov, Dmytro; Domanska, Diana; Bock, Christoph; Bossini-Castillo, Lara; Chikina, Maria; Favorov, Alexander; Layer, Ryan M; Mironov, Andrey A; Quinlan, Aaron R; Sheffield, Nathan C; Trynka, Gosia; Sandve, Geir K

2018-06-05

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.
Characteristics of meta-analyses and their component studies in the Cochrane Database of Systematic Reviews: a cross-sectional, descriptive analysis

PubMed Central

2011-01-01

Background Cochrane systematic reviews collate and summarise studies of the effects of healthcare interventions. The characteristics of these reviews and the meta-analyses and individual studies they contain provide insights into the nature of healthcare research and important context for the development of relevant statistical and other methods. Methods We classified every meta-analysis with at least two studies in every review in the January 2008 issue of the Cochrane Database of Systematic Reviews (CDSR) according to the medical specialty, the types of interventions being compared and the type of outcome. We provide descriptive statistics for numbers of meta-analyses, numbers of component studies and sample sizes of component studies, broken down by these categories. Results We included 2321 reviews containing 22,453 meta-analyses, which themselves consist of data from 112,600 individual studies (which may appear in more than one meta-analysis). Meta-analyses in the areas of gynaecology, pregnancy and childbirth (21%), mental health (13%) and respiratory diseases (13%) are well represented in the CDSR. Most meta-analyses address drugs, either with a control or placebo group (37%) or in a comparison with another drug (25%). The median number of meta-analyses per review is six (inter-quartile range 3 to 12). The median number of studies included in the meta-analyses with at least two studies is three (inter-quartile range 2 to 6). Sample sizes of individual studies range from 2 to 1,242,071, with a median of 91 participants. Discussion It is clear that the numbers of studies eligible for meta-analyses are typically very small for all medical areas, outcomes and interventions covered by Cochrane reviews. This highlights the particular importance of suitable methods for the meta-analysis of small data sets. There was little variation in number of studies per meta-analysis across medical areas, across outcome data types or across types of interventions being compared. PMID:22114982
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

PubMed Central

Chu, Annie; Cui, Jenny; Dinov, Ivo D.

2011-01-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
SAS Code for Calculating Intraclass Correlation Coefficients and Effect Size Benchmarks for Site-Randomized Education Experiments

ERIC Educational Resources Information Center

Brandon, Paul R.; Harrison, George M.; Lawton, Brian E.

2013-01-01

When evaluators plan site-randomized experiments, they must conduct the appropriate statistical power analyses. These analyses are most likely to be valid when they are based on data from the jurisdictions in which the studies are to be conducted. In this method note, we provide software code, in the form of a SAS macro, for producing statistical…
Cocaine profiling for strategic intelligence, a cross-border project between France and Switzerland: part II. Validation of the statistical methodology for the profiling of cocaine.

PubMed

Lociciro, S; Esseiva, P; Hayoz, P; Dujourdy, L; Besacier, F; Margot, P

2008-05-20

Harmonisation and optimization of analytical and statistical methodologies were carried out between two forensic laboratories (Lausanne, Switzerland and Lyon, France) in order to provide drug intelligence for cross-border cocaine seizures. Part I dealt with the optimization of the analytical method and its robustness. This second part investigates statistical methodologies that will provide reliable comparison of cocaine seizures analysed on two different gas chromatographs interfaced with a flame ionisation detectors (GC-FIDs) in two distinct laboratories. Sixty-six statistical combinations (ten data pre-treatments followed by six different distance measurements and correlation coefficients) were applied. One pre-treatment (N+S: area of each peak is divided by its standard deviation calculated from the whole data set) followed by the Cosine or Pearson correlation coefficients were found to be the best statistical compromise for optimal discrimination of linked and non-linked samples. The centralisation of the analyses in one single laboratory is not a required condition anymore to compare samples seized in different countries. This allows collaboration, but also, jurisdictional control over data.
Experimental design and statistical methods for improved hit detection in high-throughput screening.

PubMed

Malo, Nathalie; Hanley, James A; Carlile, Graeme; Liu, Jing; Pelletier, Jerry; Thomas, David; Nadon, Robert

2010-09-01

Identification of active compounds in high-throughput screening (HTS) contexts can be substantially improved by applying classical experimental design and statistical inference principles to all phases of HTS studies. The authors present both experimental and simulated data to illustrate how true-positive rates can be maximized without increasing false-positive rates by the following analytical process. First, the use of robust data preprocessing methods reduces unwanted variation by removing row, column, and plate biases. Second, replicate measurements allow estimation of the magnitude of the remaining random error and the use of formal statistical models to benchmark putative hits relative to what is expected by chance. Receiver Operating Characteristic (ROC) analyses revealed superior power for data preprocessed by a trimmed-mean polish method combined with the RVM t-test, particularly for small- to moderate-sized biological hits.
Comparison of corneal endothelial image analysis by Konan SP8000 noncontact and Bio-Optics Bambi systems.

PubMed

Benetz, B A; Diaconu, E; Bowlin, S J; Oak, S S; Laing, R A; Lass, J H

1999-01-01

Compare corneal endothelial image analysis by Konan SP8000 and Bio-Optics Bambi image-analysis systems. Corneal endothelial images from 98 individuals (191 eyes), ranging in age from 4 to 87 years, with a normal slit-lamp examination and no history of ocular trauma, intraocular surgery, or intraocular inflammation were obtained by the Konan SP8000 noncontact specular microscope. One observer analyzed these images by using the Konan system and a second observer by using the Bio-Optics Bambi system. Three methods of analyses were used: a fixed-frame method to obtain cell density (for both Konan and Bio-Optics Bambi) and a "dot" (Konan) or "corners" (Bio-Optics Bambi) method to determine morphometric parameters. The cell density determined by the Konan fixed-frame method was significantly higher (157 cells/mm2) than the Bio-Optics Bambi fixed-frame method determination (p<0.0001). However, the difference in cell density, although still statistically significant, was smaller and reversed comparing the Konan fixed-frame method with both Konan dot and Bio-Optics Bambi comers method (-74 cells/mm2, p<0.0001; -55 cells/mm2, p<0.0001, respectively). Small but statistically significant morphometric analyses differences between Konan and Bio-Optics Bambi were seen: cell density, +19 cells/mm2 (p = 0.03); cell area, -3.0 microm2 (p = 0.008); and coefficient of variation, +1.0 (p = 0.003). There was no statistically significant difference between these two methods in the percentage of six-sided cells detected (p = 0.55). Cell densities measured by the Konan fixed-frame method were comparable with Konan and Bio-Optics Bambi's morphometric analysis, but not with the Bio-Optics Bambi fixed-frame method. The two morphometric analyses were comparable with minimal or no differences for the parameters that were studied. The Konan SP8000 endothelial image-analysis system may be useful for large-scale clinical trials determining cell loss; its noncontact system has many clinical benefits (including patient comfort, safety, ease of use, and short procedure time) and provides reliable cell-density calculations.
Geographically Sourcing Cocaine's Origin - Delineation of the Nineteen Major Coca Growing Regions in South America.

PubMed

Mallette, Jennifer R; Casale, John F; Jordan, James; Morello, David R; Beyer, Paul M

2016-03-23

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses ((2)H and (18)O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
Geographically Sourcing Cocaine’s Origin - Delineation of the Nineteen Major Coca Growing Regions in South America

NASA Astrophysics Data System (ADS)

Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.

2016-03-01

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
A weighted U-statistic for genetic association analyses of sequencing data.

PubMed

Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

2014-12-01

With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
Emotional and cognitive effects of peer tutoring among secondary school mathematics students

NASA Astrophysics Data System (ADS)

Alegre Ansuategui, Francisco José; Moliner Miravet, Lidón

2017-11-01

This paper describes an experience of same-age peer tutoring conducted with 19 eighth-grade mathematics students in a secondary school in Castellon de la Plana (Spain). Three constructs were analysed before and after launching the program: academic performance, mathematics self-concept and attitude of solidarity. Students' perceptions of the method were also analysed. The quantitative data was gathered by means of a mathematics self-concept questionnaire, an attitude of solidarity questionnaire and the students' numerical ratings. A statistical analysis was performed using Student's t-test. The qualitative information was gathered by means of discussion groups and a field diary. This information was analysed using descriptive analysis and by categorizing the information. Results show statistically significant improvements in all the variables and the positive assessment of the experience and the interactions that took place between the students.
Quantifying Safety Margin Using the Risk-Informed Safety Margin Characterization (RISMC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grabaskas, David; Bucknor, Matthew; Brunett, Acacia

2015-04-26

The Risk-Informed Safety Margin Characterization (RISMC), developed by Idaho National Laboratory as part of the Light-Water Reactor Sustainability Project, utilizes a probabilistic safety margin comparison between a load and capacity distribution, rather than a deterministic comparison between two values, as is usually done in best-estimate plus uncertainty analyses. The goal is to determine the failure probability, or in other words, the probability of the system load equaling or exceeding the system capacity. While this method has been used in pilot studies, there has been little work conducted investigating the statistical significance of the resulting failure probability. In particular, it ismore » difficult to determine how many simulations are necessary to properly characterize the failure probability. This work uses classical (frequentist) statistics and confidence intervals to examine the impact in statistical accuracy when the number of simulations is varied. Two methods are proposed to establish confidence intervals related to the failure probability established using a RISMC analysis. The confidence interval provides information about the statistical accuracy of the method utilized to explore the uncertainty space, and offers a quantitative method to gauge the increase in statistical accuracy due to performing additional simulations.« less
Statistical analysis plan for the Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART). A randomized controlled trial

PubMed Central

Damiani, Lucas Petri; Berwanger, Otavio; Paisani, Denise; Laranjeira, Ligia Nasi; Suzumura, Erica Aranha; Amato, Marcelo Britto Passos; Carvalho, Carlos Roberto Ribeiro; Cavalcanti, Alexandre Biasi

2017-01-01

Background The Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART) is an international multicenter randomized pragmatic controlled trial with allocation concealment involving 120 intensive care units in Brazil, Argentina, Colombia, Italy, Poland, Portugal, Malaysia, Spain, and Uruguay. The primary objective of ART is to determine whether maximum stepwise alveolar recruitment associated with PEEP titration, adjusted according to the static compliance of the respiratory system (ART strategy), is able to increase 28-day survival in patients with acute respiratory distress syndrome compared to conventional treatment (ARDSNet strategy). Objective To describe the data management process and statistical analysis plan. Methods The statistical analysis plan was designed by the trial executive committee and reviewed and approved by the trial steering committee. We provide an overview of the trial design with a special focus on describing the primary (28-day survival) and secondary outcomes. We describe our data management process, data monitoring committee, interim analyses, and sample size calculation. We describe our planned statistical analyses for primary and secondary outcomes as well as pre-specified subgroup analyses. We also provide details for presenting results, including mock tables for baseline characteristics, adherence to the protocol and effect on clinical outcomes. Conclusion According to best trial practice, we report our statistical analysis plan and data management plan prior to locking the database and beginning analyses. We anticipate that this document will prevent analysis bias and enhance the utility of the reported results. Trial registration ClinicalTrials.gov number, NCT01374022. PMID:28977255
Adapt-Mix: learning local genetic correlation structure improves summary statistics-based analyses

PubMed Central

Park, Danny S.; Brown, Brielin; Eng, Celeste; Huntsman, Scott; Hu, Donglei; Torgerson, Dara G.; Burchard, Esteban G.; Zaitlen, Noah

2015-01-01

Motivation: Approaches to identifying new risk loci, training risk prediction models, imputing untyped variants and fine-mapping causal variants from summary statistics of genome-wide association studies are playing an increasingly important role in the human genetics community. Current summary statistics-based methods rely on global ‘best guess’ reference panels to model the genetic correlation structure of the dataset being studied. This approach, especially in admixed populations, has the potential to produce misleading results, ignores variation in local structure and is not feasible when appropriate reference panels are missing or small. Here, we develop a method, Adapt-Mix, that combines information across all available reference panels to produce estimates of local genetic correlation structure for summary statistics-based methods in arbitrary populations. Results: We applied Adapt-Mix to estimate the genetic correlation structure of both admixed and non-admixed individuals using simulated and real data. We evaluated our method by measuring the performance of two summary statistics-based methods: imputation and joint-testing. When using our method as opposed to the current standard of ‘best guess’ reference panels, we observed a 28% decrease in mean-squared error for imputation and a 73.7% decrease in mean-squared error for joint-testing. Availability and implementation: Our method is publicly available in a software package called ADAPT-Mix available at https://github.com/dpark27/adapt_mix. Contact: noah.zaitlen@ucsf.edu PMID:26072481
OSPAR standard method and software for statistical analysis of beach litter data.

PubMed

Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit

2017-09-15

The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
Epidemiology Characteristics, Methodological Assessment and Reporting of Statistical Analysis of Network Meta-Analyses in the Field of Cancer

PubMed Central

Ge, Long; Tian, Jin-hui; Li, Xiu-xia; Song, Fujian; Li, Lun; Zhang, Jun; Li, Ge; Pei, Gai-qin; Qiu, Xia; Yang, Ke-hu

2016-01-01

Because of the methodological complexity of network meta-analyses (NMAs), NMAs may be more vulnerable to methodological risks than conventional pair-wise meta-analysis. Our study aims to investigate epidemiology characteristics, conduction of literature search, methodological quality and reporting of statistical analysis process in the field of cancer based on PRISMA extension statement and modified AMSTAR checklist. We identified and included 102 NMAs in the field of cancer. 61 NMAs were conducted using a Bayesian framework. Of them, more than half of NMAs did not report assessment of convergence (60.66%). Inconsistency was assessed in 27.87% of NMAs. Assessment of heterogeneity in traditional meta-analyses was more common (42.62%) than in NMAs (6.56%). Most of NMAs did not report assessment of similarity (86.89%) and did not used GRADE tool to assess quality of evidence (95.08%). 43 NMAs were adjusted indirect comparisons, the methods used were described in 53.49% NMAs. Only 4.65% NMAs described the details of handling of multi group trials and 6.98% described the methods of similarity assessment. The median total AMSTAR-score was 8.00 (IQR: 6.00–8.25). Methodological quality and reporting of statistical analysis did not substantially differ by selected general characteristics. Overall, the quality of NMAs in the field of cancer was generally acceptable. PMID:27848997
A new method to reduce the statistical and systematic uncertainty of chance coincidence backgrounds measured with waveform digitizers

DOE PAGES

O'Donnell, John M.

2015-06-30

We present a new method for measuring chance-coincidence backgrounds during the collection of coincidence data. The method relies on acquiring data with near-zero dead time, which is now realistic due to the increasing deployment of flash electronic-digitizer (waveform digitizer) techniques. An experiment designed to use this new method is capable of acquiring more coincidence data, and a much reduced statistical fluctuation of the measured background. A statistical analysis is presented, and us ed to derive a figure of merit for the new method. Factors of four improvement over other analyses are realistic. The technique is illustrated with preliminary data takenmore » as part of a program to make new measurements of the prompt fission neutron spectra at Los Alamo s Neutron Science Center. In conclusion, it is expected that the these measurements will occur in a regime where the maximum figure of merit will be exploited« less
A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic

PubMed Central

Qi, Jin-Peng; Qi, Jie; Zhang, Qing

2016-01-01

Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals. PMID:27413364
A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic.

PubMed

Qi, Jin-Peng; Qi, Jie; Zhang, Qing

2016-01-01

Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals.
How Historical Information Can Improve Extreme Value Analysis of Coastal Water Levels

NASA Astrophysics Data System (ADS)

Le Cozannet, G.; Bulteau, T.; Idier, D.; Lambert, J.; Garcin, M.

2016-12-01

The knowledge of extreme coastal water levels is useful for coastal flooding studies or the design of coastal defences. While deriving such extremes with standard analyses using tide gauge measurements, one often needs to deal with limited effective duration of observation which can result in large statistical uncertainties. This is even truer when one faces outliers, those particularly extreme values distant from the others. In a recent work (Bulteau et al., 2015), we investigated how historical information of past events reported in archives can reduce statistical uncertainties and relativize such outlying observations. We adapted a Bayesian Markov Chain Monte Carlo method, initially developed in the hydrology field (Reis and Stedinger, 2005), to the specific case of coastal water levels. We applied this method to the site of La Rochelle (France), where the storm Xynthia in 2010 generated a water level considered so far as an outlier. Based on 30 years of tide gauge measurements and 8 historical events since 1890, the results showed a significant decrease in statistical uncertainties on return levels when historical information is used. Also, Xynthia's water level no longer appeared as an outlier and we could have reasonably predicted the annual exceedance probability of that level beforehand (predictive probability for 2010 based on data until the end of 2009 of the same order of magnitude as the standard estimative probability using data until the end of 2010). Such results illustrate the usefulness of historical information in extreme value analyses of coastal water levels, as well as the relevance of the proposed method to integrate heterogeneous data in such analyses.

The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method

NASA Astrophysics Data System (ADS)

Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad

2018-04-01

Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.
Metamodels for Computer-Based Engineering Design: Survey and Recommendations

NASA Technical Reports Server (NTRS)

Simpson, Timothy W.; Peplinski, Jesse; Koch, Patrick N.; Allen, Janet K.

1997-01-01

The use of statistical techniques to build approximations of expensive computer analysis codes pervades much of todays engineering design. These statistical approximations, or metamodels, are used to replace the actual expensive computer analyses, facilitating multidisciplinary, multiobjective optimization and concept exploration. In this paper we review several of these techniques including design of experiments, response surface methodology, Taguchi methods, neural networks, inductive learning, and kriging. We survey their existing application in engineering design and then address the dangers of applying traditional statistical techniques to approximate deterministic computer analysis codes. We conclude with recommendations for the appropriate use of statistical approximation techniques in given situations and how common pitfalls can be avoided.
Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions.

PubMed

Bansal, Ravi; Peterson, Bradley S

2018-06-01

Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.
Behavior, sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation.

PubMed

Eickhoff, Simon B; Nichols, Thomas E; Laird, Angela R; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T; Bzdok, Danilo; Eickhoff, Claudia R

2016-08-15

Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
Behavior, Sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation

PubMed Central

Eickhoff, Simon B.; Nichols, Thomas E.; Laird, Angela R.; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T.

2016-01-01

Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. PMID:27179606
A visual basic program to generate sediment grain-size statistics and to extrapolate particle distributions

USGS Publications Warehouse

Poppe, L.J.; Eliason, A.H.; Hastings, M.E.

2004-01-01

Measures that describe and summarize sediment grain-size distributions are important to geologists because of the large amount of information contained in textural data sets. Statistical methods are usually employed to simplify the necessary comparisons among samples and quantify the observed differences. The two statistical methods most commonly used by sedimentologists to describe particle distributions are mathematical moments (Krumbein and Pettijohn, 1938) and inclusive graphics (Folk, 1974). The choice of which of these statistical measures to use is typically governed by the amount of data available (Royse, 1970). If the entire distribution is known, the method of moments may be used; if the next to last accumulated percent is greater than 95, inclusive graphics statistics can be generated. Unfortunately, earlier programs designed to describe sediment grain-size distributions statistically do not run in a Windows environment, do not allow extrapolation of the distribution's tails, or do not generate both moment and graphic statistics (Kane and Hubert, 1963; Collias et al., 1963; Schlee and Webster, 1967; Poppe et al., 2000)1.Owing to analytical limitations, electro-resistance multichannel particle-size analyzers, such as Coulter Counters, commonly truncate the tails of the fine-fraction part of grain-size distributions. These devices do not detect fine clay in the 0.6–0.1 μm range (part of the 11-phi and all of the 12-phi and 13-phi fractions). Although size analyses performed down to 0.6 μm microns are adequate for most freshwater and near shore marine sediments, samples from many deeper water marine environments (e.g. rise and abyssal plain) may contain significant material in the fine clay fraction, and these analyses benefit from extrapolation.The program (GSSTAT) described herein generates statistics to characterize sediment grain-size distributions and can extrapolate the fine-grained end of the particle distribution. It is written in Microsoft Visual Basic 6.0 and provides a window to facilitate program execution. The input for the sediment fractions is weight percentages in whole-phi notation (Krumbein, 1934; Inman, 1952), and the program permits the user to select output in either method of moments or inclusive graphics statistics (Fig. 1). Users select options primarily with mouse-click events, or through interactive dialogue boxes.
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.

PubMed

Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P; Patterson, Nick; Price, Alkes L

2014-10-15

Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of [Formula: see text] association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary materials are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis.

PubMed

Neyeloff, Jeruza L; Fuchs, Sandra C; Moreira, Leila B

2012-01-20

Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis

PubMed Central

2012-01-01

Background Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. Findings We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. Conclusions It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software. PMID:22264277
The Need for Speed in Rodent Locomotion Analyses

PubMed Central

Batka, Richard J.; Brown, Todd J.; Mcmillan, Kathryn P.; Meadows, Rena M.; Jones, Kathryn J.; Haulcomb, Melissa M.

2016-01-01

Locomotion analysis is now widely used across many animal species to understand the motor defects in disease, functional recovery following neural injury, and the effectiveness of various treatments. More recently, rodent locomotion analysis has become an increasingly popular method in a diverse range of research. Speed is an inseparable aspect of locomotion that is still not fully understood, and its effects are often not properly incorporated while analyzing data. In this hybrid manuscript, we accomplish three things: (1) review the interaction between speed and locomotion variables in rodent studies, (2) comprehensively analyze the relationship between speed and 162 locomotion variables in a group of 16 wild-type mice using the CatWalk gait analysis system, and (3) develop and test a statistical method in which locomotion variables are analyzed and reported in the context of speed. Notable results include the following: (1) over 90% of variables, reported by CatWalk, were dependent on speed with an average R2 value of 0.624, (2) most variables were related to speed in a nonlinear manner, (3) current methods of controlling for speed are insufficient, and (4) the linear mixed model is an appropriate and effective statistical method for locomotion analyses that is inclusive of speed-dependent relationships. Given the pervasive dependency of locomotion variables on speed, we maintain that valid conclusions from locomotion analyses cannot be made unless they are analyzed and reported within the context of speed. PMID:24890845
Exploratory study on a statistical method to analyse time resolved data obtained during nanomaterial exposure measurements

NASA Astrophysics Data System (ADS)

Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.

2013-04-01

Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Methodological reporting of randomized trials in five leading Chinese nursing journals.

PubMed

Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu

2014-01-01

Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34 ± 0.97 (Mean ± SD). No RCT reported descriptions and changes in "trial design," changes in "outcomes" and "implementation," or descriptions of the similarity of interventions for "blinding." Poor reporting was found in detailing the "settings of participants" (13.1%), "type of randomization sequence generation" (1.8%), calculation methods of "sample size" (0.4%), explanation of any interim analyses and stopping guidelines for "sample size" (0.3%), "allocation concealment mechanism" (0.3%), additional analyses in "statistical methods" (2.1%), and targeted subjects and methods of "blinding" (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of "participants," "interventions," and definitions of the "outcomes" and "statistical methods." The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods.
Phylogenetic relationships of South American lizards of the genus Stenocercus (Squamata: Iguania): A new approach using a general mixture model for gene sequence data.

PubMed

Torres-Carvajal, Omar; Schulte, James A; Cadle, John E

2006-04-01

The South American iguanian lizard genus Stenocercus includes 54 species occurring mostly in the Andes and adjacent lowland areas from northern Venezuela and Colombia to central Argentina at elevations of 0-4000m. Small taxon or character sampling has characterized all phylogenetic analyses of Stenocercus, which has long been recognized as sister taxon to the Tropidurus Group. In this study, we use mtDNA sequence data to perform phylogenetic analyses that include 32 species of Stenocercus and 12 outgroup taxa. Monophyly of this genus is strongly supported by maximum parsimony and Bayesian analyses. Evolutionary relationships within Stenocercus are further analyzed with a Bayesian implementation of a general mixture model, which accommodates variability in the pattern of evolution across sites. These analyses indicate a basal split of Stenocercus into two clades, one of which receives very strong statistical support. In addition, we test previous hypotheses using non-parametric and parametric statistical methods, and provide a phylogenetic classification for Stenocercus.
Decomposing biodiversity data using the Latent Dirichlet Allocation model, a probabilistic multivariate statistical method

Treesearch

Denis Valle; Benjamin Baiser; Christopher W. Woodall; Robin Chazdon; Jerome Chave

2014-01-01

We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates...
The statistical reporting quality of articles published in 2010 in five dental journals.

PubMed

Vähänikkilä, Hannu; Tjäderhane, Leo; Nieminen, Pentti

2015-01-01

Statistical methods play an important role in medical and dental research. In earlier studies it has been observed that current use of methods and reporting of statistics are responsible for some of the errors in the interpretation of results. The aim of this study was to investigate the quality of statistical reporting in dental research articles. A total of 200 articles published in 2010 were analysed covering five dental journals: Journal of Dental Research, Caries Research, Community Dentistry and Oral Epidemiology, Journal of Dentistry and Acta Odontologica Scandinavica. Each paper underwent careful scrutiny for the use of statistical methods and reporting. A paper with at least one poor reporting item has been classified as 'problems with reporting statistics' and a paper without any poor reporting item as 'acceptable'. The investigation showed that 18 (9%) papers were acceptable and 182 (91%) papers contained at least one poor reporting item. The proportion of at least one poor reporting item in this survey was high (91%). The authors of dental journals should be encouraged to improve the statistical section of their research articles and to present the results in such a way that it is in line with the policy and presentation of the leading dental journals.
Effects of different preservation methods on inter simple sequence repeat (ISSR) and random amplified polymorphic DNA (RAPD) molecular markers in botanic samples.

PubMed

Wang, Xiaolong; Li, Lin; Zhao, Jiaxin; Li, Fangliang; Guo, Wei; Chen, Xia

2017-04-01

To evaluate the effects of different preservation methods (stored in a -20°C ice chest, preserved in liquid nitrogen and dried in silica gel) on inter simple sequence repeat (ISSR) or random amplified polymorphic DNA (RAPD) analyses in various botanical specimens (including broad-leaved plants, needle-leaved plants and succulent plants) for different times (three weeks and three years), we used a statistical analysis based on the number of bands, genetic index and cluster analysis. The results demonstrate that methods used to preserve samples can provide sufficient amounts of genomic DNA for ISSR and RAPD analyses; however, the effect of different preservation methods on these analyses vary significantly, and the preservation time has little effect on these analyses. Our results provide a reference for researchers to select the most suitable preservation method depending on their study subject for the analysis of molecular markers based on genomic DNA. Copyright © 2017 Académie des sciences. Published by Elsevier Masson SAS. All rights reserved.
STRengthening analytical thinking for observational studies: the STRATOS initiative.

PubMed

Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James

2014-12-30

The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even 'standard' analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
Drying method has no substantial effect on δ(15)N or δ(13)C values of muscle tissue from teleost fishes.

PubMed

Bessey, Cindy; Vanderklift, Mathew A

2014-02-15

Stable isotope analysis (SIA) is a powerful tool in many fields of research that enables quantitative comparisons among studies, if similar methods have been used. The goal of this study was to determine if three different drying methods commonly used to prepare samples for SIA yielded different δ(15)N and δ(13)C values. Muscle subsamples from 10 individuals each of three teleost species were dried using three methods: (i) oven, (ii) food dehydrator, and (iii) freeze-dryer. All subsamples were analysed for δ(15)N and δ(13)C values, and nitrogen and carbon content, using a continuous flow system consisting of a Delta V Plus mass spectrometer and a Flush 1112 elemental analyser via a Conflo IV universal interface. The δ(13)C values were normalized to constant lipid content using the equations proposed by McConnaughey and McRoy. Although statistically significant, the differences in δ(15)N values between the drying methods were small (mean differences ≤0.21‰). The differences in δ(13)C values between the drying methods were not statistically significant, and normalising the δ(13)C values to constant lipid content reduced the mean differences for all treatments to ≤0.65‰. A statistically significant difference of ~2% in C content existed between tissues dried in a food dehydrator and those dried in a freeze-dryer for two fish species. There was no significant effect of fish size on the differences between methods. No substantial effect of drying method was found on the δ(15)N or δ(13)C values of teleost muscle tissue. Copyright © 2013 John Wiley & Sons, Ltd.
On Improving the Quality and Interpretation of Environmental Assessments using Statistical Analysis and Geographic Information Systems

NASA Astrophysics Data System (ADS)

Karuppiah, R.; Faldi, A.; Laurenzi, I.; Usadi, A.; Venkatesh, A.

2014-12-01

An increasing number of studies are focused on assessing the environmental footprint of different products and processes, especially using life cycle assessment (LCA). This work shows how combining statistical methods and Geographic Information Systems (GIS) with environmental analyses can help improve the quality of results and their interpretation. Most environmental assessments in literature yield single numbers that characterize the environmental impact of a process/product - typically global or country averages, often unchanging in time. In this work, we show how statistical analysis and GIS can help address these limitations. For example, we demonstrate a method to separately quantify uncertainty and variability in the result of LCA models using a power generation case study. This is important for rigorous comparisons between the impacts of different processes. Another challenge is lack of data that can affect the rigor of LCAs. We have developed an approach to estimate environmental impacts of incompletely characterized processes using predictive statistical models. This method is applied to estimate unreported coal power plant emissions in several world regions. There is also a general lack of spatio-temporal characterization of the results in environmental analyses. For instance, studies that focus on water usage do not put in context where and when water is withdrawn. Through the use of hydrological modeling combined with GIS, we quantify water stress on a regional and seasonal basis to understand water supply and demand risks for multiple users. Another example where it is important to consider regional dependency of impacts is when characterizing how agricultural land occupation affects biodiversity in a region. We developed a data-driven methodology used in conjuction with GIS to determine if there is a statistically significant difference between the impacts of growing different crops on different species in various biomes of the world.
Study design and statistical analysis of data in human population studies with the micronucleus assay.

PubMed

Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

2011-01-01

The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.

Detection of semi-volatile organic compounds in permeable ...

EPA Pesticide Factsheets

Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
Accounting for Multiple Births in Neonatal and Perinatal Trials: Systematic Review and Case Study

PubMed Central

Hibbs, Anna Maria; Black, Dennis; Palermo, Lisa; Cnaan, Avital; Luan, Xianqun; Truog, William E; Walsh, Michele C; Ballard, Roberta A

2010-01-01

Objectives To determine the prevalence in the neonatal literature of statistical approaches accounting for the unique clustering patterns of multiple births. To explore the sensitivity of an actual trial to several analytic approaches to multiples. Methods A systematic review of recent perinatal trials assessed the prevalence of studies accounting for clustering of multiples. The NO CLD trial served as a case study of the sensitivity of the outcome to several statistical strategies. We calculated odds ratios using non-clustered (logistic regression) and clustered (generalized estimating equations, multiple outputation) analyses. Results In the systematic review, most studies did not describe the randomization of twins and did not account for clustering. Of those studies that did, exclusion of multiples and generalized estimating equations were the most common strategies. The NO CLD study included 84 infants with a sibling enrolled in the study. Multiples were more likely than singletons to be white and were born to older mothers (p<0.01). Analyses that accounted for clustering were statistically significant; analyses assuming independence were not. Conclusions The statistical approach to multiples can influence the odds ratio and width of confidence intervals, thereby affecting the interpretation of a study outcome. A minority of perinatal studies address this issue. PMID:19969305
A Statistical Method for Synthesizing Mediation Analyses Using the Product of Coefficient Approach Across Multiple Trials

PubMed Central

Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks

2016-01-01

Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
LandScape: a simple method to aggregate p-values and other stochastic variables without a priori grouping.

PubMed

Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob

2016-08-01

In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).
STRengthening Analytical Thinking for Observational Studies: the STRATOS initiative

PubMed Central

Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James

2014-01-01

The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even ‘standard’ analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. PMID:25074480
Classical Statistics and Statistical Learning in Imaging Neuroscience

PubMed Central

Bzdok, Danilo

2017-01-01

Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study.

PubMed

van der Krieke, Lian; Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith Gm; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-08-07

Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher's tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.
Approach for Uncertainty Propagation and Robust Design in CFD Using Sensitivity Derivatives

NASA Technical Reports Server (NTRS)

Putko, Michele M.; Newman, Perry A.; Taylor, Arthur C., III; Green, Lawrence L.

2001-01-01

This paper presents an implementation of the approximate statistical moment method for uncertainty propagation and robust optimization for a quasi 1-D Euler CFD (computational fluid dynamics) code. Given uncertainties in statistically independent, random, normally distributed input variables, a first- and second-order statistical moment matching procedure is performed to approximate the uncertainty in the CFD output. Efficient calculation of both first- and second-order sensitivity derivatives is required. In order to assess the validity of the approximations, the moments are compared with statistical moments generated through Monte Carlo simulations. The uncertainties in the CFD input variables are also incorporated into a robust optimization procedure. For this optimization, statistical moments involving first-order sensitivity derivatives appear in the objective function and system constraints. Second-order sensitivity derivatives are used in a gradient-based search to successfully execute a robust optimization. The approximate methods used throughout the analyses are found to be valid when considering robustness about input parameter mean values.
Improving surveillance for injuries associated with potential motor vehicle safety defects

PubMed Central

Whitfield, R; Whitfield, A

2004-01-01

Objective: To improve surveillance for deaths and injuries associated with potential motor vehicle safety defects. Design: Vehicles in fatal crashes can be studied for indications of potential defects using an "early warning" surveillance statistic previously suggested for screening reports of adverse drug reactions. This statistic is illustrated with time series data for fatal, tire related and fire related crashes. Geographic analyses are used to augment the tire related statistics. Results: A statistical criterion based on the Poisson distribution that tests the likelihood of an expected number of events, given the number of events that actually occurred, is a promising method that can be readily adapted for use in injury surveillance. Conclusions: Use of the demonstrated techniques could have helped to avert a well known injury surveillance failure. This method is adaptable to aid in the direction of engineering and statistical reviews to prevent deaths and injuries associated with potential motor vehicle safety defects using available databases. PMID:15066972
Evaluation and application of summary statistic imputation to discover new height-associated loci.

PubMed

Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

2018-05-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression.
Evaluation and application of summary statistic imputation to discover new height-associated loci

PubMed Central

2018-01-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression. PMID:29782485
Do regional methods really help reduce uncertainties in flood frequency analyses?

NASA Astrophysics Data System (ADS)

Cong Nguyen, Chi; Payrastre, Olivier; Gaume, Eric

2013-04-01

Flood frequency analyses are often based on continuous measured series at gauge sites. However, the length of the available data sets is usually too short to provide reliable estimates of extreme design floods. To reduce the estimation uncertainties, the analyzed data sets have to be extended either in time, making use of historical and paleoflood data, or in space, merging data sets considered as statistically homogeneous to build large regional data samples. Nevertheless, the advantage of the regional analyses, the important increase of the size of the studied data sets, may be counterbalanced by the possible heterogeneities of the merged sets. The application and comparison of four different flood frequency analysis methods to two regions affected by flash floods in the south of France (Ardèche and Var) illustrates how this balance between the number of records and possible heterogeneities plays in real-world applications. The four tested methods are: (1) a local statistical analysis based on the existing series of measured discharges, (2) a local analysis valuating the existing information on historical floods, (3) a standard regional flood frequency analysis based on existing measured series at gauged sites and (4) a modified regional analysis including estimated extreme peak discharges at ungauged sites. Monte Carlo simulations are conducted to simulate a large number of discharge series with characteristics similar to the observed ones (type of statistical distributions, number of sites and records) to evaluate to which extent the results obtained on these case studies can be generalized. These two case studies indicate that even small statistical heterogeneities, which are not detected by the standard homogeneity tests implemented in regional flood frequency studies, may drastically limit the usefulness of such approaches. On the other hand, these result show that the valuation of information on extreme events, either historical flood events at gauged sites or estimated extremes at ungauged sites in the considered region, is an efficient way to reduce uncertainties in flood frequency studies.
Confidence intervals for the between-study variance in random-effects meta-analysis using generalised heterogeneity statistics: should we use unequal tails?

PubMed

Jackson, Dan; Bowden, Jack

2016-09-07

Confidence intervals for the between study variance are useful in random-effects meta-analyses because they quantify the uncertainty in the corresponding point estimates. Methods for calculating these confidence intervals have been developed that are based on inverting hypothesis tests using generalised heterogeneity statistics. Whilst, under the random effects model, these new methods furnish confidence intervals with the correct coverage, the resulting intervals are usually very wide, making them uninformative. We discuss a simple strategy for obtaining 95 % confidence intervals for the between-study variance with a markedly reduced width, whilst retaining the nominal coverage probability. Specifically, we consider the possibility of using methods based on generalised heterogeneity statistics with unequal tail probabilities, where the tail probability used to compute the upper bound is greater than 2.5 %. This idea is assessed using four real examples and a variety of simulation studies. Supporting analytical results are also obtained. Our results provide evidence that using unequal tail probabilities can result in shorter 95 % confidence intervals for the between-study variance. We also show some further results for a real example that illustrates how shorter confidence intervals for the between-study variance can be useful when performing sensitivity analyses for the average effect, which is usually the parameter of primary interest. We conclude that using unequal tail probabilities when computing 95 % confidence intervals for the between-study variance, when using methods based on generalised heterogeneity statistics, can result in shorter confidence intervals. We suggest that those who find the case for using unequal tail probabilities convincing should use the '1-4 % split', where greater tail probability is allocated to the upper confidence bound. The 'width-optimal' interval that we present deserves further investigation.
A concept for holistic whole body MRI data analysis, Imiomics

PubMed Central

Malmberg, Filip; Johansson, Lars; Lind, Lars; Sundbom, Magnus; Ahlström, Håkan; Kullberg, Joel

2017-01-01

Purpose To present and evaluate a whole-body image analysis concept, Imiomics (imaging–omics) and an image registration method that enables Imiomics analyses by deforming all image data to a common coordinate system, so that the information in each voxel can be compared between persons or within a person over time and integrated with non-imaging data. Methods The presented image registration method utilizes relative elasticity constraints of different tissue obtained from whole-body water-fat MRI. The registration method is evaluated by inverse consistency and Dice coefficients and the Imiomics concept is evaluated by example analyses of importance for metabolic research using non-imaging parameters where we know what to expect. The example analyses include whole body imaging atlas creation, anomaly detection, and cross-sectional and longitudinal analysis. Results The image registration method evaluation on 128 subjects shows low inverse consistency errors and high Dice coefficients. Also, the statistical atlas with fat content intensity values shows low standard deviation values, indicating successful deformations to the common coordinate system. The example analyses show expected associations and correlations which agree with explicit measurements, and thereby illustrate the usefulness of the proposed Imiomics concept. Conclusions The registration method is well-suited for Imiomics analyses, which enable analyses of relationships to non-imaging data, e.g. clinical data, in new types of holistic targeted and untargeted big-data analysis. PMID:28241015
Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls

PubMed Central

Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.

2013-01-01

As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950
Statistics for NAEG: past efforts, new results, and future plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilbert, R.O.; Simpson, J.C.; Kinnison, R.R.

A brief review of Nevada Applied Ecology Group (NAEG) objectives is followed by a summary of past statistical analyses conducted by Pacific Northwest Laboratory for the NAEG. Estimates of spatial pattern of radionuclides and other statistical analyses at NS's 201, 219 and 221 are reviewed as background for new analyses presented in this paper. Suggested NAEG activities and statistical analyses needed for the projected termination date of NAEG studies in March 1986 are given.
Estimation versus falsification approaches in sport and exercise science.

PubMed

Wilkinson, Michael; Winter, Edward M

2018-05-22

There has been a recent resurgence in debate about methods for statistical inference in science. The debate addresses statistical concepts and their impact on the value and meaning of analyses' outcomes. In contrast, philosophical underpinnings of approaches and the extent to which analytical tools match philosophical goals of the scientific method have received less attention. This short piece considers application of the scientific method to "what-is-the-influence-of x-on-y" type questions characteristic of sport and exercise science. We consider applications and interpretations of estimation versus falsification based statistical approaches and their value in addressing how much x influences y, and in measurement error and method agreement settings. We compare estimation using magnitude based inference (MBI) with falsification using null hypothesis significance testing (NHST), and highlight the limited value both of falsification and NHST to address problems in sport and exercise science. We recommend adopting an estimation approach, expressing the uncertainty of effects of x on y, and their practical/clinical value against pre-determined effect magnitudes using MBI.
Application of multivariate statistical techniques in microbial ecology

PubMed Central

Paliy, O.; Shankar, V.

2016-01-01

Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large scale ecological datasets. Especially noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions, and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amounts of data, powerful statistical techniques of multivariate analysis are well suited to analyze and interpret these datasets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular dataset. In this review we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive, and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and dataset structure. PMID:26786791
An evaluation of various methods of treatment for Legg-Calvé-Perthes disease.

PubMed

Wang, L; Bowen, J R; Puniak, M A; Guille, J T; Glutting, J

1995-05-01

An analysis of 5 methods of treatment for Legg-Calvé-Perthes disease was done on 124 patients with 141 affected hips. Before treatment, all groups were statistically similar concerning initial Mose measurement, age at onset of the disease, gender, and Catterall class. Treatments included the Scottish Rite orthosis (41 hips), nonweight bearing and exercises (41 hips), Petrie cast (29 hips), femoral varus osteotomy (15 hips), or Salter osteotomy (15 hips). Hips treated by the Scottish Rite orthosis had a significantly worse Mose measurement across time interaction (repeated measures analysis of variance, post hoc analyses, p < 0.05). For the other 4 treatment methods, there was no statistically different change. At followup, the Mose measurements for hips treated with the Scottish Rite orthosis were significantly worse than those for hips treated by nonweight bearing and exercises, Petrie cast, varus osteotomy, or Salter osteotomy (repeated measures analysis of variance, post hoc analyses, p < 0.05). There was, however, no significant difference in the distribution of hips according to the Stulberg et al classification at the last followup.
Models of dyadic social interaction.

PubMed Central

Griffin, Dale; Gonzalez, Richard

2003-01-01

We discuss the logic of research designs for dyadic interaction and present statistical models with parameters that are tied to psychologically relevant constructs. Building on Karl Pearson's classic nineteenth-century statistical analysis of within-organism similarity, we describe several approaches to indexing dyadic interdependence and provide graphical methods for visualizing dyadic data. We also describe several statistical and conceptual solutions to the 'levels of analytic' problem in analysing dyadic data. These analytic strategies allow the researcher to examine and measure psychological questions of interdependence and social influence. We provide illustrative data from casually interacting and romantic dyads. PMID:12689382

Walking through the statistical black boxes of plant breeding.

PubMed

Xavier, Alencar; Muir, William M; Craig, Bruce; Rainey, Katy Martin

2016-10-01

The main statistical procedures in plant breeding are based on Gaussian process and can be computed through mixed linear models. Intelligent decision making relies on our ability to extract useful information from data to help us achieve our goals more efficiently. Many plant breeders and geneticists perform statistical analyses without understanding the underlying assumptions of the methods or their strengths and pitfalls. In other words, they treat these statistical methods (software and programs) like black boxes. Black boxes represent complex pieces of machinery with contents that are not fully understood by the user. The user sees the inputs and outputs without knowing how the outputs are generated. By providing a general background on statistical methodologies, this review aims (1) to introduce basic concepts of machine learning and its applications to plant breeding; (2) to link classical selection theory to current statistical approaches; (3) to show how to solve mixed models and extend their application to pedigree-based and genomic-based prediction; and (4) to clarify how the algorithms of genome-wide association studies work, including their assumptions and limitations.
Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice

PubMed Central

Stewart, Gavin B.; Altman, Douglas G.; Askie, Lisa M.; Duley, Lelia; Simmonds, Mark C.; Stewart, Lesley A.

2012-01-01

Background Individual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and Findings We included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. Conclusions For these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials. PMID:23056232
Analyses of global sea surface temperature 1856-1991

NASA Astrophysics Data System (ADS)

Kaplan, Alexey; Cane, Mark A.; Kushnir, Yochanan; Clement, Amy C.; Blumenthal, M. Benno; Rajagopalan, Balaji

1998-08-01

Global analyses of monthly sea surface temperature (SST) anomalies from 1856 to 1991 are produced using three statistically based methods: optimal smoothing (OS), the Kaiman filter (KF) and optimal interpolation (OI). Each of these is accompanied by estimates of the error covariance of the analyzed fields. The spatial covariance function these methods require is estimated from the available data; the timemarching model is a first-order autoregressive model again estimated from data. The data input for the analyses are monthly anomalies from the United Kingdom Meteorological Office historical sea surface temperature data set (MOHSST5) [Parker et al., 1994] of the Global Ocean Surface Temperature Atlas (GOSTA) [Bottomley et al., 1990]. These analyses are compared with each other, with GOSTA, and with an analysis generated by projection (P) onto a set of empirical orthogonal functions (as in Smith et al. [1996]). In theory, the quality of the analyses should rank in the order OS, KF, OI, P, and GOSTA. It is found that the first four give comparable results in the data-rich periods (1951-1991), but at times when data is sparse the first three differ significantly from P and GOSTA. At these times the latter two often have extreme and fluctuating values, prima facie evidence of error. The statistical schemes are also verified against data not used in any of the analyses (proxy records derived from corals and air temperature records from coastal and island stations). We also present evidence that the analysis error estimates are indeed indicative of the quality of the products. At most times the OS and KF products are close to the OI product, but at times of especially poor coverage their use of information from other times is advantageous. The methods appear to reconstruct the major features of the global SST field from very sparse data. Comparison with other indications of the El Niño-Southern Oscillation cycle show that the analyses provide usable information on interannual variability as far back as the 1860s.
Effects of Learning Style and Training Method on Computer Attitude and Performance in World Wide Web Page Design Training.

ERIC Educational Resources Information Center

Chou, Huey-Wen; Wang, Yu-Fang

1999-01-01

Compares the effects of two training methods on computer attitude and performance in a World Wide Web page design program in a field experiment with high school students in Taiwan. Discusses individual differences, Kolb's Experiential Learning Theory and Learning Style Inventory, Computer Attitude Scale, and results of statistical analyses.…
Wisconsin's forest, 2004: statistics and quality assurance

Treesearch

Mark H. Hansen; Charles H. Perry; Gary Brand; Ronald E. McRoberts

2008-01-01

The first full, annualized inventory of Wisconsin's forests was completed in 2004 after 6,478 forested plots were visited. An earlier publication summarized the results and presented issue - driven analyses (Perry et al. 2008) . This report includes detailed information on forest inventory methods...
Fundamentals of Petroleum.

ERIC Educational Resources Information Center

Bureau of Naval Personnel, Washington, DC.

Basic information on petroleum is presented in this book prepared for naval logistics officers. Petroleum in national defense is discussed in connection with consumption statistics, productive capacity, world's resources, and steps in logistics. Chemical and geological analyses are made in efforts to familiarize methods of refining, measuring,…
Identifying and characterizing hepatitis C virus hotspots in Massachusetts: a spatial epidemiological approach.

PubMed

Stopka, Thomas J; Goulart, Michael A; Meyers, David J; Hutcheson, Marga; Barton, Kerri; Onofrey, Shauna; Church, Daniel; Donahue, Ashley; Chui, Kenneth K H

2017-04-20

Hepatitis C virus (HCV) infections have increased during the past decade but little is known about geographic clustering patterns. We used a unique analytical approach, combining geographic information systems (GIS), spatial epidemiology, and statistical modeling to identify and characterize HCV hotspots, statistically significant clusters of census tracts with elevated HCV counts and rates. We compiled sociodemographic and HCV surveillance data (n = 99,780 cases) for Massachusetts census tracts (n = 1464) from 2002 to 2013. We used a five-step spatial epidemiological approach, calculating incremental spatial autocorrelations and Getis-Ord Gi* statistics to identify clusters. We conducted logistic regression analyses to determine factors associated with the HCV hotspots. We identified nine HCV clusters, with the largest in Boston, New Bedford/Fall River, Worcester, and Springfield (p < 0.05). In multivariable analyses, we found that HCV hotspots were independently and positively associated with the percent of the population that was Hispanic (adjusted odds ratio [AOR]: 1.07; 95% confidence interval [CI]: 1.04, 1.09) and the percent of households receiving food stamps (AOR: 1.83; 95% CI: 1.22, 2.74). HCV hotspots were independently and negatively associated with the percent of the population that were high school graduates or higher (AOR: 0.91; 95% CI: 0.89, 0.93) and the percent of the population in the "other" race/ethnicity category (AOR: 0.88; 95% CI: 0.85, 0.91). We identified locations where HCV clusters were a concern, and where enhanced HCV prevention, treatment, and care can help combat the HCV epidemic in Massachusetts. GIS, spatial epidemiological and statistical analyses provided a rigorous approach to identify hotspot clusters of disease, which can inform public health policy and intervention targeting. Further studies that incorporate spatiotemporal cluster analyses, Bayesian spatial and geostatistical models, spatially weighted regression analyses, and assessment of associations between HCV clustering and the built environment are needed to expand upon our combined spatial epidemiological and statistical methods.
ParallABEL: an R library for generalized parallelization of genome-wide association studies.

PubMed

Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

2010-04-29

Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.
The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

PubMed

Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

2010-03-01

New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.
[Quality of clinical studies published in the RBGO over one decade (1999-2009): methodological and ethical aspects and statistical procedures].

PubMed

de Sá, Joceline Cássia Ferezini; Marini, Gabriela; Gelaleti, Rafael Bottaro; da Silva, João Batista; de Azevedo, George Gantas; Rudge, Marilza Vieira Cunha

2013-11-01

To evaluate the methodological and statistical design evolution of the publications in the Brazilian Journal of Gynecology and Obstetrics (RBGO) from resolution 196/96. A review of 133 articles published in 1999 (65) and 2009 (68) was performed by two independent reviewers with training in clinical epidemiology and methodology of scientific research. We included all original clinical articles, case and series reports and excluded editorials, letters to the editor, systematic reviews, experimental studies, opinion articles, besides abstracts of theses and dissertations. Characteristics related to the methodological quality of the studies were analyzed in each article using a checklist that evaluated two criteria: methodological aspects and statistical procedures. We used descriptive statistics and the χ2 test for comparison of the two years. There was a difference between 1999 and 2009 regarding the study and statistical design, with more accuracy in the procedures and the use of more robust tests between 1999 and 2009. In RBGO, we observed an evolution in the methods of published articles and a more in-depth use of the statistical analyses, with more sophisticated tests such as regression and multilevel analyses, which are essential techniques for the knowledge and planning of health interventions, leading to fewer interpretation errors.
DISSCO: direct imputation of summary statistics allowing covariates

PubMed Central

Xu, Zheng; Duan, Qing; Yan, Song; Chen, Wei; Li, Mingyao; Lange, Ethan; Li, Yun

2015-01-01

Background: Imputation of individual level genotypes at untyped markers using an external reference panel of genotyped or sequenced individuals has become standard practice in genetic association studies. Direct imputation of summary statistics can also be valuable, for example in meta-analyses where individual level genotype data are not available. Two methods (DIST and ImpG-Summary/LD), that assume a multivariate Gaussian distribution for the association summary statistics, have been proposed for imputing association summary statistics. However, both methods assume that the correlations between association summary statistics are the same as the correlations between the corresponding genotypes. This assumption can be violated in the presence of confounding covariates. Methods: We analytically show that in the absence of covariates, correlation among association summary statistics is indeed the same as that among the corresponding genotypes, thus serving as a theoretical justification for the recently proposed methods. We continue to prove that in the presence of covariates, correlation among association summary statistics becomes the partial correlation of the corresponding genotypes controlling for covariates. We therefore develop direct imputation of summary statistics allowing covariates (DISSCO). Results: We consider two real-life scenarios where the correlation and partial correlation likely make practical difference: (i) association studies in admixed populations; (ii) association studies in presence of other confounding covariate(s). Application of DISSCO to real datasets under both scenarios shows at least comparable, if not better, performance compared with existing correlation-based methods, particularly for lower frequency variants. For example, DISSCO can reduce the absolute deviation from the truth by 3.9–15.2% for variants with minor allele frequency <5%. Availability and implementation: http://www.unc.edu/∼yunmli/DISSCO. Contact: yunli@med.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25810429
Sensitivity Analyses of the Change in FVC in a Phase 3 Trial of Pirfenidone for Idiopathic Pulmonary Fibrosis

PubMed Central

Bradford, Williamson Z.; Fagan, Elizabeth A.; Glaspole, Ian; Glassberg, Marilyn K.; Glasscock, Kenneth F.; King, Talmadge E.; Lancaster, Lisa H.; Nathan, Steven D.; Pereira, Carlos A.; Sahn, Steven A.; Swigris, Jeffrey J.; Noble, Paul W.

2015-01-01

BACKGROUND: FVC outcomes in clinical trials on idiopathic pulmonary fibrosis (IPF) can be substantially influenced by the analytic methodology and the handling of missing data. We conducted a series of sensitivity analyses to assess the robustness of the statistical finding and the stability of the estimate of the magnitude of treatment effect on the primary end point of FVC change in a phase 3 trial evaluating pirfenidone in adults with IPF. METHODS: Source data included all 555 study participants randomized to treatment with pirfenidone or placebo in the Assessment of Pirfenidone to Confirm Efficacy and Safety in Idiopathic Pulmonary Fibrosis (ASCEND) study. Sensitivity analyses were conducted to assess whether alternative statistical tests and methods for handling missing data influenced the observed magnitude of treatment effect on the primary end point of change from baseline to week 52 in FVC. RESULTS: The distribution of FVC change at week 52 was systematically different between the two treatment groups and favored pirfenidone in each analysis. The method used to impute missing data due to death had a marked effect on the magnitude of change in FVC in both treatment groups; however, the magnitude of treatment benefit was generally consistent on a relative basis, with an approximate 50% reduction in FVC decline observed in the pirfenidone group in each analysis. CONCLUSIONS: Our results confirm the robustness of the statistical finding on the primary end point of change in FVC in the ASCEND trial and corroborate the estimated magnitude of the pirfenidone treatment effect in patients with IPF. TRIAL REGISTRY: ClinicalTrials.gov; No.: NCT01366209; URL: www.clinicaltrials.gov PMID:25856121
Effect of plasma spraying modes on material properties of internal combustion engine cylinder liners

NASA Astrophysics Data System (ADS)

Timokhova, O. M.; Burmistrova, O. N.; Sirina, E. A.; Timokhov, R. S.

2018-03-01

The paper analyses different methods of remanufacturing worn-out machine parts in order to get the best performance characteristics. One of the most promising of them is a plasma spraying method. The mathematical models presented in the paper are intended to anticipate the results of plasma spraying, its effect on the properties of the material of internal combustion engine cylinder liners under repair. The experimental data and research results have been computer processed with Statistica 10.0 software package. The pare correlation coefficient values (R) and F-statistic criterion are given to confirm the statistical properties and adequacy of obtained regression equations.
Flux control coefficients determined by inhibitor titration: the design and analysis of experiments to minimize errors.

PubMed Central

Small, J R

1993-01-01

This paper is a study into the effects of experimental error on the estimated values of flux control coefficients obtained using specific inhibitors. Two possible techniques for analysing the experimental data are compared: a simple extrapolation method (the so-called graph method) and a non-linear function fitting method. For these techniques, the sources of systematic errors are identified and the effects of systematic and random errors are quantified, using both statistical analysis and numerical computation. It is shown that the graph method is very sensitive to random errors and, under all conditions studied, that the fitting method, even under conditions where the assumptions underlying the fitted function do not hold, outperformed the graph method. Possible ways of designing experiments to minimize the effects of experimental errors are analysed and discussed. PMID:8257434
An operational definition of a statistically meaningful trend.

PubMed

Bryhn, Andreas C; Dimberg, Peter H

2011-04-28

Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
Healthy Worker Effect Phenomenon: Revisited with Emphasis on Statistical Methods – A Review

PubMed Central

Chowdhury, Ritam; Shah, Divyang; Payal, Abhishek R.

2017-01-01

Known since 1885 but studied systematically only in the past four decades, the healthy worker effect (HWE) is a special form of selection bias common to occupational cohort studies. The phenomenon has been under debate for many years with respect to its impact, conceptual approach (confounding, selection bias, or both), and ways to resolve or account for its effect. The effect is not uniform across age groups, gender, race, and types of occupations and nor is it constant over time. Hence, assessing HWE and accounting for it in statistical analyses is complicated and requires sophisticated methods. Here, we review the HWE, factors affecting it, and methods developed so far to deal with it. PMID:29391741
Distinguishing synchronous and time-varying synergies using point process interval statistics: motor primitives in frog and rat

PubMed Central

Hart, Corey B.; Giszter, Simon F.

2013-01-01

We present and apply a method that uses point process statistics to discriminate the forms of synergies in motor pattern data, prior to explicit synergy extraction. The method uses electromyogram (EMG) pulse peak timing or onset timing. Peak timing is preferable in complex patterns where pulse onsets may be overlapping. An interval statistic derived from the point processes of EMG peak timings distinguishes time-varying synergies from synchronous synergies (SS). Model data shows that the statistic is robust for most conditions. Its application to both frog hindlimb EMG and rat locomotion hindlimb EMG show data from these preparations is clearly most consistent with synchronous synergy models (p < 0.001). Additional direct tests of pulse and interval relations in frog data further bolster the support for synchronous synergy mechanisms in these data. Our method and analyses support separated control of rhythm and pattern of motor primitives, with the low level execution primitives comprising pulsed SS in both frog and rat, and both episodic and rhythmic behaviors. PMID:23675341
An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.

PubMed

Kim, Junghi; Bai, Yun; Pan, Wei

2015-12-01

We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.
METHODS OF DEALING WITH VALUES BELOW THE LIMIT OF DETECTION USING SAS

EPA Science Inventory

Due to limitations of chemical analysis procedures, small concentrations cannot be precisely measured. These concentrations are said to be below the limit of detection (LOD). In statistical analyses, these values are often censored and substituted with a constant value, such ...
Periodicity of microfilariae of human filariasis analysed by a trigonometric method (Aikat and Das).

PubMed

Tanaka, H

1981-04-01

The microfilarial periodicity of human filariae was characterized statistically by fitting the observed change of microfilaria (mf) counts to the formula of a simple harmonic wave using two parameters, the peak hour (K) and periodicity index (D) (Sasa & Tanaka, 1972, 1974). Later Aikat and Das (1976) proposed a simple calculation method using trigonometry (A-D method) to determine the peak hour (K) and periodicity index (P). All data of microfilarial periodicity analysed previously by the method of Sasa and Tanaka (S-T method) were calculated again by the A-D method in the present study to evaluate the latter method. The results of calculations showed that P was not proportional to D and the ratios of P/D were mostly smaller than expected, especially when P or D was small in less periodic forms. The peak hour calculated by the A-D method did not differ much from that calculated by the S-T method. Goodness of fit was improved slightly by the A-K method in two thirds of analysed data. The classification of human filariae in respect of the type of periodicity was, however, changed little by the results calculated by the A-D method.

Training in metabolomics research. II. Processing and statistical analysis of metabolomics data, metabolite identification, pathway analysis, applications of metabolomics and its future

PubMed Central

Barnes, Stephen; Benton, H. Paul; Casazza, Krista; Cooper, Sara; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H.; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K.; Renfrow, Matthew B.; Tiwari, Hemant K.

2017-01-01

Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites, and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. PMID:28239968
Estimating population diversity with CatchAll

PubMed Central

Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.

2012-01-01

Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
Geographically Sourcing Cocaine’s Origin – Delineation of the Nineteen Major Coca Growing Regions in South America

PubMed Central

Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.

2016-01-01

Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions. PMID:27006288
An Assessment of Phylogenetic Tools for Analyzing the Interplay Between Interspecific Interactions and Phenotypic Evolution.

PubMed

Drury, J P; Grether, G F; Garland, T; Morlon, H

2018-05-01

Much ecological and evolutionary theory predicts that interspecific interactions often drive phenotypic diversification and that species phenotypes in turn influence species interactions. Several phylogenetic comparative methods have been developed to assess the importance of such processes in nature; however, the statistical properties of these methods have gone largely untested. Focusing mainly on scenarios of competition between closely-related species, we assess the performance of available comparative approaches for analyzing the interplay between interspecific interactions and species phenotypes. We find that many currently used statistical methods often fail to detect the impact of interspecific interactions on trait evolution, that sister-taxa analyses are particularly unreliable in general, and that recently developed process-based models have more satisfactory statistical properties. Methods for detecting predictors of species interactions are generally more reliable than methods for detecting character displacement. In weighing the strengths and weaknesses of different approaches, we hope to provide a clear guide for empiricists testing hypotheses about the reciprocal effect of interspecific interactions and species phenotypes and to inspire further development of process-based models.
HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics.

PubMed

Zheng, Jie; Rodriguez, Santiago; Laurin, Charles; Baird, Denis; Trela-Larsen, Lea; Erzurumluoglu, Mesut A; Zheng, Yi; White, Jon; Giambartolomei, Claudia; Zabaneh, Delilah; Morris, Richard; Kumari, Meena; Casas, Juan P; Hingorani, Aroon D; Evans, David M; Gaunt, Tom R; Day, Ian N M

2017-01-01

Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Statistical Analyses for Probabilistic Assessments of the Reactor Pressure Vessel Structural Integrity: Building a Master Curve on an Extract of the 'Euro' Fracture Toughness Dataset, Controlling Statistical Uncertainty for Both Mono-Temperature and multi-temperature tests

DOE Office of Scientific and Technical Information (OSTI.GOV)

Josse, Florent; Lefebvre, Yannick; Todeschini, Patrick

2006-07-01

Assessing the structural integrity of a nuclear Reactor Pressure Vessel (RPV) subjected to pressurized-thermal-shock (PTS) transients is extremely important to safety. In addition to conventional deterministic calculations to confirm RPV integrity, Electricite de France (EDF) carries out probabilistic analyses. Probabilistic analyses are interesting because some key variables, albeit conventionally taken at conservative values, can be modeled more accurately through statistical variability. One variable which significantly affects RPV structural integrity assessment is cleavage fracture initiation toughness. The reference fracture toughness method currently in use at EDF is the RCCM and ASME Code lower-bound K{sub IC} based on the indexing parameter RT{submore » NDT}. However, in order to quantify the toughness scatter for probabilistic analyses, the master curve method is being analyzed at present. Furthermore, the master curve method is a direct means of evaluating fracture toughness based on K{sub JC} data. In the framework of the master curve investigation undertaken by EDF, this article deals with the following two statistical items: building a master curve from an extract of a fracture toughness dataset (from the European project 'Unified Reference Fracture Toughness Design curves for RPV Steels') and controlling statistical uncertainty for both mono-temperature and multi-temperature tests. Concerning the first point, master curve temperature dependence is empirical in nature. To determine the 'original' master curve, Wallin postulated that a unified description of fracture toughness temperature dependence for ferritic steels is possible, and used a large number of data corresponding to nuclear-grade pressure vessel steels and welds. Our working hypothesis is that some ferritic steels may behave in slightly different ways. Therefore we focused exclusively on the basic french reactor vessel metal of types A508 Class 3 and A 533 grade B Class 1, taking the sampling level and direction into account as well as the test specimen type. As for the second point, the emphasis is placed on the uncertainties in applying the master curve approach. For a toughness dataset based on different specimens of a single product, application of the master curve methodology requires the statistical estimation of one parameter: the reference temperature T{sub 0}. Because of the limited number of specimens, estimation of this temperature is uncertain. The ASTM standard provides a rough evaluation of this statistical uncertainty through an approximate confidence interval. In this paper, a thorough study is carried out to build more meaningful confidence intervals (for both mono-temperature and multi-temperature tests). These results ensure better control over uncertainty, and allow rigorous analysis of the impact of its influencing factors: the number of specimens and the temperatures at which they have been tested. (authors)« less
The use of belief-based probabilistic methods in volcanology: Scientists' views and implications for risk assessments

NASA Astrophysics Data System (ADS)

Donovan, Amy; Oppenheimer, Clive; Bravo, Michael

2012-12-01

This paper constitutes a philosophical and social scientific study of expert elicitation in the assessment and management of volcanic risk on Montserrat during the 1995-present volcanic activity. It outlines the broader context of subjective probabilistic methods and then uses a mixed-method approach to analyse the use of these methods in volcanic crises. Data from a global survey of volcanologists regarding the use of statistical methods in hazard assessment are presented. Detailed qualitative data from Montserrat are then discussed, particularly concerning the expert elicitation procedure that was pioneered during the eruptions. These data are analysed and conclusions about the use of these methods in volcanology are drawn. The paper finds that while many volcanologists are open to the use of these methods, there are still some concerns, which are similar to the concerns encountered in the literature on probabilistic and determinist approaches to seismic hazard analysis.
Evaluating statistical and clinical significance of intervention effects in single-case experimental designs: an SPSS method to analyze univariate data.

PubMed

Maric, Marija; de Haan, Else; Hogendoorn, Sanne M; Wolters, Lidewij H; Huizenga, Hilde M

2015-03-01

Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a data-analytic method to analyze univariate (i.e., one symptom) single-case data using the common package SPSS. This method can help the clinical researcher to investigate whether an intervention works as compared with a baseline period or another intervention type, and to determine whether symptom improvement is clinically significant. First, we describe the statistical method in a conceptual way and show how it can be implemented in SPSS. Simulation studies were performed to determine the number of observation points required per intervention phase. Second, to illustrate this method and its implications, we present a case study of an adolescent with anxiety disorders treated with cognitive-behavioral therapy techniques in an outpatient psychotherapy clinic, whose symptoms were regularly assessed before each session. We provide a description of the data analyses and results of this case study. Finally, we discuss the advantages and shortcomings of the proposed method. Copyright © 2014. Published by Elsevier Ltd.
Meta‐analysis using individual participant data: one‐stage and two‐stage approaches, and why they may differ

PubMed Central

Ensor, Joie; Riley, Richard D.

2016-01-01

Meta‐analysis using individual participant data (IPD) obtains and synthesises the raw, participant‐level data from a set of relevant studies. The IPD approach is becoming an increasingly popular tool as an alternative to traditional aggregate data meta‐analysis, especially as it avoids reliance on published results and provides an opportunity to investigate individual‐level interactions, such as treatment‐effect modifiers. There are two statistical approaches for conducting an IPD meta‐analysis: one‐stage and two‐stage. The one‐stage approach analyses the IPD from all studies simultaneously, for example, in a hierarchical regression model with random effects. The two‐stage approach derives aggregate data (such as effect estimates) in each study separately and then combines these in a traditional meta‐analysis model. There have been numerous comparisons of the one‐stage and two‐stage approaches via theoretical consideration, simulation and empirical examples, yet there remains confusion regarding when each approach should be adopted, and indeed why they may differ. In this tutorial paper, we outline the key statistical methods for one‐stage and two‐stage IPD meta‐analyses, and provide 10 key reasons why they may produce different summary results. We explain that most differences arise because of different modelling assumptions, rather than the choice of one‐stage or two‐stage itself. We illustrate the concepts with recently published IPD meta‐analyses, summarise key statistical software and provide recommendations for future IPD meta‐analyses. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27747915
Statistical Performances of Resistive Active Power Splitter

NASA Astrophysics Data System (ADS)

Lalléchère, Sébastien; Ravelo, Blaise; Thakur, Atul

2016-03-01

In this paper, the synthesis and sensitivity analysis of an active power splitter (PWS) is proposed. It is based on the active cell composed of a Field Effect Transistor in cascade with shunted resistor at the input and the output (resistive amplifier topology). The PWS uncertainty versus resistance tolerances is suggested by using stochastic method. Furthermore, with the proposed topology, we can control easily the device gain while varying a resistance. This provides useful tool to analyse the statistical sensitivity of the system in uncertain environment.
Performance of statistical process control methods for regional surgical site infection surveillance: a 10-year multicentre pilot study.

PubMed

Baker, Arthur W; Haridy, Salah; Salem, Joseph; Ilieş, Iulian; Ergai, Awatef O; Samareh, Aven; Andrianas, Nicholas; Benneyan, James C; Sexton, Daniel J; Anderson, Deverick J

2017-11-24

Traditional strategies for surveillance of surgical site infections (SSI) have multiple limitations, including delayed and incomplete outbreak detection. Statistical process control (SPC) methods address these deficiencies by combining longitudinal analysis with graphical presentation of data. We performed a pilot study within a large network of community hospitals to evaluate performance of SPC methods for detecting SSI outbreaks. We applied conventional Shewhart and exponentially weighted moving average (EWMA) SPC charts to 10 previously investigated SSI outbreaks that occurred from 2003 to 2013. We compared the results of SPC surveillance to the results of traditional SSI surveillance methods. Then, we analysed the performance of modified SPC charts constructed with different outbreak detection rules, EWMA smoothing factors and baseline SSI rate calculations. Conventional Shewhart and EWMA SPC charts both detected 8 of the 10 SSI outbreaks analysed, in each case prior to the date of traditional detection. Among detected outbreaks, conventional Shewhart chart detection occurred a median of 12 months prior to outbreak onset and 22 months prior to traditional detection. Conventional EWMA chart detection occurred a median of 7 months prior to outbreak onset and 14 months prior to traditional detection. Modified Shewhart and EWMA charts additionally detected several outbreaks earlier than conventional SPC charts. Shewhart and SPC charts had low false-positive rates when used to analyse separate control hospital SSI data. Our findings illustrate the potential usefulness and feasibility of real-time SPC surveillance of SSI to rapidly identify outbreaks and improve patient safety. Further study is needed to optimise SPC chart selection and calculation, statistical outbreak detection rules and the process for reacting to signals of potential outbreaks. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Methods for estimating low-flow statistics for Massachusetts streams

USGS Publications Warehouse

Ries, Kernell G.; Friesz, Paul J.

2000-01-01

Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The streamgaging stations had from 2 to 81 years of record, with a mean record length of 37 years. The low-flow partial-record stations had from 8 to 36 streamflow measurements, with a median of 14 measurements. All basin characteristics were determined from digital map data. The basin characteristics that were statistically significant in most of the final regression equations were drainage area, the area of stratified-drift deposits per unit of stream length plus 0.1, mean basin slope, and an indicator variable that was 0 in the eastern region and 1 in the western region of Massachusetts. The equations were developed by use of weighted-least-squares regression analyses, with weights assigned proportional to the years of record and inversely proportional to the variances of the streamflow statistics for the stations. Standard errors of prediction ranged from 70.7 to 17.5 percent for the equations to predict the 7-day, 10-year low flow and 50-percent duration flow, respectively. The equations are not applicable for use in the Southeast Coastal region of the State, or where basin characteristics for the selected ungaged site are outside the ranges of those for the stations used in the regression analyses. A World Wide Web application was developed that provides streamflow statistics for data collection stations from a data base and for ungaged sites by measuring the necessary basin characteristics for the site and solving the regression equations. Output provided by the Web application for ungaged sites includes a map of the drainage-basin boundary determined for the site, the measured basin characteristics, the estimated streamflow statistics, and 90-percent prediction intervals for the estimates. An equation is provided for combining regression and correlation estimates to obtain improved estimates of the streamflow statistics for low-flow partial-record stations. An equation is also provided for combining regression and drainage-area ratio estimates to obtain improved e
Inferring causal relationships between phenotypes using summary statistics from genome-wide association studies.

PubMed

Meng, Xiang-He; Shen, Hui; Chen, Xiang-Ding; Xiao, Hong-Mei; Deng, Hong-Wen

2018-03-01

Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with diverse complex phenotypes and diseases, and provided tremendous opportunities for further analyses using summary association statistics. Recently, Pickrell et al. developed a robust method for causal inference using independent putative causal SNPs. However, this method may fail to infer the causal relationship between two phenotypes when only a limited number of independent putative causal SNPs identified. Here, we extended Pickrell's method to make it more applicable for the general situations. We extended the causal inference method by replacing the putative causal SNPs with the lead SNPs (the set of the most significant SNPs in each independent locus) and tested the performance of our extended method using both simulation and empirical data. Simulations suggested that when the same number of genetic variants is used, our extended method had similar distribution of test statistic under the null model as well as comparable power under the causal model compared with the original method by Pickrell et al. But in practice, our extended method would generally be more powerful because the number of independent lead SNPs was often larger than the number of independent putative causal SNPs. And including more SNPs, on the other hand, would not cause more false positives. By applying our extended method to summary statistics from GWAS for blood metabolites and femoral neck bone mineral density (FN-BMD), we successfully identified ten blood metabolites that may causally influence FN-BMD. We extended a causal inference method for inferring putative causal relationship between two phenotypes using summary statistics from GWAS, and identified a number of potential causal metabolites for FN-BMD, which may provide novel insights into the pathophysiological mechanisms underlying osteoporosis.
Applications of Stochastic Analyses for Collaborative Learning and Cognitive Assessment

DTIC Science & Technology

2007-04-01

models (Visser, Maartje, Raijmakers, & Molenaar , 2002). The second part of this paper illustrates two applications of the methods described in the...clustering three-way data sets. Computational Statistics and Data Analysis, 51 (11), 5368–5376. Visser, I., Maartje, E., Raijmakers, E. J., & Molenaar
An Analysis of Methods Used to Examine Gender Differences in Computer-Related Behavior.

ERIC Educational Resources Information Center

Kay, Robin

1992-01-01

Review of research investigating gender differences in computer-related behavior examines statistical and methodological flaws. Issues addressed include sample selection, sample size, scale development, scale quality, the use of univariate and multivariate analyses, regressional analysis, construct definition, construct testing, and the…
Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map.

PubMed

Orsi, Rebecca

2017-02-01

Concept mapping is now a commonly-used technique for articulating and evaluating programmatic outcomes. However, research regarding validity of knowledge and outcomes produced with concept mapping is sparse. The current study describes quantitative validity analyses using a concept mapping dataset. We sought to increase the validity of concept mapping evaluation results by running multiple cluster analysis methods and then using several metrics to choose from among solutions. We present four different clustering methods based on analyses using the R statistical software package: partitioning around medoids (PAM), fuzzy analysis (FANNY), agglomerative nesting (AGNES) and divisive analysis (DIANA). We then used the Dunn and Davies-Bouldin indices to assist in choosing a valid cluster solution for a concept mapping outcomes evaluation. We conclude that the validity of the outcomes map is high, based on the analyses described. Finally, we discuss areas for further concept mapping methods research. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Statistical process control: A feasibility study of the application of time-series measurement in early neurorehabilitation after acquired brain injury.

PubMed

Markovic, Gabriela; Schult, Marie-Louise; Bartfai, Aniko; Elg, Mattias

2017-01-31

Progress in early cognitive recovery after acquired brain injury is uneven and unpredictable, and thus the evaluation of rehabilitation is complex. The use of time-series measurements is susceptible to statistical change due to process variation. To evaluate the feasibility of using a time-series method, statistical process control, in early cognitive rehabilitation. Participants were 27 patients with acquired brain injury undergoing interdisciplinary rehabilitation of attention within 4 months post-injury. The outcome measure, the Paced Auditory Serial Addition Test, was analysed using statistical process control. Statistical process control identifies if and when change occurs in the process according to 3 patterns: rapid, steady or stationary performers. The statistical process control method was adjusted, in terms of constructing the baseline and the total number of measurement points, in order to measure a process in change. Statistical process control methodology is feasible for use in early cognitive rehabilitation, since it provides information about change in a process, thus enabling adjustment of the individual treatment response. Together with the results indicating discernible subgroups that respond differently to rehabilitation, statistical process control could be a valid tool in clinical decision-making. This study is a starting-point in understanding the rehabilitation process using a real-time-measurements approach.
Evidence for the Selective Reporting of Analyses and Discrepancies in Clinical Trials: A Systematic Review of Cohort Studies of Clinical Trials

PubMed Central

Dwan, Kerry; Altman, Douglas G.; Clarke, Mike; Gamble, Carrol; Higgins, Julian P. T.; Sterne, Jonathan A. C.; Williamson, Paula R.; Kirkham, Jamie J.

2014-01-01

Background Most publications about selective reporting in clinical trials have focussed on outcomes. However, selective reporting of analyses for a given outcome may also affect the validity of findings. If analyses are selected on the basis of the results, reporting bias may occur. The aims of this study were to review and summarise the evidence from empirical cohort studies that assessed discrepant or selective reporting of analyses in randomised controlled trials (RCTs). Methods and Findings A systematic review was conducted and included cohort studies that assessed any aspect of the reporting of analyses of RCTs by comparing different trial documents, e.g., protocol compared to trial report, or different sections within a trial publication. The Cochrane Methodology Register, Medline (Ovid), PsycInfo (Ovid), and PubMed were searched on 5 February 2014. Two authors independently selected studies, performed data extraction, and assessed the methodological quality of the eligible studies. Twenty-two studies (containing 3,140 RCTs) published between 2000 and 2013 were included. Twenty-two studies reported on discrepancies between information given in different sources. Discrepancies were found in statistical analyses (eight studies), composite outcomes (one study), the handling of missing data (three studies), unadjusted versus adjusted analyses (three studies), handling of continuous data (three studies), and subgroup analyses (12 studies). Discrepancy rates varied, ranging from 7% (3/42) to 88% (7/8) in statistical analyses, 46% (36/79) to 82% (23/28) in adjusted versus unadjusted analyses, and 61% (11/18) to 100% (25/25) in subgroup analyses. This review is limited in that none of the included studies investigated the evidence for bias resulting from selective reporting of analyses. It was not possible to combine studies to provide overall summary estimates, and so the results of studies are discussed narratively. Conclusions Discrepancies in analyses between publications and other study documentation were common, but reasons for these discrepancies were not discussed in the trial reports. To ensure transparency, protocols and statistical analysis plans need to be published, and investigators should adhere to these or explain discrepancies. Please see later in the article for the Editors' Summary PMID:24959719
Refining cost-effectiveness analyses using the net benefit approach and econometric methods: an example from a trial of anti-depressant treatment.

PubMed

Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony

2013-04-01

Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.
Effects of Exercise in the Treatment of Overweight and Obese Children and Adolescents: A Systematic Review of Meta-Analyses

PubMed Central

Kelley, George A.; Kelley, Kristi S.

2013-01-01

Purpose. Conduct a systematic review of previous meta-analyses addressing the effects of exercise in the treatment of overweight and obese children and adolescents. Methods. Previous meta-analyses of randomized controlled exercise trials that assessed adiposity in overweight and obese children and adolescents were included by searching nine electronic databases and cross-referencing from retrieved studies. Methodological quality was assessed using the Assessment of Multiple Systematic Reviews (AMSTAR) Instrument. The alpha level for statistical significance was set at P ≤ 0.05. Results. Of the 308 studies reviewed, two aggregate data meta-analyses representing 14 and 17 studies and 481 and 701 boys and girls met all eligibility criteria. Methodological quality was 64% and 73%. For both studies, statistically significant reductions in percent body fat were observed (P = 0.006 and P < 0.00001). The number-needed-to treat (NNT) was 4 and 3 with an estimated 24.5 and 31.5 million overweight and obese children in the world potentially benefitting, 2.8 and 3.6 million in the US. No other measures of adiposity (BMI-related measures, body weight, and central obesity) were statistically significant. Conclusions. Exercise is efficacious for reducing percent body fat in overweight and obese children and adolescents. Insufficient evidence exists to suggest that exercise reduces other measures of adiposity. PMID:24455215

Body Weight Reducing Effect of Oral Boric Acid Intake

PubMed Central

Aysan, Erhan; Sahin, Fikrettin; Telci, Dilek; Yalvac, Mehmet Emir; Emre, Sinem Hocaoglu; Karaca, Cetin; Muslumanoglu, Mahmut

2011-01-01

Background: Boric acid is widely used in biology, but its body weight reducing effect is not researched. Methods: Twenty mice were divided into two equal groups. Control group mice drank standard tap water, but study group mice drank 0.28mg/250ml boric acid added tap water over five days. Total body weight changes, major organ histopathology, blood biochemistry, urine and feces analyses were compared. Results: Study group mice lost body weight mean 28.1% but in control group no weight loss and also weight gained mean 0.09% (p<0.001). Total drinking water and urine outputs were not statistically different. Cholesterol, LDL, AST, ALT, LDH, amylase and urobilinogen levels were statistically significantly high in the study group. Other variables were not statistically different. No histopathologic differences were detected in evaluations of all resected major organs. Conclusion: Low dose oral boric acid intake cause serious body weight reduction. Blood and urine analyses support high glucose, lipid and middle protein catabolisms, but the mechanism is unclear. PMID:22135611
BRepertoire: a user-friendly web server for analysing antibody repertoire data.

PubMed

Margreitter, Christian; Lu, Hui-Chun; Townsend, Catherine; Stewart, Alexander; Dunn-Walters, Deborah K; Fraternali, Franca

2018-04-14

Antibody repertoire analysis by high throughput sequencing is now widely used, but a persisting challenge is enabling immunologists to explore their data to discover discriminating repertoire features for their own particular investigations. Computational methods are necessary for large-scale evaluation of antibody properties. We have developed BRepertoire, a suite of user-friendly web-based software tools for large-scale statistical analyses of repertoire data. The software is able to use data preprocessed by IMGT, and performs statistical and comparative analyses with versatile plotting options. BRepertoire has been designed to operate in various modes, for example analysing sequence-specific V(D)J gene usage, discerning physico-chemical properties of the CDR regions and clustering of clonotypes. Those analyses are performed on the fly by a number of R packages and are deployed by a shiny web platform. The user can download the analysed data in different table formats and save the generated plots as image files ready for publication. We believe BRepertoire to be a versatile analytical tool that complements experimental studies of immune repertoires. To illustrate the server's functionality, we show use cases including differential gene usage in a vaccination dataset and analysis of CDR3H properties in old and young individuals. The server is accessible under http://mabra.biomed.kcl.ac.uk/BRepertoire.
Multi-trait analysis of genome-wide association summary statistics using MTAG.

PubMed

Turley, Patrick; Walters, Raymond K; Maghzian, Omeed; Okbay, Aysu; Lee, James J; Fontana, Mark Alan; Nguyen-Viet, Tuan Anh; Wedow, Robbee; Zacher, Meghan; Furlotte, Nicholas A; Magnusson, Patrik; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Laibson, David; Cesarini, David; Neale, Benjamin M; Benjamin, Daniel J

2018-02-01

We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.
A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins

PubMed Central

Knudsen, Bjarne; Miyamoto, Michael M.

2001-01-01

Changes in protein function can lead to changes in the selection acting on specific residues. This can often be detected as evolutionary rate changes at the sites in question. A maximum-likelihood method for detecting evolutionary rate shifts at specific protein positions is presented. The method determines significance values of the rate differences to give a sound statistical foundation for the conclusions drawn from the analyses. A statistical test for detecting slowly evolving sites is also described. The methods are applied to a set of Myc proteins for the identification of both conserved sites and those with changing evolutionary rates. Those positions with conserved and changing rates are related to the structures and functions of their proteins. The results are compared with an earlier Bayesian method, thereby highlighting the advantages of the new likelihood ratio tests. PMID:11734650
Usefulness and limitations of various guinea-pig test methods in detecting human skin sensitizers-validation of guinea-pig tests for skin hypersensitivity.

PubMed

Marzulli, F; Maguire, H C

1982-02-01

Several guinea-pig predictive test methods were evaluated by comparison of results with those obtained with human predictive tests, using ten compounds that have been used in cosmetics. The method involves the statistical analysis of the frequency with which guinea-pig tests agree with the findings of tests in humans. In addition, the frequencies of false positive and false negative predictive findings are considered and statistically analysed. The results clearly demonstrate the superiority of adjuvant tests (complete Freund's adjuvant) in determining skin sensitizers and the overall superiority of the guinea-pig maximization test in providing results similar to those obtained by human testing. A procedure is suggested for utilizing adjuvant and non-adjuvant test methods for characterizing compounds as of weak, moderate or strong sensitizing potential.
Quantification and Statistical Analysis Methods for Vessel Wall Components from Stained Images with Masson's Trichrome

PubMed Central

Hernández-Morera, Pablo; Castaño-González, Irene; Travieso-González, Carlos M.; Mompeó-Corredera, Blanca; Ortega-Santana, Francisco

2016-01-01

Purpose To develop a digital image processing method to quantify structural components (smooth muscle fibers and extracellular matrix) in the vessel wall stained with Masson’s trichrome, and a statistical method suitable for small sample sizes to analyze the results previously obtained. Methods The quantification method comprises two stages. The pre-processing stage improves tissue image appearance and the vessel wall area is delimited. In the feature extraction stage, the vessel wall components are segmented by grouping pixels with a similar color. The area of each component is calculated by normalizing the number of pixels of each group by the vessel wall area. Statistical analyses are implemented by permutation tests, based on resampling without replacement from the set of the observed data to obtain a sampling distribution of an estimator. The implementation can be parallelized on a multicore machine to reduce execution time. Results The methods have been tested on 48 vessel wall samples of the internal saphenous vein stained with Masson’s trichrome. The results show that the segmented areas are consistent with the perception of a team of doctors and demonstrate good correlation between the expert judgments and the measured parameters for evaluating vessel wall changes. Conclusion The proposed methodology offers a powerful tool to quantify some components of the vessel wall. It is more objective, sensitive and accurate than the biochemical and qualitative methods traditionally used. The permutation tests are suitable statistical techniques to analyze the numerical measurements obtained when the underlying assumptions of the other statistical techniques are not met. PMID:26761643
A survey of design methods for failure detection in dynamic systems

NASA Technical Reports Server (NTRS)

Willsky, A. S.

1975-01-01

A number of methods for detecting abrupt changes (such as failures) in stochastic dynamical systems are surveyed. The class of linear systems is concentrated on but the basic concepts, if not the detailed analyses, carry over to other classes of systems. The methods surveyed range from the design of specific failure-sensitive filters, to the use of statistical tests on filter innovations, to the development of jump process formulations. Tradeoffs in complexity versus performance are discussed.
Propensity score to detect baseline imbalance in cluster randomized trials: the role of the c-statistic.

PubMed

Leyrat, Clémence; Caille, Agnès; Foucher, Yohann; Giraudeau, Bruno

2016-01-22

Despite randomization, baseline imbalance and confounding bias may occur in cluster randomized trials (CRTs). Covariate imbalance may jeopardize the validity of statistical inferences if they occur on prognostic factors. Thus, the diagnosis of a such imbalance is essential to adjust statistical analysis if required. We developed a tool based on the c-statistic of the propensity score (PS) model to detect global baseline covariate imbalance in CRTs and assess the risk of confounding bias. We performed a simulation study to assess the performance of the proposed tool and applied this method to analyze the data from 2 published CRTs. The proposed method had good performance for large sample sizes (n =500 per arm) and when the number of unbalanced covariates was not too small as compared with the total number of baseline covariates (≥40% of unbalanced covariates). We also provide a strategy for pre selection of the covariates needed to be included in the PS model to enhance imbalance detection. The proposed tool could be useful in deciding whether covariate adjustment is required before performing statistical analyses of CRTs.
Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement.

PubMed

Austin, Peter C

2007-11-01

I conducted a systematic review of the use of propensity score matching in the cardiovascular surgery literature. I examined the adequacy of reporting and whether appropriate statistical methods were used. I examined 60 articles published in the Annals of Thoracic Surgery, European Journal of Cardio-thoracic Surgery, Journal of Cardiovascular Surgery, and the Journal of Thoracic and Cardiovascular Surgery between January 1, 2004, and December 31, 2006. Thirty-one of the 60 studies did not provide adequate information on how the propensity score-matched pairs were formed. Eleven (18%) of studies did not report on whether matching on the propensity score balanced baseline characteristics between treated and untreated subjects in the matched sample. No studies used appropriate methods to compare baseline characteristics between treated and untreated subjects in the propensity score-matched sample. Eight (13%) of the 60 studies explicitly used statistical methods appropriate for the analysis of matched data when estimating the effect of treatment on the outcomes. Two studies used appropriate methods for some outcomes, but not for all outcomes. Thirty-nine (65%) studies explicitly used statistical methods that were inappropriate for matched-pairs data when estimating the effect of treatment on outcomes. Eleven studies did not report the statistical tests that were used to assess the statistical significance of the treatment effect. Analysis of propensity score-matched samples tended to be poor in the cardiovascular surgery literature. Most statistical analyses ignored the matched nature of the sample. I provide suggestions for improving the reporting and analysis of studies that use propensity score matching.
Behind the statistics: the ethnography of suicide in Palestine.

PubMed

Dabbagh, Nadia

2012-06-01

As part of the first anthropological study on suicide in the modern Arab world, statistics gathered from the Ramallah region of the West Bank in Palestine painted an apparently remarkably similar picture to that found in Western countries such as the UK and France. More men than women completed suicide, more women than men attempted suicide. Men used more violent methods such as hanging and women softer methods such as medication overdose. Completed suicide was higher in the older age range, attempted suicide in the younger. However, ethnographic fieldwork and detailed examination of the case studies and suicide narratives gathered and analysed within the cultural, political and economic contexts illustrated more starkly the differences in suicidal practices between Palestinian West Bank society of the 1990s and other regions of the world. The central argument of the paper is that although statistics tell a very important story, ethnography uncovers a multitude of stories 'behind the statistics', and thus helps us to make sense of both cultural context and subjective experience.
Organizational downsizing and age discrimination litigation: the influence of personnel practices and statistical evidence on litigation outcomes.

PubMed

Wingate, Peter H; Thornton, George C; McIntyre, Kelly S; Frame, Jennifer H

2003-02-01

The present study examined relationships between reduction-in-force (RIF) personnel practices, presentation of statistical evidence, and litigation outcomes. Policy capturing methods were utilized to analyze the components of 115 federal district court opinions involving age discrimination disparate treatment allegations and organizational downsizing. Univariate analyses revealed meaningful links between RIF personnel practices, use of statistical evidence, and judicial verdict. The defendant organization was awarded summary judgment in 73% of the claims included in the study. Judicial decisions in favor of the defendant organization were found to be significantly related to such variables as formal performance appraisal systems, termination decision review within the organization, methods of employee assessment and selection for termination, and the presence of a concrete layoff policy. The use of statistical evidence in ADEA disparate treatment litigation was investigated and found to be a potentially persuasive type of indirect evidence. Legal, personnel, and evidentiary ramifications are reviewed, and a framework of downsizing mechanics emphasizing legal defensibility is presented.
Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review.

PubMed

Zaki, Rafdzah; Bulgiba, Awang; Ismail, Roshidi; Ismail, Noor Azina

2012-01-01

Accurate values are a must in medicine. An important parameter in determining the quality of a medical instrument is agreement with a gold standard. Various statistical methods have been used to test for agreement. Some of these methods have been shown to be inappropriate. This can result in misleading conclusions about the validity of an instrument. The Bland-Altman method is the most popular method judging by the many citations of the article proposing this method. However, the number of citations does not necessarily mean that this method has been applied in agreement research. No previous study has been conducted to look into this. This is the first systematic review to identify statistical methods used to test for agreement of medical instruments. The proportion of various statistical methods found in this review will also reflect the proportion of medical instruments that have been validated using those particular methods in current clinical practice. Five electronic databases were searched between 2007 and 2009 to look for agreement studies. A total of 3,260 titles were initially identified. Only 412 titles were potentially related, and finally 210 fitted the inclusion criteria. The Bland-Altman method is the most popular method with 178 (85%) studies having used this method, followed by the correlation coefficient (27%) and means comparison (18%). Some of the inappropriate methods highlighted by Altman and Bland since the 1980s are still in use. This study finds that the Bland-Altman method is the most popular method used in agreement research. There are still inappropriate applications of statistical methods in some studies. It is important for a clinician or medical researcher to be aware of this issue because misleading conclusions from inappropriate analyses will jeopardize the quality of the evidence, which in turn will influence quality of care given to patients in the future.
Statistical Methods Used to Test for Agreement of Medical Instruments Measuring Continuous Variables in Method Comparison Studies: A Systematic Review

PubMed Central

Zaki, Rafdzah; Bulgiba, Awang; Ismail, Roshidi; Ismail, Noor Azina

2012-01-01

Background Accurate values are a must in medicine. An important parameter in determining the quality of a medical instrument is agreement with a gold standard. Various statistical methods have been used to test for agreement. Some of these methods have been shown to be inappropriate. This can result in misleading conclusions about the validity of an instrument. The Bland-Altman method is the most popular method judging by the many citations of the article proposing this method. However, the number of citations does not necessarily mean that this method has been applied in agreement research. No previous study has been conducted to look into this. This is the first systematic review to identify statistical methods used to test for agreement of medical instruments. The proportion of various statistical methods found in this review will also reflect the proportion of medical instruments that have been validated using those particular methods in current clinical practice. Methodology/Findings Five electronic databases were searched between 2007 and 2009 to look for agreement studies. A total of 3,260 titles were initially identified. Only 412 titles were potentially related, and finally 210 fitted the inclusion criteria. The Bland-Altman method is the most popular method with 178 (85%) studies having used this method, followed by the correlation coefficient (27%) and means comparison (18%). Some of the inappropriate methods highlighted by Altman and Bland since the 1980s are still in use. Conclusions This study finds that the Bland-Altman method is the most popular method used in agreement research. There are still inappropriate applications of statistical methods in some studies. It is important for a clinician or medical researcher to be aware of this issue because misleading conclusions from inappropriate analyses will jeopardize the quality of the evidence, which in turn will influence quality of care given to patients in the future. PMID:22662248
DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts.

PubMed

Lee, Donghyung; Bigdeli, T Bernard; Williamson, Vernell S; Vladimirov, Vladimir I; Riley, Brien P; Fanous, Ayman H; Bacanu, Silviu-Alin

2015-10-01

To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources. DISTMIX software, its reference population data, and usage examples are publicly available at http://code.google.com/p/distmix. dlee4@vcu.edu Supplementary Data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Methodological Reporting of Randomized Trials in Five Leading Chinese Nursing Journals

PubMed Central

Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu

2014-01-01

Background Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. Methods In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. Results In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34±0.97 (Mean ± SD). No RCT reported descriptions and changes in “trial design,” changes in “outcomes” and “implementation,” or descriptions of the similarity of interventions for “blinding.” Poor reporting was found in detailing the “settings of participants” (13.1%), “type of randomization sequence generation” (1.8%), calculation methods of “sample size” (0.4%), explanation of any interim analyses and stopping guidelines for “sample size” (0.3%), “allocation concealment mechanism” (0.3%), additional analyses in “statistical methods” (2.1%), and targeted subjects and methods of “blinding” (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of “participants,” “interventions,” and definitions of the “outcomes” and “statistical methods.” The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. Conclusions The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods. PMID:25415382
Machine Learning Methods for Attack Detection in the Smart Grid.

PubMed

Ozay, Mete; Esnaola, Inaki; Yarman Vural, Fatos Tunay; Kulkarni, Sanjeev R; Poor, H Vincent

2016-08-01

Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semisupervised) are employed with decision- and feature-level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than attack detection algorithms that employ state vector estimation methods in the proposed attack detection framework.
A Space–Time Permutation Scan Statistic for Disease Outbreak Detection

PubMed Central

Kulldorff, Martin; Heffernan, Richard; Hartman, Jessica; Assunção, Renato; Mostashari, Farzad

2005-01-01

Background The ability to detect disease outbreaks early is important in order to minimize morbidity and mortality through timely implementation of disease prevention and control measures. Many national, state, and local health departments are launching disease surveillance systems with daily analyses of hospital emergency department visits, ambulance dispatch calls, or pharmacy sales for which population-at-risk information is unavailable or irrelevant. Methods and Findings We propose a prospective space–time permutation scan statistic for the early detection of disease outbreaks that uses only case numbers, with no need for population-at-risk data. It makes minimal assumptions about the time, geographical location, or size of the outbreak, and it adjusts for natural purely spatial and purely temporal variation. The new method was evaluated using daily analyses of hospital emergency department visits in New York City. Four of the five strongest signals were likely local precursors to citywide outbreaks due to rotavirus, norovirus, and influenza. The number of false signals was at most modest. Conclusion If such results hold up over longer study times and in other locations, the space–time permutation scan statistic will be an important tool for local and national health departments that are setting up early disease detection surveillance systems. PMID:15719066
Bayesian approach for counting experiment statistics applied to a neutrino point source analysis

NASA Astrophysics Data System (ADS)

Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.

2013-12-01

In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.
Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic.

PubMed

Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R

2016-12-01

: MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We demonstrate our proposed approach for a two-sample summary data MR analysis to estimate the causal effect of low-density lipoprotein on heart disease risk. A high value of IGX2 close to 1 indicates that dilution does not materially affect the standard MR-Egger analyses for these data. : Care must be taken to assess the NOME assumption via the IGX2 statistic before implementing standard MR-Egger regression in the two-sample summary data context. If IGX2 is sufficiently low (less than 90%), inferences from the method should be interpreted with caution and adjustment methods considered. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.
DISSCO: direct imputation of summary statistics allowing covariates.

PubMed

Xu, Zheng; Duan, Qing; Yan, Song; Chen, Wei; Li, Mingyao; Lange, Ethan; Li, Yun

2015-08-01

Imputation of individual level genotypes at untyped markers using an external reference panel of genotyped or sequenced individuals has become standard practice in genetic association studies. Direct imputation of summary statistics can also be valuable, for example in meta-analyses where individual level genotype data are not available. Two methods (DIST and ImpG-Summary/LD), that assume a multivariate Gaussian distribution for the association summary statistics, have been proposed for imputing association summary statistics. However, both methods assume that the correlations between association summary statistics are the same as the correlations between the corresponding genotypes. This assumption can be violated in the presence of confounding covariates. We analytically show that in the absence of covariates, correlation among association summary statistics is indeed the same as that among the corresponding genotypes, thus serving as a theoretical justification for the recently proposed methods. We continue to prove that in the presence of covariates, correlation among association summary statistics becomes the partial correlation of the corresponding genotypes controlling for covariates. We therefore develop direct imputation of summary statistics allowing covariates (DISSCO). We consider two real-life scenarios where the correlation and partial correlation likely make practical difference: (i) association studies in admixed populations; (ii) association studies in presence of other confounding covariate(s). Application of DISSCO to real datasets under both scenarios shows at least comparable, if not better, performance compared with existing correlation-based methods, particularly for lower frequency variants. For example, DISSCO can reduce the absolute deviation from the truth by 3.9-15.2% for variants with minor allele frequency <5%. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Research Design and Statistical Methods in Indian Medical Journals: A Retrospective Survey

PubMed Central

Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S.; Mayya, Shreemathi S.

2015-01-01

Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were – study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and interpretation (χ2=25.616, Φ=0.173, p<0.0001), 32.5% (104/320) compared to 17.1% (84/490), though some serious ones were still present. Indian medical research seems to have made no major progress regarding using correct statistical analyses, but error/defects in study designs have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems. PMID:25856194
Research design and statistical methods in Indian medical journals: a retrospective survey.

PubMed

Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S; Mayya, Shreemathi S

2015-01-01

Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were - study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and interpretation (χ2=25.616, Φ=0.173, p<0.0001), 32.5% (104/320) compared to 17.1% (84/490), though some serious ones were still present. Indian medical research seems to have made no major progress regarding using correct statistical analyses, but error/defects in study designs have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems.
Applying a Mixed Methods Framework to Differential Item Function Analyses

ERIC Educational Resources Information Center

Hitchcock, John H.; Johanson, George A.

2015-01-01

Understanding the reason(s) for Differential Item Functioning (DIF) in the context of measurement is difficult. Although identifying potential DIF items is typically a statistical endeavor, understanding the reasons for DIF (and item repair or replacement) might require investigations that can be informed by qualitative work. Such work is…
Effective Analysis of Reaction Time Data

ERIC Educational Resources Information Center

Whelan, Robert

2008-01-01

Most analyses of reaction time (RT) data are conducted by using the statistical techniques with which psychologists are most familiar, such as analysis of variance on the sample mean. Unfortunately, these methods are usually inappropriate for RT data, because they have little power to detect genuine differences in RT between conditions. In…
Michigan's forests, 2004: statistics and quality assurance

Treesearch

Scott A. Pugh; Mark H. Hansen; Gary Brand; Ronald E. McRoberts

2010-01-01

The first annual inventory of Michigan's forests was completed in 2004 after 18,916 plots were selected and 10,355 forested plots were visited. This report includes detailed information on forest inventory methods, quality of estimates, and additional tables. An earlier publication presented analyses of the inventoried data (Pugh et al. 2009).
Student Evaluation of Instruction: Comparison between In-Class and Online Methods

ERIC Educational Resources Information Center

Capa-Aydin, Yesim

2016-01-01

This study compares student evaluations of instruction that were collected in-class with those gathered through an online survey. The two modes of administration were compared with respect to response rate, psychometric characteristics and mean ratings through different statistical analyses. Findings indicated that in-class evaluations produced a…
Congruence between Disabled Elders and Their Primary Caregivers

ERIC Educational Resources Information Center

Horowitz, Amy; Goodman, Caryn R.; Reinhardt, Joann P.

2004-01-01

Purpose: This study examines the extent and independent correlates of congruence between disabled elders and their caregivers on several aspects of the caregiving experience. Design and Methods: Participants were 117 visually impaired elders and their caregivers. Correlational analyses, kappa statistics, and paired t tests were used to examine the…
METHODS OF DEALING WITH VALUES BELOW THE LIMIT OF DETECTION USING SAS

EPA Science Inventory

Due to limitations of chemical analysis procedures, small values cannot be precisely measured. These values are said to be below the limit of detection (LOD). In statistical analyses, these values are often censored and substituted with a constant value, such as half the LOD,...
Learning Opportunities for Group Learning

ERIC Educational Resources Information Center

Gil, Alfonso J.; Mataveli, Mara

2017-01-01

Purpose: This paper aims to analyse the impact of organizational learning culture and learning facilitators in group learning. Design/methodology/approach: This study was conducted using a survey method applied to a statistically representative sample of employees from Rioja wine companies in Spain. A model was tested using a structural equation…
PARTIAL LEAST SQUARE ANALYSES FOR ASSOCIATION OF LANDSCAPE METRICS WITH WATER BIOLOGICAL AND CHEMICAL PROPERTIES IN THE SAVANNAH RIVER BASIN

EPA Science Inventory

Surface water quality is related to conditions in the surrounding geophysical environment, including soils, landcover, and anthropogenic activities. A number of statistical methods may be used to analyze and explore relationships among variables. Single-, multiple- and multivaria...
An Experimental Ecological Study of a Garden Compost Heap.

ERIC Educational Resources Information Center

Curds, Tracy

1985-01-01

A quantitative study of the fauna of a garden compost heap shows it to be similar to that of organisms found in soil and leaf litter. Materials, methods, and results are discussed and extensive tables of fauna lists, wet/dry masses, and statistical analyses are presented. (Author/DH)
Statistics provide guidance for indigenous organic carbon detection on Mars missions.

PubMed

Sephton, Mark A; Carter, Jonathan N

2014-08-01

Data from the Viking and Mars Science Laboratory missions indicate the presence of organic compounds that are not definitively martian in origin. Both contamination and confounding mineralogies have been suggested as alternatives to indigenous organic carbon. Intuitive thought suggests that we are repeatedly obtaining data that confirms the same level of uncertainty. Bayesian statistics may suggest otherwise. If an organic detection method has a true positive to false positive ratio greater than one, then repeated organic matter detection progressively increases the probability of indigeneity. Bayesian statistics also reveal that methods with higher ratios of true positives to false positives give higher overall probabilities and that detection of organic matter in a sample with a higher prior probability of indigenous organic carbon produces greater confidence. Bayesian statistics, therefore, provide guidance for the planning and operation of organic carbon detection activities on Mars. Suggestions for future organic carbon detection missions and instruments are as follows: (i) On Earth, instruments should be tested with analog samples of known organic content to determine their true positive to false positive ratios. (ii) On the mission, for an instrument with a true positive to false positive ratio above one, it should be recognized that each positive detection of organic carbon will result in a progressive increase in the probability of indigenous organic carbon being present; repeated measurements, therefore, can overcome some of the deficiencies of a less-than-definitive test. (iii) For a fixed number of analyses, the highest true positive to false positive ratio method or instrument will provide the greatest probability that indigenous organic carbon is present. (iv) On Mars, analyses should concentrate on samples with highest prior probability of indigenous organic carbon; intuitive desires to contrast samples of high prior probability and low prior probability of indigenous organic carbon should be resisted.
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.

PubMed

Festing, M F

2001-01-01

In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
Visual field progression with frequency-doubling matrix perimetry and standard automated perimetry in patients with glaucoma and in healthy controls.

PubMed

Redmond, Tony; O'Leary, Neil; Hutchison, Donna M; Nicolela, Marcelo T; Artes, Paul H; Chauhan, Balwantray C

2013-12-01

A new analysis method called permutation of pointwise linear regression measures the significance of deterioration over time at each visual field location, combines the significance values into an overall statistic, and then determines the likelihood of change in the visual field. Because the outcome is a single P value, individualized to that specific visual field and independent of the scale of the original measurement, the method is well suited for comparing techniques with different stimuli and scales. To test the hypothesis that frequency-doubling matrix perimetry (FDT2) is more sensitive than standard automated perimetry (SAP) in identifying visual field progression in glaucoma. Patients with open-angle glaucoma and healthy controls were examined by FDT2 and SAP, both with the 24-2 test pattern, on the same day at 6-month intervals in a longitudinal prospective study conducted in a hospital-based setting. Only participants with at least 5 examinations were included. Data were analyzed with permutation of pointwise linear regression. Permutation of pointwise linear regression is individualized to each participant, in contrast to current analyses in which the statistical significance is inferred from population-based approaches. Analyses were performed with both total deviation and pattern deviation. Sixty-four patients and 36 controls were included in the study. The median age, SAP mean deviation, and follow-up period were 65 years, -2.6 dB, and 5.4 years, respectively, in patients and 62 years, +0.4 dB, and 5.2 years, respectively, in controls. Using total deviation analyses, statistically significant deterioration was identified in 17% of patients with FDT2, in 34% of patients with SAP, and in 14% of patients with both techniques; in controls these percentages were 8% with FDT2, 31% with SAP, and 8% with both. Using pattern deviation analyses, statistically significant deterioration was identified in 16% of patients with FDT2, in 17% of patients with SAP, and in 3% of patients with both techniques; in controls these values were 3% with FDT2 and none with SAP. No evidence was found that FDT2 is more sensitive than SAP in identifying visual field deterioration. In about one-third of healthy controls, age-related deterioration with SAP reached statistical significance.
Statistical analyses of influence of solar and geomagnetic activities on car accident events

NASA Astrophysics Data System (ADS)

Alania, M. V.; Gil, A.; Wieliczuk, R.

2001-01-01

Statistical analyses of the influence of Solar and geomagnetic activity, sector structure of the interplanetary magnetic field and galactic cosmic ray Forbush effects on car accident events in Poland for the period of 1990-1999 have been carried out. Using auto-correlation, cross-correlation, spectral analyses and superposition epochs methods it has been shown that there are separate periods when car accident events have direct correlation with Ap index of the geomagnetic activity, sector structure of the interplanetary magnetic field and Forbush decreases of galactic cosmic rays. Nevertheless, the single-valued direct correlation is not possible to reveal for the whole period of 1990-1999. Periodicity of 7 days and its second harmonic (3.5 days) has been reliably revealed in the car accident events data in Poland for the each year of the period 1990-1999. It is shown that the maximum car accident events take place in Poland on Friday and practically does not depend on the level of solar and geomagnetic activities.
Borrowing of strength and study weights in multivariate and network meta-analysis.

PubMed

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2017-12-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis

PubMed Central

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2016-01-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
Use of model calibration to achieve high accuracy in analysis of computer networks

DOEpatents

Frogner, Bjorn; Guarro, Sergio; Scharf, Guy

2004-05-11

A system and method are provided for creating a network performance prediction model, and calibrating the prediction model, through application of network load statistical analyses. The method includes characterizing the measured load on the network, which may include background load data obtained over time, and may further include directed load data representative of a transaction-level event. Probabilistic representations of load data are derived to characterize the statistical persistence of the network performance variability and to determine delays throughout the network. The probabilistic representations are applied to the network performance prediction model to adapt the model for accurate prediction of network performance. Certain embodiments of the method and system may be used for analysis of the performance of a distributed application characterized as data packet streams.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation

NASA Technical Reports Server (NTRS)

DePriest, Douglas; Morgan, Carolyn

2003-01-01

The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
Bayes in biological anthropology.

PubMed

Konigsberg, Lyle W; Frankenberg, Susan R

2013-12-01

In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.

Statistical analysis of solid waste composition data: Arithmetic mean, standard deviation and correlation coefficients.

PubMed

Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard

2017-11-01

Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sensitivity Analyses of the Change in FVC in a Phase 3 Trial of Pirfenidone for Idiopathic Pulmonary Fibrosis.

PubMed

Lederer, David J; Bradford, Williamson Z; Fagan, Elizabeth A; Glaspole, Ian; Glassberg, Marilyn K; Glasscock, Kenneth F; Kardatzke, David; King, Talmadge E; Lancaster, Lisa H; Nathan, Steven D; Pereira, Carlos A; Sahn, Steven A; Swigris, Jeffrey J; Noble, Paul W

2015-07-01

FVC outcomes in clinical trials on idiopathic pulmonary fibrosis (IPF) can be substantially influenced by the analytic methodology and the handling of missing data. We conducted a series of sensitivity analyses to assess the robustness of the statistical finding and the stability of the estimate of the magnitude of treatment effect on the primary end point of FVC change in a phase 3 trial evaluating pirfenidone in adults with IPF. Source data included all 555 study participants randomized to treatment with pirfenidone or placebo in the Assessment of Pirfenidone to Confirm Efficacy and Safety in Idiopathic Pulmonary Fibrosis (ASCEND) study. Sensitivity analyses were conducted to assess whether alternative statistical tests and methods for handling missing data influenced the observed magnitude of treatment effect on the primary end point of change from baseline to week 52 in FVC. The distribution of FVC change at week 52 was systematically different between the two treatment groups and favored pirfenidone in each analysis. The method used to impute missing data due to death had a marked effect on the magnitude of change in FVC in both treatment groups; however, the magnitude of treatment benefit was generally consistent on a relative basis, with an approximate 50% reduction in FVC decline observed in the pirfenidone group in each analysis. Our results confirm the robustness of the statistical finding on the primary end point of change in FVC in the ASCEND trial and corroborate the estimated magnitude of the pirfenidone treatment effect in patients with IPF. ClinicalTrials.gov; No.: NCT01366209; URL: www.clinicaltrials.gov.
An Improved LC-ESI-MS/MS Method to Quantify Pregabalin in Human Plasma and Dry Plasma Spot for Therapeutic Monitoring and Pharmacokinetic Applications.

PubMed

Dwivedi, Jaya; Namdev, Kuldeep K; Chilkoti, Deepak C; Verma, Surajpal; Sharma, Swapnil

2018-06-06

Therapeutic drug monitoring (TDM) of anti-epileptic drugs provides a valid clinical tool in optimization of overall therapy. However, TDM is challenging due to the high biological samples (plasma/blood) storage/shipment costs and the limited availability of laboratories providing TDM services. Sampling in the form of dry plasma spot (DPS) or dry blood spot (DBS) is a suitable alternative to overcome these issues. An improved, simple, rapid, and stability indicating method for quantification of pregabalin in human plasma and DPS has been developed and validated. Analyses were performed on liquid chromatography tandem mass spectrometer under positive ionization mode of electrospray interface. Pregabain-d4 was used as internal standard, and the chromatographic separations were performed on Poroshell 120 EC-C18 column using an isocratic mobile phase flow rate of 1 mL/min. Stability of pregabalin in DPS was evaluated under simulated real-time conditions. Extraction procedures from plasma and DPS samples were compared using statistical tests. The method was validated considering the FDA method validation guideline. The method was linear over the concentration range of 20-16000 ng/mL and 100-10000 ng/mL in plasma and DPS, respectively. DPS samples were found stable for only one week upon storage at room temperature and for at least four weeks at freezing temperature (-20 ± 5 °C). Method was applied for quantification of pregabalin in over 600 samples of a clinical study. Statistical analyses revealed that two extraction procedures in plasma and DPS samples showed statistically insignificant difference and can be used interchangeably without any bias. Proposed method involves simple and rapid steps of sample processing that do not require a pre- or post-column derivatization procedure. The method is suitable for routine pharmacokinetic analysis and therapeutic monitoring of pregabalin.
Effects of Interventions on Survival in Acute Respiratory Distress Syndrome: an Umbrella Review of 159 Published Randomized Trials and 29 Meta-analyses

PubMed Central

Tonelli, Adriano R.; Zein, Joe; Adams, Jacob; Ioannidis, John P.A.

2014-01-01

Purpose Multiple interventions have been tested in acute respiratory distress syndrome (ARDS). We examined the entire agenda of published randomized controlled trials (RCTs) in ARDS that reported on mortality and of respective meta-analyses. Methods We searched PubMed, the Cochrane Library and Web of Knowledge until July 2013. We included RCTs in ARDS published in English. We excluded trials of newborns and children; and those on short-term interventions, ARDS prevention or post-traumatic lung injury. We also reviewed all meta-analyses of RCTs in this field that addressed mortality. Treatment modalities were grouped in five categories: mechanical ventilation strategies and respiratory care, enteral or parenteral therapies, inhaled / intratracheal medications, nutritional support and hemodynamic monitoring. Results We identified 159 published RCTs of which 93 had overall mortality reported (n= 20,671 patients) - 44 trials (14,426 patients) reported mortality as a primary outcome. A statistically significant survival benefit was observed in 8 trials (7 interventions) and two trials reported an adverse effect on survival. Among RTCs with >50 deaths in at least 1 treatment arm (n=21), 2 showed a statistically significant mortality benefit of the intervention (lower tidal volumes and prone positioning), 1 showed a statistically significant mortality benefit only in adjusted analyses (cisatracurium) and 1 (high-frequency oscillatory ventilation) showed a significant detrimental effect. Across 29 meta-analyses, the most consistent evidence was seen for low tidal volumes and prone positioning in severe ARDS. Conclusions There is limited supportive evidence that specific interventions can decrease mortality in ARDS. While low tidal volumes and prone positioning in severe ARDS seem effective, most sporadic findings of interventions suggesting reduced mortality are not corroborated consistently in large-scale evidence including meta-analyses. PMID:24667919
A common base method for analysis of qPCR data and the application of simple blocking in qPCR experiments.

PubMed

Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J

2017-12-01

qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.
Interim analyses in 2 x 2 crossover trials.

PubMed

Cook, R J

1995-09-01

A method is presented for performing interim analyses in long term 2 x 2 crossover trials with serial patient entry. The analyses are based on a linear statistic that combines data from individuals observed for one treatment period with data from individuals observed for both periods. The coefficients in this linear combination can be chosen quite arbitrarily, but we focus on variance-based weights to maximize power for tests regarding direct treatment effects. The type I error rate of this procedure is controlled by utilizing the joint distribution of the linear statistics over analysis stages. Methods for performing power and sample size calculations are indicated. A two-stage sequential design involving simultaneous patient entry and a single between-period interim analysis is considered in detail. The power and average number of measurements required for this design are compared to those of the usual crossover trial. The results indicate that, while there is minimal loss in power relative to the usual crossover design in the absence of differential carry-over effects, the proposed design can have substantially greater power when differential carry-over effects are present. The two-stage crossover design can also lead to more economical studies in terms of the expected number of measurements required, due to the potential for early stopping. Attention is directed toward normally distributed responses.
Investigation of 2-stage meta-analysis methods for joint longitudinal and time-to-event data through simulation and real data application.

PubMed

Sudell, Maria; Tudur Smith, Catrin; Gueyffier, François; Kolamunnage-Dona, Ruwanthi

2018-04-15

Joint modelling of longitudinal and time-to-event data is often preferred over separate longitudinal or time-to-event analyses as it can account for study dropout, error in longitudinally measured covariates, and correlation between longitudinal and time-to-event outcomes. The joint modelling literature focuses mainly on the analysis of single studies with no methods currently available for the meta-analysis of joint model estimates from multiple studies. We propose a 2-stage method for meta-analysis of joint model estimates. These methods are applied to the INDANA dataset to combine joint model estimates of systolic blood pressure with time to death, time to myocardial infarction, and time to stroke. Results are compared to meta-analyses of separate longitudinal or time-to-event models. A simulation study is conducted to contrast separate versus joint analyses over a range of scenarios. Using the real dataset, similar results were obtained by using the separate and joint analyses. However, the simulation study indicated a benefit of use of joint rather than separate methods in a meta-analytic setting where association exists between the longitudinal and time-to-event outcomes. Where evidence of association between longitudinal and time-to-event outcomes exists, results from joint models over standalone analyses should be pooled in 2-stage meta-analyses. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Geostatistics and GIS: tools for characterizing environmental contamination.

PubMed

Henshaw, Shannon L; Curriero, Frank C; Shields, Timothy M; Glass, Gregory E; Strickland, Paul T; Breysse, Patrick N

2004-08-01

Geostatistics is a set of statistical techniques used in the analysis of georeferenced data that can be applied to environmental contamination and remediation studies. In this study, the 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene (DDE) contamination at a Superfund site in western Maryland is evaluated. Concern about the site and its future clean up has triggered interest within the community because residential development surrounds the area. Spatial statistical methods, of which geostatistics is a subset, are becoming increasingly popular, in part due to the availability of geographic information system (GIS) software in a variety of application packages. In this article, the joint use of ArcGIS software and the R statistical computing environment are demonstrated as an approach for comprehensive geostatistical analyses. The spatial regression method, kriging, is used to provide predictions of DDE levels at unsampled locations both within the site and the surrounding areas where residential development is ongoing.
Tipping points in the arctic: eyeballing or statistical significance?

PubMed

Carstensen, Jacob; Weydmann, Agata

2012-02-01

Arctic ecosystems have experienced and are projected to experience continued large increases in temperature and declines in sea ice cover. It has been hypothesized that small changes in ecosystem drivers can fundamentally alter ecosystem functioning, and that this might be particularly pronounced for Arctic ecosystems. We present a suite of simple statistical analyses to identify changes in the statistical properties of data, emphasizing that changes in the standard error should be considered in addition to changes in mean properties. The methods are exemplified using sea ice extent, and suggest that the loss rate of sea ice accelerated by factor of ~5 in 1996, as reported in other studies, but increases in random fluctuations, as an early warning signal, were observed already in 1990. We recommend to employ the proposed methods more systematically for analyzing tipping points to document effects of climate change in the Arctic.
When mechanism matters: Bayesian forecasting using models of ecological diffusion

USGS Publications Warehouse

Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

2017-01-01

Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
A survey of design methods for failure detection in dynamic systems

NASA Technical Reports Server (NTRS)

Willsky, A. S.

1975-01-01

A number of methods for the detection of abrupt changes (such as failures) in stochastic dynamical systems were surveyed. The class of linear systems were emphasized, but the basic concepts, if not the detailed analyses, carry over to other classes of systems. The methods surveyed range from the design of specific failure-sensitive filters, to the use of statistical tests on filter innovations, to the development of jump process formulations. Tradeoffs in complexity versus performance are discussed.
Quantitative analysis of tympanic membrane perforation: a simple and reliable method.

PubMed

Ibekwe, T S; Adeosun, A A; Nwaorgu, O G

2009-01-01

Accurate assessment of the features of tympanic membrane perforation, especially size, site, duration and aetiology, is important, as it enables optimum management. To describe a simple, cheap and effective method of quantitatively analysing tympanic membrane perforations. The system described comprises a video-otoscope (capable of generating still and video images of the tympanic membrane), adapted via a universal serial bus box to a computer screen, with images analysed using the Image J geometrical analysis software package. The reproducibility of results and their correlation with conventional otoscopic methods of estimation were tested statistically with the paired t-test and correlational tests, using the Statistical Package for the Social Sciences version 11 software. The following equation was generated: P/T x 100 per cent = percentage perforation, where P is the area (in pixels2) of the tympanic membrane perforation and T is the total area (in pixels2) for the entire tympanic membrane (including the perforation). Illustrations are shown. Comparison of blinded data on tympanic membrane perforation area obtained independently from assessments by two trained otologists, of comparative years of experience, using the video-otoscopy system described, showed similar findings, with strong correlations devoid of inter-observer error (p = 0.000, r = 1). Comparison with conventional otoscopic assessment also indicated significant correlation, comparing results for two trained otologists, but some inter-observer variation was present (p = 0.000, r = 0.896). Correlation between the two methods for each of the otologists was also highly significant (p = 0.000). A computer-adapted video-otoscope, with images analysed by Image J software, represents a cheap, reliable, technology-driven, clinical method of quantitative analysis of tympanic membrane perforations and injuries.
ParallABEL: an R library for generalized parallelization of genome-wide association studies

PubMed Central

2010-01-01

Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Conclusions Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL. PMID:20429914
Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens.

PubMed

Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi

2017-01-01

High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Anticoagulant vs. antiplatelet therapy in patients with cryptogenic stroke and patent foramen ovale: an individual participant data meta-analysis.

PubMed

Kent, David M; Dahabreh, Issa J; Ruthazer, Robin; Furlan, Anthony J; Weimar, Christian; Serena, Joaquín; Meier, Bernhard; Mattle, Heinrich P; Di Angelantonio, Emanuele; Paciaroni, Maurizio; Schuchlenz, Herwig; Homma, Shunichi; Lutz, Jennifer S; Thaler, David E

2015-09-14

The preferred antithrombotic strategy for secondary prevention in patients with cryptogenic stroke (CS) and patent foramen ovale (PFO) is unknown. We pooled multiple observational studies and used propensity score-based methods to estimate the comparative effectiveness of oral anticoagulation (OAC) compared with antiplatelet therapy (APT). Individual participant data from 12 databases of medically treated patients with CS and PFO were analysed with Cox regression models, to estimate database-specific hazard ratios (HRs) comparing OAC with APT, for both the primary composite outcome [recurrent stroke, transient ischaemic attack (TIA), or death] and stroke alone. Propensity scores were applied via inverse probability of treatment weighting to control for confounding. We synthesized database-specific HRs using random-effects meta-analysis models. This analysis included 2385 (OAC = 804 and APT = 1581) patients with 227 composite endpoints (stroke/TIA/death). The difference between OAC and APT was not statistically significant for the primary composite outcome [adjusted HR = 0.76, 95% confidence interval (CI) 0.52-1.12] or for the secondary outcome of stroke alone (adjusted HR = 0.75, 95% CI 0.44-1.27). Results were consistent in analyses applying alternative weighting schemes, with the exception that OAC had a statistically significant beneficial effect on the composite outcome in analyses standardized to the patient population who actually received APT (adjusted HR = 0.64, 95% CI 0.42-0.99). Subgroup analyses did not detect statistically significant heterogeneity of treatment effects across clinically important patient groups. We did not find a statistically significant difference comparing OAC with APT; our results justify randomized trials comparing different antithrombotic approaches in these patients. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2015. For permissions please email: journals.permissions@oup.com.
Adopting a Patient-Centered Approach to Primary Outcome Analysis of Acute Stroke Trials by Use of a Utility-Weighted Modified Rankin Scale

PubMed Central

Chaisinanunkul, Napasri; Adeoye, Opeolu; Lewis, Roger J.; Grotta, James C.; Broderick, Joseph; Jovin, Tudor G.; Nogueira, Raul G.; Elm, Jordan; Graves, Todd; Berry, Scott; Lees, Kennedy R.; Barreto, Andrew D.; Saver, Jeffrey L.

2015-01-01

Background and Purpose Although the modified Rankin Scale (mRS) is the most commonly employed primary endpoint in acute stroke trials, its power is limited when analyzed in dichotomized fashion and its indication of effect size challenging to interpret when analyzed ordinally. Weighting the seven Rankin levels by utilities may improve scale interpretability while preserving statistical power. Methods A utility weighted mRS (UW-mRS) was derived by averaging values from time-tradeoff (patient centered) and person-tradeoff (clinician centered) studies. The UW-mRS, standard ordinal mRS, and dichotomized mRS were applied to 11 trials or meta-analyses of acute stroke treatments, including lytic, endovascular reperfusion, blood pressure moderation, and hemicraniectomy interventions. Results Utility values were: mRS 0–1.0; mRS 1 - 0.91; mRS 2 - 0.76; mRS 3 - 0.65; mRS 4 - 0.33; mRS 5 & 6 - 0. For trials with unidirectional treatment effects, the UW-mRS paralleled the ordinal mRS and outperformed dichotomous mRS analyses. Both the UW-mRS and the ordinal mRS were statistically significant in six of eight unidirectional effect trials, while dichotomous analyses were statistically significant in two to four of eight. In bidirectional effect trials, both the UW-mRS and ordinal tests captured the divergent treatment effects by showing neutral results whereas some dichotomized analyses showed positive results. Mean utility differences in trials with statistically significant positive results ranged from 0.026 to 0.249. Conclusion A utility-weighted mRS performs similarly to the standard ordinal mRS in detecting treatment effects in actual stroke trials and ensures the quantitative outcome is a valid reflection of patient-centered benefits. PMID:26138130
Longitudinal data analyses using linear mixed models in SPSS: concepts, procedures and illustrations.

PubMed

Shek, Daniel T L; Ma, Cecilia M S

2011-01-05

Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM) are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM) are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models) are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) in Hong Kong are presented.
Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

PubMed Central

Shek, Daniel T. L.; Ma, Cecilia M. S.

2011-01-01

Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM) are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM) are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models) are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) in Hong Kong are presented. PMID:21218263
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"

ERIC Educational Resources Information Center

Ozturk, Elif

2012-01-01

The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Assessment and statistics of surgically induced astigmatism.

PubMed

Naeser, Kristian

2008-05-01

The aim of the thesis was to develop methods for assessment of surgically induced astigmatism (SIA) in individual eyes, and in groups of eyes. The thesis is based on 12 peer-reviewed publications, published over a period of 16 years. In these publications older and contemporary literature was reviewed(1). A new method (the polar system) for analysis of SIA was developed. Multivariate statistical analysis of refractive data was described(2-4). Clinical validation studies were performed. The description of a cylinder surface with polar values and differential geometry was compared. The main results were: refractive data in the form of sphere, cylinder and axis may define an individual patient or data set, but are unsuited for mathematical and statistical analyses(1). The polar value system converts net astigmatisms to orthonormal components in dioptric space. A polar value is the difference in meridional power between two orthogonal meridians(5,6). Any pair of polar values, separated by an arch of 45 degrees, characterizes a net astigmatism completely(7). The two polar values represent the net curvital and net torsional power over the chosen meridian(8). The spherical component is described by the spherical equivalent power. Several clinical studies demonstrated the efficiency of multivariate statistical analysis of refractive data(4,9-11). Polar values and formal differential geometry describe astigmatic surfaces with similar concepts and mathematical functions(8). Other contemporary methods, such as Long's power matrix, Holladay's and Alpins' methods, Zernike(12) and Fourier analyses(8), are correlated to the polar value system. In conclusion, analysis of SIA should be performed with polar values or other contemporary component systems. The study was supported by Statens Sundhedsvidenskabeligt Forskningsråd, Cykelhandler P. Th. Rasmussen og Hustrus Mindelegat, Hotelejer Carl Larsen og Hustru Nicoline Larsens Mindelegat, Landsforeningen til Vaern om Synet, Forskningsinitiativet for Arhus Amt, Alcon Denmark, and Desirée and Niels Ydes Fond.

Improving the Prognostic Ability through Better Use of Standard Clinical Data - The Nottingham Prognostic Index as an Example

PubMed Central

Winzer, Klaus-Jürgen; Buchholz, Anika; Schumacher, Martin; Sauerbrei, Willi

2016-01-01

Background Prognostic factors and prognostic models play a key role in medical research and patient management. The Nottingham Prognostic Index (NPI) is a well-established prognostic classification scheme for patients with breast cancer. In a very simple way, it combines the information from tumor size, lymph node stage and tumor grade. For the resulting index cutpoints are proposed to classify it into three to six groups with different prognosis. As not all prognostic information from the three and other standard factors is used, we will consider improvement of the prognostic ability using suitable analysis approaches. Methods and Findings Reanalyzing overall survival data of 1560 patients from a clinical database by using multivariable fractional polynomials and further modern statistical methods we illustrate suitable multivariable modelling and methods to derive and assess the prognostic ability of an index. Using a REMARK type profile we summarize relevant steps of the analysis. Adding the information from hormonal receptor status and using the full information from the three NPI components, specifically concerning the number of positive lymph nodes, an extended NPI with improved prognostic ability is derived. Conclusions The prognostic ability of even one of the best established prognostic index in medicine can be improved by using suitable statistical methodology to extract the full information from standard clinical data. This extended version of the NPI can serve as a benchmark to assess the added value of new information, ranging from a new single clinical marker to a derived index from omics data. An established benchmark would also help to harmonize the statistical analyses of such studies and protect against the propagation of many false promises concerning the prognostic value of new measurements. Statistical methods used are generally available and can be used for similar analyses in other diseases. PMID:26938061
Comparison of methods for estimating flood magnitudes on small streams in Georgia

USGS Publications Warehouse

Hess, Glen W.; Price, McGlone

1989-01-01

The U.S. Geological Survey has collected flood data for small, natural streams at many sites throughout Georgia during the past 20 years. Flood-frequency relations were developed for these data using four methods: (1) observed (log-Pearson Type III analysis) data, (2) rainfall-runoff model, (3) regional regression equations, and (4) map-model combination. The results of the latter three methods were compared to the analyses of the observed data in order to quantify the differences in the methods and determine if the differences are statistically significant.
[Correlation of dental age and anthropometric parametres of the overall growth and development in children].

PubMed

Triković-Janjić, Olivera; Apostolović, Mirjana; Janosević, Mirjana; Filipović, Gordana

2008-02-01

Anthropometric methods of measuring the whole body and body parts are the most commonly applied methods of analysing the growth and development of children. Anthropometric measures are interconnected, so that with growth and development the change of one of the parameters causes the change of the other. The aim of the paper was to analyse whether dental development follows the overall growth and development and what the ratio of this interdependence is. The research involved a sample of 134 participants, aged between 6 and 8 years. Dental age was determined as the average of the sum of existing permanent teeth from the participants aged 6, 7 and 8. With the aim of analysing physical growth and development, commonly accepted anthropometric indexes were applied: height, weight, circumference of the head, the chest cavity at its widest point, the upper arm, the abdomen, the thigh and thickness of the epidermis. The dimensions were measured according to the methodology of the International Biological Programme. The influence of the pertinent variables' related size on the analysed variable was deter mined by the statistical method of multivariable regression. The middle values of all the anthropometric parametres, except for the thickness of the epidermis, were slightly bigger with male participants, and the circumference of the chest cavity was statistically considerably bigger (p < 0.05). The results of anthropometric measurement showed in general a distinct homogeneity not only of the sample group but also within gender, in relation to all the dimensions, excyt for the thickness of the epidermis. The average of the dental age of the participants was 10.36, (10.42 and 10.31 for females and males respectively). Considerable correlation (R = 0.59) with high statistical significance (p < 0.001) was determined between dental age and the set of anthropometric parameters of general growth and development. There is a considerable positive correlation (R = 0.59) between dental age and anthropometric parameters of general growth and development, which confirms that dental development follows the overall growth and development of children, aged between 6 and 8 years.
Association analysis of multiple traits by an approach of combining P values.

PubMed

Chen, Lili; Wang, Yong; Zhou, Yajing

2018-03-01

Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.
Performance testing of NIOSH Method 5524/ASTM Method D-7049-04, for determination of metalworking fluids.

PubMed

Glaser, Robert; Kurimo, Robert; Shulman, Stanley

2007-08-01

A performance test of NIOSH Method 5524/ASTM Method D-7049-04 for analysis of metalworking fluids (MWF) was conducted. These methods involve determination of the total and extractable weights of MWF samples; extractions are performed using a ternary blend of toluene:dichloromethane:methanol and a binary blend of methanol:water. Six laboratories participated in this study. A preliminary analysis of 20 blank samples was made to familiarize the laboratories with the procedure(s) and to estimate the methods' limits of detection/quantitation (LODs/LOQs). Synthetically generated samples of a semisynthetic MWF aerosol were then collected on tared polytetrafluoroethylene (PTFE) filters and analyzed according to the methods by all participants. Sample masses deposited (approximately 400-500 micro g) corresponded to amounts expected in an 8-hr shift at the NIOSH recommended exposure levels (REL) of 0.4 mg/m(3) (thoracic) and 0.5 mg/m(3) (total particulate). The generator output was monitored with a calibrated laser particle counter. One laboratory significantly underreported the sampled masses relative to the other five labs. A follow-up study compared only gravimetric results of this laboratory with those of two other labs. In the preliminary analysis of blanks; the average LOQs were 0.094 mg for the total weight analysis and 0.136 mg for the extracted weight analyses. For the six-lab study, the average LOQs were 0.064 mg for the total weight analyses and 0.067 mg for the extracted weight analyses. Using ASTM conventions, h and k statistics were computed to determine the degree of consistency of each laboratory with the others. One laboratory experienced problems with precision but not bias. The precision estimates for the remaining five labs were not different statistically (alpha = 0.005) for either the total or extractable weights. For all six labs, the average fraction extracted was > or =0.94 (CV = 0.025). Pooled estimates of the total coefficients of variation of analysis were 0.13 for the total weight samples and 0.13 for the extracted weight samples. An overall method bias of -5% was determined by comparing the overall mean concentration reported by the participants to that determined by the particle counter. In the three-lab follow-up study, the nonconsistent lab reported results that were unbiased but statistically less precise than the others; the average LOQ was 0.133 mg for the total weight analyses. It is concluded that aerosolized MWF sampled at concentrations corresponding to either of the NIOSH RELs can generally be shipped unrefrigerated, stored refrigerated up to 7 days, and then analyzed quantitatively and precisely for MWF using the NIOSH/ASTM procedures.
Pediatric patient safety events during hospitalization: approaches to accounting for institution-level effects.

PubMed

Slonim, Anthony D; Marcin, James P; Turenne, Wendy; Hall, Matt; Joseph, Jill G

2007-12-01

To determine the rates, patient, and institutional characteristics associated with the occurrence of patient safety indicators (PSIs) in hospitalized children and the degree of statistical difference derived from using three approaches of controlling for institution level effects. Pediatric Health Information System Dataset consisting of all pediatric discharges (<21 years of age) from 34 academic, freestanding children's hospitals for calendar year 2003. The rates of PSIs were computed for all discharges. The patient and institutional characteristics associated with these PSIs were calculated. The analyses sequentially applied three increasingly conservative methods to control for the institution-level effects robust standard error estimation, a fixed effects model, and a random effects model. The degree of difference from a "base state," which excluded institution-level variables, and between the models was calculated. The effects of these analyses on the interpretation of the PSIs are presented. PSIs are relatively infrequent events in hospitalized children ranging from 0 per 10,000 (postoperative hip fracture) to 87 per 10,000 (postoperative respiratory failure). Significant variables associated PSIs included age (neonates), race (Caucasians), payor status (public insurance), severity of illness (extreme), and hospital size (>300 beds), which all had higher rates of PSIs than their reference groups in the bivariable logistic regression results. The three different approaches of adjusting for institution-level effects demonstrated that there were similarities in both the clinical and statistical significance across each of the models. Institution-level effects can be appropriately controlled for by using a variety of methods in the analyses of administrative data. Whenever possible, resource-conservative methods should be used in the analyses especially if clinical implications are minimal.
Assessing groundwater vulnerability to agrichemical contamination in the Midwest US

USGS Publications Warehouse

Burkart, M.R.; Kolpin, D.W.; James, D.E.

1999-01-01

Agrichemicals (herbicides and nitrate) are significant sources of diffuse pollution to groundwater. Indirect methods are needed to assess the potential for groundwater contamination by diffuse sources because groundwater monitoring is too costly to adequately define the geographic extent of contamination at a regional or national scale. This paper presents examples of the application of statistical, overlay and index, and process-based modeling methods for groundwater vulnerability assessments to a variety of data from the Midwest U.S. The principles for vulnerability assessment include both intrinsic (pedologic, climatologic, and hydrogeologic factors) and specific (contaminant and other anthropogenic factors) vulnerability of a location. Statistical methods use the frequency of contaminant occurrence, contaminant concentration, or contamination probability as a response variable. Statistical assessments are useful for defining the relations among explanatory and response variables whether they define intrinsic or specific vulnerability. Multivariate statistical analyses are useful for ranking variables critical to estimating water quality responses of interest. Overlay and index methods involve intersecting maps of intrinsic and specific vulnerability properties and indexing the variables by applying appropriate weights. Deterministic models use process-based equations to simulate contaminant transport and are distinguished from the other methods in their potential to predict contaminant transport in both space and time. An example of a one-dimensional leaching model linked to a geographic information system (GIS) to define a regional metamodel for contamination in the Midwest is included.
Research on Visual Analysis Methods of Terrorism Events

NASA Astrophysics Data System (ADS)

Guo, Wenyue; Liu, Haiyan; Yu, Anzhu; Li, Jing

2016-06-01

Under the situation that terrorism events occur more and more frequency throughout the world, improving the response capability of social security incidents has become an important aspect to test governments govern ability. Visual analysis has become an important method of event analysing for its advantage of intuitive and effective. To analyse events' spatio-temporal distribution characteristics, correlations among event items and the development trend, terrorism event's spatio-temporal characteristics are discussed. Suitable event data table structure based on "5W" theory is designed. Then, six types of visual analysis are purposed, and how to use thematic map and statistical charts to realize visual analysis on terrorism events is studied. Finally, experiments have been carried out by using the data provided by Global Terrorism Database, and the results of experiments proves the availability of the methods.
Proliferative Changes in the Bronchial Epithelium of Former Smokers Treated With Retinoids

PubMed Central

Hittelman, Walter N.; Liu, Diane D.; Kurie, Jonathan M.; Lotan, Reuben; Lee, Jin Soo; Khuri, Fadlo; Ibarguen, Heladio; Morice, Rodolfo C.; Walsh, Garrett; Roth, Jack A.; Minna, John; Ro, Jae Y.; Broxson, Anita; Hong, Waun Ki; Lee, J. Jack

2012-01-01

Background Retinoids have shown antiproliferative and chemopreventive activity. We analyzed data from a randomized, placebo-controlled chemoprevention trial to determine whether a 3-month treatment with either 9-cis-retinoic acid (RA) or 13-cis-RA and α-tocopherol reduced Ki-67, a proliferation biomarker, in the bronchial epithelium. Methods Former smokers (n = 225) were randomly assigned to receive 3 months of daily oral 9-cis-RA (100 mg), 13-cis-RA (1 mg/kg) and α-tocopherol (1200 IU), or placebo. Bronchoscopic biopsy specimens obtained before and after treatment were immunohistochemically assessed for changes in the Ki-67 proliferative index (i.e., percentage of cells with Ki-67–positive nuclear staining) in the basal and parabasal layers of the bronchial epithelium. Per-subject and per–biopsy site analyses were conducted. Multicovariable analyses, including a mixed-effects model and a generalized estimating equations model, were used to investigate the treatment effect (Ki-67 labeling index and percentage of bronchial epithelial biopsy sites with a Ki-67 index ≥ 5%) with adjustment for multiple covariates, such as smoking history and metaplasia. Coefficient estimates and 95% confidence intervals (CIs) were obtained from the models. All statistical tests were two-sided. Results In per-subject analyses, Ki-67 labeling in the basal layer was not changed by any treatment; the percentage of subjects with a high Ki-67 labeling in the parabasal layer dropped statistically significantly after treatment with 13-cis-RA and α-tocopherol treatment (P = .04) compared with placebo, but the drop was not statistically significant after 9-cis-RA treatment (P = .17). A similar effect was observed in the parabasal layer in a per-site analysis; the percentage of sites with high Ki-67 labeling dropped statistically significantly after 9-cis-RA treatment (coefficient estimate = −0.72, 95% CI = −1.24 to −0.20; P = .007) compared with placebo, and after 13-cis-RA and α-tocopherol treatment (coefficient estimate = −0.66, 95% CI = −1.15 to −0.17; P = .008). Conclusions In per-subject analyses, treatment with 13-cis-RA and α-tocopherol, compared with placebo, was statistically significantly associated with reduced bronchial epithelial cell proliferation; treatment with 9-cis-RA was not. In per-site analyses, statistically significant associations were obtained with both treatments. PMID:17971525
A new method for estimating the usual intake of episodically-consumed foods with application to their distribution

PubMed Central

Midthune, Douglas; Dodd, Kevin W.; Freedman, Laurence S.; Krebs-Smith, Susan M.; Subar, Amy F.; Guenther, Patricia M.; Carroll, Raymond J.; Kipnis, Victor

2007-01-01

Objective We propose a new statistical method that uses information from two 24-hour recalls (24HRs) to estimate usual intake of episodically-consumed foods. Statistical Analyses Performed The method developed at the National Cancer Institute (NCI) accommodates the large number of non-consumption days that arise with foods by separating the probability of consumption from the consumption-day amount, using a two-part model. Covariates, such as sex, age, race, or information from a food frequency questionnaire (FFQ), may supplement the information from two or more 24HRs using correlated mixed model regression. The model allows for correlation between the probability of consuming a food on a single day and the consumption-day amount. Percentiles of the distribution of usual intake are computed from the estimated model parameters. Results The Eating at America's Table Study (EATS) data are used to illustrate the method to estimate the distribution of usual intake for whole grains and dark green vegetables for men and women and the distribution of usual intakes of whole grains by educational level among men. A simulation study indicates that the NCI method leads to substantial improvement over existing methods for estimating the distribution of usual intake of foods. Applications/Conclusions The NCI method provides distinct advantages over previously proposed methods by accounting for the correlation between probability of consumption and amount consumed and by incorporating covariate information. Researchers interested in estimating the distribution of usual intakes of foods for a population or subpopulation are advised to work with a statistician and incorporate the NCI method in analyses. PMID:17000190
Training in metabolomics research. II. Processing and statistical analysis of metabolomics data, metabolite identification, pathway analysis, applications of metabolomics and its future.

PubMed

Barnes, Stephen; Benton, H Paul; Casazza, Krista; Cooper, Sara J; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K; Renfrow, Matthew B; Tiwari, Hemant K

2016-08-01

Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Generalising Ward's Method for Use with Manhattan Distances.

PubMed

Strauss, Trudie; von Maltitz, Michael Johan

2017-01-01

The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperforms the method using Euclidean distances. As an application, we perform statistical analyses on languages using methods normally applied to biology and genetic classification. We aim to quantify differences in character traits between languages and use a statistical language signature based on relative bi-gram (sequence of two letters) frequencies to calculate a distance matrix between 32 Indo-European languages. We then use Ward's method of hierarchical clustering to classify the languages, using the Euclidean distance and the Manhattan distance. Results obtained from using the different distance metrics are compared to show that the Ward's algorithm characteristic of minimising intra-cluster variation and maximising inter-cluster variation is not violated when using the Manhattan metric.
Tests of Alignment among Assessment, Standards, and Instruction Using Generalized Linear Model Regression

ERIC Educational Resources Information Center

Fulmer, Gavin W.; Polikoff, Morgan S.

2014-01-01

An essential component in school accountability efforts is for assessments to be well-aligned with the standards or curriculum they are intended to measure. However, relatively little prior research has explored methods to determine statistical significance of alignment or misalignment. This study explores analyses of alignment as a special case…
Testing Mediation Using Multiple Regression and Structural Equation Modeling Analyses in Secondary Data

ERIC Educational Resources Information Center

Li, Spencer D.

2011-01-01

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Publication Bias in Meta-Analyses of the Efficacy of Psychotherapeutic Interventions for Depression

ERIC Educational Resources Information Center

Niemeyer, Helen; Musch, Jochen; Pietrowsky, Reinhard

2013-01-01

Objective: The aim of this study was to assess whether systematic reviews investigating psychotherapeutic interventions for depression are affected by publication bias. Only homogeneous data sets were included, as heterogeneous data sets can distort statistical tests of publication bias. Method: We applied Begg and Mazumdar's adjusted rank…
The Development of a Decision Support System for Mobile Learning: A Case Study in Taiwan

ERIC Educational Resources Information Center

Chiu, Po-Sheng; Huang, Yueh-Min

2016-01-01

While mobile learning (m-learning) has considerable potential, most of previous strategies for developing this new approach to education were analysed using the knowledge, experience and judgement of individuals, with the support of statistical software. Although these methods provide systematic steps for the implementation of m-learning…
Attitudes towards Participation in Business Development Programmes: An Ethnic Comparison in Sweden

ERIC Educational Resources Information Center

Abbasian, Saeid; Yazdanfar, Darush

2015-01-01

Purpose: The aim of the study is to investigate whether there are any differences between the attitudes towards participation in development programmes of entrepreneurs who are immigrants and those who are native-born. Design/methodology/approach: Several statistical methods, including a binary logistic regression model, were used to analyse a…
A Comparison of Self versus Tutor Assessment among Hungarian Undergraduate Business Students

ERIC Educational Resources Information Center

Kun, András István

2016-01-01

This study analyses the self-assessment behaviour and efficiency of 163 undergraduate business students from Hungary. Using various statistical methods, the results support the hypothesis that high-achieving students are more accurate in their pre- and post-examination self-assessments, and also less likely to overestimate their performance, and,…
Knowledge about Hepatitis B and Predictors of Hepatitis B Vaccination among Vietnamese American College Students

ERIC Educational Resources Information Center

Hwang, Jessica P.; Huang, Chih-Hsun; Yi, Jenny K.

2008-01-01

Asian American college students are at high risk for hepatitis B virus (HBV). Participants and Methods: Vietnamese American students completed a questionnaire assessing HBV knowledge and attitudes. The authors performed statistical analyses to examine the relationship between HBV knowledge and participant characteristics. They also performed…
Psycho-Motor Needs Assessment of Virginia School Children.

ERIC Educational Resources Information Center

Glen Haven Achievement Center, Fort Collins, CO.

An effort to assess psycho-motor (P-M) needs among Virginia children in K-4 and in special primary classes for the educable mentally retarded is presented. Included are methods for selecting, combining, and developing evaluation measures, which are verified statistically by analyses of data collected from a stratified sample of approximately 4,500…

Use of Spatial Epidemiology and Hot Spot Analysis to Target Women Eligible for Prenatal Women, Infants, and Children Services

PubMed Central

Krawczyk, Christopher; Gradziel, Pat; Geraghty, Estella M.

2014-01-01

Objectives. We used a geographic information system and cluster analyses to determine locations in need of enhanced Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) Program services. Methods. We linked documented births in the 2010 California Birth Statistical Master File with the 2010 data from the WIC Integrated Statewide Information System. Analyses focused on the density of pregnant women who were eligible for but not receiving WIC services in California’s 7049 census tracts. We used incremental spatial autocorrelation and hot spot analyses to identify clusters of WIC-eligible nonparticipants. Results. We detected clusters of census tracts with higher-than-expected densities, compared with the state mean density of WIC-eligible nonparticipants, in 21 of 58 (36.2%) California counties (P < .05). In subsequent county-level analyses, we located neighborhood-level clusters of higher-than-expected densities of eligible nonparticipants in Sacramento, San Francisco, Fresno, and Los Angeles Counties (P < .05). Conclusions. Hot spot analyses provided a rigorous and objective approach to determine the locations of statistically significant clusters of WIC-eligible nonparticipants. Results helped inform WIC program and funding decisions, including the opening of new WIC centers, and offered a novel approach for targeting public health services. PMID:24354821
Use of the Analysis of the Volatile Faecal Metabolome in Screening for Colorectal Cancer

PubMed Central

2015-01-01

Diagnosis of colorectal cancer is an invasive and expensive colonoscopy, which is usually carried out after a positive screening test. Unfortunately, existing screening tests lack specificity and sensitivity, hence many unnecessary colonoscopies are performed. Here we report on a potential new screening test for colorectal cancer based on the analysis of volatile organic compounds (VOCs) in the headspace of faecal samples. Faecal samples were obtained from subjects who had a positive faecal occult blood sample (FOBT). Subjects subsequently had colonoscopies performed to classify them into low risk (non-cancer) and high risk (colorectal cancer) groups. Volatile organic compounds were analysed by selected ion flow tube mass spectrometry (SIFT-MS) and then data were analysed using both univariate and multivariate statistical methods. Ions most likely from hydrogen sulphide, dimethyl sulphide and dimethyl disulphide are statistically significantly higher in samples from high risk rather than low risk subjects. Results using multivariate methods show that the test gives a correct classification of 75% with 78% specificity and 72% sensitivity on FOBT positive samples, offering a potentially effective alternative to FOBT. PMID:26086914
Introduction to Bayesian statistical approaches to compositional analyses of transgenic crops 1. Model validation and setting the stage.

PubMed

Harrison, Jay M; Breeze, Matthew L; Harrigan, George G

2011-08-01

Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.
Practice-based evidence study design for comparative effectiveness research.

PubMed

Horn, Susan D; Gassaway, Julie

2007-10-01

To describe a new, rigorous, comprehensive practice-based evidence for clinical practice improvement (PBE-CPI) study methodology, and compare its features, advantages, and disadvantages to those of randomized controlled trials and sophisticated statistical methods for comparative effectiveness research. PBE-CPI incorporates natural variation within data from routine clinical practice to determine what works, for whom, when, and at what cost. It uses the knowledge of front-line caregivers, who develop study questions and define variables as part of a transdisciplinary team. Its comprehensive measurement framework provides a basis for analyses of significant bivariate and multivariate associations between treatments and outcomes, controlling for patient differences, such as severity of illness. PBE-CPI studies can uncover better practices more quickly than randomized controlled trials or sophisticated statistical methods, while achieving many of the same advantages. We present examples of actionable findings from PBE-CPI studies in postacute care settings related to comparative effectiveness of medications, nutritional support approaches, incontinence products, physical therapy activities, and other services. Outcomes improved when practices associated with better outcomes in PBE-CPI analyses were adopted in practice.
Sieve analysis in HIV-1 vaccine efficacy trials

PubMed Central

Edlefsen, Paul T.; Gilbert, Peter B.; Rolland, Morgane

2013-01-01

Purpose of review The genetic characterization of HIV-1 breakthrough infections in vaccine and placebo recipients offers new ways to assess vaccine efficacy trials. Statistical and sequence analysis methods provide opportunities to mine the mechanisms behind the effect of an HIV vaccine. Recent findings The release of results from two HIV-1 vaccine efficacy trials, Step/HVTN-502 and RV144, led to numerous studies in the last five years, including efforts to sequence HIV-1 breakthrough infections and compare viral characteristics between the vaccine and placebo groups. Novel genetic and statistical analysis methods uncovered features that distinguished founder viruses isolated from vaccinees from those isolated from placebo recipients, and identified HIV-1 genetic targets of vaccine-induced immune responses. Summary Studies of HIV-1 breakthrough infections in vaccine efficacy trials can provide an independent confirmation to correlates of risk studies, as they take advantage of vaccine/placebo comparisons while correlates of risk analyses are limited to vaccine recipients. Through the identification of viral determinants impacted by vaccine-mediated host immune responses, sieve analyses can shed light on potential mechanisms of vaccine protection. PMID:23719202
Sieve analysis in HIV-1 vaccine efficacy trials.

PubMed

Edlefsen, Paul T; Gilbert, Peter B; Rolland, Morgane

2013-09-01

The genetic characterization of HIV-1 breakthrough infections in vaccine and placebo recipients offers new ways to assess vaccine efficacy trials. Statistical and sequence analysis methods provide opportunities to mine the mechanisms behind the effect of an HIV vaccine. The release of results from two HIV-1 vaccine efficacy trials, Step/HVTN-502 (HIV Vaccine Trials Network-502) and RV144, led to numerous studies in the last 5 years, including efforts to sequence HIV-1 breakthrough infections and compare viral characteristics between the vaccine and placebo groups. Novel genetic and statistical analysis methods uncovered features that distinguished founder viruses isolated from vaccinees from those isolated from placebo recipients, and identified HIV-1 genetic targets of vaccine-induced immune responses. Studies of HIV-1 breakthrough infections in vaccine efficacy trials can provide an independent confirmation to correlates of risk studies, as they take advantage of vaccine/placebo comparisons, whereas correlates of risk analyses are limited to vaccine recipients. Through the identification of viral determinants impacted by vaccine-mediated host immune responses, sieve analyses can shed light on potential mechanisms of vaccine protection.
Use of MALDI-TOF Mass Spectrometry and a Custom Database to Characterize Bacteria Indigenous to a Unique Cave Environment (Kartchner Caverns, AZ, USA)

PubMed Central

Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.

2015-01-01

MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854
Use of MALDI-TOF mass spectrometry and a custom database to characterize bacteria indigenous to a unique cave environment (Kartchner Caverns, AZ, USA).

PubMed

Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R

2015-01-02

MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.
Analysis of longitudinal data from animals where some data are missing in SPSS

PubMed Central

Duricki, DA; Soleman, S; Moon, LDF

2017-01-01

Testing of therapies for disease or injury often involves analysis of longitudinal data from animals. Modern analytical methods have advantages over conventional methods (particularly where some data are missing) yet are not used widely by pre-clinical researchers. We provide here an easy to use protocol for analysing longitudinal data from animals and present a click-by-click guide for performing suitable analyses using the statistical package SPSS. We guide readers through analysis of a real-life data set obtained when testing a therapy for brain injury (stroke) in elderly rats. We show that repeated measures analysis of covariance failed to detect a treatment effect when a few data points were missing (due to animal drop-out) whereas analysis using an alternative method detected a beneficial effect of treatment; specifically, we demonstrate the superiority of linear models (with various covariance structures) analysed using Restricted Maximum Likelihood estimation (to include all available data). This protocol takes two hours to follow. PMID:27196723
Arkansas StreamStats: a U.S. Geological Survey web map application for basin characteristics and streamflow statistics

USGS Publications Warehouse

Pugh, Aaron L.

2014-01-01

Users of streamflow information often require streamflow statistics and basin characteristics at various locations along a stream. The USGS periodically calculates and publishes streamflow statistics and basin characteristics for streamflowgaging stations and partial-record stations, but these data commonly are scattered among many reports that may or may not be readily available to the public. The USGS also provides and periodically updates regional analyses of streamflow statistics that include regression equations and other prediction methods for estimating statistics for ungaged and unregulated streams across the State. Use of these regional predictions for a stream can be complex and often requires the user to determine a number of basin characteristics that may require interpretation. Basin characteristics may include drainage area, classifiers for physical properties, climatic characteristics, and other inputs. Obtaining these input values for gaged and ungaged locations traditionally has been time consuming, subjective, and can lead to inconsistent results.
Statistical ecology comes of age.

PubMed

Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric

2014-12-01

The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.
Statistical ecology comes of age

PubMed Central

Gimenez, Olivier; Buckland, Stephen T.; Morgan, Byron J. T.; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M.; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M.; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric

2014-01-01

The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1–4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data. PMID:25540151
Improving phylogenetic analyses by incorporating additional information from genetic sequence databases.

PubMed

Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A

2009-10-01

Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
Non-invasive brain stimulation to investigate language production in healthy speakers: A meta-analysis.

PubMed

Klaus, Jana; Schutter, Dennis J L G

2018-06-01

Non-invasive brain stimulation (NIBS) has become a common method to study the interrelations between the brain and language functioning. This meta-analysis examined the efficacy of transcranial magnetic stimulation (TMS) and direct current stimulation (tDCS) in the study of language production in healthy volunteers. Forty-five effect sizes from 30 studies which investigated the effects of NIBS on picture naming or verbal fluency in healthy participants were meta-analysed. Further sub-analyses investigated potential influences of stimulation type, control, target site, task, online vs. offline application, and current density of the target electrode. Random effects modelling showed a small, but reliable effect of NIBS on language production. Subsequent analyses indicated larger weighted mean effect sizes for TMS as compared to tDCS studies. No statistical differences for the other sub-analyses were observed. We conclude that NIBS is a useful method for neuroscientific studies on language production in healthy volunteers. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Progressive statistics for studies in sports medicine and exercise science.

PubMed

Hopkins, William G; Marshall, Stephen W; Batterham, Alan M; Hanin, Juri

2009-01-01

Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.
The Australasian Resuscitation in Sepsis Evaluation (ARISE) trial statistical analysis plan.

PubMed

Delaney, Anthony P; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve

2013-09-01

The Australasian Resuscitation in Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the emergency department with severe sepsis. In keeping with current practice, and considering aspects of trial design and reporting specific to non-pharmacological interventions, our plan outlines the principles and methods for analysing and reporting the trial results. The document is prepared before completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and before completion of the two related international studies. Our statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. We reviewed the data collected by the research team as specified in the study protocol and detailed in the study case report form. We describe information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation, other related therapies and other relevant data with appropriate comparisons between groups. We define the primary, secondary and tertiary outcomes for the study, with description of the planned statistical analyses. We have developed a statistical analysis plan with a trial profile, mock-up tables and figures. We describe a plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies and adverse events. We describe the primary, secondary and tertiary outcomes with identification of subgroups to be analysed. We have developed a statistical analysis plan for the ARISE study, available in the public domain, before the completion of recruitment into the study. This will minimise analytical bias and conforms to current best practice in conducting clinical trials.
Cormack Research Project: Glasgow University

NASA Technical Reports Server (NTRS)

Skinner, Susan; Ryan, James M.

1998-01-01

The aim of this project was to investigate and improve upon existing methods of analysing data from COMITEL on the Gamma Ray Observatory for neutrons emitted during solar flares. In particular, a strategy for placing confidence intervals on neutron energy distributions, due to uncertainties on the response matrix has been developed. We have also been able to demonstrate the superior performance of one of a range of possible statistical regularization strategies. A method of generating likely models of neutron energy distributions has also been developed as a tool to this end. The project involved solving an inverse problem with noise being added to the data in various ways. To achieve this pre-existing C code was used to run Fortran subroutines which performed statistical regularization on the data.
A Statistical Approach for the Concurrent Coupling of Molecular Dynamics and Finite Element Methods

NASA Technical Reports Server (NTRS)

Saether, E.; Yamakov, V.; Glaessgen, E.

2007-01-01

Molecular dynamics (MD) methods are opening new opportunities for simulating the fundamental processes of material behavior at the atomistic level. However, increasing the size of the MD domain quickly presents intractable computational demands. A robust approach to surmount this computational limitation has been to unite continuum modeling procedures such as the finite element method (FEM) with MD analyses thereby reducing the region of atomic scale refinement. The challenging problem is to seamlessly connect the two inherently different simulation techniques at their interface. In the present work, a new approach to MD-FEM coupling is developed based on a restatement of the typical boundary value problem used to define a coupled domain. The method uses statistical averaging of the atomistic MD domain to provide displacement interface boundary conditions to the surrounding continuum FEM region, which, in return, generates interface reaction forces applied as piecewise constant traction boundary conditions to the MD domain. The two systems are computationally disconnected and communicate only through a continuous update of their boundary conditions. With the use of statistical averages of the atomistic quantities to couple the two computational schemes, the developed approach is referred to as an embedded statistical coupling method (ESCM) as opposed to a direct coupling method where interface atoms and FEM nodes are individually related. The methodology is inherently applicable to three-dimensional domains, avoids discretization of the continuum model down to atomic scales, and permits arbitrary temperatures to be applied.
Statistical relationships between journal use and research output at academic institutions in South Korea.

PubMed

Jung, Youngim; Kim, Jayhoon; So, Minho; Kim, Hwanmin

In this study, we analysed the statistical association between e-journal use and research output at the institution level in South Korea by performing comparative and diachronic analyses, as well as the analysis by field. The datasets were compiled from four different sources: national reports on research output indicators in science fields, two statistics databases on higher education institutions open to the public, and e-journal usage statistics generated by 47 major publishers. Due to the different data sources utilized, a considerable number of missing values appeared in our datasets and various mapping issues required corrections prior to the analysis. Two techniques for handling missing data were applied and the impact of each technique was discussed. In order to compile the institutional data by field, journals were first mapped, and then the statistics were summarized according to subject field. We observed that e-journal use exhibited stronger correlations with the number of publications and the times cited, in contrast to the number of undergraduates, graduates, faculty members and the amount of research funds, and this was the case regardless of the NA handling method or author type. The difference between the maximum correlation for the amount of external research funding with two average indicators and that of the correlation for e-journal use were not significant. Statistically, the accountability of e-journal use for the average times cited per article and the average JIF was quite similar with external research funds. It was found that the number of e-journal articles used had a strong positive correlation (Pearson's correlation coefficients of r > 0.9, p < 0.05) with the number of articles published in SCI(E) journals and the times cited regardless of the author type, NA handling method or time period. We also observed that the top-five institutions in South Korea, with respect to the number of publications in SCI(E) journals, were generally across a balanced range of academic activities, while producing significant research output and using published material. Finally, we confirmed that the association of e-journal use with the two quantitative research indicators is strongly positive, even for the analyses by field, with the exception of the Arts and Humanities.
The space of ultrametric phylogenetic trees.

PubMed

Gavryushkin, Alex; Drummond, Alexei J

2016-08-21

The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

The evolution of autodigestion in the mushroom family Psathyrellaceae (Agaricales) inferred from Maximum Likelihood and Bayesian methods.

PubMed

Nagy, László G; Urban, Alexander; Orstadius, Leif; Papp, Tamás; Larsson, Ellen; Vágvölgyi, Csaba

2010-12-01

Recently developed comparative phylogenetic methods offer a wide spectrum of applications in evolutionary biology, although it is generally accepted that their statistical properties are incompletely known. Here, we examine and compare the statistical power of the ML and Bayesian methods with regard to selection of best-fit models of fruiting-body evolution and hypothesis testing of ancestral states on a real-life data set of a physiological trait (autodigestion) in the family Psathyrellaceae. Our phylogenies are based on the first multigene data set generated for the family. Two different coding regimes (binary and multistate) and two data sets differing in taxon sampling density are examined. The Bayesian method outperformed Maximum Likelihood with regard to statistical power in all analyses. This is particularly evident if the signal in the data is weak, i.e. in cases when the ML approach does not provide support to choose among competing hypotheses. Results based on binary and multistate coding differed only modestly, although it was evident that multistate analyses were less conclusive in all cases. It seems that increased taxon sampling density has favourable effects on inference of ancestral states, while model parameters are influenced to a smaller extent. The model best fitting our data implies that the rate of losses of deliquescence equals zero, although model selection in ML does not provide proper support to reject three of the four candidate models. The results also support the hypothesis that non-deliquescence (lack of autodigestion) has been ancestral in Psathyrellaceae, and that deliquescent fruiting bodies represent the preferred state, having evolved independently several times during evolution. Copyright © 2010 Elsevier Inc. All rights reserved.
MO-G-12A-01: Quantitative Imaging Metrology: What Should Be Assessed and How?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Giger, M; Petrick, N; Obuchowski, N

The first two symposia in the Quantitative Imaging Track focused on 1) the introduction of quantitative imaging (QI) challenges and opportunities, and QI efforts of agencies and organizations such as the RSNA, NCI, FDA, and NIST, and 2) the techniques, applications, and challenges of QI, with specific examples from CT, PET/CT, and MR. This third symposium in the QI Track will focus on metrology and its importance in successfully advancing the QI field. While the specific focus will be on QI, many of the concepts presented are more broadly applicable to many areas of medical physics research and applications. Asmore » such, the topics discussed should be of interest to medical physicists involved in imaging as well as therapy. The first talk of the session will focus on the introduction to metrology and why it is critically important in QI. The second talk will focus on appropriate methods for technical performance assessment. The third talk will address statistically valid methods for algorithm comparison, a common problem not only in QI but also in other areas of medical physics. The final talk in the session will address strategies for publication of results that will allow statistically valid meta-analyses, which is critical for combining results of individual studies with typically small sample sizes in a manner that can best inform decisions and advance the field. Learning Objectives: Understand the importance of metrology in the QI efforts. Understand appropriate methods for technical performance assessment. Understand methods for comparing algorithms with or without reference data (i.e., “ground truth”). Understand the challenges and importance of reporting results in a manner that allows for statistically valid meta-analyses.« less
Comparison of a non-stationary voxelation-corrected cluster-size test with TFCE for group-Level MRI inference.

PubMed

Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong

2017-03-01

Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies

PubMed Central

Liu, Zhonghua; Lin, Xihong

2017-01-01

Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391
Multiple phenotype association tests using summary statistics in genome-wide association studies.

PubMed

Liu, Zhonghua; Lin, Xihong

2018-03-01

We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.
An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature.

PubMed

Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth

2015-10-01

Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
First Monte Carlo analysis of fragmentation functions from single-inclusive e + e - annihilation

DOE PAGES

Sato, Nobuo; Ethier, J. J.; Melnitchouk, W.; ...

2016-12-02

Here, we perform the first iterative Monte Carlo (IMC) analysis of fragmentation functions constrained by all available data from single-inclusive $e^+ e^-$ annihilation into pions and kaons. The IMC method eliminates potential bias in traditional analyses based on single fits introduced by fixing parameters not well contrained by the data, and provides a statistically rigorous determination of uncertainties. Our analysis reveals specific features of fragmentation functions using the new IMC methodology and those obtained from previous analyses, especially for light quarks and for strange quark fragmentation to kaons.
Multilevel modelling: Beyond the basic applications.

PubMed

Wright, Daniel B; London, Kamala

2009-05-01

Over the last 30 years statistical algorithms have been developed to analyse datasets that have a hierarchical/multilevel structure. Particularly within developmental and educational psychology these techniques have become common where the sample has an obvious hierarchical structure, like pupils nested within a classroom. We describe two areas beyond the basic applications of multilevel modelling that are important to psychology: modelling the covariance structure in longitudinal designs and using generalized linear multilevel modelling as an alternative to methods from signal detection theory (SDT). Detailed code for all analyses is described using packages for the freeware R.
Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM

NASA Astrophysics Data System (ADS)

Köylü, Ü.; Geymen, A.

2016-10-01

Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.
Study/experimental/research design: much more than statistics.

PubMed

Knight, Kenneth L

2010-01-01

The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes "Methods" sections hard to read and understand. To clarify the difference between study design and statistical analysis, to show the advantages of a properly written study design on article comprehension, and to encourage authors to correctly describe study designs. The role of study design is explored from the introduction of the concept by Fisher through modern-day scientists and the AMA Manual of Style. At one time, when experiments were simpler, the study design and statistical design were identical or very similar. With the complex research that is common today, which often includes manipulating variables to create new variables and the multiple (and different) analyses of a single data set, data collection is very different than statistical design. Thus, both a study design and a statistical design are necessary. Scientific manuscripts will be much easier to read and comprehend. A proper experimental design serves as a road map to the study methods, helping readers to understand more clearly how the data were obtained and, therefore, assisting them in properly analyzing the results.
Advanced spectrophotometric chemometric methods for resolving the binary mixture of doxylamine succinate and pyridoxine hydrochloride.

PubMed

Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita

2018-03-01

The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture.

PubMed

Chung, Dongjun; Kim, Hang J; Zhao, Hongyu

2017-02-01

Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.
diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data.

PubMed

Lun, Aaron T L; Smyth, Gordon K

2015-08-19

Chromatin conformation capture with high-throughput sequencing (Hi-C) is a technique that measures the in vivo intensity of interactions between all pairs of loci in the genome. Most conventional analyses of Hi-C data focus on the detection of statistically significant interactions. However, an alternative strategy involves identifying significant changes in the interaction intensity (i.e., differential interactions) between two or more biological conditions. This is more statistically rigorous and may provide more biologically relevant results. Here, we present the diffHic software package for the detection of differential interactions from Hi-C data. diffHic provides methods for read pair alignment and processing, counting into bin pairs, filtering out low-abundance events and normalization of trended or CNV-driven biases. It uses the statistical framework of the edgeR package to model biological variability and to test for significant differences between conditions. Several options for the visualization of results are also included. The use of diffHic is demonstrated with real Hi-C data sets. Performance against existing methods is also evaluated with simulated data. On real data, diffHic is able to successfully detect interactions with significant differences in intensity between biological conditions. It also compares favourably to existing software tools on simulated data sets. These results suggest that diffHic is a viable approach for differential analyses of Hi-C data.
Statistical analyses of the relative risk.

PubMed Central

Gart, J J

1979-01-01

Let P1 be the probability of a disease in one population and P2 be the probability of a disease in a second population. The ratio of these quantities, R = P1/P2, is termed the relative risk. We consider first the analyses of the relative risk from retrospective studies. The relation between the relative risk and the odds ratio (or cross-product ratio) is developed. The odds ratio can be considered a parameter of an exponential model possessing sufficient statistics. This permits the development of exact significance tests and confidence intervals in the conditional space. Unconditional tests and intervals are also considered briefly. The consequences of misclassification errors and ignoring matching or stratifying are also considered. The various methods are extended to combination of results over the strata. Examples of case-control studies testing the association between HL-A frequencies and cancer illustrate the techniques. The parallel analyses of prospective studies are given. If P1 and P2 are small with large samples sizes the appropriate model is a Poisson distribution. This yields a exponential model with sufficient statistics. Exact conditional tests and confidence intervals can then be developed. Here we consider the case where two populations are compared adjusting for sex differences as well as for the strata (or covariate) differences such as age. The methods are applied to two examples: (1) testing in the two sexes the ratio of relative risks of skin cancer in people living in different latitudes, and (2) testing over time the ratio of the relative risks of cancer in two cities, one of which fluoridated its drinking water and one which did not. PMID:540589
The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

PubMed

Christensen, G B; Knight, S; Camp, N J

2009-11-01

We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.
Predicting clinical trial results based on announcements of interim analyses

PubMed Central

2014-01-01

Background Announcements of interim analyses of a clinical trial convey information about the results beyond the trial’s Data Safety Monitoring Board (DSMB). The amount of information conveyed may be minimal, but the fact that none of the trial’s stopping boundaries has been crossed implies that the experimental therapy is neither extremely effective nor hopeless. Predicting success of the ongoing trial is of interest to the trial’s sponsor, the medical community, pharmaceutical companies, and investors. We determine the probability of trial success by quantifying only the publicly available information from interim analyses of an ongoing trial. We illustrate our method in the context of the National Surgical Adjuvant Breast and Bowel (NSABP) trial, C-08. Methods We simulated trials based on the specifics of the NSABP C-08 protocol that were publicly available. We quantified the uncertainty around the treatment effect using prior weights for the various possibilities in light of other colon cancer studies and other studies of the investigational agent, bevacizumab. We considered alternative prior distributions. Results Subsequent to the trial’s third interim analysis, our predictive probabilities were: that the trial would eventually be successful, 48.0%; would stop for futility, 7.4%; and would continue to completion without statistical significance, 44.5%. The actual trial continued to completion without statistical significance. Conclusions Announcements of interim analyses provide information outside the DSMB’s sphere of confidentiality. This information is potentially helpful to clinical trial prognosticators. ‘Information leakage’ from standard interim analyses such as in NSABP C-08 is conventionally viewed as acceptable even though it may be quite revealing. Whether leakage from more aggressive types of adaptations is acceptable should be assessed at the design stage. PMID:24607270
Evaluation of a Partial Genome Screening of Two Asthma Susceptibility Regions Using Bayesian Network Based Bayesian Multilevel Analysis of Relevance

PubMed Central

Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba

2012-01-01

Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035
Analysis of data collected from right and left limbs: Accounting for dependence and improving statistical efficiency in musculoskeletal research.

PubMed

Stewart, Sarah; Pearson, Janet; Rome, Keith; Dalbeth, Nicola; Vandal, Alain C

2018-01-01

Statistical techniques currently used in musculoskeletal research often inefficiently account for paired-limb measurements or the relationship between measurements taken from multiple regions within limbs. This study compared three commonly used analysis methods with a mixed-models approach that appropriately accounted for the association between limbs, regions, and trials and that utilised all information available from repeated trials. Four analysis were applied to an existing data set containing plantar pressure data, which was collected for seven masked regions on right and left feet, over three trials, across three participant groups. Methods 1-3 averaged data over trials and analysed right foot data (Method 1), data from a randomly selected foot (Method 2), and averaged right and left foot data (Method 3). Method 4 used all available data in a mixed-effects regression that accounted for repeated measures taken for each foot, foot region and trial. Confidence interval widths for the mean differences between groups for each foot region were used as a criterion for comparison of statistical efficiency. Mean differences in pressure between groups were similar across methods for each foot region, while the confidence interval widths were consistently smaller for Method 4. Method 4 also revealed significant between-group differences that were not detected by Methods 1-3. A mixed effects linear model approach generates improved efficiency and power by producing more precise estimates compared to alternative approaches that discard information in the process of accounting for paired-limb measurements. This approach is recommended in generating more clinically sound and statistically efficient research outputs. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome-wide association analysis of secondary imaging phenotypes from the Alzheimer's disease neuroimaging initiative study.

PubMed

Zhu, Wensheng; Yuan, Ying; Zhang, Jingwen; Zhou, Fan; Knickmeyer, Rebecca C; Zhu, Hongtu

2017-02-01

The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme. Copyright © 2016 Elsevier Inc. All rights reserved.
Defining window-boundaries for genomic analyses using smoothing spline techniques

DOE PAGES

Beissinger, Timothy M.; Rosa, Guilherme J.M.; Kaeppler, Shawn M.; ...

2015-04-17

High-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the datamore » and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome.« less

The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra

PubMed Central

Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

2015-01-01

Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713
A modified method of 3D-SSP analysis for amyloid PET imaging using [¹¹C]BF-227.

PubMed

Kaneta, Tomohiro; Okamura, Nobuyuki; Minoshima, Satoshi; Furukawa, Katsutoshi; Tashiro, Manabu; Furumoto, Shozo; Iwata, Ren; Fukuda, Hiroshi; Takahashi, Shoki; Yanai, Kazuhiko; Kudo, Yukitsuka; Arai, Hiroyuki

2011-12-01

Three-dimensional stereotactic surface projection (3D-SSP) analyses have been widely used in dementia imaging studies. However, 3D-SSP sometimes shows paradoxical results on amyloid positron emission tomography (PET) analyses. This is thought to be caused by errors in anatomical standardization (AS) based on an (18)F-fluorodeoxyglucose (FDG) template. We developed a new method of 3D-SSP analysis for amyloid PET imaging, and used it to analyze (11)C-labeled 2-(2-[2-dimethylaminothiazol-5-yl]ethenyl)-6-(2-[fluoro]ethoxy)benzoxazole (BF-227) PET images of subjects with mild cognitive impairment (MCI) and Alzheimer's disease (AD). The subjects were 20 with MCI, 19 patients with AD, and 17 healthy controls. Twelve subjects with MCI were followed up for 3 years or more, and conversion to AD was seen in 6 cases. All subjects underwent PET with both FDG and BF-227. For AS and 3D-SSP analyses of PET data, Neurostat (University of Washington, WA, USA) was used. Method 1 involves AS for BF-227 images using an FDG template. In this study, we developed a new method (Method 2) for AS: First, an FDG image was subjected to AS using an FDG template. Then, the BF-227 image of the same patient was registered to the FDG image, and AS was performed using the transformation parameters calculated for AS of the corresponding FDG images. Regional values were normalized by the average value obtained at the cerebellum and values were calculated for the frontal, parietal, temporal, and occipital lobes. For statistical comparison of the 3 groups, we applied one-way analysis of variance followed by the Bonferroni post hoc test. For statistical comparison between converters and non-converters, the t test was applied. Statistical significance was defined as p < 0.05. Among the 56 cases we studied, Method 1 demonstrated slight distortions after AS of the image in 16 cases and heavy distortions in 4 cases in which the distortions were not observed with Method 2. Both methods demonstrated that the values in AD and MCI patients were significantly higher than those in the controls, in the parietal, temporal, and occipital lobes. However, only Method 2 showed significant differences in the frontal lobes. In addition, Method 2 could demonstrate a significantly higher value in MCI-to-AD converters in the parietal and frontal lobes. Method 2 corrects AS errors that often occur when using Method 1, and has made appropriate 3D-SSP analysis of amyloid PET imaging possible. This new method of 3D-SSP analysis for BF-227 PET could prove useful for detecting differences between normal groups and AD and MCI groups, and between converters and non-converters.
Economic evaluation of factorial randomised controlled trials: challenges, methods and recommendations

PubMed Central

Gray, Alastair

2017-01-01

Increasing numbers of economic evaluations are conducted alongside randomised controlled trials. Such studies include factorial trials, which randomise patients to different levels of two or more factors and can therefore evaluate the effect of multiple treatments alone and in combination. Factorial trials can provide increased statistical power or assess interactions between treatments, but raise additional challenges for trial‐based economic evaluations: interactions may occur more commonly for costs and quality‐adjusted life‐years (QALYs) than for clinical endpoints; economic endpoints raise challenges for transformation and regression analysis; and both factors must be considered simultaneously to assess which treatment combination represents best value for money. This article aims to examine issues associated with factorial trials that include assessment of costs and/or cost‐effectiveness, describe the methods that can be used to analyse such studies and make recommendations for health economists, statisticians and trialists. A hypothetical worked example is used to illustrate the challenges and demonstrate ways in which economic evaluations of factorial trials may be conducted, and how these methods affect the results and conclusions. Ignoring interactions introduces bias that could result in adopting a treatment that does not make best use of healthcare resources, while considering all interactions avoids bias but reduces statistical power. We also introduce the concept of the opportunity cost of ignoring interactions as a measure of the bias introduced by not taking account of all interactions. We conclude by offering recommendations for planning, analysing and reporting economic evaluations based on factorial trials, taking increased analysis costs into account. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28470760
Analysing and correcting the differences between multi-source and multi-scale spatial remote sensing observations.

PubMed

Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun

2014-01-01

Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation.
Analysing and Correcting the Differences between Multi-Source and Multi-Scale Spatial Remote Sensing Observations

PubMed Central

Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun

2014-01-01

Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation. PMID:25405760
Methods for meta-analysis of multiple traits using GWAS summary statistics.

PubMed

Ray, Debashree; Boehnke, Michael

2018-03-01

Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits. © 2017 WILEY PERIODICALS, INC.
The MDI Method as a Generalization of Logit, Probit and Hendry Analyses in Marketing.

DTIC Science & Technology

1980-04-01

model involves nothing more than fitting a normal distribution function ( Hanushek and Jackson (1977)). For a given value of x, the probit model...preference shifts within the soft drink category. --For applications of probit models relevant for marketing, see Hausman and Wise (1978) and Hanushek and...Marketing Research" JMR XIV, Feb. (1977). Hanushek , E.A., and J.E. Jackson, Statistical Methods for Social Scientists. Academic Press, New York (1977
Difficulties in learning and teaching statistics: teacher views

NASA Astrophysics Data System (ADS)

Koparan, Timur

2015-01-01

The purpose of this study is to define teacher views about the difficulties in learning and teaching middle school statistics subjects. To serve this aim, a number of interviews were conducted with 10 middle school maths teachers in 2011-2012 school year in the province of Trabzon. Of the qualitative descriptive research methods, the semi-structured interview technique was applied in the research. In accordance with the aim, teacher opinions about the statistics subjects were examined and analysed. Similar responses from the teachers were grouped and evaluated. The teachers stated that it was positive that middle school statistics subjects were taught gradually in every grade but some difficulties were experienced in the teaching of this subject. The findings are presented in eight themes which are context, sample, data representation, central tendency and dispersion measure, probability, variance, and other difficulties.
Statistical Techniques to Analyze Pesticide Data Program Food Residue Observations.

PubMed

Szarka, Arpad Z; Hayworth, Carol G; Ramanarayanan, Tharacad S; Joseph, Robert S I

2018-06-26

The U.S. EPA conducts dietary-risk assessments to ensure that levels of pesticides on food in the U.S. food supply are safe. Often these assessments utilize conservative residue estimates, maximum residue levels (MRLs), and a high-end estimate derived from registrant-generated field-trial data sets. A more realistic estimate of consumers' pesticide exposure from food may be obtained by utilizing residues from food-monitoring programs, such as the Pesticide Data Program (PDP) of the U.S. Department of Agriculture. A substantial portion of food-residue concentrations in PDP monitoring programs are below the limits of detection (left-censored), which makes the comparison of regulatory-field-trial and PDP residue levels difficult. In this paper, we present a novel adaption of established statistical techniques, the Kaplan-Meier estimator (K-M), the robust regression on ordered statistic (ROS), and the maximum-likelihood estimator (MLE), to quantify the pesticide-residue concentrations in the presence of heavily censored data sets. The examined statistical approaches include the most commonly used parametric and nonparametric methods for handling left-censored data that have been used in the fields of medical and environmental sciences. This work presents a case study in which data of thiamethoxam residue on bell pepper generated from registrant field trials were compared with PDP-monitoring residue values. The results from the statistical techniques were evaluated and compared with commonly used simple substitution methods for the determination of summary statistics. It was found that the maximum-likelihood estimator (MLE) is the most appropriate statistical method to analyze this residue data set. Using the MLE technique, the data analyses showed that the median and mean PDP bell pepper residue levels were approximately 19 and 7 times lower, respectively, than the corresponding statistics of the field-trial residues.
Comparison of two surface temperature measurement using thermocouples and infrared camera

NASA Astrophysics Data System (ADS)

Michalski, Dariusz; Strąk, Kinga; Piasecka, Magdalena

This paper compares two methods applied to measure surface temperatures at an experimental setup designed to analyse flow boiling heat transfer. The temperature measurements were performed in two parallel rectangular minichannels, both 1.7 mm deep, 16 mm wide and 180 mm long. The heating element for the fluid flowing in each minichannel was a thin foil made of Haynes-230. The two measurement methods employed to determine the surface temperature of the foil were: the contact method, which involved mounting thermocouples at several points in one minichannel, and the contactless method to study the other minichannel, where the results were provided with an infrared camera. Calculations were necessary to compare the temperature results. Two sets of measurement data obtained for different values of the heat flux were analysed using the basic statistical methods, the method error and the method accuracy. The experimental error and the method accuracy were taken into account. The comparative analysis showed that although the values and distributions of the surface temperatures obtained with the two methods were similar but both methods had certain limitations.
Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

PubMed

Willis, Brian H; Riley, Richard D

2017-09-20

An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Analysis and interpretation of cost data in randomised controlled trials: review of published studies

PubMed Central

Barber, Julie A; Thompson, Simon G

1998-01-01

Objective To review critically the statistical methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the study. Design Survey of published randomised trials including an economic evaluation with cost values suitable for statistical analysis; 45 such trials published in 1995 were identified from Medline. Main outcome measures The use of statistical methods for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified. Results Although all 45 trials reviewed apparently had cost data for each patient, only 9 (20%) reported adequate measures of variability for these data and only 25 (56%) gave results of statistical tests or a measure of precision for the comparison of costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented in the paper. No paper reported sample size calculations for costs. Conclusions The analysis and interpretation of cost data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs of alternative therapies have often been reported in the absence of supporting statistical evidence. Improvements in the analysis and reporting of health economic assessments are urgently required. Health economic guidelines need to be revised to incorporate more detailed statistical advice. Key messagesHealth economic evaluations required for important healthcare policy decisions are often carried out in randomised controlled trialsA review of such published economic evaluations assessed whether statistical methods for cost outcomes have been appropriately used and interpretedFew publications presented adequate descriptive information for costs or performed appropriate statistical analysesIn at least two thirds of the papers, the main conclusions regarding costs were not justifiedThe analysis and reporting of health economic assessments within randomised controlled trials urgently need improving PMID:9794854
What do results from coordinate-based meta-analyses tell us?

PubMed

Albajes-Eizagirre, Anton; Radua, Joaquim

2018-08-01

Coordinate-based meta-analyses (CBMA) methods, such as Activation Likelihood Estimation (ALE) and Seed-based d Mapping (SDM), have become an invaluable tool for summarizing the findings of voxel-based neuroimaging studies. However, the progressive sophistication of these methods may have concealed two particularities of their statistical tests. Common univariate voxelwise tests (such as the t/z-tests used in SPM and FSL) detect voxels that activate, or voxels that show differences between groups. Conversely, the tests conducted in CBMA test for "spatial convergence" of findings, i.e., they detect regions where studies report "more peaks than in most regions", regions that activate "more than most regions do", or regions that show "larger differences between groups than most regions do". The first particularity is that these tests rely on two spatial assumptions (voxels are independent and have the same probability to have a "false" peak), whose violation may make their results either conservative or liberal, though fortunately current versions of ALE, SDM and some other methods consider these assumptions. The second particularity is that the use of these tests involves an important paradox: the statistical power to detect a given effect is higher if there are no other effects in the brain, whereas lower in presence of multiple effects. Copyright © 2018 Elsevier Inc. All rights reserved.
System Synthesis in Preliminary Aircraft Design using Statistical Methods

NASA Technical Reports Server (NTRS)

DeLaurentis, Daniel; Mavris, Dimitri N.; Schrage, Daniel P.

1996-01-01

This paper documents an approach to conceptual and preliminary aircraft design in which system synthesis is achieved using statistical methods, specifically design of experiments (DOE) and response surface methodology (RSM). These methods are employed in order to more efficiently search the design space for optimum configurations. In particular, a methodology incorporating three uses of these techniques is presented. First, response surface equations are formed which represent aerodynamic analyses, in the form of regression polynomials, which are more sophisticated than generally available in early design stages. Next, a regression equation for an overall evaluation criterion is constructed for the purpose of constrained optimization at the system level. This optimization, though achieved in a innovative way, is still traditional in that it is a point design solution. The methodology put forward here remedies this by introducing uncertainty into the problem, resulting a solutions which are probabilistic in nature. DOE/RSM is used for the third time in this setting. The process is demonstrated through a detailed aero-propulsion optimization of a high speed civil transport. Fundamental goals of the methodology, then, are to introduce higher fidelity disciplinary analyses to the conceptual aircraft synthesis and provide a roadmap for transitioning from point solutions to probabalistic designs (and eventually robust ones).
Meta-analysis of magnitudes, differences and variation in evolutionary parameters.

PubMed

Morrissey, M B

2016-10-01

Meta-analysis is increasingly used to synthesize major patterns in the large literatures within ecology and evolution. Meta-analytic methods that do not account for the process of observing data, which we may refer to as 'informal meta-analyses', may have undesirable properties. In some cases, informal meta-analyses may produce results that are unbiased, but do not necessarily make the best possible use of available data. In other cases, unbiased statistical noise in individual reports in the literature can potentially be converted into severe systematic biases in informal meta-analyses. I first present a general description of how failure to account for noise in individual inferences should be expected to lead to biases in some kinds of meta-analysis. In particular, informal meta-analyses of quantities that reflect the dispersion of parameters in nature, for example, the mean absolute value of a quantity, are likely to be generally highly misleading. I then re-analyse three previously published informal meta-analyses, where key inferences were of aspects of the dispersion of values in nature, for example, the mean absolute value of selection gradients. Major biological conclusions in each original informal meta-analysis closely match those that could arise as artefacts due to statistical noise. I present alternative mixed-model-based analyses that are specifically tailored to each situation, but where all analyses may be implemented with widely available open-source software. In each example meta-re-analysis, major conclusions change substantially. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Subtyping of Children with Developmental Dyslexia via Bootstrap Aggregated Clustering and the Gap Statistic: Comparison with the Double-Deficit Hypothesis

ERIC Educational Resources Information Center

King, Wayne M.; Giess, Sally A.; Lombardino, Linda J.

2007-01-01

Background: The marked degree of heterogeneity in persons with developmental dyslexia has motivated the investigation of possible subtypes. Attempts have proceeded both from theoretical models of reading and the application of unsupervised learning (clustering) methods. Previous cluster analyses of data obtained from persons with reading…
Assessment of Students' Scientific and Alternative Conceptions of Energy and Momentum Using Concentration Analysis

ERIC Educational Resources Information Center

Dega, Bekele Gashe; Govender, Nadaraj

2016-01-01

This study compares the scientific and alternative conceptions of energy and momentum of university first-year science students in Ethiopia and the US. Written data were collected using the Energy and Momentum Conceptual Survey developed by Singh and Rosengrant. The Concentration Analysis statistical method was used for analysing the Ethiopian…
From genes to ecosystems: Measuring evolutionary diversity and community structure with Forest Inventory and Analysis (FIA) data

Treesearch

Kevin M. Potter

2009-01-01

Forest genetic sustainability is an important component of forest health because genetic diversity and evolutionary processes allow for the adaptation of species and for the maintenance of ecosystem functionality and resilience. Phylogenetic community analyses, a set of new statistical methods for describing the evolutionary relationships among species, offer an...
Three Studies on the Leadership Behaviors of Academic Deans in Higher Education

ERIC Educational Resources Information Center

Brower, Rebecca

2013-01-01

This three article mixed methods dissertation is titled "Three Studies on the Leadership Behaviors of Academic Deans in Higher Education." Each article is based on a sample of 51 academic deans from a three state region in the Southeastern United States. In the first study, the results of the statistical analyses reinforce the gender…
Differential Neonatal and Postneonatal Infant Mortality Rates across US Counties: The Role of Socioeconomic Conditions and Rurality

ERIC Educational Resources Information Center

Sparks, P. Johnelle; McLaughlin, Diane K.; Stokes, C. Shannon

2009-01-01

Purpose: To examine differences in correlates of neonatal and postneonatal infant mortality rates, across counties, by degree of rurality. Methods: Neonatal and postneonatal mortality rates were calculated from the 1998 to 2002 Compressed Mortality Files from the National Center for Health Statistics. Bivariate analyses assessed the relationship…

Methods and Challenges of Analyzing Spatial Data for Social Work Problems: The Case of Examining Child Maltreatment Geographically

ERIC Educational Resources Information Center

Freisthler, Bridget; Lery, Bridgette; Gruenewald, Paul J.; Chow, Julian

2006-01-01

Increasingly, social work researchers are interested in examining how "place" and "location" contribute to social problems. Yet, often these researchers do not use the specialized spatial statistical techniques developed to handle the analytic issues faced when conducting ecological analyses. This article explains the importance of these…
Deck Wetness and Extreme Motions Experiments: An Investigation into Establishing Reliable Statistics for Rare Events

DTIC Science & Technology

1990-02-01

CAlA WACe Mns. b. Amalgamated for all tank runs: (1) Significant wave height and mdal period of achieved wave condition. (2) Mean And .S mortions...experimental conditions. It is impossible to set sa jndtrd run lengths for all experimental conditions and so a method should be developed to analyse the
Integrative Analysis of Salmonellosis Outbreaks in Israel 1999-2012 Revealed an Invasive S. enterica Serovar 9,12:l,v:- and Endemic S. Typhimurium DT104 strain

USDA-ARS?s Scientific Manuscript database

Salmonella enterica is the leading etiologic agent of bacterial foodborne outbreaks worldwide. Methods. Laboratory-based statistical surveillance, molecular and genomics analyses were applied to characterize Salmonella outbreaks pattern in Israel. 65,087 Salmonella isolates reported to the National ...
Parental Socio-Economic Status as Correlate of Child Labour in Ile-Ife, Nigeria

ERIC Educational Resources Information Center

Elegbeleye, O. S.; Olasupo, M. O.

2012-01-01

This study investigated the relationship between parental socio-economic status and child labour practices in Ile-Ife, Nigeria. The study employed survey method to gather data from 200 parents which constituted the study population. Pearson Product Moment Correlation and t-test statistics were used for the data analyses. The outcome of the study…
Using Person Response Functions to Investigate Areas of Person Misfit Related to Item Characteristics

ERIC Educational Resources Information Center

Walker, A. Adrienne; Jennings, Jeremy Kyle; Engelhard, George, Jr.

2018-01-01

Individual person fit analyses provide important information regarding the validity of test score inferences for an "individual" test taker. In this study, we use data from an undergraduate statistics test (N = 1135) to illustrate a two-step method that researchers and practitioners can use to examine individual person fit. First, person…
Good analytical practice: statistics and handling data in biomedical science. A primer and directions for authors. Part 1: Introduction. Data within and between one or two sets of individuals.

PubMed

Blann, A D; Nation, B R

2008-01-01

The biomedical scientist is bombarded on a daily basis by information, almost all of which refers to the health status of an individual or groups of individuals. This review is the first of a two-part article written to explain some of the issues related to the presentation and analysis of data. The first part focuses on types of data and how to present and analyse data from an individual or from one or two groups of persons. The second part will examine data from three or more sets of persons, what methods are available to allow this analysis (i.e., statistical software packages), and will conclude with a statement on appropriate descriptors of data, their analyses, and presentation for authors considering submission of their data to this journal.
Personal use of hair dyes and the risk of bladder cancer: results of a meta-analysis.

PubMed Central

Huncharek, Michael; Kupelnick, Bruce

2005-01-01

OBJECTIVE: This study examined the methodology of observational studies that explored an association between personal use of hair dye products and the risk of bladder cancer. METHODS: Data were pooled from epidemiological studies using a general variance-based meta-analytic method that employed confidence intervals. The outcome of interest was a summary relative risk (RRs) reflecting the risk of bladder cancer development associated with use of hair dye products vs. non-use. Sensitivity analyses were performed to explain any observed statistical heterogeneity and to explore the influence of specific study characteristics of the summary estimate of effect. RESULTS: Initially combining homogenous data from six case-control and one cohort study yielded a non-significant RR of 1.01 (0.92, 1.11), suggesting no association between hair dye use and bladder cancer development. Sensitivity analyses examining the influence of hair dye type, color, and study design on this suspected association showed that uncontrolled confounding and design limitations contributed to a spurious non-significant summary RR. The sensitivity analyses yielded statistically significant RRs ranging from 1.22 (1.11, 1.51) to 1.50 (1.30, 1.98), indicating that personal use of hair dye products increases bladder cancer risk by 22% to 50% vs. non-use. CONCLUSION: The available epidemiological data suggest an association between personal use of hair dye products and increased risk of bladder cancer. PMID:15736329
A Cyber-Attack Detection Model Based on Multivariate Analyses

NASA Astrophysics Data System (ADS)

Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Assessing the effect of land use change on catchment runoff by combined use of statistical tests and hydrological modelling: Case studies from Zimbabwe

NASA Astrophysics Data System (ADS)

Lørup, Jens Kristian; Refsgaard, Jens Christian; Mazvimavi, Dominic

1998-03-01

The purpose of this study was to identify and assess long-term impacts of land use change on catchment runoff in semi-arid Zimbabwe, based on analyses of long hydrological time series (25-50 years) from six medium-sized (200-1000 km 2) non-experimental rural catchments. A methodology combining common statistical methods with hydrological modelling was adopted in order to distinguish between the effects of climate variability and the effects of land use change. The hydrological model (NAM) was in general able to simulate the observed hydrographs very well during the reference period, thus providing a means to account for the effects of climate variability and hence strengthening the power of the subsequent statistical tests. In the test period the validated model was used to provide the runoff record which would have occurred in the absence of land use change. The analyses indicated a decrease in the annual runoff for most of the six catchments, with the largest changes occurring for catchments located within communal land, where large increases in population and agricultural intensity have taken place. However, the decrease was only statistically significant at the 5% level for one of the catchments.
Statistical universals reveal the structures and functions of human music.

PubMed

Savage, Patrick E; Brown, Steven; Sakai, Emi; Currie, Thomas E

2015-07-21

Music has been called "the universal language of mankind." Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation.
Statistical universals reveal the structures and functions of human music

PubMed Central

Savage, Patrick E.; Brown, Steven; Sakai, Emi; Currie, Thomas E.

2015-01-01

Music has been called “the universal language of mankind.” Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation. PMID:26124105
Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves.

PubMed

Guyot, Patricia; Ades, A E; Ouwens, Mario J N M; Welton, Nicky J

2012-02-01

The results of Randomized Controlled Trials (RCTs) on time-to-event outcomes that are usually reported are median time to events and Cox Hazard Ratio. These do not constitute the sufficient statistics required for meta-analysis or cost-effectiveness analysis, and their use in secondary analyses requires strong assumptions that may not have been adequately tested. In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient time-to-event data from which they were generated. We develop an algorithm that maps from digitised curves back to KM data by finding numerical solutions to the inverted KM equations, using where available information on number of events and numbers at risk. The reproducibility and accuracy of survival probabilities, median survival times and hazard ratios based on reconstructed KM data was assessed by comparing published statistics (survival probabilities, medians and hazard ratios) with statistics based on repeated reconstructions by multiple observers. The validation exercise established there was no material systematic error and that there was a high degree of reproducibility for all statistics. Accuracy was excellent for survival probabilities and medians, for hazard ratios reasonable accuracy can only be obtained if at least numbers at risk or total number of events are reported. The algorithm is a reliable tool for meta-analysis and cost-effectiveness analyses of RCTs reporting time-to-event data. It is recommended that all RCTs should report information on numbers at risk and total number of events alongside KM curves.
Using assemblage data in ecological indicators: A comparison and evaluation of commonly available statistical tools

USGS Publications Warehouse

Smith, Joseph M.; Mather, Martha E.

2012-01-01

Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.
Early Warning Signs of Suicide in Service Members Who Engage in Unauthorized Acts of Violence

DTIC Science & Technology

2016-06-01

observable to military law enforcement personnel. Statistical analyses tested for differences in warning signs between cases of suicide, violence, or...indicators, (2) Behavioral Change indicators, (3) Social indicators, and (4) Occupational indicators. Statistical analyses were conducted to test for...6 Coding _________________________________________________________________ 7 Statistical
[Statistical analysis using freely-available "EZR (Easy R)" software].

PubMed

Kanda, Yoshinobu

2015-10-01

Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Implications of the methodological choices for hydrologic portrayals of climate change over the contiguous United States: Statistically downscaled forcing data and hydrologic models

USGS Publications Warehouse

Mizukami, Naoki; Clark, Martyn P.; Gutmann, Ethan D.; Mendoza, Pablo A.; Newman, Andrew J.; Nijssen, Bart; Livneh, Ben; Hay, Lauren E.; Arnold, Jeffrey R.; Brekke, Levi D.

2016-01-01

Continental-domain assessments of climate change impacts on water resources typically rely on statistically downscaled climate model outputs to force hydrologic models at a finer spatial resolution. This study examines the effects of four statistical downscaling methods [bias-corrected constructed analog (BCCA), bias-corrected spatial disaggregation applied at daily (BCSDd) and monthly scales (BCSDm), and asynchronous regression (AR)] on retrospective hydrologic simulations using three hydrologic models with their default parameters (the Community Land Model, version 4.0; the Variable Infiltration Capacity model, version 4.1.2; and the Precipitation–Runoff Modeling System, version 3.0.4) over the contiguous United States (CONUS). Biases of hydrologic simulations forced by statistically downscaled climate data relative to the simulation with observation-based gridded data are presented. Each statistical downscaling method produces different meteorological portrayals including precipitation amount, wet-day frequency, and the energy input (i.e., shortwave radiation), and their interplay affects estimations of precipitation partitioning between evapotranspiration and runoff, extreme runoff, and hydrologic states (i.e., snow and soil moisture). The analyses show that BCCA underestimates annual precipitation by as much as −250 mm, leading to unreasonable hydrologic portrayals over the CONUS for all models. Although the other three statistical downscaling methods produce a comparable precipitation bias ranging from −10 to 8 mm across the CONUS, BCSDd severely overestimates the wet-day fraction by up to 0.25, leading to different precipitation partitioning compared to the simulations with other downscaled data. Overall, the choice of downscaling method contributes to less spread in runoff estimates (by a factor of 1.5–3) than the choice of hydrologic model with use of the default parameters if BCCA is excluded.
Identification of the isomers using principal component analysis (PCA) method

NASA Astrophysics Data System (ADS)

Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

2016-03-01

In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.
Which Propensity Score Method Best Reduces Confounder Imbalance? An Example From a Retrospective Evaluation of a Childhood Obesity Intervention.

PubMed

Schroeder, Krista; Jia, Haomiao; Smaldone, Arlene

Propensity score (PS) methods are increasingly being employed by researchers to reduce bias arising from confounder imbalance when using observational data to examine intervention effects. The purpose of this study was to examine PS theory and methodology and compare application of three PS methods (matching, stratification, weighting) to determine which best improves confounder balance. Baseline characteristics of a sample of 20,518 school-aged children with severe obesity (of whom 1,054 received an obesity intervention) were assessed prior to PS application. Three PS methods were then applied to the data to determine which showed the greatest improvement in confounder balance between the intervention and control group. The effect of each PS method on the outcome variable-body mass index percentile change at one year-was also examined. SAS 9.4 and Comprehensive Meta-analysis statistical software were used for analyses. Prior to PS adjustment, the intervention and control groups differed significantly on seven of 11 potential confounders. PS matching removed all differences. PS stratification and weighting both removed one difference but created two new differences. Sensitivity analyses did not change these results. Body mass index percentile at 1 year decreased in both groups. The size of the decrease was smaller in the intervention group, and the estimate of the decrease varied by PS method. Selection of a PS method should be guided by insight from statistical theory and simulation experiments, in addition to observed improvement in confounder balance. For this data set, PS matching worked best to correct confounder imbalance. Because each method varied in correcting confounder imbalance, we recommend that multiple PS methods be compared for ability to improve confounder balance before implementation in evaluating treatment effects in observational data.
Multivariate statistical approach to estimate mixing proportions for unknown end members

USGS Publications Warehouse

Valder, Joshua F.; Long, Andrew J.; Davis, Arden D.; Kenner, Scott J.

2012-01-01

A multivariate statistical method is presented, which includes principal components analysis (PCA) and an end-member mixing model to estimate unknown end-member hydrochemical compositions and the relative mixing proportions of those end members in mixed waters. PCA, together with the Hotelling T2 statistic and a conceptual model of groundwater flow and mixing, was used in selecting samples that best approximate end members, which then were used as initial values in optimization of the end-member mixing model. This method was tested on controlled datasets (i.e., true values of estimates were known a priori) and found effective in estimating these end members and mixing proportions. The controlled datasets included synthetically generated hydrochemical data, synthetically generated mixing proportions, and laboratory analyses of sample mixtures, which were used in an evaluation of the effectiveness of this method for potential use in actual hydrological settings. For three different scenarios tested, correlation coefficients (R2) for linear regression between the estimated and known values ranged from 0.968 to 0.993 for mixing proportions and from 0.839 to 0.998 for end-member compositions. The method also was applied to field data from a study of end-member mixing in groundwater as a field example and partial method validation.
Correlating tephras and cryptotephras using glass compositional analyses and numerical and statistical methods: Review and evaluation

NASA Astrophysics Data System (ADS)

Lowe, David J.; Pearce, Nicholas J. G.; Jorgensen, Murray A.; Kuehn, Stephen C.; Tryon, Christian A.; Hayward, Chris L.

2017-11-01

We define tephras and cryptotephras and their components (mainly ash-sized particles of glass ± crystals in distal deposits) and summarize the basis of tephrochronology as a chronostratigraphic correlational and dating tool for palaeoenvironmental, geological, and archaeological research. We then document and appraise recent advances in analytical methods used to determine the major, minor, and trace elements of individual glass shards from tephra or cryptotephra deposits to aid their correlation and application. Protocols developed recently for the electron probe microanalysis of major elements in individual glass shards help to improve data quality and standardize reporting procedures. A narrow electron beam (diameter ∼3-5 μm) can now be used to analyze smaller glass shards than previously attainable. Reliable analyses of 'microshards' (defined here as glass shards <32 μm in diameter) using narrow beams are useful for fine-grained samples from distal or ultra-distal geographic locations, and for vesicular or microlite-rich glass shards or small melt inclusions. Caveats apply, however, in the microprobe analysis of very small microshards (≤∼5 μm in diameter), where particle geometry becomes important, and of microlite-rich glass shards where the potential problem of secondary fluorescence across phase boundaries needs to be recognised. Trace element analyses of individual glass shards using laser ablation inductively coupled plasma-mass spectrometry (LA-ICP-MS), with crater diameters of 20 μm and 10 μm, are now effectively routine, giving detection limits well below 1 ppm. Smaller ablation craters (<10 μm) can be subject to significant element fractionation during analysis, but the systematic relationship of such fractionation with glass composition suggests that analyses for some elements at these resolutions may be quantifiable. In undertaking analyses, either by microprobe or LA-ICP-MS, reference material data acquired using the same procedure, and preferably from the same analytical session, should be presented alongside new analytical data. In part 2 of the review, we describe, critically assess, and recommend ways in which tephras or cryptotephras can be correlated (in conjunction with other information) using numerical or statistical analyses of compositional data. Statistical methods provide a less subjective means of dealing with analytical data pertaining to tephra components (usually glass or crystals/phenocrysts) than heuristic alternatives. They enable a better understanding of relationships among the data from multiple viewpoints to be developed and help quantify the degree of uncertainty in establishing correlations. In common with other scientific hypothesis testing, it is easier to infer using such analysis that two or more tephras are different rather than the same. Adding stratigraphic, chronological, spatial, or palaeoenvironmental data (i.e. multiple criteria) is usually necessary and allows for more robust correlations to be made. A two-stage approach is useful, the first focussed on differences in the mean composition of samples, or their range, which can be visualised graphically via scatterplot matrices or bivariate plots coupled with the use of statistical tools such as distance measures, similarity coefficients, hierarchical cluster analysis (informed by distance measures or similarity or cophenetic coefficients), and principal components analysis (PCA). Some statistical methods (cluster analysis, discriminant analysis) are referred to as 'machine learning' in the computing literature. The second stage examines sample variance and the degree of compositional similarity so that sample equivalence or otherwise can be established on a statistical basis. This stage may involve discriminant function analysis (DFA), support vector machines (SVMs), canonical variates analysis (CVA), and ANOVA or MANOVA (or its two-sample special case, the Hotelling two-sample T2 test). Randomization tests can be used where distributional assumptions such as multivariate normality underlying parametric tests are doubtful. Compositional data may be transformed and scaled before being subjected to multivariate statistical procedures including calculation of distance matrices, hierarchical cluster analysis, and PCA. Such transformations may make the assumption of multivariate normality more appropriate. A sequential procedure using Mahalanobis distance and the Hotelling two-sample T2 test is illustrated using glass major element data from trachytic to phonolitic Kenyan tephras. All these methods require a broad range of high-quality compositional data which can be used to compare 'unknowns' with reference (training) sets that are sufficiently complete to account for all possible correlatives, including tephras with heterogeneous glasses that contain multiple compositional groups. Currently, incomplete databases are tending to limit correlation efficacy. The development of an open, online global database to facilitate progress towards integrated, high-quality tephrostratigraphic frameworks for different regions is encouraged.

Incorporating an Interactive Statistics Workshop into an Introductory Biology Course-Based Undergraduate Research Experience (CURE) Enhances Students’ Statistical Reasoning and Quantitative Literacy Skills †

PubMed Central

Olimpo, Jeffrey T.; Pevey, Ryan S.; McCabe, Thomas M.

2018-01-01

Course-based undergraduate research experiences (CUREs) provide an avenue for student participation in authentic scientific opportunities. Within the context of such coursework, students are often expected to collect, analyze, and evaluate data obtained from their own investigations. Yet, limited research has been conducted that examines mechanisms for supporting students in these endeavors. In this article, we discuss the development and evaluation of an interactive statistics workshop that was expressly designed to provide students with an open platform for graduate teaching assistant (GTA)-mentored data processing, statistical testing, and synthesis of their own research findings. Mixed methods analyses of pre/post-intervention survey data indicated a statistically significant increase in students’ reasoning and quantitative literacy abilities in the domain, as well as enhancement of student self-reported confidence in and knowledge of the application of various statistical metrics to real-world contexts. Collectively, these data reify an important role for scaffolded instruction in statistics in preparing emergent scientists to be data-savvy researchers in a globally expansive STEM workforce. PMID:29904549
Incorporating an Interactive Statistics Workshop into an Introductory Biology Course-Based Undergraduate Research Experience (CURE) Enhances Students' Statistical Reasoning and Quantitative Literacy Skills.

PubMed

Olimpo, Jeffrey T; Pevey, Ryan S; McCabe, Thomas M

2018-01-01

Course-based undergraduate research experiences (CUREs) provide an avenue for student participation in authentic scientific opportunities. Within the context of such coursework, students are often expected to collect, analyze, and evaluate data obtained from their own investigations. Yet, limited research has been conducted that examines mechanisms for supporting students in these endeavors. In this article, we discuss the development and evaluation of an interactive statistics workshop that was expressly designed to provide students with an open platform for graduate teaching assistant (GTA)-mentored data processing, statistical testing, and synthesis of their own research findings. Mixed methods analyses of pre/post-intervention survey data indicated a statistically significant increase in students' reasoning and quantitative literacy abilities in the domain, as well as enhancement of student self-reported confidence in and knowledge of the application of various statistical metrics to real-world contexts. Collectively, these data reify an important role for scaffolded instruction in statistics in preparing emergent scientists to be data-savvy researchers in a globally expansive STEM workforce.
Mediation Analysis with Survival Outcomes: Accelerated Failure Time vs. Proportional Hazards Models

PubMed Central

Gelfand, Lois A.; MacKinnon, David P.; DeRubeis, Robert J.; Baraldi, Amanda N.

2016-01-01

Objective: Survival time is an important type of outcome variable in treatment research. Currently, limited guidance is available regarding performing mediation analyses with survival outcomes, which generally do not have normally distributed errors, and contain unobserved (censored) events. We present considerations for choosing an approach, using a comparison of semi-parametric proportional hazards (PH) and fully parametric accelerated failure time (AFT) approaches for illustration. Method: We compare PH and AFT models and procedures in their integration into mediation models and review their ability to produce coefficients that estimate causal effects. Using simulation studies modeling Weibull-distributed survival times, we compare statistical properties of mediation analyses incorporating PH and AFT approaches (employing SAS procedures PHREG and LIFEREG, respectively) under varied data conditions, some including censoring. A simulated data set illustrates the findings. Results: AFT models integrate more easily than PH models into mediation models. Furthermore, mediation analyses incorporating LIFEREG produce coefficients that can estimate causal effects, and demonstrate superior statistical properties. Censoring introduces bias in the coefficient estimate representing the treatment effect on outcome—underestimation in LIFEREG, and overestimation in PHREG. With LIFEREG, this bias can be addressed using an alternative estimate obtained from combining other coefficients, whereas this is not possible with PHREG. Conclusions: When Weibull assumptions are not violated, there are compelling advantages to using LIFEREG over PHREG for mediation analyses involving survival-time outcomes. Irrespective of the procedures used, the interpretation of coefficients, effects of censoring on coefficient estimates, and statistical properties should be taken into account when reporting results. PMID:27065906
Robust approaches to quantification of margin and uncertainty for sparse data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hund, Lauren; Schroeder, Benjamin B.; Rumsey, Kelin

Characterizing the tails of probability distributions plays a key role in quantification of margins and uncertainties (QMU), where the goal is characterization of low probability, high consequence events based on continuous measures of performance. When data are collected using physical experimentation, probability distributions are typically fit using statistical methods based on the collected data, and these parametric distributional assumptions are often used to extrapolate about the extreme tail behavior of the underlying probability distribution. In this project, we character- ize the risk associated with such tail extrapolation. Specifically, we conducted a scaling study to demonstrate the large magnitude of themore » risk; then, we developed new methods for communicat- ing risk associated with tail extrapolation from unvalidated statistical models; lastly, we proposed a Bayesian data-integration framework to mitigate tail extrapolation risk through integrating ad- ditional information. We conclude that decision-making using QMU is a complex process that cannot be achieved using statistical analyses alone.« less
The application of the statistical theory of extreme values to gust-load problems

NASA Technical Reports Server (NTRS)

Press, Harry

1950-01-01

An analysis is presented which indicates that the statistical theory of extreme values is applicable to the problems of predicting the frequency of encountering the larger gust loads and gust velocities for both specific test conditions as well as commercial transport operations. The extreme-value theory provides an analytic form for the distributions of maximum values of gust load and velocity. Methods of fitting the distribution are given along with a method of estimating the reliability of the predictions. The theory of extreme values is applied to available load data from commercial transport operations. The results indicate that the estimates of the frequency of encountering the larger loads are more consistent with the data and more reliable than those obtained in previous analyses. (author)
Whole-Range Assessment: A Simple Method for Analysing Allelopathic Dose-Response Data

PubMed Central

An, Min; Pratley, J. E.; Haig, T.; Liu, D.L.

2005-01-01

Based on the typical biological responses of an organism to allelochemicals (hormesis), concepts of whole-range assessment and inhibition index were developed for improved analysis of allelopathic data. Examples of their application are presented using data drawn from the literature. The method is concise and comprehensive, and makes data grouping and multiple comparisons simple, logical, and possible. It improves data interpretation, enhances research outcomes, and is a statistically efficient summary of the plant response profiles. PMID:19330165
Spatial analyses for nonoverlapping objects with size variations and their application to coral communities.

PubMed

Muko, Soyoka; Shimatani, Ichiro K; Nozawa, Yoko

2014-07-01

Spatial distributions of individuals are conventionally analysed by representing objects as dimensionless points, in which spatial statistics are based on centre-to-centre distances. However, if organisms expand without overlapping and show size variations, such as is the case for encrusting corals, interobject spacing is crucial for spatial associations where interactions occur. We introduced new pairwise statistics using minimum distances between objects and demonstrated their utility when examining encrusting coral community data. We also calculated the conventional point process statistics and the grid-based statistics to clarify the advantages and limitations of each spatial statistical method. For simplicity, coral colonies were approximated by disks in these demonstrations. Focusing on short-distance effects, the use of minimum distances revealed that almost all coral genera were aggregated at a scale of 1-25 cm. However, when fragmented colonies (ramets) were treated as a genet, a genet-level analysis indicated weak or no aggregation, suggesting that most corals were randomly distributed and that fragmentation was the primary cause of colony aggregations. In contrast, point process statistics showed larger aggregation scales, presumably because centre-to-centre distances included both intercolony spacing and colony sizes (radius). The grid-based statistics were able to quantify the patch (aggregation) scale of colonies, but the scale was strongly affected by the colony size. Our approach quantitatively showed repulsive effects between an aggressive genus and a competitively weak genus, while the grid-based statistics (covariance function) also showed repulsion although the spatial scale indicated from the statistics was not directly interpretable in terms of ecological meaning. The use of minimum distances together with previously proposed spatial statistics helped us to extend our understanding of the spatial patterns of nonoverlapping objects that vary in size and the associated specific scales. © 2013 The Authors. Journal of Animal Ecology © 2013 British Ecological Society.
Improved score statistics for meta-analysis in single-variant and gene-level association studies.

PubMed

Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo

2018-06-01

Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
Statistical approaches in published ophthalmic clinical science papers: a comparison to statistical practice two decades ago.

PubMed

Zhang, Harrison G; Ying, Gui-Shuang

2018-02-09

The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Subjective global assessment of nutritional status in children.

PubMed

Mahdavi, Aida Malek; Ostadrahimi, Alireza; Safaiyan, Abdolrasool

2010-10-01

This study was aimed to compare the subjective and objective nutritional assessments and to analyse the performance of subjective global assessment (SGA) of nutritional status in diagnosing undernutrition in paediatric patients. One hundred and forty children (aged 2-12 years) hospitalized consecutively in Tabriz Paediatric Hospital from June 2008 to August 2008 underwent subjective assessment using the SGA questionnaire and objective assessment, including anthropometric and biochemical measurements. Agreement between two assessment methods was analysed by the kappa (κ) statistic. Statistical indicators including (sensitivity, specificity, predictive values, error rates, accuracy, powers, likelihood ratios and odds ratio) between SGA and objective assessment method were determined. The overall prevalence of undernutrition according to the SGA (70.7%) was higher than that by objective assessment of nutritional status (48.5%). Agreement between the two evaluation methods was only fair to moderate (κ = 0.336, P < 0.001). The sensitivity, specificity, positive and negative predictive value of the SGA method for screening undernutrition in this population were 88.235%, 45.833%, 60.606% and 80.487%, respectively. Accuracy, positive and negative power of the SGA method were 66.428%, 56.074% and 41.25%, respectively. Likelihood ratio positive, likelihood ratio negative and odds ratio of the SGA method were 1.628, 0.256 and 6.359, respectively. Our findings indicated that in assessing nutritional status of children, there is not a good level of agreement between SGA and objective nutritional assessment. In addition, SGA is a highly sensitive tool for assessing nutritional status and could identify children at risk of developing undernutrition. © 2009 Blackwell Publishing Ltd.
Using R-Project for Free Statistical Analysis in Extension Research

ERIC Educational Resources Information Center

Mangiafico, Salvatore S.

2013-01-01

One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Incidence of post-operative adhesions following Misgav Ladach caesarean section--a comparative study.

PubMed

Fatusić, Zlatan; Hudić, Igor

2009-02-01

To evaluate the incidence of peritoneal adhesions as a post-operative complication after caesarean section following the Misgav Ladach method and compare it with peritoneal adhesions following traditional caesarean section methods (Pfannenstiel-Dörffler, low midline laparotomy-Dörffler). The analysis is retrospective and is based on medical documentation of the Clinic for Gynecology and Obstetrics, University Clinical Centre, Tuzla, Bosnia and Herzegovina (data from 1 January 2001 to 31 December 2005). We analysed previous caesarean section dependent on caesarean section method (200 by Misgav Ladach method, 100 by Pfannenstiel-Dörffler method and 100 caesarean section by low midline laparotomy-Dörffler). Adhesion scores were assigned using a previously validated scoring system. We found statistically significant difference (p < 0.05) in incidence of peritoneal adhesions in second and third caesarean section between Misgav Ladach method and the Pfannestiel-Dörffler and low midline laparotomy-Dörffler method. Difference in incidence of peritoneal adhesions between low midline laparotomy-Dörffler and Pfannenstiel-Dörffler method was not statistically different (p > 0.05). The mean pelvic adhesion score was statistically lower in Misgav Ladach group (0.43 +/- 0.79) than the mean score in the Pfannestiel-Dörffler (0.71 +/- 1.27) and low midline laparotomy-Dörffler groups (0.99 +/- 1.49) (p < 0.05). Our study showed that Misgav Ladach method of caesarean section makes possible lower incidence of peritoneal adhesions as post-operative complication of previous caesarean section.
The expectancy-value muddle in the theory of planned behaviour - and some proposed solutions.

PubMed

French, David P; Hankins, Matthew

2003-02-01

The authors of the Theories of Reasoned Action and Planned Behaviour recommended a method for statistically analysing the relationships between beliefs and the Attitude, Subjective Norm, and Perceived Behavioural Control constructs. This method has been used in the overwhelming majority of studies using these theories. However, there is a growing awareness that this method yields statistically uninterpretable results (Evans, 1991). Despite this, the use of this method is continuing, as is uninformed interpretation of this problematic research literature. This is probably due to the lack of a simple account of where the problem lies, and the large number of alternatives available. This paper therefore summarizes the problem as simply as possible, gives consideration to the conclusions that can be validly drawn from studies that contain this problem, and critically reviews the many alternatives that have been proposed to address this problem. Different techniques are identified as being suitable, according to the purpose of the specific research project.
Analysing recurrent hospitalizations in heart failure: a review of statistical methodology, with application to CHARM-Preserved.

PubMed

Rogers, Jennifer K; Pocock, Stuart J; McMurray, John J V; Granger, Christopher B; Michelson, Eric L; Östergren, Jan; Pfeffer, Marc A; Solomon, Scott D; Swedberg, Karl; Yusuf, Salim

2014-01-01

Heart failure is characterized by recurrent hospitalizations, but often only the first event is considered in clinical trial reports. In chronic diseases, such as heart failure, analysing all events gives a more complete picture of treatment benefit. We describe methods of analysing repeat hospitalizations, and illustrate their value in one major trial. The Candesartan in Heart failure Assessment of Reduction in Mortality and morbidity (CHARM)-Preserved study compared candesartan with placebo in 3023 patients with heart failure and preserved systolic function. The heart failure hospitalization rates were 12.5 and 8.9 per 100 patient-years in the placebo and candesartan groups, respectively. The repeat hospitalizations were analysed using the Andersen-Gill, Poisson, and negative binomial methods. Death was incorporated into analyses by treating it as an additional event. The win ratio method and a method that jointly models hospitalizations and mortality were also considered. Using repeat events gave larger treatment benefits than time to first event analysis. The negative binomial method for the composite of recurrent heart failure hospitalizations and cardiovascular death gave a rate ratio of 0.75 [95% confidence interval (CI) 0.62-0.91, P = 0.003], whereas the hazard ratio for time to first heart failure hospitalization or cardiovascular death was 0.86 (95% CI 0.74-1.00, P = 0.050). In patients with preserved EF, candesartan reduces the rate of admissions for worsening heart failure, to a greater extent than apparent from analysing only first hospitalizations. Recurrent events should be routinely incorporated into the analysis of future clinical trials in heart failure. © 2013 The Authors. European Journal of Heart Failure © 2013 European Society of Cardiology.
On-Orbit System Identification

NASA Technical Reports Server (NTRS)

Mettler, E.; Milman, M. H.; Bayard, D.; Eldred, D. B.

1987-01-01

Information derived from accelerometer readings benefits important engineering and control functions. Report discusses methodology for detection, identification, and analysis of motions within space station. Techniques of vibration and rotation analyses, control theory, statistics, filter theory, and transform methods integrated to form system for generating models and model parameters that characterize total motion of complicated space station, with respect to both control-induced and random mechanical disturbances.
Analysing the Opportunities and Challenges to Use of Information and Communication Technology Tools in Teaching-Learning Process

ERIC Educational Resources Information Center

Dastjerdi, Negin Barat

2016-01-01

The research aims at the evaluation of ICT use in teaching-learning process to the students of Isfahan elementary schools. The method of this research is descriptive-surveying. The statistical population of the study was all teachers of Isfahan elementary schools. The sample size was determined 350 persons that selected through cluster sampling…
A method for estimating current attendance on sets of campgrounds...a pilot study

Treesearch

Richard L. Bury; Ruth Margolies

1964-01-01

Statistical models were devised for estimating both daily and seasonal attendance (and corresponding precision of estimates) through correlation-regression and ratio analyses. Total daily attendance for a test set of 23 campgrounds could be estimated from attendance measured in only one of them. The chances were that estimates would be within 10 percent of true...
UNITY: Confronting Supernova Cosmology's Statistical and Systematic Uncertainties in a Unified Bayesian Framework

NASA Astrophysics Data System (ADS)

Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The

2015-11-01

While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.
Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method.

PubMed

Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

2014-07-01

The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous studies. We also compared pairwise distances (between geographically separated samples) with those obtained using the AMOVA method and found good agreement. Further analyses that are impossible with AMOVA were made using the discrete Laplace method: analysis of the homogeneity in two different ways and calculating marginal STR distributions. We found that the Y-STR haplotypes from e.g. Finland were relatively homogeneous as opposed to the relatively heterogeneous Y-STR haplotypes from e.g. Lublin, Eastern Poland and Berlin, Germany. We demonstrated that the observed distributions of alleles at each locus were similar to the expected ones. We also compared pairwise distances between geographically separated samples from Africa with those obtained using the AMOVA method and found good agreement. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A review of published analyses of case-cohort studies and recommendations for future reporting.

PubMed

Sharp, Stephen J; Poulaliou, Manon; Thompson, Simon G; White, Ian R; Wood, Angela M

2014-01-01

The case-cohort study design combines the advantages of a cohort study with the efficiency of a nested case-control study. However, unlike more standard observational study designs, there are currently no guidelines for reporting results from case-cohort studies. Our aim was to review recent practice in reporting these studies, and develop recommendations for the future. By searching papers published in 24 major medical and epidemiological journals between January 2010 and March 2013 using PubMed, Scopus and Web of Knowledge, we identified 32 papers reporting case-cohort studies. The median subcohort sampling fraction was 4.1% (interquartile range 3.7% to 9.1%). The papers varied in their approaches to describing the numbers of individuals in the original cohort and the subcohort, presenting descriptive data, and in the level of detail provided about the statistical methods used, so it was not always possible to be sure that appropriate analyses had been conducted. Based on the findings of our review, we make recommendations about reporting of the study design, subcohort definition, numbers of participants, descriptive information and statistical methods, which could be used alongside existing STROBE guidelines for reporting observational studies.

Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

PubMed Central

Riley, Richard D.

2017-01-01

An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Diet misreporting can be corrected: confirmation of the association between energy intake and fat-free mass in adolescents.

PubMed

Vainik, Uku; Konstabel, Kenn; Lätt, Evelin; Mäestu, Jarek; Purge, Priit; Jürimäe, Jaak

2016-10-01

Subjective energy intake (sEI) is often misreported, providing unreliable estimates of energy consumed. Therefore, relating sEI data to health outcomes is difficult. Recently, Börnhorst et al. compared various methods to correct sEI-based energy intake estimates. They criticised approaches that categorise participants as under-reporters, plausible reporters and over-reporters based on the sEI:total energy expenditure (TEE) ratio, and thereafter use these categories as statistical covariates or exclusion criteria. Instead, they recommended using external predictors of sEI misreporting as statistical covariates. We sought to confirm and extend these findings. Using a sample of 190 adolescent boys (mean age=14), we demonstrated that dual-energy X-ray absorptiometry-measured fat-free mass is strongly associated with objective energy intake data (onsite weighted breakfast), but the association with sEI (previous 3-d dietary interview) is weak. Comparing sEI with TEE revealed that sEI was mostly under-reported (74 %). Interestingly, statistically controlling for dietary reporting groups or restricting samples to plausible reporters created a stronger-than-expected association between fat-free mass and sEI. However, the association was an artifact caused by selection bias - that is, data re-sampling and simulations showed that these methods overestimated the effect size because fat-free mass was related to sEI both directly and indirectly via TEE. A more realistic association between sEI and fat-free mass was obtained when the model included common predictors of misreporting (e.g. BMI, restraint). To conclude, restricting sEI data only to plausible reporters can cause selection bias and inflated associations in later analyses. Therefore, we further support statistically correcting sEI data in nutritional analyses. The script for running simulations is provided.
Biometric Analysis – A Reliable Indicator for Diagnosing Taurodontism using Panoramic Radiographs

PubMed Central

Hegde, Veda; Anegundi, Rajesh Trayambhak; Pravinchandra, K.R.

2013-01-01

Background: Taurodontism is a clinical entity with a morpho–anatomical change in the shape of the tooth, which was thought to be absent in modern man. Taurodontism is mostly observed as an isolated trait or a component of a syndrome. Various techniques have been devised to diagnose taurodontism. Aim: The aim of this study was to analyze whether a biometric analysis was useful in diagnosing taurodontism, in radiographs which appeared to be normal on cursory observations. Setting and Design: This study was carried out in our institution by using radiographs which were taken for routine procedures. Material and Methods: In this retrospective study, panoramic radiographs were obtained from dental records of children who were aged between 9–14 years, who did not have any abnormality on cursory observations. Biometric analyses were carried out on permanent mandibular first molar(s) by using a novel biometric method. The values were tabulated and analysed. Statistics: Fischer exact probability test, Chi square test and Chi-square test with Yates correction were used for statistical analysis of the data. Results: Cursory observation did not yield us any case of taurodontism. In contrast, the biometric analysis yielded us a statistically significant number of cases of taurodontism. However, there was no statistically significant difference in the number of cases with taurodontism, which was obtained between the genders and the age group which was considered. Conclusion: Thus, taurodontism was diagnosed on a biometric analysis, which was otherwise missed on a cursory observation. It is therefore necessary from the clinical point of view, to diagnose even the mildest form of taurodontism by using metric analysis rather than just relying on a visual radiographic assessment, as its occurrence has many clinical implications and a diagnostic importance. PMID:24086912
Visual field progression in glaucoma: estimating the overall significance of deterioration with permutation analyses of pointwise linear regression (PoPLR).

PubMed

O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H

2012-10-01

To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
Teenage births to ethnic minority women.

PubMed

Berthoud, R

2001-01-01

This article analyses British age-specific fertility rates by ethnic group, with a special interest in child-bearing by women below the age of 20. Birth statistics are not analysed by ethnic group, and teenage birth rates have been estimated from the dates of birth of mothers and children in the Labour Force Survey. The method appears to be robust. Caribbean, Pakistani and especially Bangladeshi women were much more likely to have been teenage mothers than white women, but Indian women were below the national average. Teenage birth rates have been falling in all three South Asian communities.
Temporal scaling and spatial statistical analyses of groundwater level fluctuations

NASA Astrophysics Data System (ADS)

Sun, H.; Yuan, L., Sr.; Zhang, Y.

2017-12-01

Natural dynamics such as groundwater level fluctuations can exhibit multifractionality and/or multifractality due likely to multi-scale aquifer heterogeneity and controlling factors, whose statistics requires efficient quantification methods. This study explores multifractionality and non-Gaussian properties in groundwater dynamics expressed by time series of daily level fluctuation at three wells located in the lower Mississippi valley, after removing the seasonal cycle in the temporal scaling and spatial statistical analysis. First, using the time-scale multifractional analysis, a systematic statistical method is developed to analyze groundwater level fluctuations quantified by the time-scale local Hurst exponent (TS-LHE). Results show that the TS-LHE does not remain constant, implying the fractal-scaling behavior changing with time and location. Hence, we can distinguish the potentially location-dependent scaling feature, which may characterize the hydrology dynamic system. Second, spatial statistical analysis shows that the increment of groundwater level fluctuations exhibits a heavy tailed, non-Gaussian distribution, which can be better quantified by a Lévy stable distribution. Monte Carlo simulations of the fluctuation process also show that the linear fractional stable motion model can well depict the transient dynamics (i.e., fractal non-Gaussian property) of groundwater level, while fractional Brownian motion is inadequate to describe natural processes with anomalous dynamics. Analysis of temporal scaling and spatial statistics therefore may provide useful information and quantification to understand further the nature of complex dynamics in hydrology.
The Problem of Auto-Correlation in Parasitology

PubMed Central

Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

2012-01-01

Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865
Rockslide susceptibility and hazard assessment for mitigation works design along vertical rocky cliffs: workflow proposal based on a real case-study conducted in Sacco (Campania), Italy

NASA Astrophysics Data System (ADS)

Pignalosa, Antonio; Di Crescenzo, Giuseppe; Marino, Ermanno; Terracciano, Rosario; Santo, Antonio

2015-04-01

The work here presented concerns a case study in which a complete multidisciplinary workflow has been applied for an extensive assessment of the rockslide susceptibility and hazard in a common scenario such as a vertical and fractured rocky cliffs. The studied area is located in a high-relief zone in Southern Italy (Sacco, Salerno, Campania), characterized by wide vertical rocky cliffs formed by tectonized thick successions of shallow-water limestones. The study concerned the following phases: a) topographic surveying integrating of 3d laser scanning, photogrammetry and GNSS; b) gelogical surveying, characterization of single instabilities and geomecanichal surveying, conducted by geologists rock climbers; c) processing of 3d data and reconstruction of high resolution geometrical models; d) structural and geomechanical analyses; e) data filing in a GIS-based spatial database; f) geo-statistical and spatial analyses and mapping of the whole set of data; g) 3D rockfall analysis; The main goals of the study have been a) to set-up an investigation method to achieve a complete and thorough characterization of the slope stability conditions and b) to provide a detailed base for an accurate definition of the reinforcement and mitigation systems. For this purposes the most up-to-date methods of field surveying, remote sensing, 3d modelling and geospatial data analysis have been integrated in a systematic workflow, accounting of the economic sustainability of the whole project. A novel integrated approach have been applied both fusing deterministic and statistical surveying methods. This approach enabled to deal with the wide extension of the studied area (near to 200.000 m2), without compromising an high accuracy of the results. The deterministic phase, based on a field characterization of single instabilities and their further analyses on 3d models, has been applied for delineating the peculiarity of each single feature. The statistical approach, based on geostructural field mapping and on punctual geomechanical data from scan-line surveying, allowed the rock mass partitioning in homogeneous geomechanical sectors and data interpolation through bounded geostatistical analyses on 3d models. All data, resulting from both approaches, have been referenced and filed in a single spatial database and considered in global geo-statistical analyses for deriving a fully modelled and comprehensive evaluation of the rockslide susceptibility. The described workflow yielded the following innovative results: a) a detailed census of single potential instabilities, through a spatial database recording the geometrical, geological and mechanical features, along with the expected failure modes; b) an high resolution characterization of the whole slope rockslide susceptibility, based on the partitioning of the area according to the stability and mechanical conditions which can be directly related to specific hazard mitigation systems; c) the exact extension of the area exposed to the rockslide hazard, along with the dynamic parameters of expected phenomena; d) an intervention design for hazard mitigation.
Lindemann histograms as a new method to analyse nano-patterns and phases

NASA Astrophysics Data System (ADS)

Makey, Ghaith; Ilday, Serim; Tokel, Onur; Ibrahim, Muhamet; Yavuz, Ozgun; Pavlov, Ihor; Gulseren, Oguz; Ilday, Omer

The detection, observation, and analysis of material phases and atomistic patterns are of great importance for understanding systems exhibiting both equilibrium and far-from-equilibrium dynamics. As such, there is intense research on phase transitions and pattern dynamics in soft matter, statistical and nonlinear physics, and polymer physics. In order to identify phases and nano-patterns, the pair correlation function is commonly used. However, this approach is limited in terms of recognizing competing patterns in dynamic systems, and lacks visualisation capabilities. In order to solve these limitations, we introduce Lindemann histogram quantification as an alternative method to analyse solid, liquid, and gas phases, along with hexagonal, square, and amorphous nano-pattern symmetries. We show that the proposed approach based on Lindemann parameter calculated per particle maps local number densities to material phase or particles pattern. We apply the Lindemann histogram method on dynamical colloidal self-assembly experimental data and identify competing patterns.
National Trends in Trace Metals Concentrations in Ambient Particulate Matter

NASA Astrophysics Data System (ADS)

McCarthy, M. C.; Hafner, H. R.; Charrier, J. G.

2007-12-01

Ambient measurements of trace metals identified as hazardous air pollutants (HAPs, air toxics) collected in the United States from 1990 to 2006 were analyzed for long-term trends. Trace metals analyzed include lead, manganese, arsenic, chromium, nickel, cadmium, and selenium. Visual and statistical analyses were used to identify and quantify temporal variations in air toxics at national and regional levels. Trend periods were required to be at least five years. Lead particles decreased in concentration at most monitoring sites, but trends in other metals were not consistent over time or spatially. In addition, routine ambient monitoring methods had method detection limits (MDLs) too high to adequately measure concentrations for trends analysis. Differences between measurement methods at urban and rural sites also confound trends analyses. Improvements in MDLs, and a better understanding of comparability between networks, are needed to better quantify trends in trace metal concentrations in the future.
A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula

PubMed Central

Giordano, Bruno L.; Kayser, Christoph; Rousselet, Guillaume A.; Gross, Joachim; Schyns, Philippe G.

2016-01-01

Abstract We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open‐source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541–1573, 2017. © 2016 Wiley Periodicals, Inc. PMID:27860095
An investigative comparison of purging and non-purging groundwater sampling methods in Karoo aquifer monitoring wells

NASA Astrophysics Data System (ADS)

Gomo, M.; Vermeulen, D.

2015-03-01

An investigation was conducted to statistically compare the influence of non-purging and purging groundwater sampling methods on analysed inorganic chemistry parameters and calculated saturation indices. Groundwater samples were collected from 15 monitoring wells drilled in Karoo aquifers before and after purging for the comparative study. For the non-purging method, samples were collected from groundwater flow zones located in the wells using electrical conductivity (EC) profiling. The two data sets of non-purged and purged groundwater samples were analysed for inorganic chemistry parameters at the Institute of Groundwater Studies (IGS) laboratory of the Free University in South Africa. Saturation indices for mineral phases that were found in the data base of PHREEQC hydrogeochemical model were calculated for each data set. Four one-way ANOVA tests were conducted using Microsoft excel 2007 to investigate if there is any statistically significant difference between: (1) all inorganic chemistry parameters measured in the non-purged and purged groundwater samples per each specific well, (2) all mineral saturation indices calculated for the non-purged and purged groundwater samples per each specific well, (3) individual inorganic chemistry parameters measured in the non-purged and purged groundwater samples across all wells and (4) Individual mineral saturation indices calculated for non-purged and purged groundwater samples across all wells. For all the ANOVA tests conducted, the calculated alpha values (p) are greater than 0.05 (significance level) and test statistic (F) is less than the critical value (Fcrit) (F < Fcrit). The results imply that there was no statistically significant difference between the two data sets. With a 95% confidence, it was therefore concluded that the variance between groups was rather due to random chance and not to the influence of the sampling methods (tested factor). It is therefore be possible that in some hydrogeologic conditions, non-purged groundwater samples might be just as representative as the purged ones. The findings of this study can provide an important platform for future evidence oriented research investigations to establish the necessity of purging prior to groundwater sampling in different aquifer systems.
Predictive distributions for between-study heterogeneity and simple methods for their application in Bayesian meta-analysis

PubMed Central

Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T

2015-01-01

Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839
[Quality assessment in anesthesia].

PubMed

Kupperwasser, B

1996-01-01

Quality assessment (assurance/improvement) is the set of methods used to measure and improve the delivered care and the department's performance against pre-established criteria or standards. The four stages of the self-maintained quality assessment cycle are: problem identification, problem analysis, problem correction and evaluation of corrective actions. Quality assessment is a measurable entity for which it is necessary to define and calibrate measurement parameters (indicators) from available data gathered from the hospital anaesthesia environment. Problem identification comes from the accumulation of indicators. There are four types of quality indicators: structure, process, outcome and sentinel indicators. The latter signal a quality defect, are independent of outcomes, are easier to analyse by statistical methods and closely related to processes and main targets of quality improvement. The three types of methods to analyse the problems (indicators) are: peer review, quantitative methods and risks management techniques. Peer review is performed by qualified anaesthesiologists. To improve its validity, the review process should be explicited and conclusions based on standards of practice and literature references. The quantitative methods are statistical analyses applied to the collected data and presented in a graphic format (histogram, Pareto diagram, control charts). The risks management techniques include: a) critical incident analysis establishing an objective relationship between a 'critical' event and the associated human behaviours; b) system accident analysis, based on the fact that accidents continue to occur despite safety systems and sophisticated technologies, checks of all the process components leading to the impredictable outcome and not just the human factors; c) cause-effect diagrams facilitate the problem analysis in reducing its causes to four fundamental components (persons, regulations, equipment, process). Definition and implementation of corrective measures, based on the findings of the two previous stages, are the third step of the evaluation cycle. The Hawthorne effect is an outcome improvement, before the implementation of any corrective actions. Verification of the implemented actions is the final and mandatory step closing the evaluation cycle.
Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

PubMed

Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

2015-08-01

Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.
Night shift work and breast cancer risk: what do the meta-analyses tell us?

PubMed

Pahwa, Manisha; Labrèche, France; Demers, Paul A

2018-05-22

Objectives This paper aims to compare results, assess the quality, and discuss the implications of recently published meta-analyses of night shift work and breast cancer risk. Methods A comprehensive search was conducted for meta-analyses published from 2007-2017 that included at least one pooled effect size (ES) for breast cancer associated with any night shift work exposure metric and were accompanied by a systematic literature review. Pooled ES from each meta-analysis were ascertained with a focus on ever/never exposure associations. Assessments of heterogeneity and publication bias were also extracted. The AMSTAR 2 checklist was used to evaluate quality. Results Seven meta-analyses, published from 2013-2016, collectively included 30 cohort and case-control studies spanning 1996-2016. Five meta-analyses reported pooled ES for ever/never night shift work exposure; these ranged from 0.99 [95% confidence interval (CI) 0.95-1.03, N=10 cohort studies) to 1.40 (95% CI 1.13-1.73, N=9 high quality studies). Estimates for duration, frequency, and cumulative night shift work exposure were scant and mostly not statistically significant. Meta-analyses of cohort, Asian, and more fully-adjusted studies generally resulted in lower pooled ES than case-control, European, American, or minimally-adjusted studies. Most reported statistically significant between-study heterogeneity. Publication bias was not evident in any of the meta-analyses. Only one meta-analysis was strong in critical quality domains. Conclusions Fairly consistent elevated pooled ES were found for ever/never night shift work and breast cancer risk, but results for other shift work exposure metrics were inconclusive. Future evaluations of shift work should incorporate high quality meta-analyses that better appraise individual study quality.
Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

PubMed

Schlenker, Evelyn

2016-01-01

This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
Fighting bias with statistics: Detecting gender differences in responses to items on a preschool science assessment

NASA Astrophysics Data System (ADS)

Greenberg, Ariela Caren

Differential item functioning (DIF) and differential distractor functioning (DDF) are methods used to screen for item bias (Camilli & Shepard, 1994; Penfield, 2008). Using an applied empirical example, this mixed-methods study examined the congruency and relationship of DIF and DDF methods in screening multiple-choice items. Data for Study I were drawn from item responses of 271 female and 236 male low-income children on a preschool science assessment. Item analyses employed a common statistical approach of the Mantel-Haenszel log-odds ratio (MH-LOR) to detect DIF in dichotomously scored items (Holland & Thayer, 1988), and extended the approach to identify DDF (Penfield, 2008). Findings demonstrated that the using MH-LOR to detect DIF and DDF supported the theoretical relationship that the magnitude and form of DIF and are dependent on the DDF effects, and demonstrated the advantages of studying DIF and DDF in multiple-choice items. A total of 4 items with DIF and DDF and 5 items with only DDF were detected. Study II incorporated an item content review, an important but often overlooked and under-published step of DIF and DDF studies (Camilli & Shepard). Interviews with 25 female and 22 male low-income preschool children and an expert review helped to interpret the DIF and DDF results and their comparison, and determined that a content review process of studied items can reveal reasons for potential item bias that are often congruent with the statistical results. Patterns emerged and are discussed in detail. The quantitative and qualitative analyses were conducted in an applied framework of examining the validity of the preschool science assessment scores for evaluating science programs serving low-income children, however, the techniques can be generalized for use with measures across various disciplines of research.
Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. analysis and examples.

PubMed Central

Peto, R.; Pike, M. C.; Armitage, P.; Breslow, N. E.; Cox, D. R.; Howard, S. V.; Mantel, N.; McPherson, K.; Peto, J.; Smith, P. G.

1977-01-01

Part I of this report appeared in the previous issue (Br. J. Cancer (1976) 34,585), and discussed the design of randomized clinical trials. Part II now describes efficient methods of analysis of randomized clinical trials in which we wish to compare the duration of survival (or the time until some other untoward event first occurs) among different groups of patients. It is intended to enable physicians without statistical training either to analyse such data themselves using life tables, the logrank test and retrospective stratification, or, when such analyses are presented, to appreciate them more critically, but the discussion may also be of interest to statisticians who have not yet specialized in clinical trial analyses. PMID:831755
A wind proxy based on migrating dunes at the Baltic coast: statistical analysis of the link between wind conditions and sand movement

NASA Astrophysics Data System (ADS)

Bierstedt, Svenja E.; Hünicke, Birgit; Zorita, Eduardo; Ludwig, Juliane

2017-07-01

We statistically analyse the relationship between the structure of migrating dunes in the southern Baltic and the driving wind conditions over the past 26 years, with the long-term aim of using migrating dunes as a proxy for past wind conditions at an interannual resolution. The present analysis is based on the dune record derived from geo-radar measurements by Ludwig et al. (2017). The dune system is located at the Baltic Sea coast of Poland and is migrating from west to east along the coast. The dunes present layers with different thicknesses that can be assigned to absolute dates at interannual timescales and put in relation to seasonal wind conditions. To statistically analyse this record and calibrate it as a wind proxy, we used a gridded regional meteorological reanalysis data set (coastDat2) covering recent decades. The identified link between the dune annual layers and wind conditions was additionally supported by the co-variability between dune layers and observed sea level variations in the southern Baltic Sea. We include precipitation and temperature into our analysis, in addition to wind, to learn more about the dependency between these three atmospheric factors and their common influence on the dune system. We set up a statistical linear model based on the correlation between the frequency of days with specific wind conditions in a given season and dune migration velocities derived for that season. To some extent, the dune records can be seen as analogous to tree-ring width records, and hence we use a proxy validation method usually applied in dendrochronology, cross-validation with the leave-one-out method, when the observational record is short. The revealed correlations between the wind record from the reanalysis and the wind record derived from the dune structure is in the range between 0.28 and 0.63, yielding similar statistical validation skill as dendroclimatological records.

Modelling the effect of structural QSAR parameters on skin penetration using genetic programming

NASA Astrophysics Data System (ADS)

Chung, K. K.; Do, D. Q.

2010-09-01

In order to model relationships between chemical structures and biological effects in quantitative structure-activity relationship (QSAR) data, an alternative technique of artificial intelligence computing—genetic programming (GP)—was investigated and compared to the traditional method—statistical. GP, with the primary advantage of generating mathematical equations, was employed to model QSAR data and to define the most important molecular descriptions in QSAR data. The models predicted by GP agreed with the statistical results, and the most predictive models of GP were significantly improved when compared to the statistical models using ANOVA. Recently, artificial intelligence techniques have been applied widely to analyse QSAR data. With the capability of generating mathematical equations, GP can be considered as an effective and efficient method for modelling QSAR data.
Size and shape measurement in contemporary cephalometrics.

PubMed

McIntyre, Grant T; Mossey, Peter A

2003-06-01

The traditional method of analysing cephalograms--conventional cephalometric analysis (CCA)--involves the calculation of linear distance measurements, angular measurements, area measurements, and ratios. Because shape information cannot be determined from these 'size-based' measurements, an increasing number of studies employ geometric morphometric tools in the cephalometric analysis of craniofacial morphology. Most of the discussions surrounding the appropriateness of CCA, Procrustes superimposition, Euclidean distance matrix analysis (EDMA), thin-plate spline analysis (TPS), finite element morphometry (FEM), elliptical Fourier functions (EFF), and medial axis analysis (MAA) have centred upon mathematical and statistical arguments. Surprisingly, little information is available to assist the orthodontist in the clinical relevance of each technique. This article evaluates the advantages and limitations of the above methods currently used to analyse the craniofacial morphology on cephalograms and investigates their clinical relevance and possible applications.
How to Get Statistically Significant Effects in Any ERP Experiment (and Why You Shouldn’t)

PubMed Central

Luck, Steven J.; Gaspelin, Nicholas

2016-01-01

Event-related potential (ERP) experiments generate massive data sets, often containing thousands of values for each participant, even after averaging. The richness of these data sets can be very useful in testing sophisticated hypotheses, but this richness also creates many opportunities to obtain effects that are statistically significant but do not reflect true differences among groups or conditions (bogus effects). The purpose of this paper is to demonstrate how common and seemingly innocuous methods for quantifying and analyzing ERP effects can lead to very high rates of significant-but-bogus effects, with the likelihood of obtaining at least one such bogus effect exceeding 50% in many experiments. We focus on two specific problems: using the grand average data to select the time windows and electrode sites for quantifying component amplitudes and latencies, and using one or more multi-factor statistical analyses. Re-analyses of prior data and simulations of typical experimental designs are used to show how these problems can greatly increase the likelihood of significant-but-bogus results. Several strategies are described for avoiding these problems and for increasing the likelihood that significant effects actually reflect true differences among groups or conditions. PMID:28000253
Nonnormality and Divergence in Posttreatment Alcohol Use

PubMed Central

Witkiewitz, Katie; van der Maas, Han L. J.; Hufford, Michael R.; Marlatt, G. Alan

2007-01-01

Alcohol lapses are the modal outcome following treatment for alcohol use disorders, yet many alcohol researchers have encountered limited success in the prediction and prevention of relapse. One hypothesis is that lapses are unpredictable, but another possibility is the complexity of the relapse process is not captured by traditional statistical methods. Data from Project Matching Alcohol Treatments to Client Heterogeneity (Project MATCH), a multisite alcohol treatment study, were reanalyzed with 2 statistical methodologies: catastrophe and 2-part growth mixture modeling. Drawing on previous investigations of self-efficacy as a dynamic predictor of relapse, the current study revisits the self-efficacy matching hypothesis, which was not statistically supported in Project MATCH. Results from both the catastrophe and growth mixture analyses demonstrated a dynamic relationship between self-efficacy and drinking outcomes. The growth mixture analyses provided evidence in support of the original matching hypothesis: Individuals with lower self-efficacy who received cognitive behavior therapy drank far less frequently than did those with low self-efficacy who received motivational therapy. These results highlight the dynamical nature of the relapse process and the importance of the use of methodologies that accommodate this complexity when evaluating treatment outcomes. PMID:17516769
Statistical Analysis of Categorical Time Series of Atmospheric Elementary Circulation Mechanisms - Dzerdzeevski Classification for the Northern Hemisphere

PubMed Central

Brenčič, Mihael

2016-01-01

Northern hemisphere elementary circulation mechanisms, defined with the Dzerdzeevski classification and published on a daily basis from 1899–2012, are analysed with statistical methods as continuous categorical time series. Classification consists of 41 elementary circulation mechanisms (ECM), which are assigned to calendar days. Empirical marginal probabilities of each ECM were determined. Seasonality and the periodicity effect were investigated with moving dispersion filters and randomisation procedure on the ECM categories as well as with the time analyses of the ECM mode. The time series were determined as being non-stationary with strong time-dependent trends. During the investigated period, periodicity interchanges with periods when no seasonality is present. In the time series structure, the strongest division is visible at the milestone of 1986, showing that the atmospheric circulation pattern reflected in the ECM has significantly changed. This change is result of the change in the frequency of ECM categories; before 1986, the appearance of ECM was more diverse, and afterwards fewer ECMs appear. The statistical approach applied to the categorical climatic time series opens up new potential insight into climate variability and change studies that have to be performed in the future. PMID:27116375
Statistical analysis of major ion and trace element geochemistry of water, 1986-2006, at seven wells transecting the freshwater/saline-water interface of the Edwards Aquifer, San Antonio, Texas

USGS Publications Warehouse

Mahler, Barbara J.

2008-01-01

The statistical analyses taken together indicate that the geochemistry at the freshwater-zone wells is more variable than that at the transition-zone wells. The geochemical variability at the freshwater-zone wells might result from dilution of ground water by meteoric water. This is indicated by relatively constant major ion molar ratios; a preponderance of positive correlations between SC, major ions, and trace elements; and a principal components analysis in which the major ions are strongly loaded on the first principal component. Much of the variability at three of the four transition-zone wells might result from the use of different laboratory analytical methods or reporting procedures during the period of sampling. This is reflected by a lack of correlation between SC and major ion concentrations at the transition-zone wells and by a principal components analysis in which the variability is fairly evenly distributed across several principal components. The statistical analyses further indicate that, although the transition-zone wells are less well connected to surficial hydrologic conditions than the freshwater-zone wells, there is some connection but the response time is longer.
Statistical Analysis of Categorical Time Series of Atmospheric Elementary Circulation Mechanisms - Dzerdzeevski Classification for the Northern Hemisphere.

PubMed

Brenčič, Mihael

2016-01-01

Northern hemisphere elementary circulation mechanisms, defined with the Dzerdzeevski classification and published on a daily basis from 1899-2012, are analysed with statistical methods as continuous categorical time series. Classification consists of 41 elementary circulation mechanisms (ECM), which are assigned to calendar days. Empirical marginal probabilities of each ECM were determined. Seasonality and the periodicity effect were investigated with moving dispersion filters and randomisation procedure on the ECM categories as well as with the time analyses of the ECM mode. The time series were determined as being non-stationary with strong time-dependent trends. During the investigated period, periodicity interchanges with periods when no seasonality is present. In the time series structure, the strongest division is visible at the milestone of 1986, showing that the atmospheric circulation pattern reflected in the ECM has significantly changed. This change is result of the change in the frequency of ECM categories; before 1986, the appearance of ECM was more diverse, and afterwards fewer ECMs appear. The statistical approach applied to the categorical climatic time series opens up new potential insight into climate variability and change studies that have to be performed in the future.
Considerations in the statistical analysis of clinical trials in periodontitis.

PubMed

Imrey, P B

1986-05-01

Adult periodontitis has been described as a chronic infectious process exhibiting sporadic, acute exacerbations which cause quantal, localized losses of dental attachment. Many analytic problems of periodontal trials are similar to those of other chronic diseases. However, the episodic, localized, infrequent, and relatively unpredictable behavior of exacerbations, coupled with measurement error difficulties, cause some specific problems. Considerable controversy exists as to the proper selection and treatment of multiple site data from the same patient for group comparisons for epidemiologic or therapeutic evaluative purposes. This paper comments, with varying degrees of emphasis, on several issues pertinent to the analysis of periodontal trials. Considerable attention is given to the ways in which measurement variability may distort analytic results. Statistical treatments of multiple site data for descriptive summaries are distinguished from treatments for formal statistical inference to validate therapeutic effects. Evidence suggesting that sites behave independently is contested. For inferential analyses directed at therapeutic or preventive effects, analytic models based on site independence are deemed unsatisfactory. Methods of summarization that may yield more powerful analyses than all-site mean scores, while retaining appropriate treatment of inter-site associations, are suggested. Brief comments and opinions on an assortment of other issues in clinical trial analysis are preferred.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2012-01-01

Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Biomechanical Analysis of Military Boots. Phase 1. Materials Testing of Military and Commercial Footwear

DTIC Science & Technology

1992-10-01

N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
[Methods, challenges and opportunities for big data analyses of microbiome].

PubMed

Sheng, Hua-Fang; Zhou, Hong-Wei

2015-07-01

Microbiome is a novel research field related with a variety of chronic inflamatory diseases. Technically, there are two major approaches to analysis of microbiome: metataxonome by sequencing the 16S rRNA variable tags, and metagenome by shot-gun sequencing of the total microbial (mainly bacterial) genome mixture. The 16S rRNA sequencing analyses pipeline includes sequence quality control, diversity analyses, taxonomy and statistics; metagenome analyses further includes gene annotation and functional analyses. With the development of the sequencing techniques, the cost of sequencing will decrease, and big data analyses will become the central task. Data standardization, accumulation, modeling and disease prediction are crucial for future exploit of these data. Meanwhile, the information property in these data, and the functional verification with culture-dependent and culture-independent experiments remain the focus in future research. Studies of human microbiome will bring a better understanding of the relations between the human body and the microbiome, especially in the context of disease diagnosis and therapy, which promise rich research opportunities.
A simple method to accurately position Port-A-Cath without the aid of intraoperative fluoroscopy or other localizing devices.

PubMed

Horng, Huann-Cheng; Yuan, Chiou-Chung; Chao, Kuan-Chong; Cheng, Ming-Huei; Wang, Peng-Hui

2007-06-01

To evaluate the efficacy and acceptability of the Port-A-Cath (PAC) insertion method with (conventional group as II) and without (modified group as I) the aid of intraoperative fluoroscopy or other localizing devices. A total of 158 women with various kinds of gynecological cancers warranting PAC insertion (n = 86 in group I and n = 72 in group II, respectively) were evaluated. Data for analyses included patient age, main disease, dislocation site, surgical time, complications, and catheter outcome. There was no statistical difference between the two groups in terms of age, main disease, complications, and the experiencing of patent catheters. However, appropriate positioning (100% in group I, and 82% in group II) in the superior vena cava (SVC) showed statistical differences between the two groups (P = 0.001). In addition, the surgical time in group I was statistically shorter than that in group II (P < 0.001). The modified method for inserting the PAC offered the following benefits: including avoiding X-ray exposure for both the operator and the patient, defining the appropriate position in the SVC, and less surgical time. (c) 2007 Wiley-Liss, Inc.
Pseudoautosomal region in schizophrenia: linkage analysis of seven loci by sib-pair and lod-score methods.

PubMed

d'Amato, T; Waksman, G; Martinez, M; Laurent, C; Gorwood, P; Campion, D; Jay, M; Petit, C; Savoye, C; Bastard, C

1994-05-01

In a previous study, we reported a nonrandom segregation between schizophrenia and the pseudoautosomal locus DXYS14 in a sample of 33 sibships. That study has been extended by the addition of 16 new sibships from 16 different families. Data from six other loci of the pseudoautosomal region and of the immediately adjacent part of the X specific region have also been analyzed. Two methods of linkage analysis were used: the affected sibling pair (ASP) method and the lod-score method. Lod-score analyses were performed on the basis of three different models--A, B, and C--all shown to be consistent with the epidemiological data on schizophrenia. No clear evidence for linkage was obtained with any of these models. However, whatever the genetic model and the disease classification, maximum lod scores were positive with most of the markers, with the highest scores generally being obtained for the DXYS14 locus. When the ASP method was used, the earlier finding of nonrandom segregation between schizophrenia and the DXYS14 locus was still supported in this larger data set, at an increased level of statistical significance. Findings of ASP analyses were not significant for the other loci. Thus, findings obtained from analyses using the ASP method, but not the lod-score method, were consistent with the pseudoautosomal hypothesis for schizophrenia.
Power, effects, confidence, and significance: an investigation of statistical practices in nursing research.

PubMed

Gaskin, Cadeyrn J; Happell, Brenda

2014-05-01

To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Missing data and multiple imputation in clinical epidemiological research.

PubMed

Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

2017-01-01

Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.
Missing data and multiple imputation in clinical epidemiological research

PubMed Central

Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

2017-01-01

Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data. PMID:28352203
Truths, lies, and statistics.

PubMed

Thiese, Matthew S; Walker, Skyler; Lindsey, Jenna

2017-10-01

Distribution of valuable research discoveries are needed for the continual advancement of patient care. Publication and subsequent reliance of false study results would be detrimental for patient care. Unfortunately, research misconduct may originate from many sources. While there is evidence of ongoing research misconduct in all it's forms, it is challenging to identify the actual occurrence of research misconduct, which is especially true for misconduct in clinical trials. Research misconduct is challenging to measure and there are few studies reporting the prevalence or underlying causes of research misconduct among biomedical researchers. Reported prevalence estimates of misconduct are probably underestimates, and range from 0.3% to 4.9%. There have been efforts to measure the prevalence of research misconduct; however, the relatively few published studies are not freely comparable because of varying characterizations of research misconduct and the methods used for data collection. There are some signs which may point to an increased possibility of research misconduct, however there is a need for continued self-policing by biomedical researchers. There are existing resources to assist in ensuring appropriate statistical methods and preventing other types of research fraud. These included the "Statistical Analyses and Methods in the Published Literature", also known as the SAMPL guidelines, which help scientists determine the appropriate method of reporting various statistical methods; the "Strengthening Analytical Thinking for Observational Studies", or the STRATOS, which emphases on execution and interpretation of results; and the Committee on Publication Ethics (COPE), which was created in 1997 to deliver guidance about publication ethics. COPE has a sequence of views and strategies grounded in the values of honesty and accuracy.
A DNA microarray-based methylation-sensitive (MS)-AFLP hybridization method for genetic and epigenetic analyses.

PubMed

Yamamoto, F; Yamamoto, M

2004-07-01

We previously developed a PCR-based DNA fingerprinting technique named the Methylation Sensitive (MS)-AFLP method, which permits comparative genome-wide scanning of methylation status with a manageable number of fingerprinting experiments. The technique uses the methylation sensitive restriction enzyme NotI in the context of the existing Amplified Fragment Length Polymorphism (AFLP) method. Here we report the successful conversion of this gel electrophoresis-based DNA fingerprinting technique into a DNA microarray hybridization technique (DNA Microarray MS-AFLP). By performing a total of 30 (15 x 2 reciprocal labeling) DNA Microarray MS-AFLP hybridization experiments on genomic DNA from two breast and three prostate cancer cell lines in all pairwise combinations, and Southern hybridization experiments using more than 100 different probes, we have demonstrated that the DNA Microarray MS-AFLP is a reliable method for genetic and epigenetic analyses. No statistically significant differences were observed in the number of differences between the breast-prostate hybridization experiments and the breast-breast or prostate-prostate comparisons.
[Cluster analysis applicability to fitness evaluation of cosmonauts on long-term missions of the International space station].

PubMed

Egorov, A D; Stepantsov, V I; Nosovskiĭ, A M; Shipov, A A

2009-01-01

Cluster analysis was applied to evaluate locomotion training (running and running intermingled with walking) of 13 cosmonauts on long-term ISS missions by the parameters of duration (min), distance (m) and intensity (km/h). Based on the results of analyses, the cosmonauts were distributed into three steady groups of 2, 5 and 6 persons. Distance and speed showed a statistical rise (p < 0.03) from group 1 to group 3. Duration of physical locomotion training was not statistically different in the groups (p = 0.125). Therefore, cluster analysis is an adequate method of evaluating fitness of cosmonauts on long-term missions.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sandstrom, Mary M.; Brown, Geoffrey W.; Preston, Daniel N.

2015-10-30

The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statisticallymore » significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.« less

Effect of an EBM course in combination with case method learning sessions: an RCT on professional performance, job satisfaction, and self-efficacy of occupational physicians.

PubMed

Hugenholtz, Nathalie I R; Schaafsma, Frederieke G; Nieuwenhuijsen, Karen; van Dijk, Frank J H

2008-10-01

An intervention existing of an evidence-based medicine (EBM) course in combination with case method learning sessions (CMLSs) was designed to enhance the professional performance, self-efficacy and job satisfaction of occupational physicians. A cluster randomized controlled trial was set up and data were collected through questionnaires at baseline (T0), directly after the intervention (T1) and 7 months after baseline (T2). The data of the intervention group [T0 (n = 49), T1 (n = 31), T2 (n = 29)] and control group [T0 (n = 49), T1 (n = 28), T2 (n = 28)] were analysed in mixed model analyses. Mean scores of the perceived value of the CMLS were calculated in the intervention group. The overall effect of the intervention over time comparing the intervention with the control group was statistically significant for professional performance (p < 0.001). Job satisfaction and self-efficacy changes were small and not statistically significant between the groups. The perceived value of the CMLS to gain new insights and to improve the quality of their performance increased with the number of sessions followed. An EBM course in combination with case method learning sessions is perceived as valuable and offers evidence to enhance the professional performance of occupational physicians. However, it does not seem to influence their self-efficacy and job satisfaction.
Evaluation of the validity of the Bolton Index using cone-beam computed tomography (CBCT)

PubMed Central

Llamas, José M.; Cibrián, Rosa; Gandía, José L.; Paredes, Vanessa

2012-01-01

Aims: To evaluate the reliability and reproducibility of calculating the Bolton Index using cone-beam computed tomography (CBCT), and to compare this with measurements obtained using the 2D Digital Method. Material and Methods: Traditional study models were obtained from 50 patients, which were then digitized in order to be able to measure them using the Digital Method. Likewise, CBCTs of those same patients were undertaken using the Dental Picasso Master 3D® and the images obtained were then analysed using the InVivoDental programme. Results: By determining the regression lines for both measurement methods, as well as the difference between both of their values, the two methods are shown to be comparable, despite the fact that the measurements analysed presented statistically significant differences. Conclusions: The three-dimensional models obtained from the CBCT are as accurate and reproducible as the digital models obtained from the plaster study casts for calculating the Bolton Index. The differences existing between both methods were clinically acceptable. Key words:Tooth-size, digital models, bolton index, CBCT. PMID:22549690
Estimation of gene induction enables a relevance-based ranking of gene sets.

PubMed

Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens

2009-07-01

In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.
Manual vs. computer-assisted sperm analysis: can CASA replace manual assessment of human semen in clinical practice?

PubMed

Talarczyk-Desole, Joanna; Berger, Anna; Taszarek-Hauke, Grażyna; Hauke, Jan; Pawelczyk, Leszek; Jedrzejczak, Piotr

2017-01-01

The aim of the study was to check the quality of computer-assisted sperm analysis (CASA) system in comparison to the reference manual method as well as standardization of the computer-assisted semen assessment. The study was conducted between January and June 2015 at the Andrology Laboratory of the Division of Infertility and Reproductive Endocrinology, Poznań University of Medical Sciences, Poland. The study group consisted of 230 men who gave sperm samples for the first time in our center as part of an infertility investigation. The samples underwent manual and computer-assisted assessment of concentration, motility and morphology. A total of 184 samples were examined twice: manually, according to the 2010 WHO recommendations, and with CASA, using the program set-tings provided by the manufacturer. Additionally, 46 samples underwent two manual analyses and two computer-assisted analyses. The p-value of p < 0.05 was considered as statistically significant. Statistically significant differences were found between all of the investigated sperm parameters, except for non-progressive motility, measured with CASA and manually. In the group of patients where all analyses with each method were performed twice on the same sample we found no significant differences between both assessments of the same probe, neither in the samples analyzed manually nor with CASA, although standard deviation was higher in the CASA group. Our results suggest that computer-assisted sperm analysis requires further improvement for a wider application in clinical practice.
Multiple animal studies for medical chemical defense program in soldier/patient decontamination and drug development on task 85-17: Validation of an analytical method for the detection of soman (GD), mustard (HD), tabun (GA), and VX in wastewater samples. Final report, 13 October 1985-1 January 1989

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joiner, R.L.; Hayes, L.; Rust, W.

1989-05-01

The following report summarizes the development and validation of an analytical method for the analyses of soman (GD), mustard (HD), VX, and tabun (GA) in wastewater. The need for an analytical method that can detect GD, HD, VX, and GA with the necessary sensitivity (< 20 parts per billion (PPB))and selectivity is essential to Medical Research and Evaluation Facility (MREF) operations. The analytical data were generated using liquid-liquid extraction of the wastewater, with the extract being concentrated and analyzed by gas chromatography (GC) methods. The sample preparation and analyses methods were developed in support of ongoing activities within the MREF.more » We have documented the precision and accuracy of the analytical method through an expected working calibration range (3.0 to 60 ppb). The analytical method was statistically evaluated over a range of concentrations to establish a detection limit and quantitation limit for the method. Whenever the true concentration is 8.5 ppb or above, the probability is at least 99.9 percent that the measured concentration will be ppb or above. Thus, 6 ppb could be used as a lower reliability limit for detecting concentrations in excess of 8.5 ppb. In summary, the proposed sample extraction and analyses methods are suitable for quantitative analyses to determine the presence of GD, HD, VX, and GA in wastewater samples. Our findings indicate that we can detect any of these chemical surety materiel (CSM) in water at or below the established U.S. Army Surgeon General's safety levels in drinking water.« less
Initial evaluation of discrete orthogonal basis reconstruction of ECT images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moody, E.B.; Donohue, K.D.

1996-12-31

Discrete orthogonal basis restoration (DOBR) is a linear, non-iterative, and robust method for solving inverse problems for systems characterized by shift-variant transfer functions. This simulation study evaluates the feasibility of using DOBR for reconstructing emission computed tomographic (ECT) images. The imaging system model uses typical SPECT parameters and incorporates the effects of attenuation, spatially-variant PSF, and Poisson noise in the projection process. Sample reconstructions and statistical error analyses for a class of digital phantoms compare the DOBR performance for Hartley and Walsh basis functions. Test results confirm that DOBR with either basis set produces images with good statistical properties. Nomore » problems were encountered with reconstruction instability. The flexibility of the DOBR method and its consistent performance warrants further investigation of DOBR as a means of ECT image reconstruction.« less
When ab ≠ c - c': published errors in the reports of single-mediator models.

PubMed

Petrocelli, John V; Clarkson, Joshua J; Whitmire, Melanie B; Moon, Paul E

2013-06-01

Accurate reports of mediation analyses are critical to the assessment of inferences related to causality, since these inferences are consequential for both the evaluation of previous research (e.g., meta-analyses) and the progression of future research. However, upon reexamination, approximately 15% of published articles in psychology contain at least one incorrect statistical conclusion (Bakker & Wicherts, Behavior research methods, 43, 666-678 2011), disparities that beget the question of inaccuracy in mediation reports. To quantify this question of inaccuracy, articles reporting standard use of single-mediator models in three high-impact journals in personality and social psychology during 2011 were examined. More than 24% of the 156 models coded failed an equivalence test (i.e., ab = c - c'), suggesting that one or more regression coefficients in mediation analyses are frequently misreported. The authors cite common sources of errors, provide recommendations for enhanced accuracy in reports of single-mediator models, and discuss implications for alternative methods.
Australasian Resuscitation In Sepsis Evaluation trial statistical analysis plan.

PubMed

Delaney, Anthony; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve

2013-10-01

The Australasian Resuscitation In Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the ED with severe sepsis. In keeping with current practice, and taking into considerations aspects of trial design and reporting specific to non-pharmacologic interventions, this document outlines the principles and methods for analysing and reporting the trial results. The document is prepared prior to completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and prior to completion of the two related international studies. The statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. The data collected by the research team as specified in the study protocol, and detailed in the study case report form were reviewed. Information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation and other related therapies, and other relevant data are described with appropriate comparisons between groups. The primary, secondary and tertiary outcomes for the study are defined, with description of the planned statistical analyses. A statistical analysis plan was developed, along with a trial profile, mock-up tables and figures. A plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies, along with adverse events are described. The primary, secondary and tertiary outcomes are described along with identification of subgroups to be analysed. A statistical analysis plan for the ARISE study has been developed, and is available in the public domain, prior to the completion of recruitment into the study. This will minimise analytic bias and conforms to current best practice in conducting clinical trials. © 2013 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
Differences in Reporting of Analyses in Internal Company Documents Versus Published Trial Reports: Comparisons in Industry-Sponsored Trials in Off-Label Uses of Gabapentin

PubMed Central

Vedula, S. Swaroop; Li, Tianjing; Dickersin, Kay

2013-01-01

Background Details about the type of analysis (e.g., intent to treat [ITT]) and definitions (i.e., criteria for including participants in the analysis) are necessary for interpreting a clinical trial's findings. Our objective was to compare the description of types of analyses and criteria for including participants in the publication (i.e., what was reported) with descriptions in the corresponding internal company documents (i.e., what was planned and what was done). Trials were for off-label uses of gabapentin sponsored by Pfizer and Parke-Davis, and documents were obtained through litigation. Methods and Findings For each trial, we compared internal company documents (protocols, statistical analysis plans, and research reports, all unpublished), with publications. One author extracted data and another verified, with a third person verifying discordant items and a sample of the rest. Extracted data included the number of participants randomized and analyzed for efficacy, and types of analyses for efficacy and safety and their definitions (i.e., criteria for including participants in each type of analysis). We identified 21 trials, 11 of which were published randomized controlled trials, and that provided the documents needed for planned comparisons. For three trials, there was disagreement on the number of randomized participants between the research report and publication. Seven types of efficacy analyses were described in the protocols, statistical analysis plans, and publications, including ITT and six others. The protocol or publication described ITT using six different definitions, resulting in frequent disagreements between the two documents (i.e., different numbers of participants were included in the analyses). Conclusions Descriptions of analyses conducted did not agree between internal company documents and what was publicly reported. Internal company documents provide extensive documentation of methods planned and used, and trial findings, and should be publicly accessible. Reporting standards for randomized controlled trials should recommend transparent descriptions and definitions of analyses performed and which study participants are excluded. Please see later in the article for the Editors' Summary PMID:23382656
Army Logistician. Volume 39, Issue 1, January-February 2007

DTIC Science & Technology

2007-02-01

of electronic systems using statistical methods. P& C , however, requires advanced prognostic capabilities not only to detect the early onset of...patterns. Entities operating in a P& C -enabled environment will sense and understand contextual meaning , communicate their state and mission, and act to...accessing of historical and simulation patterns; on- board prognostics capabilities; physics of failure analyses; and predictive modeling. P& C also
Superheavy-element spectroscopy: Correlations along element 115 decay chains

NASA Astrophysics Data System (ADS)

Rudolph, D.; Forsberg, U.; Sarmiento, L. G.; Golubev, P.; Fahlander, C.

2016-05-01

Following a brief summary of the region of the heaviest atomic nuclei yet created in the laboratory, data on more than hundred α-decay chains associated with the production of element 115 are combined to investigate time and energy correlations along the observed decay chains. Several of these are analysed using a new method for statistical assessments of lifetimes in sets of decay chains.
Reduction of Complications of Local Anaesthesia in Dental Healthcare Setups by Application of the Six Sigma Methodology: A Statistical Quality Improvement Technique

PubMed Central

Khatoon, Farheen

2015-01-01

Background Health care faces challenges due to complications, inefficiencies and other concerns that threaten the safety of patients. Aim The purpose of his study was to identify causes of complications encountered after administration of local anaesthesia for dental and oral surgical procedures and to reduce the incidence of complications by introduction of six sigma methodology. Materials and Methods DMAIC (Define, Measure, Analyse, Improve and Control) process of Six Sigma was taken into consideration to reduce the incidence of complications encountered after administration of local anaesthesia injections for dental and oral surgical procedures using failure mode and effect analysis. Pareto analysis was taken into consideration to analyse the most recurring complications. Paired z-sample test using Minitab Statistical Inference and Fisher’s exact test was used to statistically analyse the obtained data. The p-value <0.05 was considered as significant value. Results Total 54 systemic and 62 local complications occurred during three months of analyse and measure phase. Syncope, failure of anaesthesia, trismus, auto mordeduras and pain at injection site was found to be most recurring complications. Cumulative defective percentage was 7.99 in case of pre-improved data and decreased to 4.58 in the control phase. Estimate for difference was 0.0341228 and 95% lower bound for difference was 0.0193966. p-value was found to be highly significant with p= 0.000. Conclusion The application of six sigma improvement methodology in healthcare tends to deliver consistently better results to the patients as well as hospitals and results in better patient compliance as well as satisfaction. PMID:26816989
Spatial variation of volcanic rock geochemistry in the Virunga Volcanic Province: Statistical analysis of an integrated database

NASA Astrophysics Data System (ADS)

Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu

2017-10-01

We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.
An Embedded Statistical Method for Coupling Molecular Dynamics and Finite Element Analyses

NASA Technical Reports Server (NTRS)

Saether, E.; Glaessgen, E.H.; Yamakov, V.

2008-01-01

The coupling of molecular dynamics (MD) simulations with finite element methods (FEM) yields computationally efficient models that link fundamental material processes at the atomistic level with continuum field responses at higher length scales. The theoretical challenge involves developing a seamless connection along an interface between two inherently different simulation frameworks. Various specialized methods have been developed to solve particular classes of problems. Many of these methods link the kinematics of individual MD atoms with FEM nodes at their common interface, necessarily requiring that the finite element mesh be refined to atomic resolution. Some of these coupling approaches also require simulations to be carried out at 0 K and restrict modeling to two-dimensional material domains due to difficulties in simulating full three-dimensional material processes. In the present work, a new approach to MD-FEM coupling is developed based on a restatement of the standard boundary value problem used to define a coupled domain. The method replaces a direct linkage of individual MD atoms and finite element (FE) nodes with a statistical averaging of atomistic displacements in local atomic volumes associated with each FE node in an interface region. The FEM and MD computational systems are effectively independent and communicate only through an iterative update of their boundary conditions. With the use of statistical averages of the atomistic quantities to couple the two computational schemes, the developed approach is referred to as an embedded statistical coupling method (ESCM). ESCM provides an enhanced coupling methodology that is inherently applicable to three-dimensional domains, avoids discretization of the continuum model to atomic scale resolution, and permits finite temperature states to be applied.
A comparison of three methods of setting prescribing budgets, using data derived from defined daily dose analyses of historic patterns of use.

PubMed Central

Maxwell, M; Howie, J G; Pryde, C J

1998-01-01

BACKGROUND: Prescribing matters (particularly budget setting and research into prescribing variation between doctors) have been handicapped by the absence of credible measures of the volume of drugs prescribed. AIM: To use the defined daily dose (DDD) method to study variation in the volume and cost of drugs prescribed across the seven main British National Formulary (BNF) chapters with a view to comparing different methods of setting prescribing budgets. METHOD: Study of one year of prescribing statistics from all 129 general practices in Lothian, covering 808,059 patients: analyses of prescribing statistics for 1995 to define volume and cost/volume of prescribing for one year for 10 groups of practices defined by the age and deprivation status of their patients, for seven BNF chapters; creation of prescribing budgets for 1996 for each individual practice based on the use of target volume and cost statistics; comparison of 1996 DDD-based budgets with those set using the conventional historical approach; and comparison of DDD-based budgets with budgets set using a capitation-based formula derived from local cost/patient information. RESULTS: The volume of drugs prescribed was affected by the age structure of the practices in BNF Chapters 1 (gastrointestinal), 2 (cardiovascular), and 6 (endocrine), and by deprivation structure for BNF Chapters 3 (respiratory) and 4 (central nervous system). Costs per DDD in the major BNF chapters were largely independent of age, deprivation structure, or fundholding status. Capitation and DDD-based budgets were similar to each other, but both differed substantially from historic budgets. One practice in seven gained or lost more than 100,000 Pounds per annum using DDD or capitation budgets compared with historic budgets. The DDD-based budget, but not the capitation-based budget, can be used to set volume-specific prescribing targets. CONCLUSIONS: DDD-based and capitation-based prescribing budgets can be set using a simple explanatory model and generalizable methods. In this study, both differed substantially from historic budgets. DDD budgets could be created to accommodate new prescribing strategies and raised or lowered to reflect local intentions to alter overall prescribing volume or cost targets. We recommend that future work on setting budgets and researching prescribing variations should be based on DDD statistics. PMID:10024703
[Triple-type theory of statistics and its application in the scientific research of biomedicine].

PubMed

Hu, Liang-ping; Liu, Hui-gang

2005-07-20

To point out the crux of why so many people failed to grasp statistics and to bring forth a "triple-type theory of statistics" to solve the problem in a creative way. Based on the experience in long-time teaching and research in statistics, the "three-type theory" was raised and clarified. Examples were provided to demonstrate that the 3 types, i.e., expressive type, prototype and the standardized type are the essentials for people to apply statistics rationally both in theory and practice, and moreover, it is demonstrated by some instances that the "three types" are correlated with each other. It can help people to see the essence by interpreting and analyzing the problems of experimental designs and statistical analyses in medical research work. Investigations reveal that for some questions, the three types are mutually identical; for some questions, the prototype is their standardized type; however, for some others, the three types are distinct from each other. It has been shown that in some multifactor experimental researches, it leads to the nonexistence of the standardized type corresponding to the prototype at all, because some researchers have committed the mistake of "incomplete control" in setting experimental groups. This is a problem which should be solved by the concept and method of "division". Once the "triple-type" for each question is clarified, a proper experimental design and statistical method can be carried out easily. "Triple-type theory of statistics" can help people to avoid committing statistical mistakes or at least to decrease the misuse rate dramatically and improve the quality, level and speed of biomedical research during the process of applying statistics. It can also help people to improve the quality of statistical textbooks and the teaching effect of statistics and it has demonstrated how to advance biomedical statistics.
Regional projection of climate impact indices over the Mediterranean region

NASA Astrophysics Data System (ADS)

Casanueva, Ana; Frías, M.; Dolores; Herrera, Sixto; Bedia, Joaquín; San Martín, Daniel; Gutiérrez, José Manuel; Zaninovic, Ksenija

2014-05-01

Climate Impact Indices (CIIs) are being increasingly used in different socioeconomic sectors to transfer information about climate change impacts and risks to stakeholders. CIIs are typically based on different weather variables such as temperature, wind speed, precipitation or humidity and comprise, in a single index, the relevant meteorological information for the particular impact sector (in this study wildfires and tourism). This dependence on several climate variables poses important limitations to the application of statistical downscaling techniques, since physical consistency among variables is required in most cases to obtain reliable local projections. The present study assesses the suitability of the "direct" downscaling approach, in which the downscaling method is directly applied to the CII. In particular, for illustrative purposes, we consider two popular indices used in the wildfire and tourism sectors, the Fire Weather Index (FWI) and the Physiological Equivalent Temperature (PET), respectively. As an example, two case studies are analysed over two representative Mediterranean regions of interest for the EU CLIM-RUN project: continental Spain for the FWI and Croatia for the PET. Results obtained with this "direct" downscaling approach are similar to those found from the application of the statistical downscaling to the individual meteorological drivers prior to the index calculation ("component" downscaling) thus, a wider range of statistical downscaling methods could be used. As an illustration, future changes in both indices are projected by applying two direct statistical downscaling methods, analogs and linear regression, to the ECHAM5 model. Larger differences were found between the two direct statistical downscaling approaches than between the direct and the component approaches with a single downscaling method. While these examples focus on particular indices and Mediterranean regions of interest for CLIM-RUN stakeholders, the same study could be extended to other indices and regions.
Analysis of Parasite and Other Skewed Counts

PubMed Central

Alexander, Neal

2012-01-01

Objective To review methods for the statistical analysis of parasite and other skewed count data. Methods Statistical methods for skewed count data are described and compared, with reference to those used over a ten year period of Tropical Medicine and International Health. Two parasitological datasets are used for illustration. Results Ninety papers were identified, 89 with descriptive and 60 with inferential analysis. A lack of clarity is noted in identifying measures of location, in particular the Williams and geometric mean. The different measures are compared, emphasizing the legitimacy of the arithmetic mean for skewed data. In the published papers, the t test and related methods were often used on untransformed data, which is likely to be invalid. Several approaches to inferential analysis are described, emphasizing 1) non-parametric methods, while noting that they are not simply comparisons of medians, and 2) generalized linear modelling, in particular with the negative binomial distribution. Additional methods, such as the bootstrap, with potential for greater use are described. Conclusions Clarity is recommended when describing transformations and measures of location. It is suggested that non-parametric methods and generalized linear models are likely to be sufficient for most analyses. PMID:22943299
An efficient empirical Bayes method for genomewide association studies.

PubMed

Wang, Q; Wei, J; Pan, Y; Xu, S

2016-08-01

Linear mixed model (LMM) is one of the most popular methods for genomewide association studies (GWAS). Numerous forms of LMM have been developed; however, there are two major issues in GWAS that have not been fully addressed before. The two issues are (i) the genomic background noise and (ii) low statistical power after Bonferroni correction. We proposed an empirical Bayes (EB) method by assigning each marker effect a normal prior distribution, resulting in shrinkage estimates of marker effects. We found that such a shrinkage approach can selectively shrink marker effects and reduce the noise level to zero for majority of non-associated markers. In the meantime, the EB method allows us to use an 'effective number of tests' to perform Bonferroni correction for multiple tests. Simulation studies for both human and pig data showed that EB method can significantly increase statistical power compared with the widely used exact GWAS methods, such as GEMMA and FaST-LMM-Select. Real data analyses in human breast cancer identified improved detection signals for markers previously known to be associated with breast cancer. We therefore believe that EB method is a valuable tool for identifying the genetic basis of complex traits. © 2015 Blackwell Verlag GmbH.
40 CFR 91.512 - Request for public hearing.

Code of Federal Regulations, 2010 CFR

2010-07-01

... plans and statistical analyses have been properly applied (specifically, whether sampling procedures and statistical analyses specified in this subpart were followed and whether there exists a basis for... will be made available to the public during Agency business hours. ...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.