Sample records for statistical power analyses

  1. A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

    ERIC Educational Resources Information Center

    Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.

    2010-01-01

    This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…

  2. Power, effects, confidence, and significance: an investigation of statistical practices in nursing research.

    PubMed

    Gaskin, Cadeyrn J; Happell, Brenda

    2014-05-01

    To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  3. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling

    PubMed Central

    Wood, John

    2017-01-01

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080

  4. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    PubMed

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  5. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling.

    PubMed

    Nord, Camilla L; Valton, Vincent; Wood, John; Roiser, Jonathan P

    2017-08-23

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered-some very seriously so-but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. Copyright © 2017 Nord, Valton et al.

  6. Spurious correlations and inference in landscape genetics

    Treesearch

    Samuel A. Cushman; Erin L. Landguth

    2010-01-01

    Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...

  7. Using R-Project for Free Statistical Analysis in Extension Research

    ERIC Educational Resources Information Center

    Mangiafico, Salvatore S.

    2013-01-01

    One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…

  8. Statistical power of intervention analyses: simulation and empirical application to treated lumber prices

    Treesearch

    Jeffrey P. Prestemon

    2009-01-01

    Timber product markets are subject to large shocks deriving from natural disturbances and policy shifts. Statistical modeling of shocks is often done to assess their economic importance. In this article, I simulate the statistical power of univariate and bivariate methods of shock detection using time series intervention models. Simulations show that bivariate methods...

  9. A General Framework for Power Analysis to Detect the Moderator Effects in Two- and Three-Level Cluster Randomized Trials

    ERIC Educational Resources Information Center

    Dong, Nianbo; Spybrook, Jessaca; Kelcey, Ben

    2016-01-01

    The purpose of this study is to propose a general framework for power analyses to detect the moderator effects in two- and three-level cluster randomized trials (CRTs). The study specifically aims to: (1) develop the statistical formulations for calculating statistical power, minimum detectable effect size (MDES) and its confidence interval to…

  10. A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.

    PubMed

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L

    2014-01-01

    We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.

  11. Statistical power analysis in wildlife research

    USGS Publications Warehouse

    Steidl, R.J.; Hayes, J.P.

    1997-01-01

    Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.

  12. Understanding Statistical Power in Cluster Randomized Trials: Challenges Posed by Differences in Notation and Terminology

    ERIC Educational Resources Information Center

    Spybrook, Jessaca; Hedges, Larry; Borenstein, Michael

    2014-01-01

    Research designs in which clusters are the unit of randomization are quite common in the social sciences. Given the multilevel nature of these studies, the power analyses for these studies are more complex than in a simple individually randomized trial. Tools are now available to help researchers conduct power analyses for cluster randomized…

  13. Low statistical power in biomedical science: a review of three human research domains.

    PubMed

    Dumas-Mallet, Estelle; Button, Katherine S; Boraud, Thomas; Gonon, Francois; Munafò, Marcus R

    2017-02-01

    Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0-10% or 11-20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.

  14. Low statistical power in biomedical science: a review of three human research domains

    PubMed Central

    Dumas-Mallet, Estelle; Button, Katherine S.; Boraud, Thomas; Gonon, Francois

    2017-01-01

    Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation. PMID:28386409

  15. Discovering human germ cell mutagens with whole genome sequencing: Insights from power calculations reveal the importance of controlling for between-family variability.

    PubMed

    Webster, R J; Williams, A; Marchetti, F; Yauk, C L

    2018-07-01

    Mutations in germ cells pose potential genetic risks to offspring. However, de novo mutations are rare events that are spread across the genome and are difficult to detect. Thus, studies in this area have generally been under-powered, and no human germ cell mutagen has been identified. Whole Genome Sequencing (WGS) of human pedigrees has been proposed as an approach to overcome these technical and statistical challenges. WGS enables analysis of a much wider breadth of the genome than traditional approaches. Here, we performed power analyses to determine the feasibility of using WGS in human families to identify germ cell mutagens. Different statistical models were compared in the power analyses (ANOVA and multiple regression for one-child families, and mixed effect model sampling between two to four siblings per family). Assumptions were made based on parameters from the existing literature, such as the mutation-by-paternal age effect. We explored two scenarios: a constant effect due to an exposure that occurred in the past, and an accumulating effect where the exposure is continuing. Our analysis revealed the importance of modeling inter-family variability of the mutation-by-paternal age effect. Statistical power was improved by models accounting for the family-to-family variability. Our power analyses suggest that sufficient statistical power can be attained with 4-28 four-sibling families per treatment group, when the increase in mutations ranges from 40 to 10% respectively. Modeling family variability using mixed effect models provided a reduction in sample size compared to a multiple regression approach. Much larger sample sizes were required to detect an interaction effect between environmental exposures and paternal age. These findings inform study design and statistical modeling approaches to improve power and reduce sequencing costs for future studies in this area. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  16. Power estimation using simulations for air pollution time-series studies

    PubMed Central

    2012-01-01

    Background Estimation of power to assess associations of interest can be challenging for time-series studies of the acute health effects of air pollution because there are two dimensions of sample size (time-series length and daily outcome counts), and because these studies often use generalized linear models to control for complex patterns of covariation between pollutants and time trends, meteorology and possibly other pollutants. In general, statistical software packages for power estimation rely on simplifying assumptions that may not adequately capture this complexity. Here we examine the impact of various factors affecting power using simulations, with comparison of power estimates obtained from simulations with those obtained using statistical software. Methods Power was estimated for various analyses within a time-series study of air pollution and emergency department visits using simulations for specified scenarios. Mean daily emergency department visit counts, model parameter value estimates and daily values for air pollution and meteorological variables from actual data (8/1/98 to 7/31/99 in Atlanta) were used to generate simulated daily outcome counts with specified temporal associations with air pollutants and randomly generated error based on a Poisson distribution. Power was estimated by conducting analyses of the association between simulated daily outcome counts and air pollution in 2000 data sets for each scenario. Power estimates from simulations and statistical software (G*Power and PASS) were compared. Results In the simulation results, increasing time-series length and average daily outcome counts both increased power to a similar extent. Our results also illustrate the low power that can result from using outcomes with low daily counts or short time series, and the reduction in power that can accompany use of multipollutant models. Power estimates obtained using standard statistical software were very similar to those from the simulations when properly implemented; implementation, however, was not straightforward. Conclusions These analyses demonstrate the similar impact on power of increasing time-series length versus increasing daily outcome counts, which has not previously been reported. Implementation of power software for these studies is discussed and guidance is provided. PMID:22995599

  17. Power estimation using simulations for air pollution time-series studies.

    PubMed

    Winquist, Andrea; Klein, Mitchel; Tolbert, Paige; Sarnat, Stefanie Ebelt

    2012-09-20

    Estimation of power to assess associations of interest can be challenging for time-series studies of the acute health effects of air pollution because there are two dimensions of sample size (time-series length and daily outcome counts), and because these studies often use generalized linear models to control for complex patterns of covariation between pollutants and time trends, meteorology and possibly other pollutants. In general, statistical software packages for power estimation rely on simplifying assumptions that may not adequately capture this complexity. Here we examine the impact of various factors affecting power using simulations, with comparison of power estimates obtained from simulations with those obtained using statistical software. Power was estimated for various analyses within a time-series study of air pollution and emergency department visits using simulations for specified scenarios. Mean daily emergency department visit counts, model parameter value estimates and daily values for air pollution and meteorological variables from actual data (8/1/98 to 7/31/99 in Atlanta) were used to generate simulated daily outcome counts with specified temporal associations with air pollutants and randomly generated error based on a Poisson distribution. Power was estimated by conducting analyses of the association between simulated daily outcome counts and air pollution in 2000 data sets for each scenario. Power estimates from simulations and statistical software (G*Power and PASS) were compared. In the simulation results, increasing time-series length and average daily outcome counts both increased power to a similar extent. Our results also illustrate the low power that can result from using outcomes with low daily counts or short time series, and the reduction in power that can accompany use of multipollutant models. Power estimates obtained using standard statistical software were very similar to those from the simulations when properly implemented; implementation, however, was not straightforward. These analyses demonstrate the similar impact on power of increasing time-series length versus increasing daily outcome counts, which has not previously been reported. Implementation of power software for these studies is discussed and guidance is provided.

  18. Targeted On-Demand Team Performance App Development

    DTIC Science & Technology

    2016-10-01

    from three sites; 6) Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes...statistical analyses, and examine any resulting qualitative data for trends or connections to statistical outcomes. On Schedule 21 Predictive...Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes.  What opportunities for

  19. Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study.

    PubMed

    Egbewale, Bolaji E; Lewis, Martyn; Sim, Julius

    2014-04-09

    Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. 126 hypothetical trial scenarios were evaluated (126,000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power.

  20. Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study

    PubMed Central

    2014-01-01

    Background Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. Methods 126 hypothetical trial scenarios were evaluated (126 000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Results Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Conclusions Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power. PMID:24712304

  1. Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power

    PubMed Central

    Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M.; Vaughn, Sharon

    2016-01-01

    An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated measures. This can result in attenuated pretest-posttest correlations, reducing the variance explained by the pretest covariate. We investigated the implications of two potential range restriction scenarios: direct truncation on a selection measure and indirect range restriction on correlated measures. Empirical and simulated data indicated direct range restriction on the pretest covariate greatly reduced statistical power and necessitated sample size increases of 82%–155% (dependent on selection criteria) to achieve equivalent statistical power to parameters with unrestricted samples. However, measures demonstrating indirect range restriction required much smaller sample size increases (32%–71%) under equivalent scenarios. Additional analyses manipulated the correlations between measures and pretest-posttest correlations to guide planning experiments. Results highlight the need to differentiate between selection measures and potential covariates and to investigate range restriction as a factor impacting statistical power. PMID:28479943

  2. Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power.

    PubMed

    Miciak, Jeremy; Taylor, W Pat; Stuebing, Karla K; Fletcher, Jack M; Vaughn, Sharon

    2016-01-01

    An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated measures. This can result in attenuated pretest-posttest correlations, reducing the variance explained by the pretest covariate. We investigated the implications of two potential range restriction scenarios: direct truncation on a selection measure and indirect range restriction on correlated measures. Empirical and simulated data indicated direct range restriction on the pretest covariate greatly reduced statistical power and necessitated sample size increases of 82%-155% (dependent on selection criteria) to achieve equivalent statistical power to parameters with unrestricted samples. However, measures demonstrating indirect range restriction required much smaller sample size increases (32%-71%) under equivalent scenarios. Additional analyses manipulated the correlations between measures and pretest-posttest correlations to guide planning experiments. Results highlight the need to differentiate between selection measures and potential covariates and to investigate range restriction as a factor impacting statistical power.

  3. Replication Unreliability in Psychology: Elusive Phenomena or “Elusive” Statistical Power?

    PubMed Central

    Tressoldi, Patrizio E.

    2012-01-01

    The focus of this paper is to analyze whether the unreliability of results related to certain controversial psychological phenomena may be a consequence of their low statistical power. Applying the Null Hypothesis Statistical Testing (NHST), still the widest used statistical approach, unreliability derives from the failure to refute the null hypothesis, in particular when exact or quasi-exact replications of experiments are carried out. Taking as example the results of meta-analyses related to four different controversial phenomena, subliminal semantic priming, incubation effect for problem solving, unconscious thought theory, and non-local perception, it was found that, except for semantic priming on categorization, the statistical power to detect the expected effect size (ES) of the typical study, is low or very low. The low power in most studies undermines the use of NHST to study phenomena with moderate or low ESs. We conclude by providing some suggestions on how to increase the statistical power or use different statistical approaches to help discriminate whether the results obtained may or may not be used to support or to refute the reality of a phenomenon with small ES. PMID:22783215

  4. On the structure and phase transitions of power-law Poissonian ensembles

    NASA Astrophysics Data System (ADS)

    Eliazar, Iddo; Oshanin, Gleb

    2012-10-01

    Power-law Poissonian ensembles are Poisson processes that are defined on the positive half-line, and that are governed by power-law intensities. Power-law Poissonian ensembles are stochastic objects of fundamental significance; they uniquely display an array of fractal features and they uniquely generate a span of important applications. In this paper we apply three different methods—oligarchic analysis, Lorenzian analysis and heterogeneity analysis—to explore power-law Poissonian ensembles. The amalgamation of these analyses, combined with the topology of power-law Poissonian ensembles, establishes a detailed and multi-faceted picture of the statistical structure and the statistical phase transitions of these elemental ensembles.

  5. Update on work-related psychosocial factors and the development of ischemic heart disease: a systematic review.

    PubMed

    Pejtersen, Jan Hyld; Burr, Hermann; Hannerz, Harald; Fishta, Alba; Hurwitz Eller, Nanna

    2015-01-01

    The present review deals with the relationship between occupational psychosocial factors and the incidence of ischemic heart disease (IHD) with special regard to the statistical power of the findings. This review with 4 inclusion criteria is an update of a 2009 review of which the first 3 criteria were included in the original review: (1) STUDY: a prospective or case-control study if exposure was not self-reported (prognostic studies excluded); (2) OUTCOME: definite IHD determined externally; (3) EXPOSURE: psychosocial factors at work (excluding shift work, trauma, violence or accidents, and social capital); and (4) Statistical power: acceptable to detect a 20% increased risk in IHD. Eleven new papers met the inclusion criteria 1-3; a total of 44 papers were evaluated regarding inclusion criteria 4. Of 169 statistical analyses, only 10 analyses in 2 papers had acceptable statistical power. The results of the 2 papers pointed in the same direction, namely that only the control dimension of job strain explained the excess risk for myocardial infarction for job strain. The large number of underpowered studies and the focus on psychosocial models, such as the job strain models, make it difficult to determine to what extent psychosocial factors at work are risk factors of IHD. There is a need for considering statistical power when planning studies.

  6. The influence of control group reproduction on the statistical ...

    EPA Pesticide Factsheets

    Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of breeding pairs of medaka. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) will have on the statistical power of the test. A software tool, the MEOGRT Reproduction Power Analysis Tool, was developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. The manuscript illustrates how the reproductive performance of the control medaka that are used in a MEOGRT influence statistical power, and therefore the successful implementation of the protocol. Example scenarios, based upon medaka reproduction data collected at MED, are discussed that bolster the recommendation that facilities planning to implement the MEOGRT should have a culture of medaka with hi

  7. Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review.

    PubMed

    Groppe, David M; Urbach, Thomas P; Kutas, Marta

    2011-12-01

    Event-related potentials (ERPs) and magnetic fields (ERFs) are typically analyzed via ANOVAs on mean activity in a priori windows. Advances in computing power and statistics have produced an alternative, mass univariate analyses consisting of thousands of statistical tests and powerful corrections for multiple comparisons. Such analyses are most useful when one has little a priori knowledge of effect locations or latencies, and for delineating effect boundaries. Mass univariate analyses complement and, at times, obviate traditional analyses. Here we review this approach as applied to ERP/ERF data and four methods for multiple comparison correction: strong control of the familywise error rate (FWER) via permutation tests, weak control of FWER via cluster-based permutation tests, false discovery rate control, and control of the generalized FWER. We end with recommendations for their use and introduce free MATLAB software for their implementation. Copyright © 2011 Society for Psychophysiological Research.

  8. Power of mental health nursing research: a statistical analysis of studies in the International Journal of Mental Health Nursing.

    PubMed

    Gaskin, Cadeyrn J; Happell, Brenda

    2013-02-01

    Having sufficient power to detect effect sizes of an expected magnitude is a core consideration when designing studies in which inferential statistics will be used. The main aim of this study was to investigate the statistical power in studies published in the International Journal of Mental Health Nursing. From volumes 19 (2010) and 20 (2011) of the journal, studies were analysed for their power to detect small, medium, and large effect sizes, according to Cohen's guidelines. The power of the 23 studies included in this review to detect small, medium, and large effects was 0.34, 0.79, and 0.94, respectively. In 90% of papers, no adjustments for experiment-wise error were reported. With a median of nine inferential tests per paper, the mean experiment-wise error rate was 0.51. A priori power analyses were only reported in 17% of studies. Although effect sizes for correlations and regressions were routinely reported, effect sizes for other tests (χ(2)-tests, t-tests, ANOVA/MANOVA) were largely absent from the papers. All types of effect sizes were infrequently interpreted. Researchers are strongly encouraged to conduct power analyses when designing studies, and to avoid scattergun approaches to data analysis (i.e. undertaking large numbers of tests in the hope of finding 'significant' results). Because reviewing effect sizes is essential for determining the clinical significance of study findings, researchers would better serve the field of mental health nursing if they reported and interpreted effect sizes. © 2012 The Authors. International Journal of Mental Health Nursing © 2012 Australian College of Mental Health Nurses Inc.

  9. Statistical issues on the analysis of change in follow-up studies in dental research.

    PubMed

    Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S

    2007-12-01

    To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.

  10. Power Analysis for Complex Mediational Designs Using Monte Carlo Methods

    ERIC Educational Resources Information Center

    Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.

    2010-01-01

    Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex…

  11. The Problem of Auto-Correlation in Parasitology

    PubMed Central

    Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

    2012-01-01

    Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865

  12. mvMapper: statistical and geographical data exploration and visualization of multivariate analysis of population structure

    USDA-ARS?s Scientific Manuscript database

    Characterizing population genetic structure across geographic space is a fundamental challenge in population genetics. Multivariate statistical analyses are powerful tools for summarizing genetic variability, but geographic information and accompanying metadata is not always easily integrated into t...

  13. Interim analyses in 2 x 2 crossover trials.

    PubMed

    Cook, R J

    1995-09-01

    A method is presented for performing interim analyses in long term 2 x 2 crossover trials with serial patient entry. The analyses are based on a linear statistic that combines data from individuals observed for one treatment period with data from individuals observed for both periods. The coefficients in this linear combination can be chosen quite arbitrarily, but we focus on variance-based weights to maximize power for tests regarding direct treatment effects. The type I error rate of this procedure is controlled by utilizing the joint distribution of the linear statistics over analysis stages. Methods for performing power and sample size calculations are indicated. A two-stage sequential design involving simultaneous patient entry and a single between-period interim analysis is considered in detail. The power and average number of measurements required for this design are compared to those of the usual crossover trial. The results indicate that, while there is minimal loss in power relative to the usual crossover design in the absence of differential carry-over effects, the proposed design can have substantially greater power when differential carry-over effects are present. The two-stage crossover design can also lead to more economical studies in terms of the expected number of measurements required, due to the potential for early stopping. Attention is directed toward normally distributed responses.

  14. Separate-channel analysis of two-channel microarrays: recovering inter-spot information.

    PubMed

    Smyth, Gordon K; Altman, Naomi S

    2013-05-26

    Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.

  15. The Surprisingly Modest Relationship between SES and Educational Achievement

    ERIC Educational Resources Information Center

    Harwell, Michael; Maeda, Yukiko; Bishop, Kyoungwon; Xie, Aolin

    2017-01-01

    Measures of socioeconomic status (SES) are routinely used in analyses of achievement data to increase statistical power, statistically control for the effects of SES, and enhance causality arguments under the premise that the SES-achievement relationship is moderate to strong. Empirical evidence characterizing the strength of the SES-achievement…

  16. Statistical analyses support power law distributions found in neuronal avalanches.

    PubMed

    Klaus, Andreas; Yu, Shan; Plenz, Dietmar

    2011-01-01

    The size distribution of neuronal avalanches in cortical networks has been reported to follow a power law distribution with exponent close to -1.5, which is a reflection of long-range spatial correlations in spontaneous neuronal activity. However, identifying power law scaling in empirical data can be difficult and sometimes controversial. In the present study, we tested the power law hypothesis for neuronal avalanches by using more stringent statistical analyses. In particular, we performed the following steps: (i) analysis of finite-size scaling to identify scale-free dynamics in neuronal avalanches, (ii) model parameter estimation to determine the specific exponent of the power law, and (iii) comparison of the power law to alternative model distributions. Consistent with critical state dynamics, avalanche size distributions exhibited robust scaling behavior in which the maximum avalanche size was limited only by the spatial extent of sampling ("finite size" effect). This scale-free dynamics suggests the power law as a model for the distribution of avalanche sizes. Using both the Kolmogorov-Smirnov statistic and a maximum likelihood approach, we found the slope to be close to -1.5, which is in line with previous reports. Finally, the power law model for neuronal avalanches was compared to the exponential and to various heavy-tail distributions based on the Kolmogorov-Smirnov distance and by using a log-likelihood ratio test. Both the power law distribution without and with exponential cut-off provided significantly better fits to the cluster size distributions in neuronal avalanches than the exponential, the lognormal and the gamma distribution. In summary, our findings strongly support the power law scaling in neuronal avalanches, providing further evidence for critical state dynamics in superficial layers of cortex.

  17. Statistical Performances of Resistive Active Power Splitter

    NASA Astrophysics Data System (ADS)

    Lalléchère, Sébastien; Ravelo, Blaise; Thakur, Atul

    2016-03-01

    In this paper, the synthesis and sensitivity analysis of an active power splitter (PWS) is proposed. It is based on the active cell composed of a Field Effect Transistor in cascade with shunted resistor at the input and the output (resistive amplifier topology). The PWS uncertainty versus resistance tolerances is suggested by using stochastic method. Furthermore, with the proposed topology, we can control easily the device gain while varying a resistance. This provides useful tool to analyse the statistical sensitivity of the system in uncertain environment.

  18. Comparison of Time-to-First Event and Recurrent Event Methods in Randomized Clinical Trials.

    PubMed

    Claggett, Brian; Pocock, Stuart; Wei, L J; Pfeffer, Marc A; McMurray, John J V; Solomon, Scott D

    2018-03-27

    Background -Most Phase-3 trials feature time-to-first event endpoints for their primary and/or secondary analyses. In chronic diseases where a clinical event can occur more than once, recurrent-event methods have been proposed to more fully capture disease burden and have been assumed to improve statistical precision and power compared to conventional "time-to-first" methods. Methods -To better characterize factors that influence statistical properties of recurrent-events and time-to-first methods in the evaluation of randomized therapy, we repeatedly simulated trials with 1:1 randomization of 4000 patients to active vs control therapy, with true patient-level risk reduction of 20% (i.e. RR=0.80). For patients who discontinued active therapy after a first event, we assumed their risk reverted subsequently to their original placebo-level risk. Through simulation, we varied a) the degree of between-patient heterogeneity of risk and b) the extent of treatment discontinuation. Findings were compared with those from actual randomized clinical trials. Results -As the degree of between-patient heterogeneity of risk was increased, both time-to-first and recurrent-events methods lost statistical power to detect a true risk reduction and confidence intervals widened. The recurrent-events analyses continued to estimate the true RR=0.80 as heterogeneity increased, while the Cox model produced estimates that were attenuated. The power of recurrent-events methods declined as the rate of study drug discontinuation post-event increased. Recurrent-events methods provided greater power than time-to-first methods in scenarios where drug discontinuation was ≤30% following a first event, lesser power with drug discontinuation rates of ≥60%, and comparable power otherwise. We confirmed in several actual trials in chronic heart failure that treatment effect estimates were attenuated when estimated via the Cox model and that increased statistical power from recurrent-events methods was most pronounced in trials with lower treatment discontinuation rates. Conclusions -We find that the statistical power of both recurrent-events and time-to-first methods are reduced by increasing heterogeneity of patient risk, a parameter not included in conventional power and sample size formulas. Data from real clinical trials are consistent with simulation studies, confirming that the greatest statistical gains from use of recurrent-events methods occur in the presence of high patient heterogeneity and low rates of study drug discontinuation.

  19. The influence of control group reproduction on the statistical power of the Environmental Protection Agency's Medaka Extended One Generation Reproduction Test (MEOGRT).

    PubMed

    Flynn, Kevin; Swintek, Joe; Johnson, Rodney

    2017-02-01

    Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of medaka breeding pairs. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) would have on the statistical power of the test. The MEOGRT Reproduction Power Analysis Tool (MRPAT) is a software tool developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. Example scenarios are detailed that highlight the importance of the reproductive parameters on statistical power. When control fecundity is increased from 21 to 38 eggs per pair per day and the variance decreased from 49 to 20, the gain in power is equivalent to increasing replication by 2.5 times. On the other hand, if 10% of the breeding pairs, including controls, do not spawn, the power to detect a 40% decrease in fecundity drops to 0.54 from nearly 0.98 when all pairs have some level of egg production. Perhaps most importantly, MRPAT was used to inform the decision making process that lead to the final recommendation of the MEOGRT to have 24 control breeding pairs and 12 breeding pairs in each exposure group. Published by Elsevier Inc.

  20. Reliability and statistical power analysis of cortical and subcortical FreeSurfer metrics in a large sample of healthy elderly.

    PubMed

    Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz

    2015-03-01

    FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls

    PubMed Central

    Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.

    2013-01-01

    As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950

  2. Evaluation and application of summary statistic imputation to discover new height-associated loci.

    PubMed

    Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

    2018-05-01

    As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression.

  3. Evaluation and application of summary statistic imputation to discover new height-associated loci

    PubMed Central

    2018-01-01

    As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression. PMID:29782485

  4. Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

    PubMed

    Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

    2008-01-01

    ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.

  5. ResidPlots-2: Computer Software for IRT Graphical Residual Analyses

    ERIC Educational Resources Information Center

    Liang, Tie; Han, Kyung T.; Hambleton, Ronald K.

    2009-01-01

    This article discusses the ResidPlots-2, a computer software that provides a powerful tool for IRT graphical residual analyses. ResidPlots-2 consists of two components: a component for computing residual statistics and another component for communicating with users and for plotting the residual graphs. The features of the ResidPlots-2 software are…

  6. Dynamic modelling of n-of-1 data: powerful and flexible data analytics applied to individualised studies.

    PubMed

    Vieira, Rute; McDonald, Suzanne; Araújo-Soares, Vera; Sniehotta, Falko F; Henderson, Robin

    2017-09-01

    N-of-1 studies are based on repeated observations within an individual or unit over time and are acknowledged as an important research method for generating scientific evidence about the health or behaviour of an individual. Statistical analyses of n-of-1 data require accurate modelling of the outcome while accounting for its distribution, time-related trend and error structures (e.g., autocorrelation) as well as reporting readily usable contextualised effect sizes for decision-making. A number of statistical approaches have been documented but no consensus exists on which method is most appropriate for which type of n-of-1 design. We discuss the statistical considerations for analysing n-of-1 studies and briefly review some currently used methodologies. We describe dynamic regression modelling as a flexible and powerful approach, adaptable to different types of outcomes and capable of dealing with the different challenges inherent to n-of-1 statistical modelling. Dynamic modelling borrows ideas from longitudinal and event history methodologies which explicitly incorporate the role of time and the influence of past on future. We also present an illustrative example of the use of dynamic regression on monitoring physical activity during the retirement transition. Dynamic modelling has the potential to expand researchers' access to robust and user-friendly statistical methods for individualised studies.

  7. POWER ANALYSIS FOR COMPLEX MEDIATIONAL DESIGNS USING MONTE CARLO METHODS

    PubMed Central

    Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.

    2013-01-01

    Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex mediational models. The approach is based on the well known technique of generating a large number of samples in a Monte Carlo study, and estimating power as the percentage of cases in which an estimate of interest is significantly different from zero. Examples of power calculation for commonly used mediational models are provided. Power analyses for the single mediator, multiple mediators, three-path mediation, mediation with latent variables, moderated mediation, and mediation in longitudinal designs are described. Annotated sample syntax for Mplus is appended and tabled values of required sample sizes are shown for some models. PMID:23935262

  8. How can my research paper be useful for future meta-analyses on forest restoration practices?

    Treesearch

    Enrique Andivia; Pedro Villar‑Salvador; Juan A. Oliet; Jaime Puertolas; R. Kasten Dumroese

    2018-01-01

    Statistical meta-analysis is a powerful and useful tool to quantitatively synthesize the information conveyed in published studies on a particular topic. It allows identifying and quantifying overall patterns and exploring causes of variation. The inclusion of published works in meta-analyses requires, however, a minimum quality standard of the reported data and...

  9. Confidence crisis of results in biomechanics research.

    PubMed

    Knudson, Duane

    2017-11-01

    Many biomechanics studies have small sample sizes and incorrect statistical analyses, so reporting of inaccurate inferences and inflated magnitude of effects are common in the field. This review examines these issues in biomechanics research and summarises potential solutions from research in other fields to increase the confidence in the experimental effects reported in biomechanics. Authors, reviewers and editors of biomechanics research reports are encouraged to improve sample sizes and the resulting statistical power, improve reporting transparency, improve the rigour of statistical analyses used, and increase the acceptance of replication studies to improve the validity of inferences from data in biomechanics research. The application of sports biomechanics research results would also improve if a larger percentage of unbiased effects and their uncertainty were reported in the literature.

  10. Sunspot activity and influenza pandemics: a statistical assessment of the purported association.

    PubMed

    Towers, S

    2017-10-01

    Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.

  11. Behavior, sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation.

    PubMed

    Eickhoff, Simon B; Nichols, Thomas E; Laird, Angela R; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T; Bzdok, Danilo; Eickhoff, Claudia R

    2016-08-15

    Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Behavior, Sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation

    PubMed Central

    Eickhoff, Simon B.; Nichols, Thomas E.; Laird, Angela R.; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T.

    2016-01-01

    Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. PMID:27179606

  13. Adopting a Patient-Centered Approach to Primary Outcome Analysis of Acute Stroke Trials by Use of a Utility-Weighted Modified Rankin Scale

    PubMed Central

    Chaisinanunkul, Napasri; Adeoye, Opeolu; Lewis, Roger J.; Grotta, James C.; Broderick, Joseph; Jovin, Tudor G.; Nogueira, Raul G.; Elm, Jordan; Graves, Todd; Berry, Scott; Lees, Kennedy R.; Barreto, Andrew D.; Saver, Jeffrey L.

    2015-01-01

    Background and Purpose Although the modified Rankin Scale (mRS) is the most commonly employed primary endpoint in acute stroke trials, its power is limited when analyzed in dichotomized fashion and its indication of effect size challenging to interpret when analyzed ordinally. Weighting the seven Rankin levels by utilities may improve scale interpretability while preserving statistical power. Methods A utility weighted mRS (UW-mRS) was derived by averaging values from time-tradeoff (patient centered) and person-tradeoff (clinician centered) studies. The UW-mRS, standard ordinal mRS, and dichotomized mRS were applied to 11 trials or meta-analyses of acute stroke treatments, including lytic, endovascular reperfusion, blood pressure moderation, and hemicraniectomy interventions. Results Utility values were: mRS 0–1.0; mRS 1 - 0.91; mRS 2 - 0.76; mRS 3 - 0.65; mRS 4 - 0.33; mRS 5 & 6 - 0. For trials with unidirectional treatment effects, the UW-mRS paralleled the ordinal mRS and outperformed dichotomous mRS analyses. Both the UW-mRS and the ordinal mRS were statistically significant in six of eight unidirectional effect trials, while dichotomous analyses were statistically significant in two to four of eight. In bidirectional effect trials, both the UW-mRS and ordinal tests captured the divergent treatment effects by showing neutral results whereas some dichotomized analyses showed positive results. Mean utility differences in trials with statistically significant positive results ranged from 0.026 to 0.249. Conclusion A utility-weighted mRS performs similarly to the standard ordinal mRS in detecting treatment effects in actual stroke trials and ensures the quantitative outcome is a valid reflection of patient-centered benefits. PMID:26138130

  14. The relation between statistical power and inference in fMRI

    PubMed Central

    Wager, Tor D.; Yarkoni, Tal

    2017-01-01

    Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial—especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20–30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches. PMID:29155843

  15. Statistical power for nonequivalent pretest-posttest designs. The impact of change-score versus ANCOVA models.

    PubMed

    Oakes, J M; Feldman, H A

    2001-02-01

    Nonequivalent controlled pretest-posttest designs are central to evaluation science, yet no practical and unified approach for estimating power in the two most widely used analytic approaches to these designs exists. This article fills the gap by presenting and comparing useful, unified power formulas for ANCOVA and change-score analyses, indicating the implications of each on sample-size requirements. The authors close with practical recommendations for evaluators. Mathematical details and a simple spreadsheet approach are included in appendices.

  16. SAS Code for Calculating Intraclass Correlation Coefficients and Effect Size Benchmarks for Site-Randomized Education Experiments

    ERIC Educational Resources Information Center

    Brandon, Paul R.; Harrison, George M.; Lawton, Brian E.

    2013-01-01

    When evaluators plan site-randomized experiments, they must conduct the appropriate statistical power analyses. These analyses are most likely to be valid when they are based on data from the jurisdictions in which the studies are to be conducted. In this method note, we provide software code, in the form of a SAS macro, for producing statistical…

  17. Annual energy review 1994

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1995-07-01

    This 13th edition presents the Energy Information Administration`s historical energy statistics. For most series, statistics are given for every year from 1949 through 1994; thus, this report is well-suited to long-term trend analyses. It covers all major energy activities, including consumption, production, trade, stocks, and prices for all major energy commodities, including fossil fuels and electricity. Statistics on renewable energy sources are also included: this year, for the first time, usage of renewables by other consumers as well as by electric utilities is included. Also new is a two-part, comprehensive presentation of data on petroleum products supplied by sector formore » 1949 through 1994. Data from electric utilities and nonutilities are integrated as ``electric power industry`` data; nonutility power gross generation are presented for the first time. One section presents international statistics (for more detail see EIA`s International Energy Annual).« less

  18. An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.

    PubMed

    Kim, Junghi; Bai, Yun; Pan, Wei

    2015-12-01

    We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.

  19. Power considerations for λ inflation factor in meta-analyses of genome-wide association studies.

    PubMed

    Georgiopoulos, Georgios; Evangelou, Evangelos

    2016-05-19

    The genomic control (GC) approach is extensively used to effectively control false positive signals due to population stratification in genome-wide association studies (GWAS). However, GC affects the statistical power of GWAS. The loss of power depends on the magnitude of the inflation factor (λ) that is used for GC. We simulated meta-analyses of different GWAS. Minor allele frequency (MAF) ranged from 0·001 to 0·5 and λ was sampled from two scenarios: (i) random scenario (empirically-derived distribution of real λ values) and (ii) selected scenario from simulation parameter modification. Adjustment for λ was considered under single correction (within study corrected standard errors) and double correction (additional λ corrected summary estimate). MAF was a pivotal determinant of observed power. In random λ scenario, double correction induced a symmetric power reduction in comparison to single correction. For MAF 1·2 and MAF >5%. Our results provide a quick but detailed index for power considerations of future meta-analyses of GWAS that enables a more flexible design from early steps based on the number of studies accumulated in different groups and the λ values observed in the single studies.

  20. [The application of the prospective space-time statistic in early warning of infectious disease].

    PubMed

    Yin, Fei; Li, Xiao-Song; Feng, Zi-Jian; Ma, Jia-Qi

    2007-06-01

    To investigate the application of prospective space-time scan statistic in the early stage of detecting infectious disease outbreaks. The prospective space-time scan statistic was tested by mimicking daily prospective analyses of bacillary dysentery data of Chengdu city in 2005 (3212 cases in 102 towns and villages). And the results were compared with that of purely temporal scan statistic. The prospective space-time scan statistic could give specific messages both in spatial and temporal. The results of June indicated that the prospective space-time scan statistic could timely detect the outbreaks that started from the local site, and the early warning message was powerful (P = 0.007). When the merely temporal scan statistic for detecting the outbreak was sent two days later, and the signal was less powerful (P = 0.039). The prospective space-time scan statistic could make full use of the spatial and temporal information in infectious disease data and could timely and effectively detect the outbreaks that start from the local sites. The prospective space-time scan statistic could be an important tool for local and national CDC to set up early detection surveillance systems.

  1. Inappropriate Fiddling with Statistical Analyses to Obtain a Desirable P-value: Tests to Detect its Presence in Published Literature

    PubMed Central

    Gadbury, Gary L.; Allison, David B.

    2012-01-01

    Much has been written regarding p-values below certain thresholds (most notably 0.05) denoting statistical significance and the tendency of such p-values to be more readily publishable in peer-reviewed journals. Intuition suggests that there may be a tendency to manipulate statistical analyses to push a “near significant p-value” to a level that is considered significant. This article presents a method for detecting the presence of such manipulation (herein called “fiddling”) in a distribution of p-values from independent studies. Simulations are used to illustrate the properties of the method. The results suggest that the method has low type I error and that power approaches acceptable levels as the number of p-values being studied approaches 1000. PMID:23056287

  2. Inappropriate fiddling with statistical analyses to obtain a desirable p-value: tests to detect its presence in published literature.

    PubMed

    Gadbury, Gary L; Allison, David B

    2012-01-01

    Much has been written regarding p-values below certain thresholds (most notably 0.05) denoting statistical significance and the tendency of such p-values to be more readily publishable in peer-reviewed journals. Intuition suggests that there may be a tendency to manipulate statistical analyses to push a "near significant p-value" to a level that is considered significant. This article presents a method for detecting the presence of such manipulation (herein called "fiddling") in a distribution of p-values from independent studies. Simulations are used to illustrate the properties of the method. The results suggest that the method has low type I error and that power approaches acceptable levels as the number of p-values being studied approaches 1000.

  3. Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

    PubMed

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E

    2014-04-01

    This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  4. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

    PubMed

    Willis, Brian H; Riley, Richard D

    2017-09-20

    An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  5. Public benefits of public power. [Booklet

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1980-01-01

    The principal characteristics and benefits of public power are described using a question and answer format. The book begins by defining public power, describing its history, and confirming that people have a right to choose. The answers to questions about the benefits of public power are grouped under three major headings: rates, local control; economic and political benefits; and power supply and consumption. Establishing community public systems is hard work, requiring a progression through local government authorization, legal and financial analyses, public persuasion, voter approval, and a bond issue. Electric utility statistics show that local public systems outnumber all othermore » types of ownership. (DCK)« less

  6. A statistical spatial power spectrum of the Earth's lithospheric magnetic field

    NASA Astrophysics Data System (ADS)

    Thébault, E.; Vervelidou, F.

    2015-05-01

    The magnetic field of the Earth's lithosphere arises from rock magnetization contrasts that were shaped over geological times. The field can be described mathematically in spherical harmonics or with distributions of magnetization. We exploit this dual representation and assume that the lithospheric field is induced by spatially varying susceptibility values within a shell of constant thickness. By introducing a statistical assumption about the power spectrum of the susceptibility, we then derive a statistical expression for the spatial power spectrum of the crustal magnetic field for the spatial scales ranging from 60 to 2500 km. This expression depends on the mean induced magnetization, the thickness of the shell, and a power law exponent for the power spectrum of the susceptibility. We test the relevance of this form with a misfit analysis to the observational NGDC-720 lithospheric magnetic field model power spectrum. This allows us to estimate a mean global apparent induced magnetization value between 0.3 and 0.6 A m-1, a mean magnetic crustal thickness value between 23 and 30 km, and a root mean square for the field value between 190 and 205 nT at 95 per cent. These estimates are in good agreement with independent models of the crustal magnetization and of the seismic crustal thickness. We carry out the same analysis in the continental and oceanic domains separately. We complement the misfit analyses with a Kolmogorov-Smirnov goodness-of-fit test and we conclude that the observed power spectrum can be each time a sample of the statistical one.

  7. Statistical power and effect sizes of depression research in Japan.

    PubMed

    Okumura, Yasuyuki; Sakamoto, Shinji

    2011-06-01

    Few studies have been conducted on the rationales for using interpretive guidelines for effect size, and most of the previous statistical power surveys have covered broad research domains. The present study aimed to estimate the statistical power and to obtain realistic target effect sizes of depression research in Japan. We systematically reviewed 18 leading journals of psychiatry and psychology in Japan and identified 974 depression studies that were mentioned in 935 articles published between 1990 and 2006. In 392 studies, logistic regression analyses revealed that using clinical populations was independently associated with being a statistical power of <0.80 (odds ratio 5.9, 95% confidence interval 2.9-12.0) and of <0.50 (odds ratio 4.9, 95% confidence interval 2.3-10.5). Of the studies using clinical populations, 80% did not achieve a power of 0.80 or more, and 44% did not achieve a power of 0.50 or more to detect the medium population effect sizes. A predictive model for the proportion of variance explained was developed using a linear mixed-effects model. The model was then used to obtain realistic target effect sizes in defined study characteristics. In the face of a real difference or correlation in population, many depression researchers are less likely to give a valid result than simply tossing a coin. It is important to educate depression researchers in order to enable them to conduct an a priori power analysis. © 2011 The Authors. Psychiatry and Clinical Neurosciences © 2011 Japanese Society of Psychiatry and Neurology.

  8. The Power Prior: Theory and Applications

    PubMed Central

    Ibrahim, Joseph G.; Chen, Ming-Hui; Gwon, Yeongjin; Chen, Fang

    2015-01-01

    The power prior has been widely used in many applications covering a large number of disciplines. The power prior is intended to be an informative prior constructed from historical data. It has been used in clinical trials, genetics, health care, psychology, environmental health, engineering, economics, and business. It has also been applied for a wide variety of models and settings, both in the experimental design and analysis contexts. In this review article, we give an A to Z exposition of the power prior and its applications to date. We review its theoretical properties, variations in its formulation, statistical contexts for which it has been used, applications, and its advantages over other informative priors. We review models for which it has been used, including generalized linear models, survival models, and random effects models. Statistical areas where the power prior has been used include model selection, experimental design, hierarchical modeling, and conjugate priors. Prequentist properties of power priors in posterior inference are established and a simulation study is conducted to further examine the empirical performance of the posterior estimates with power priors. Real data analyses are given illustrating the power prior as well as the use of the power prior in the Bayesian design of clinical trials. PMID:26346180

  9. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

    PubMed

    Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

    2015-04-01

    Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  10. STR data for 15 autosomal STR markers from Paraná (Southern Brazil).

    PubMed

    Alves, Hemerson B; Leite, Fábio P N; Sotomaior, Vanessa S; Rueda, Fábio F; Silva, Rosane; Moura-Neto, Rodrigo S

    2014-03-01

    Allelic frequencies for 15 STR autosomal loci, using AmpFℓSTR® Identifiler™, forensic, and statistical parameters were calculated. All loci reached the Hardy-Weinberg equilibrium. The combined power of discrimination and mean power of exclusion were 0.999999999999999999 and 0.9999993, respectively. The MDS plot and NJ tree analysis, generated by FST matrix, corroborated the notion of the origins of the Paraná population as mainly European-derived. The combination of these 15 STR loci represents a powerful strategy for individual identification and parentage analyses for the Paraná population.

  11. Metal and physico-chemical variations at a hydroelectric reservoir analyzed by Multivariate Analyses and Artificial Neural Networks: environmental management and policy/decision-making tools.

    PubMed

    Cavalcante, Y L; Hauser-Davis, R A; Saraiva, A C F; Brandão, I L S; Oliveira, T F; Silveira, A M

    2013-01-01

    This paper compared and evaluated seasonal variations in physico-chemical parameters and metals at a hydroelectric power station reservoir by applying Multivariate Analyses and Artificial Neural Networks (ANN) statistical techniques. A Factor Analysis was used to reduce the number of variables: the first factor was composed of elements Ca, K, Mg and Na, and the second by Chemical Oxygen Demand. The ANN showed 100% correct classifications in training and validation samples. Physico-chemical analyses showed that water pH values were not statistically different between the dry and rainy seasons, while temperature, conductivity, alkalinity, ammonia and DO were higher in the dry period. TSS, hardness and COD, on the other hand, were higher during the rainy season. The statistical analyses showed that Ca, K, Mg and Na are directly connected to the Chemical Oxygen Demand, which indicates a possibility of their input into the reservoir system by domestic sewage and agricultural run-offs. These statistical applications, thus, are also relevant in cases of environmental management and policy decision-making processes, to identify which factors should be further studied and/or modified to recover degraded or contaminated water bodies. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Ataxia Telangiectasia–Mutated Gene Polymorphisms and Acute Normal Tissue Injuries in Cancer Patients After Radiation Therapy: A Systematic Review and Meta-analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dong, Lihua; Cui, Jingkun; Tang, Fengjiao

    Purpose: Studies of the association between ataxia telangiectasia–mutated (ATM) gene polymorphisms and acute radiation injuries are often small in sample size, and the results are inconsistent. We conducted the first meta-analysis to provide a systematic review of published findings. Methods and Materials: Publications were identified by searching PubMed up to April 25, 2014. Primary meta-analysis was performed for all acute radiation injuries, and subgroup meta-analyses were based on clinical endpoint. The influence of sample size and radiation injury incidence on genetic effects was estimated in sensitivity analyses. Power calculations were also conducted. Results: The meta-analysis was conducted on the ATMmore » polymorphism rs1801516, including 5 studies with 1588 participants. For all studies, the cut-off for differentiating cases from controls was grade 2 acute radiation injuries. The primary meta-analysis showed a significant association with overall acute radiation injuries (allelic model: odds ratio = 1.33, 95% confidence interval: 1.04-1.71). Subgroup analyses detected an association between the rs1801516 polymorphism and a significant increase in urinary and lower gastrointestinal injuries and an increase in skin injury that was not statistically significant. There was no between-study heterogeneity in any meta-analyses. In the sensitivity analyses, small studies did not show larger effects than large studies. In addition, studies with high incidence of acute radiation injuries showed larger effects than studies with low incidence. Power calculations revealed that the statistical power of the primary meta-analysis was borderline, whereas there was adequate power for the subgroup analysis of studies with high incidence of acute radiation injuries. Conclusions: Our meta-analysis showed a consistency of the results from the overall and subgroup analyses. We also showed that the genetic effect of the rs1801516 polymorphism on acute radiation injuries was dependent on the incidence of the injury. These support the evidence of an association between the rs1801516 polymorphism and acute radiation injuries, encouraging further research of this topic.« less

  13. Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

    PubMed Central

    Riley, Richard D.

    2017-01-01

    An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945

  14. Can power-law scaling and neuronal avalanches arise from stochastic dynamics?

    PubMed

    Touboul, Jonathan; Destexhe, Alain

    2010-02-11

    The presence of self-organized criticality in biology is often evidenced by a power-law scaling of event size distributions, which can be measured by linear regression on logarithmic axes. We show here that such a procedure does not necessarily mean that the system exhibits self-organized criticality. We first provide an analysis of multisite local field potential (LFP) recordings of brain activity and show that event size distributions defined as negative LFP peaks can be close to power-law distributions. However, this result is not robust to change in detection threshold, or when tested using more rigorous statistical analyses such as the Kolmogorov-Smirnov test. Similar power-law scaling is observed for surrogate signals, suggesting that power-law scaling may be a generic property of thresholded stochastic processes. We next investigate this problem analytically, and show that, indeed, stochastic processes can produce spurious power-law scaling without the presence of underlying self-organized criticality. However, this power-law is only apparent in logarithmic representations, and does not survive more rigorous analysis such as the Kolmogorov-Smirnov test. The same analysis was also performed on an artificial network known to display self-organized criticality. In this case, both the graphical representations and the rigorous statistical analysis reveal with no ambiguity that the avalanche size is distributed as a power-law. We conclude that logarithmic representations can lead to spurious power-law scaling induced by the stochastic nature of the phenomenon. This apparent power-law scaling does not constitute a proof of self-organized criticality, which should be demonstrated by more stringent statistical tests.

  15. Treatment of Missing Data in Workforce Education Research

    ERIC Educational Resources Information Center

    Gemici, Sinan; Rojewski, Jay W.; Lee, In Heok

    2012-01-01

    Most quantitative analyses in workforce education are affected by missing data. Traditional approaches to remedy missing data problems often result in reduced statistical power and biased parameter estimates due to systematic differences between missing and observed values. This article examines the treatment of missing data in pertinent…

  16. Assessment and statistics of surgically induced astigmatism.

    PubMed

    Naeser, Kristian

    2008-05-01

    The aim of the thesis was to develop methods for assessment of surgically induced astigmatism (SIA) in individual eyes, and in groups of eyes. The thesis is based on 12 peer-reviewed publications, published over a period of 16 years. In these publications older and contemporary literature was reviewed(1). A new method (the polar system) for analysis of SIA was developed. Multivariate statistical analysis of refractive data was described(2-4). Clinical validation studies were performed. The description of a cylinder surface with polar values and differential geometry was compared. The main results were: refractive data in the form of sphere, cylinder and axis may define an individual patient or data set, but are unsuited for mathematical and statistical analyses(1). The polar value system converts net astigmatisms to orthonormal components in dioptric space. A polar value is the difference in meridional power between two orthogonal meridians(5,6). Any pair of polar values, separated by an arch of 45 degrees, characterizes a net astigmatism completely(7). The two polar values represent the net curvital and net torsional power over the chosen meridian(8). The spherical component is described by the spherical equivalent power. Several clinical studies demonstrated the efficiency of multivariate statistical analysis of refractive data(4,9-11). Polar values and formal differential geometry describe astigmatic surfaces with similar concepts and mathematical functions(8). Other contemporary methods, such as Long's power matrix, Holladay's and Alpins' methods, Zernike(12) and Fourier analyses(8), are correlated to the polar value system. In conclusion, analysis of SIA should be performed with polar values or other contemporary component systems. The study was supported by Statens Sundhedsvidenskabeligt Forskningsråd, Cykelhandler P. Th. Rasmussen og Hustrus Mindelegat, Hotelejer Carl Larsen og Hustru Nicoline Larsens Mindelegat, Landsforeningen til Vaern om Synet, Forskningsinitiativet for Arhus Amt, Alcon Denmark, and Desirée and Niels Ydes Fond.

  17. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

    PubMed

    Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

    2012-08-08

    Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  18. Cluster mass inference via random field theory.

    PubMed

    Zhang, Hui; Nichols, Thomas E; Johnson, Timothy D

    2009-01-01

    Cluster extent and voxel intensity are two widely used statistics in neuroimaging inference. Cluster extent is sensitive to spatially extended signals while voxel intensity is better for intense but focal signals. In order to leverage strength from both statistics, several nonparametric permutation methods have been proposed to combine the two methods. Simulation studies have shown that of the different cluster permutation methods, the cluster mass statistic is generally the best. However, to date, there is no parametric cluster mass inference available. In this paper, we propose a cluster mass inference method based on random field theory (RFT). We develop this method for Gaussian images, evaluate it on Gaussian and Gaussianized t-statistic images and investigate its statistical properties via simulation studies and real data. Simulation results show that the method is valid under the null hypothesis and demonstrate that it can be more powerful than the cluster extent inference method. Further, analyses with a single subject and a group fMRI dataset demonstrate better power than traditional cluster size inference, and good accuracy relative to a gold-standard permutation test.

  19. Adopting a Patient-Centered Approach to Primary Outcome Analysis of Acute Stroke Trials Using a Utility-Weighted Modified Rankin Scale.

    PubMed

    Chaisinanunkul, Napasri; Adeoye, Opeolu; Lewis, Roger J; Grotta, James C; Broderick, Joseph; Jovin, Tudor G; Nogueira, Raul G; Elm, Jordan J; Graves, Todd; Berry, Scott; Lees, Kennedy R; Barreto, Andrew D; Saver, Jeffrey L

    2015-08-01

    Although the modified Rankin Scale (mRS) is the most commonly used primary end point in acute stroke trials, its power is limited when analyzed in dichotomized fashion and its indication of effect size challenging to interpret when analyzed ordinally. Weighting the 7 Rankin levels by utilities may improve scale interpretability while preserving statistical power. A utility-weighted mRS (UW-mRS) was derived by averaging values from time-tradeoff (patient centered) and person-tradeoff (clinician centered) studies. The UW-mRS, standard ordinal mRS, and dichotomized mRS were applied to 11 trials or meta-analyses of acute stroke treatments, including lytic, endovascular reperfusion, blood pressure moderation, and hemicraniectomy interventions. Utility values were 1.0 for mRS level 0; 0.91 for mRS level 1; 0.76 for mRS level 2; 0.65 for mRS level 3; 0.33 for mRS level 4; 0 for mRS level 5; and 0 for mRS level 6. For trials with unidirectional treatment effects, the UW-mRS paralleled the ordinal mRS and outperformed dichotomous mRS analyses. Both the UW-mRS and the ordinal mRS were statistically significant in 6 of 8 unidirectional effect trials, whereas dichotomous analyses were statistically significant in 2 to 4 of 8. In bidirectional effect trials, both the UW-mRS and ordinal tests captured the divergent treatment effects by showing neutral results, whereas some dichotomized analyses showed positive results. Mean utility differences in trials with statistically significant positive results ranged from 0.026 to 0.249. A UW-mRS performs similar to the standard ordinal mRS in detecting treatment effects in actual stroke trials and ensures the quantitative outcome is a valid reflection of patient-centered benefits. © 2015 American Heart Association, Inc.

  20. Effective Analysis of Reaction Time Data

    ERIC Educational Resources Information Center

    Whelan, Robert

    2008-01-01

    Most analyses of reaction time (RT) data are conducted by using the statistical techniques with which psychologists are most familiar, such as analysis of variance on the sample mean. Unfortunately, these methods are usually inappropriate for RT data, because they have little power to detect genuine differences in RT between conditions. In…

  1. Multiplicity Control in Structural Equation Modeling

    ERIC Educational Resources Information Center

    Cribbie, Robert A.

    2007-01-01

    Researchers conducting structural equation modeling analyses rarely, if ever, control for the inflated probability of Type I errors when evaluating the statistical significance of multiple parameters in a model. In this study, the Type I error control, power and true model rates of famsilywise and false discovery rate controlling procedures were…

  2. Spatial variability effects on precision and power of forage yield estimation

    USDA-ARS?s Scientific Manuscript database

    Spatial analyses of yield trials are important, as they adjust cultivar means for spatial variation and improve the statistical precision of yield estimation. While the relative efficiency of spatial analysis has been frequently reported in several yield trials, its application on long-term forage y...

  3. Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis.

    PubMed

    Shrout, Patrick E; Rodgers, Joseph L

    2018-01-04

    Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.

  4. Analysis of broadcasting satellite service feeder link power control and polarization

    NASA Technical Reports Server (NTRS)

    Sullivan, T. M.

    1982-01-01

    Statistical analyses of carrier to interference power ratios (C/Is) were performed in assessing 17.5 GHz feeder links using (1) fixed power and power control, and (2) orthogonal linear and orthogonal circular polarizations. The analysis methods and attenuation/depolarization data base were based on CCIR findings to the greatest possible extent. Feeder links using adaptive power control were found to neither cause or suffer significant C/I degradation relative to that for fixed power feeder links having similar or less stringent availability objectives. The C/Is for sharing between orthogonal linearly polarized feeder links were found to be significantly higher than those for circular polarization only in links to nominally colocated satellites from nominally colocated Earth stations in high attenuation environments.

  5. The power prior: theory and applications.

    PubMed

    Ibrahim, Joseph G; Chen, Ming-Hui; Gwon, Yeongjin; Chen, Fang

    2015-12-10

    The power prior has been widely used in many applications covering a large number of disciplines. The power prior is intended to be an informative prior constructed from historical data. It has been used in clinical trials, genetics, health care, psychology, environmental health, engineering, economics, and business. It has also been applied for a wide variety of models and settings, both in the experimental design and analysis contexts. In this review article, we give an A-to-Z exposition of the power prior and its applications to date. We review its theoretical properties, variations in its formulation, statistical contexts for which it has been used, applications, and its advantages over other informative priors. We review models for which it has been used, including generalized linear models, survival models, and random effects models. Statistical areas where the power prior has been used include model selection, experimental design, hierarchical modeling, and conjugate priors. Frequentist properties of power priors in posterior inference are established, and a simulation study is conducted to further examine the empirical performance of the posterior estimates with power priors. Real data analyses are given illustrating the power prior as well as the use of the power prior in the Bayesian design of clinical trials. Copyright © 2015 John Wiley & Sons, Ltd.

  6. Wave energy resource of Brazil: An analysis from 35 years of ERA-Interim reanalysis data

    PubMed Central

    Araújo, Alex Maurício

    2017-01-01

    This paper presents a characterization of the wave power resource and an analysis of the wave power output for three (AquaBuoy, Pelamis and Wave Dragon) different wave energy converters (WEC) over the Brazilian offshore. To do so it used a 35 years reanalysis database from the ERA-Interim project. Annual and seasonal statistical analyzes of significant height and energy period were performed, and the directional variability of the incident waves were evaluated. The wave power resource was characterized in terms of the statistical parameters of mean, maximum, 95th percentile and standard deviation, and in terms of the temporal variability coefficients COV, SV e MV. From these analyses, the total annual wave power resource available over the Brazilian offshore was estimated in 89.97 GW, with largest mean wave power of 20.63 kW/m in the southernmost part of the study area. The analysis of the three WEC was based in the annual wave energy output and in the capacity factor. The higher capacity factor was 21.85% for Pelamis device at the southern region of the study area. PMID:28817731

  7. Wave energy resource of Brazil: An analysis from 35 years of ERA-Interim reanalysis data.

    PubMed

    Espindola, Rafael Luz; Araújo, Alex Maurício

    2017-01-01

    This paper presents a characterization of the wave power resource and an analysis of the wave power output for three (AquaBuoy, Pelamis and Wave Dragon) different wave energy converters (WEC) over the Brazilian offshore. To do so it used a 35 years reanalysis database from the ERA-Interim project. Annual and seasonal statistical analyzes of significant height and energy period were performed, and the directional variability of the incident waves were evaluated. The wave power resource was characterized in terms of the statistical parameters of mean, maximum, 95th percentile and standard deviation, and in terms of the temporal variability coefficients COV, SV e MV. From these analyses, the total annual wave power resource available over the Brazilian offshore was estimated in 89.97 GW, with largest mean wave power of 20.63 kW/m in the southernmost part of the study area. The analysis of the three WEC was based in the annual wave energy output and in the capacity factor. The higher capacity factor was 21.85% for Pelamis device at the southern region of the study area.

  8. Statistics for Radiology Research.

    PubMed

    Obuchowski, Nancy A; Subhas, Naveen; Polster, Joshua

    2017-02-01

    Biostatistics is an essential component in most original research studies in imaging. In this article we discuss five key statistical concepts for study design and analyses in modern imaging research: statistical hypothesis testing, particularly focusing on noninferiority studies; imaging outcomes especially when there is no reference standard; dealing with the multiplicity problem without spending all your study power; relevance of confidence intervals in reporting and interpreting study results; and finally tools for assessing quantitative imaging biomarkers. These concepts are presented first as examples of conversations between investigator and biostatistician, and then more detailed discussions of the statistical concepts follow. Three skeletal radiology examples are used to illustrate the concepts. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

  9. graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture.

    PubMed

    Chung, Dongjun; Kim, Hang J; Zhao, Hongyu

    2017-02-01

    Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.

  10. Estimating statistical power for open-enrollment group treatment trials.

    PubMed

    Morgan-Lopez, Antonio A; Saavedra, Lissette M; Hien, Denise A; Fals-Stewart, William

    2011-01-01

    Modeling turnover in group membership has been identified as a key barrier contributing to a disconnect between the manner in which behavioral treatment is conducted (open-enrollment groups) and the designs of substance abuse treatment trials (closed-enrollment groups, individual therapy). Latent class pattern mixture models (LCPMMs) are emerging tools for modeling data from open-enrollment groups with membership turnover in recently proposed treatment trials. The current article illustrates an approach to conducting power analyses for open-enrollment designs based on the Monte Carlo simulation of LCPMM models using parameters derived from published data from a randomized controlled trial comparing Seeking Safety to a Community Care condition for women presenting with comorbid posttraumatic stress disorder and substance use disorders. The example addresses discrepancies between the analysis framework assumed in power analyses of many recently proposed open-enrollment trials and the proposed use of LCPMM for data analysis. Copyright © 2011 Elsevier Inc. All rights reserved.

  11. Continuous Covariate Imbalance and Conditional Power for Clinical Trial Interim Analyses

    PubMed Central

    Ciolino, Jody D.; Martin, Renee' H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    Oftentimes valid statistical analyses for clinical trials involve adjustment for known influential covariates, regardless of imbalance observed in these covariates at baseline across treatment groups. Thus, it must be the case that valid interim analyses also properly adjust for these covariates. There are situations, however, in which covariate adjustment is not possible, not planned, or simply carries less merit as it makes inferences less generalizable and less intuitive. In this case, covariate imbalance between treatment groups can have a substantial effect on both interim and final primary outcome analyses. This paper illustrates the effect of influential continuous baseline covariate imbalance on unadjusted conditional power (CP), and thus, on trial decisions based on futility stopping bounds. The robustness of the relationship is illustrated for normal, skewed, and bimodal continuous baseline covariates that are related to a normally distributed primary outcome. Results suggest that unadjusted CP calculations in the presence of influential covariate imbalance require careful interpretation and evaluation. PMID:24607294

  12. Methods for meta-analysis of multiple traits using GWAS summary statistics.

    PubMed

    Ray, Debashree; Boehnke, Michael

    2018-03-01

    Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits. © 2017 WILEY PERIODICALS, INC.

  13. Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study

    PubMed Central

    Hou, Lin; Sun, Ning; Mane, Shrikant; Sayward, Fred; Rajeevan, Nallakkandi; Cheung, Kei-Hoi; Cho, Kelly; Pyarajan, Saiju; Aslan, Mihaela; Miller, Perry; Harvey, Philip D.; Gaziano, J. Michael; Concato, John; Zhao, Hongyu

    2017-01-01

    A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant’s DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/). PMID:28019059

  14. Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens.

    PubMed

    Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi

    2017-01-01

    High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Improved score statistics for meta-analysis in single-variant and gene-level association studies.

    PubMed

    Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo

    2018-06-01

    Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.

  16. Distinguishing advective and powered motion in self-propelled colloids

    NASA Astrophysics Data System (ADS)

    Byun, Young-Moo; Lammert, Paul E.; Hong, Yiying; Sen, Ayusman; Crespi, Vincent H.

    2017-11-01

    Self-powered motion in catalytic colloidal particles provides a compelling example of active matter, i.e. systems that engage in single-particle and collective behavior far from equilibrium. The long-time, long-distance behavior of such systems is of particular interest, since it connects their individual micro-scale behavior to macro-scale phenomena. In such analyses, it is important to distinguish motion due to subtle advective effects—which also has long time scales and length scales—from long-timescale phenomena that derive from intrinsically powered motion. Here, we develop a methodology to analyze the statistical properties of the translational and rotational motions of powered colloids to distinguish, for example, active chemotaxis from passive advection by bulk flow.

  17. A robust and efficient statistical method for genetic association studies using case and control samples from multiple cohorts

    PubMed Central

    2013-01-01

    Background The theoretical basis of genome-wide association studies (GWAS) is statistical inference of linkage disequilibrium (LD) between any polymorphic marker and a putative disease locus. Most methods widely implemented for such analyses are vulnerable to several key demographic factors and deliver a poor statistical power for detecting genuine associations and also a high false positive rate. Here, we present a likelihood-based statistical approach that accounts properly for non-random nature of case–control samples in regard of genotypic distribution at the loci in populations under study and confers flexibility to test for genetic association in presence of different confounding factors such as population structure, non-randomness of samples etc. Results We implemented this novel method together with several popular methods in the literature of GWAS, to re-analyze recently published Parkinson’s disease (PD) case–control samples. The real data analysis and computer simulation show that the new method confers not only significantly improved statistical power for detecting the associations but also robustness to the difficulties stemmed from non-randomly sampling and genetic structures when compared to its rivals. In particular, the new method detected 44 significant SNPs within 25 chromosomal regions of size < 1 Mb but only 6 SNPs in two of these regions were previously detected by the trend test based methods. It discovered two SNPs located 1.18 Mb and 0.18 Mb from the PD candidates, FGF20 and PARK8, without invoking false positive risk. Conclusions We developed a novel likelihood-based method which provides adequate estimation of LD and other population model parameters by using case and control samples, the ease in integration of these samples from multiple genetically divergent populations and thus confers statistically robust and powerful analyses of GWAS. On basis of simulation studies and analysis of real datasets, we demonstrated significant improvement of the new method over the non-parametric trend test, which is the most popularly implemented in the literature of GWAS. PMID:23394771

  18. Narrative Review of Statistical Reporting Checklists, Mandatory Statistical Editing, and Rectifying Common Problems in the Reporting of Scientific Articles.

    PubMed

    Dexter, Franklin; Shafer, Steven L

    2017-03-01

    Considerable attention has been drawn to poor reproducibility in the biomedical literature. One explanation is inadequate reporting of statistical methods by authors and inadequate assessment of statistical reporting and methods during peer review. In this narrative review, we examine scientific studies of several well-publicized efforts to improve statistical reporting. We also review several retrospective assessments of the impact of these efforts. These studies show that instructions to authors and statistical checklists are not sufficient; no findings suggested that either improves the quality of statistical methods and reporting. Second, even basic statistics, such as power analyses, are frequently missing or incorrectly performed. Third, statistical review is needed for all papers that involve data analysis. A consistent finding in the studies was that nonstatistical reviewers (eg, "scientific reviewers") and journal editors generally poorly assess statistical quality. We finish by discussing our experience with statistical review at Anesthesia & Analgesia from 2006 to 2016.

  19. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

  20. A global estimate of the Earth's magnetic crustal thickness

    NASA Astrophysics Data System (ADS)

    Vervelidou, Foteini; Thébault, Erwan

    2014-05-01

    The Earth's lithosphere is considered to be magnetic only down to the Curie isotherm. Therefore the Curie isotherm can, in principle, be estimated by analysis of magnetic data. Here, we propose such an analysis in the spectral domain by means of a newly introduced regional spatial power spectrum. This spectrum is based on the Revised Spherical Cap Harmonic Analysis (R-SCHA) formalism (Thébault et al., 2006). We briefly discuss its properties and its relationship with the Spherical Harmonic spatial power spectrum. This relationship allows us to adapt any theoretical expression of the lithospheric field power spectrum expressed in Spherical Harmonic degrees to the regional formulation. We compared previously published statistical expressions (Jackson, 1994 ; Voorhies et al., 2002) to the recent lithospheric field models derived from the CHAMP and airborne measurements and we finally developed a new statistical form for the power spectrum of the Earth's magnetic lithosphere that we think provides more consistent results. This expression depends on the mean magnetization, the mean crustal thickness and a power law value that describes the amount of spatial correlation of the sources. In this study, we make a combine use of the R-SCHA surface power spectrum and this statistical form. We conduct a series of regional spectral analyses for the entire Earth. For each region, we estimate the R-SCHA surface power spectrum of the NGDC-720 Spherical Harmonic model (Maus, 2010). We then fit each of these observational spectra to the statistical expression of the power spectrum of the Earth's lithosphere. By doing so, we estimate the large wavelengths of the magnetic crustal thickness on a global scale that are not accessible directly from the magnetic measurements due to the masking core field. We then discuss these results and compare them to the results we obtained by conducting a similar spectral analysis, but this time in the cartesian coordinates, by means of a published statistical expression (Maus et al., 1997). We also compare our results to crustal thickness global maps derived by means of additional geophysical data (Purucker et al., 2002).

  1. Statistical aspects of genetic association testing in small samples, based on selective DNA pooling data in the arctic fox.

    PubMed

    Szyda, Joanna; Liu, Zengting; Zatoń-Dobrowolska, Magdalena; Wierzbicki, Heliodor; Rzasa, Anna

    2008-01-01

    We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.

  2. Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies

    PubMed Central

    Liu, Zhonghua; Lin, Xihong

    2017-01-01

    Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391

  3. Multiple phenotype association tests using summary statistics in genome-wide association studies.

    PubMed

    Liu, Zhonghua; Lin, Xihong

    2018-03-01

    We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.

  4. Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

    PubMed

    Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

    2017-07-21

    DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.

  5. Four modes of optical parametric operation for squeezed state generation

    NASA Astrophysics Data System (ADS)

    Andersen, U. L.; Buchler, B. C.; Lam, P. K.; Wu, J. W.; Gao, J. R.; Bachor, H.-A.

    2003-11-01

    We report a versatile instrument, based on a monolithic optical parametric amplifier, which reliably generates four different types of squeezed light. We obtained vacuum squeezing, low power amplitude squeezing, phase squeezing and bright amplitude squeezing. We show a complete analysis of this light, including a full quantum state tomography. In addition we demonstrate the direct detection of the squeezed state statistics without the aid of a spectrum analyser. This technique makes the nonclassical properties directly visible and allows complete measurement of the statistical moments of the squeezed quadrature.

  6. Development of a reactive-dispersive plume model

    NASA Astrophysics Data System (ADS)

    Kim, Hyun S.; Kim, Yong H.; Song, Chul H.

    2017-04-01

    A reactive-dispersive plume model (RDPM) was developed in this study. The RDPM can consider two main components of large-scale point source plume: i) turbulent dispersion and ii) photochemical reactions. In order to evaluate the simulation performance of newly developed RDPM, the comparisons between the model-predicted and observed mixing ratios were made using the TexAQS II 2006 (Texas Air Quality Study II 2006) power-plant experiment data. Statistical analyses show good correlation (0.61≤R≤0.92), and good agreement with the Index of Agreement (0.70≤R≤0.95). The chemical NOx lifetimes for two power-plant plumes (Monticello and Welsh power plants) were also estimated.

  7. A statistical analysis of energy and power demand for the tractive purposes of an electric vehicle in urban traffic - an analysis of a short and long observation period

    NASA Astrophysics Data System (ADS)

    Slaski, G.; Ohde, B.

    2016-09-01

    The article presents the results of a statistical dispersion analysis of an energy and power demand for tractive purposes of a battery electric vehicle. The authors compare data distribution for different values of an average speed in two approaches, namely a short and long period of observation. The short period of observation (generally around several hundred meters) results from a previously proposed macroscopic energy consumption model based on an average speed per road section. This approach yielded high values of standard deviation and coefficient of variation (the ratio between standard deviation and the mean) around 0.7-1.2. The long period of observation (about several kilometers long) is similar in length to standardized speed cycles used in testing a vehicle energy consumption and available range. The data were analysed to determine the impact of observation length on the energy and power demand variation. The analysis was based on a simulation of electric power and energy consumption performed with speed profiles data recorded in Poznan agglomeration.

  8. Modeling and replicating statistical topology and evidence for CMB nonhomogeneity

    PubMed Central

    Agami, Sarit

    2017-01-01

    Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301

  9. Bridging the Gap between Theory and Model: A Reflection on the Balance Scale Task.

    ERIC Educational Resources Information Center

    Turner, Geoffrey F. W.; Thomas, Hoben

    2002-01-01

    Focuses on individual strengths of articles by Jensen and van der Maas, and Halford et al., and the power of their combined perspectives. Suggests a performance model that can both evaluate specific theoretical claims and reveal important data features that had been previously obscured using conventional statistical analyses. Maintains that the…

  10. Literacy Inequalities in Theory and Practice: The Power to Name and Define

    ERIC Educational Resources Information Center

    Street, Brian V.

    2011-01-01

    I analyse what exactly is being addressed when the notion of "literacy inequalities" is cited in the context of international policy with regard to education in general and literacy in particular. Whilst literacy statistics are used as indicators of social inequality and as a basis for policy in improving rights, educational attainment, etc., I…

  11. Turning the Potential Liability of Large Enrollment Laboratory Science Courses into an Asset

    ERIC Educational Resources Information Center

    Johnson, Dan; Levy, Foster; Karsai, Istvan; Stroud, Kimberly

    2006-01-01

    Data sharing among multiple lab sections increases statistical power of data analyses and informs student-generated hypotheses. We describe how to collect, organize, and manage data to support replicate and rolling inquiry models, with three illustrative examples of activities from a population-level biology course for science majors. (Contains 1…

  12. Formalizing the definition of meta-analysis in Molecular Ecology.

    PubMed

    ArchMiller, Althea A; Bauer, Eric F; Koch, Rebecca E; Wijayawardena, Bhagya K; Anil, Ammu; Kottwitz, Jack J; Munsterman, Amelia S; Wilson, Alan E

    2015-08-01

    Meta-analysis, the statistical synthesis of pertinent literature to develop evidence-based conclusions, is relatively new to the field of molecular ecology, with the first meta-analysis published in the journal Molecular Ecology in 2003 (Slate & Phua 2003). The goal of this article is to formalize the definition of meta-analysis for the authors, editors, reviewers and readers of Molecular Ecology by completing a review of the meta-analyses previously published in this journal. We also provide a brief overview of the many components required for meta-analysis with a more specific discussion of the issues related to the field of molecular ecology, including the use and statistical considerations of Wright's FST and its related analogues as effect sizes in meta-analysis. We performed a literature review to identify articles published as 'meta-analyses' in Molecular Ecology, which were then evaluated by at least two reviewers. We specifically targeted Molecular Ecology publications because as a flagship journal in this field, meta-analyses published in Molecular Ecology have the potential to set the standard for meta-analyses in other journals. We found that while many of these reviewed articles were strong meta-analyses, others failed to follow standard meta-analytical techniques. One of these unsatisfactory meta-analyses was in fact a secondary analysis. Other studies attempted meta-analyses but lacked the fundamental statistics that are considered necessary for an effective and powerful meta-analysis. By drawing attention to the inconsistency of studies labelled as meta-analyses, we emphasize the importance of understanding the components of traditional meta-analyses to fully embrace the strengths of quantitative data synthesis in the field of molecular ecology. © 2015 John Wiley & Sons Ltd.

  13. On Improving the Quality and Interpretation of Environmental Assessments using Statistical Analysis and Geographic Information Systems

    NASA Astrophysics Data System (ADS)

    Karuppiah, R.; Faldi, A.; Laurenzi, I.; Usadi, A.; Venkatesh, A.

    2014-12-01

    An increasing number of studies are focused on assessing the environmental footprint of different products and processes, especially using life cycle assessment (LCA). This work shows how combining statistical methods and Geographic Information Systems (GIS) with environmental analyses can help improve the quality of results and their interpretation. Most environmental assessments in literature yield single numbers that characterize the environmental impact of a process/product - typically global or country averages, often unchanging in time. In this work, we show how statistical analysis and GIS can help address these limitations. For example, we demonstrate a method to separately quantify uncertainty and variability in the result of LCA models using a power generation case study. This is important for rigorous comparisons between the impacts of different processes. Another challenge is lack of data that can affect the rigor of LCAs. We have developed an approach to estimate environmental impacts of incompletely characterized processes using predictive statistical models. This method is applied to estimate unreported coal power plant emissions in several world regions. There is also a general lack of spatio-temporal characterization of the results in environmental analyses. For instance, studies that focus on water usage do not put in context where and when water is withdrawn. Through the use of hydrological modeling combined with GIS, we quantify water stress on a regional and seasonal basis to understand water supply and demand risks for multiple users. Another example where it is important to consider regional dependency of impacts is when characterizing how agricultural land occupation affects biodiversity in a region. We developed a data-driven methodology used in conjuction with GIS to determine if there is a statistically significant difference between the impacts of growing different crops on different species in various biomes of the world.

  14. Statistical issues in quality control of proteomic analyses: good experimental design and planning.

    PubMed

    Cairns, David A

    2011-03-01

    Quality control is becoming increasingly important in proteomic investigations as experiments become more multivariate and quantitative. Quality control applies to all stages of an investigation and statistics can play a key role. In this review, the role of statistical ideas in the design and planning of an investigation is described. This involves the design of unbiased experiments using key concepts from statistical experimental design, the understanding of the biological and analytical variation in a system using variance components analysis and the determination of a required sample size to perform a statistically powerful investigation. These concepts are described through simple examples and an example data set from a 2-D DIGE pilot experiment. Each of these concepts can prove useful in producing better and more reproducible data. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Effect of Different Phases of Menstrual Cycle on Heart Rate Variability (HRV).

    PubMed

    Brar, Tejinder Kaur; Singh, K D; Kumar, Avnish

    2015-10-01

    Heart Rate Variability (HRV), which is a measure of the cardiac autonomic tone, displays physiological changes throughout the menstrual cycle. The functions of the ANS in various phases of the menstrual cycle were examined in some studies. The aim of our study was to observe the effect of menstrual cycle on cardiac autonomic function parameters in healthy females. A cross-sectional (observational) study was conducted on 50 healthy females, in the age group of 18-25 years. Heart Rate Variability (HRV) was recorded by Physio Pac (PC-2004). The data consisted of Time Domain Analysis and Frequency Domain Analysis in menstrual, proliferative and secretory phase of menstrual cycle. Data collected was analysed statistically using student's pair t-test. The difference in mean heart rate, LF power%, LFnu and HFnu in menstrual and proliferative phase was found to be statistically significant. The difference in mean RR, Mean HR, RMSSD (the square root of the mean of the squares of the successive differences between adjacent NNs.), NN50 (the number of pairs of successive NNs that differ by more than 50 ms), pNN50 (the proportion of NN50 divided by total number of NNs.), VLF (very low frequency) power, LF (low frequency) power, LF power%, HF power %, LF/HF ratio, LFnu and HFnu was found to be statistically significant in proliferative and secretory phase. The difference in Mean RR, Mean HR, LFnu and HFnu was found to be statistically significant in secretory and menstrual phases. From the study it can be concluded that sympathetic nervous activity in secretory phase is greater than in the proliferative phase, whereas parasympathetic nervous activity is predominant in proliferative phase.

  16. Effect of Different Phases of Menstrual Cycle on Heart Rate Variability (HRV)

    PubMed Central

    Singh, K. D.; Kumar, Avnish

    2015-01-01

    Background Heart Rate Variability (HRV), which is a measure of the cardiac autonomic tone, displays physiological changes throughout the menstrual cycle. The functions of the ANS in various phases of the menstrual cycle were examined in some studies. Aims and Objectives The aim of our study was to observe the effect of menstrual cycle on cardiac autonomic function parameters in healthy females. Materials and Methods A cross-sectional (observational) study was conducted on 50 healthy females, in the age group of 18-25 years. Heart Rate Variability (HRV) was recorded by Physio Pac (PC-2004). The data consisted of Time Domain Analysis and Frequency Domain Analysis in menstrual, proliferative and secretory phase of menstrual cycle. Data collected was analysed statistically using student’s pair t-test. Results The difference in mean heart rate, LF power%, LFnu and HFnu in menstrual and proliferative phase was found to be statistically significant. The difference in mean RR, Mean HR, RMSSD (the square root of the mean of the squares of the successive differences between adjacent NNs.), NN50 (the number of pairs of successive NNs that differ by more than 50 ms), pNN50 (the proportion of NN50 divided by total number of NNs.), VLF (very low frequency) power, LF (low frequency) power, LF power%, HF power %, LF/HF ratio, LFnu and HFnu was found to be statistically significant in proliferative and secretory phase. The difference in Mean RR, Mean HR, LFnu and HFnu was found to be statistically significant in secretory and menstrual phases. Conclusion From the study it can be concluded that sympathetic nervous activity in secretory phase is greater than in the proliferative phase, whereas parasympathetic nervous activity is predominant in proliferative phase. PMID:26557512

  17. The use and misuse of statistical analyses. [in geophysics and space physics

    NASA Technical Reports Server (NTRS)

    Reiff, P. H.

    1983-01-01

    The statistical techniques most often used in space physics include Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density, and superposed epoch analysis. Tests are presented which can evaluate the significance of the results obtained through each of these. Data presented without some form of error analysis are frequently useless, since they offer no way of assessing whether a bump on a spectrum or on a superposed epoch analysis is real or merely a statistical fluctuation. Among many of the published linear correlations, for instance, the uncertainty in the intercept and slope is not given, so that the significance of the fitted parameters cannot be assessed.

  18. Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial

    PubMed Central

    Hallgren, Kevin A.

    2012-01-01

    Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen’s kappa and intra-class correlations to assess IRR. PMID:22833776

  19. Modeling Cross-Situational Word–Referent Learning: Prior Questions

    PubMed Central

    Yu, Chen; Smith, Linda B.

    2013-01-01

    Both adults and young children possess powerful statistical computation capabilities—they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of associative learning. This article describes a series of simulation studies and analyses designed to understand the different learning mechanisms posited by the 2 classes of models and their relation to each other. Variants of a hypothesis-testing model and a simple or dumb associative mechanism were examined under different specifications of information selection, computation, and decision. Critically, these 3 components of the models interact in complex ways. The models illustrate a fundamental tradeoff between amount of data input and powerful computations: With the selection of more information, dumb associative models can mimic the powerful learning that is accomplished by hypothesis-testing models with fewer data. However, because of the interactions among the component parts of the models, the associative model can mimic various hypothesis-testing models, producing the same learning patterns but through different internal components. The simulations argue for the importance of a compositional approach to human statistical learning: the experimental decomposition of the processes that contribute to statistical learning in human learners and models with the internal components that can be evaluated independently and together. PMID:22229490

  20. Exploration of time-course combinations of outcome scales for use in a global test of stroke recovery.

    PubMed

    Goldie, Fraser C; Fulton, Rachael L; Dawson, Jesse; Bluhmki, Erich; Lees, Kennedy R

    2014-08-01

    Clinical trials for acute ischemic stroke treatment require large numbers of participants and are expensive to conduct. Methods that enhance statistical power are therefore desirable. We explored whether this can be achieved by a measure incorporating both early and late measures of outcome (e.g. seven-day NIH Stroke Scale combined with 90-day modified Rankin scale). We analyzed sensitivity to treatment effect, using proportional odds logistic regression for ordinal scales and generalized estimating equation method for global outcomes, with all analyses adjusted for baseline severity and age. We ran simulations to assess relations between sample size and power for ordinal scales and corresponding global outcomes. We used R version 2·12·1 (R Development Core Team. R Foundation for Statistical Computing, Vienna, Austria) for simulations and SAS 9·2 (SAS Institute Inc., Cary, NC, USA) for all other analyses. Each scale considered for combination was sensitive to treatment effect in isolation. The mRS90 and NIHSS90 had adjusted odds ratio of 1·56 and 1·62, respectively. Adjusted odds ratio for global outcomes of the combination of mRS90 with NIHSS7 and NIHSS90 with NIHSS7 were 1·69 and 1·73, respectively. The smallest sample sizes required to generate statistical power ≥80% for mRS90, NIHSS7, and global outcomes of mRS90 and NIHSS7 combined and NIHSS90 and NIHSS7 combined were 500, 490, 400, and 380, respectively. When data concerning both early and late outcomes are combined into a global measure, there is increased sensitivity to treatment effect compared with solitary ordinal scales. This delivers a 20% reduction in required sample size at 80% power. Combining early with late outcomes merits further consideration. © 2013 The Authors. International Journal of Stroke © 2013 World Stroke Organization.

  1. Categorization of the trophic status of a hydroelectric power plant reservoir in the Brazilian Amazon by statistical analyses and fuzzy approaches.

    PubMed

    da Costa Lobato, Tarcísio; Hauser-Davis, Rachel Ann; de Oliveira, Terezinha Ferreira; Maciel, Marinalva Cardoso; Tavares, Maria Regina Madruga; da Silveira, Antônio Morais; Saraiva, Augusto Cesar Fonseca

    2015-02-15

    The Amazon area has been increasingly suffering from anthropogenic impacts, especially due to the construction of hydroelectric power plant reservoirs. The analysis and categorization of the trophic status of these reservoirs are of interest to indicate man-made changes in the environment. In this context, the present study aimed to categorize the trophic status of a hydroelectric power plant reservoir located in the Brazilian Amazon by constructing a novel Water Quality Index (WQI) and Trophic State Index (TSI) for the reservoir using major ion concentrations and physico-chemical water parameters determined in the area and taking into account the sampling locations and the local hydrological regimes. After applying statistical analyses (factor analysis and cluster analysis) and establishing a rule base of a fuzzy system to these indicators, the results obtained by the proposed method were then compared to the generally applied Carlson and a modified Lamparelli trophic state index (TSI), specific for trophic regions. The categorization of the trophic status by the proposed fuzzy method was shown to be more reliable, since it takes into account the specificities of the study area, while the Carlson and Lamparelli TSI do not, and, thus, tend to over or underestimate the trophic status of these ecosystems. The statistical techniques proposed and applied in the present study, are, therefore, relevant in cases of environmental management and policy decision-making processes, aiding in the identification of the ecological status of water bodies. With this, it is possible to identify which factors should be further investigated and/or adjusted in order to attempt the recovery of degraded water bodies. Copyright © 2014 Elsevier B.V. All rights reserved.

  2. Improving qPCR telomere length assays: Controlling for well position effects increases statistical power.

    PubMed

    Eisenberg, Dan T A; Kuzawa, Christopher W; Hayes, M Geoffrey

    2015-01-01

    Telomere length (TL) is commonly measured using quantitative PCR (qPCR). Although, easier than the southern blot of terminal restriction fragments (TRF) TL measurement method, one drawback of qPCR is that it introduces greater measurement error and thus reduces the statistical power of analyses. To address a potential source of measurement error, we consider the effect of well position on qPCR TL measurements. qPCR TL data from 3,638 people run on a Bio-Rad iCycler iQ are reanalyzed here. To evaluate measurement validity, correspondence with TRF, age, and between mother and offspring are examined. First, we present evidence for systematic variation in qPCR TL measurements in relation to thermocycler well position. Controlling for these well-position effects consistently improves measurement validity and yields estimated improvements in statistical power equivalent to increasing sample sizes by 16%. We additionally evaluated the linearity of the relationships between telomere and single copy gene control amplicons and between qPCR and TRF measures. We find that, unlike some previous reports, our data exhibit linear relationships. We introduce the standard error in percent, a superior method for quantifying measurement error as compared to the commonly used coefficient of variation. Using this measure, we find that excluding samples with high measurement error does not improve measurement validity in our study. Future studies using block-based thermocyclers should consider well position effects. Since additional information can be gleaned from well position corrections, rerunning analyses of previous results with well position correction could serve as an independent test of the validity of these results. © 2015 Wiley Periodicals, Inc.

  3. Coordinate based random effect size meta-analysis of neuroimaging studies.

    PubMed

    Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

    2017-06-01

    Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Spherical aberrations of human astigmatic corneas.

    PubMed

    Zhao, Huawei; Dai, Guang-Ming; Chen, Li; Weeber, Henk A; Piers, Patricia A

    2011-11-01

    To evaluate whether the average spherical aberration of human astigmatic corneas is statistically equivalent to human nonastigmatic corneas. Spherical aberrations of 445 astigmatic corneas prior to laser vision correction were retrospectively investigated to determine Zernike coefficients for central corneal areas 6 mm in diameter using CTView (Sarver and Associates). Data were divided into groups according to cylinder power (0.01 to 0.25 diopters [D], 0.26 to 0.75 D, 0.76 to 1.06 D, 1.07 to 1.53 D, 1.54 to 2.00 D, and >2.00 D) and according to age by decade. Spherical aberrations were correlated with age and astigmatic power among groups and the entire population. Statistical analyses were conducted, and P<.05 was considered statistically significant. Mean patient age was 42.6±11 years. Astigmatic corneas had an average astigmatic power of 0.78±0.58 D and mean spherical aberration was 0.25±0.13 μm for the entire population and approximately the same (0.27 μm) for individual groups, ranging from 0.23 to 0.29 μm (P>.05 for all tested groups). Mean spherical aberration of astigmatic corneas was not correlated significantly with cylinder power or age (P>.05). Spherical aberrations are similar to those of nonastigmatic corneas, permitting the use of these additional data in the design of aspheric toric intra-ocular lenses. Copyright 2011, SLACK Incorporated.

  5. The effects of run-of-river hydroelectric power schemes on invertebrate community composition in temperate streams and rivers.

    PubMed

    Bilotta, Gary S; Burnside, Niall G; Turley, Matthew D; Gray, Jeremy C; Orr, Harriet G

    2017-01-01

    Run-of-river (ROR) hydroelectric power (HEP) schemes are often presumed to be less ecologically damaging than large-scale storage HEP schemes. However, there is currently limited scientific evidence on their ecological impact. The aim of this article is to investigate the effects of ROR HEP schemes on communities of invertebrates in temperate streams and rivers, using a multi-site Before-After, Control-Impact (BACI) study design. The study makes use of routine environmental surveillance data collected as part of long-term national and international monitoring programmes at 22 systematically-selected ROR HEP schemes and 22 systematically-selected paired control sites. Five widely-used family-level invertebrate metrics (richness, evenness, LIFE, E-PSI, WHPT) were analysed using a linear mixed effects model. The analyses showed that there was a statistically significant effect (p<0.05) of ROR HEP construction and operation on the evenness of the invertebrate community. However, no statistically significant effects were detected on the four other metrics of community composition. The implications of these findings are discussed in this article and recommendations are made for best-practice study design for future invertebrate community impact studies.

  6. The effects of run-of-river hydroelectric power schemes on invertebrate community composition in temperate streams and rivers

    PubMed Central

    2017-01-01

    Run-of-river (ROR) hydroelectric power (HEP) schemes are often presumed to be less ecologically damaging than large-scale storage HEP schemes. However, there is currently limited scientific evidence on their ecological impact. The aim of this article is to investigate the effects of ROR HEP schemes on communities of invertebrates in temperate streams and rivers, using a multi-site Before-After, Control-Impact (BACI) study design. The study makes use of routine environmental surveillance data collected as part of long-term national and international monitoring programmes at 22 systematically-selected ROR HEP schemes and 22 systematically-selected paired control sites. Five widely-used family-level invertebrate metrics (richness, evenness, LIFE, E-PSI, WHPT) were analysed using a linear mixed effects model. The analyses showed that there was a statistically significant effect (p<0.05) of ROR HEP construction and operation on the evenness of the invertebrate community. However, no statistically significant effects were detected on the four other metrics of community composition. The implications of these findings are discussed in this article and recommendations are made for best-practice study design for future invertebrate community impact studies. PMID:28158282

  7. Statistical analyses to support guidelines for marine avian sampling. Final report

    USGS Publications Warehouse

    Kinlan, Brian P.; Zipkin, Elise; O'Connell, Allan F.; Caldow, Chris

    2012-01-01

    Interest in development of offshore renewable energy facilities has led to a need for high-quality, statistically robust information on marine wildlife distributions. A practical approach is described to estimate the amount of sampling effort required to have sufficient statistical power to identify species-specific “hotspots” and “coldspots” of marine bird abundance and occurrence in an offshore environment divided into discrete spatial units (e.g., lease blocks), where “hotspots” and “coldspots” are defined relative to a reference (e.g., regional) mean abundance and/or occurrence probability for each species of interest. For example, a location with average abundance or occurrence that is three times larger the mean (3x effect size) could be defined as a “hotspot,” and a location that is three times smaller than the mean (1/3x effect size) as a “coldspot.” The choice of the effect size used to define hot and coldspots will generally depend on a combination of ecological and regulatory considerations. A method is also developed for testing the statistical significance of possible hotspots and coldspots. Both methods are illustrated with historical seabird survey data from the USGS Avian Compendium Database. Our approach consists of five main components: 1. A review of the primary scientific literature on statistical modeling of animal group size and avian count data to develop a candidate set of statistical distributions that have been used or may be useful to model seabird counts. 2. Statistical power curves for one-sample, one-tailed Monte Carlo significance tests of differences of observed small-sample means from a specified reference distribution. These curves show the power to detect "hotspots" or "coldspots" of occurrence and abundance at a range of effect sizes, given assumptions which we discuss. 3. A model selection procedure, based on maximum likelihood fits of models in the candidate set, to determine an appropriate statistical distribution to describe counts of a given species in a particular region and season. 4. Using a large database of historical at-sea seabird survey data, we applied this technique to identify appropriate statistical distributions for modeling a variety of species, allowing the distribution to vary by season. For each species and season, we used the selected distribution to calculate and map retrospective statistical power to detect hotspots and coldspots, and map pvalues from Monte Carlo significance tests of hotspots and coldspots, in discrete lease blocks designated by the U.S. Department of Interior, Bureau of Ocean Energy Management (BOEM). 5. Because our definition of hotspots and coldspots does not explicitly include variability over time, we examine the relationship between the temporal scale of sampling and the proportion of variance captured in time series of key environmental correlates of marine bird abundance, as well as available marine bird abundance time series, and use these analyses to develop recommendations for the temporal distribution of sampling to adequately represent both shortterm and long-term variability. We conclude by presenting a schematic “decision tree” showing how this power analysis approach would fit in a general framework for avian survey design, and discuss implications of model assumptions and results. We discuss avenues for future development of this work, and recommendations for practical implementation in the context of siting and wildlife assessment for offshore renewable energy development projects.

  8. Stroke Treatment Academic Industry Roundtable Recommendations for Individual Data Pooling Analyses in Stroke.

    PubMed

    Lees, Kennedy R; Khatri, Pooja

    2016-08-01

    Pooled analysis of individual patient data from stroke trials can deliver more precise estimates of treatment effect, enhance power to examine prespecified subgroups, and facilitate exploration of treatment-modifying influences. Analysis plans should be declared, and preferably published, before trial results are known. For pooling trials that used diverse analytic approaches, an ordinal analysis is favored, with justification for considering deaths and severe disability jointly. Because trial pooling is an incremental process, analyses should follow a sequential approach, with statistical adjustment for iterations. Updated analyses should be published when revised conclusions have a clinical implication. However, caution is recommended in declaring pooled findings that may prejudice ongoing trials, unless clinical implications are compelling. All contributing trial teams should contribute to leadership, data verification, and authorship of pooled analyses. Development work is needed to enable reliable inferences to be drawn about individual drug or device effects that contribute to a pooled analysis, versus a class effect, if the treatment strategy combines ≥2 such drugs or devices. Despite the practical challenges, pooled analyses are powerful and essential tools in interpreting clinical trial findings and advancing clinical care. © 2016 American Heart Association, Inc.

  9. Non-Gaussian power grid frequency fluctuations characterized by Lévy-stable laws and superstatistics

    NASA Astrophysics Data System (ADS)

    Schäfer, Benjamin; Beck, Christian; Aihara, Kazuyuki; Witthaut, Dirk; Timme, Marc

    2018-02-01

    Multiple types of fluctuations impact the collective dynamics of power grids and thus challenge their robust operation. Fluctuations result from processes as different as dynamically changing demands, energy trading and an increasing share of renewable power feed-in. Here we analyse principles underlying the dynamics and statistics of power grid frequency fluctuations. Considering frequency time series for a range of power grids, including grids in North America, Japan and Europe, we find a strong deviation from Gaussianity best described as Lévy-stable and q-Gaussian distributions. We present a coarse framework to analytically characterize the impact of arbitrary noise distributions, as well as a superstatistical approach that systematically interprets heavy tails and skewed distributions. We identify energy trading as a substantial contribution to today's frequency fluctuations and effective damping of the grid as a controlling factor enabling reduction of fluctuation risks, with enhanced effects for small power grids.

  10. A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics.

    PubMed

    Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu

    2017-12-07

    Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  11. An add-in implementation of the RESAMPLING syntax under Microsoft EXCEL.

    PubMed

    Meineke, I

    2000-10-01

    The RESAMPLING syntax defines a set of powerful commands, which allow the programming of probabilistic statistical models with few, easily memorized statements. This paper presents an implementation of the RESAMPLING syntax using Microsoft EXCEL with Microsoft WINDOWS(R) as a platform. Two examples are given to demonstrate typical applications of RESAMPLING in biomedicine. Details of the implementation with special emphasis on the programming environment are discussed at length. The add-in is available electronically to interested readers upon request. The use of the add-in facilitates numerical statistical analyses of data from within EXCEL in a comfortable way.

  12. Analysis and meta-analysis of single-case designs: an introduction.

    PubMed

    Shadish, William R

    2014-04-01

    The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  13. A Statistical Method for Synthesizing Mediation Analyses Using the Product of Coefficient Approach Across Multiple Trials

    PubMed Central

    Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks

    2016-01-01

    Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330

  14. Analyzing longitudinal data with the linear mixed models procedure in SPSS.

    PubMed

    West, Brady T

    2009-09-01

    Many applied researchers analyzing longitudinal data share a common misconception: that specialized statistical software is necessary to fit hierarchical linear models (also known as linear mixed models [LMMs], or multilevel models) to longitudinal data sets. Although several specialized statistical software programs of high quality are available that allow researchers to fit these models to longitudinal data sets (e.g., HLM), rapid advances in general purpose statistical software packages have recently enabled analysts to fit these same models when using preferred packages that also enable other more common analyses. One of these general purpose statistical packages is SPSS, which includes a very flexible and powerful procedure for fitting LMMs to longitudinal data sets with continuous outcomes. This article aims to present readers with a practical discussion of how to analyze longitudinal data using the LMMs procedure in the SPSS statistical software package.

  15. Debate: Subgroup analyses in clinical trials: fun to look at - but don't believe them!

    PubMed Central

    Sleight, Peter

    2000-01-01

    Analysis of subgroup results in a clinical trial is surprisingly unreliable, even in a large trial. This is the result of a combination of reduced statistical power, increased variance and the play of chance. Reliance on such analyses is likely to be more erroneous, and hence harmful, than application of the overall proportional (or relative) result in the whole trial to the estimate of absolute risk in that subgroup. Plausible explanations can usually be found for effects that are, in reality, simply due to the play of chance. When clinicians believe such subgroup analyses, there is a real danger of harm to the individual patient. PMID:11714402

  16. ParallABEL: an R library for generalized parallelization of genome-wide association studies.

    PubMed

    Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

    2010-04-29

    Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.

  17. SPS market analysis. [small solar thermal power systems

    NASA Technical Reports Server (NTRS)

    Goff, H. C.

    1980-01-01

    A market analysis task included personal interviews by GE personnel and supplemental mail surveys to acquire statistical data and to identify and measure attitudes, reactions and intentions of prospective small solar thermal power systems (SPS) users. Over 500 firms were contacted, including three ownership classes of electric utilities, industrial firms in the top SIC codes for energy consumption, and design engineering firms. A market demand model was developed which utilizes the data base developed by personal interviews and surveys, and projected energy price and consumption data to perform sensitivity analyses and estimate potential markets for SPS.

  18. Publication Bias Currently Makes an Accurate Estimate of the Benefits of Enrichment Programs Difficult: A Postmortem of Two Meta-Analyses Using Statistical Power Analysis

    ERIC Educational Resources Information Center

    Warne, Russell T.

    2016-01-01

    Recently Kim (2016) published a meta-analysis on the effects of enrichment programs for gifted students. She found that these programs produced substantial effects for academic achievement (g = 0.96) and socioemotional outcomes (g = 0.55). However, given current theory and empirical research these estimates of the benefits of enrichment programs…

  19. A Fatigue Management System for Sustained Military Operations

    DTIC Science & Technology

    2008-03-31

    Bradford, Brenda Jones, Margaret Campbell, Heather McCrory, Amy Campbell, Linda Mendez, Juan Cardenas, Beckie Moise, Samuel Cardenas, Fernando...always under the direct observation of research personnel or knowingly monitored from a central control station by closed circuit television, excluding...blocks 1, 5, 9, and 13). Statistical Analyses To determine the appropriate sample size for this study, a power analysis was based on the post

  20. Bispectral analysis of equatorial spread F density irregularities

    NASA Technical Reports Server (NTRS)

    Labelle, J.; Lund, E. J.

    1992-01-01

    Bispectral analysis has been applied to density irregularities at frequencies 5-30 Hz observed with a sounding rocket launched from Peru in March 1983. Unlike the power spectrum, the bispectrum contains statistical information about the phase relations between the Fourier components which make up the waveform. In the case of spread F data from 475 km the 5-30 Hz portion of the spectrum displays overall enhanced bicoherence relative to that of the background instrumental noise and to that expected due to statistical considerations, implying that the observed f exp -2.5 power law spectrum has a significant non-Gaussian component. This is consistent with previous qualitative analyses. The bicoherence has also been calculated for simulated equatorial spread F density irregularities in approximately the same wavelength regime, and the resulting bispectrum has some features in common with that of the rocket data. The implications of this analysis for equatorial spread F are discussed, and some future investigations are suggested.

  1. The impact of registration accuracy on imaging validation study design: A novel statistical power calculation.

    PubMed

    Gibson, Eli; Fenster, Aaron; Ward, Aaron D

    2013-10-01

    Novel imaging modalities are pushing the boundaries of what is possible in medical imaging, but their signal properties are not always well understood. The evaluation of these novel imaging modalities is critical to achieving their research and clinical potential. Image registration of novel modalities to accepted reference standard modalities is an important part of characterizing the modalities and elucidating the effect of underlying focal disease on the imaging signal. The strengths of the conclusions drawn from these analyses are limited by statistical power. Based on the observation that in this context, statistical power depends in part on uncertainty arising from registration error, we derive a power calculation formula relating registration error, number of subjects, and the minimum detectable difference between normal and pathologic regions on imaging, for an imaging validation study design that accommodates signal correlations within image regions. Monte Carlo simulations were used to evaluate the derived models and test the strength of their assumptions, showing that the model yielded predictions of the power, the number of subjects, and the minimum detectable difference of simulated experiments accurate to within a maximum error of 1% when the assumptions of the derivation were met, and characterizing sensitivities of the model to violations of the assumptions. The use of these formulae is illustrated through a calculation of the number of subjects required for a case study, modeled closely after a prostate cancer imaging validation study currently taking place at our institution. The power calculation formulae address three central questions in the design of imaging validation studies: (1) What is the maximum acceptable registration error? (2) How many subjects are needed? (3) What is the minimum detectable difference between normal and pathologic image regions? Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Power analysis and trend detection for water quality monitoring data. An application for the Greater Yellowstone Inventory and Monitoring Network

    USGS Publications Warehouse

    Irvine, Kathryn M.; Manlove, Kezia; Hollimon, Cynthia

    2012-01-01

    An important consideration for long term monitoring programs is determining the required sampling effort to detect trends in specific ecological indicators of interest. To enhance the Greater Yellowstone Inventory and Monitoring Network’s water resources protocol(s) (O’Ney 2006 and O’Ney et al. 2009 [under review]), we developed a set of tools to: (1) determine the statistical power for detecting trends of varying magnitude in a specified water quality parameter over different lengths of sampling (years) and different within-year collection frequencies (monthly or seasonal sampling) at particular locations using historical data, and (2) perform periodic trend analyses for water quality parameters while addressing seasonality and flow weighting. A power analysis for trend detection is a statistical procedure used to estimate the probability of rejecting the hypothesis of no trend when in fact there is a trend, within a specific modeling framework. In this report, we base our power estimates on using the seasonal Kendall test (Helsel and Hirsch 2002) for detecting trend in water quality parameters measured at fixed locations over multiple years. We also present procedures (R-scripts) for conducting a periodic trend analysis using the seasonal Kendall test with and without flow adjustment. This report provides the R-scripts developed for power and trend analysis, tutorials, and the associated tables and graphs. The purpose of this report is to provide practical information for monitoring network staff on how to use these statistical tools for water quality monitoring data sets.

  3. Detecting trends in raptor counts: power and type I error rates of various statistical tests

    USGS Publications Warehouse

    Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.

    1996-01-01

    We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.

  4. Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.

    PubMed

    Festing, M F

    2001-01-01

    In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.

  5. A randomized evaluation of a computer-based physician's workstation: design considerations and baseline results.

    PubMed Central

    Rotman, B. L.; Sullivan, A. N.; McDonald, T.; DeSmedt, P.; Goodnature, D.; Higgins, M.; Suermondt, H. J.; Young, C. Y.; Owens, D. K.

    1995-01-01

    We are performing a randomized, controlled trial of a Physician's Workstation (PWS), an ambulatory care information system, developed for use in the General Medical Clinic (GMC) of the Palo Alto VA. Goals for the project include selecting appropriate outcome variables and developing a statistically powerful experimental design with a limited number of subjects. As PWS provides real-time drug-ordering advice, we retrospectively examined drug costs and drug-drug interactions in order to select outcome variables sensitive to our short-term intervention as well as to estimate the statistical efficiency of alternative design possibilities. Drug cost data revealed the mean daily cost per physician per patient was 99.3 cents +/- 13.4 cents, with a range from 0.77 cent to 1.37 cents. The rate of major interactions per prescription for each physician was 2.9% +/- 1%, with a range from 1.5% to 4.8%. Based on these baseline analyses, we selected a two-period parallel design for the evaluation, which maximized statistical power while minimizing sources of bias. PMID:8563376

  6. Efficacy of isokinetic exercise on functional capacity and pain in patellofemoral pain syndrome.

    PubMed

    Alaca, Ridvan; Yilmaz, Bilge; Goktepe, A Salim; Mohur, Haydar; Kalyon, Tunc Alp

    2002-11-01

    To assess the effect of an isokinetic exercise program on symptoms and functions of patients with patellofemoral pain syndrome. A total of 22 consecutive patients with the complaint of anterior knee pain who met the inclusion criteria were recruited to assess the efficacy of isokinetic exercise on functional capacity, isokinetic parameters, and pain scores in patients with patellofemoral pain syndrome. A total of 37 knees were examined. Six-meter hopping, three-step hopping, and single-limb hopping course tests were performed for each patient with the measurements of the Lysholm scale and visual analog scale. Tested parameters were peak torque, total work, average power, and endurance ratios. Statistical analyses revealed that at the end of the 6-wk treatment period, functional and isokinetic parameters improved significantly, as did pain scores. There was not statistically significant correlation between different groups of parameters. The isokinetic exercise treatment program used in this study prevented the extensor power loss due to patellofemoral pain syndrome, but the improvement in the functional capacity was not correlated with the gained power.

  7. Application of multivariate statistical techniques in microbial ecology.

    PubMed

    Paliy, O; Shankar, V

    2016-03-01

    Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure. © 2016 John Wiley & Sons Ltd.

  8. The intervals method: a new approach to analyse finite element outputs using multivariate statistics

    PubMed Central

    De Esteban-Trivigno, Soledad; Püschel, Thomas A.; Fortuny, Josep

    2017-01-01

    Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches. PMID:29043107

  9. Power and trust in organizational relations: an empirical study in Turkish public hospitals.

    PubMed

    Bozaykut, Tuba; Gurbuz, F Gulruh

    2015-01-01

    Given the salience of the interplay between trust and power relations in organizational settings, this paper examines the perceptions of social power and its effects on trust in supervisors within the context of public hospitals. Following the theoretical background from which the study model is developed, the recent situation of hospitals within Turkish healthcare system is discussed to further elucidate the working conditions of physicians. Sample data were collected employing a structured questionnaire that was distributed to physicians working at seven different public hospitals. The statistical analyses indicate that perceptions of supervisors' social power affect subordinates' trust in supervisors. Although coercive power is found to have the greatest impact on trust in supervisors, the influence of the power base is weak. In addition, the results show that perceptions of social power differ between genders. However, the results do not support any of the hypotheses regarding the relations between trust in supervisors and the examined demographic variables. Copyright © 2014 John Wiley & Sons, Ltd.

  10. Researchers’ Intuitions About Power in Psychological Research

    PubMed Central

    Bakker, Marjan; Hartgerink, Chris H. J.; Wicherts, Jelte M.; van der Maas, Han L. J.

    2016-01-01

    Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies. PMID:27354203

  11. Researchers' Intuitions About Power in Psychological Research.

    PubMed

    Bakker, Marjan; Hartgerink, Chris H J; Wicherts, Jelte M; van der Maas, Han L J

    2016-08-01

    Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers' experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies. © The Author(s) 2016.

  12. Evaluating and Reporting Statistical Power in Counseling Research

    ERIC Educational Resources Information Center

    Balkin, Richard S.; Sheperis, Carl J.

    2011-01-01

    Despite recommendations from the "Publication Manual of the American Psychological Association" (6th ed.) to include information on statistical power when publishing quantitative results, authors seldom include analysis or discussion of statistical power. The rationale for discussing statistical power is addressed, approaches to using "G*Power" to…

  13. Entropy Based Genetic Association Tests and Gene-Gene Interaction Tests

    PubMed Central

    de Andrade, Mariza; Wang, Xin

    2011-01-01

    In the past few years, several entropy-based tests have been proposed for testing either single SNP association or gene-gene interaction. These tests are mainly based on Shannon entropy and have higher statistical power when compared to standard χ2 tests. In this paper, we extend some of these tests using a more generalized entropy definition, Rényi entropy, where Shannon entropy is a special case of order 1. The order λ (>0) of Rényi entropy weights the events (genotype/haplotype) according to their probabilities (frequencies). Higher λ places more emphasis on higher probability events while smaller λ (close to 0) tends to assign weights more equally. Thus, by properly choosing the λ, one can potentially increase the power of the tests or the p-value level of significance. We conducted simulation as well as real data analyses to assess the impact of the order λ and the performance of these generalized tests. The results showed that for dominant model the order 2 test was more powerful and for multiplicative model the order 1 or 2 had similar power. The analyses indicate that the choice of λ depends on the underlying genetic model and Shannon entropy is not necessarily the most powerful entropy measure for constructing genetic association or interaction tests. PMID:23089811

  14. A computational framework for estimating statistical power and planning hypothesis-driven experiments involving one-dimensional biomechanical continua.

    PubMed

    Pataky, Todd C; Robinson, Mark A; Vanrenterghem, Jos

    2018-01-03

    Statistical power assessment is an important component of hypothesis-driven research but until relatively recently (mid-1990s) no methods were available for assessing power in experiments involving continuum data and in particular those involving one-dimensional (1D) time series. The purpose of this study was to describe how continuum-level power analyses can be used to plan hypothesis-driven biomechanics experiments involving 1D data. In particular, we demonstrate how theory- and pilot-driven 1D effect modeling can be used for sample-size calculations for both single- and multi-subject experiments. For theory-driven power analysis we use the minimum jerk hypothesis and single-subject experiments involving straight-line, planar reaching. For pilot-driven power analysis we use a previously published knee kinematics dataset. Results show that powers on the order of 0.8 can be achieved with relatively small sample sizes, five and ten for within-subject minimum jerk analysis and between-subject knee kinematics, respectively. However, the appropriate sample size depends on a priori justifications of biomechanical meaning and effect size. The main advantage of the proposed technique is that it encourages a priori justification regarding the clinical and/or scientific meaning of particular 1D effects, thereby robustly structuring subsequent experimental inquiry. In short, it shifts focus from a search for significance to a search for non-rejectable hypotheses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Separate enrichment analysis of pathways for up- and downregulated genes.

    PubMed

    Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

    2014-03-06

    Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.

  16. Statistical Selection of Biological Models for Genome-Wide Association Analyses.

    PubMed

    Bi, Wenjian; Kang, Guolian; Pounds, Stanley B

    2018-05-24

    Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.

  17. An Evaluative Report on the Current Status of Parapsychology

    DTIC Science & Technology

    1986-05-01

    mentation" (Stanford, 1979). The ganzfeld procedure eliminates patterned stimulation in the visual h and auditory modes. Visual isolation is provided by...distracting external stimulation . The most popular of such techniques is the ganzfeld, a procedure in which the subject looks through halves of ping...powerful statistical analyses. Ongoing analog or digital feedback can be provided to subjects in innumerable ways in either the visual or auditory mode

  18. The statistical big bang of 1911: ideology, technological innovation and the production of medical statistics.

    PubMed

    Higgs, W

    1996-12-01

    This paper examines the relationship between intellectual debate, technologies for analysing information, and the production of statistics in the General Register Office (GRO) in London in the early twentieth century. It argues that controversy between eugenicists and public health officials respecting the cause and effect of class-specific variations in fertility led to the introduction of questions in the 1911 census on marital fertility. The increasing complexity of the census necessitated a shift from manual to mechanised forms of data processing within the GRO. The subsequent increase in processing power allowed the GRO to make important changes to the medical and demographic statistics it published in the annual Reports of the Registrar General. These included substituting administrative sanitary districts for registration districts as units of analysis, consistently transferring deaths in institutions back to place of residence, and abstracting deaths according to the International List of Causes of Death.

  19. A Guerilla Guide to Common Problems in ‘Neurostatistics’: Essential Statistical Topics in Neuroscience

    PubMed Central

    Smith, Paul F.

    2017-01-01

    Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins. PMID:29371855

  20. A Guerilla Guide to Common Problems in 'Neurostatistics': Essential Statistical Topics in Neuroscience.

    PubMed

    Smith, Paul F

    2017-01-01

    Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins.

  1. Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2017-01-01

    Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

  2. Power of treatment success definitions when the Canine Brief Pain Inventory is used to evaluate carprofen treatment for the control of pain and inflammation in dogs with osteoarthritis.

    PubMed

    Brown, Dorothy Cimino; Bell, Margie; Rhodes, Linda

    2013-12-01

    To determine the optimal method for use of the Canine Brief Pain Inventory (CBPI) to quantitate responses of dogs with osteoarthritis to treatment with carprofen or placebo. 150 dogs with osteoarthritis. Data were analyzed from 2 studies with identical protocols in which owner-completed CBPIs were used. Treatment for each dog was classified as a success or failure by comparing the pain severity score (PSS) and pain interference score (PIS) on day 0 (baseline) with those on day 14. Treatment success or failure was defined on the basis of various combinations of reduction in the 2 scores when inclusion criteria were set as a PSS and PIS ≥ 1, 2, or 3 at baseline. Statistical analyses were performed to select the definition of treatment success that had the greatest statistical power to detect differences between carprofen and placebo treatments. Defining treatment success as a reduction of ≥ 1 in PSS and ≥ 2 in PIS in each dog had consistently robust power. Power was 62.8% in the population that included only dogs with baseline scores ≥ 2 and 64.7% in the population that included only dogs with baseline scores ≥ 3. The CBPI had robust statistical power to evaluate the treatment effect of carprofen in dogs with osteoarthritis when protocol success criteria were predefined as a reduction ≥ 1 in PIS and ≥ 2 in PSS. Results indicated the CBPI can be used as an outcome measure in clinical trials to evaluate new pain treatments when it is desirable to evaluate success in individual dogs rather than overall mean or median scores in a test population.

  3. A weighted U-statistic for genetic association analyses of sequencing data.

    PubMed

    Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

    2014-12-01

    With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.

  4. Assessing the effect of land use change on catchment runoff by combined use of statistical tests and hydrological modelling: Case studies from Zimbabwe

    NASA Astrophysics Data System (ADS)

    Lørup, Jens Kristian; Refsgaard, Jens Christian; Mazvimavi, Dominic

    1998-03-01

    The purpose of this study was to identify and assess long-term impacts of land use change on catchment runoff in semi-arid Zimbabwe, based on analyses of long hydrological time series (25-50 years) from six medium-sized (200-1000 km 2) non-experimental rural catchments. A methodology combining common statistical methods with hydrological modelling was adopted in order to distinguish between the effects of climate variability and the effects of land use change. The hydrological model (NAM) was in general able to simulate the observed hydrographs very well during the reference period, thus providing a means to account for the effects of climate variability and hence strengthening the power of the subsequent statistical tests. In the test period the validated model was used to provide the runoff record which would have occurred in the absence of land use change. The analyses indicated a decrease in the annual runoff for most of the six catchments, with the largest changes occurring for catchments located within communal land, where large increases in population and agricultural intensity have taken place. However, the decrease was only statistically significant at the 5% level for one of the catchments.

  5. An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

    PubMed

    Ng'andu, N H

    1997-03-30

    In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.

  6. Progressive statistics for studies in sports medicine and exercise science.

    PubMed

    Hopkins, William G; Marshall, Stephen W; Batterham, Alan M; Hanin, Juri

    2009-01-01

    Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.

  7. Economic evaluation of factorial randomised controlled trials: challenges, methods and recommendations

    PubMed Central

    Gray, Alastair

    2017-01-01

    Increasing numbers of economic evaluations are conducted alongside randomised controlled trials. Such studies include factorial trials, which randomise patients to different levels of two or more factors and can therefore evaluate the effect of multiple treatments alone and in combination. Factorial trials can provide increased statistical power or assess interactions between treatments, but raise additional challenges for trial‐based economic evaluations: interactions may occur more commonly for costs and quality‐adjusted life‐years (QALYs) than for clinical endpoints; economic endpoints raise challenges for transformation and regression analysis; and both factors must be considered simultaneously to assess which treatment combination represents best value for money. This article aims to examine issues associated with factorial trials that include assessment of costs and/or cost‐effectiveness, describe the methods that can be used to analyse such studies and make recommendations for health economists, statisticians and trialists. A hypothetical worked example is used to illustrate the challenges and demonstrate ways in which economic evaluations of factorial trials may be conducted, and how these methods affect the results and conclusions. Ignoring interactions introduces bias that could result in adopting a treatment that does not make best use of healthcare resources, while considering all interactions avoids bias but reduces statistical power. We also introduce the concept of the opportunity cost of ignoring interactions as a measure of the bias introduced by not taking account of all interactions. We conclude by offering recommendations for planning, analysing and reporting economic evaluations based on factorial trials, taking increased analysis costs into account. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28470760

  8. Universal Inverse Power-Law Distribution for Fractal Fluctuations in Dynamical Systems: Applications for Predictability of Inter-Annual Variability of Indian and USA Region Rainfall

    NASA Astrophysics Data System (ADS)

    Selvam, A. M.

    2017-01-01

    Dynamical systems in nature exhibit self-similar fractal space-time fluctuations on all scales indicating long-range correlations and, therefore, the statistical normal distribution with implicit assumption of independence, fixed mean and standard deviation cannot be used for description and quantification of fractal data sets. The author has developed a general systems theory based on classical statistical physics for fractal fluctuations which predicts the following. (1) The fractal fluctuations signify an underlying eddy continuum, the larger eddies being the integrated mean of enclosed smaller-scale fluctuations. (2) The probability distribution of eddy amplitudes and the variance (square of eddy amplitude) spectrum of fractal fluctuations follow the universal Boltzmann inverse power law expressed as a function of the golden mean. (3) Fractal fluctuations are signatures of quantum-like chaos since the additive amplitudes of eddies when squared represent probability densities analogous to the sub-atomic dynamics of quantum systems such as the photon or electron. (4) The model predicted distribution is very close to statistical normal distribution for moderate events within two standard deviations from the mean but exhibits a fat long tail that are associated with hazardous extreme events. Continuous periodogram power spectral analyses of available GHCN annual total rainfall time series for the period 1900-2008 for Indian and USA stations show that the power spectra and the corresponding probability distributions follow model predicted universal inverse power law form signifying an eddy continuum structure underlying the observed inter-annual variability of rainfall. On a global scale, man-made greenhouse gas related atmospheric warming would result in intensification of natural climate variability, seen immediately in high frequency fluctuations such as QBO and ENSO and even shorter timescales. Model concepts and results of analyses are discussed with reference to possible prediction of climate change. Model concepts, if correct, rule out unambiguously, linear trends in climate. Climate change will only be manifested as increase or decrease in the natural variability. However, more stringent tests of model concepts and predictions are required before applications to such an important issue as climate change. Observations and simulations with climate models show that precipitation extremes intensify in response to a warming climate (O'Gorman in Curr Clim Change Rep 1:49-59, 2015).

  9. Across-cohort QC analyses of GWAS summary statistics from complex traits.

    PubMed

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2016-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.

  10. The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

    PubMed

    Christensen, G B; Knight, S; Camp, N J

    2009-11-01

    We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.

  11. Across-cohort QC analyses of GWAS summary statistics from complex traits

    PubMed Central

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2017-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965

  12. Systematic meta-analyses and field synopsis of genetic association studies in colorectal adenomas

    PubMed Central

    Montazeri, Zahra; Theodoratou, Evropi; Nyiraneza, Christine; Timofeeva, Maria; Chen, Wanjing; Svinti, Victoria; Sivakumaran, Shanya; Gresham, Gillian; Cubitt, Laura; Carvajal-Carmona, Luis; Bertagnolli, Monica M; Zauber, Ann G; Tomlinson, Ian; Farrington, Susan M; Dunlop, Malcolm G; Campbell, Harry; Little, Julian

    2018-01-01

    Background Low penetrance genetic variants, primarily single nucleotide polymorphisms, have substantial influence on colorectal cancer (CRC) susceptibility. Most CRCs develop from colorectal adenomas (CRA). Here, we report the first comprehensive field synopsis that catalogues all genetic association studies on CRA, with a parallel online database (http://www.chs.med.ed.ac.uk/CRAgene/). Methods We performed a systematic review, reviewing 9750 titles and then extracted data from 130 publications reporting on 181 polymorphisms in 74 genes. We conducted meta-analyses to derive summary effect estimates for 37 polymorphisms in 26 genes. We applied the Venice criteria and Bayesian False Discovery Probability (BFDP) to assess the levels of the credibility of associations. Results We considered the association with the rs6983267 variant at 8q24 as “highly credible”, reaching genome wide statistical significance in at least one meta-analysis model. We identified “less credible” associations (higher heterogeneity, lower statistical power, BFDP>0.02) with a further four variants of four independent genes: MTHFR c.677C>T p.A222V (rs1801133), TP53 c.215C>G p.R72P (rs1042522), NQO1 c.559C>T p.P187S (rs1800566), and NAT1 alleles imputed as fast acetylator genotypes. For the remaining 32 variants of 22 genes for which positive associations with CRA risk have been previously reported, the meta-analyses revealed no credible evidence to support these as true associations. Conclusions The limited number of credible associations between low penetrance genetic variants and CRA reflects the lower volume of evidence and associated lack of statistical power to detect associations of the magnitude typically observed for genetic variants and chronic diseases. The CRAgene database provides context for CRA genetic association data and will help inform future research directions. PMID:26451011

  13. ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data.

    PubMed

    Carter, Kim W; Francis, Richard W; Carter, K W; Francis, R W; Bresnahan, M; Gissler, M; Grønborg, T K; Gross, R; Gunnes, N; Hammond, G; Hornig, M; Hultman, C M; Huttunen, J; Langridge, A; Leonard, H; Newman, S; Parner, E T; Petersson, G; Reichenberg, A; Sandin, S; Schendel, D E; Schalkwyk, L; Sourander, A; Steadman, C; Stoltenberg, C; Suominen, A; Surén, P; Susser, E; Sylvester Vethanayagam, A; Yusof, Z

    2016-04-01

    Research studies exploring the determinants of disease require sufficient statistical power to detect meaningful effects. Sample size is often increased through centralized pooling of disparately located datasets, though ethical, privacy and data ownership issues can often hamper this process. Methods that facilitate the sharing of research data that are sympathetic with these issues and which allow flexible and detailed statistical analyses are therefore in critical need. We have created a software platform for the Virtual Pooling and Analysis of Research data (ViPAR), which employs free and open source methods to provide researchers with a web-based platform to analyse datasets housed in disparate locations. Database federation permits controlled access to remotely located datasets from a central location. The Secure Shell protocol allows data to be securely exchanged between devices over an insecure network. ViPAR combines these free technologies into a solution that facilitates 'virtual pooling' where data can be temporarily pooled into computer memory and made available for analysis without the need for permanent central storage. Within the ViPAR infrastructure, remote sites manage their own harmonized research dataset in a database hosted at their site, while a central server hosts the data federation component and a secure analysis portal. When an analysis is initiated, requested data are retrieved from each remote site and virtually pooled at the central site. The data are then analysed by statistical software and, on completion, results of the analysis are returned to the user and the virtually pooled data are removed from memory. ViPAR is a secure, flexible and powerful analysis platform built on open source technology that is currently in use by large international consortia, and is made publicly available at [http://bioinformatics.childhealthresearch.org.au/software/vipar/]. © The Author 2015. Published by Oxford University Press on behalf of the International Epidemiological Association.

  14. Statistical guidelines for assessing marine avian hotspots and coldspots: A case study on wind energy development in the U.S. Atlantic Ocean

    USGS Publications Warehouse

    Zipkin, Elise F.; Kinlan, Brian P.; Sussman, Allison; Rypkema, Diana; Wimer, Mark; O'Connell, Allan F.

    2015-01-01

    Estimating patterns of habitat use is challenging for marine avian species because seabirds tend to aggregate in large groups and it can be difficult to locate both individuals and groups in vast marine environments. We developed an approach to estimate the statistical power of discrete survey events to identify species-specific hotspots and coldspots of long-term seabird abundance in marine environments. We illustrate our approach using historical seabird data from survey transects in the U.S. Atlantic Ocean Outer Continental Shelf (OCS), an area that has been divided into “lease blocks” for proposed offshore wind energy development. For our power analysis, we examined whether discrete lease blocks within the region could be defined as hotspots (3 × mean abundance in the OCS) or coldspots (1/3 ×) for individual species within a given season. For each of 74 species/season combinations, we determined which of eight candidate statistical distributions (ranging in their degree of skewedness) best fit the count data. We then used the selected distribution and estimates of regional prevalence to calculate and map statistical power to detect hotspots and coldspots, and estimate the p-value from Monte Carlo significance tests that specific lease blocks are in fact hotspots or coldspots relative to regional average abundance. The power to detect species-specific hotspots was higher than that of coldspots for most species because species-specific prevalence was relatively low (mean: 0.111; SD: 0.110). The number of surveys required for adequate power (> 0.6) was large for most species (tens to hundreds) using this hotspot definition. Regulators may need to accept higher proportional effect sizes, combine species into groups, and/or broaden the spatial scale by combining lease blocks in order to determine optimal placement of wind farms. Our power analysis approach provides a general framework for both retrospective analyses and future avian survey design and is applicable to a broad range of research and conservation problems.

  15. ParallABEL: an R library for generalized parallelization of genome-wide association studies

    PubMed Central

    2010-01-01

    Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Conclusions Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL. PMID:20429914

  16. Rare Variant Association Test with Multiple Phenotypes

    PubMed Central

    Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung

    2016-01-01

    Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885

  17. Selection of nontarget arthropod taxa for field research on transgenic insecticidal crops: using empirical data and statistical power.

    PubMed

    Prasifka, J R; Hellmich, R L; Dively, G P; Higgins, L S; Dixon, P M; Duan, J J

    2008-02-01

    One of the possible adverse effects of transgenic insecticidal crops is the unintended decline in the abundance of nontarget arthropods. Field trials designed to evaluate potential nontarget effects can be more complex than expected because decisions to conduct field trials and the selection of taxa to include are not always guided by the results of laboratory tests. Also, recent studies emphasize the potential for indirect effects (adverse impacts to nontarget arthropods without feeding directly on plant tissues), which are difficult to predict because of interactions among nontarget arthropods, target pests, and transgenic crops. As a consequence, field studies may attempt to monitor expansive lists of arthropod taxa, making the design of such broad studies more difficult and reducing the likelihood of detecting any negative effects that might be present. To improve the taxonomic focus and statistical rigor of future studies, existing field data and corresponding power analysis may provide useful guidance. Analysis of control data from several nontarget field trials using repeated-measures designs suggests that while detection of small effects may require considerable increases in replication, there are taxa from different ecological roles that are sampled effectively using standard methods. The use of statistical power to guide selection of taxa for nontarget trials reflects scientists' inability to predict the complex interactions among arthropod taxa, particularly when laboratory trials fail to provide guidance on which groups are more likely to be affected. However, scientists still may exercise judgment, including taxa that are not included in or supported by power analyses.

  18. Angular Baryon Acoustic Oscillation measure at z=2.225 from the SDSS quasar survey

    NASA Astrophysics Data System (ADS)

    de Carvalho, E.; Bernui, A.; Carvalho, G. C.; Novaes, C. P.; Xavier, H. S.

    2018-04-01

    Following a quasi model-independent approach we measure the transversal BAO mode at high redshift using the two-point angular correlation function (2PACF). The analyses done here are only possible now with the quasar catalogue from the twelfth data release (DR12Q) from the Sloan Digital Sky Survey, because it is spatially dense enough to allow the measurement of the angular BAO signature with moderate statistical significance and acceptable precision. Our analyses with quasars in the redshift interval z in [2.20,2.25] produce the angular BAO scale θBAO = 1.77° ± 0.31° with a statistical significance of 2.12 σ (i.e., 97% confidence level), calculated through a likelihood analysis performed using the theoretical covariance matrix sourced by the analytical power spectra expected in the ΛCDM concordance model. Additionally, we show that the BAO signal is robust—although with less statistical significance—under diverse bin-size choices and under small displacements of the quasars' angular coordinates. Finally, we also performed cosmological parameter analyses comparing the θBAO predictions for wCDM and w(a)CDM models with angular BAO data available in the literature, including the measurement obtained here, jointly with CMB data. The constraints on the parameters ΩM, w0 and wa are in excellent agreement with the ΛCDM concordance model.

  19. Detecting the response of fish assemblages to stream restoration: Effects of different sampling designsf

    USGS Publications Warehouse

    Baldigo, Barry P.; Warren, D.R.

    2008-01-01

    Increased trout production within limited stream reaches is a popular goal for restoration projects, yet investigators seldom monitor, assess, or publish the associated effects on fish assemblages. Fish community data from a total of 40 surveys at restored and reference reaches in three streams of the Catskill Mountains, New York, were analyzed a posteriori to determine how the ability to detect significant changes in biomass of brown trout Salmo trutta, all salmonids, or the entire fish community differs with effect size, number of streams assessed, process used to quantify the index response, and number of replicates collected before and after restoration. Analyses of statistical power (probability of detecting a meaningful difference or effect) and integrated power (average power over all possible ??-values) were combined with before-after, control-impact analyses to assess the effectiveness of alternate sampling and analysis designs. In general, the more robust analyses indicated that biomass of brown trout and salmonid populations increased significantly in restored reaches but that the net increases (relative to the reference reach) were significant only at two of four restored reaches. Restoration alone could not account for the net increases in total biomass of fish communities. Power analyses generally showed that integrated power was greater than 0.95 when (1) biomass increases were larger than 5.0 g/m2, (2) the total number of replicates ranged from 4 to 8, and (3) coefficients of variation (CVs) for responses were less than 40%. Integrated power was often greater than 0.95 for responses as low as 1.0 g/m2 if the response CVs were less than 30%. Considering that brown trout, salmonid, and community biomass increased by 2.99 g/m2 on average (SD= 1.17 g/m2) in the four restored reaches, use of two to three replicates both before and after restoration would have an integrated power of about 0.95 and would help detect significant changes in fish biomass under similar situations. ?? Copyright by the American Fisheries Society 2008.

  20. Association analysis of multiple traits by an approach of combining P values.

    PubMed

    Chen, Lili; Wang, Yong; Zhou, Yajing

    2018-03-01

    Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.

  1. Quantitative cancer risk assessment based on NIOSH and UCC epidemiological data for workers exposed to ethylene oxide.

    PubMed

    Valdez-Flores, Ciriaco; Sielken, Robert L; Teta, M Jane

    2010-04-01

    The most recent epidemiological data on individual workers in the NIOSH and updated UCC occupational studies have been used to characterize the potential excess cancer risks of environmental exposure to ethylene oxide (EO). In addition to refined analyses of the separate cohorts, power has been increased by analyzing the combined cohorts. In previous SMR analyses of the separate studies and the present analyses of the updated and pooled studies of over 19,000 workers, none of the SMRs for any combination of the 12 cancer endpoints and six sub-cohorts analyzed were statistically significantly greater than one including the ones of greatest previous interest: leukemia, lymphohematopoietic tissue, lymphoid tumors, NHL, and breast cancer. In our study, no evidence of a positive cumulative exposure-response relationship was found. Fitted Cox proportional hazards models with cumulative EO exposure do not have statistically significant positive slopes. The lack of increasing trends was corroborated by categorical analyses. Cox model estimates of the concentrations corresponding to a 1-in-a-million extra environmental cancer risk are all greater than approximately 1ppb and are more than 1500-fold greater than the 0.4ppt estimate in the 2006 EPA draft IRIS risk assessment. The reasons for this difference are identified and discussed. Copyright 2009 Elsevier Inc. All rights reserved.

  2. Statistical power in parallel group point exposure studies with time-to-event outcomes: an empirical comparison of the performance of randomized controlled trials and the inverse probability of treatment weighting (IPTW) approach.

    PubMed

    Austin, Peter C; Schuster, Tibor; Platt, Robert W

    2015-10-15

    Estimating statistical power is an important component of the design of both randomized controlled trials (RCTs) and observational studies. Methods for estimating statistical power in RCTs have been well described and can be implemented simply. In observational studies, statistical methods must be used to remove the effects of confounding that can occur due to non-random treatment assignment. Inverse probability of treatment weighting (IPTW) using the propensity score is an attractive method for estimating the effects of treatment using observational data. However, sample size and power calculations have not been adequately described for these methods. We used an extensive series of Monte Carlo simulations to compare the statistical power of an IPTW analysis of an observational study with time-to-event outcomes with that of an analysis of a similarly-structured RCT. We examined the impact of four factors on the statistical power function: number of observed events, prevalence of treatment, the marginal hazard ratio, and the strength of the treatment-selection process. We found that, on average, an IPTW analysis had lower statistical power compared to an analysis of a similarly-structured RCT. The difference in statistical power increased as the magnitude of the treatment-selection model increased. The statistical power of an IPTW analysis tended to be lower than the statistical power of a similarly-structured RCT.

  3. Microfluidic-based mini-metagenomics enables discovery of novel microbial lineages from complex environmental samples.

    PubMed

    Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R

    2017-07-05

    Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.

  4. Handwriting Examination: Moving from Art to Science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jarman, K.H.; Hanlen, R.C.; Manzolillo, P.A.

    In this document, we present a method for validating the premises and methodology of forensic handwriting examination. This method is intuitively appealing because it relies on quantitative measurements currently used qualitatively by FDE's in making comparisons, and it is scientifically rigorous because it exploits the power of multivariate statistical analysis. This approach uses measures of both central tendency and variation to construct a profile for a given individual. (Central tendency and variation are important for characterizing an individual's writing and both are currently used by FDE's in comparative analyses). Once constructed, different profiles are then compared for individuality using clustermore » analysis; they are grouped so that profiles within a group cannot be differentiated from one another based on the measured characteristics, whereas profiles between groups can. The cluster analysis procedure used here exploits the power of multivariate hypothesis testing. The result is not only a profile grouping but also an indication of statistical significance of the groups generated.« less

  5. Do polymorphisms of 5,10-methylenetetrahydrofolate reductase (MTHFR) gene affect the risk of childhood acute lymphoblastic leukemia?

    PubMed

    Pereira, Tiago Veiga; Rudnicki, Martina; Pereira, Alexandre Costa; Pombo-de-Oliveira, Maria S; Franco, Rendrik França

    2006-01-01

    Meta-analysis has become an important statistical tool in genetic association studies, since it may provide more powerful and precise estimates. However, meta-analytic studies are prone to several potential biases not only because the preferential publication of "positive'' studies but also due to difficulties in obtaining all relevant information during the study selection process. In this letter, we point out major problems in meta-analysis that may lead to biased conclusions, illustrating an empirical example of two recent meta-analyses on the relation between MTHFR polymorphisms and risk of acute lymphoblastic leukemia that, despite the similarity in statistical methods and period of study selection, provided partially conflicting results.

  6. Experimental design and statistical methods for improved hit detection in high-throughput screening.

    PubMed

    Malo, Nathalie; Hanley, James A; Carlile, Graeme; Liu, Jing; Pelletier, Jerry; Thomas, David; Nadon, Robert

    2010-09-01

    Identification of active compounds in high-throughput screening (HTS) contexts can be substantially improved by applying classical experimental design and statistical inference principles to all phases of HTS studies. The authors present both experimental and simulated data to illustrate how true-positive rates can be maximized without increasing false-positive rates by the following analytical process. First, the use of robust data preprocessing methods reduces unwanted variation by removing row, column, and plate biases. Second, replicate measurements allow estimation of the magnitude of the remaining random error and the use of formal statistical models to benchmark putative hits relative to what is expected by chance. Receiver Operating Characteristic (ROC) analyses revealed superior power for data preprocessed by a trimmed-mean polish method combined with the RVM t-test, particularly for small- to moderate-sized biological hits.

  7. Correcting power and p-value calculations for bias in diffusion tensor imaging.

    PubMed

    Lauzon, Carolyn B; Landman, Bennett A

    2013-07-01

    Diffusion tensor imaging (DTI) provides quantitative parametric maps sensitive to tissue microarchitecture (e.g., fractional anisotropy, FA). These maps are estimated through computational processes and subject to random distortions including variance and bias. Traditional statistical procedures commonly used for study planning (including power analyses and p-value/alpha-rate thresholds) specifically model variability, but neglect potential impacts of bias. Herein, we quantitatively investigate the impacts of bias in DTI on hypothesis test properties (power and alpha-rate) using a two-sided hypothesis testing framework. We present theoretical evaluation of bias on hypothesis test properties, evaluate the bias estimation technique SIMEX for DTI hypothesis testing using simulated data, and evaluate the impacts of bias on spatially varying power and alpha rates in an empirical study of 21 subjects. Bias is shown to inflame alpha rates, distort the power curve, and cause significant power loss even in empirical settings where the expected difference in bias between groups is zero. These adverse effects can be attenuated by properly accounting for bias in the calculation of power and p-values. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Considerations in the statistical analysis of clinical trials in periodontitis.

    PubMed

    Imrey, P B

    1986-05-01

    Adult periodontitis has been described as a chronic infectious process exhibiting sporadic, acute exacerbations which cause quantal, localized losses of dental attachment. Many analytic problems of periodontal trials are similar to those of other chronic diseases. However, the episodic, localized, infrequent, and relatively unpredictable behavior of exacerbations, coupled with measurement error difficulties, cause some specific problems. Considerable controversy exists as to the proper selection and treatment of multiple site data from the same patient for group comparisons for epidemiologic or therapeutic evaluative purposes. This paper comments, with varying degrees of emphasis, on several issues pertinent to the analysis of periodontal trials. Considerable attention is given to the ways in which measurement variability may distort analytic results. Statistical treatments of multiple site data for descriptive summaries are distinguished from treatments for formal statistical inference to validate therapeutic effects. Evidence suggesting that sites behave independently is contested. For inferential analyses directed at therapeutic or preventive effects, analytic models based on site independence are deemed unsatisfactory. Methods of summarization that may yield more powerful analyses than all-site mean scores, while retaining appropriate treatment of inter-site associations, are suggested. Brief comments and opinions on an assortment of other issues in clinical trial analysis are preferred.

  9. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update

    PubMed Central

    Afgan, Enis; Baker, Dannon; van den Beek, Marius; Blankenberg, Daniel; Bouvier, Dave; Čech, Martin; Chilton, John; Clements, Dave; Coraor, Nate; Eberhard, Carl; Grüning, Björn; Guerler, Aysam; Hillman-Jackson, Jennifer; Von Kuster, Greg; Rasche, Eric; Soranzo, Nicola; Turaga, Nitesh; Taylor, James; Nekrutenko, Anton; Goecks, Jeremy

    2016-01-01

    High-throughput data production technologies, particularly ‘next-generation’ DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale. PMID:27137889

  10. Using Meta-analyses for Comparative Effectiveness Research

    PubMed Central

    Ruppar, Todd M.; Phillips, Lorraine J.; Chase, Jo-Ana D.

    2012-01-01

    Comparative effectiveness research seeks to identify the most effective interventions for particular patient populations. Meta-analysis is an especially valuable form of comparative effectiveness research because it emphasizes the magnitude of intervention effects rather than relying on tests of statistical significance among primary studies. Overall effects can be calculated for diverse clinical and patient-centered variables to determine the outcome patterns. Moderator analyses compare intervention characteristics among primary studies by determining if effect sizes vary among studies with different intervention characteristics. Intervention effectiveness can be linked to patient characteristics to provide evidence for patient-centered care. Moderator analyses often answer questions never posed by primary studies because neither multiple intervention characteristics nor populations are compared in single primary studies. Thus meta-analyses provide unique contributions to knowledge. Although meta-analysis is a powerful comparative effectiveness strategy, methodological challenges and limitations in primary research must be acknowledged to interpret findings. PMID:22789450

  11. Statistical Power in Meta-Analysis

    ERIC Educational Resources Information Center

    Liu, Jin

    2015-01-01

    Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…

  12. Emissions of mercury from the power sector in Poland

    NASA Astrophysics Data System (ADS)

    Zyśk, J.; Wyrwa, A.; Pluta, M.

    2011-01-01

    Poland belongs to the European Union countries with the highest mercury emissions. This is mainly related to coal combustion. This paper presents estimates of mercury emissions from power sector in Poland. In this work, the bottom-up approach was applied and over 160 emission point sources were analysed. For each, the characteristics of the whole technological chain starting from fuel quality, boiler type as well as emission controls were taken into account. Our results show that emissions of mercury from brown coal power plants in 2005 were nearly four times greater than those of hard coal power plants. These estimates differ significantly from national statistics and some possible reasons are discussed. For the first time total mercury emissions from the Polish power sector were differentiated into its main atmospheric forms: gaseous elemental (GEM), reactive gaseous (RGM) and particulate-bound mercury. Information on emission source location and the likely vertical distribution of mercury emissions, which can be used in modelling of atmospheric dispersion of mercury is also provided.

  13. Determining the Statistical Power of the Kolmogorov-Smirnov and Anderson-Darling Goodness-of-Fit Tests via Monte Carlo Simulation

    DTIC Science & Technology

    2016-12-01

    KS and AD Statistical Power via Monte Carlo Simulation Statistical power is the probability of correctly rejecting the null hypothesis when the...Select a caveat DISTRIBUTION STATEMENT A. Approved for public release: distribution unlimited. Determining the Statistical Power...real-world data to test the accuracy of the simulation. Statistical comparison of these metrics can be necessary when making such a determination

  14. A marked correlation function for constraining modified gravity models

    NASA Astrophysics Data System (ADS)

    White, Martin

    2016-11-01

    Future large scale structure surveys will provide increasingly tight constraints on our cosmological model. These surveys will report results on the distance scale and growth rate of perturbations through measurements of Baryon Acoustic Oscillations and Redshift-Space Distortions. It is interesting to ask: what further analyses should become routine, so as to test as-yet-unknown models of cosmic acceleration? Models which aim to explain the accelerated expansion rate of the Universe by modifications to General Relativity often invoke screening mechanisms which can imprint a non-standard density dependence on their predictions. This suggests density-dependent clustering as a `generic' constraint. This paper argues that a density-marked correlation function provides a density-dependent statistic which is easy to compute and report and requires minimal additional infrastructure beyond what is routinely available to such survey analyses. We give one realization of this idea and study it using low order perturbation theory. We encourage groups developing modified gravity theories to see whether such statistics provide discriminatory power for their models.

  15. Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

    PubMed

    Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

    2015-08-01

    Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.

  16. Louder than words: power and conflict in interprofessional education articles, 1954–2013

    PubMed Central

    Paradis, Elise; Whitehead, Cynthia R

    2015-01-01

    Context Interprofessional education (IPE) aspires to enable collaborative practice. Current IPE offerings, although rapidly proliferating, lack evidence of efficacy and theoretical grounding. Objectives Our research aimed to explore the historical emergence of the field of IPE and to analyse the positioning of this academic field of inquiry. In particular, we sought to investigate the extent to which power and conflict – elements central to interprofessional care – figure in the IPE literature. Methods We used a combination of deductive and inductive automated coding and manual coding to explore the contents of 2191 articles in the IPE literature published between 1954 and 2013. Inductive coding focused on the presence and use of the sociological (rather than statistical) version of power, which refers to hierarchies and asymmetries among the professions. Articles found to be centrally about power were then analysed using content analysis. Results Publications on IPE have grown exponentially in the past decade. Deductive coding of identified articles showed an emphasis on students, learning, programmes and practice. Automated inductive coding of titles and abstracts identified 129 articles potentially about power, but manual coding found that only six articles put power and conflict at the centre. Content analysis of these six articles revealed that two provided tentative explorations of power dynamics, one skirted around this issue, and three explicitly theorised and integrated power and conflict. Conclusions The lack of attention to power and conflict in the IPE literature suggests that many educators do not foreground these issues. Education programmes are expected to transform individuals into effective collaborators, without heed to structural, organisational and institutional factors. In so doing, current constructions of IPE veil the problems that IPE attempts to solve. PMID:25800300

  17. Louder than words: power and conflict in interprofessional education articles, 1954-2013.

    PubMed

    Paradis, Elise; Whitehead, Cynthia R

    2015-04-01

    Interprofessional education (IPE) aspires to enable collaborative practice. Current IPE offerings, although rapidly proliferating, lack evidence of efficacy and theoretical grounding. Our research aimed to explore the historical emergence of the field of IPE and to analyse the positioning of this academic field of inquiry. In particular, we sought to investigate the extent to which power and conflict - elements central to interprofessional care - figure in the IPE literature. We used a combination of deductive and inductive automated coding and manual coding to explore the contents of 2191 articles in the IPE literature published between 1954 and 2013. Inductive coding focused on the presence and use of the sociological (rather than statistical) version of power, which refers to hierarchies and asymmetries among the professions. Articles found to be centrally about power were then analysed using content analysis. Publications on IPE have grown exponentially in the past decade. Deductive coding of identified articles showed an emphasis on students, learning, programmes and practice. Automated inductive coding of titles and abstracts identified 129 articles potentially about power, but manual coding found that only six articles put power and conflict at the centre. Content analysis of these six articles revealed that two provided tentative explorations of power dynamics, one skirted around this issue, and three explicitly theorised and integrated power and conflict. The lack of attention to power and conflict in the IPE literature suggests that many educators do not foreground these issues. Education programmes are expected to transform individuals into effective collaborators, without heed to structural, organisational and institutional factors. In so doing, current constructions of IPE veil the problems that IPE attempts to solve. © 2015 The Authors Medical Education Published by John Wiley & Sons Ltd.

  18. FARVATX: FAmily-based Rare Variant Association Test for X-linked genes

    PubMed Central

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H.; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-01-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease (COPD). Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. PMID:27325607

  19. FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

    PubMed

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-09-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. © 2016 WILEY PERIODICALS, INC.

  20. Enhanced Component Performance Study. Emergency Diesel Generators 1998–2013

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schroeder, John Alton

    2014-11-01

    This report presents an enhanced performance evaluation of emergency diesel generators (EDGs) at U.S. commercial nuclear power plants. This report evaluates component performance over time using Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES) data from 1998 through 2013 and maintenance unavailability (UA) performance data using Mitigating Systems Performance Index (MSPI) Basis Document data from 2002 through 2013. The objective is to present an analysis of factors that could influence the system and component trends in addition to annual performance trends of failure rates and probabilities. The factors analyzed for the EDG component are the differences in failuresmore » between all demands and actual unplanned engineered safety feature (ESF) demands, differences among manufacturers, and differences among EDG ratings. Statistical analyses of these differences are performed and results showing whether pooling is acceptable across these factors. In addition, engineering analyses were performed with respect to time period and failure mode. The factors analyzed are: sub-component, failure cause, detection method, recovery, manufacturer, and EDG rating.« less

  1. Generalised Central Limit Theorems for Growth Rate Distribution of Complex Systems

    NASA Astrophysics Data System (ADS)

    Takayasu, Misako; Watanabe, Hayafumi; Takayasu, Hideki

    2014-04-01

    We introduce a solvable model of randomly growing systems consisting of many independent subunits. Scaling relations and growth rate distributions in the limit of infinite subunits are analysed theoretically. Various types of scaling properties and distributions reported for growth rates of complex systems in a variety of fields can be derived from this basic physical model. Statistical data of growth rates for about 1 million business firms are analysed as a real-world example of randomly growing systems. Not only are the scaling relations consistent with the theoretical solution, but the entire functional form of the growth rate distribution is fitted with a theoretical distribution that has a power-law tail.

  2. Active learning for noisy oracle via density power divergence.

    PubMed

    Sogawa, Yasuhiro; Ueno, Tsuyoshi; Kawahara, Yoshinobu; Washio, Takashi

    2013-10-01

    The accuracy of active learning is critically influenced by the existence of noisy labels given by a noisy oracle. In this paper, we propose a novel pool-based active learning framework through robust measures based on density power divergence. By minimizing density power divergence, such as β-divergence and γ-divergence, one can estimate the model accurately even under the existence of noisy labels within data. Accordingly, we develop query selecting measures for pool-based active learning using these divergences. In addition, we propose an evaluation scheme for these measures based on asymptotic statistical analyses, which enables us to perform active learning by evaluating an estimation error directly. Experiments with benchmark datasets and real-world image datasets show that our active learning scheme performs better than several baseline methods. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. The statistical power to detect cross-scale interactions at macroscales

    USGS Publications Warehouse

    Wagner, Tyler; Fergus, C. Emi; Stow, Craig A.; Cheruvelil, Kendra S.; Soranno, Patricia A.

    2016-01-01

    Macroscale studies of ecological phenomena are increasingly common because stressors such as climate and land-use change operate at large spatial and temporal scales. Cross-scale interactions (CSIs), where ecological processes operating at one spatial or temporal scale interact with processes operating at another scale, have been documented in a variety of ecosystems and contribute to complex system dynamics. However, studies investigating CSIs are often dependent on compiling multiple data sets from different sources to create multithematic, multiscaled data sets, which results in structurally complex, and sometimes incomplete data sets. The statistical power to detect CSIs needs to be evaluated because of their importance and the challenge of quantifying CSIs using data sets with complex structures and missing observations. We studied this problem using a spatially hierarchical model that measures CSIs between regional agriculture and its effects on the relationship between lake nutrients and lake productivity. We used an existing large multithematic, multiscaled database, LAke multiscaled GeOSpatial, and temporal database (LAGOS), to parameterize the power analysis simulations. We found that the power to detect CSIs was more strongly related to the number of regions in the study rather than the number of lakes nested within each region. CSI power analyses will not only help ecologists design large-scale studies aimed at detecting CSIs, but will also focus attention on CSI effect sizes and the degree to which they are ecologically relevant and detectable with large data sets.

  4. Global maps of the magnetic thickness and magnetization of the Earth's lithosphere

    NASA Astrophysics Data System (ADS)

    Vervelidou, Foteini; Thébault, Erwan

    2015-10-01

    We have constructed global maps of the large-scale magnetic thickness and magnetization of Earth's lithosphere. Deriving such large-scale maps based on lithospheric magnetic field measurements faces the challenge of the masking effect of the core field. In this study, the maps were obtained through analyses in the spectral domain by means of a new regional spatial power spectrum based on the Revised Spherical Cap Harmonic Analysis (R-SCHA) formalism. A series of regional spectral analyses were conducted covering the entire Earth. The R-SCHA surface power spectrum for each region was estimated using the NGDC-720 spherical harmonic (SH) model of the lithospheric magnetic field, which is based on satellite, aeromagnetic, and marine measurements. These observational regional spectra were fitted to a recently proposed statistical expression of the power spectrum of Earth's lithospheric magnetic field, whose free parameters include the thickness and magnetization of the magnetic sources. The resulting global magnetic thickness map is compared to other crustal and magnetic thickness maps based upon different geophysical data. We conclude that the large-scale magnetic thickness of the lithosphere is on average confined to a layer that does not exceed the Moho.

  5. Acute nonlymphocytic leukemia and residential exposure to power frequency magnetic fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Severson, R.K.

    1986-01-01

    A population-based case-control study of adult acute nonlymphocytic leukemia (ANLL) and residential exposure to power frequency magnetic fields was conducted in King, Pierce and Snohomish Counties in Washington state. Of 164 cases who were diagnosed from January 1, 1981 through December 31, 1984, 114 were interviewed. Controls were selected from the study area on the basis of random digit dialing and frequency matched to the cases by age and sex. Analyses were undertaken to evaluate whether exposure to high levels of power frequency magnetic fields in the residence were associated with an increased risk of ANLL. Neither the directly measuredmore » magnetic fields nor the surrogate values based on the wiring configurations were associated with ANLL. Additional analyses suggested that persons with prior allergies were at decreased risk of acute myelocytic leukemia (AML). Also, persons with prior autoimmune diseases were at increased risk of AML. The increase in AML risk in rheumatoid arthritics was of borderline statistical significance. Finally, cigarette smoking was associated with an increased risk of AML. The risk of AML increased significantly with the number of years of cigarette smoking.« less

  6. The evolution of autodigestion in the mushroom family Psathyrellaceae (Agaricales) inferred from Maximum Likelihood and Bayesian methods.

    PubMed

    Nagy, László G; Urban, Alexander; Orstadius, Leif; Papp, Tamás; Larsson, Ellen; Vágvölgyi, Csaba

    2010-12-01

    Recently developed comparative phylogenetic methods offer a wide spectrum of applications in evolutionary biology, although it is generally accepted that their statistical properties are incompletely known. Here, we examine and compare the statistical power of the ML and Bayesian methods with regard to selection of best-fit models of fruiting-body evolution and hypothesis testing of ancestral states on a real-life data set of a physiological trait (autodigestion) in the family Psathyrellaceae. Our phylogenies are based on the first multigene data set generated for the family. Two different coding regimes (binary and multistate) and two data sets differing in taxon sampling density are examined. The Bayesian method outperformed Maximum Likelihood with regard to statistical power in all analyses. This is particularly evident if the signal in the data is weak, i.e. in cases when the ML approach does not provide support to choose among competing hypotheses. Results based on binary and multistate coding differed only modestly, although it was evident that multistate analyses were less conclusive in all cases. It seems that increased taxon sampling density has favourable effects on inference of ancestral states, while model parameters are influenced to a smaller extent. The model best fitting our data implies that the rate of losses of deliquescence equals zero, although model selection in ML does not provide proper support to reject three of the four candidate models. The results also support the hypothesis that non-deliquescence (lack of autodigestion) has been ancestral in Psathyrellaceae, and that deliquescent fruiting bodies represent the preferred state, having evolved independently several times during evolution. Copyright © 2010 Elsevier Inc. All rights reserved.

  7. [A Review on the Use of Effect Size in Nursing Research].

    PubMed

    Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae

    2015-10-01

    The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.

  8. Improving the power of clinical trials of rheumatoid arthritis by using data on continuous scales when analysing response rates: an application of the augmented binary method

    PubMed Central

    Jenkins, Martin

    2016-01-01

    Objective. In clinical trials of RA, it is common to assess effectiveness using end points based upon dichotomized continuous measures of disease activity, which classify patients as responders or non-responders. Although dichotomization generally loses statistical power, there are good clinical reasons to use these end points; for example, to allow for patients receiving rescue therapy to be assigned as non-responders. We adopt a statistical technique called the augmented binary method to make better use of the information provided by these continuous measures and account for how close patients were to being responders. Methods. We adapted the augmented binary method for use in RA clinical trials. We used a previously published randomized controlled trial (Oral SyK Inhibition in Rheumatoid Arthritis-1) to assess its performance in comparison to a standard method treating patients purely as responders or non-responders. The power and error rate were investigated by sampling from this study. Results. The augmented binary method reached similar conclusions to standard analysis methods but was able to estimate the difference in response rates to a higher degree of precision. Results suggested that CI widths for ACR responder end points could be reduced by at least 15%, which could equate to reducing the sample size of a study by 29% to achieve the same statistical power. For other end points, the gain was even higher. Type I error rates were not inflated. Conclusion. The augmented binary method shows considerable promise for RA trials, making more efficient use of patient data whilst still reporting outcomes in terms of recognized response end points. PMID:27338084

  9. Generalized Majority Logic Criterion to Analyze the Statistical Strength of S-Boxes

    NASA Astrophysics Data System (ADS)

    Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan

    2012-05-01

    The majority logic criterion is applicable in the evaluation process of substitution boxes used in the advanced encryption standard (AES). The performance of modified or advanced substitution boxes is predicted by processing the results of statistical analysis by the majority logic criteria. In this paper, we use the majority logic criteria to analyze some popular and prevailing substitution boxes used in encryption processes. In particular, the majority logic criterion is applied to AES, affine power affine (APA), Gray, Lui J, residue prime, S8 AES, Skipjack, and Xyi substitution boxes. The majority logic criterion is further extended into a generalized majority logic criterion which has a broader spectrum of analyzing the effectiveness of substitution boxes in image encryption applications. The integral components of the statistical analyses used for the generalized majority logic criterion are derived from results of entropy analysis, contrast analysis, correlation analysis, homogeneity analysis, energy analysis, and mean of absolute deviation (MAD) analysis.

  10. Fuel and Lubricant Effects on Exhaust Emissions from a Light-Duty CIDI Powered Vehicle

    DTIC Science & Technology

    2003-09-01

    particulate emissions were examined on a 1999 Mercedes Benz C220 D. Test cycles included the FTP and the US06. Statistical analyses were performed on...4 REPORT 03.03227.03 viii LIST OF FIGURES Figure Page 1 Mercedes - Benz C220D Vehicle on...macroemulsion fuel was also evaluated. REPORT 03.03227.03 2 of 28 II. PROGRAM DESCRIPTION The test vehicle was a 1999 Mercedes - Benz C220 D equipped with a

  11. Experimental Design in Clinical 'Omics Biomarker Discovery.

    PubMed

    Forshed, Jenny

    2017-11-03

    This tutorial highlights some issues in the experimental design of clinical 'omics biomarker discovery, how to avoid bias and get as true quantities as possible from biochemical analyses, and how to select samples to improve the chance of answering the clinical question at issue. This includes the importance of defining clinical aim and end point, knowing the variability in the results, randomization of samples, sample size, statistical power, and how to avoid confounding factors by including clinical data in the sample selection, that is, how to avoid unpleasant surprises at the point of statistical analysis. The aim of this Tutorial is to help translational clinical and preclinical biomarker candidate research and to improve the validity and potential of future biomarker candidate findings.

  12. Statistical Learning Analysis in Neuroscience: Aiming for Transparency

    PubMed Central

    Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan

    2009-01-01

    Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270

  13. Toward "Constructing" the Concept of Statistical Power: An Optical Analogy.

    ERIC Educational Resources Information Center

    Rogers, Bruce G.

    This paper presents a visual analogy that may be used by instructors to teach the concept of statistical power in statistical courses. Statistical power is mathematically defined as the probability of rejecting a null hypothesis when that null is false, or, equivalently, the probability of detecting a relationship when it exists. The analogy…

  14. Economic Impacts of Wind Turbine Development in U.S. Counties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    J., Brown; B., Hoen; E., Lantz

    2011-07-25

    The objective is to address the research question using post-project construction, county-level data, and econometric evaluation methods. Wind energy is expanding rapidly in the United States: Over the last 4 years, wind power has contributed approximately 35 percent of all new electric power capacity. Wind power plants are often developed in rural areas where local economic development impacts from the installation are projected, including land lease and property tax payments and employment growth during plant construction and operation. Wind energy represented 2.3 percent of the U.S. electricity supply in 2010, but studies show that penetrations of at least 20 percentmore » are feasible. Several studies have used input-output models to predict direct, indirect, and induced economic development impacts. These analyses have often been completed prior to project construction. Available studies have not yet investigated the economic development impacts of wind development at the county level using post-construction econometric evaluation methods. Analysis of county-level impacts is limited. However, previous county-level analyses have estimated operation-period employment at 0.2 to 0.6 jobs per megawatt (MW) of power installed and earnings at $9,000/MW to $50,000/MW. We find statistically significant evidence of positive impacts of wind development on county-level per capita income from the OLS and spatial lag models when they are applied to the full set of wind and non-wind counties. The total impact on annual per capita income of wind turbine development (measured in MW per capita) in the spatial lag model was $21,604 per MW. This estimate is within the range of values estimated in the literature using input-output models. OLS results for the wind-only counties and matched samples are similar in magnitude, but are not statistically significant at the 10-percent level. We find a statistically significant impact of wind development on employment in the OLS analysis for wind counties only, but not in the other models. Our estimates of employment impacts are not precise enough to assess the validity of employment impacts from input-output models applied in advance of wind energy project construction. The analysis provides empirical evidence of positive income effects at the county level from cumulative wind turbine development, consistent with the range of impacts estimated using input-output models. Employment impacts are less clear.« less

  15. Multi-Scale Modeling to Improve Single-Molecule, Single-Cell Experiments

    NASA Astrophysics Data System (ADS)

    Munsky, Brian; Shepherd, Douglas

    2014-03-01

    Single-cell, single-molecule experiments are producing an unprecedented amount of data to capture the dynamics of biological systems. When integrated with computational models, observations of spatial, temporal and stochastic fluctuations can yield powerful quantitative insight. We concentrate on experiments that localize and count individual molecules of mRNA. These high precision experiments have large imaging and computational processing costs, and we explore how improved computational analyses can dramatically reduce overall data requirements. In particular, we show how analyses of spatial, temporal and stochastic fluctuations can significantly enhance parameter estimation results for small, noisy data sets. We also show how full probability distribution analyses can constrain parameters with far less data than bulk analyses or statistical moment closures. Finally, we discuss how a systematic modeling progression from simple to more complex analyses can reduce total computational costs by orders of magnitude. We illustrate our approach using single-molecule, spatial mRNA measurements of Interleukin 1-alpha mRNA induction in human THP1 cells following stimulation. Our approach could improve the effectiveness of single-molecule gene regulation analyses for many other process.

  16. DMINDA: an integrated web server for DNA motif identification and analyses

    PubMed Central

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-01-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419

  17. Testosterone replacement therapy and the heart: friend, foe or bystander?

    PubMed Central

    Canfield, Steven; Wang, Run

    2016-01-01

    The role of testosterone therapy (TTh) in cardiovascular disease (CVD) outcomes is still controversial, and it seems will remain inconclusive for the moment. An extensive body of literature has investigated the association of endogenous testosterone and use of TTh with CVD events including several meta-analyses. In some instances, a number of studies reported beneficial effects of TTh on CVD events and in other instances the body of literature reported detrimental effects or no effects at all. Yet, no review article has scrutinized this body of literature using the magnitude of associations and statistical significance reported from this relationship. We critically reviewed the previous and emerging body of literature that investigated the association of endogenous testosterone and use of TTh with CVD events (only fatal and nonfatal). These studies were divided into three groups, “beneficial (friendly use)”, “detrimental (foe)” and “no effects at all (bystander)”, based on their magnitude of associations and statistical significance from original research studies and meta-analyses of epidemiological studies and of randomized controlled trials (RCT’s). In this review article, the studies reporting a significant association of high levels of testosterone with a reduced risk of CVD events in original prospective studies and meta-analyses of cross-sectional and prospective studies seems to be more consistent. However, the number of meta-analyses of RCT’s does not provide a clear picture after we divided it into the beneficial, detrimental or no effects all groups using their magnitudes of association and statistical significance. From this review, we suggest that we need a study or number of studies that have the adequate power, epidemiological, and clinical data to provide a definitive conclusion on whether the effect of TTh on the natural history of CVD is real or not. PMID:28078222

  18. Testosterone replacement therapy and the heart: friend, foe or bystander?

    PubMed

    Lopez, David S; Canfield, Steven; Wang, Run

    2016-12-01

    The role of testosterone therapy (TTh) in cardiovascular disease (CVD) outcomes is still controversial, and it seems will remain inconclusive for the moment. An extensive body of literature has investigated the association of endogenous testosterone and use of TTh with CVD events including several meta-analyses. In some instances, a number of studies reported beneficial effects of TTh on CVD events and in other instances the body of literature reported detrimental effects or no effects at all. Yet, no review article has scrutinized this body of literature using the magnitude of associations and statistical significance reported from this relationship. We critically reviewed the previous and emerging body of literature that investigated the association of endogenous testosterone and use of TTh with CVD events (only fatal and nonfatal). These studies were divided into three groups, "beneficial (friendly use)", "detrimental (foe)" and "no effects at all (bystander)", based on their magnitude of associations and statistical significance from original research studies and meta-analyses of epidemiological studies and of randomized controlled trials (RCT's). In this review article, the studies reporting a significant association of high levels of testosterone with a reduced risk of CVD events in original prospective studies and meta-analyses of cross-sectional and prospective studies seems to be more consistent. However, the number of meta-analyses of RCT's does not provide a clear picture after we divided it into the beneficial, detrimental or no effects all groups using their magnitudes of association and statistical significance. From this review, we suggest that we need a study or number of studies that have the adequate power, epidemiological, and clinical data to provide a definitive conclusion on whether the effect of TTh on the natural history of CVD is real or not.

  19. Data series embedding and scale invariant statistics.

    PubMed

    Michieli, I; Medved, B; Ristov, S

    2010-06-01

    Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power-law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or "fractal-like" behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated. (c) 2009 Elsevier B.V. All rights reserved.

  20. easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.

    PubMed

    Grimm, Dominik G; Roqueiro, Damian; Salomé, Patrice A; Kleeberger, Stefan; Greshake, Bastian; Zhu, Wangsheng; Liu, Chang; Lippert, Christoph; Stegle, Oliver; Schölkopf, Bernhard; Weigel, Detlef; Borgwardt, Karsten M

    2017-01-01

    The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana , using flowering and growth-related traits. © 2016 American Society of Plant Biologists. All rights reserved.

  1. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    PubMed

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  2. Effect of non-normality on test statistics for one-way independent groups designs.

    PubMed

    Cribbie, Robert A; Fiksenbaum, Lisa; Keselman, H J; Wilcox, Rand R

    2012-02-01

    The data obtained from one-way independent groups designs is typically non-normal in form and rarely equally variable across treatment populations (i.e., population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e., the analysis of variance F test) typically provides invalid results (e.g., too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non-normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e., trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non-normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non-normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non-normal. © 2011 The British Psychological Society.

  3. Computer experimental analysis of the CHP performance of a 100 kW e SOFC Field Unit by a factorial design

    NASA Astrophysics Data System (ADS)

    Calì, M.; Santarelli, M. G. L.; Leone, P.

    Gas Turbine Technologies (GTT) and Politecnico di Torino, both located in Torino (Italy), have been involved in the design and installation of a SOFC laboratory in order to analyse the operation, in cogenerative configuration, of the CHP 100 kW e SOFC Field Unit, built by Siemens-Westinghouse Power Corporation (SWPC), which is at present (May 2005) starting its operation and which will supply electric and thermal power to the GTT factory. In order to take the better advantage from the analysis of the on-site operation, and especially to correctly design the scheduled experimental tests on the system, we developed a mathematical model and run a simulated experimental campaign, applying a rigorous statistical approach to the analysis of the results. The aim of this work is the computer experimental analysis, through a statistical methodology (2 k factorial experiments), of the CHP 100 performance. First, the mathematical model has been calibrated with the results acquired during the first CHP100 demonstration at EDB/ELSAM in Westerwoort. After, the simulated tests have been performed in the form of computer experimental session, and the measurement uncertainties have been simulated with perturbation imposed to the model independent variables. The statistical methodology used for the computer experimental analysis is the factorial design (Yates' Technique): using the ANOVA technique the effect of the main independent variables (air utilization factor U ox, fuel utilization factor U F, internal fuel and air preheating and anodic recycling flow rate) has been investigated in a rigorous manner. Analysis accounts for the effects of parameters on stack electric power, thermal recovered power, single cell voltage, cell operative temperature, consumed fuel flow and steam to carbon ratio. Each main effect and interaction effect of parameters is shown with particular attention on generated electric power and stack heat recovered.

  4. Powered versus manual toothbrushing for oral health.

    PubMed

    Yaacob, Munirah; Worthington, Helen V; Deacon, Scott A; Deery, Chris; Walmsley, A Damien; Robinson, Peter G; Glenny, Anne-Marie

    2014-06-17

    Removing dental plaque may play a key role maintaining oral health. There is conflicting evidence for the relative merits of manual and powered toothbrushing in achieving this. This is an update of a Cochrane review first published in 2003, and previously updated in 2005. To compare manual and powered toothbrushes in everyday use, by people of any age, in relation to the removal of plaque, the health of the gingivae, staining and calculus, dependability, adverse effects and cost. We searched the following electronic databases: the Cochrane Oral Health Group's Trials Register (to 23 January 2014), the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2014, Issue 1), MEDLINE via OVID (1946 to 23 January 2014), EMBASE via OVID (1980 to 23 January 2014) and CINAHL via EBSCO (1980 to 23 January 2014). We searched the US National Institutes of Health Trials Register and the WHO Clinical Trials Registry Platform for ongoing trials. No restrictions were placed on the language or date of publication when searching the electronic databases. Randomised controlled trials of at least four weeks of unsupervised powered toothbrushing versus manual toothbrushing for oral health in children and adults. We used standard methodological procedures expected by The Cochrane Collaboration. Random-effects models were used provided there were four or more studies included in the meta-analysis, otherwise fixed-effect models were used. Data were classed as short term (one to three months) and long term (greater than three months). Fifty-six trials met the inclusion criteria; 51 trials involving 4624 participants provided data for meta-analysis. Five trials were at low risk of bias, five at high and 46 at unclear risk of bias.There is moderate quality evidence that powered toothbrushes provide a statistically significant benefit compared with manual toothbrushes with regard to the reduction of plaque in both the short term (standardised mean difference (SMD) -0.50 (95% confidence interval (CI) -0.70 to -0.31); 40 trials, n = 2871) and long term (SMD -0.47 (95% CI -0.82 to -0.11; 14 trials, n = 978). These results correspond to an 11% reduction in plaque for the Quigley Hein index (Turesky) in the short term and 21% reduction long term. Both meta-analyses showed high levels of heterogeneity (I(2) = 83% and 86% respectively) that was not explained by the different powered toothbrush type subgroups.With regard to gingivitis, there is moderate quality evidence that powered toothbrushes again provide a statistically significant benefit when compared with manual toothbrushes both in the short term (SMD -0.43 (95% CI -0.60 to -0.25); 44 trials, n = 3345) and long term (SMD -0.21 (95% CI -0.31 to -0.12); 16 trials, n = 1645). This corresponds to a 6% and 11% reduction in gingivitis for the Löe and Silness index respectively. Both meta-analyses showed high levels of heterogeneity (I(2) = 82% and 51% respectively) that was not explained by the different powered toothbrush type subgroups.The number of trials for each type of powered toothbrush varied: side to side (10 trials), counter oscillation (five trials), rotation oscillation (27 trials), circular (two trials), ultrasonic (seven trials), ionic (four trials) and unknown (five trials). The greatest body of evidence was for rotation oscillation brushes which demonstrated a statistically significant reduction in plaque and gingivitis at both time points. Powered toothbrushes reduce plaque and gingivitis more than manual toothbrushing in the short and long term. The clinical importance of these findings remains unclear. Observation of methodological guidelines and greater standardisation of design would benefit both future trials and meta-analyses.Cost, reliability and side effects were inconsistently reported. Any reported side effects were localised and only temporary.

  5. Relating design and environmental variables to reliability

    NASA Astrophysics Data System (ADS)

    Kolarik, William J.; Landers, Thomas L.

    The combination of space application and nuclear power source demands high reliability hardware. The possibilities of failure, either an inability to provide power or a catastrophic accident, must be minimized. Nuclear power experiences on the ground have led to highly sophisticated probabilistic risk assessment procedures, most of which require quantitative information to adequately assess such risks. In the area of hardware risk analysis, reliability information plays a key role. One of the lessons learned from the Three Mile Island experience is that thorough analyses of critical components are essential. Nuclear grade equipment shows some reliability advantages over commercial. However, no statistically significant difference has been found. A recent study pertaining to spacecraft electronics reliability, examined some 2500 malfunctions on more than 300 aircraft. The study classified the equipment failures into seven general categories. Design deficiencies and lack of environmental protection accounted for about half of all failures. Within each class, limited reliability modeling was performed using a Weibull failure model.

  6. Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses

    PubMed Central

    Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.

    2015-01-01

    Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576

  7. Statistical analysis plan for the Pneumatic CompREssion for PreVENting Venous Thromboembolism (PREVENT) trial: a study protocol for a randomized controlled trial.

    PubMed

    Arabi, Yaseen; Al-Hameed, Fahad; Burns, Karen E A; Mehta, Sangeeta; Alsolamy, Sami; Almaani, Mohammed; Mandourah, Yasser; Almekhlafi, Ghaleb A; Al Bshabshe, Ali; Finfer, Simon; Alshahrani, Mohammed; Khalid, Imran; Mehta, Yatin; Gaur, Atul; Hawa, Hassan; Buscher, Hergen; Arshad, Zia; Lababidi, Hani; Al Aithan, Abdulsalam; Jose, Jesna; Abdukahil, Sheryl Ann I; Afesh, Lara Y; Dbsawy, Maamoun; Al-Dawood, Abdulaziz

    2018-03-15

    The Pneumatic CompREssion for Preventing VENous Thromboembolism (PREVENT) trial evaluates the effect of adjunctive intermittent pneumatic compression (IPC) with pharmacologic thromboprophylaxis compared to pharmacologic thromboprophylaxis alone on venous thromboembolism (VTE) in critically ill adults. In this multicenter randomized trial, critically ill patients receiving pharmacologic thromboprophylaxis will be randomized to an IPC or a no IPC (control) group. The primary outcome is "incident" proximal lower-extremity deep vein thrombosis (DVT) within 28 days after randomization. Radiologists interpreting the lower-extremity ultrasonography will be blinded to intervention allocation, whereas the patients and treating team will be unblinded. The trial has 80% power to detect a 3% absolute risk reduction in the rate of proximal DVT from 7% to 4%. Consistent with international guidelines, we have developed a detailed plan to guide the analysis of the PREVENT trial. This plan specifies the statistical methods for the evaluation of primary and secondary outcomes, and defines covariates for adjusted analyses a priori. Application of this statistical analysis plan to the PREVENT trial will facilitate unbiased analyses of clinical data. ClinicalTrials.gov , ID: NCT02040103 . Registered on 3 November 2013; Current controlled trials, ID: ISRCTN44653506 . Registered on 30 October 2013.

  8. Super-delta: a new differential gene expression analysis procedure with robust data normalization.

    PubMed

    Liu, Yuhang; Zhang, Jinfeng; Qiu, Xing

    2017-12-21

    Normalization is an important data preparation step in gene expression analyses, designed to remove various systematic noise. Sample variance is greatly reduced after normalization, hence the power of subsequent statistical analyses is likely to increase. On the other hand, variance reduction is made possible by borrowing information across all genes, including differentially expressed genes (DEGs) and outliers, which will inevitably introduce some bias. This bias typically inflates type I error; and can reduce statistical power in certain situations. In this study we propose a new differential expression analysis pipeline, dubbed as super-delta, that consists of a multivariate extension of the global normalization and a modified t-test. A robust procedure is designed to minimize the bias introduced by DEGs in the normalization step. The modified t-test is derived based on asymptotic theory for hypothesis testing that suitably pairs with the proposed robust normalization. We first compared super-delta with four commonly used normalization methods: global, median-IQR, quantile, and cyclic loess normalization in simulation studies. Super-delta was shown to have better statistical power with tighter control of type I error rate than its competitors. In many cases, the performance of super-delta is close to that of an oracle test in which datasets without technical noise were used. We then applied all methods to a collection of gene expression datasets on breast cancer patients who received neoadjuvant chemotherapy. While there is a substantial overlap of the DEGs identified by all of them, super-delta were able to identify comparatively more DEGs than its competitors. Downstream gene set enrichment analysis confirmed that all these methods selected largely consistent pathways. Detailed investigations on the relatively small differences showed that pathways identified by super-delta have better connections to breast cancer than other methods. As a new pipeline, super-delta provides new insights to the area of differential gene expression analysis. Solid theoretical foundation supports its asymptotic unbiasedness and technical noise-free properties. Implementation on real and simulated datasets demonstrates its decent performance compared with state-of-art procedures. It also has the potential of expansion to be incorporated with other data type and/or more general between-group comparison problems.

  9. Is complex allometry in field metabolic rates of mammals a statistical artifact?

    PubMed

    Packard, Gary C

    2017-01-01

    Recent reports indicate that field metabolic rates (FMRs) of mammals conform to a pattern of complex allometry in which the exponent in a simple, two-parameter power equation increases steadily as a dependent function of body mass. The reports were based, however, on indirect analyses performed on logarithmic transformations of the original data. I re-examined values for FMR and body mass for 114 species of mammal by the conventional approach to allometric analysis (to illustrate why the approach is unreliable) and by linear and nonlinear regression on untransformed variables (to illustrate the power and versatility of newer analytical methods). The best of the regression models fitted directly to untransformed observations is a three-parameter power equation with multiplicative, lognormal, heteroscedastic error and an allometric exponent of 0.82. The mean function is a good fit to data in graphical display. The significant intercept in the model may simply have gone undetected in prior analyses because conventional allometry assumes implicitly that the intercept is zero; or the intercept may be a spurious finding resulting from bias introduced by the haphazard sampling that underlies "exploratory" analyses like the one reported here. The aforementioned issues can be resolved only by gathering new data specifically intended to address the question of scaling of FMR with body mass in mammals. However, there is no support for the concept of complex allometry in the relationship between FMR and body size in mammals. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size.

    PubMed

    Heidel, R Eric

    2016-01-01

    Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.

  11. The Relation Between Inflation in Type-I and Type-II Error Rate and Population Divergence in Genome-Wide Association Analysis of Multi-Ethnic Populations.

    PubMed

    Derks, E M; Zwinderman, A H; Gamazon, E R

    2017-05-01

    Population divergence impacts the degree of population stratification in Genome Wide Association Studies. We aim to: (i) investigate type-I error rate as a function of population divergence (F ST ) in multi-ethnic (admixed) populations; (ii) evaluate the statistical power and effect size estimates; and (iii) investigate the impact of population stratification on the results of gene-based analyses. Quantitative phenotypes were simulated. Type-I error rate was investigated for Single Nucleotide Polymorphisms (SNPs) with varying levels of F ST between the ancestral European and African populations. Type-II error rate was investigated for a SNP characterized by a high value of F ST . In all tests, genomic MDS components were included to correct for population stratification. Type-I and type-II error rate was adequately controlled in a population that included two distinct ethnic populations but not in admixed samples. Statistical power was reduced in the admixed samples. Gene-based tests showed no residual inflation in type-I error rate.

  12. Microfluidic-based mini-metagenomics enables discovery of novel microbial lineages from complex environmental samples

    PubMed Central

    Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R

    2017-01-01

    Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007

  13. Predictive analysis of beer quality by correlating sensory evaluation with higher alcohol and ester production using multivariate statistics methods.

    PubMed

    Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru

    2014-10-15

    Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Fine-scale landscape genetics of the American badger (Taxidea taxus): disentangling landscape effects and sampling artifacts in a poorly understood species

    PubMed Central

    Kierepka, E M; Latch, E K

    2016-01-01

    Landscape genetics is a powerful tool for conservation because it identifies landscape features that are important for maintaining genetic connectivity between populations within heterogeneous landscapes. However, using landscape genetics in poorly understood species presents a number of challenges, namely, limited life history information for the focal population and spatially biased sampling. Both obstacles can reduce power in statistics, particularly in individual-based studies. In this study, we genotyped 233 American badgers in Wisconsin at 12 microsatellite loci to identify alternative statistical approaches that can be applied to poorly understood species in an individual-based framework. Badgers are protected in Wisconsin owing to an overall lack in life history information, so our study utilized partial redundancy analysis (RDA) and spatially lagged regressions to quantify how three landscape factors (Wisconsin River, Ecoregions and land cover) impacted gene flow. We also performed simulations to quantify errors created by spatially biased sampling. Statistical analyses first found that geographic distance was an important influence on gene flow, mainly driven by fine-scale positive spatial autocorrelations. After controlling for geographic distance, both RDA and regressions found that Wisconsin River and Agriculture were correlated with genetic differentiation. However, only Agriculture had an acceptable type I error rate (3–5%) to be considered biologically relevant. Collectively, this study highlights the benefits of combining robust statistics and error assessment via simulations and provides a method for hypothesis testing in individual-based landscape genetics. PMID:26243136

  15. A Bayesian Approach to the Overlap Analysis of Epidemiologically Linked Traits.

    PubMed

    Asimit, Jennifer L; Panoutsopoulou, Kalliope; Wheeler, Eleanor; Berndt, Sonja I; Cordell, Heather J; Morris, Andrew P; Zeggini, Eleftheria; Barroso, Inês

    2015-12-01

    Diseases often cooccur in individuals more often than expected by chance, and may be explained by shared underlying genetic etiology. A common approach to genetic overlap analyses is to use summary genome-wide association study data to identify single-nucleotide polymorphisms (SNPs) that are associated with multiple traits at a selected P-value threshold. However, P-values do not account for differences in power, whereas Bayes' factors (BFs) do, and may be approximated using summary statistics. We use simulation studies to compare the power of frequentist and Bayesian approaches with overlap analyses, and to decide on appropriate thresholds for comparison between the two methods. It is empirically illustrated that BFs have the advantage over P-values of a decreasing type I error rate as study size increases for single-disease associations. Consequently, the overlap analysis of traits from different-sized studies encounters issues in fair P-value threshold selection, whereas BFs are adjusted automatically. Extensive simulations show that Bayesian overlap analyses tend to have higher power than those that assess association strength with P-values, particularly in low-power scenarios. Calibration tables between BFs and P-values are provided for a range of sample sizes, as well as an approximation approach for sample sizes that are not in the calibration table. Although P-values are sometimes thought more intuitive, these tables assist in removing the opaqueness of Bayesian thresholds and may also be used in the selection of a BF threshold to meet a certain type I error rate. An application of our methods is used to identify variants associated with both obesity and osteoarthritis. © 2015 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.

  16. The Statistical Power of Planned Comparisons.

    ERIC Educational Resources Information Center

    Benton, Roberta L.

    Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shirazi, M.A.; Davis, L.R.

    To obtain improved prediction of heated plume characteristics from a surface jet, an integral analysis computer model was modified and a comprehensive set of field and laboratory data available from the literature was gathered, analyzed, and correlated for estimating the magnitude of certain coefficients that are normally introduced in these analyses to achieve closure. The parameters so estimated include the coefficients for entrainment, turbulent exchange, drag, and shear. Since there appeared considerable scatter in the data, even after appropriate subgrouping to narrow the influence of various flow conditions on the data, only statistical procedures could be applied to find themore » best fit. This and other analyses of its type have been widely used in industry and government for the prediction of thermal plumes from steam power plants. Although the present model has many shortcomings, a recent independent and exhaustive assessment of such predictions revealed that in comparison with other analyses of its type the present analysis predicts the field situations more successfully.« less

  18. Quality control and conduct of genome-wide association meta-analyses.

    PubMed

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth J F

    2014-05-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.

  19. Quality control and conduct of genome-wide association meta-analyses

    PubMed Central

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth JF

    2014-01-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for [1] organizational aspects of GWAMAs, and for [2] QC at the study file level, the meta-level across studies, and the meta-analysis output level. Real–world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for use of a powerful and flexible software package called EasyQC. For consortia of comparable size to the GIANT consortium, the present protocol takes a minimum of about 10 months to complete. PMID:24762786

  20. Power-law ansatz in complex systems: Excessive loss of information.

    PubMed

    Tsai, Sun-Ting; Chang, Chin-De; Chang, Ching-Hao; Tsai, Meng-Xue; Hsu, Nan-Jung; Hong, Tzay-Ming

    2015-12-01

    The ubiquity of power-law relations in empirical data displays physicists' love of simple laws and uncovering common causes among seemingly unrelated phenomena. However, many reported power laws lack statistical support and mechanistic backings, not to mention discrepancies with real data are often explained away as corrections due to finite size or other variables. We propose a simple experiment and rigorous statistical procedures to look into these issues. Making use of the fact that the occurrence rate and pulse intensity of crumple sound obey a power law with an exponent that varies with material, we simulate a complex system with two driving mechanisms by crumpling two different sheets together. The probability function of the crumple sound is found to transit from two power-law terms to a bona fide power law as compaction increases. In addition to showing the vicinity of these two distributions in the phase space, this observation nicely demonstrates the effect of interactions to bring about a subtle change in macroscopic behavior and more information may be retrieved if the data are subject to sorting. Our analyses are based on the Akaike information criterion that is a direct measurement of information loss and emphasizes the need to strike a balance between model simplicity and goodness of fit. As a show of force, the Akaike information criterion also found the Gutenberg-Richter law for earthquakes and the scale-free model for a brain functional network, a two-dimensional sandpile, and solar flare intensity to suffer an excessive loss of information. They resemble more the crumpled-together ball at low compactions in that there appear to be two driving mechanisms that take turns occurring.

  1. The power to detect linkage in complex disease by means of simple LOD-score analyses.

    PubMed Central

    Greenberg, D A; Abreu, P; Hodge, S E

    1998-01-01

    Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage. PMID:9718328

  2. The power to detect linkage in complex disease by means of simple LOD-score analyses.

    PubMed

    Greenberg, D A; Abreu, P; Hodge, S E

    1998-09-01

    Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage.

  3. TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies

    PubMed Central

    van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.

    2013-01-01

    To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524

  4. [Basic concepts for network meta-analysis].

    PubMed

    Catalá-López, Ferrán; Tobías, Aurelio; Roqué, Marta

    2014-12-01

    Systematic reviews and meta-analyses have long been fundamental tools for evidence-based clinical practice. Initially, meta-analyses were proposed as a technique that could improve the accuracy and the statistical power of previous research from individual studies with small sample size. However, one of its main limitations has been the fact of being able to compare no more than two treatments in an analysis, even when the clinical research question necessitates that we compare multiple interventions. Network meta-analysis (NMA) uses novel statistical methods that incorporate information from both direct and indirect treatment comparisons in a network of studies examining the effects of various competing treatments, estimating comparisons between many treatments in a single analysis. Despite its potential limitations, NMA applications in clinical epidemiology can be of great value in situations where there are several treatments that have been compared against a common comparator. Also, NMA can be relevant to a research or clinical question when many treatments must be considered or when there is a mix of both direct and indirect information in the body of evidence. Copyright © 2013 Elsevier España, S.L.U. All rights reserved.

  5. Moderate quality evidence finds statistical benefit in oral health for powered over manual toothbrushes.

    PubMed

    Niederman, Richard

    2014-09-01

    The Cochrane Oral Health Group's Trials Register, the Cochrane Central Register of Controlled Trials (CENTRAL), Medline, Embase, CINAHL, National Institutes of Health Trials Register and the WHO Clinical Trials Registry Platform for ongoing trials. Reference lists of identified articles were also scanned for relevant papers. Identified manufacturers were contacted for additional information. Only randomised controlled trials comparing manual and powered toothbrushes were considered. Crossover trials were eligible for inclusion if the wash-out period length was more than two weeks. Study assessment and data extraction were carried out independently by at least two reviewers. The primary outcome measures were quantified levels of plaque or gingivitis. Risk of bias assessment was undertaken. Standard Cochrane methodological approaches were taken. Random-effects models were used provided there were four or more studies included in the meta-analysis, otherwise fixed-effect models were used. Data were classed as short term (one to three months) and long term (greater than three months). Fifty-six trials were included with 51 (4624 patients) providing data for meta-analysis. The majority (46) were at unclear risk of bias, five at high risk of bias and five at low risk. There was moderate quality evidence that powered toothbrushes provide a statistically significant benefit compared with manual toothbrushes with regard to the reduction of plaque in both the short and long-term. This corresponds to an 11% reduction in plaque for the Quigley Hein index (Turesky) in the short term and a 21% reduction in the long term. There was a high degree of heterogeneity that was not explained by the different powered toothbrush type subgroups.There was also moderate quality evidence that powered toothbrushes again provide a statistically significant reduction in gingivitis when compared with manual toothbrushes both in the short and long term. This corresponds to a 6% and 11% reduction in gingivitis for the Löe and Silness indices respectively. Again there was a high degree of heterogeneity that was not explained by the different powered toothbrush type subgroups. The greatest body of evidence was for rotation oscillation brushes which demonstrated a statistically significant reduction in plaque and gingivitis at both time points. Powered toothbrushes reduce plaque and gingivitis more than manual toothbrushing in the short and long term. The clinical importance of these findings remains unclear. Observation of methodological guidelines and greater standardisation of design would benefit both future trials and meta-analyses. Cost, reliability and side effects were inconsistently reported. Any reported side effects were localised and only temporary.

  6. On the brain structure heterogeneity of autism: Parsing out acquisition site effects with significance-weighted principal component analysis.

    PubMed

    Martinez-Murcia, Francisco Jesús; Lai, Meng-Chuan; Górriz, Juan Manuel; Ramírez, Javier; Young, Adam M H; Deoni, Sean C L; Ecker, Christine; Lombardo, Michael V; Baron-Cohen, Simon; Murphy, Declan G M; Bullmore, Edward T; Suckling, John

    2017-03-01

    Neuroimaging studies have reported structural and physiological differences that could help understand the causes and development of Autism Spectrum Disorder (ASD). Many of them rely on multisite designs, with the recruitment of larger samples increasing statistical power. However, recent large-scale studies have put some findings into question, considering the results to be strongly dependent on the database used, and demonstrating the substantial heterogeneity within this clinically defined category. One major source of variance may be the acquisition of the data in multiple centres. In this work we analysed the differences found in the multisite, multi-modal neuroimaging database from the UK Medical Research Council Autism Imaging Multicentre Study (MRC AIMS) in terms of both diagnosis and acquisition sites. Since the dissimilarities between sites were higher than between diagnostic groups, we developed a technique called Significance Weighted Principal Component Analysis (SWPCA) to reduce the undesired intensity variance due to acquisition site and to increase the statistical power in detecting group differences. After eliminating site-related variance, statistically significant group differences were found, including Broca's area and the temporo-parietal junction. However, discriminative power was not sufficient to classify diagnostic groups, yielding accuracies results close to random. Our work supports recent claims that ASD is a highly heterogeneous condition that is difficult to globally characterize by neuroimaging, and therefore different (and more homogenous) subgroups should be defined to obtain a deeper understanding of ASD. Hum Brain Mapp 38:1208-1223, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  7. Identification of the Best Anthropometric Predictors of Serum High- and Low-Density Lipoproteins Using Machine Learning.

    PubMed

    Lee, Bum Ju; Kim, Jong Yeol

    2015-09-01

    Serum high-density lipoprotein (HDL) and low-density lipoprotein (LDL) cholesterol levels are associated with risk factors for various diseases and are related to anthropometric measures. However, controversy remains regarding the best anthropometric indicators of the HDL and LDL cholesterol levels. The objectives of this study were to identify the best predictors of HDL and LDL cholesterol using statistical analyses and two machine learning algorithms and to compare the predictive power of combined anthropometric measures in Korean adults. A total of 13,014 subjects participated in this study. The anthropometric measures were assessed with binary logistic regression (LR) to evaluate statistically significant differences between the subjects with normal and high LDL cholesterol levels and between the subjects with normal and low HDL cholesterol levels. LR and the naive Bayes algorithm (NB), which provides more reasonable and reliable results, were used in the analyses of the predictive power of individual and combined measures. The best predictor of HDL was the rib to hip ratio (p =< 0.0001; odds ratio (OR) = 1.895; area under curve (AUC) = 0.681) in women and the waist to hip ratio (WHR) (p =< 0.0001; OR = 1.624; AUC = 0.633) in men. In women, the strongest indicator of LDL was age (p =< 0.0001; OR = 1.662; AUC by NB = 0.653 ; AUC by LR = 0.636). Among the anthropometric measures, the body mass index (BMI), WHR, forehead to waist ratio, forehead to rib ratio, and forehead to chest ratio were the strongest predictors of LDL; these measures had similar predictive powers. The strongest predictor in men was BMI (p =< 0.0001; OR = 1.369; AUC by NB = 0.594; AUC by LR = 0.595 ). The predictive power of almost all individual anthropometric measures was higher for HDL than for LDL, and the predictive power for both HDL and LDL in women was higher than for men. A combination of anthropometric measures slightly improved the predictive power for both HDL and LDL cholesterol. The best indicator for HDL and LDL might differ according to the type of cholesterol and the gender. In women, but not men, age was the variable that strongly predicted HDL and LDL cholesterol levels. Our findings provide new information for the development of better initial screening tools for HDL and LDL cholesterol.

  8. Statistical modelling for recurrent events: an application to sports injuries

    PubMed Central

    Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F

    2014-01-01

    Background Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. Objective This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Methods Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. Results The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Conclusions Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. PMID:22872683

  9. Statistical power to detect violation of the proportional hazards assumption when using the Cox regression model.

    PubMed

    Austin, Peter C

    2018-01-01

    The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest.

  10. Statistical power to detect violation of the proportional hazards assumption when using the Cox regression model

    PubMed Central

    Austin, Peter C.

    2017-01-01

    The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest. PMID:29321694

  11. The Power of Neuroimaging Biomarkers for Screening Frontotemporal Dementia

    PubMed Central

    McMillan, Corey T.; Avants, Brian B.; Cook, Philip; Ungar, Lyle; Trojanowski, John Q.; Grossman, Murray

    2014-01-01

    Frontotemporal dementia (FTD) is a clinically and pathologically heterogeneous neurodegenerative disease that can result from either frontotemporal lobar degeneration (FTLD) or Alzheimer’s disease (AD) pathology. It is critical to establish statistically powerful biomarkers that can achieve substantial cost-savings and increase feasibility of clinical trials. We assessed three broad categories of neuroimaging methods to screen underlying FTLD and AD pathology in a clinical FTD series: global measures (e.g., ventricular volume), anatomical volumes of interest (VOIs) (e.g., hippocampus) using a standard atlas, and data-driven VOIs using Eigenanatomy. We evaluated clinical FTD patients (N=93) with cerebrospinal fluid, gray matter (GM) MRI, and diffusion tensor imaging (DTI) to assess whether they had underlying FTLD or AD pathology. Linear regression was performed to identify the optimal VOIs for each method in a training dataset and then we evaluated classification sensitivity and specificity in an independent test cohort. Power was evaluated by calculating minimum sample sizes (mSS) required in the test classification analyses for each model. The data-driven VOI analysis using a multimodal combination of GM MRI and DTI achieved the greatest classification accuracy (89% SENSITIVE; 89% SPECIFIC) and required a lower minimum sample size (N=26) relative to anatomical VOI and global measures. We conclude that a data-driven VOI approach employing Eigenanatomy provides more accurate classification, benefits from increased statistical power in unseen datasets, and therefore provides a robust method for screening underlying pathology in FTD patients for entry into clinical trials. PMID:24687814

  12. The problem is not just sample size: the consequences of low base rates in policing experiments in smaller cities.

    PubMed

    Hinkle, Joshua C; Weisburd, David; Famega, Christine; Ready, Justin

    2013-01-01

    Hot spots policing is one of the most influential police innovations, with a strong body of experimental research showing it to be effective in reducing crime and disorder. However, most studies have been conducted in major cities, and we thus know little about whether it is effective in smaller cities, which account for a majority of police agencies. The lack of experimental studies in smaller cities is likely in part due to challenges designing statistically powerful tests in such contexts. The current article explores the challenges of statistical power and "noise" resulting from low base rates of crime in smaller cities and provides suggestions for future evaluations to overcome these limitations. Data from a randomized experimental evaluation of broken windows policing in hot spots are used to illustrate the challenges that low base rates present for evaluating hot spots policing programs in smaller cities. Analyses show low base rates make it difficult to detect treatment effects. Very large effect sizes would be required to reach sufficient power, and random fluctuations around low base rates make detecting treatment effects difficult, irrespective of power, by masking differences between treatment and control groups. Low base rates present strong challenges to researchers attempting to evaluate hot spots policing in smaller cities. As such, base rates must be taken directly into account when designing experimental evaluations. The article offers suggestions for researchers attempting to expand the examination of hot spots policing and other microplace-based interventions to smaller jurisdictions.

  13. On damage detection in wind turbine gearboxes using outlier analysis

    NASA Astrophysics Data System (ADS)

    Antoniadou, Ifigeneia; Manson, Graeme; Dervilis, Nikolaos; Staszewski, Wieslaw J.; Worden, Keith

    2012-04-01

    The proportion of worldwide installed wind power in power systems increases over the years as a result of the steadily growing interest in renewable energy sources. Still, the advantages offered by the use of wind power are overshadowed by the high operational and maintenance costs, resulting in the low competitiveness of wind power in the energy market. In order to reduce the costs of corrective maintenance, the application of condition monitoring to gearboxes becomes highly important, since gearboxes are among the wind turbine components with the most frequent failure observations. While condition monitoring of gearboxes in general is common practice, with various methods having been developed over the last few decades, wind turbine gearbox condition monitoring faces a major challenge: the detection of faults under the time-varying load conditions prevailing in wind turbine systems. Classical time and frequency domain methods fail to detect faults under variable load conditions, due to the temporary effect that these faults have on vibration signals. This paper uses the statistical discipline of outlier analysis for the damage detection of gearbox tooth faults. A simplified two-degree-of-freedom gearbox model considering nonlinear backlash, time-periodic mesh stiffness and static transmission error, simulates the vibration signals to be analysed. Local stiffness reduction is used for the simulation of tooth faults and statistical processes determine the existence of intermittencies. The lowest level of fault detection, the threshold value, is considered and the Mahalanobis squared-distance is calculated for the novelty detection problem.

  14. Explorations in Statistics: Power

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2010-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four…

  15. An Examination of Statistical Power in Multigroup Dynamic Structural Equation Models

    ERIC Educational Resources Information Center

    Prindle, John J.; McArdle, John J.

    2012-01-01

    This study used statistical simulation to calculate differential statistical power in dynamic structural equation models with groups (as in McArdle & Prindle, 2008). Patterns of between-group differences were simulated to provide insight into how model parameters influence power approximations. Chi-square and root mean square error of…

  16. Publication bias was not a good reason to discourage trials with low power.

    PubMed

    Borm, George F; den Heijer, Martin; Zielhuis, Gerhard A

    2009-01-01

    The objective was to investigate whether it is justified to discourage trials with less than 80% power. Trials with low power are unlikely to produce conclusive results, but their findings can be used by pooling then in a meta-analysis. However, such an analysis may be biased, because trials with low power are likely to have a nonsignificant result and are less likely to be published than trials with a statistically significant outcome. We simulated several series of studies with varying degrees of publication bias and then calculated the "real" one-sided type I error and the bias of meta-analyses with a "nominal" error rate (significance level) of 2.5%. In single trials, in which heterogeneity was set at zero, low, and high, the error rates were 2.3%, 4.7%, and 16.5%, respectively. In multiple trials with 80%-90% power and a publication rate of 90% when the results were nonsignificant, the error rates could be as high as 5.1%. When the power was 50% and the publication rate of non-significant results was 60%, the error rates did not exceed 5.3%, whereas the bias was at most 15% of the difference used in the power calculation. The impact of publication bias does not warrant the exclusion of trials with 50% power.

  17. Electric power annual 1992

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    The Electric Power Annual presents a summary of electric utility statistics at national, regional and State levels. The objective of the publication is to provide industry decisionmakers, government policymakers, analysts and the general public with historical data that may be used in understanding US electricity markets. The Electric Power Annual is prepared by the Survey Management Division; Office of Coal, Nuclear, Electric and Alternate Fuels; Energy Information Administration (EIA); US Department of Energy. ``The US Electric Power Industry at a Glance`` section presents a profile of the electric power industry ownership and performance, and a review of key statistics formore » the year. Subsequent sections present data on generating capability, including proposed capability additions; net generation; fossil-fuel statistics; retail sales; revenue; financial statistics; environmental statistics; electric power transactions; demand-side management; and nonutility power producers. In addition, the appendices provide supplemental data on major disturbances and unusual occurrences in US electricity power systems. Each section contains related text and tables and refers the reader to the appropriate publication that contains more detailed data on the subject matter. Monetary values in this publication are expressed in nominal terms.« less

  18. Reporting of Positive Results in Randomized Controlled Trials of Mindfulness-Based Mental Health Interventions.

    PubMed

    Coronado-Montoya, Stephanie; Levis, Alexander W; Kwakkenbos, Linda; Steele, Russell J; Turner, Erick H; Thombs, Brett D

    2016-01-01

    A large proportion of mindfulness-based therapy trials report statistically significant results, even in the context of very low statistical power. The objective of the present study was to characterize the reporting of "positive" results in randomized controlled trials of mindfulness-based therapy. We also assessed mindfulness-based therapy trial registrations for indications of possible reporting bias and reviewed recent systematic reviews and meta-analyses to determine whether reporting biases were identified. CINAHL, Cochrane CENTRAL, EMBASE, ISI, MEDLINE, PsycInfo, and SCOPUS databases were searched for randomized controlled trials of mindfulness-based therapy. The number of positive trials was described and compared to the number that might be expected if mindfulness-based therapy were similarly effective compared to individual therapy for depression. Trial registries were searched for mindfulness-based therapy registrations. CINAHL, Cochrane CENTRAL, EMBASE, ISI, MEDLINE, PsycInfo, and SCOPUS were also searched for mindfulness-based therapy systematic reviews and meta-analyses. 108 (87%) of 124 published trials reported ≥1 positive outcome in the abstract, and 109 (88%) concluded that mindfulness-based therapy was effective, 1.6 times greater than the expected number of positive trials based on effect size d = 0.55 (expected number positive trials = 65.7). Of 21 trial registrations, 13 (62%) remained unpublished 30 months post-trial completion. No trial registrations adequately specified a single primary outcome measure with time of assessment. None of 36 systematic reviews and meta-analyses concluded that effect estimates were overestimated due to reporting biases. The proportion of mindfulness-based therapy trials with statistically significant results may overstate what would occur in practice.

  19. Homeopathy: meta-analyses of pooled clinical data.

    PubMed

    Hahn, Robert G

    2013-01-01

    In the first decade of the evidence-based era, which began in the mid-1990s, meta-analyses were used to scrutinize homeopathy for evidence of beneficial effects in medical conditions. In this review, meta-analyses including pooled data from placebo-controlled clinical trials of homeopathy and the aftermath in the form of debate articles were analyzed. In 1997 Klaus Linde and co-workers identified 89 clinical trials that showed an overall odds ratio of 2.45 in favor of homeopathy over placebo. There was a trend toward smaller benefit from studies of the highest quality, but the 10 trials with the highest Jadad score still showed homeopathy had a statistically significant effect. These results challenged academics to perform alternative analyses that, to demonstrate the lack of effect, relied on extensive exclusion of studies, often to the degree that conclusions were based on only 5-10% of the material, or on virtual data. The ultimate argument against homeopathy is the 'funnel plot' published by Aijing Shang's research group in 2005. However, the funnel plot is flawed when applied to a mixture of diseases, because studies with expected strong treatments effects are, for ethical reasons, powered lower than studies with expected weak or unclear treatment effects. To conclude that homeopathy lacks clinical effect, more than 90% of the available clinical trials had to be disregarded. Alternatively, flawed statistical methods had to be applied. Future meta-analyses should focus on the use of homeopathy in specific diseases or groups of diseases instead of pooling data from all clinical trials. © 2013 S. Karger GmbH, Freiburg.

  20. Human movement stochastic variability leads to diagnostic biomarkers In Autism Spectrum Disorders (ASD)

    NASA Astrophysics Data System (ADS)

    Wu, Di; Torres, Elizabeth B.; Jose, Jorge V.

    2015-03-01

    ASD is a spectrum of neurodevelopmental disorders. The high heterogeneity of the symptoms associated with the disorder impedes efficient diagnoses based on human observations. Recent advances with high-resolution MEM wearable sensors enable accurate movement measurements that may escape the naked eye. It calls for objective metrics to extract physiological relevant information from the rapidly accumulating data. In this talk we'll discuss the statistical analysis of movement data continuously collected with high-resolution sensors at 240Hz. We calculated statistical properties of speed fluctuations within the millisecond time range that closely correlate with the subjects' cognitive abilities. We computed the periodicity and synchronicity of the speed fluctuations' from their power spectrum and ensemble averaged two-point cross-correlation function. We built a two-parameter phase space from the temporal statistical analyses of the nearest neighbor fluctuations that provided a quantitative biomarker for ASD and adult normal subjects and further classified ASD severity. We also found age related developmental statistical signatures and potential ASD parental links in our movement dynamical studies. Our results may have direct clinical applications.

  1. Impact of malicious servers over trust and reputation models in wireless sensor networks

    NASA Astrophysics Data System (ADS)

    Verma, Vinod Kumar; Singh, Surinder; Pathak, N. P.

    2016-03-01

    This article deals with the impact of malicious servers over different trust and reputation models in wireless sensor networks. First, we analysed the five trust and reputation models, namely BTRM-WSN, Eigen trust, peer trust, power trust, linguistic fuzzy trust model. Further, we proposed wireless sensor network design for optimisation of these models. Finally, influence of malicious servers on the behaviour of above mentioned trust and reputation models is discussed. Statistical analysis has been carried out to prove the validity of our proposal.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Merkley, Eric D.; Sego, Landon H.; Lin, Andy

    Adaptive processes in bacterial species can occur rapidly in laboratory culture, leading to genetic divergence between naturally occurring and laboratory-adapted strains. Differentiating wild and closely-related laboratory strains is clearly important for biodefense and bioforensics; however, DNA sequence data alone has thus far not provided a clear signature, perhaps due to lack of understanding of how diverse genome changes lead to adapted phenotypes. Protein abundance profiles from mass spectrometry-based proteomics analyses are a molecular measure of phenotype. Proteomics data contains sufficient information that powerful statistical methods can uncover signatures that distinguish wild strains of Yersinia pestis from laboratory-adapted strains.

  3. SPS market analysis

    NASA Astrophysics Data System (ADS)

    Goff, H. C.

    1980-05-01

    A market analysis task included personal interviews by GE personnel and supplemental mail surveys to acquire statistical data and to identify and measure attitudes, reactions and intentions of prospective small solar thermal power systems (SPS) users. Over 500 firms were contacted, including three ownership classes of electric utilities, industrial firms in the top SIC codes for energy consumption, and design engineering firms. A market demand model was developed which utilizes the data base developed by personal interviews and surveys, and projected energy price and consumption data to perform sensitivity analyses and estimate potential markets for SPS.

  4. Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power

    ERIC Educational Resources Information Center

    Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M.; Vaughn, Sharon

    2016-01-01

    An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated…

  5. The Importance of Teaching Power in Statistical Hypothesis Testing

    ERIC Educational Resources Information Center

    Olinsky, Alan; Schumacher, Phyllis; Quinn, John

    2012-01-01

    In this paper, we discuss the importance of teaching power considerations in statistical hypothesis testing. Statistical power analysis determines the ability of a study to detect a meaningful effect size, where the effect size is the difference between the hypothesized value of the population parameter under the null hypothesis and the true value…

  6. Statistics for NAEG: past efforts, new results, and future plans

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, R.O.; Simpson, J.C.; Kinnison, R.R.

    A brief review of Nevada Applied Ecology Group (NAEG) objectives is followed by a summary of past statistical analyses conducted by Pacific Northwest Laboratory for the NAEG. Estimates of spatial pattern of radionuclides and other statistical analyses at NS's 201, 219 and 221 are reviewed as background for new analyses presented in this paper. Suggested NAEG activities and statistical analyses needed for the projected termination date of NAEG studies in March 1986 are given.

  7. Profiling the nucleobase and structure selectivity of anticancer drugs and other DNA alkylating agents by RNA sequencing.

    PubMed

    Gillingham, Dennis; Sauter, Basilius

    2018-05-06

    Drugs that covalently modify DNA are components of most chemotherapy regimens, often serving as first-line treatments. Classically the chemical reactivity of DNA alkylators has been determined in vitro with short oligonucleotides. Here we use next generation RNA sequencing to report on the chemoselectivity of alkylating agents. We develop the method with the well-known clinically used DNA modifiying drugs streptozotocin and temozolomide, and then apply the technique to profile RNA modification with uncharacterized alkylation reactions such as with powerful electrophiles like trimethylsilyldiazomethane. The multiplexed and massively parallel format of NGS offers analyses of chemical reactivity in nucleic acids to be accomplished in less time with greater statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Efficacy of plaque removal and learning effect of a powered and a manual toothbrush.

    PubMed

    Lazarescu, D; Boccaneala, S; Illiescu, A; De Boever, J A

    2003-08-01

    Subjects with high plaque and gingivitis scores can profit most from the introduction of new manual or powered tooth brushes. To improve their hygiene, not only the technical characteristics of new brushes but also the learning effect in efficient handling are of importance. : The present study compared the efficacy in plaque removal of an electric and a manual toothbrush in a general population and analysed the learning effect in efficient handling. Eighty healthy subjects, unfamiliar with electric brushes, were divided into two groups: group 1 used the Philips/Jordan HP 735 powered brush and group 2 used a manual brush, Oral-B40+. Plaque index (PI) and gingival bleeding index (GBI) were assessed at baseline and at weeks 3, 6, 12 and 18. After each evaluation, patients abstained from oral hygiene for 24 h. The next day a 3-min supervised brushing was performed. Before and after this brushing, PI was assessed for the estimation of the individual learning effect. The study was single blinded. Over the 18-week period, PI reduced gradually and statistically significantly (p<0.001) in group 1 from 2.9 (+/-0.38) to 1.5 (+/-0.24) and in group 2 from 2.9 (+/-0.34) to 2.2 (+/-0.23). From week 3 onwards, the difference between groups was statistically significant (p<0.001). The bleeding index decreased in group 1 from 28% (+/-17%) to 7% (+/-5%) (p<0.001) and in group 2 from 30% (+/-12%) to 12% (+/-6%) (p<0.001). The difference between groups was statistically significant (p<0.001) from week 6 onwards. The learning effect, expressed as the percentage of plaque reduction after 3 min of supervised brushing, was 33% for group 1 and 26% for group 2 at week 0. This percentage increased at week 18 to 64% in group 1 and 44% in group 2 (difference between groups statistically significant: p<0.001). The powered brush was significantly more efficient in removing plaque and improving gingival health than the manual brush in the group of subjects unfamiliar with electric brushes. There was also a significant learning effect that was more pronounced with the electric toothbrush.

  9. New powerful statistics for alignment-free sequence comparison under a pattern transfer model.

    PubMed

    Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu

    2011-09-07

    Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model

    PubMed Central

    Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu

    2011-01-01

    Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298

  11. Inferring causal relationships between phenotypes using summary statistics from genome-wide association studies.

    PubMed

    Meng, Xiang-He; Shen, Hui; Chen, Xiang-Ding; Xiao, Hong-Mei; Deng, Hong-Wen

    2018-03-01

    Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with diverse complex phenotypes and diseases, and provided tremendous opportunities for further analyses using summary association statistics. Recently, Pickrell et al. developed a robust method for causal inference using independent putative causal SNPs. However, this method may fail to infer the causal relationship between two phenotypes when only a limited number of independent putative causal SNPs identified. Here, we extended Pickrell's method to make it more applicable for the general situations. We extended the causal inference method by replacing the putative causal SNPs with the lead SNPs (the set of the most significant SNPs in each independent locus) and tested the performance of our extended method using both simulation and empirical data. Simulations suggested that when the same number of genetic variants is used, our extended method had similar distribution of test statistic under the null model as well as comparable power under the causal model compared with the original method by Pickrell et al. But in practice, our extended method would generally be more powerful because the number of independent lead SNPs was often larger than the number of independent putative causal SNPs. And including more SNPs, on the other hand, would not cause more false positives. By applying our extended method to summary statistics from GWAS for blood metabolites and femoral neck bone mineral density (FN-BMD), we successfully identified ten blood metabolites that may causally influence FN-BMD. We extended a causal inference method for inferring putative causal relationship between two phenotypes using summary statistics from GWAS, and identified a number of potential causal metabolites for FN-BMD, which may provide novel insights into the pathophysiological mechanisms underlying osteoporosis.

  12. DMINDA: an integrated web server for DNA motif identification and analyses.

    PubMed

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Application of multivariate statistical techniques in microbial ecology

    PubMed Central

    Paliy, O.; Shankar, V.

    2016-01-01

    Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large scale ecological datasets. Especially noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions, and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amounts of data, powerful statistical techniques of multivariate analysis are well suited to analyze and interpret these datasets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular dataset. In this review we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive, and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and dataset structure. PMID:26786791

  14. Rough surface reconstruction for ultrasonic NDE simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choi, Wonjae; Shi, Fan; Lowe, Michael J. S.

    2014-02-18

    The reflection of ultrasound from rough surfaces is an important topic for the NDE of safety-critical components, such as pressure-containing components in power stations. The specular reflection from a rough surface of a defect is normally lower than it would be from a flat surface, so it is typical to apply a safety factor in order that justification cases for inspection planning are conservative. The study of the statistics of the rough surfaces that might be expected in candidate defects according to materials and loading, and the reflections from them, can be useful to develop arguments for realistic safety factors.more » This paper presents a study of real rough crack surfaces that are representative of the potential defects in pressure-containing power plant. Two-dimensional (area) values of the height of the roughness have been measured and their statistics analysed. Then a means to reconstruct model cases with similar statistics, so as to enable the creation of multiple realistic realizations of the surfaces, has been investigated, using random field theory. Rough surfaces are reconstructed, based on a real surface, and results for these two-dimensional descriptions of the original surface have been compared with those from the conventional model based on a one-dimensional correlation coefficient function. In addition, ultrasonic reflections from them are simulated using a finite element method.« less

  15. "Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs"

    ERIC Educational Resources Information Center

    Konstantopoulos, Spyros

    2009-01-01

    Power computations for one-level experimental designs that assume simple random samples are greatly facilitated by power tables such as those presented in Cohen's book about statistical power analysis. However, in education and the social sciences experimental designs have naturally nested structures and multilevel models are needed to compute the…

  16. Comparison of three longitudinal analysis models for the health-related quality of life in oncology: a simulation study.

    PubMed

    Anota, Amélie; Barbieri, Antoine; Savina, Marion; Pam, Alhousseiny; Gourgou-Bourgade, Sophie; Bonnetain, Franck; Bascoul-Mollevi, Caroline

    2014-12-31

    Health-Related Quality of Life (HRQoL) is an important endpoint in oncology clinical trials aiming to investigate the clinical benefit of new therapeutic strategies for the patient. However, the longitudinal analysis of HRQoL remains complex and unstandardized. There is clearly a need to propose accessible statistical methods and meaningful results for clinicians. The objective of this study was to compare three strategies for longitudinal analyses of HRQoL data in oncology clinical trials through a simulation study. The methods proposed were: the score and mixed model (SM); a survival analysis approach based on the time to HRQoL score deterioration (TTD); and the longitudinal partial credit model (LPCM). Simulations compared the methods in terms of type I error and statistical power of the test of an interaction effect between treatment arm and time. Several simulation scenarios were explored based on the EORTC HRQoL questionnaires and varying the number of patients (100, 200 or 300), items (1, 2 or 4) and response categories per item (4 or 7). Five or 10 measurement times were considered, with correlations ranging from low to high between each measure. The impact of informative missing data on these methods was also studied to reflect the reality of most clinical trials. With complete data, the type I error rate was close to the expected value (5%) for all methods, while the SM method was the most powerful method, followed by LPCM. The power of TTD is low for single-item dimensions, because only four possible values exist for the score. When the number of items increases, the power of the SM approach remained stable, those of the TTD method increases while the power of LPCM remained stable. With 10 measurement times, the LPCM was less efficient. With informative missing data, the statistical power of SM and TTD tended to decrease, while that of LPCM tended to increase. To conclude, the SM model was the most powerful model, irrespective of the scenario considered, and the presence or not of missing data. The TTD method should be avoided for single-item dimensions of the EORTC questionnaire. While the LPCM model was more adapted to this kind of data, it was less efficient than the SM model. These results warrant validation through comparisons on real data.

  17. A new u-statistic with superior design sensitivity in matched observational studies.

    PubMed

    Rosenbaum, Paul R

    2011-09-01

    In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is "no departure" then this is the same as the power of a randomization test in a randomized experiment. A new family of u-statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments-that is, it often has good Pitman efficiency-but small effects are invariably sensitive to small unobserved biases. Members of this family of u-statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u-statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology. © 2010, The International Biometric Society.

  18. The vertical pattern of microwave radiation around BTS (Base Transceiver Station) antennae in Hashtgerd township.

    PubMed

    Nasseri, Simin; Monazzam, Mohammadreza; Beheshti, Meisam; Zare, Sajad; Mahvi, Amirhosein

    2013-12-20

    New environmental pollutants interfere with the environment and human life along with technology development. One of these pollutants is electromagnetic field. This study determines the vertical microwave radiation pattern of different types of Base Transceiver Station (BTS) antennae in the Hashtgerd city as the capital of Savojbolagh County, Alborz Province of Iran. The basic data including the geographical location of the BTS antennae in the city, brand, operator type, installation and its height was collected from radio communication office, and then the measurements were carried out according to IEEE STD 95. 1 by the SPECTRAN 4060. The statistical analyses were carried out by SPSS16 using Kolmogorov Smirnov test and multiple regression method. Results indicated that in both operators of Irancell and Hamrah-e-Aval (First Operator), the power density rose with an increase in measurement height or decrease in the vertical distance of broadcaster antenna. With mix model test, a significant statistical relationship was observed between measurement height and the average power density in both types of the operators. With increasing measuring height, power density increased in both operators. The study showed installing antennae in a crowded area needs more care because of higher radiation emission. More rigid surfaces and mobile users are two important factors in crowded area that can increase wave density and hence raise public microwave exposure.

  19. Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.

    PubMed

    Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E

    2015-09-03

    Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Monitoring the impact of Bt maize on butterflies in the field: estimation of required sample sizes.

    PubMed

    Lang, Andreas

    2004-01-01

    The monitoring of genetically modified organisms (GMOs) after deliberate release is important in order to assess and evaluate possible environmental effects. Concerns have been raised that the transgenic crop, Bt maize, may affect butterflies occurring in field margins. Therefore, a monitoring of butterflies was suggested accompanying the commercial cultivation of Bt maize. In this study, baseline data on the butterfly species and their abundance in maize field margins is presented together with implications for butterfly monitoring. The study was conducted in Bavaria, South Germany, between 2000-2002. A total of 33 butterfly species was recorded in field margins. A small number of species dominated the community, and butterflies observed were mostly common species. Observation duration was the most important factor influencing the monitoring results. Field margin size affected the butterfly abundance, and habitat diversity had a tendency to influence species richness. Sample size and statistical power analyses indicated that a sample size in the range of 75 to 150 field margins for treatment (transgenic maize) and control (conventional maize) would detect (power of 80%) effects larger than 15% in species richness and the butterfly abundance pooled across species. However, a much higher number of field margins must be sampled in order to achieve a higher statistical power, to detect smaller effects, and to monitor single butterfly species.

  1. The vertical pattern of microwave radiation around BTS (Base Transceiver Station) antennae in Hashtgerd township

    PubMed Central

    2013-01-01

    New environmental pollutants interfere with the environment and human life along with technology development. One of these pollutants is electromagnetic field. This study determines the vertical microwave radiation pattern of different types of Base Transceiver Station (BTS) antennae in the Hashtgerd city as the capital of Savojbolagh County, Alborz Province of Iran. The basic data including the geographical location of the BTS antennae in the city, brand, operator type, installation and its height was collected from radio communication office, and then the measurements were carried out according to IEEE STD 95. 1 by the SPECTRAN 4060. The statistical analyses were carried out by SPSS16 using Kolmogorov Smirnov test and multiple regression method. Results indicated that in both operators of Irancell and Hamrah-e-Aval (First Operator), the power density rose with an increase in measurement height or decrease in the vertical distance of broadcaster antenna. With mix model test, a significant statistical relationship was observed between measurement height and the average power density in both types of the operators. With increasing measuring height, power density increased in both operators. The study showed installing antennae in a crowded area needs more care because of higher radiation emission. More rigid surfaces and mobile users are two important factors in crowded area that can increase wave density and hence raise public microwave exposure. PMID:24359870

  2. Trial Sequential Analysis in systematic reviews with meta-analysis.

    PubMed

    Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

    2017-03-06

    Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.

  3. Memory matters: influence from a cognitive map on animal space use.

    PubMed

    Gautestad, Arild O

    2011-10-21

    A vertebrate individual's cognitive map provides a capacity for site fidelity and long-distance returns to favorable patches. Fractal-geometrical analysis of individual space use based on collection of telemetry fixes makes it possible to verify the influence of a cognitive map on the spatial scatter of habitat use and also to what extent space use has been of a scale-specific versus a scale-free kind. This approach rests on a statistical mechanical level of system abstraction, where micro-scale details of behavioral interactions are coarse-grained to macro-scale observables like the fractal dimension of space use. In this manner, the magnitude of the fractal dimension becomes a proxy variable for distinguishing between main classes of habitat exploration and site fidelity, like memory-less (Markovian) Brownian motion and Levy walk and memory-enhanced space use like Multi-scaled Random Walk (MRW). In this paper previous analyses are extended by exploring MRW simulations under three scenarios: (1) central place foraging, (2) behavioral adaptation to resource depletion (avoidance of latest visited locations) and (3) transition from MRW towards Levy walk by narrowing memory capacity to a trailing time window. A generalized statistical-mechanical theory with the power to model cognitive map influence on individual space use will be important for statistical analyses of animal habitat preferences and the mechanics behind site fidelity and home ranges. Copyright © 2011 Elsevier Ltd. All rights reserved.

  4. Relationship of the functional movement screen in-line lunge to power, speed, and balance measures.

    PubMed

    Hartigan, Erin H; Lawrence, Michael; Bisson, Brian M; Torgerson, Erik; Knight, Ryan C

    2014-05-01

    The in-line lunge of the Functional Movement Screen (FMS) evaluates lateral stability, balance, and movement asymmetries. Athletes who score poorly on the in-line lunge should avoid activities requiring power or speed until scores are improved, yet relationships between the in-line lunge scores and other measures of balance, power, and speed are unknown. (1) Lunge scores will correlate with center of pressure (COP), maximum jump height (MJH), and 36.6-meter sprint time and (2) there will be no differences between limbs on lunge scores, MJH, or COP. Descriptive laboratory study. Level 3. Thirty-seven healthy, active participants completed the first 3 tasks of the FMS (eg, deep squat, hurdle step, in-line lunge), unilateral drop jumps, and 36.6-meter sprints. A 3-dimensional motion analysis system captured MJH. Force platforms measured COP excursion. A laser timing system measured 36.6-m sprint time. Statistical analyses were used to determine whether a relationship existed between lunge scores and COP, MJH, and 36.6-m speed (Spearman rho tests) and whether differences existed between limbs in lunge scores (Wilcoxon signed-rank test), MJH, and COP (paired t tests). Lunge scores were not significantly correlated with COP, MJH, or 36.6-m sprint time. Lunge scores, COP excursion, and MJH were not statistically different between limbs. Performance on the FMS in-line lunge was not related to balance, power, or speed. Healthy participants were symmetrical in lunging measures and MJH. Scores on the FMS in-line lunge should not be attributed to power, speed, or balance performance without further examination. However, assessing limb symmetry appears to be clinically relevant.

  5. You Cannot Step Into the Same River Twice: When Power Analyses Are Optimistic.

    PubMed

    McShane, Blakeley B; Böckenholt, Ulf

    2014-11-01

    Statistical power depends on the size of the effect of interest. However, effect sizes are rarely fixed in psychological research: Study design choices, such as the operationalization of the dependent variable or the treatment manipulation, the social context, the subject pool, or the time of day, typically cause systematic variation in the effect size. Ignoring this between-study variation, as standard power formulae do, results in assessments of power that are too optimistic. Consequently, when researchers attempting replication set sample sizes using these formulae, their studies will be underpowered and will thus fail at a greater than expected rate. We illustrate this with both hypothetical examples and data on several well-studied phenomena in psychology. We provide formulae that account for between-study variation and suggest that researchers set sample sizes with respect to our generally more conservative formulae. Our formulae generalize to settings in which there are multiple effects of interest. We also introduce an easy-to-use website that implements our approach to setting sample sizes. Finally, we conclude with recommendations for quantifying between-study variation. © The Author(s) 2014.

  6. Dust-acoustic waves and stability in the permeating dusty plasma. II. Power-law distributions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gong Jingyu; Du Jiulin; Liu Zhipeng

    2012-08-15

    The dust-acoustic waves and the stability theory for the permeating dusty plasma with power-law distributions are studied by using nonextensive q-statistics. In two limiting physical cases, when the thermal velocity of the flowing dusty plasma is much larger than, and much smaller than the phase velocity of the waves, we derived the dust-acoustic wave frequency, the instability growth rate, and the instability critical flowing velocity. As compared with the formulae obtained in part I [Gong et al., Phys. Plasmas 19, 043704 (2012)], all formulae of the present cases and the resulting plasma characteristics are q-dependent, and the power-law distribution ofmore » each plasma component of the permeating dusty plasma has a different q-parameter and thus has a different nonextensive effect. Further, we make numerical analyses of an example that a cometary plasma tail is passing through the interplanetary space dusty plasma and we show that these power-law distributions have significant effects on the plasma characteristics of this kind of plasma environment.« less

  7. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

    PubMed Central

    Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.

    2014-01-01

    Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607

  8. Sampling designs for contaminant temporal trend analyses using sedentary species exemplified by the snails Bellamya aeruginosa and Viviparus viviparus.

    PubMed

    Yin, Ge; Danielsson, Sara; Dahlberg, Anna-Karin; Zhou, Yihui; Qiu, Yanling; Nyberg, Elisabeth; Bignert, Anders

    2017-10-01

    Environmental monitoring typically assumes samples and sampling activities to be representative of the population being studied. Given a limited budget, an appropriate sampling strategy is essential to support detecting temporal trends of contaminants. In the present study, based on real chemical analysis data on polybrominated diphenyl ethers in snails collected from five subsites in Tianmu Lake, computer simulation is performed to evaluate three sampling strategies by the estimation of required sample size, to reach a detection of an annual change of 5% with a statistical power of 80% and 90% with a significant level of 5%. The results showed that sampling from an arbitrarily selected sampling spot is the worst strategy, requiring much more individual analyses to achieve the above mentioned criteria compared with the other two approaches. A fixed sampling site requires the lowest sample size but may not be representative for the intended study object e.g. a lake and is also sensitive to changes of that particular sampling site. In contrast, sampling at multiple sites along the shore each year, and using pooled samples when the cost to collect and prepare individual specimens are much lower than the cost for chemical analysis, would be the most robust and cost efficient strategy in the long run. Using statistical power as criterion, the results demonstrated quantitatively the consequences of various sampling strategies, and could guide users with respect of required sample sizes depending on sampling design for long term monitoring programs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Subjective global assessment of nutritional status in children.

    PubMed

    Mahdavi, Aida Malek; Ostadrahimi, Alireza; Safaiyan, Abdolrasool

    2010-10-01

    This study was aimed to compare the subjective and objective nutritional assessments and to analyse the performance of subjective global assessment (SGA) of nutritional status in diagnosing undernutrition in paediatric patients. One hundred and forty children (aged 2-12 years) hospitalized consecutively in Tabriz Paediatric Hospital from June 2008 to August 2008 underwent subjective assessment using the SGA questionnaire and objective assessment, including anthropometric and biochemical measurements. Agreement between two assessment methods was analysed by the kappa (κ) statistic. Statistical indicators including (sensitivity, specificity, predictive values, error rates, accuracy, powers, likelihood ratios and odds ratio) between SGA and objective assessment method were determined. The overall prevalence of undernutrition according to the SGA (70.7%) was higher than that by objective assessment of nutritional status (48.5%). Agreement between the two evaluation methods was only fair to moderate (κ = 0.336, P < 0.001). The sensitivity, specificity, positive and negative predictive value of the SGA method for screening undernutrition in this population were 88.235%, 45.833%, 60.606% and 80.487%, respectively. Accuracy, positive and negative power of the SGA method were 66.428%, 56.074% and 41.25%, respectively. Likelihood ratio positive, likelihood ratio negative and odds ratio of the SGA method were 1.628, 0.256 and 6.359, respectively. Our findings indicated that in assessing nutritional status of children, there is not a good level of agreement between SGA and objective nutritional assessment. In addition, SGA is a highly sensitive tool for assessing nutritional status and could identify children at risk of developing undernutrition. © 2009 Blackwell Publishing Ltd.

  10. High-Density Signal Interface Electromagnetic Radiation Prediction for Electromagnetic Compatibility Evaluation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Halligan, Matthew

    Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less

  11. Integration of statistical and physiological analyses of adaptation of near-isogenic barley lines.

    PubMed

    Romagosa, I; Fox, P N; García Del Moral, L F; Ramos, J M; García Del Moral, B; Roca de Togores, F; Molina-Cano, J L

    1993-08-01

    Seven near-isogenic barley lines, differing for three independent mutant genes, were grown in 15 environments in Spain. Genotype x environment interaction (G x E) for grain yield was examined with the Additive Main Effects and Multiplicative interaction (AMMI) model. The results of this statistical analysis of multilocation yield-data were compared with a morpho-physiological characterization of the lines at two sites (Molina-Cano et al. 1990). The first two principal component axes from the AMMI analysis were strongly associated with the morpho-physiological characters. The independent but parallel discrimination among genotypes reflects genetic differences and highlights the power of the AMMI analysis as a tool to investigate G x E. Characters which appear to be positively associated with yield in the germplasm under study could be identified for some environments.

  12. Low-dose ionizing radiation increases the mortality risk of solid cancers in nuclear industry workers: A meta-analysis.

    PubMed

    Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu

    2018-05-01

    Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality.

  13. Seven ways to increase power without increasing N.

    PubMed

    Hansen, W B; Collins, L M

    1994-01-01

    Many readers of this monograph may wonder why a chapter on statistical power was included. After all, by now the issue of statistical power is in many respects mundane. Everyone knows that statistical power is a central research consideration, and certainly most National Institute on Drug Abuse grantees or prospective grantees understand the importance of including a power analysis in research proposals. However, there is ample evidence that, in practice, prevention researchers are not paying sufficient attention to statistical power. If they were, the findings observed by Hansen (1992) in a recent review of the prevention literature would not have emerged. Hansen (1992) examined statistical power based on 46 cohorts followed longitudinally, using nonparametric assumptions given the subjects' age at posttest and the numbers of subjects. Results of this analysis indicated that, in order for a study to attain 80-percent power for detecting differences between treatment and control groups, the difference between groups at posttest would need to be at least 8 percent (in the best studies) and as much as 16 percent (in the weakest studies). In order for a study to attain 80-percent power for detecting group differences in pre-post change, 22 of the 46 cohorts would have needed relative pre-post reductions of greater than 100 percent. Thirty-three of the 46 cohorts had less than 50-percent power to detect a 50-percent relative reduction in substance use. These results are consistent with other review findings (e.g., Lipsey 1990) that have shown a similar lack of power in a broad range of research topics. Thus, it seems that, although researchers are aware of the importance of statistical power (particularly of the necessity for calculating it when proposing research), they somehow are failing to end up with adequate power in their completed studies. This chapter argues that the failure of many prevention studies to maintain adequate statistical power is due to an overemphasis on sample size (N) as the only, or even the best, way to increase statistical power. It is easy to see how this overemphasis has come about. Sample size is easy to manipulate, has the advantage of being related to power in a straight-forward way, and usually is under the direct control of the researcher, except for limitations imposed by finances or subject availability. Another option for increasing power is to increase the alpha used for hypothesis-testing but, as very few researchers seriously consider significance levels much larger than the traditional .05, this strategy seldom is used. Of course, sample size is important, and the authors of this chapter are not recommending that researchers cease choosing sample sizes carefully. Rather, they argue that researchers should not confine themselves to increasing N to enhance power. It is important to take additional measures to maintain and improve power over and above making sure the initial sample size is sufficient. The authors recommend two general strategies. One strategy involves attempting to maintain the effective initial sample size so that power is not lost needlessly. The other strategy is to take measures to maximize the third factor that determines statistical power: effect size.

  14. Relative risk estimates from spatial and space-time scan statistics: Are they biased?

    PubMed Central

    Prates, Marcos O.; Kulldorff, Martin; Assunção, Renato M.

    2014-01-01

    The purely spatial and space-time scan statistics have been successfully used by many scientists to detect and evaluate geographical disease clusters. Although the scan statistic has high power in correctly identifying a cluster, no study has considered the estimates of the cluster relative risk in the detected cluster. In this paper we evaluate whether there is any bias on these estimated relative risks. Intuitively, one may expect that the estimated relative risks has upward bias, since the scan statistic cherry picks high rate areas to include in the cluster. We show that this intuition is correct for clusters with low statistical power, but with medium to high power the bias becomes negligible. The same behaviour is not observed for the prospective space-time scan statistic, where there is an increasing conservative downward bias of the relative risk as the power to detect the cluster increases. PMID:24639031

  15. The message of the survival curves: I. Composite analysis of long-term treatment studies in bipolar disorder.

    PubMed

    Frecska, Ede; Kovacs, Attila Istvan; Balla, Petra; Falussy, Linda; Ferencz, Akos; Varga, Zsofia

    2012-09-01

    There is a shortage of studies analyzing the time course of recurrent episodes and comparing effectiveness of long-term treatments in bipolar disorder. 'Number needed to treat' (NNT) analyses have been proven to be useful for clinically meaningful comparisons, but results vary considerably among studies. The survival curves of different trials also show a great variability preventing reliable conclusions on the time course of maintenance therapies. The variance of survival analyses on long-term medication management can be reduced with increasing the statistical power by combining the life-tables of individual studies. In this study the survival tables of 28 studies on maintenance treatment of bipolar disorder were reconstructed from the published diagrams, and the numbers of relapsed patients in the original studies were estimated for plotting composite survival curves of an inactive, mono- and combination therapy arm. The review was finally based on 5231 subjects. The resulting composite diagrams indicate that within the first year 48% of patients on monotherapy, and 35% on combination therapy experienced recurrence of any affective episode ('early relapsers'). The rest of the patient population was affected by recurrences in a smaller rate over a more extended period of time ('late relapsers'). For a favorable outcome at 40 months of episode prevention in bipolar disorder the NNT was 6 for mono- and 3 for combination therapy. Log-rank analyses of the composite data supported the effectiveness of both medication protocols over placebo, and the superiority of drug combination over monotherapy; though there were some indications of decreased efficacy in the two treatment arms after extended maintenance. Composite analysis offers increased statistical power for studying the time course of survival data. Mood episodes in bipolar disorder are likely to recur early on and relapses in "real-life" can be more frequent than the rates published here. Our results favor combination therapy for the long-term management of bipolar disorder. Concerns are expressed that NNT analyses have significant limitations when applied to recurring events with cumulative deterioration instead of cases where cumulative improvement is expected over time.

  16. Analysis of ambient SO 2 concentrations and winds in the complex surroundings of a thermal power plant

    NASA Astrophysics Data System (ADS)

    Mlakar, P.

    2004-11-01

    SO2 pollution is still a significant problem in Slovenia, especially around large thermal power plants (TPPs), like the one at Šoštanj. The Šoštanj TPP is the exclusive source of SO2 in the area and is therefore a perfect example for air pollution studies. In order to understand air pollution around the Šoštanj TPP in detail, some analyses of emissions and ambient concentrations of SO2 at six automated monitoring stations in the surroundings of the TPP were made. The data base from 1991 to 1993 was used when there were no desulfurisation plants in operation. Statistical analyses of the influence of the emissions from the three TPP stacks at different measuring points were made. The analyses prove that the smallest stack (100 m) mainly pollutes villages and towns near the TPP within a radius of a few kilometres. The medium stack's (150 m) influence is noticed at shorter as well as at longer distances up to more than ten kilometres. The highest stack (230 m) pollutes mainly at longer distances, where the plume reaches the higher hills. Detailed analyses of ambient SO2 concentrations were made. They show the temporal and spatial distribution of different classes of SO2 concentrations from very low to alarming values. These analyses show that pollution patterns at a particular station remain the same if observed on a yearly basis, but can vary very much if observed on a monthly basis, mainly because of different weather patterns. Therefore the winds in the basin (as the most important feature influencing air pollution dispersion) were further analysed in detail to find clusters of similar patterns. For cluster analysis of ground-level winds patterns in the basin around the Šoštanj Thermal Power Plant, the Kohonen neural network and Leaders' method were used. Furthermore, the dependence of ambient SO2 concentrations on the clusters obtained was analysed. The results proved that effective cluster analysis can be a useful tool for compressing a huge wind data base in order to find the correlation between winds and pollutant concentrations. The analyses made provide a better insight into air pollution over complex terrain.

  17. Dichotomising continuous data while retaining statistical power using a distributional approach.

    PubMed

    Peacock, J L; Sauzet, O; Ewings, S M; Kerry, S M

    2012-11-20

    Dichotomisation of continuous data is known to be hugely problematic because information is lost, power is reduced and relationships may be obscured or changed. However, not only are differences in means difficult for clinicians to interpret, but thresholds also occur in many areas of medical practice and cannot be ignored. In recognition of both the problems of dichotomisation and the ways in which it may be useful clinically, we have used a distributional approach to derive a difference in proportions with a 95% CI that retains the precision and the power of the CI for the equivalent difference in means. In this way, we propose a dual approach that analyses continuous data using both means and proportions to replace dichotomisation alone and that may be useful in certain situations. We illustrate this work with examples and simulations that show good performance of the parametric approach under standard distributional assumptions from our own research and from the literature. Copyright © 2012 John Wiley & Sons, Ltd.

  18. Statistical inconsistencies in the KiDS-450 data set

    NASA Astrophysics Data System (ADS)

    Efstathiou, George; Lemos, Pablo

    2018-05-01

    The Kilo-Degree Survey (KiDS) has been used in several recent papers to infer constraints on the amplitude of the matter power spectrum and matter density at low redshift. Some of these analyses have claimed tension with the Planck Λ cold dark matter cosmology at the ˜2σ-3σ level, perhaps indicative of new physics. However, Planck is consistent with other low-redshift probes of the matter power spectrum such as redshift-space distortions and the combined galaxy-mass and galaxy-galaxy power spectra. Here, we perform consistency tests of the KiDS data, finding internal tensions for various cuts of the data at ˜2.2σ-3.5σ significance. Until these internal tensions are understood, we argue that it is premature to claim evidence for new physics from KiDS. We review the consistency between KiDS and other weak lensing measurements of S8, highlighting the importance of intrinsic alignments for precision cosmology.

  19. Removing an intersubject variance component in a general linear model improves multiway factoring of event-related spectral perturbations in group EEG studies.

    PubMed

    Spence, Jeffrey S; Brier, Matthew R; Hart, John; Ferree, Thomas C

    2013-03-01

    Linear statistical models are used very effectively to assess task-related differences in EEG power spectral analyses. Mixed models, in particular, accommodate more than one variance component in a multisubject study, where many trials of each condition of interest are measured on each subject. Generally, intra- and intersubject variances are both important to determine correct standard errors for inference on functions of model parameters, but it is often assumed that intersubject variance is the most important consideration in a group study. In this article, we show that, under common assumptions, estimates of some functions of model parameters, including estimates of task-related differences, are properly tested relative to the intrasubject variance component only. A substantial gain in statistical power can arise from the proper separation of variance components when there is more than one source of variability. We first develop this result analytically, then show how it benefits a multiway factoring of spectral, spatial, and temporal components from EEG data acquired in a group of healthy subjects performing a well-studied response inhibition task. Copyright © 2011 Wiley Periodicals, Inc.

  20. An efficient empirical Bayes method for genomewide association studies.

    PubMed

    Wang, Q; Wei, J; Pan, Y; Xu, S

    2016-08-01

    Linear mixed model (LMM) is one of the most popular methods for genomewide association studies (GWAS). Numerous forms of LMM have been developed; however, there are two major issues in GWAS that have not been fully addressed before. The two issues are (i) the genomic background noise and (ii) low statistical power after Bonferroni correction. We proposed an empirical Bayes (EB) method by assigning each marker effect a normal prior distribution, resulting in shrinkage estimates of marker effects. We found that such a shrinkage approach can selectively shrink marker effects and reduce the noise level to zero for majority of non-associated markers. In the meantime, the EB method allows us to use an 'effective number of tests' to perform Bonferroni correction for multiple tests. Simulation studies for both human and pig data showed that EB method can significantly increase statistical power compared with the widely used exact GWAS methods, such as GEMMA and FaST-LMM-Select. Real data analyses in human breast cancer identified improved detection signals for markers previously known to be associated with breast cancer. We therefore believe that EB method is a valuable tool for identifying the genetic basis of complex traits. © 2015 Blackwell Verlag GmbH.

  1. An overview of meta-analysis for clinicians.

    PubMed

    Lee, Young Ho

    2018-03-01

    The number of medical studies being published is increasing exponentially, and clinicians must routinely process large amounts of new information. Moreover, the results of individual studies are often insufficient to provide confident answers, as their results are not consistently reproducible. A meta-analysis is a statistical method for combining the results of different studies on the same topic and it may resolve conflicts among studies. Meta-analysis is being used increasingly and plays an important role in medical research. This review introduces the basic concepts, steps, advantages, and caveats of meta-analysis, to help clinicians understand it in clinical practice and research. A major advantage of a meta-analysis is that it produces a precise estimate of the effect size, with considerably increased statistical power, which is important when the power of the primary study is limited because of a small sample size. A meta-analysis may yield conclusive results when individual studies are inconclusive. Furthermore, meta-analyses investigate the source of variation and different effects among subgroups. In summary, a meta-analysis is an objective, quantitative method that provides less biased estimates on a specific topic. Understanding how to conduct a meta-analysis aids clinicians in the process of making clinical decisions.

  2. In-situ structural integrity evaluation for high-power pulsed spallation neutron source - Effects of cavitation damage on structural vibration

    NASA Astrophysics Data System (ADS)

    Wan, Tao; Naoe, Takashi; Futakawa, Masatoshi

    2016-01-01

    A double-wall structure mercury target will be installed at the high-power pulsed spallation neutron source in the Japan Proton Accelerator Research Complex (J-PARC). Cavitation damage on the inner wall is an important factor governing the lifetime of the target-vessel. To monitor the structural integrity of the target vessel, displacement velocity at a point on the outer surface of the target vessel is measured using a laser Doppler vibrometer (LDV). The measured signals can be used for evaluating the damage inside the target vessel because of cyclic loading and cavitation bubble collapse caused by pulsed-beam induced pressure waves. The wavelet differential analysis (WDA) was applied to reveal the effects of the damage on vibrational cycling. To reduce the effects of noise superimposed on the vibration signals on the WDA results, analysis of variance (ANOVA) and analysis of covariance (ANCOVA), statistical methods were applied. Results from laboratory experiments, numerical simulation results with random noise added, and target vessel field data were analyzed by the WDA and the statistical methods. The analyses demonstrated that the established in-situ diagnostic technique can be used to effectively evaluate the structural response of the target vessel.

  3. Analysis of postoperative complications for superficial liposuction: a review of 2398 cases.

    PubMed

    Kim, Youn Hwan; Cha, Sang Myun; Naidu, Shenthilkumar; Hwang, Weon Jung

    2011-02-01

    Superficial liposuction has found its application in maximizing and creating a lifting effect to achieve a better aesthetic result. Due to initial high complication rates, these procedures were generally accepted as risky. In a response to the increasing concerns over the safety and efficacy of superficial liposuction, the authors describe their 14-year experience of performing superficial liposuction and analysis of postoperative complications associated with these procedures. From March of 1995 to December of 2008, the authors performed superficial liposuction on 2398 patients. Three subgroups were incorporated according to liposuction methods as follows: power-assisted liposuction alone (subgroup 1), power-assisted liposuction combined with ultrasound energy (subgroup 2), and power-assisted liposuction combined with external ultrasound and postoperative Endermologie (subgroup 3). Statistical analyses for complications were performed among subgroups. The mean age was 42.8 years, mean body mass index was 27.9 kg/m2, and mean volume of total aspiration was 5045 cc. Overall complication rate was 8.6 percent (206 patients). Four cases of skin necroses and two cases of infections were included. The most common complication was postoperative contour irregularity. Power-assisted liposuction combined with external ultrasound with or without postoperative Endermologie was seen to decrease the overall complication rate, contour irregularity, and skin necrosis. There were no statistical differences regarding other complications. Superficial liposuction has potential risks for higher complications compared with conventional suction techniques, especially postoperative contour irregularity, which can be minimized with proper selection of candidates for the procedure, avoiding overzealous suctioning of superficial layer, and using a combination of ultrasound energy techniques.

  4. Game Related Statistics Discriminating Between Starters and Nonstarters Players in Women’S National Basketball Association League (WNBA)

    PubMed Central

    Gòmez, Miguel-Ángel; Lorenzo, Alberto; Ortega, Enrique; Sampaio, Jaime; Ibàñez, Sergio-José

    2009-01-01

    The aim of the present study was to identify the game-related statistics that allow discriminating between starters and nonstarter players in women’s basketball when related to winning or losing games and best or worst teams. The sample comprised all 216 regular season games from the 2005 Women’s National Basketball Association League (WNBA). The game-related statistics included were 2- and 3- point field-goals (both successful and unsuccessful), free-throws (both successful and unsuccessful), defensive and offensive rebounds, assists, blocks, fouls, steals, turnovers and minutes played. Results from multivariate analysis showed that when best teams won, the discriminant game-related statistics were successful 2-point field-goals (SC = 0.47), successful free-throws (SC = 0.44), fouls (SC = -0.41), assists (SC = 0.37), and defensive rebounds (SC = 0.37). When the worst teams won, the discriminant game-related statistics were successful 2-point field- goals (SC = 0.37), successful free-throws (SC = 0.45), assists (SC = 0.58), and steals (SC = 0.35). The results showed that the successful 2-point field-goals, successful free-throws and the assists were the most powerful variables discriminating between starters and nonstarters. These specific characteristics helped to point out the importance of starters’ players shooting and passing ability during competitions. Key points The players’ game-related statistical profile varied according to team status, game outcome and team quality in women’s basketball. The results of this work help to point out the different player’s performance described in women’s basketball compared with men’s basketball. The results obtained enhance the importance of starters and nonstarters contribution to team’s performance in different game contexts. Results showed the power of successful 2-point field-goals, successful free-throws and assists discriminating between starters and nonstarters in all the analyses. PMID:24149538

  5. Power calculator for instrumental variable analysis in pharmacoepidemiology

    PubMed Central

    Walker, Venexia M; Davies, Neil M; Windmeijer, Frank; Burgess, Stephen; Martin, Richard M

    2017-01-01

    Abstract Background Instrumental variable analysis, for example with physicians’ prescribing preferences as an instrument for medications issued in primary care, is an increasingly popular method in the field of pharmacoepidemiology. Existing power calculators for studies using instrumental variable analysis, such as Mendelian randomization power calculators, do not allow for the structure of research questions in this field. This is because the analysis in pharmacoepidemiology will typically have stronger instruments and detect larger causal effects than in other fields. Consequently, there is a need for dedicated power calculators for pharmacoepidemiological research. Methods and Results The formula for calculating the power of a study using instrumental variable analysis in the context of pharmacoepidemiology is derived before being validated by a simulation study. The formula is applicable for studies using a single binary instrument to analyse the causal effect of a binary exposure on a continuous outcome. An online calculator, as well as packages in both R and Stata, are provided for the implementation of the formula by others. Conclusions The statistical power of instrumental variable analysis in pharmacoepidemiological studies to detect a clinically meaningful treatment effect is an important consideration. Research questions in this field have distinct structures that must be accounted for when calculating power. The formula presented differs from existing instrumental variable power formulae due to its parametrization, which is designed specifically for ease of use by pharmacoepidemiologists. PMID:28575313

  6. Publication bias in obesity treatment trials?

    PubMed

    Allison, D B; Faith, M S; Gorman, B S

    1996-10-01

    The present investigation examined the extent of publication bias (namely the tendency to publish significant findings and file away non-significant findings) within the obesity treatment literature. Quantitative literature synthesis of four published meta-analyses from the obesity treatment literature. Interventions in these studies included pharmacological, educational, child, and couples treatments. To assess publication bias, several regression procedures (for example weighted least-squares, random-effects multi-level modeling, and robust regression methods) were used to regress effect sizes onto their standard errors, or proxies thereof, within each of the four meta-analysis. A significant positive beta weight in these analyses signified publication bias. There was evidence for publication bias within two of the four published meta-analyses, such that reviews of published studies were likely to overestimate clinical efficacy. The lack of evidence for publication bias within the two other meta-analyses might have been due to insufficient statistical power rather than the absence of selection bias. As in other disciplines, publication bias appears to exist in the obesity treatment literature. Suggestions are offered for managing publication bias once identified or reducing its likelihood in the first place.

  7. Use of Multivariate Linkage Analysis for Dissection of a Complex Cognitive Trait

    PubMed Central

    Marlow, Angela J.; Fisher, Simon E.; Francks, Clyde; MacPhie, I. Laurence; Cherny, Stacey S.; Richardson, Alex J.; Talcott, Joel B.; Stein, John F.; Monaco, Anthony P.; Cardon, Lon R.

    2003-01-01

    Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits. PMID:12587094

  8. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.

    PubMed

    Westfall, Jacob; Kenny, David A; Judd, Charles M

    2014-10-01

    Researchers designing experiments in which a sample of participants responds to a sample of stimuli are faced with difficult questions about optimal study design. The conventional procedures of statistical power analysis fail to provide appropriate answers to these questions because they are based on statistical models in which stimuli are not assumed to be a source of random variation in the data, models that are inappropriate for experiments involving crossed random factors of participants and stimuli. In this article, we present new methods of power analysis for designs with crossed random factors, and we give detailed, practical guidance to psychology researchers planning experiments in which a sample of participants responds to a sample of stimuli. We extensively examine 5 commonly used experimental designs, describe how to estimate statistical power in each, and provide power analysis results based on a reasonable set of default parameter values. We then develop general conclusions and formulate rules of thumb concerning the optimal design of experiments in which a sample of participants responds to a sample of stimuli. We show that in crossed designs, statistical power typically does not approach unity as the number of participants goes to infinity but instead approaches a maximum attainable power value that is possibly small, depending on the stimulus sample. We also consider the statistical merits of designs involving multiple stimulus blocks. Finally, we provide a simple and flexible Web-based power application to aid researchers in planning studies with samples of stimuli.

  9. Monte Carlo based statistical power analysis for mediation models: methods and software.

    PubMed

    Zhang, Zhiyong

    2014-12-01

    The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.

  10. Seeking a fingerprint: analysis of point processes in actigraphy recording

    NASA Astrophysics Data System (ADS)

    Gudowska-Nowak, Ewa; Ochab, Jeremi K.; Oleś, Katarzyna; Beldzik, Ewa; Chialvo, Dante R.; Domagalik, Aleksandra; Fąfrowicz, Magdalena; Marek, Tadeusz; Nowak, Maciej A.; Ogińska, Halszka; Szwed, Jerzy; Tyburczyk, Jacek

    2016-05-01

    Motor activity of humans displays complex temporal fluctuations which can be characterised by scale-invariant statistics, thus demonstrating that structure and fluctuations of such kinetics remain similar over a broad range of time scales. Previous studies on humans regularly deprived of sleep or suffering from sleep disorders predicted a change in the invariant scale parameters with respect to those for healthy subjects. In this study we investigate the signal patterns from actigraphy recordings by means of characteristic measures of fractional point processes. We analyse spontaneous locomotor activity of healthy individuals recorded during a week of regular sleep and a week of chronic partial sleep deprivation. Behavioural symptoms of lack of sleep can be evaluated by analysing statistics of duration times during active and resting states, and alteration of behavioural organisation can be assessed by analysis of power laws detected in the event count distribution, distribution of waiting times between consecutive movements and detrended fluctuation analysis of recorded time series. We claim that among different measures characterising complexity of the actigraphy recordings and their variations implied by chronic sleep distress, the exponents characterising slopes of survival functions in resting states are the most effective biomarkers distinguishing between healthy and sleep-deprived groups.

  11. Smart Sampling and HPC-based Probabilistic Look-ahead Contingency Analysis Implementation and its Evaluation with Real-world Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yousu; Etingov, Pavel V.; Ren, Huiying

    This paper describes a probabilistic look-ahead contingency analysis application that incorporates smart sampling and high-performance computing (HPC) techniques. Smart sampling techniques are implemented to effectively represent the structure and statistical characteristics of uncertainty introduced by different sources in the power system. They can significantly reduce the data set size required for multiple look-ahead contingency analyses, and therefore reduce the time required to compute them. High-performance-computing (HPC) techniques are used to further reduce computational time. These two techniques enable a predictive capability that forecasts the impact of various uncertainties on potential transmission limit violations. The developed package has been tested withmore » real world data from the Bonneville Power Administration. Case study results are presented to demonstrate the performance of the applications developed.« less

  12. Control of gaseous pollution via the leaves of non-edible trees

    NASA Astrophysics Data System (ADS)

    Al-Maliky, S. J. B.

    2015-11-01

    The accelerated increase of the use of various transportation means, industrial machinery and other power consuming technologies has led to tremendous degradation of outdoor air quality all around the world. Green solution was tested here as an innovative gas control mean via non edible Myrtus communis green leaves as natural sorption media. Statistical analyses were applied in order to examine the correlation between various parameters of this study. The tests of gas records around the tree that was targeted by a gas stream of 5 KW power generators have demonstrated an excellent gas control role of the green leaves, with average efficiencies of about 75% and 82% for the removal of Nitrogen Dioxide and Carbon Monoxide, respectively. An interesting finding of this research was that the sorption role of green leaves has promoted their sizes and Chlorophyll Content Index.

  13. Power-law rheology controls aftershock triggering and decay

    PubMed Central

    Zhang, Xiaoming; Shcherbakov, Robert

    2016-01-01

    The occurrence of aftershocks is a signature of physical systems exhibiting relaxation phenomena. They are observed in various natural or experimental systems and usually obey several non-trivial empirical laws. Here we consider a cellular automaton realization of a nonlinear viscoelastic slider-block model in order to infer the physical mechanisms of triggering responsible for the occurrence of aftershocks. We show that nonlinear viscoelasticity plays a critical role in the occurrence of aftershocks. The model reproduces several empirical laws describing the statistics of aftershocks. In case of earthquakes, the proposed model suggests that the power-law rheology of the fault gauge, underlying lower crust, and upper mantle controls the decay rate of aftershocks. This is verified by analysing several prominent aftershock sequences for which the rheological properties of the underlying crust and upper mantle were established. PMID:27819355

  14. Quantum fluctuation theorems and power measurements

    NASA Astrophysics Data System (ADS)

    Prasanna Venkatesh, B.; Watanabe, Gentaro; Talkner, Peter

    2015-07-01

    Work in the paradigm of the quantum fluctuation theorems of Crooks and Jarzynski is determined by projective measurements of energy at the beginning and end of the force protocol. In analogy to classical systems, we consider an alternative definition of work given by the integral of the supplied power determined by integrating up the results of repeated measurements of the instantaneous power during the force protocol. We observe that such a definition of work, in spite of taking account of the process dependence, has different possible values and statistics from the work determined by the conventional two energy measurement approach (TEMA). In the limit of many projective measurements of power, the system’s dynamics is frozen in the power measurement basis due to the quantum Zeno effect leading to statistics only trivially dependent on the force protocol. In general the Jarzynski relation is not satisfied except for the case when the instantaneous power operator commutes with the total Hamiltonian at all times. We also consider properties of the joint statistics of power-based definition of work and TEMA work in protocols where both values are determined. This allows us to quantify their correlations. Relaxing the projective measurement condition, weak continuous measurements of power are considered within the stochastic master equation formalism. Even in this scenario the power-based work statistics is in general not able to reproduce qualitative features of the TEMA work statistics.

  15. Statistical Power of Psychological Research: What Have We Gained in 20 Years?

    ERIC Educational Resources Information Center

    Rossi, Joseph S.

    1990-01-01

    Calculated power for 6,155 statistical tests in 221 journal articles published in 1982 volumes of "Journal of Abnormal Psychology,""Journal of Consulting and Clinical Psychology," and "Journal of Personality and Social Psychology." Power to detect small, medium, and large effects was .17, .57, and .83, respectively. Concluded that power of…

  16. An overview of 37 randomised trials of blood pressure lowering agents among 270,000 individuals. World Health Organization-International Society of Hypertension Blood Pressure Lowering Treatment Trialists' Collaboration.

    PubMed

    Neal, B; MacMahon, S

    1999-01-01

    Overviews (meta-analyses) of the major ongoing randomized trials of blood pressure lowering drugs will be conducted to determine the effects of: first, newer versus older classes of blood pressure lowering drugs in patients with hypertension; and second, blood pressure lowering treatments versus untreated or less treated control conditions in patient groups at high risk of cardiovascular events. The principal study outcomes are stroke, coronary heart disease, total cardiovascular events and total cardiovascular deaths. The overviews have been prospectively designed and will be conducted on individual patient data. The analyses will be conducted as a collaboration between the principal investigators of participating trials involving about 270,000 patients. Full data should be available in 2003, with the first round of analyses performed in 1999-2000. The combination of trial results should provide good statistical power to detect even modest differences between the effects on the main study outcomes.

  17. Discriminating low frequency components from long range persistent fluctuations in daily atmospheric temperature variability

    NASA Astrophysics Data System (ADS)

    Lanfredi, M.; Simoniello, T.; Cuomo, V.; Macchiato, M.

    2009-02-01

    This study originated from recent results reported in literature, which support the existence of long-range (power-law) persistence in atmospheric temperature fluctuations on monthly and inter-annual scales. We investigated the results of Detrended Fluctuation Analysis (DFA) carried out on twenty-two historical daily time series recorded in Europe in order to evaluate the reliability of such findings in depth. More detailed inspections emphasized systematic deviations from power-law and high statistical confidence for functional form misspecification. Rigorous analyses did not support scale-free correlation as an operative concept for Climate modelling, as instead suggested in literature. In order to understand the physical implications of our results better, we designed a bivariate Markov process, parameterised on the basis of the atmospheric observational data by introducing a slow dummy variable. The time series generated by this model, analysed both in time and frequency domains, tallied with the real ones very well. They accounted for both the deceptive scaling found in literature and the correlation details enhanced by our analysis. Our results seem to evidence the presence of slow fluctuations from another climatic sub-system such as ocean, which inflates temperature variance up to several months. They advise more precise re-analyses of temperature time series before suggesting dynamical paradigms useful for Climate modelling and for the assessment of Climate Change.

  18. Discriminating low frequency components from long range persistent fluctuations in daily atmospheric temperature variability

    NASA Astrophysics Data System (ADS)

    Lanfredi, M.; Simoniello, T.; Cuomo, V.; Macchiato, M.

    2009-07-01

    This study originated from recent results reported in literature, which support the existence of long-range (power-law) persistence in atmospheric temperature fluctuations on monthly and inter-annual scales. We investigated the results of Detrended Fluctuation Analysis (DFA) carried out on twenty-two historical daily time series recorded in Europe in order to evaluate the reliability of such findings in depth. More detailed inspections emphasized systematic deviations from power-law and high statistical confidence for functional form misspecification. Rigorous analyses did not support scale-free correlation as an operative concept for Climate modelling, as instead suggested in literature. In order to understand the physical implications of our results better, we designed a bivariate Markov process, parameterised on the basis of the atmospheric observational data by introducing a slow dummy variable. The time series generated by this model, analysed both in time and frequency domains, tallied with the real ones very well. They accounted for both the deceptive scaling found in literature and the correlation details enhanced by our analysis. Our results seem to evidence the presence of slow fluctuations from another climatic sub-system such as ocean, which inflates temperature variance up to several months. They advise more precise re-analyses of temperature time series before suggesting dynamical paradigms useful for Climate modelling and for the assessment of Climate Change.

  19. Computed statistics at streamgages, and methods for estimating low-flow frequency statistics and development of regional regression equations for estimating low-flow frequency statistics at ungaged locations in Missouri

    USGS Publications Warehouse

    Southard, Rodney E.

    2013-01-01

    The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical estimates on one of these streams can be calculated at an ungaged location that has a drainage area that is between 40 percent of the drainage area of the farthest upstream streamgage and within 150 percent of the drainage area of the farthest downstream streamgage along the stream of interest. The second method may be used on any stream with a streamgage that has operated for 10 years or longer and for which anthropogenic effects have not changed the low-flow characteristics at the ungaged location since collection of the streamflow data. A ratio of drainage area of the stream at the ungaged location to the drainage area of the stream at the streamgage was computed to estimate the statistic at the ungaged location. The range of applicability is between 40- and 150-percent of the drainage area of the streamgage, and the ungaged location must be located on the same stream as the streamgage. The third method uses regional regression equations to estimate selected low-flow frequency statistics for unregulated streams in Missouri. This report presents regression equations to estimate frequency statistics for the 10-year recurrence interval and for the N-day durations of 1, 2, 3, 7, 10, 30, and 60 days. Basin and climatic characteristics were computed using geographic information system software and digital geospatial data. A total of 35 characteristics were computed for use in preliminary statewide and regional regression analyses based on existing digital geospatial data and previous studies. Spatial analyses for geographical bias in the predictive accuracy of the regional regression equations defined three low-flow regions with the State representing the three major physiographic provinces in Missouri. Region 1 includes the Central Lowlands, Region 2 includes the Ozark Plateaus, and Region 3 includes the Mississippi Alluvial Plain. A total of 207 streamgages were used in the regression analyses for the regional equations. Of the 207 U.S. Geological Survey streamgages, 77 were located in Region 1, 120 were located in Region 2, and 10 were located in Region 3. Streamgages located outside of Missouri were selected to extend the range of data used for the independent variables in the regression analyses. Streamgages included in the regression analyses had 10 or more years of record and were considered to be affected minimally by anthropogenic activities or trends. Regional regression analyses identified three characteristics as statistically significant for the development of regional equations. For Region 1, drainage area, longest flow path, and streamflow-variability index were statistically significant. The range in the standard error of estimate for Region 1 is 79.6 to 94.2 percent. For Region 2, drainage area and streamflow variability index were statistically significant, and the range in the standard error of estimate is 48.2 to 72.1 percent. For Region 3, drainage area and streamflow-variability index also were statistically significant with a range in the standard error of estimate of 48.1 to 96.2 percent. Limitations on the use of estimating low-flow frequency statistics at ungaged locations are dependent on the method used. The first method outlined for use in Missouri, power curve equations, were developed to estimate the selected statistics for ungaged locations on 28 selected streams with multiple streamgages located on the same stream. A second method uses a drainage-area ratio to compute statistics at an ungaged location using data from a single streamgage on the same stream with 10 or more years of record. Ungaged locations on these streams may use the ratio of the drainage area at an ungaged location to the drainage area at a streamgage location to scale the selected statistic value from the streamgage location to the ungaged location. This method can be used if the drainage area of the ungaged location is within 40 to 150 percent of the streamgage drainage area. The third method is the use of the regional regression equations. The limits for the use of these equations are based on the ranges of the characteristics used as independent variables and that streams must be affected minimally by anthropogenic activities.

  20. Experimental design, power and sample size for animal reproduction experiments.

    PubMed

    Chapman, Phillip L; Seidel, George E

    2008-01-01

    The present paper concerns statistical issues in the design of animal reproduction experiments, with emphasis on the problems of sample size determination and power calculations. We include examples and non-technical discussions aimed at helping researchers avoid serious errors that may invalidate or seriously impair the validity of conclusions from experiments. Screen shots from interactive power calculation programs and basic SAS power calculation programs are presented to aid in understanding statistical power and computing power in some common experimental situations. Practical issues that are common to most statistical design problems are briefly discussed. These include one-sided hypothesis tests, power level criteria, equality of within-group variances, transformations of response variables to achieve variance equality, optimal specification of treatment group sizes, 'post hoc' power analysis and arguments for the increased use of confidence intervals in place of hypothesis tests.

  1. Effects off system factors on the economics of and demand for small solar thermal power systems

    NASA Technical Reports Server (NTRS)

    1981-01-01

    Market penetration as a function time, SPS performance factors, and market/economic considerations was estimated, and commercialization strategies were formulated. A market analysis task included personal interviews and supplemental mail surveys to acquire statistical data and to identify and measure attitudes, reactions and intentions of prospective SPS users. Interviews encompassed three ownership classes of electric utilities and industrial firms in the SIC codes for energy consumption. A market demand model was developed which utilized the data base developed, and projected energy price and consumption data to perform sensitivity analyses and estimate potential market for SPS.

  2. Effects off system factors on the economics of and demand for small solar thermal power systems

    NASA Astrophysics Data System (ADS)

    1981-09-01

    Market penetration as a function time, SPS performance factors, and market/economic considerations was estimated, and commercialization strategies were formulated. A market analysis task included personal interviews and supplemental mail surveys to acquire statistical data and to identify and measure attitudes, reactions and intentions of prospective SPS users. Interviews encompassed three ownership classes of electric utilities and industrial firms in the SIC codes for energy consumption. A market demand model was developed which utilized the data base developed, and projected energy price and consumption data to perform sensitivity analyses and estimate potential market for SPS.

  3. Development of an automated processing and screening system for the space shuttle orbiter flight test data

    NASA Technical Reports Server (NTRS)

    Mccutchen, D. K.; Brose, J. F.; Palm, W. E.

    1982-01-01

    One nemesis of the structural dynamist is the tedious task of reviewing large quantities of data. This data, obtained from various types of instrumentation, may be represented by oscillogram records, root-mean-squared (rms) time histories, power spectral densities, shock spectra, 1/3 octave band analyses, and various statistical distributions. In an attempt to reduce the laborious task of manually reviewing all of the space shuttle orbiter wideband frequency-modulated (FM) analog data, an automated processing system was developed to perform the screening process based upon predefined or predicted threshold criteria.

  4. The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms

    PubMed Central

    Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

    2013-01-01

    Abstract To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches – for example, analysis of variance (ANOVA) – are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing. PMID:24567836

  5. The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms.

    PubMed

    Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

    2013-08-01

    To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches - for example, analysis of variance (ANOVA) - are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing.

  6. How to Make Nothing Out of Something: Analyses of the Impact of Study Sampling and Statistical Interpretation in Misleading Meta-Analytic Conclusions

    PubMed Central

    Cunningham, Michael R.; Baumeister, Roy F.

    2016-01-01

    The limited resource model states that self-control is governed by a relatively finite set of inner resources on which people draw when exerting willpower. Once self-control resources have been used up or depleted, they are less available for other self-control tasks, leading to a decrement in subsequent self-control success. The depletion effect has been studied for over 20 years, tested or extended in more than 600 studies, and supported in an independent meta-analysis (Hagger et al., 2010). Meta-analyses are supposed to reduce bias in literature reviews. Carter et al.’s (2015) meta-analysis, by contrast, included a series of questionable decisions involving sampling, methods, and data analysis. We provide quantitative analyses of key sampling issues: exclusion of many of the best depletion studies based on idiosyncratic criteria and the emphasis on mini meta-analyses with low statistical power as opposed to the overall depletion effect. We discuss two key methodological issues: failure to code for research quality, and the quantitative impact of weak studies by novice researchers. We discuss two key data analysis issues: questionable interpretation of the results of trim and fill and Funnel Plot Asymmetry test procedures, and the use and misinterpretation of the untested Precision Effect Test and Precision Effect Estimate with Standard Error (PEESE) procedures. Despite these serious problems, the Carter et al. (2015) meta-analysis results actually indicate that there is a real depletion effect – contrary to their title. PMID:27826272

  7. Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.

    PubMed

    Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill

    2016-01-01

    Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects.

    PubMed

    Zhang, De-Xing; Hewitt, Godfrey M

    2003-03-01

    Population-genetic studies have been remarkably productive and successful in the last decade following the invention of PCR technology and the introduction of mitochondrial and microsatellite DNA markers. While mitochondrial DNA has proven powerful for genealogical and evolutionary studies of animal populations, and microsatellite sequences are the most revealing DNA markers available so far for inferring population structure and dynamics, they both have important and unavoidable limitations. To obtain a fuller picture of the history and evolutionary potential of populations, genealogical data from nuclear loci are essential, and the inclusion of other nuclear markers, i.e. single copy nuclear polymorphic (scnp) sequences, is clearly needed. Four major uncertainties for nuclear DNA analyses of populations have been facing us, i.e. the availability of scnp markers for carrying out such analysis, technical laboratory hurdles for resolving haplotypes, difficulty in data analysis because of recombination, low divergence levels and intraspecific multifurcation evolution, and the utility of scnp markers for addressing population-genetic questions. In this review, we discuss the availability of highly polymorphic single copy DNA in the nuclear genome, describe patterns and rate of evolution of nuclear sequences, summarize past empirical and theoretical efforts to recover and analyse data from scnp markers, and examine the difficulties, challenges and opportunities faced in such studies. We show that although challenges still exist, the above-mentioned obstacles are now being removed. Recent advances in technology and increases in statistical power provide the prospect of nuclear DNA analyses becoming routine practice, allowing allele-discriminating characterization of scnp loci and microsatellite loci. This certainly will increase our ability to address more complex questions, and thereby the sophistication of genetic analyses of populations.

  9. Discriminatory power of water polo game-related statistics at the 2008 Olympic Games.

    PubMed

    Escalante, Yolanda; Saavedra, Jose M; Mansilla, Mirella; Tella, Victor

    2011-02-01

    The aims of this study were (1) to compare water polo game-related statistics by context (winning and losing teams) and sex (men and women), and (2) to identify characteristics discriminating the performances for each sex. The game-related statistics of the 64 matches (44 men's and 20 women's) played in the final phase of the Olympic Games held in Beijing in 2008 were analysed. Unpaired t-tests compared winners and losers and men and women, and confidence intervals and effect sizes of the differences were calculated. The results were subjected to a discriminant analysis to identify the differentiating game-related statistics of the winning and losing teams. The results showed the differences between winning and losing men's teams to be in both defence and offence, whereas in women's teams they were only in offence. In men's games, passing (assists), aggressive play (exclusions), centre position effectiveness (centre shots), and goalkeeper defence (goalkeeper-blocked 5-m shots) predominated, whereas in women's games the play was more dynamic (possessions). The variable that most discriminated performance in men was goalkeeper-blocked shots, and in women shooting effectiveness (shots). These results should help coaches when planning training and competition.

  10. How Many Studies Do You Need? A Primer on Statistical Power for Meta-Analysis

    ERIC Educational Resources Information Center

    Valentine, Jeffrey C.; Pigott, Therese D.; Rothstein, Hannah R.

    2010-01-01

    In this article, the authors outline methods for using fixed and random effects power analysis in the context of meta-analysis. Like statistical power analysis for primary studies, power analysis for meta-analysis can be done either prospectively or retrospectively and requires assumptions about parameters that are unknown. The authors provide…

  11. Monitoring Statistics Which Have Increased Power over a Reduced Time Range.

    ERIC Educational Resources Information Center

    Tang, S. M.; MacNeill, I. B.

    1992-01-01

    The problem of monitoring trends for changes at unknown times is considered. Statistics that permit one to focus high power on a segment of the monitored period are studied. Numerical procedures are developed to compute the null distribution of these statistics. (Author)

  12. The 1993 Mississippi river flood: A one hundred or a one thousand year event?

    USGS Publications Warehouse

    Malamud, B.D.; Turcotte, D.L.; Barton, C.C.

    1996-01-01

    Power-law (fractal) extreme-value statistics are applicable to many natural phenomena under a wide variety of circumstances. Data from a hydrologic station in Keokuk, Iowa, shows the great flood of the Mississippi River in 1993 has a recurrence interval on the order of 100 years using power-law statistics applied to partial-duration flood series and on the order of 1,000 years using a log-Pearson type 3 (LP3) distribution applied to annual series. The LP3 analysis is the federally adopted probability distribution for flood-frequency estimation of extreme events. We suggest that power-law statistics are preferable to LP3 analysis. As a further test of the power-law approach we consider paleoflood data from the Colorado River. We compare power-law and LP3 extrapolations of historical data with these paleo-floods. The results are remarkably similar to those obtained for the Mississippi River: Recurrence intervals from power-law statistics applied to Lees Ferry discharge data are generally consistent with inferred 100- and 1,000-year paleofloods, whereas LP3 analysis gives recurrence intervals that are orders of magnitude longer. For both the Keokuk and Lees Ferry gauges, the use of an annual series introduces an artificial curvature in log-log space that leads to an underestimate of severe floods. Power-law statistics are predicting much shorter recurrence intervals than the federally adopted LP3 statistics. We suggest that if power-law behavior is applicable, then the likelihood of severe floods is much higher. More conservative dam designs and land-use restrictions Nay be required.

  13. Hydrogen Fuel Cell Performance as Telecommunications Backup Power in the United States

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kurtz, Jennifer; Saur, Genevieve; Sprik, Sam

    2015-03-01

    Working in collaboration with the U.S. Department of Energy (DOE) and industry project partners, the National Renewable Energy Laboratory (NREL) acts as the central data repository for the data collected from real-world operation of fuel cell backup power systems. With American Recovery and Reinvestment Act of 2009 (ARRA) co-funding awarded through DOE's Fuel Cell Technologies Office, more than 1,300 fuel cell units were deployed over a three-plus-year period in stationary, material handling equipment, auxiliary power, and backup power applications. This surpassed a Fuel Cell Technologies Office ARRA objective to spur commercialization of an early market technology by installing 1,000 fuelmore » cell units across several different applications, including backup power. By December 2013, 852 backup power units out of 1,330 fuel cell units deployed were providing backup service, mainly for telecommunications towers. For 136 of the fuel cell backup units, project participants provided detailed operational data to the National Fuel Cell Technology Evaluation Center for analysis by NREL's technology validation team. NREL analyzed operational data collected from these government co-funded demonstration projects to characterize key fuel cell backup power performance metrics, including reliability and operation trends, and to highlight the business case for using fuel cells in these early market applications. NREL's analyses include these critical metrics, along with deployment, U.S. grid outage statistics, and infrastructure operation.« less

  14. [Development of an Excel spreadsheet for meta-analysis of indirect and mixed treatment comparisons].

    PubMed

    Tobías, Aurelio; Catalá-López, Ferrán; Roqué, Marta

    2014-01-01

    Meta-analyses in clinical research usually aimed to evaluate treatment efficacy and safety in direct comparison with a unique comparator. Indirect comparisons, using the Bucher's method, can summarize primary data when information from direct comparisons is limited or nonexistent. Mixed comparisons allow combining estimates from direct and indirect comparisons, increasing statistical power. There is a need for simple applications for meta-analysis of indirect and mixed comparisons. These can easily be conducted using a Microsoft Office Excel spreadsheet. We developed a spreadsheet for indirect and mixed effects comparisons of friendly use for clinical researchers interested in systematic reviews, but non-familiarized with the use of more advanced statistical packages. The use of the proposed Excel spreadsheet for indirect and mixed comparisons can be of great use in clinical epidemiology to extend the knowledge provided by traditional meta-analysis when evidence from direct comparisons is limited or nonexistent.

  15. The Growth of the User Community of the La Silla Paranal Observatory Science Archive

    NASA Astrophysics Data System (ADS)

    Romaniello, M.; Arnaboldi, M.; Da Rocha, C.; De Breuck, C.; Delmotte, N.; Dobrzycki, A.; Fourniol, N.; Freudling, W.; Mascetti, L.; Micol, A.; Retzlaff, J.; Sterzik, M.; Sequeiros, I. V.; De Breuck, M. V.

    2016-03-01

    The archive of the La Silla Paranal Observatory has grown steadily into a powerful science resource for the ESO astronomical community. Established in 1998, the Science Archive Facility (SAF) stores both the raw data generated by all ESO instruments and selected processed (science-ready) data. The growth of the SAF user community is analysed through access and publication statistics. Statistics are presented for archival users, who do not contribute to observing proposals, and contrasted with regular and archival users, who are successful in competing for observing time. Archival data from the SAF contribute to about one paper out of four that use data from ESO facilities. This study reveals that the blend of users constitutes a mixture of the traditional ESO community making novel use of the data and of a new community being built around the SAF.

  16. Low-dose ionizing radiation increases the mortality risk of solid cancers in nuclear industry workers: A meta-analysis

    PubMed Central

    Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu

    2018-01-01

    Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality. PMID:29725540

  17. PROMISE: a tool to identify genomic features with a specific biologically interesting pattern of associations with multiple endpoint variables.

    PubMed

    Pounds, Stan; Cheng, Cheng; Cao, Xueyuan; Crews, Kristine R; Plunkett, William; Gandhi, Varsha; Rubnitz, Jeffrey; Ribeiro, Raul C; Downing, James R; Lamba, Jatinder

    2009-08-15

    In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables. Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org.

  18. "What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"

    ERIC Educational Resources Information Center

    Ozturk, Elif

    2012-01-01

    The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…

  19. Potentiation Effects of Half-Squats Performed in a Ballistic or Nonballistic Manner.

    PubMed

    Suchomel, Timothy J; Sato, Kimitake; DeWeese, Brad H; Ebben, William P; Stone, Michael H

    2016-06-01

    This study examined and compared the acute effects of ballistic and nonballistic concentric-only half-squats (COHSs) on squat jump performance. Fifteen resistance-trained men performed a squat jump 2 minutes after a control protocol or 2 COHSs at 90% of their 1 repetition maximum (1RM) COHS performed in a ballistic or nonballistic manner. Jump height (JH), peak power (PP), and allometrically scaled peak power (PPa) were compared using three 3 × 2 repeated-measures analyses of variance. Statistically significant condition × time interaction effects existed for JH (p = 0.037), PP (p = 0.041), and PPa (p = 0.031). Post hoc analysis revealed that the ballistic condition produced statistically greater JH (p = 0.017 and p = 0.036), PP (p = 0.031 and p = 0.026), and PPa (p = 0.024 and p = 0.023) than the control and nonballistic conditions, respectively. Small effect sizes for JH, PP, and PPa existed during the ballistic condition (d = 0.28-0.44), whereas trivial effect sizes existed during the control (d = 0.0-0.18) and nonballistic (d = 0.0-0.17) conditions. Large statistically significant relationships existed between the JH potentiation response and the subject's relative back squat 1RM (r = 0.520; p = 0.047) and relative COHS 1RM (r = 0.569; p = 0.027) during the ballistic condition. In addition, large statistically significant relationship existed between JH potentiation response and the subject's relative back squat strength (r = 0.633; p = 0.011), whereas the moderate relationship with the subject's relative COHS strength trended toward significance (r = 0.483; p = 0.068). Ballistic COHS produced superior potentiation effects compared with COHS performed in a nonballistic manner. Relative strength may contribute to the elicited potentiation response after ballistic and nonballistic COHS.

  20. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

  1. Endoscopic septoplasty in primary cases using electromechanical instruments: surgical technique, efficacy and results.

    PubMed

    De Sousa Fontes, Aderito; Sandrea Jiménez, Minaret; Chacaltana Ayerve, Rosa R

    2013-01-01

    The microdebrider is a surgical tool which has been used successfully in many endoscopic surgical procedures in otolaryngology. In this study, we analysed our experience using this powered instrument in the resection of obstructive nasal septum deviations. This was a longitudinal, prospective, descriptive study conducted between January and June 2007 on 141 patients who consulted for chronic nasal obstruction caused by a septal deviation or deformity and underwent powered endoscopic septoplasty (PES). The mean age was 39.9 years (15-63 years); 60.28% were male (n=85) The change in nasal symptom severity decreased after surgery from 6.12 (preoperative) to 2.01 (postoperative). Patients undergoing PES had a significant reduction of nasal symptoms in the pre- and postoperative period, which was statistically significant (P<.05). There were no statistically significant differences between the results at the 2 nd week, 6th week and 5th year after surgery. The 100% of patients were satisfied with the results of surgery and no patient answered "No" to the question added to compare patient satisfaction after surgery. Minor complications in the postoperative period were present in 4.96% of the cases. Powered endoscopic septoplasty allows accurate, conservative repair of obstructive nasal septum deviations, with fewer complications and better functional results. In our experience, this technique offered significant perioperative advantages with high postoperative patient satisfaction in terms of reducing the severity of nasal symptoms. Copyright © 2012 Elsevier España, S.L. All rights reserved.

  2. Indoor Soiling Method and Outdoor Statistical Risk Analysis of Photovoltaic Power Plants

    NASA Astrophysics Data System (ADS)

    Rajasekar, Vidyashree

    This is a two-part thesis. Part 1 presents an approach for working towards the development of a standardized artificial soiling method for laminated photovoltaic (PV) cells or mini-modules. Construction of an artificial chamber to maintain controlled environmental conditions and components/chemicals used in artificial soil formulation is briefly explained. Both poly-Si mini-modules and a single cell mono-Si coupons were soiled and characterization tests such as I-V, reflectance and quantum efficiency (QE) were carried out on both soiled, and cleaned coupons. From the results obtained, poly-Si mini-modules proved to be a good measure of soil uniformity, as any non-uniformity present would not result in a smooth curve during I-V measurements. The challenges faced while executing reflectance and QE characterization tests on poly-Si due to smaller size cells was eliminated on the mono-Si coupons with large cells to obtain highly repeatable measurements. This study indicates that the reflectance measurements between 600-700 nm wavelengths can be used as a direct measure of soil density on the modules. Part 2 determines the most dominant failure modes of field aged PV modules using experimental data obtained in the field and statistical analysis, FMECA (Failure Mode, Effect, and Criticality Analysis). The failure and degradation modes of about 744 poly-Si glass/polymer frameless modules fielded for 18 years under the cold-dry climate of New York was evaluated. Defect chart, degradation rates (both string and module levels) and safety map were generated using the field measured data. A statistical reliability tool, FMECA that uses Risk Priority Number (RPN) is used to determine the dominant failure or degradation modes in the strings and modules by means of ranking and prioritizing the modes. This study on PV power plants considers all the failure and degradation modes from both safety and performance perspectives. The indoor and outdoor soiling studies were jointly performed by two Masters Students, Sravanthi Boppana and Vidyashree Rajasekar. This thesis presents the indoor soiling study, whereas the other thesis presents the outdoor soiling study. Similarly, the statistical risk analyses of two power plants (model J and model JVA) were jointly performed by these two Masters students. Both power plants are located at the same cold-dry climate, but one power plant carries framed modules and the other carries frameless modules. This thesis presents the results obtained on the frameless modules.

  3. Robust inference for group sequential trials.

    PubMed

    Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei

    2017-03-01

    For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.

  4. Exploiting excess sharing: a more powerful test of linkage for affected sib pairs than the transmission/disequilibrium test.

    PubMed Central

    Wicks, J

    2000-01-01

    The transmission/disequilibrium test (TDT) is a popular, simple, and powerful test of linkage, which can be used to analyze data consisting of transmissions to the affected members of families with any kind pedigree structure, including affected sib pairs (ASPs). Although it is based on the preferential transmission of a particular marker allele across families, it is not a valid test of association for ASPs. Martin et al. devised a similar statistic for ASPs, Tsp, which is also based on preferential transmission of a marker allele but which is a valid test of both linkage and association for ASPs. It is, however, less powerful than the TDT as a test of linkage for ASPs. What I show is that the differences between the TDT and Tsp are due to the fact that, although both statistics are based on preferential transmission of a marker allele, the TDT also exploits excess sharing in identity-by-descent transmissions to ASPs. Furthermore, I show that both of these statistics are members of a family of "TDT-like" statistics for ASPs. The statistics in this family are based on preferential transmission but also, to varying extents, exploit excess sharing. From this family of statistics, we see that, although the TDT exploits excess sharing to some extent, it is possible to do so to a greater extent-and thus produce a more powerful test of linkage, for ASPs, than is provided by the TDT. Power simulations conducted under a number of disease models are used to verify that the most powerful member of this family of TDT-like statistics is more powerful than the TDT for ASPs. PMID:10788332

  5. Exploiting excess sharing: a more powerful test of linkage for affected sib pairs than the transmission/disequilibrium test.

    PubMed

    Wicks, J

    2000-06-01

    The transmission/disequilibrium test (TDT) is a popular, simple, and powerful test of linkage, which can be used to analyze data consisting of transmissions to the affected members of families with any kind pedigree structure, including affected sib pairs (ASPs). Although it is based on the preferential transmission of a particular marker allele across families, it is not a valid test of association for ASPs. Martin et al. devised a similar statistic for ASPs, Tsp, which is also based on preferential transmission of a marker allele but which is a valid test of both linkage and association for ASPs. It is, however, less powerful than the TDT as a test of linkage for ASPs. What I show is that the differences between the TDT and Tsp are due to the fact that, although both statistics are based on preferential transmission of a marker allele, the TDT also exploits excess sharing in identity-by-descent transmissions to ASPs. Furthermore, I show that both of these statistics are members of a family of "TDT-like" statistics for ASPs. The statistics in this family are based on preferential transmission but also, to varying extents, exploit excess sharing. From this family of statistics, we see that, although the TDT exploits excess sharing to some extent, it is possible to do so to a greater extent-and thus produce a more powerful test of linkage, for ASPs, than is provided by the TDT. Power simulations conducted under a number of disease models are used to verify that the most powerful member of this family of TDT-like statistics is more powerful than the TDT for ASPs.

  6. Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

    PubMed

    Schlenker, Evelyn

    2016-01-01

    This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.

  7. The Effects of Run-of-River Hydroelectric Power Schemes on Fish Community Composition in Temperate Streams and Rivers

    PubMed Central

    2016-01-01

    The potential environmental impacts of large-scale storage hydroelectric power (HEP) schemes have been well-documented in the literature. In Europe, awareness of these potential impacts and limited opportunities for politically-acceptable medium- to large-scale schemes, have caused attention to focus on smaller-scale HEP schemes, particularly run-of-river (ROR) schemes, to contribute to meeting renewable energy targets. Run-of-river HEP schemes are often presumed to be less environmentally damaging than large-scale storage HEP schemes. However, there is currently a lack of peer-reviewed studies on their physical and ecological impact. The aim of this article was to investigate the effects of ROR HEP schemes on communities of fish in temperate streams and rivers, using a Before-After, Control-Impact (BACI) study design. The study makes use of routine environmental surveillance data collected as part of long-term national and international monitoring programmes at 23 systematically-selected ROR HEP schemes and 23 systematically-selected paired control sites. Six area-normalised metrics of fish community composition were analysed using a linear mixed effects model (number of species, number of fish, number of Atlantic salmon—Salmo salar, number of >1 year old Atlantic salmon, number of brown trout—Salmo trutta, and number of >1 year old brown trout). The analyses showed that there was a statistically significant effect (p<0.05) of ROR HEP construction and operation on the number of species. However, no statistically significant effects were detected on the other five metrics of community composition. The implications of these findings are discussed in this article and recommendations are made for best-practice study design for future fish community impact studies. PMID:27191717

  8. The Effects of Run-of-River Hydroelectric Power Schemes on Fish Community Composition in Temperate Streams and Rivers.

    PubMed

    Bilotta, Gary S; Burnside, Niall G; Gray, Jeremy C; Orr, Harriet G

    2016-01-01

    The potential environmental impacts of large-scale storage hydroelectric power (HEP) schemes have been well-documented in the literature. In Europe, awareness of these potential impacts and limited opportunities for politically-acceptable medium- to large-scale schemes, have caused attention to focus on smaller-scale HEP schemes, particularly run-of-river (ROR) schemes, to contribute to meeting renewable energy targets. Run-of-river HEP schemes are often presumed to be less environmentally damaging than large-scale storage HEP schemes. However, there is currently a lack of peer-reviewed studies on their physical and ecological impact. The aim of this article was to investigate the effects of ROR HEP schemes on communities of fish in temperate streams and rivers, using a Before-After, Control-Impact (BACI) study design. The study makes use of routine environmental surveillance data collected as part of long-term national and international monitoring programmes at 23 systematically-selected ROR HEP schemes and 23 systematically-selected paired control sites. Six area-normalised metrics of fish community composition were analysed using a linear mixed effects model (number of species, number of fish, number of Atlantic salmon-Salmo salar, number of >1 year old Atlantic salmon, number of brown trout-Salmo trutta, and number of >1 year old brown trout). The analyses showed that there was a statistically significant effect (p<0.05) of ROR HEP construction and operation on the number of species. However, no statistically significant effects were detected on the other five metrics of community composition. The implications of these findings are discussed in this article and recommendations are made for best-practice study design for future fish community impact studies.

  9. Origin of the correlations between exit times in pedestrian flows through a bottleneck

    NASA Astrophysics Data System (ADS)

    Nicolas, Alexandre; Touloupas, Ioannis

    2018-01-01

    Robust statistical features have emerged from the microscopic analysis of dense pedestrian flows through a bottleneck, notably with respect to the time gaps between successive passages. We pinpoint the mechanisms at the origin of these features thanks to simple models that we develop and analyse quantitatively. We disprove the idea that anticorrelations between successive time gaps (i.e. an alternation between shorter ones and longer ones) are a hallmark of a zipper-like intercalation of pedestrian lines and show that they simply result from the possibility that pedestrians from distinct ‘lines’ or directions cross the bottleneck within a short time interval. A second feature concerns the bursts of escapes, i.e. egresses that come in fast succession. Despite the ubiquity of exponential distributions of burst sizes, entailed by a Poisson process, we argue that anomalous (power-law) statistics arise if the bottleneck is nearly congested, albeit only in a tiny portion of parameter space. The generality of the proposed mechanisms implies that similar statistical features should also be observed for other types of particulate flows.

  10. Defining window-boundaries for genomic analyses using smoothing spline techniques

    DOE PAGES

    Beissinger, Timothy M.; Rosa, Guilherme J.M.; Kaeppler, Shawn M.; ...

    2015-04-17

    High-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the datamore » and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome.« less

  11. PEPA test: fast and powerful differential analysis from relative quantitative proteomics data using shared peptides.

    PubMed

    Jacob, Laurent; Combes, Florence; Burger, Thomas

    2018-06-18

    We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis a challenge and are often not accounted for. In this article, we use a linear model describing peptide-protein relationships to build a likelihood ratio test of differential abundance for proteins. We show that the likelihood ratio statistic can be computed in linear time with the number of peptides. We also provide the asymptotic null distribution of a regularized version of our statistic. Experiments on both real and simulated datasets show that our procedures outperforms state-of-the-art methods. The procedures are available via the pepa.test function of the DAPAR Bioconductor R package.

  12. Meta-analysis: Problems with Russian Publications.

    PubMed

    Verbitskaya, E V

    2015-01-01

    Meta-analysis is a powerful tool to identify Evidence Based medical technologies (interventions) for use in every day practice. Meta-analysis uses statistical approaches to combine results from multiple studies in an effort to increase power (over individual studies), improve estimates of the size of the effect and/or to resolve uncertainty when reports disagree. Meta-analysis is a quantitative, formal study design used to systematically assess previous research studies to derive conclusions from this research. Meta-analysis may provide more precise estimate of the effect of treatment or risk factor for a disease, or other outcomes, than any individual study contributing to the pooled analysis.We have quite a substantial number of Russian medical publications, but not so many Meta-Analyses published in Russian. Russian publications are cited in English language papers not so often. A total of 90% of clinical studies included in published Meta-Analyses incorporate only English language papers. International studies or papers with Russian co-authors are published in English language. The main question is: what is the problem with inclusion of Russian medical publications in Meta-Analysis? The main reasons for this are the following: 1) It is difficult to find Russian papers, difficult to work with them and to work with Russian journals:a. There are single Russian Biomedical Journals, which are translated into English and are included in databases (PubMed, Scopus and other), despite the fact that all of them have English language abstracts.b. The majority the meta-analyses authors use in their work different citation management software such as the Mendeley, Reference Manager, ProCite, EndNote, and others. These citation management systems allow scientists to organize their own literature databases with internet searches and have adds-on for the Office programs what makes process of literature citation very convenient. The Internet sites of the majority of International Journals have built-in tools for saving citations to reference manager software. The majority of articles in Russian journals cannot be captured by citation management systems: they do not have special coding of articles descriptors.c. Some journals still have PDF files of the whole journal issue without dividing it into articles and do not provide any descriptors, making manual time-consuming input of information the only possibility. Moreover the context search of the article content is unavailable for search engines.2) The quality of research. This problem has been discussed for more than twenty years already. Still we have too many publications of poor quality of study design and statistical analysis. With the exception of pharmacological clinical tails, designed and supervised by international Pharma industry, many interventional studies, conducted in Russia, have methodological flaws inferring a high risk of bias:a. Absence of adequate control,b. No standard endpoints, duration of therapy and follow up,c. Absence of randomization and blinding,d. Low power of studies: sample sizes are calculated (if calculated at all) in such a way, that the main goal is to have as small sample size as possible. Very often statisticians have to solve the problem how to substantiate a small number of subjects, that sponsor could afford, instead of calculating the needed sample size to reach enough power.e. No standards of statistical analysis.f. Russian journals do not have standards for description and presentation of study results, in particular, results of statistical analysis (a reader even cannot see what is presented: standard deviation (SD) or standard error of the mean (SEM).We have a long standing experience in analysis of methodological and statistical quality of Russian biomedical publications and have found up to 80% publications with statistical and methodological errors and high risk of bias.In our practice, we had tried to perform two Meta-analyses for two local pharmaceutical products for prevention of stroke recurrence. For the first product, we did not found even two single Russian language studies suitable for the analysis (incomparable populations, different designs, endpoints, doses etc.). For the second product, only four studies had comparable populations and standard internationally approved scales for effectiveness analysis. However, the combinations of scales, the length of treatment and follow up differed widely, so that we could combine the results of only 2 or 3 studies for each end point. Russian researchers have to follow internationally recognised standards in study design, selection of endpoint, timelines and therapy regimens, data analysis and presentation of results. Russian journals need to develop consolidate rules for authors of clinical trials and epidemiological research of result reporting close to international standards. In this case the international Network EQUATOR (Enhancing the QUAlity and Transparency Of health Research http://www.equator-network.org/) is one to be taken into account. In addition, Russian Journals have to improve their online information for better interaction with search engines and citation managers.

  13. Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association.

    PubMed

    el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J

    2007-09-24

    In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.

  14. A flooding induced station blackout analysis for a pressurized water reactor using the RISMC toolkit

    DOE PAGES

    Mandelli, Diego; Prescott, Steven; Smith, Curtis; ...

    2015-05-17

    In this paper we evaluate the impact of a power uprate on a pressurized water reactor (PWR) for a tsunami-induced flooding test case. This analysis is performed using the RISMC toolkit: the RELAP-7 and RAVEN codes. RELAP-7 is the new generation of system analysis codes that is responsible for simulating the thermal-hydraulic dynamics of PWR and boiling water reactor systems. RAVEN has two capabilities: to act as a controller of the RELAP-7 simulation (e.g., component/system activation) and to perform statistical analyses. In our case, the simulation of the flooding is performed by using an advanced smooth particle hydrodynamics code calledmore » NEUTRINO. The obtained results allow the user to investigate and quantify the impact of timing and sequencing of events on system safety. The impact of power uprate is determined in terms of both core damage probability and safety margins.« less

  15. An omnibus test for the global null hypothesis.

    PubMed

    Futschik, Andreas; Taus, Thomas; Zehetmayer, Sonja

    2018-01-01

    Global hypothesis tests are a useful tool in the context of clinical trials, genetic studies, or meta-analyses, when researchers are not interested in testing individual hypotheses, but in testing whether none of the hypotheses is false. There are several possibilities how to test the global null hypothesis when the individual null hypotheses are independent. If it is assumed that many of the individual null hypotheses are false, combination tests have been recommended to maximize power. If, however, it is assumed that only one or a few null hypotheses are false, global tests based on individual test statistics are more powerful (e.g. Bonferroni or Simes test). However, usually there is no a priori knowledge on the number of false individual null hypotheses. We therefore propose an omnibus test based on cumulative sums of the transformed p-values. We show that this test yields an impressive overall performance. The proposed method is implemented in an R-package called omnibus.

  16. Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions.

    PubMed

    Bansal, Ravi; Peterson, Bradley S

    2018-06-01

    Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. Evaluation of a regional monitoring program's statistical power to detect temporal trends in forest health indicators

    USGS Publications Warehouse

    Perles, Stephanie J.; Wagner, Tyler; Irwin, Brian J.; Manning, Douglas R.; Callahan, Kristina K.; Marshall, Matthew R.

    2014-01-01

    Forests are socioeconomically and ecologically important ecosystems that are exposed to a variety of natural and anthropogenic stressors. As such, monitoring forest condition and detecting temporal changes therein remain critical to sound public and private forestland management. The National Parks Service’s Vital Signs monitoring program collects information on many forest health indicators, including species richness, cover by exotics, browse pressure, and forest regeneration. We applied a mixed-model approach to partition variability in data for 30 forest health indicators collected from several national parks in the eastern United States. We then used the estimated variance components in a simulation model to evaluate trend detection capabilities for each indicator. We investigated the extent to which the following factors affected ability to detect trends: (a) sample design: using simple panel versus connected panel design, (b) effect size: increasing trend magnitude, (c) sample size: varying the number of plots sampled each year, and (d) stratified sampling: post-stratifying plots into vegetation domains. Statistical power varied among indicators; however, indicators that measured the proportion of a total yielded higher power when compared to indicators that measured absolute or average values. In addition, the total variability for an indicator appeared to influence power to detect temporal trends more than how total variance was partitioned among spatial and temporal sources. Based on these analyses and the monitoring objectives of theVital Signs program, the current sampling design is likely overly intensive for detecting a 5 % trend·year−1 for all indicators and is appropriate for detecting a 1 % trend·year−1 in most indicators.

  18. [Data collection in anesthesia. Experiences with the inauguration of a new information system].

    PubMed

    Zbinden, A M; Rothenbühler, H; Häberli, B

    1997-06-01

    In many institutions information systems are used to process off-line anaesthesia data for invoices, statistical purposes, and quality assurance. Information systems are also increasingly being used to improve process control in order to reduce costs. Most of today's systems were created when information technology and working processes in anaesthesia were very different from those in use today. Thus, many institutions must now replace their computer systems but are probably not aware of how complex this change will be. Modern information systems mostly use client-server architecture and relational data bases. Substituting an old system with a new one is frequently a greater task than designing a system from scratch. This article gives the conclusions drawn from the experience obtained when a large departmental computer system is redesigned in an university hospital. The new system was based on a client-server architecture and was developed by an external company without preceding conceptual analysis. Modules for patient, anaesthesia, surgical, and pain-service data were included. Data were analysed using a separate statistical package (RS/1 from Bolt Beranek), taking advantage of its powerful precompiled procedures. Development and introduction of the new system took much more time and effort than expected despite the use of modern software tools. Introduction of the new program required intensive user training despite the choice of modem graphic screen layouts. Automatic data-reading systems could not be used, as too many faults occurred and the effort for the user was too high. However, after the initial problems were solved the system turned out to be a powerful tool for quality control (both process and outcome quality), billing, and scheduling. The statistical analysis of the data resulted in meaningful and relevant conclusions. Before creating a new information system, the working processes have to be analysed and, if possible, made more efficient; a detailed programme specification must then be made. A servicing and maintenance contract should be drawn up before the order is given to a company. Time periods of equal duration have to be scheduled for defining, writing, testing and introducing the program. Modern client-server systems with relational data bases are by no means simpler to establish and maintain than previous mainframe systems with hierarchical data bases, and thus, experienced computer specialists need to be close at hand. We recommend collecting data only once for both statistics and quality control. To verify data quality, a system of random spot-sampling has to be established. Despite the large investments needed to build up such a system, we consider it a powerful tool for helping to solve the difficult daily problems of managing a surgical and anaesthesia unit.

  19. Statistical power to detect change in a mangrove shoreline fish community adjacent to a nuclear power plant.

    PubMed

    Dolan, T E; Lynch, P D; Karazsia, J L; Serafy, J E

    2016-03-01

    An expansion is underway of a nuclear power plant on the shoreline of Biscayne Bay, Florida, USA. While the precise effects of its construction and operation are unknown, impacts on surrounding marine habitats and biota are considered by experts to be likely. The objective of the present study was to determine the adequacy of an ongoing monitoring survey of fish communities associated with mangrove habitats directly adjacent to the power plant to detect fish community changes, should they occur, at three spatial scales. Using seasonally resolved data recorded during 532 fish surveys over an 8-year period, power analyses were performed for four mangrove fish metrics (fish diversity, fish density, and the occurrence of two ecologically important fish species: gray snapper (Lutjanus griseus) and goldspotted killifish (Floridichthys carpio). Results indicated that the monitoring program at current sampling intensity allows for detection of <33% changes in fish density and diversity metrics in both the wet and the dry season in the two larger study areas. Sampling effort was found to be insufficient in either season to detect changes at this level (<33%) in species-specific occurrence metrics for the two fish species examined. The option of supplementing ongoing, biological monitoring programs for improved, focused change detection deserves consideration from both ecological and cost-benefit perspectives.

  20. Using public control genotype data to increase power and decrease cost of case-control genetic association studies.

    PubMed

    Ho, Lindsey A; Lange, Ethan M

    2010-12-01

    Genome-wide association (GWA) studies are a powerful approach for identifying novel genetic risk factors associated with human disease. A GWA study typically requires the inclusion of thousands of samples to have sufficient statistical power to detect single nucleotide polymorphisms that are associated with only modest increases in risk of disease given the heavy burden of a multiple test correction that is necessary to maintain valid statistical tests. Low statistical power and the high financial cost of performing a GWA study remains prohibitive for many scientific investigators anxious to perform such a study using their own samples. A number of remedies have been suggested to increase statistical power and decrease cost, including the utilization of free publicly available genotype data and multi-stage genotyping designs. Herein, we compare the statistical power and relative costs of alternative association study designs that use cases and screened controls to study designs that are based only on, or additionally include, free public control genotype data. We describe a novel replication-based two-stage study design, which uses free public control genotype data in the first stage and follow-up genotype data on case-matched controls in the second stage that preserves many of the advantages inherent when using only an epidemiologically matched set of controls. Specifically, we show that our proposed two-stage design can substantially increase statistical power and decrease cost of performing a GWA study while controlling the type-I error rate that can be inflated when using public controls due to differences in ancestry and batch genotype effects.

  1. Multiplicative point process as a model of trading activity

    NASA Astrophysics Data System (ADS)

    Gontis, V.; Kaulakys, B.

    2004-11-01

    Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.

  2. Powerlaw: a Python package for analysis of heavy-tailed distributions.

    PubMed

    Alstott, Jeff; Bullmore, Ed; Plenz, Dietmar

    2014-01-01

    Power laws are theoretically interesting probability distributions that are also frequently used to describe empirical data. In recent years, effective statistical methods for fitting power laws have been developed, but appropriate use of these techniques requires significant programming and statistical insight. In order to greatly decrease the barriers to using good statistical methods for fitting power law distributions, we developed the powerlaw Python package. This software package provides easy commands for basic fitting and statistical analysis of distributions. Notably, it also seeks to support a variety of user needs by being exhaustive in the options available to the user. The source code is publicly available and easily extensible.

  3. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.

    PubMed

    Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P; Patterson, Nick; Price, Alkes L

    2014-10-15

    Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of [Formula: see text] association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary materials are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Benefits of a one health approach: An example using Rift Valley fever.

    PubMed

    Rostal, Melinda K; Ross, Noam; Machalaba, Catherine; Cordel, Claudia; Paweska, Janusz T; Karesh, William B

    2018-06-01

    One Health has been promoted by international institutions as a framework to improve public health outcomes. Despite strong overall interest in One Health, country-, local- and project-level implementation remains limited, likely due to the lack of pragmatic and tested operational methods for implementation and metrics for evaluation. Here we use Rift Valley fever virus as an example to demonstrate the value of using a One Health approach for both scientific and resources advantages. We demonstrate that coordinated, a priori investigations between One Health sectors can yield higher statistical power to elucidate important public health relationships as compared to siloed investigations and post-hoc analyses. Likewise, we demonstrate that across a project or multi-ministry health study a One Health approach can result in improved resource efficiency, with resultant cost-savings (35% in the presented case). The results of these analyses demonstrate that One Health approaches can be directly and tangibly applied to health investigations.

  5. Advanced spectrophotometric chemometric methods for resolving the binary mixture of doxylamine succinate and pyridoxine hydrochloride.

    PubMed

    Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita

    2018-03-01

    The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.

  6. Analysis of complex environment effect on near-field emission

    NASA Astrophysics Data System (ADS)

    Ravelo, B.; Lalléchère, S.; Bonnet, P.; Paladian, F.

    2014-10-01

    The article is dealing with uncertainty analyses of radiofrequency circuits electromagnetic compatibility emission based on the near-field/near-field (NF/NF) transform combined with stochastic approach. By using 2D data corresponding to electromagnetic (EM) field (X=E or H) scanned in the observation plane placed at the position z0 above the circuit under test (CUT), the X field map was extracted. Then, uncertainty analyses were assessed via the statistical moments from X component. In addition, stochastic collocation based was considered and calculations were applied to planar EM NF radiated by the CUTs as Wilkinson power divider and a microstrip line operating at GHz levels. After Matlab implementation, the mean and standard deviation were assessed. The present study illustrates how the variations of environmental parameters may impact EM fields. The NF uncertainty methodology can be applied to any physical parameter effects in complex environment and useful for printed circuit board (PCBs) design guideline.

  7. Statistical power as a function of Cronbach alpha of instrument questionnaire items.

    PubMed

    Heo, Moonseong; Kim, Namhee; Faith, Myles S

    2015-10-14

    In countless number of clinical trials, measurements of outcomes rely on instrument questionnaire items which however often suffer measurement error problems which in turn affect statistical power of study designs. The Cronbach alpha or coefficient alpha, here denoted by C(α), can be used as a measure of internal consistency of parallel instrument items that are developed to measure a target unidimensional outcome construct. Scale score for the target construct is often represented by the sum of the item scores. However, power functions based on C(α) have been lacking for various study designs. We formulate a statistical model for parallel items to derive power functions as a function of C(α) under several study designs. To this end, we assume fixed true score variance assumption as opposed to usual fixed total variance assumption. That assumption is critical and practically relevant to show that smaller measurement errors are inversely associated with higher inter-item correlations, and thus that greater C(α) is associated with greater statistical power. We compare the derived theoretical statistical power with empirical power obtained through Monte Carlo simulations for the following comparisons: one-sample comparison of pre- and post-treatment mean differences, two-sample comparison of pre-post mean differences between groups, and two-sample comparison of mean differences between groups. It is shown that C(α) is the same as a test-retest correlation of the scale scores of parallel items, which enables testing significance of C(α). Closed-form power functions and samples size determination formulas are derived in terms of C(α), for all of the aforementioned comparisons. Power functions are shown to be an increasing function of C(α), regardless of comparison of interest. The derived power functions are well validated by simulation studies that show that the magnitudes of theoretical power are virtually identical to those of the empirical power. Regardless of research designs or settings, in order to increase statistical power, development and use of instruments with greater C(α), or equivalently with greater inter-item correlations, is crucial for trials that intend to use questionnaire items for measuring research outcomes. Further development of the power functions for binary or ordinal item scores and under more general item correlation strutures reflecting more real world situations would be a valuable future study.

  8. Accounting for undetected compounds in statistical analyses of mass spectrometry 'omic studies.

    PubMed

    Taylor, Sandra L; Leiserowitz, Gary S; Kim, Kyoungmi

    2013-12-01

    Mass spectrometry is an important high-throughput technique for profiling small molecular compounds in biological samples and is widely used to identify potential diagnostic and prognostic compounds associated with disease. Commonly, this data generated by mass spectrometry has many missing values resulting when a compound is absent from a sample or is present but at a concentration below the detection limit. Several strategies are available for statistically analyzing data with missing values. The accelerated failure time (AFT) model assumes all missing values result from censoring below a detection limit. Under a mixture model, missing values can result from a combination of censoring and the absence of a compound. We compare power and estimation of a mixture model to an AFT model. Based on simulated data, we found the AFT model to have greater power to detect differences in means and point mass proportions between groups. However, the AFT model yielded biased estimates with the bias increasing as the proportion of observations in the point mass increased while estimates were unbiased with the mixture model except if all missing observations came from censoring. These findings suggest using the AFT model for hypothesis testing and mixture model for estimation. We demonstrated this approach through application to glycomics data of serum samples from women with ovarian cancer and matched controls.

  9. SBCDDB: Sleeping Beauty Cancer Driver Database for gene discovery in mouse models of human cancers

    PubMed Central

    Mann, Michael B

    2018-01-01

    Abstract Large-scale oncogenomic studies have identified few frequently mutated cancer drivers and hundreds of infrequently mutated drivers. Defining the biological context for rare driving events is fundamentally important to increasing our understanding of the druggable pathways in cancer. Sleeping Beauty (SB) insertional mutagenesis is a powerful gene discovery tool used to model human cancers in mice. Our lab and others have published a number of studies that identify cancer drivers from these models using various statistical and computational approaches. Here, we have integrated SB data from primary tumor models into an analysis and reporting framework, the Sleeping Beauty Cancer Driver DataBase (SBCDDB, http://sbcddb.moffitt.org), which identifies drivers in individual tumors or tumor populations. Unique to this effort, the SBCDDB utilizes a single, scalable, statistical analysis method that enables data to be grouped by different biological properties. This allows for SB drivers to be evaluated (and re-evaluated) under different contexts. The SBCDDB provides visual representations highlighting the spatial attributes of transposon mutagenesis and couples this functionality with analysis of gene sets, enabling users to interrogate relationships between drivers. The SBCDDB is a powerful resource for comparative oncogenomic analyses with human cancer genomics datasets for driver prioritization. PMID:29059366

  10. The power and robustness of maximum LOD score statistics.

    PubMed

    Yoo, Y J; Mendell, N R

    2008-07-01

    The maximum LOD score statistic is extremely powerful for gene mapping when calculated using the correct genetic parameter value. When the mode of genetic transmission is unknown, the maximum of the LOD scores obtained using several genetic parameter values is reported. This latter statistic requires higher critical value than the maximum LOD score statistic calculated from a single genetic parameter value. In this paper, we compare the power of maximum LOD scores based on three fixed sets of genetic parameter values with the power of the LOD score obtained after maximizing over the entire range of genetic parameter values. We simulate family data under nine generating models. For generating models with non-zero phenocopy rates, LOD scores maximized over the entire range of genetic parameters yielded greater power than maximum LOD scores for fixed sets of parameter values with zero phenocopy rates. No maximum LOD score was consistently more powerful than the others for generating models with a zero phenocopy rate. The power loss of the LOD score maximized over the entire range of genetic parameters, relative to the maximum LOD score calculated using the correct genetic parameter value, appeared to be robust to the generating models.

  11. Power Enhancement in High Dimensional Cross-Sectional Tests

    PubMed Central

    Fan, Jianqing; Liao, Yuan; Yao, Jiawei

    2016-01-01

    We propose a novel technique to boost the power of testing a high-dimensional vector H : θ = 0 against sparse alternatives where the null hypothesis is violated only by a couple of components. Existing tests based on quadratic forms such as the Wald statistic often suffer from low powers due to the accumulation of errors in estimating high-dimensional parameters. More powerful tests for sparse alternatives such as thresholding and extreme-value tests, on the other hand, require either stringent conditions or bootstrap to derive the null distribution and often suffer from size distortions due to the slow convergence. Based on a screening technique, we introduce a “power enhancement component”, which is zero under the null hypothesis with high probability, but diverges quickly under sparse alternatives. The proposed test statistic combines the power enhancement component with an asymptotically pivotal statistic, and strengthens the power under sparse alternatives. The null distribution does not require stringent regularity conditions, and is completely determined by that of the pivotal statistic. As specific applications, the proposed methods are applied to testing the factor pricing models and validating the cross-sectional independence in panel data models. PMID:26778846

  12. Power to detect trends in Missouri River fish populations within the Habitat Assessment Monitoring Program

    USGS Publications Warehouse

    Bryan, Janice L.; Wildhaber, Mark L.; Gladish, Dan W.

    2010-01-01

    As with all large rivers in the United States, the Missouri River has been altered, with approximately one-third of the mainstem length impounded and one-third channelized. These physical alterations to the environment have affected the fish populations, but studies examining the effects of alterations have been localized and for short periods of time, thereby preventing generalization. In response to the U.S. Fish and Wildlife Service Biological Opinion, the U.S. Army Corps of Engineers (USACE) initiated monitoring of habitat improvements of the Missouri River in 2005. The goal of the Habitat Assessment Monitoring Program (HAMP) is to provide information on the response of target fish species to the USACE habitat creation on the Lower Missouri River. To determine the statistical power of the HAMP and in cooperation with USACE, a power analysis was conducted using a normal linear mixed model with variance component estimates based on the first complete year of data. At a level of 20/16 (20 bends with 16 subsamples in each bend), at least one species/month/gear model has the power to determine differences between treated and untreated bends. The trammel net in September had the most species models with adequate power at the 20/16 level and overall, the trammel net had the most species/month models with adequate power at the 20/16 level. However, using only one gear or gear/month combination would eliminate other species of interest, such as three chub species (Macrhybopsis meeki, Macrhybopsis aestivalis, and Macrhybopsis gelida), sand shiners (Notropis stramineus), pallid sturgeon (Scaphirhynchus albus), and juvenile sauger (Sander canadensis). Since gear types are selective in their species efficiency, the strength of the HAMP approach is using multiple gears that have statistical power to differentiate habitat treatment differences in different fish species within the Missouri River. As is often the case with sampling rare species like the pallid sturgeon, the data used to conduct the analyses exhibit some departures from the parametric model assumptions. However, preliminary simulations indicate that the results of this study are appropriate for application to the HAMP study design.

  13. Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability.

    PubMed

    Coman, Emil N; Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J; Suggs, Suzanne; Barbour, Russell

    2014-05-01

    The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power.

  14. Detecting temporal change in freshwater fisheries surveys: statistical power and the important linkages between management questions and monitoring objectives

    USGS Publications Warehouse

    Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,

    2016-01-01

    Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.

  15. Statistical Analysis of Large-Scale Structure of Universe

    NASA Astrophysics Data System (ADS)

    Tugay, A. V.

    While galaxy cluster catalogs were compiled many decades ago, other structural elements of cosmic web are detected at definite level only in the newest works. For example, extragalactic filaments were described by velocity field and SDSS galaxy distribution during the last years. Large-scale structure of the Universe could be also mapped in the future using ATHENA observations in X-rays and SKA in radio band. Until detailed observations are not available for the most volume of Universe, some integral statistical parameters can be used for its description. Such methods as galaxy correlation function, power spectrum, statistical moments and peak statistics are commonly used with this aim. The parameters of power spectrum and other statistics are important for constraining the models of dark matter, dark energy, inflation and brane cosmology. In the present work we describe the growth of large-scale density fluctuations in one- and three-dimensional case with Fourier harmonics of hydrodynamical parameters. In result we get power-law relation for the matter power spectrum.

  16. Trait humor and longevity: do comics have the last laugh?

    PubMed

    Rotton, J

    1992-01-01

    Four sets of biographical data were analyzed in order to test the hypothesis that the ability to generate humor is associated with longevity. Although steps were taken to ensure that tests had high levels of statistical power, analyses provided very little support for the idea that individuals with a well-developed sense of humor live longer than serious writers and other entertainers. In addition, a subsidiary analysis revealed that those in the business of entertaining others died at an earlier age than those in other lines of endeavor. These findings suggest that researchers should turn their attention from trait humor to the effects of humorous material.

  17. An Improved 360 Degree and Order Model of Venus Topography

    NASA Technical Reports Server (NTRS)

    Rappaport, Nicole J.; Konopliv, Alex S.; Kucinskas, Algis B.; Ford, Peter G.

    1999-01-01

    We present an improved 360 degree and order spherical harmonic solution for Venus' topography. The new model uses the most recent set of Venus altimetry data with spacecraft positions derived from a recent high resolution gravity model. Geometric analysis indicates that the offset between the center of mass and center of figure of Venus is about 10 times smaller than that for the Earth, the Moon, or Mars. Statistical analyses confirm that the RMS topography follows a power law over the central part of the spectrum. Compared to the previous topography model, the new model is more highly correlated with Venus' harmonic gravity field.

  18. NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES

    PubMed Central

    He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.

    2017-01-01

    Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dharmarajan, Guha; Beasley, James C.; Beatty, William S.

    Many aspects of parasite biology critically depend on their hosts, and understanding how host-parasite populations are co-structured can help improve our understanding of the ecology of parasites, their hosts, and host-parasite interactions. Here, this study utilized genetic data collected from raccoons (Procyon lotor), and a specialist parasite, the raccoon tick (Ixodes texanus), to test for genetic co-structuring of host-parasite populations at both landscape and host scales. At the landscape scale, our analyses revealed a significant correlation between genetic and geographic distance matrices (i.e., isolation by distance) in ticks, but not their hosts. While there are several mechanisms that could leadmore » to a stronger pattern of isolation by distance in tick vs. raccoon datasets, our analyses suggest that at least one reason for the above pattern is the substantial increase in statistical power (due to the ≈8-fold increase in sample size) afforded by sampling parasites. Host-scale analyses indicated higher relatedness between ticks sampled from related vs. unrelated raccoons trapped within the same habitat patch, a pattern likely driven by increased contact rates between related hosts. Lastly, by utilizing fine-scale genetic data from both parasites and hosts, our analyses help improve our understanding of epidemiology and host ecology.« less

  20. Meta-analysis of thirty-two case-control and two ecological radon studies of lung cancer.

    PubMed

    Dobrzynski, Ludwik; Fornalski, Krzysztof W; Reszczynska, Joanna

    2018-03-01

    A re-analysis has been carried out of thirty-two case-control and two ecological studies concerning the influence of radon, a radioactive gas, on the risk of lung cancer. Three mathematically simplest dose-response relationships (models) were tested: constant (zero health effect), linear, and parabolic (linear-quadratic). Health effect end-points reported in the analysed studies are odds ratios or relative risk ratios, related either to morbidity or mortality. In our preliminary analysis, we show that the results of dose-response fitting are qualitatively (within uncertainties, given as error bars) the same, whichever of these health effect end-points are applied. Therefore, we deemed it reasonable to aggregate all response data into the so-called Relative Health Factor and jointly analysed such mixed data, to obtain better statistical power. In the second part of our analysis, robust Bayesian and classical methods of analysis were applied to this combined dataset. In this part of our analysis, we selected different subranges of radon concentrations. In view of substantial differences between the methodology used by the authors of case-control and ecological studies, the mathematical relationships (models) were applied mainly to the thirty-two case-control studies. The degree to which the two ecological studies, analysed separately, affect the overall results when combined with the thirty-two case-control studies, has also been evaluated. In all, as a result of our meta-analysis of the combined cohort, we conclude that the analysed data concerning radon concentrations below ~1000 Bq/m3 (~20 mSv/year of effective dose to the whole body) do not support the thesis that radon may be a cause of any statistically significant increase in lung cancer incidence.

  1. Load-embedded inertial measurement unit reveals lifting performance.

    PubMed

    Tammana, Aditya; McKay, Cody; Cain, Stephen M; Davidson, Steven P; Vitali, Rachel V; Ojeda, Lauro; Stirling, Leia; Perkins, Noel C

    2018-07-01

    Manual lifting of loads arises in many occupations as well as in activities of daily living. Prior studies explore lifting biomechanics and conditions implicated in lifting-induced injuries through laboratory-based experimental methods. This study introduces a new measurement method using load-embedded inertial measurement units (IMUs) to evaluate lifting tasks in varied environments outside of the laboratory. An example vertical load lifting task is considered that is included in an outdoor obstacle course. The IMU data, in the form of the load acceleration and angular velocity, is used to estimate load vertical velocity and three lifting performance metrics: the lifting time (speed), power, and motion smoothness. Large qualitative differences in these parameters distinguish exemplar high and low performance trials. These differences are further supported by subsequent statistical analyses of twenty three trials (including a total of 115 total lift/lower cycles) from fourteen healthy participants. Results reveal that lifting time is strongly correlated with lifting power (as expected) but also correlated with motion smoothness. Thus, participants who lift rapidly do so with significantly greater power using motions that minimize motion jerk. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Output power fluctuations due to different weights of macro particles used in particle-in-cell simulations of Cerenkov devices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bao, Rong; Li, Yongdong; Liu, Chunliang

    2016-07-15

    The output power fluctuations caused by weights of macro particles used in particle-in-cell (PIC) simulations of a backward wave oscillator and a travelling wave tube are statistically analyzed. It is found that the velocities of electrons passed a specific slow-wave structure form a specific electron velocity distribution. The electron velocity distribution obtained in PIC simulation with a relative small weight of macro particles is considered as an initial distribution. By analyzing this initial distribution with a statistical method, the estimations of the output power fluctuations caused by different weights of macro particles are obtained. The statistical method is verified bymore » comparing the estimations with the simulation results. The fluctuations become stronger with increasing weight of macro particles, which can also be determined reversely from estimations of the output power fluctuations. With the weights of macro particles optimized by the statistical method, the output power fluctuations in PIC simulations are relatively small and acceptable.« less

  3. Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Udey, Ruth Norma

    Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.

  4. Water-quality characteristics and trends for selected sites at and near the Idaho National Laboratory, Idaho, 1949-2009

    USGS Publications Warehouse

    Bartholomay, Roy C.; Davis, Linda C.; Fisher, Jason C.; Tucker, Betty J.; Raben, Flint A.

    2012-01-01

    The U.S. Geological Survey, in cooperation with the U.S. Department of Energy, analyzed water-quality data collected from 67 aquifer wells and 7 surface-water sites at the Idaho National Laboratory (INL) from 1949 through 2009. The data analyzed included major cations, anions, nutrients, trace elements, and total organic carbon. The analyses were performed to examine water-quality trends that might inform future management decisions about the number of wells to sample at the INL and the type of constituents to monitor. Water-quality trends were determined using (1) the nonparametric Kendall's tau correlation coefficient, p-value, Theil-Sen slope estimator, and summary statistics for uncensored data; and (2) the Kaplan-Meier method for calculating summary statistics, Kendall's tau correlation coefficient, p-value, and Akritas-Theil-Sen slope estimator for robust linear regression for censored data. Statistical analyses for chloride concentrations indicate that groundwater influenced by Big Lost River seepage has decreasing chloride trends or, in some cases, has variable chloride concentration changes that correlate with above-average and below-average periods of recharge. Analyses of trends for chloride in water samples from four sites located along the Big Lost River indicate a decreasing trend or no trend for chloride, and chloride concentrations generally are much lower at these four sites than those in the aquifer. Above-average and below-average periods of recharge also affect concentration trends for sodium, sulfate, nitrate, and a few trace elements in several wells. Analyses of trends for constituents in water from several of the wells that is mostly regionally derived groundwater generally indicate increasing trends for chloride, sodium, sulfate, and nitrate concentrations. These increases are attributed to agricultural or other anthropogenic influences on the aquifer upgradient of the INL. Statistical trends of chemical constituents from several wells near the Naval Reactors Facility may be influenced by wastewater disposal at the facility or by anthropogenic influence from the Little Lost River basin. Groundwater samples from three wells downgradient of the Power Burst Facility area show increasing trends for chloride, nitrate, sodium, and sulfate concentrations. The increases could be caused by wastewater disposal in the Power Burst Facility area. Some groundwater samples in the southwestern part of the INL and southwest of the INL show concentration trends for chloride and sodium that may be influenced by wastewater disposal. Some of the groundwater samples have decreasing trends that could be attributed to the decreasing concentrations in the wastewater from the late 1970s to 2009. The young fraction of groundwater in many of the wells is more than 20 years old, so samples collected in the early 1990s are more representative of groundwater discharged in the 1960s and 1970s, when concentrations in wastewater were much higher. Groundwater sampled in 2009 would be representative of the lower concentrations of chloride and sodium in wastewater discharged in the late 1980s. Analyses of trends for sodium in several groundwater samples from the central and southern part of the eastern Snake River aquifer show increasing trends. In most cases, however, the sodium concentrations are less than background concentrations measured in the aquifer. Many of the wells are open to larger mixed sections of the aquifer, and the increasing trends may indicate that the long history of wastewater disposal in the central part of the INL is increasing sodium concentrations in the groundwater.

  5. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.

    PubMed

    Flannick, Jason; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M; Agarwala, Vineeta; Gaulton, Kyle J; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Dennis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana Cn; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Altshuler, David; Burtt, Noël P; Florez, Jose C; Boehnke, Michael; McCarthy, Mark I

    2017-12-19

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.

  6. An Analysis Pipeline with Statistical and Visualization-Guided Knowledge Discovery for Michigan-Style Learning Classifier Systems

    PubMed Central

    Urbanowicz, Ryan J.; Granizo-Mackenzie, Ambrose; Moore, Jason H.

    2014-01-01

    Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data. PMID:25431544

  7. What do results from coordinate-based meta-analyses tell us?

    PubMed

    Albajes-Eizagirre, Anton; Radua, Joaquim

    2018-08-01

    Coordinate-based meta-analyses (CBMA) methods, such as Activation Likelihood Estimation (ALE) and Seed-based d Mapping (SDM), have become an invaluable tool for summarizing the findings of voxel-based neuroimaging studies. However, the progressive sophistication of these methods may have concealed two particularities of their statistical tests. Common univariate voxelwise tests (such as the t/z-tests used in SPM and FSL) detect voxels that activate, or voxels that show differences between groups. Conversely, the tests conducted in CBMA test for "spatial convergence" of findings, i.e., they detect regions where studies report "more peaks than in most regions", regions that activate "more than most regions do", or regions that show "larger differences between groups than most regions do". The first particularity is that these tests rely on two spatial assumptions (voxels are independent and have the same probability to have a "false" peak), whose violation may make their results either conservative or liberal, though fortunately current versions of ALE, SDM and some other methods consider these assumptions. The second particularity is that the use of these tests involves an important paradox: the statistical power to detect a given effect is higher if there are no other effects in the brain, whereas lower in presence of multiple effects. Copyright © 2018 Elsevier Inc. All rights reserved.

  8. A weighted U statistic for association analyses considering genetic heterogeneity.

    PubMed

    Wei, Changshuai; Elston, Robert C; Lu, Qing

    2016-07-20

    Converging evidence suggests that common complex diseases with the same or similar clinical manifestations could have different underlying genetic etiologies. While current research interests have shifted toward uncovering rare variants and structural variations predisposing to human diseases, the impact of heterogeneity in genetic studies of complex diseases has been largely overlooked. Most of the existing statistical methods assume the disease under investigation has a homogeneous genetic effect and could, therefore, have low power if the disease undergoes heterogeneous pathophysiological and etiological processes. In this paper, we propose a heterogeneity-weighted U (HWU) method for association analyses considering genetic heterogeneity. HWU can be applied to various types of phenotypes (e.g., binary and continuous) and is computationally efficient for high-dimensional genetic data. Through simulations, we showed the advantage of HWU when the underlying genetic etiology of a disease was heterogeneous, as well as the robustness of HWU against different model assumptions (e.g., phenotype distributions). Using HWU, we conducted a genome-wide analysis of nicotine dependence from the Study of Addiction: Genetics and Environments dataset. The genome-wide analysis of nearly one million genetic markers took 7h, identifying heterogeneous effects of two new genes (i.e., CYP3A5 and IKBKB) on nicotine dependence. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls

    PubMed Central

    Jason, Flannick; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M.; Agarwala, Vineeta; Gaulton, Kyle J.; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J.; Rivas, Manuel A.; Perry, John R. B.; Sim, Xueling; Blackwell, Thomas W.; Robertson, Neil R.; Rayner, N William; Cingolani, Pablo; Locke, Adam E.; Tajes, Juan Fernandez; Highland, Heather M.; Dupuis, Josee; Chines, Peter S.; Lindgren, Cecilia M.; Hartl, Christopher; Jackson, Anne U.; Chen, Han; Huyghe, Jeroen R.; van de Bunt, Martijn; Pearson, Richard D.; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M.; Gamazon, Eric R.; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A.; Below, Jennifer E.; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L.; Pasko, Dorota; Parker, Stephen C. J.; Varga, Tibor V.; Green, Todd; Beer, Nicola L.; Day-Williams, Aaron G.; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J.; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P.; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F.; Han, Bok-Ghee; Jenkinson, Christopher P.; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C. Y.; Palmer, Nicholette D.; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E.; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D.; Neale, Benjamin M.; Purcell, Shaun; Butterworth, Adam S.; Howson, Joanna M. M.; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K. L.; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H. T.; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E.; Rybin, Dennis; Farook, Vidya S.; Fowler, Sharon P.; Freedman, Barry I.; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J.; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K.; Puppala, Sobha; Scott, William R.; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A.; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C.; Mangino, Massimo; Bonnycastle, Lori L.; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L.; Herder, Christian; Groves, Christopher J.; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A.; Doney, Alex S. F.; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J.; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E.; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H.; Stirrups, Kathleen; Wood, Andrew R.; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O.; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P.; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B.; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N. A.; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M.; Syvänen, Ann-Christine; Bergman, Richard N.; Bharadwaj, Dwaipayan; Bottinger, Erwin P.; Cho, Yoon Shin; Chandak, Giriraj R.; Chan, Juliana CN; Chia, Kee Seng; Daly, Mark J.; Ebrahim, Shah B.; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A.; Lehman, Donna M.; Jia, Weiping; Ma, Ronald C. W.; Pollin, Toni I.; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J. F.; Small, Kerrin S.; Ried, Janina S.; DeFronzo, Ralph A.; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J.; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W.; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R.; Gloyn, Anna L.; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D.; Hattersley, Andrew T.; Bowden, Donald W.; Collins, Francis S.; Atzmon, Gil; Chambers, John C.; Spector, Timothy D.; Laakso, Markku; Strom, Tim M.; Bell, Graeme I.; Blangero, John; Duggirala, Ravindranath; Tai, E. Shyong; McVean, Gilean; Hanis, Craig L.; Wilson, James G.; Seielstad, Mark; Frayling, Timothy M.; Meigs, James B.; Cox, Nancy J.; Sladek, Rob; Lander, Eric S.; Gabriel, Stacey; Mohlke, Karen L.; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J.; Morris, Andrew P.; Kang, Hyun Min; Altshuler, David; Burtt, Noël P.; Florez, Jose C.; Boehnke, Michael; McCarthy, Mark I.

    2017-01-01

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1–5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D. PMID:29257133

  10. Field Synopsis and Re-analysis of Systematic Meta-analyses of Genetic Association Studies in Multiple Sclerosis: a Bayesian Approach.

    PubMed

    Park, Jae Hyon; Kim, Joo Hi; Jo, Kye Eun; Na, Se Whan; Eisenhut, Michael; Kronbichler, Andreas; Lee, Keum Hwa; Shin, Jae Il

    2018-07-01

    To provide an up-to-date summary of multiple sclerosis-susceptible gene variants and assess the noteworthiness in hopes of finding true associations, we investigated the results of 44 meta-analyses on gene variants and multiple sclerosis published through December 2016. Out of 70 statistically significant genotype associations, roughly a fifth (21%) of the comparisons showed noteworthy false-positive rate probability (FPRP) at a statistical power to detect an OR of 1.5 and at a prior probability of 10 -6 assumed for a random single nucleotide polymorphism. These associations (IRF8/rs17445836, STAT3/rs744166, HLA/rs4959093, HLA/rs2647046, HLA/rs7382297, HLA/rs17421624, HLA/rs2517646, HLA/rs9261491, HLA/rs2857439, HLA/rs16896944, HLA/rs3132671, HLA/rs2857435, HLA/rs9261471, HLA/rs2523393, HLA-DRB1/rs3135388, RGS1/rs2760524, PTGER4/rs9292777) also showed a noteworthy Bayesian false discovery probability (BFDP) and one additional association (CD24 rs8734/rs52812045) was also noteworthy via BFDP computation. Herein, we have identified several noteworthy biomarkers of multiple sclerosis susceptibility. We hope these data are used to study multiple sclerosis genetics and inform future screening programs.

  11. Properties of different selection signature statistics and a new strategy for combining them.

    PubMed

    Ma, Y; Ding, X; Qanbari, S; Weigend, S; Zhang, Q; Simianer, H

    2015-11-01

    Identifying signatures of recent or ongoing selection is of high relevance in livestock population genomics. From a statistical perspective, determining a proper testing procedure and combining various test statistics is challenging. On the basis of extensive simulations in this study, we discuss the statistical properties of eight different established selection signature statistics. In the considered scenario, we show that a reasonable power to detect selection signatures is achieved with high marker density (>1 SNP/kb) as obtained from sequencing, while rather small sample sizes (~15 diploid individuals) appear to be sufficient. Most selection signature statistics such as composite likelihood ratio and cross population extended haplotype homozogysity have the highest power when fixation of the selected allele is reached, while integrated haplotype score has the highest power when selection is ongoing. We suggest a novel strategy, called de-correlated composite of multiple signals (DCMS) to combine different statistics for detecting selection signatures while accounting for the correlation between the different selection signature statistics. When examined with simulated data, DCMS consistently has a higher power than most of the single statistics and shows a reliable positional resolution. We illustrate the new statistic to the established selective sweep around the lactase gene in human HapMap data providing further evidence of the reliability of this new statistic. Then, we apply it to scan selection signatures in two chicken samples with diverse skin color. Our analysis suggests that a set of well-known genes such as BCO2, MC1R, ASIP and TYR were involved in the divergent selection for this trait.

  12. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    PubMed Central

    Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2009-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650

  13. Evaluation of Solid Rocket Motor Component Data Using a Commercially Available Statistical Software Package

    NASA Technical Reports Server (NTRS)

    Stefanski, Philip L.

    2015-01-01

    Commercially available software packages today allow users to quickly perform the routine evaluations of (1) descriptive statistics to numerically and graphically summarize both sample and population data, (2) inferential statistics that draws conclusions about a given population from samples taken of it, (3) probability determinations that can be used to generate estimates of reliability allowables, and finally (4) the setup of designed experiments and analysis of their data to identify significant material and process characteristics for application in both product manufacturing and performance enhancement. This paper presents examples of analysis and experimental design work that has been conducted using Statgraphics®(Registered Trademark) statistical software to obtain useful information with regard to solid rocket motor propellants and internal insulation material. Data were obtained from a number of programs (Shuttle, Constellation, and Space Launch System) and sources that include solid propellant burn rate strands, tensile specimens, sub-scale test motors, full-scale operational motors, rubber insulation specimens, and sub-scale rubber insulation analog samples. Besides facilitating the experimental design process to yield meaningful results, statistical software has demonstrated its ability to quickly perform complex data analyses and yield significant findings that might otherwise have gone unnoticed. One caveat to these successes is that useful results not only derive from the inherent power of the software package, but also from the skill and understanding of the data analyst.

  14. An entropy-based statistic for genomewide association studies.

    PubMed

    Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao

    2005-07-01

    Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.

  15. Real-time forecasting and predictability of catastrophic failure events: from rock failure to volcanoes and earthquakes

    NASA Astrophysics Data System (ADS)

    Main, I. G.; Bell, A. F.; Naylor, M.; Atkinson, M.; Filguera, R.; Meredith, P. G.; Brantut, N.

    2012-12-01

    Accurate prediction of catastrophic brittle failure in rocks and in the Earth presents a significant challenge on theoretical and practical grounds. The governing equations are not known precisely, but are known to produce highly non-linear behavior similar to those of near-critical dynamical systems, with a large and irreducible stochastic component due to material heterogeneity. In a laboratory setting mechanical, hydraulic and rock physical properties are known to change in systematic ways prior to catastrophic failure, often with significant non-Gaussian fluctuations about the mean signal at a given time, for example in the rate of remotely-sensed acoustic emissions. The effectiveness of such signals in real-time forecasting has never been tested before in a controlled laboratory setting, and previous work has often been qualitative in nature, and subject to retrospective selection bias, though it has often been invoked as a basis in forecasting natural hazard events such as volcanoes and earthquakes. Here we describe a collaborative experiment in real-time data assimilation to explore the limits of predictability of rock failure in a best-case scenario. Data are streamed from a remote rock deformation laboratory to a user-friendly portal, where several proposed physical/stochastic models can be analysed in parallel in real time, using a variety of statistical fitting techniques, including least squares regression, maximum likelihood fitting, Markov-chain Monte-Carlo and Bayesian analysis. The results are posted and regularly updated on the web site prior to catastrophic failure, to ensure a true and and verifiable prospective test of forecasting power. Preliminary tests on synthetic data with known non-Gaussian statistics shows how forecasting power is likely to evolve in the live experiments. In general the predicted failure time does converge on the real failure time, illustrating the bias associated with the 'benefit of hindsight' in retrospective analyses. Inference techniques that account explicitly for non-Gaussian statistics significantly reduce the bias, and increase the reliability and accuracy, of the forecast failure time in prospective mode.

  16. PROMISE: a tool to identify genomic features with a specific biologically interesting pattern of associations with multiple endpoint variables

    PubMed Central

    Pounds, Stan; Cheng, Cheng; Cao, Xueyuan; Crews, Kristine R.; Plunkett, William; Gandhi, Varsha; Rubnitz, Jeffrey; Ribeiro, Raul C.; Downing, James R.; Lamba, Jatinder

    2009-01-01

    Motivation: In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables. Results: Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Availability: Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org. Contact: stanley.pounds@stjude.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19528086

  17. The Ironic Effect of Significant Results on the Credibility of Multiple-Study Articles

    ERIC Educational Resources Information Center

    Schimmack, Ulrich

    2012-01-01

    Cohen (1962) pointed out the importance of statistical power for psychology as a science, but statistical power of studies has not increased, while the number of studies in a single article has increased. It has been overlooked that multiple studies with modest power have a high probability of producing nonsignificant results because power…

  18. The Statistical Power of the Cluster Randomized Block Design with Matched Pairs--A Simulation Study

    ERIC Educational Resources Information Center

    Dong, Nianbo; Lipsey, Mark

    2010-01-01

    This study uses simulation techniques to examine the statistical power of the group- randomized design and the matched-pair (MP) randomized block design under various parameter combinations. Both nearest neighbor matching and random matching are used for the MP design. The power of each design for any parameter combination was calculated from…

  19. Asking Sensitive Questions: A Statistical Power Analysis of Randomized Response Models

    ERIC Educational Resources Information Center

    Ulrich, Rolf; Schroter, Hannes; Striegel, Heiko; Simon, Perikles

    2012-01-01

    This article derives the power curves for a Wald test that can be applied to randomized response models when small prevalence rates must be assessed (e.g., detecting doping behavior among elite athletes). These curves enable the assessment of the statistical power that is associated with each model (e.g., Warner's model, crosswise model, unrelated…

  20. Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice

    PubMed Central

    Stewart, Gavin B.; Altman, Douglas G.; Askie, Lisa M.; Duley, Lelia; Simmonds, Mark C.; Stewart, Lesley A.

    2012-01-01

    Background Individual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and Findings We included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. Conclusions For these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials. PMID:23056232

  1. Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures

    PubMed Central

    2013-01-01

    Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463

  2. LandScape: a simple method to aggregate p-values and other stochastic variables without a priori grouping.

    PubMed

    Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob

    2016-08-01

    In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).

  3. Early Warning Signs of Suicide in Service Members Who Engage in Unauthorized Acts of Violence

    DTIC Science & Technology

    2016-06-01

    observable to military law enforcement personnel. Statistical analyses tested for differences in warning signs between cases of suicide, violence, or...indicators, (2) Behavioral Change indicators, (3) Social indicators, and (4) Occupational indicators. Statistical analyses were conducted to test for...6 Coding _________________________________________________________________ 7 Statistical

  4. The association between major depression prevalence and sex becomes weaker with age.

    PubMed

    Patten, Scott B; Williams, Jeanne V A; Lavorato, Dina H; Wang, Jian Li; Bulloch, Andrew G M; Sajobi, Tolulope

    2016-02-01

    Women have a higher prevalence of major depressive episodes (MDE) than men, and the annual prevalence of MDE declines with age. Age by sex interactions may occur (a weakening of the sex effect with age), but are easily overlooked since individual studies lack statistical power to detect interactions. The objective of this study was to evaluate age by sex interactions in MDE prevalence. In Canada, a series of 10 national surveys conducted between 1996 and 2013 assessed MDE prevalence in respondents over the age of 14. Treating age as a continuous variable, binomial and linear regression was used to model age by sex interactions in each survey. To increase power, the survey-specific interaction coefficients were then pooled using meta-analytic methods. The estimated interaction terms were homogeneous. In the binomial regression model I (2) was 31.2 % and was not statistically significant (Q statistic = 13.1, df = 9, p = 0.159). The pooled estimate (-0.004) was significant (z = 3.13, p = 0.002), indicating that the effect of sex became weaker with increasing age. This resulted in near disappearance of the sex difference in the 75+ age group. This finding was also supported by an examination of age- and sex-specific estimates pooled across the surveys. The association of MDE prevalence with sex becomes weaker with age. The interaction may reflect biological effect modification. Investigators should test for, and consider inclusion of age by sex interactions in epidemiological analyses of MDE prevalence.

  5. [Statistical analysis using freely-available "EZR (Easy R)" software].

    PubMed

    Kanda, Yoshinobu

    2015-10-01

    Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.

  6. On the Spike Train Variability Characterized by Variance-to-Mean Power Relationship.

    PubMed

    Koyama, Shinsuke

    2015-07-01

    We propose a statistical method for modeling the non-Poisson variability of spike trains observed in a wide range of brain regions. Central to our approach is the assumption that the variance and the mean of interspike intervals are related by a power function characterized by two parameters: the scale factor and exponent. It is shown that this single assumption allows the variability of spike trains to have an arbitrary scale and various dependencies on the firing rate in the spike count statistics, as well as in the interval statistics, depending on the two parameters of the power function. We also propose a statistical model for spike trains that exhibits the variance-to-mean power relationship. Based on this, a maximum likelihood method is developed for inferring the parameters from rate-modulated spike trains. The proposed method is illustrated on simulated and experimental spike trains.

  7. Joint probability of statistical success of multiple phase III trials.

    PubMed

    Zhang, Jianliang; Zhang, Jenny J

    2013-01-01

    In drug development, after completion of phase II proof-of-concept trials, the sponsor needs to make a go/no-go decision to start expensive phase III trials. The probability of statistical success (PoSS) of the phase III trials based on data from earlier studies is an important factor in that decision-making process. Instead of statistical power, the predictive power of a phase III trial, which takes into account the uncertainty in the estimation of treatment effect from earlier studies, has been proposed to evaluate the PoSS of a single trial. However, regulatory authorities generally require statistical significance in two (or more) trials for marketing licensure. We show that the predictive statistics of two future trials are statistically correlated through use of the common observed data from earlier studies. Thus, the joint predictive power should not be evaluated as a simplistic product of the predictive powers of the individual trials. We develop the relevant formulae for the appropriate evaluation of the joint predictive power and provide numerical examples. Our methodology is further extended to the more complex phase III development scenario comprising more than two (K > 2) trials, that is, the evaluation of the PoSS of at least k₀ (k₀≤ K) trials from a program of K total trials. Copyright © 2013 John Wiley & Sons, Ltd.

  8. Power of tests for comparing trend curves with application to national immunization survey (NIS).

    PubMed

    Zhao, Zhen

    2011-02-28

    To develop statistical tests for comparing trend curves of study outcomes between two socio-demographic strata across consecutive time points, and compare statistical power of the proposed tests under different trend curves data, three statistical tests were proposed. For large sample size with independent normal assumption among strata and across consecutive time points, the Z and Chi-square test statistics were developed, which are functions of outcome estimates and the standard errors at each of the study time points for the two strata. For small sample size with independent normal assumption, the F-test statistic was generated, which is a function of sample size of the two strata and estimated parameters across study period. If two trend curves are approximately parallel, the power of Z-test is consistently higher than that of both Chi-square and F-test. If two trend curves cross at low interaction, the power of Z-test is higher than or equal to the power of both Chi-square and F-test; however, at high interaction, the powers of Chi-square and F-test are higher than that of Z-test. The measurement of interaction of two trend curves was defined. These tests were applied to the comparison of trend curves of vaccination coverage estimates of standard vaccine series with National Immunization Survey (NIS) 2000-2007 data. Copyright © 2011 John Wiley & Sons, Ltd.

  9. Multivariate statistical assessment of predictors of firefighters' muscular and aerobic work capacity.

    PubMed

    Lindberg, Ann-Sofie; Oksa, Juha; Antti, Henrik; Malm, Christer

    2015-01-01

    Physical capacity has previously been deemed important for firefighters physical work capacity, and aerobic fitness, muscular strength, and muscular endurance are the most frequently investigated parameters of importance. Traditionally, bivariate and multivariate linear regression statistics have been used to study relationships between physical capacities and work capacities among firefighters. An alternative way to handle datasets consisting of numerous correlated variables is to use multivariate projection analyses, such as Orthogonal Projection to Latent Structures. The first aim of the present study was to evaluate the prediction and predictive power of field and laboratory tests, respectively, on firefighters' physical work capacity on selected work tasks. Also, to study if valid predictions could be achieved without anthropometric data. The second aim was to externally validate selected models. The third aim was to validate selected models on firefighters' and on civilians'. A total of 38 (26 men and 12 women) + 90 (38 men and 52 women) subjects were included in the models and the external validation, respectively. The best prediction (R2) and predictive power (Q2) of Stairs, Pulling, Demolition, Terrain, and Rescue work capacities included field tests (R2 = 0.73 to 0.84, Q2 = 0.68 to 0.82). The best external validation was for Stairs work capacity (R2 = 0.80) and worst for Demolition work capacity (R2 = 0.40). In conclusion, field and laboratory tests could equally well predict physical work capacities for firefighting work tasks, and models excluding anthropometric data were valid. The predictive power was satisfactory for all included work tasks except Demolition.

  10. Investigating human skeletal muscle physiology with unilateral exercise models: when one limb is more powerful than two.

    PubMed

    MacInnis, Martin J; McGlory, Chris; Gibala, Martin J; Phillips, Stuart M

    2017-06-01

    Direct sampling of human skeletal muscle using the needle biopsy technique can facilitate insight into the biochemical and histological responses resulting from changes in exercise or feeding. However, the muscle biopsy procedure is invasive, and analyses are often expensive, which places pragmatic restraints on sample sizes. The unilateral exercise model can serve to increase statistical power and reduce the time and cost of a study. With this approach, 2 limbs of a participant are randomized to 1 of 2 treatments that can be applied almost concurrently or sequentially depending on the nature of the intervention. Similar to a typical repeated measures design, comparisons are made within participants, which increases statistical power by reducing the amount of between-person variability. A washout period is often unnecessary, reducing the time needed to complete the experiment and the influence of potential confounding variables such as habitual diet, activity, and sleep. Variations of the unilateral exercise model have been employed to investigate the influence of exercise, diet, and the interaction between the 2, on a wide range of variables including mitochondrial content, capillary density, and skeletal muscle hypertrophy. Like any model, unilateral exercise has some limitations: it cannot be used to study variables that potentially transfer across limbs, and it is generally limited to exercises that can be performed in pairs of treatments. Where appropriate, however, the unilateral exercise model can yield robust, well-controlled investigations of skeletal muscle responses to a wide range of interventions and conditions including exercise, dietary manipulation, and disuse or immobilization.

  11. A review of geographic variation and Geographic Information Systems (GIS) applications in prescription drug use research.

    PubMed

    Wangia, Victoria; Shireman, Theresa I

    2013-01-01

    While understanding geography's role in healthcare has been an area of research for over 40 years, the application of geography-based analyses to prescription medication use is limited. The body of literature was reviewed to assess the current state of such studies to demonstrate the scale and scope of projects in order to highlight potential research opportunities. To review systematically how researchers have applied geography-based analyses to medication use data. Empiric, English language research articles were identified through PubMed and bibliographies. Original research articles were independently reviewed as to the medications or classes studied, data sources, measures of medication exposure, geographic units of analysis, geospatial measures, and statistical approaches. From 145 publications matching key search terms, forty publications met the inclusion criteria. Cardiovascular and psychotropic classes accounted for the largest proportion of studies. Prescription drug claims were the primary source, and medication exposure was frequently captured as period prevalence. Medication exposure was documented across a variety of geopolitical units such as countries, provinces, regions, states, and postal codes. Most results were descriptive and formal statistical modeling capitalizing on geospatial techniques was rare. Despite the extensive research on small area variation analysis in healthcare, there are a limited number of studies that have examined geographic variation in medication use. Clearly, there is opportunity to collaborate with geographers and GIS professionals to harness the power of GIS technologies and to strengthen future medication studies by applying more robust geospatial statistical methods. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Using complexity metrics with R-R intervals and BPM heart rate measures.

    PubMed

    Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie

    2013-01-01

    Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics-fractal (DFA) and recurrence (RQA) analyses-reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, "oversampled" BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics.

  13. Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium.

    PubMed

    Pasaniuc, Bogdan; Zaitlen, Noah; Lettre, Guillaume; Chen, Gary K; Tandon, Arti; Kao, W H Linda; Ruczinski, Ingo; Fornage, Myriam; Siscovick, David S; Zhu, Xiaofeng; Larkin, Emma; Lange, Leslie A; Cupples, L Adrienne; Yang, Qiong; Akylbekova, Ermeg L; Musani, Solomon K; Divers, Jasmin; Mychaleckyj, Joe; Li, Mingyao; Papanicolaou, George J; Millikan, Robert C; Ambrosone, Christine B; John, Esther M; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah J; Bandera, Elisa V; Ingles, Sue A; Press, Michael F; Chanock, Stephen J; Deming, Sandra L; Rodriguez-Gil, Jorge L; Palmer, Cameron D; Buxbaum, Sarah; Ekunwe, Lynette; Hirschhorn, Joel N; Henderson, Brian E; Myers, Simon; Haiman, Christopher A; Reich, David; Patterson, Nick; Wilson, James G; Price, Alkes L

    2011-04-01

    While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.

  14. Statistical approaches in published ophthalmic clinical science papers: a comparison to statistical practice two decades ago.

    PubMed

    Zhang, Harrison G; Ying, Gui-Shuang

    2018-02-09

    The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. Accounting for standard errors of vision-specific latent trait in regression models.

    PubMed

    Wong, Wan Ling; Li, Xiang; Li, Jialiang; Wong, Tien Yin; Cheng, Ching-Yu; Lamoureux, Ecosse L

    2014-07-11

    To demonstrate the effectiveness of Hierarchical Bayesian (HB) approach in a modeling framework for association effects that accounts for SEs of vision-specific latent traits assessed using Rasch analysis. A systematic literature review was conducted in four major ophthalmic journals to evaluate Rasch analysis performed on vision-specific instruments. The HB approach was used to synthesize the Rasch model and multiple linear regression model for the assessment of the association effects related to vision-specific latent traits. The effectiveness of this novel HB one-stage "joint-analysis" approach allows all model parameters to be estimated simultaneously and was compared with the frequently used two-stage "separate-analysis" approach in our simulation study (Rasch analysis followed by traditional statistical analyses without adjustment for SE of latent trait). Sixty-six reviewed articles performed evaluation and validation of vision-specific instruments using Rasch analysis, and 86.4% (n = 57) performed further statistical analyses on the Rasch-scaled data using traditional statistical methods; none took into consideration SEs of the estimated Rasch-scaled scores. The two models on real data differed for effect size estimations and the identification of "independent risk factors." Simulation results showed that our proposed HB one-stage "joint-analysis" approach produces greater accuracy (average of 5-fold decrease in bias) with comparable power and precision in estimation of associations when compared with the frequently used two-stage "separate-analysis" procedure despite accounting for greater uncertainty due to the latent trait. Patient-reported data, using Rasch analysis techniques, do not take into account the SE of latent trait in association analyses. The HB one-stage "joint-analysis" is a better approach, producing accurate effect size estimations and information about the independent association of exposure variables with vision-specific latent traits. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  16. Correlates of conflict, power and authority management, aggression and impulse control in the Jamaican population.

    PubMed

    Walcott, G; Hickling, F W

    2013-01-01

    The object of this study is to establish the correlates of the phenomenology of conflict and power management in the Jamaican population. A total of 1506 adult individuals were sampled from 2150 households using a stratified sampling method and assessed using the 12 questions of the Jamaica Personality Disorder Inventory (JPDI) on the phenomenology of conflict and power management that are grouped into the psychological features of aggressive social behaviour, unlawful behaviour, socially unacceptable behaviour and financial transgressive behaviour. The database of responses to the demographic and JPDI questionnaires was created and analysed using the Statistical Package for the Social Sciences (SPSS) version 17. Of the national population sampled, 69.1% denied having any phenomenological symptoms of abnormal power management relations while 30.9% of the population admitted to having some degree of conflict and power management, ranging from mild (10.3%), to moderate (17.1), or severe (3.5%). There were 46.55% of the population which had problems with aggressive social behaviour, 9.33% had problems with unlawful behaviour, 9.58% had problems with unacceptable social behaviour and 37.74% had problems with financial transgressive behaviour. Significant gender and socio-economic class patterns for conflict and power management were revealed. This pattern of conflict and power management behaviour is critical in understanding the distinction between normal and abnormal expression of these emotions and actions. Nearly one-third of the sample population ` studied reported problems with conflict, abnormal power and authority management, impulse control and serious aggressive and transgressive behaviour.

  17. Effect size and statistical power in the rodent fear conditioning literature - A systematic review.

    PubMed

    Carneiro, Clarissa F D; Moulin, Thiago C; Macleod, Malcolm R; Amaral, Olavo B

    2018-01-01

    Proposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science.

  18. Effect size and statistical power in the rodent fear conditioning literature – A systematic review

    PubMed Central

    Macleod, Malcolm R.

    2018-01-01

    Proposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science. PMID:29698451

  19. Microwave power transmission system studies. Volume 2: Introduction, organization, environmental and spaceborne systems analyses

    NASA Technical Reports Server (NTRS)

    Maynard, O. E.; Brown, W. C.; Edwards, A.; Haley, J. T.; Meltz, G.; Howell, J. M.; Nathan, A.

    1975-01-01

    Introduction, organization, analyses, conclusions, and recommendations for each of the spaceborne subsystems are presented. Environmental effects - propagation analyses are presented with appendices covering radio wave diffraction by random ionospheric irregularities, self-focusing plasma instabilities and ohmic heating of the D-region. Analyses of dc to rf conversion subsystems and system considerations for both the amplitron and the klystron are included with appendices for the klystron covering cavity circuit calculations, output power of the solenoid-focused klystron, thermal control system, and confined flow focusing of a relativistic beam. The photovoltaic power source characteristics are discussed as they apply to interfacing with the power distribution flow paths, magnetic field interaction, dc to rf converter protection, power distribution including estimates for the power budget, weights, and costs. Analyses for the transmitting antenna consider the aperture illumination and size, with associated efficiencies and ground power distributions. Analyses of subarray types and dimensions, attitude error, flatness, phase error, subarray layout, frequency tolerance, attenuation, waveguide dimensional tolerances, mechanical including thermal considerations are included. Implications associated with transportation, assembly and packaging, attitude control and alignment are discussed. The phase front control subsystem, including both ground based pilot signal driven adaptive and ground command approaches with their associated phase errors, are analyzed.

  20. Testing the equivalence of modern human cranial covariance structure: Implications for bioarchaeological applications.

    PubMed

    von Cramon-Taubadel, Noreen; Schroeder, Lauren

    2016-10-01

    Estimation of the variance-covariance (V/CV) structure of fragmentary bioarchaeological populations requires the use of proxy extant V/CV parameters. However, it is currently unclear whether extant human populations exhibit equivalent V/CV structures. Random skewers (RS) and hierarchical analyses of common principal components (CPC) were applied to a modern human cranial dataset. Cranial V/CV similarity was assessed globally for samples of individual populations (jackknifed method) and for pairwise population sample contrasts. The results were examined in light of potential explanatory factors for covariance difference, such as geographic region, among-group distance, and sample size. RS analyses showed that population samples exhibited highly correlated multivariate responses to selection, and that differences in RS results were primarily a consequence of differences in sample size. The CPC method yielded mixed results, depending upon the statistical criterion used to evaluate the hierarchy. The hypothesis-testing (step-up) approach was deemed problematic due to sensitivity to low statistical power and elevated Type I errors. In contrast, the model-fitting (lowest AIC) approach suggested that V/CV matrices were proportional and/or shared a large number of CPCs. Pairwise population sample CPC results were correlated with cranial distance, suggesting that population history explains some of the variability in V/CV structure among groups. The results indicate that patterns of covariance in human craniometric samples are broadly similar but not identical. These findings have important implications for choosing extant covariance matrices to use as proxy V/CV parameters in evolutionary analyses of past populations. © 2016 Wiley Periodicals, Inc.

  1. Biomarker analyses in REGARD gastric/GEJ carcinoma patients treated with VEGFR2-targeted antibody ramucirumab.

    PubMed

    Fuchs, Charles S; Tabernero, Josep; Tomášek, Jiří; Chau, Ian; Melichar, Bohuslav; Safran, Howard; Tehfe, Mustapha A; Filip, Dumitru; Topuzov, Eldar; Schlittler, Luis; Udrea, Anghel Adrian; Campbell, William; Brincat, Stephen; Emig, Michael; Melemed, Symantha A; Hozak, Rebecca R; Ferry, David; Caldwell, C William; Ajani, Jaffer A

    2016-10-11

    Angiogenesis inhibition is an important strategy for cancer treatment. Ramucirumab, a human IgG1 monoclonal antibody that targets VEGF receptor 2 (VEGFR2), inhibits VEGF-A, -C, -D binding and endothelial cell proliferation. To attempt to identify prognostic and predictive biomarkers, retrospective analyses were used to assess tumour (HER2, VEGFR2) and serum (VEGF-C and -D, and soluble (s) VEGFR1 and 3) biomarkers in phase 3 REGARD patients with metastatic gastric/gastroesophageal junction carcinoma. A total of 152 out of 355 (43%) patients randomised to ramucirumab or placebo had ⩾1 evaluable biomarker result using VEGFR2 immunohistochemistry or HER2, immunohistochemistry or FISH, of blinded baseline tumour tissue samples. Serum samples (32 patients, 9%) were assayed for VEGF-C and -D, and sVEGFR1 and 3. None of the biomarkers tested were associated with ramucirumab efficacy at a level of statistical significance. High VEGFR2 endothelial expression was associated with a non-significant prognostic trend toward shorter progression-free survival (high vs low HR=1.65, 95% CI=0.84,3.23). Treatment with ramucirumab was associated with a trend toward improved survival in both high (HR=0.69, 95% CI=0.38, 1.22) and low (HR=0.73, 95% CI=0.42, 1.26) VEGFR2 subgroups. The benefit associated with ramucirumab did not appear to differ by tumoural HER2 expression. REGARD exploratory analyses did not identify a strong potentially predictive biomarker of ramucirumab efficacy; however, statistical power was limited.

  2. Visual and Statistical Analysis of Digital Elevation Models Generated Using Idw Interpolator with Varying Powers

    NASA Astrophysics Data System (ADS)

    Asal, F. F.

    2012-07-01

    Digital elevation data obtained from different Engineering Surveying techniques is utilized in generating Digital Elevation Model (DEM), which is employed in many Engineering and Environmental applications. This data is usually in discrete point format making it necessary to utilize an interpolation approach for the creation of DEM. Quality assessment of the DEM is a vital issue controlling its use in different applications; however this assessment relies heavily on statistical methods with neglecting the visual methods. The research applies visual analysis investigation on DEMs generated using IDW interpolator of varying powers in order to examine their potential in the assessment of the effects of the variation of the IDW power on the quality of the DEMs. Real elevation data has been collected from field using total station instrument in a corrugated terrain. DEMs have been generated from the data at a unified cell size using IDW interpolator with power values ranging from one to ten. Visual analysis has been undertaken using 2D and 3D views of the DEM; in addition, statistical analysis has been performed for assessment of the validity of the visual techniques in doing such analysis. Visual analysis has shown that smoothing of the DEM decreases with the increase in the power value till the power of four; however, increasing the power more than four does not leave noticeable changes on 2D and 3D views of the DEM. The statistical analysis has supported these results where the value of the Standard Deviation (SD) of the DEM has increased with increasing the power. More specifically, changing the power from one to two has produced 36% of the total increase (the increase in SD due to changing the power from one to ten) in SD and changing to the powers of three and four has given 60% and 75% respectively. This refers to decrease in DEM smoothing with the increase in the power of the IDW. The study also has shown that applying visual methods supported by statistical analysis has proven good potential in the DEM quality assessment.

  3. Power spectra as a diagnostic tool in probing statistical/nonstatistical behavior in unimolecular reactions

    NASA Astrophysics Data System (ADS)

    Chang, Xiaoyen Y.; Sewell, Thomas D.; Raff, Lionel M.; Thompson, Donald L.

    1992-11-01

    The possibility of utilizing different types of power spectra obtained from classical trajectories as a diagnostic tool to identify the presence of nonstatistical dynamics is explored by using the unimolecular bond-fission reactions of 1,2-difluoroethane and the 2-chloroethyl radical as test cases. In previous studies, the reaction rates for these systems were calculated by using a variational transition-state theory and classical trajectory methods. A comparison of the results showed that 1,2-difluoroethane is a nonstatistical system, while the 2-chloroethyl radical behaves statistically. Power spectra for these two systems have been generated under various conditions. The characteristics of these spectra are as follows: (1) The spectra for the 2-chloroethyl radical are always broader and more coupled to other modes than is the case for 1,2-difluoroethane. This is true even at very low levels of excitation. (2) When an internal energy near or above the dissociation threshold is initially partitioned into a local C-H stretching mode, the power spectra for 1,2-difluoroethane broaden somewhat, but discrete and somewhat isolated bands are still clearly evident. In contrast, the analogous power spectra for the 2-chloroethyl radical exhibit a near complete absence of isolated bands. The general appearance of the spectrum suggests a very high level of mode-to-mode coupling, large intramolecular vibrational energy redistribution (IVR) rates, and global statistical behavior. (3) The appearance of the power spectrum for the 2-chloroethyl radical is unaltered regardless of whether the initial C-H excitation is in the CH2 or the CH2Cl group. This result also suggests statistical behavior. These results are interpreted to mean that power spectra may be used as a diagnostic tool to assess the statistical character of a system. The presence of a diffuse spectrum exhibiting a nearly complete loss of isolated structures indicates that the dissociation dynamics of the molecule will be well described by statistical theories. If, however, the power spectrum maintains its discrete, isolated character, as is the case for 1,2-difluoroethane, the opposite conclusion is suggested. Since power spectra are very easily computed, this diagnostic method may prove to be useful.

  4. Using Social Network Analysis to Better Understand Compulsive Exercise Behavior Among a Sample of Sorority Members.

    PubMed

    Patterson, Megan S; Goodson, Patricia

    2017-05-01

    Compulsive exercise, a form of unhealthy exercise often associated with prioritizing exercise and feeling guilty when exercise is missed, is a common precursor to and symptom of eating disorders. College-aged women are at high risk of exercising compulsively compared with other groups. Social network analysis (SNA) is a theoretical perspective and methodology allowing researchers to observe the effects of relational dynamics on the behaviors of people. SNA was used to assess the relationship between compulsive exercise and body dissatisfaction, physical activity, and network variables. Descriptive statistics were conducted using SPSS, and quadratic assignment procedure (QAP) analyses were conducted using UCINET. QAP regression analysis revealed a statistically significant model (R 2 = .375, P < .0001) predicting compulsive exercise behavior. Physical activity, body dissatisfaction, and network variables were statistically significant predictor variables in the QAP regression model. In our sample, women who are connected to "important" or "powerful" people in their network are likely to have higher compulsive exercise scores. This result provides healthcare practitioners key target points for intervention within similar groups of women. For scholars researching eating disorders and associated behaviors, this study supports looking into group dynamics and network structure in conjunction with body dissatisfaction and exercise frequency.

  5. The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis

    PubMed Central

    2011-01-01

    Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa), the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously. PMID:21388552

  6. A powerful approach for association analysis incorporating imprinting effects

    PubMed Central

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-01-01

    Motivation: For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. Results: In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy–Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. Contact: wingfung@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21798962

  7. A powerful approach for association analysis incorporating imprinting effects.

    PubMed

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-09-15

    For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy-Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. wingfung@hku.hk Supplementary data are available at Bioinformatics online.

  8. Statistical methods and computing for big data.

    PubMed

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing; Yan, Jun

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay.

  9. Methods for measuring, enhancing, and accounting for medication adherence in clinical trials.

    PubMed

    Vrijens, B; Urquhart, J

    2014-06-01

    Adherence to rationally prescribed medications is essential for effective pharmacotherapy. However, widely variable adherence to protocol-specified dosing regimens is prevalent among participants in ambulatory drug trials, mostly manifested in the form of underdosing. Drug actions are inherently dose and time dependent, and as a result, variable underdosing diminishes the actions of trial medications by various degrees. The ensuing combination of increased variability and decreased magnitude of trial drug actions reduces statistical power to discern between-group differences in drug actions. Variable underdosing has many adverse consequences, some of which can be mitigated by the combination of reliable measurements of ambulatory patients' adherence to trial and nontrial medications, measurement-guided management of adherence, statistically and pharmacometrically sound analyses, and modifications in trial design. Although nonadherence is prevalent across all therapeutic areas in which the patients are responsible for treatment administration, the significance of the adverse consequences depends on the characteristics of both the disease and the medications.

  10. Safety Assessment of Food and Feed from GM Crops in Europe: Evaluating EFSA's Alternative Framework for the Rat 90-day Feeding Study.

    PubMed

    Hong, Bonnie; Du, Yingzhou; Mukerji, Pushkor; Roper, Jason M; Appenzeller, Laura M

    2017-07-12

    Regulatory-compliant rodent subchronic feeding studies are compulsory regardless of a hypothesis to test, according to recent EU legislation for the safety assessment of whole food/feed produced from genetically modified (GM) crops containing a single genetic transformation event (European Union Commission Implementing Regulation No. 503/2013). The Implementing Regulation refers to guidelines set forth by the European Food Safety Authority (EFSA) for the design, conduct, and analysis of rodent subchronic feeding studies. The set of EFSA recommendations was rigorously applied to a 90-day feeding study in Sprague-Dawley rats. After study completion, the appropriateness and applicability of these recommendations were assessed using a battery of statistical analysis approaches including both retrospective and prospective statistical power analyses as well as variance-covariance decomposition. In the interest of animal welfare considerations, alternative experimental designs were investigated and evaluated in the context of informing the health risk assessment of food/feed from GM crops.

  11. Statistical methods and computing for big data

    PubMed Central

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay. PMID:27695593

  12. Analysis of Emergency Diesel Generators Failure Incidents in Nuclear Power Plants

    NASA Astrophysics Data System (ADS)

    Hunt, Ronderio LaDavis

    In early years of operation, emergency diesel generators have had a minimal rate of demand failures. Emergency diesel generators are designed to operate as a backup when the main source of electricity has been disrupted. As of late, EDGs (emergency diesel generators) have been failing at NPPs (nuclear power plants) around the United States causing either station blackouts or loss of onsite and offsite power. These failures occurred from a specific type called demand failures. This thesis evaluated the current problem that raised concern in the nuclear industry which was averaging 1 EDG demand failure/year in 1997 to having an excessive event of 4 EDG demand failure year which occurred in 2011. To determine the next occurrence of the extreme event and possible cause to an event of such happening, two analyses were conducted, the statistical and root cause analysis. Considering the statistical analysis in which an extreme event probability approach was applied to determine the next occurrence year of an excessive event as well as, the probability of that excessive event occurring. Using the root cause analysis in which the potential causes of the excessive event occurred by evaluating, the EDG manufacturers, aging, policy changes/ maintenance practices and failure components. The root cause analysis investigated the correlation between demand failure data and historical data. Final results from the statistical analysis showed expectations of an excessive event occurring in a fixed range of probability and a wider range of probability from the extreme event probability approach. The root-cause analysis of the demand failure data followed historical statistics for the EDG manufacturer, aging and policy changes/ maintenance practices but, indicated a possible cause regarding the excessive event with the failure components. Conclusions showed the next excessive demand failure year, prediction of the probability and the next occurrence year of such failures, with an acceptable confidence level, was difficult but, it was likely that this type of failure will not be a 100 year event. It was noticeable to see that the majority of the EDG demand failures occurred within the main components as of 2005. The overall analysis of this study provided from percentages, indicated that it would be appropriate to make the statement that the excessive event was caused by the overall age (wear and tear) of the Emergency Diesel Generators in Nuclear Power Plants. Future Work will be to better determine the return period of the excessive event once the occurrence has happened for a second time by implementing the extreme event probability approach.

  13. Correlation of the rates of solvolysis of tert-butyl chlorothioformate and observations concerning the reaction mechanism

    PubMed Central

    Kyong, Jin Burm; Lee, Yelin; D’Souza, Malcolm John; Kevill, Dennis Neil; Kevill, Dennis Neil

    2012-01-01

    The “parent” tertiary alkyl chloroformate, tert-butyl chloroformate, is unstable, but the tert-butyl chlorothioformate (1) is of increased stability and a kinetic investigation of the solvolyses is presented. Analyses in terms of the simple and extended Grunwald-Winstein equations are carried out. The original one-term equation satisfactorily correlates the data with a sensitivity towards changes in solvent ionizing power of 0.73 ±0.03. When the two-term equation is applied, the sensitivity towards changes in solvent nucleophilicity of 0.13 ± 0.09 is associated with a high (0.17) probability that the term that it governs is not statistically significant. PMID:23538747

  14. Protein abundances can distinguish between naturally-occurring and laboratory strains of Yersinia pestis, the causative agent of plague

    DOE PAGES

    Merkley, Eric D.; Sego, Landon H.; Lin, Andy; ...

    2017-08-30

    Adaptive processes in bacterial species can occur rapidly in laboratory culture, leading to genetic divergence between naturally occurring and laboratory-adapted strains. Differentiating wild and closely-related laboratory strains is clearly important for biodefense and bioforensics; however, DNA sequence data alone has thus far not provided a clear signature, perhaps due to lack of understanding of how diverse genome changes lead to adapted phenotypes. Protein abundance profiles from mass spectrometry-based proteomics analyses are a molecular measure of phenotype. Proteomics data contains sufficient information that powerful statistical methods can uncover signatures that distinguish wild strains of Yersinia pestis from laboratory-adapted strains.

  15. Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution.

    PubMed

    van Iterson, Maarten; van Zwet, Erik W; Heijmans, Bastiaan T

    2017-01-27

    We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.

  16. Powerful Statistical Inference for Nested Data Using Sufficient Summary Statistics

    PubMed Central

    Dowding, Irene; Haufe, Stefan

    2018-01-01

    Hierarchically-organized data arise naturally in many psychology and neuroscience studies. As the standard assumption of independent and identically distributed samples does not hold for such data, two important problems are to accurately estimate group-level effect sizes, and to obtain powerful statistical tests against group-level null hypotheses. A common approach is to summarize subject-level data by a single quantity per subject, which is often the mean or the difference between class means, and treat these as samples in a group-level t-test. This “naive” approach is, however, suboptimal in terms of statistical power, as it ignores information about the intra-subject variance. To address this issue, we review several approaches to deal with nested data, with a focus on methods that are easy to implement. With what we call the sufficient-summary-statistic approach, we highlight a computationally efficient technique that can improve statistical power by taking into account within-subject variances, and we provide step-by-step instructions on how to apply this approach to a number of frequently-used measures of effect size. The properties of the reviewed approaches and the potential benefits over a group-level t-test are quantitatively assessed on simulated data and demonstrated on EEG data from a simulated-driving experiment. PMID:29615885

  17. [Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].

    PubMed

    Suzukawa, Yumi; Toyoda, Hideki

    2012-04-01

    This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.

  18. MAOA genotype, childhood maltreatment, and their interaction in the etiology of adult antisocial behaviors.

    PubMed

    Haberstick, Brett C; Lessem, Jeffrey M; Hewitt, John K; Smolen, Andrew; Hopfer, Christian J; Halpern, Carolyn T; Killeya-Jones, Ley A; Boardman, Jason D; Tabor, Joyce; Siegler, Ilene C; Williams, Redford B; Mullan Harris, Kathleen

    2014-01-01

    Maltreatment by an adult or caregiver during childhood is a prevalent and important predictor of antisocial behaviors in adulthood. A functional promoter polymorphism in the monoamine oxidase A (MAOA) gene has been implicated as a moderating factor in the relationship between childhood maltreatment and antisocial behaviors. Although there have been numerous attempts at replicating this observation, results remain inconclusive. We examined this gene-environment interaction hypothesis in a sample of 3356 white and 960 black men (aged 24-34) participating in the National Longitudinal Study of Adolescent Health. Primary analysis indicated that childhood maltreatment was a significant risk factor for later behaviors that violate rules and the rights of others (p < .05), there were no main effects of MAOA genotype, and MAOA genotype was not a significant moderator of the relationship between maltreatment and antisocial behaviors in our white sample. Post hoc analyses identified a similar pattern of results among our black sample in which maltreatment was not a significant predictor of antisocial behavior. Post hoc analyses also revealed a main effect of MAOA genotype on having a disposition toward violence in both samples and for violent convictions among our black sample. None of these post hoc findings, however, survived correction for multiple testing (p > .05). Power analyses indicated that these results were not due to insufficient statistical power. We could not confirm the hypothesis that MAOA genotype moderates the relationship between childhood maltreatment and adult antisocial behaviors. Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  19. Biomechanical Analysis of Military Boots. Phase 1. Materials Testing of Military and Commercial Footwear

    DTIC Science & Technology

    1992-10-01

    N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was

  20. Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics

    PubMed Central

    2011-01-01

    Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747

  1. Samples in applied psychology: over a decade of research in review.

    PubMed

    Shen, Winny; Kiger, Thomas B; Davies, Stacy E; Rasch, Rena L; Simon, Kara M; Ones, Deniz S

    2011-09-01

    This study examines sample characteristics of articles published in Journal of Applied Psychology (JAP) from 1995 to 2008. At the individual level, the overall median sample size over the period examined was approximately 173, which is generally adequate for detecting the average magnitude of effects of primary interest to researchers who publish in JAP. Samples using higher units of analyses (e.g., teams, departments/work units, and organizations) had lower median sample sizes (Mdn ≈ 65), yet were arguably robust given typical multilevel design choices of JAP authors despite the practical constraints of collecting data at higher units of analysis. A substantial proportion of studies used student samples (~40%); surprisingly, median sample sizes for student samples were smaller than working adult samples. Samples were more commonly occupationally homogeneous (~70%) than occupationally heterogeneous. U.S. and English-speaking participants made up the vast majority of samples, whereas Middle Eastern, African, and Latin American samples were largely unrepresented. On the basis of study results, recommendations are provided for authors, editors, and readers, which converge on 3 themes: (a) appropriateness and match between sample characteristics and research questions, (b) careful consideration of statistical power, and (c) the increased popularity of quantitative synthesis. Implications are discussed in terms of theory building, generalizability of research findings, and statistical power to detect effects. PsycINFO Database Record (c) 2011 APA, all rights reserved

  2. Scaling and universality in the human voice.

    PubMed

    Luque, Jordi; Luque, Bartolo; Lacasa, Lucas

    2015-04-06

    Speech is a distinctive complex feature of human capabilities. In order to understand the physics underlying speech production, in this work, we empirically analyse the statistics of large human speech datasets ranging several languages. We first show that during speech, the energy is unevenly released and power-law distributed, reporting a universal robust Gutenberg-Richter-like law in speech. We further show that such 'earthquakes in speech' show temporal correlations, as the interevent statistics are again power-law distributed. As this feature takes place in the intraphoneme range, we conjecture that the process responsible for this complex phenomenon is not cognitive, but it resides in the physiological (mechanical) mechanisms of speech production. Moreover, we show that these waiting time distributions are scale invariant under a renormalization group transformation, suggesting that the process of speech generation is indeed operating close to a critical point. These results are put in contrast with current paradigms in speech processing, which point towards low dimensional deterministic chaos as the origin of nonlinear traits in speech fluctuations. As these latter fluctuations are indeed the aspects that humanize synthetic speech, these findings may have an impact in future speech synthesis technologies. Results are robust and independent of the communication language or the number of speakers, pointing towards a universal pattern and yet another hint of complexity in human speech. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  3. Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions

    ERIC Educational Resources Information Center

    Yormaz, Seha; Sünbül, Önder

    2017-01-01

    This study aims to determine the Type I error rates and power of S[subscript 1] , S[subscript 2] indices and kappa statistic at detecting copying on multiple-choice tests under various conditions. It also aims to determine how copying groups are created in order to calculate how kappa statistics affect Type I error rates and power. In this study,…

  4. Statistical Power Analysis with Microsoft Excel: Normal Tests for One or Two Means as a Prelude to Using Non-Central Distributions to Calculate Power

    ERIC Educational Resources Information Center

    Texeira, Antonio; Rosa, Alvaro; Calapez, Teresa

    2009-01-01

    This article presents statistical power analysis (SPA) based on the normal distribution using Excel, adopting textbook and SPA approaches. The objective is to present the latter in a comparative way within a framework that is familiar to textbook level readers, as a first step to understand SPA with other distributions. The analysis focuses on the…

  5. On the analysis of very small samples of Gaussian repeated measurements: an alternative approach.

    PubMed

    Westgate, Philip M; Burchett, Woodrow W

    2017-03-15

    The analysis of very small samples of Gaussian repeated measurements can be challenging. First, due to a very small number of independent subjects contributing outcomes over time, statistical power can be quite small. Second, nuisance covariance parameters must be appropriately accounted for in the analysis in order to maintain the nominal test size. However, available statistical strategies that ensure valid statistical inference may lack power, whereas more powerful methods may have the potential for inflated test sizes. Therefore, we explore an alternative approach to the analysis of very small samples of Gaussian repeated measurements, with the goal of maintaining valid inference while also improving statistical power relative to other valid methods. This approach uses generalized estimating equations with a bias-corrected empirical covariance matrix that accounts for all small-sample aspects of nuisance correlation parameter estimation in order to maintain valid inference. Furthermore, the approach utilizes correlation selection strategies with the goal of choosing the working structure that will result in the greatest power. In our study, we show that when accurate modeling of the nuisance correlation structure impacts the efficiency of regression parameter estimation, this method can improve power relative to existing methods that yield valid inference. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  6. Précis of statistical significance: rationale, validity, and utility.

    PubMed

    Chow, S L

    1998-04-01

    The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.

  7. The use of imputed sibling genotypes in sibship-based association analysis: on modeling alternatives, power and model misspecification.

    PubMed

    Minică, Camelia C; Dolan, Conor V; Hottenga, Jouke-Jan; Willemsen, Gonneke; Vink, Jacqueline M; Boomsma, Dorret I

    2013-05-01

    When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.

  8. EEG low-resolution brain electromagnetic tomography (LORETA) in Huntington's disease.

    PubMed

    Painold, Annamaria; Anderer, Peter; Holl, Anna K; Letmaier, Martin; Saletu-Zyhlarz, Gerda M; Saletu, Bernd; Bonelli, Raphael M

    2011-05-01

    Previous studies have shown abnormal electroencephalography (EEG) in Huntington's disease (HD). The aim of the present investigation was to compare quantitatively analyzed EEGs of HD patients and controls by means of low-resolution brain electromagnetic tomography (LORETA). Further aims were to delineate the sensitivity and utility of EEG LORETA in the progression of HD, and to correlate parameters of cognitive and motor impairment with neurophysiological variables. In 55 HD patients and 55 controls a 3-min vigilance-controlled EEG (V-EEG) was recorded during midmorning hours. Power spectra and intracortical tomography were computed by LORETA in seven frequency bands and compared between groups. Spearman rank correlations were based on V-EEG and psychometric data. Statistical overall analysis by means of the omnibus significance test demonstrated significant (p < 0.01) differences between HD patients and controls. LORETA theta, alpha and beta power were decreased from early to late stages of the disease. Only advanced disease stages showed a significant increase in delta power, mainly in the right orbitofrontal cortex. Correlation analyses revealed that a decrease of alpha and theta power correlated significantly with increasing cognitive and motor decline. LORETA proved to be a sensitive instrument for detecting progressive electrophysiological changes in HD. Reduced alpha power seems to be a trait marker of HD, whereas increased prefrontal delta power seems to reflect worsening of the disease. Motor function and cognitive function deteriorate together with a decrease in alpha and theta power. This data set, so far the largest in HD research, helps to elucidate remaining uncertainties about electrophysiological abnormalities in HD.

  9. System Advisor Model, SAM 2011.12.2: General Description

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilman, P.; Dobos, A.

    2012-02-01

    This document describes the capabilities of the U.S. Department of Energy and National Renewable Energy Laboratory's System Advisor Model (SAM), Version 2011.12.2, released on December 2, 2011. SAM is software that models the cost and performance of renewable energy systems. Project developers, policy makers, equipment manufacturers, and researchers use graphs and tables of SAM results in the process of evaluating financial, technology, and incentive options for renewable energy projects. SAM simulates the performance of solar, wind, geothermal, biomass, and conventional power systems. The financial model can represent financing structures for projects that either buy and sell electricity at retail ratesmore » (residential and commercial) or sell electricity at a price determined in a power purchase agreement (utility). Advanced analysis options facilitate parametric, sensitivity, and statistical analyses, and allow for interfacing SAM with Microsoft Excel or with other computer programs. SAM is available as a free download at http://sam.nrel.gov. Technical support and more information about the software are available on the website.« less

  10. Acute effects of The Stick on strength, power, and flexibility.

    PubMed

    Mikesky, Alan E; Bahamonde, Rafael E; Stanton, Katie; Alvey, Thurman; Fitton, Tom

    2002-08-01

    The Stick is a muscle massage device used by athletes, particularly track athletes, to improve performance. The purpose of this project was to assess the acute effects of The Stick on muscle strength, power, and flexibility. Thirty collegiate athletes consented to participate in a 4-week, double-blind study, which consisted of 4 testing sessions (1 familiarization and 3 data collection) scheduled 1 week apart. During each testing session subjects performed 4 measures in the following sequence: hamstring flexibility, vertical jump, flying-start 20-yard dash, and isokinetic knee extension at 90 degrees x s(-1). Two minutes of randomly assigned intervention treatment (visualization [control], mock insensible electrical stimulation [placebo], or massage using The Stick [experimental]) was performed immediately prior to each performance measure. Statistical analyses involved single-factor repeated measures analysis of variance (ANOVA) with Fisher's Least Significant Difference post-hoc test. None of the variables measured showed an acute improvement (p < or = 0.05) immediately following treatment with The Stick.

  11. Genetic co-structuring in host-parasite systems: Empirical data from raccoons and raccoon ticks

    DOE PAGES

    Dharmarajan, Guha; Beasley, James C.; Beatty, William S.; ...

    2016-03-31

    Many aspects of parasite biology critically depend on their hosts, and understanding how host-parasite populations are co-structured can help improve our understanding of the ecology of parasites, their hosts, and host-parasite interactions. Here, this study utilized genetic data collected from raccoons (Procyon lotor), and a specialist parasite, the raccoon tick (Ixodes texanus), to test for genetic co-structuring of host-parasite populations at both landscape and host scales. At the landscape scale, our analyses revealed a significant correlation between genetic and geographic distance matrices (i.e., isolation by distance) in ticks, but not their hosts. While there are several mechanisms that could leadmore » to a stronger pattern of isolation by distance in tick vs. raccoon datasets, our analyses suggest that at least one reason for the above pattern is the substantial increase in statistical power (due to the ≈8-fold increase in sample size) afforded by sampling parasites. Host-scale analyses indicated higher relatedness between ticks sampled from related vs. unrelated raccoons trapped within the same habitat patch, a pattern likely driven by increased contact rates between related hosts. Lastly, by utilizing fine-scale genetic data from both parasites and hosts, our analyses help improve our understanding of epidemiology and host ecology.« less

  12. New heterogeneous test statistics for the unbalanced fixed-effect nested design.

    PubMed

    Guo, Jiin-Huarng; Billard, L; Luh, Wei-Ming

    2011-05-01

    When the underlying variances are unknown or/and unequal, using the conventional F test is problematic in the two-factor hierarchical data structure. Prompted by the approximate test statistics (Welch and Alexander-Govern methods), the authors develop four new heterogeneous test statistics to test factor A and factor B nested within A for the unbalanced fixed-effect two-stage nested design under variance heterogeneity. The actual significance levels and statistical power of the test statistics were compared in a simulation study. The results show that the proposed procedures maintain better Type I error rate control and have greater statistical power than those obtained by the conventional F test in various conditions. Therefore, the proposed test statistics are recommended in terms of robustness and easy implementation. ©2010 The British Psychological Society.

  13. Correlation techniques and measurements of wave-height statistics

    NASA Technical Reports Server (NTRS)

    Guthart, H.; Taylor, W. C.; Graf, K. A.; Douglas, D. G.

    1972-01-01

    Statistical measurements of wave height fluctuations have been made in a wind wave tank. The power spectral density function of temporal wave height fluctuations evidenced second-harmonic components and an f to the minus 5th power law decay beyond the second harmonic. The observations of second harmonic effects agreed very well with a theoretical prediction. From the wave statistics, surface drift currents were inferred and compared to experimental measurements with satisfactory agreement. Measurements were made of the two dimensional correlation coefficient at 15 deg increments in angle with respect to the wind vector. An estimate of the two-dimensional spatial power spectral density function was also made.

  14. Merging National Forest and National Forest Health Inventories to Obtain an Integrated Forest Resource Inventory – Experiences from Bavaria, Slovenia and Sweden

    PubMed Central

    Kovač, Marko; Bauer, Arthur; Ståhl, Göran

    2014-01-01

    Backgrounds, Material and Methods To meet the demands of sustainable forest management and international commitments, European nations have designed a variety of forest-monitoring systems for specific needs. While the majority of countries are committed to independent, single-purpose inventorying, a minority of countries have merged their single-purpose forest inventory systems into integrated forest resource inventories. The statistical efficiencies of the Bavarian, Slovene and Swedish integrated forest resource inventory designs are investigated with the various statistical parameters of the variables of growing stock volume, shares of damaged trees, and deadwood volume. The parameters are derived by using the estimators for the given inventory designs. The required sample sizes are derived via the general formula for non-stratified independent samples and via statistical power analyses. The cost effectiveness of the designs is compared via two simple cost effectiveness ratios. Results In terms of precision, the most illustrative parameters of the variables are relative standard errors; their values range between 1% and 3% if the variables’ variations are low (s%<80%) and are higher in the case of higher variations. A comparison of the actual and required sample sizes shows that the actual sample sizes were deliberately set high to provide precise estimates for the majority of variables and strata. In turn, the successive inventories are statistically efficient, because they allow detecting the mean changes of variables with powers higher than 90%; the highest precision is attained for the changes of growing stock volume and the lowest for the changes of the shares of damaged trees. Two indicators of cost effectiveness also show that the time input spent for measuring one variable decreases with the complexity of inventories. Conclusion There is an increasing need for credible information on forest resources to be used for decision making and national and international policy making. Such information can be cost-efficiently provided through integrated forest resource inventories. PMID:24941120

  15. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: An example from a vertigo phase III study with longitudinal count data as primary endpoint

    PubMed Central

    2012-01-01

    Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. Conclusions The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint. PMID:22962944

  16. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint.

    PubMed

    Adrion, Christine; Mansmann, Ulrich

    2012-09-10

    A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.

  17. Ecological statistics of Gestalt laws for the perceptual organization of contours.

    PubMed

    Elder, James H; Goldberg, Richard M

    2002-01-01

    Although numerous studies have measured the strength of visual grouping cues for controlled psychophysical stimuli, little is known about the statistical utility of these various cues for natural images. In this study, we conducted experiments in which human participants trace perceived contours in natural images. These contours are automatically mapped to sequences of discrete tangent elements detected in the image. By examining relational properties between pairs of successive tangents on these traced curves, and between randomly selected pairs of tangents, we are able to estimate the likelihood distributions required to construct an optimal Bayesian model for contour grouping. We employed this novel methodology to investigate the inferential power of three classical Gestalt cues for contour grouping: proximity, good continuation, and luminance similarity. The study yielded a number of important results: (1) these cues, when appropriately defined, are approximately uncorrelated, suggesting a simple factorial model for statistical inference; (2) moderate image-to-image variation of the statistics indicates the utility of general probabilistic models for perceptual organization; (3) these cues differ greatly in their inferential power, proximity being by far the most powerful; and (4) statistical modeling of the proximity cue indicates a scale-invariant power law in close agreement with prior psychophysics.

  18. A clinicomicrobiological study to evaluate the efficacy of manual and powered toothbrushes among autistic patients

    PubMed Central

    Vajawat, Mayuri; Deepika, P. C.; Kumar, Vijay; Rajeshwari, P.

    2015-01-01

    Aim: To compare the efficacy of powered toothbrushes in improving gingival health and reducing salivary red complex counts as compared to manual toothbrushes, among autistic individuals. Materials and Methods: Forty autistics was selected. Test group received powered toothbrushes, and control group received manual toothbrushes. Plaque index and gingival index were recorded. Unstimulated saliva was collected for analysis of red complex organisms using polymerase chain reaction. Results: A statistically significant reduction in the plaque scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.002 for controls). This reduction was statistically more significant in the test group (P = 0.024). A statistically significant reduction in the gingival scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.001 for controls). This reduction was statistically more significant in the test group (P = 0.042). No statistically significant reduction in the detection rate of red complex organisms were seen at 4 weeks in both the groups. Conclusion: Powered toothbrushes result in a significant overall improvement in gingival health when constant reinforcement of oral hygiene instructions is given. PMID:26681855

  19. Waveform classification and statistical analysis of seismic precursors to the July 2008 Vulcanian Eruption of Soufrière Hills Volcano, Montserrat

    NASA Astrophysics Data System (ADS)

    Rodgers, Mel; Smith, Patrick; Pyle, David; Mather, Tamsin

    2016-04-01

    Understanding the transition between quiescence and eruption at dome-forming volcanoes, such as Soufrière Hills Volcano (SHV), Montserrat, is important for monitoring volcanic activity during long-lived eruptions. Statistical analysis of seismic events (e.g. spectral analysis and identification of multiplets via cross-correlation) can be useful for characterising seismicity patterns and can be a powerful tool for analysing temporal changes in behaviour. Waveform classification is crucial for volcano monitoring, but consistent classification, both during real-time analysis and for retrospective analysis of previous volcanic activity, remains a challenge. Automated classification allows consistent re-classification of events. We present a machine learning (random forest) approach to rapidly classify waveforms that requires minimal training data. We analyse the seismic precursors to the July 2008 Vulcanian explosion at SHV and show systematic changes in frequency content and multiplet behaviour that had not previously been recognised. These precursory patterns of seismicity may be interpreted as changes in pressure conditions within the conduit during magma ascent and could be linked to magma flow rates. Frequency analysis of the different waveform classes supports the growing consensus that LP and Hybrid events should be considered end members of a continuum of low-frequency source processes. By using both supervised and unsupervised machine-learning methods we investigate the nature of waveform classification and assess current classification schemes.

  20. Statistical properties of radiation power levels from a high-gain free-electron laser at and beyond saturation

    NASA Astrophysics Data System (ADS)

    Schroeder, C. B.; Fawley, W. M.; Esarey, E.

    2003-07-01

    We investigate the statistical properties (e.g., shot-to-shot power fluctuations) of the radiation from a high-gain free-electron laser (FEL) operating in the nonlinear regime. We consider the case of an FEL amplifier reaching saturation whose shot-to-shot fluctuations in input radiation power follow a gamma distribution. We analyze the corresponding output power fluctuations at and beyond saturation, including beam energy spread effects, and find that there are well-characterized values of undulator length for which the fluctuations reach a minimum.

  1. Space Station Freedom electric power system availability study

    NASA Technical Reports Server (NTRS)

    Turnquist, Scott R.

    1990-01-01

    The results are detailed of follow-on availability analyses performed on the Space Station Freedom electric power system (EPS). The scope includes analyses of several EPS design variations, these are: the 4-photovoltaic (PV) module baseline EPS design, a 6-PV module EPS design, and a 3-solar dynamic module EPS design which included a 10 kW PV module. The analyses performed included: determining the discrete power levels that the EPS will operate at upon various component failures and the availability of each of these operating states; ranking EPS components by the relative contribution each component type gives to the power availability of the EPS; determining the availability impacts of including structural and long-life EPS components in the availability models used in the analyses; determining optimum sparing strategies, for storing space EPS components on-orbit, to maintain high average-power-capability with low lift-mass requirements; and analyses to determine the sensitivity of EPS-availability to uncertainties in the component reliability and maintainability data used.

  2. What Determines the Duration of War? Insights from Assessment Strategies in Animal Contests

    PubMed Central

    Briffa, Mark

    2014-01-01

    Interstate wars and animal contests both involve disputed resources, restraint and giving up decisions. In both cases it seems illogical for the weaker side to persist in the conflict if it will eventually lose. In the case of animal contests analyses of the links between opponent power and contest duration have provided insights into what sources of information are available to fighting animals. I outline the theory of information use during animal contests and describe a statistical framework that has been used to distinguish between two strategies that individuals use to decide whether to persist or quit. I then apply this framework to the analysis of interstate wars. War duration increases with the power of winners and losers. These patterns provide no support for the idea that wars are settled on the basis of mutual assessment of capabilities but indicate that settlement is based on attrition. In contrast to most animal contests, war duration is as closely linked to the power of the winning side as to that of the losing side. Overall, this analysis highlights a number of similarities between animal contests and interstate war, indicating that both could be investigated using similar conceptual frameworks. PMID:25247403

  3. OPATs: Omnibus P-value association tests.

    PubMed

    Chen, Chia-Wei; Yang, Hsin-Chou

    2017-07-10

    Combining statistical significances (P-values) from a set of single-locus association tests in genome-wide association studies is a proof-of-principle method for identifying disease-associated genomic segments, functional genes and biological pathways. We review P-value combinations for genome-wide association studies and introduce an integrated analysis tool, Omnibus P-value Association Tests (OPATs), which provides popular analysis methods of P-value combinations. The software OPATs programmed in R and R graphical user interface features a user-friendly interface. In addition to analysis modules for data quality control and single-locus association tests, OPATs provides three types of set-based association test: window-, gene- and biopathway-based association tests. P-value combinations with or without threshold and rank truncation are provided. The significance of a set-based association test is evaluated by using resampling procedures. Performance of the set-based association tests in OPATs has been evaluated by simulation studies and real data analyses. These set-based association tests help boost the statistical power, alleviate the multiple-testing problem, reduce the impact of genetic heterogeneity, increase the replication efficiency of association tests and facilitate the interpretation of association signals by streamlining the testing procedures and integrating the genetic effects of multiple variants in genomic regions of biological relevance. In summary, P-value combinations facilitate the identification of marker sets associated with disease susceptibility and uncover missing heritability in association studies, thereby establishing a foundation for the genetic dissection of complex diseases and traits. OPATs provides an easy-to-use and statistically powerful analysis tool for P-value combinations. OPATs, examples, and user guide can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/genetics/association/OPATs.htm. © The Author 2017. Published by Oxford University Press.

  4. Anodal transcranial direct current stimulation of the right anterior temporal lobe did not significantly affect verbal insight.

    PubMed

    Aihara, Takatsugu; Ogawa, Takeshi; Shimokawa, Takeaki; Yamashita, Okito

    2017-01-01

    Humans often utilize past experience to solve difficult problems. However, if past experience is insufficient to solve a problem, solvers may reach an impasse. Insight can be valuable for breaking an impasse, enabling the reinterpretation or re-representation of a problem. Previous studies using between-subjects designs have revealed a causal relationship between the anterior temporal lobes (ATLs) and non-verbal insight, by enhancing the right ATL while inhibiting the left ATL using transcranial direct current stimulation (tDCS). In addition, neuroimaging studies have reported a correlation between right ATL activity and verbal insight. Based on these findings, we hypothesized that the right ATL is causally related to both non-verbal and verbal insight. To test this hypothesis, we conducted an experiment with 66 subjects using a within-subjects design, which typically has greater statistical power than a between-subjects design. Subjects participated in tDCS experiments across 2 days, in which they solved both non-verbal and verbal insight problems under active or sham stimulation conditions. To dissociate the effects of right ATL stimulation from those of left ATL stimulation, we used two montage types; anodal tDCS of the right ATL together with cathodal tDCS of the left ATL (stimulating both ATLs) and anodal tDCS of the right ATL with cathodal tDCS of the left cheek (stimulating only the right ATL). The montage used was counterbalanced across subjects. Statistical analyses revealed that, regardless of the montage type, there were no significant differences between the active and sham conditions for either verbal or non-verbal insight, although the finding for non-verbal insight was inconclusive because of a lack of statistical power. These results failed to support previous findings suggesting that the right ATL is the central locus of insight.

  5. Generalized functional linear models for gene-based case-control association studies.

    PubMed

    Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao

    2014-11-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.

  6. Generalized Functional Linear Models for Gene-based Case-Control Association Studies

    PubMed Central

    Mills, James L.; Carter, Tonia C.; Lobach, Iryna; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Weeks, Daniel E.; Xiong, Momiao

    2014-01-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene are disease-related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease data sets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. PMID:25203683

  7. Prospective multi-centre Voxel Based Morphometry study employing scanner specific segmentations: Procedure development using CaliBrain structural MRI data

    PubMed Central

    2009-01-01

    Background Structural Magnetic Resonance Imaging (sMRI) of the brain is employed in the assessment of a wide range of neuropsychiatric disorders. In order to improve statistical power in such studies it is desirable to pool scanning resources from multiple centres. The CaliBrain project was designed to provide for an assessment of scanner differences at three centres in Scotland, and to assess the practicality of pooling scans from multiple-centres. Methods We scanned healthy subjects twice on each of the 3 scanners in the CaliBrain project with T1-weighted sequences. The tissue classifier supplied within the Statistical Parametric Mapping (SPM5) application was used to map the grey and white tissue for each scan. We were thus able to assess within scanner variability and between scanner differences. We have sought to correct for between scanner differences by adjusting the probability mappings of tissue occupancy (tissue priors) used in SPM5 for tissue classification. The adjustment procedure resulted in separate sets of tissue priors being developed for each scanner and we refer to these as scanner specific priors. Results Voxel Based Morphometry (VBM) analyses and metric tests indicated that the use of scanner specific priors reduced tissue classification differences between scanners. However, the metric results also demonstrated that the between scanner differences were not reduced to the level of within scanner variability, the ideal for scanner harmonisation. Conclusion Our results indicate the development of scanner specific priors for SPM can assist in pooling of scan resources from different research centres. This can facilitate improvements in the statistical power of quantitative brain imaging studies. PMID:19445668

  8. Assessing the clinical utility of cancer genomic and proteomic data across tumor types.

    PubMed

    Yuan, Yuan; Van Allen, Eliezer M; Omberg, Larsson; Wagle, Nikhil; Amin-Mansour, Ali; Sokolov, Artem; Byers, Lauren A; Xu, Yanxun; Hess, Kenneth R; Diao, Lixia; Han, Leng; Huang, Xuelin; Lawrence, Michael S; Weinstein, John N; Stuart, Josh M; Mills, Gordon B; Garraway, Levi A; Margolin, Adam A; Getz, Gad; Liang, Han

    2014-07-01

    Molecular profiling of tumors promises to advance the clinical management of cancer, but the benefits of integrating molecular data with traditional clinical variables have not been systematically studied. Here we retrospectively predict patient survival using diverse molecular data (somatic copy-number alteration, DNA methylation and mRNA, microRNA and protein expression) from 953 samples of four cancer types from The Cancer Genome Atlas project. We find that incorporating molecular data with clinical variables yields statistically significantly improved predictions (FDR < 0.05) for three cancers but those quantitative gains were limited (2.2-23.9%). Additional analyses revealed little predictive power across tumor types except for one case. In clinically relevant genes, we identified 10,281 somatic alterations across 12 cancer types in 2,928 of 3,277 patients (89.4%), many of which would not be revealed in single-tumor analyses. Our study provides a starting point and resources, including an open-access model evaluation platform, for building reliable prognostic and therapeutic strategies that incorporate molecular data.

  9. Tipping point analysis of ocean acoustic noise

    NASA Astrophysics Data System (ADS)

    Livina, Valerie N.; Brouwer, Albert; Harris, Peter; Wang, Lian; Sotirakopoulos, Kostas; Robinson, Stephen

    2018-02-01

    We apply tipping point analysis to a large record of ocean acoustic data to identify the main components of the acoustic dynamical system and study possible bifurcations and transitions of the system. The analysis is based on a statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the data using time-series techniques. We analyse long-term and seasonal trends, system states and acoustic fluctuations to reconstruct a one-dimensional stochastic equation to approximate the acoustic dynamical system. We apply potential analysis to acoustic fluctuations and detect several changes in the system states in the past 14 years. These are most likely caused by climatic phenomena. We analyse trends in sound pressure level within different frequency bands and hypothesize a possible anthropogenic impact on the acoustic environment. The tipping point analysis framework provides insight into the structure of the acoustic data and helps identify its dynamic phenomena, correctly reproducing the probability distribution and scaling properties (power-law correlations) of the time series.

  10. 40 CFR 91.512 - Request for public hearing.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... plans and statistical analyses have been properly applied (specifically, whether sampling procedures and statistical analyses specified in this subpart were followed and whether there exists a basis for... will be made available to the public during Agency business hours. ...

  11. A retrospective survey of research design and statistical analyses in selected Chinese medical journals in 1998 and 2008.

    PubMed

    Jin, Zhichao; Yu, Danghui; Zhang, Luoman; Meng, Hong; Lu, Jian; Gao, Qingbin; Cao, Yang; Ma, Xiuqiang; Wu, Cheng; He, Qian; Wang, Rui; He, Jia

    2010-05-25

    High quality clinical research not only requires advanced professional knowledge, but also needs sound study design and correct statistical analyses. The number of clinical research articles published in Chinese medical journals has increased immensely in the past decade, but study design quality and statistical analyses have remained suboptimal. The aim of this investigation was to gather evidence on the quality of study design and statistical analyses in clinical researches conducted in China for the first decade of the new millennium. Ten (10) leading Chinese medical journals were selected and all original articles published in 1998 (N = 1,335) and 2008 (N = 1,578) were thoroughly categorized and reviewed. A well-defined and validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation. Main outcomes were the frequencies of different types of study design, error/defect proportion in design and statistical analyses, and implementation of CONSORT in randomized clinical trials. From 1998 to 2008: The error/defect proportion in statistical analyses decreased significantly ( = 12.03, p<0.001), 59.8% (545/1,335) in 1998 compared to 52.2% (664/1,578) in 2008. The overall error/defect proportion of study design also decreased ( = 21.22, p<0.001), 50.9% (680/1,335) compared to 42.40% (669/1,578). In 2008, design with randomized clinical trials remained low in single digit (3.8%, 60/1,578) with two-third showed poor results reporting (defects in 44 papers, 73.3%). Nearly half of the published studies were retrospective in nature, 49.3% (658/1,335) in 1998 compared to 48.2% (761/1,578) in 2008. Decreases in defect proportions were observed in both results presentation ( = 93.26, p<0.001), 92.7% (945/1,019) compared to 78.2% (1023/1,309) and interpretation ( = 27.26, p<0.001), 9.7% (99/1,019) compared to 4.3% (56/1,309), some serious ones persisted. Chinese medical research seems to have made significant progress regarding statistical analyses, but there remains ample room for improvement regarding study designs. Retrospective clinical studies are the most often used design, whereas randomized clinical trials are rare and often show methodological weaknesses. Urgent implementation of the CONSORT statement is imperative.

  12. Metacontrast Inferred from Reaction Time and Verbal Report: Replication and Comments on the Feher-Biederman Experiment

    ERIC Educational Resources Information Center

    Amundson, Vickie E.; Bernstein, Ira H.

    1973-01-01

    Authors note that Fehrer and Biederman's two statistical tests were not of equal power and that their conclusion could be a statistical artifact of both the lesser power of the verbal report comparison and the insensitivity of their particular verbal report indicator. (Editor)

  13. Clinical and structural outcomes after arthroscopic repair of full-thickness rotator cuff tears with and without platelet-rich product supplementation: a meta-analysis and meta-regression.

    PubMed

    Warth, Ryan J; Dornan, Grant J; James, Evan W; Horan, Marilee P; Millett, Peter J

    2015-02-01

    The purpose of this study was to perform a systematic review, meta-analysis, and meta-regression of all Level I and Level II studies comparing the clinical or structural outcomes, or both, after rotator cuff repair with and without platelet-rich product (PRP) supplementation. A literature search of the PubMed and EMBASE databases was performed to identify all Level I or II studies comparing the clinical or structural outcomes, or both, after arthroscopic repair of full-thickness rotator cuff tears with (PRP+ group) and without (PRP- group) PRP supplementation. Data included outcome scores (American Shoulder and Elbow Surgeons [ASES], University of California Los Angeles [UCLA], Constant, Simple Shoulder Test [SST] and visual analog scale [VAS] scores) and retears diagnosed with imaging studies. Meta-analyses compared preoperative, postoperative, and gain in outcome scores and relative risk ratios for retears. Meta-regression compared the effect of PRP treatment on outcome scores and retear rates according to 6 covariates. Minimum effect sizes that were detectable with 80% power were also calculated for each study. Eleven studies were included in this review and a maximum of 8 studies were used for meta-analyses according to data availability. There were no statistically significant differences between the PRP+ and PRP- groups for overall outcome scores or retear rates (P > .05). Overall gain in the Constant score was decreased when liquid PRP was injected over the tendon surface compared with PRP application at the tendon-bone interface (-6.88 points v +0.78 points, respectively; P = .046); however, this difference did not reach the previously reported minimum clinically important difference (MCID) for Constant scores. When the initial tear size was greater than 3 cm in anterior-posterior length, the PRP+ group exhibited decreased retear rates after double-row repairs when compared with the PRP- group (25.9% v 57.1%, respectively; P = .046). Sensitivity power analyses revealed that most included studies were only powered to detect large differences in outcome scores between groups. There were no statistically significant differences in overall gain in outcome scores or retear rates between treatment groups. Gain in Constant scores was significantly increased when PRPs were applied at the tendon-bone interface when compared with application over the top of the repaired tendon. Retear rates were significantly decreased when PRPs were used for the treatment of tears greater than 3 cm in anterior-posterior length using a double-row technique. Most of the included studies were only powered to detect large differences in outcome scores between treatment groups. In addition, an increased risk for selection, performance, and attrition biases was found. Level II, meta-analysis of Level I and Level II studies. Copyright © 2015 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.

  14. Algorithm for Identifying Erroneous Rain-Gauge Readings

    NASA Technical Reports Server (NTRS)

    Rickman, Doug

    2005-01-01

    An algorithm analyzes rain-gauge data to identify statistical outliers that could be deemed to be erroneous readings. Heretofore, analyses of this type have been performed in burdensome manual procedures that have involved subjective judgements. Sometimes, the analyses have included computational assistance for detecting values falling outside of arbitrary limits. The analyses have been performed without statistically valid knowledge of the spatial and temporal variations of precipitation within rain events. In contrast, the present algorithm makes it possible to automate such an analysis, makes the analysis objective, takes account of the spatial distribution of rain gauges in conjunction with the statistical nature of spatial variations in rainfall readings, and minimizes the use of arbitrary criteria. The algorithm implements an iterative process that involves nonparametric statistics.

  15. Citation of previous meta-analyses on the same topic: a clue to perpetuation of incorrect methods?

    PubMed

    Li, Tianjing; Dickersin, Kay

    2013-06-01

    Systematic reviews and meta-analyses serve as a basis for decision-making and clinical practice guidelines and should be carried out using appropriate methodology to avoid incorrect inferences. We describe the characteristics, statistical methods used for meta-analyses, and citation patterns of all 21 glaucoma systematic reviews we identified pertaining to the effectiveness of prostaglandin analog eye drops in treating primary open-angle glaucoma, published between December 2000 and February 2012. We abstracted data, assessed whether appropriate statistical methods were applied in meta-analyses, and examined citation patterns of included reviews. We identified two forms of problematic statistical analyses in 9 of the 21 systematic reviews examined. Except in 1 case, none of the 9 reviews that used incorrect statistical methods cited a previously published review that used appropriate methods. Reviews that used incorrect methods were cited 2.6 times more often than reviews that used appropriate statistical methods. We speculate that by emulating the statistical methodology of previous systematic reviews, systematic review authors may have perpetuated incorrect approaches to meta-analysis. The use of incorrect statistical methods, perhaps through emulating methods described in previous research, calls conclusions of systematic reviews into question and may lead to inappropriate patient care. We urge systematic review authors and journal editors to seek the advice of experienced statisticians before undertaking or accepting for publication a systematic review and meta-analysis. The author(s) have no proprietary or commercial interest in any materials discussed in this article. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  16. Nonuniform High-Gamma (60–500 Hz) Power Changes Dissociate Cognitive Task and Anatomy in Human Cortex

    PubMed Central

    Gaona, Charles M.; Sharma, Mohit; Freudenburg, Zachary V.; Breshears, Jonathan D.; Bundy, David T.; Roland, Jarod; Barbour, Dennis L.; Schalk, Gerwin

    2011-01-01

    High-gamma-band (>60 Hz) power changes in cortical electrophysiology are a reliable indicator of focal, event-related cortical activity. Despite discoveries of oscillatory subthreshold and synchronous suprathreshold activity at the cellular level, there is an increasingly popular view that high-gamma-band amplitude changes recorded from cellular ensembles are the result of asynchronous firing activity that yields wideband and uniform power increases. Others have demonstrated independence of power changes in the low- and high-gamma bands, but to date, no studies have shown evidence of any such independence above 60 Hz. Based on nonuniformities in time-frequency analyses of electrocorticographic (ECoG) signals, we hypothesized that induced high-gamma-band (60–500 Hz) power changes are more heterogeneous than currently understood. Using single-word repetition tasks in six human subjects, we showed that functional responsiveness of different ECoG high-gamma sub-bands can discriminate cognitive task (e.g., hearing, reading, speaking) and cortical locations. Power changes in these sub-bands of the high-gamma range are consistently present within single trials and have statistically different time courses within the trial structure. Moreover, when consolidated across all subjects within three task-relevant anatomic regions (sensorimotor, Broca's area, and superior temporal gyrus), these behavior- and location-dependent power changes evidenced nonuniform trends across the population. Together, the independence and nonuniformity of power changes across a broad range of frequencies suggest that a new approach to evaluating high-gamma-band cortical activity is necessary. These findings show that in addition to time and location, frequency is another fundamental dimension of high-gamma dynamics. PMID:21307246

  17. Alignment-free sequence comparison (II): theoretical power of comparison statistics.

    PubMed

    Wan, Lin; Reinert, Gesine; Sun, Fengzhu; Waterman, Michael S

    2010-11-01

    Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.

  18. Multi-arm group sequential designs with a simultaneous stopping rule.

    PubMed

    Urach, S; Posch, M

    2016-12-30

    Multi-arm group sequential clinical trials are efficient designs to compare multiple treatments to a control. They allow one to test for treatment effects already in interim analyses and can have a lower average sample number than fixed sample designs. Their operating characteristics depend on the stopping rule: We consider simultaneous stopping, where the whole trial is stopped as soon as for any of the arms the null hypothesis of no treatment effect can be rejected, and separate stopping, where only recruitment to arms for which a significant treatment effect could be demonstrated is stopped, but the other arms are continued. For both stopping rules, the family-wise error rate can be controlled by the closed testing procedure applied to group sequential tests of intersection and elementary hypotheses. The group sequential boundaries for the separate stopping rule also control the family-wise error rate if the simultaneous stopping rule is applied. However, we show that for the simultaneous stopping rule, one can apply improved, less conservative stopping boundaries for local tests of elementary hypotheses. We derive corresponding improved Pocock and O'Brien type boundaries as well as optimized boundaries to maximize the power or average sample number and investigate the operating characteristics and small sample properties of the resulting designs. To control the power to reject at least one null hypothesis, the simultaneous stopping rule requires a lower average sample number than the separate stopping rule. This comes at the cost of a lower power to reject all null hypotheses. Some of this loss in power can be regained by applying the improved stopping boundaries for the simultaneous stopping rule. The procedures are illustrated with clinical trials in systemic sclerosis and narcolepsy. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  19. Statistical detection of EEG synchrony using empirical bayesian inference.

    PubMed

    Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven

    2015-01-01

    There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  20. Assessing exclusionary power of a paternity test involving a pair of alleged grandparents.

    PubMed

    Scarpetta, Marco A; Staub, Rick W; Einum, David D

    2007-02-01

    The power of a genetic test battery to exclude a pair of individuals as grandparents is an important consideration for parentage testing laboratories. However, a reliable method to calculate such a statistic with short-tandem repeat (STR) genetic markers has not been presented. Two formulae describing the random grandparents not excluded (RGPNE) statistic at a single genetic locus were derived: RGPNE = a(4 - 6a + 4a(2)- a(3)) when the paternal obligate allele (POA) is defined and RGPNE = 2[(a + b)(2 - a - b)][1 - (a + b)(2 - a - b)] + [(a + b)(2 - a - b)] when the POA is ambiguous. A minimum number of genetic markers required to yield cumulative RGPNE values of not greater than 0.01 was calculated with weighted average allele frequencies of the CODIS STR loci. RGPNE data for actual grandparentage cases are also presented to empirically examine the exclusionary power of routine casework. A comparison of RGPNE and random man not excluded (RMNE) values demonstrates the increased difficulty involved in excluding two individuals as grandparents compared to excluding a single alleged parent. A minimum of 12 STR markers is necessary to achieve RGPNE values of not greater than 0.01 when the mother is tested; more than 25 markers are required without the mother. Cumulative RGPNE values for each of 22 nonexclusionary grandparentage cases were not more than 0.01 but were significantly weaker when calculated without data from the mother. Calculation of the RGPNE provides a simple means to help minimize the potential of false inclusions in grandparentage analyses. This study also underscores the importance of testing the mother when examining the parents of an unavailable alleged father (AF).

  1. Multi-Decadal analysis of Global Trends in Microseism Intensity: A Proxy for Changes in Extremal Storm Activity and Oceanic Wave State

    NASA Astrophysics Data System (ADS)

    Anthony, R. E.; Aster, R. C.; Rowe, C. A.

    2016-12-01

    The Earth's seismic noise spectrum features two globally ubiquitous peaks near 8 and 16 s periods (secondary and primary bands) that arise when storm-generated ocean gravity waves are converted to seismic energy, predominantly into Rayleigh waves. Because of its regionally integrative nature, microseism intensity and other seismographic data from long running sites can provide useful proxies for wave state. Expanding an earlier study of global microseism trends (Aster et al., 2010), we analyze digitally-archived, up-to-date (through late 2016) multi-decadal seismic data from stations of global seismographic networks to characterize the spatiotemporal evolution of wave climate over the past >20 years. The IRIS Noise Tool Kit (Bahavair et al., 2013) is used to produce ground motion power spectral density (PSD) estimates in 3-hour overlapping time series segments. The result of this effort is a longer duration and more broadly geographically distributed PSD database than attained in previous studies, particularly for the primary microseism band. Integrating power within the primary and secondary microseism bands enables regional characterization of spatially-integrated trends in wave states and storm event statistics of varying thresholds. The results of these analyses are then interpreted within the context of recognized modes of atmospheric variability, including the particularly strong 2015-2016 El Niño. We note a number of statistically significant increasing trends in both raw microseism power and storm activity occurring at multiple stations in the Northwest Atlantic and Southeast Pacific consistent with generally increased wave heights and storminess in these regions. Such trends in wave activity have the potential to significantly influence coastal environments particularly under rising global sea levels.

  2. A multi-site comparison of in vivo safety pharmacology studies conducted to support ICH S7A & B regulatory submissions.

    PubMed

    Ewart, Lorna; Milne, Aileen; Adkins, Debbie; Benjamin, Amanda; Bialecki, Russell; Chen, Yafei; Ericsson, Ann-Christin; Gardner, Stacey; Grant, Claire; Lengel, David; Lindgren, Silvana; Lowing, Sarah; Marks, Louise; Moors, Jackie; Oldman, Karen; Pietras, Mark; Prior, Helen; Punton, James; Redfern, Will S; Salmond, Ross; Skinner, Matt; Some, Margareta; Stanton, Andrea; Swedberg, Michael; Finch, John; Valentin, Jean-Pierre

    2013-01-01

    Parts A and B of the ICH S7 guidelines on safety pharmacology describe the in vivo studies that must be conducted prior to first time in man administration of any new pharmaceutical. ICH S7A requires a consideration of the sensitivity and reproducibility of the test systems used. This could encompass maintaining a dataset of historical pre-dose values, power analyses, as well as a demonstration of acceptable model sensitivity and robust pharmacological validation. During the process of outsourcing safety pharmacology studies to Charles River Laboratories, AstraZeneca set out to ensure that models were performed identically in each facility and saw this as an opportunity to review the inter-laboratory variability of these essential models. The five in vivo studies outsourced were the conscious dog telemetry model for cardiovascular assessment, the rat whole body plethysmography model for respiratory assessment, the rat modified Irwin screen for central nervous system assessment, the rat charcoal meal study for gastrointestinal assessment and the rat metabolic cage study for assessment of renal function. Each study was validated with known reference compounds and data were compared across facilities. Statistical power was also calculated for each model. The results obtained indicated that each of the studies could be performed with comparable statistical power and could achieve a similar outcome, independent of facility. The consistency of results obtained from these models across multiple facilities was high thus providing confidence that the models can be run in different facilities and maintain compliance with ICH S7A and B. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. A power set-based statistical selection procedure to locate susceptible rare variants associated with complex traits with sequencing data.

    PubMed

    Sun, Hokeun; Wang, Shuang

    2014-08-15

    Existing association methods for rare variants from sequencing data have focused on aggregating variants in a gene or a genetic region because of the fact that analysing individual rare variants is underpowered. However, these existing rare variant detection methods are not able to identify which rare variants in a gene or a genetic region of all variants are associated with the complex diseases or traits. Once phenotypic associations of a gene or a genetic region are identified, the natural next step in the association study with sequencing data is to locate the susceptible rare variants within the gene or the genetic region. In this article, we propose a power set-based statistical selection procedure that is able to identify the locations of the potentially susceptible rare variants within a disease-related gene or a genetic region. The selection performance of the proposed selection procedure was evaluated through simulation studies, where we demonstrated the feasibility and superior power over several comparable existing methods. In particular, the proposed method is able to handle the mixed effects when both risk and protective variants are present in a gene or a genetic region. The proposed selection procedure was also applied to the sequence data on the ANGPTL gene family from the Dallas Heart Study to identify potentially susceptible rare variants within the trait-related genes. An R package 'rvsel' can be downloaded from http://www.columbia.edu/∼sw2206/ and http://statsun.pusan.ac.kr. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Reporting quality of statistical methods in surgical observational studies: protocol for systematic review.

    PubMed

    Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume

    2014-06-28

    Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.

  5. Drying method has no substantial effect on δ(15)N or δ(13)C values of muscle tissue from teleost fishes.

    PubMed

    Bessey, Cindy; Vanderklift, Mathew A

    2014-02-15

    Stable isotope analysis (SIA) is a powerful tool in many fields of research that enables quantitative comparisons among studies, if similar methods have been used. The goal of this study was to determine if three different drying methods commonly used to prepare samples for SIA yielded different δ(15)N and δ(13)C values. Muscle subsamples from 10 individuals each of three teleost species were dried using three methods: (i) oven, (ii) food dehydrator, and (iii) freeze-dryer. All subsamples were analysed for δ(15)N and δ(13)C values, and nitrogen and carbon content, using a continuous flow system consisting of a Delta V Plus mass spectrometer and a Flush 1112 elemental analyser via a Conflo IV universal interface. The δ(13)C values were normalized to constant lipid content using the equations proposed by McConnaughey and McRoy. Although statistically significant, the differences in δ(15)N values between the drying methods were small (mean differences ≤0.21‰). The differences in δ(13)C values between the drying methods were not statistically significant, and normalising the δ(13)C values to constant lipid content reduced the mean differences for all treatments to ≤0.65‰. A statistically significant difference of ~2% in C content existed between tissues dried in a food dehydrator and those dried in a freeze-dryer for two fish species. There was no significant effect of fish size on the differences between methods. No substantial effect of drying method was found on the δ(15)N or δ(13)C values of teleost muscle tissue. Copyright © 2013 John Wiley & Sons, Ltd.

  6. The added value of ordinal analysis in clinical trials: an example in traumatic brain injury.

    PubMed

    Roozenbeek, Bob; Lingsma, Hester F; Perel, Pablo; Edwards, Phil; Roberts, Ian; Murray, Gordon D; Maas, Andrew Ir; Steyerberg, Ewout W

    2011-01-01

    In clinical trials, ordinal outcome measures are often dichotomized into two categories. In traumatic brain injury (TBI) the 5-point Glasgow outcome scale (GOS) is collapsed into unfavourable versus favourable outcome. Simulation studies have shown that exploiting the ordinal nature of the GOS increases chances of detecting treatment effects. The objective of this study is to quantify the benefits of ordinal analysis in the real-life situation of a large TBI trial. We used data from the CRASH trial that investigated the efficacy of corticosteroids in TBI patients (n = 9,554). We applied two techniques for ordinal analysis: proportional odds analysis and the sliding dichotomy approach, where the GOS is dichotomized at different cut-offs according to baseline prognostic risk. These approaches were compared to dichotomous analysis. The information density in each analysis was indicated by a Wald statistic. All analyses were adjusted for baseline characteristics. Dichotomous analysis of the six-month GOS showed a non-significant treatment effect (OR = 1.09, 95% CI 0.98 to 1.21, P = 0.096). Ordinal analysis with proportional odds regression or sliding dichotomy showed highly statistically significant treatment effects (OR 1.15, 95% CI 1.06 to 1.25, P = 0.0007 and 1.19, 95% CI 1.08 to 1.30, P = 0.0002), with 2.05-fold and 2.56-fold higher information density compared to the dichotomous approach respectively. Analysis of the CRASH trial data confirmed that ordinal analysis of outcome substantially increases statistical power. We expect these results to hold for other fields of critical care medicine that use ordinal outcome measures and recommend that future trials adopt ordinal analyses. This will permit detection of smaller treatment effects.

  7. Exploring the statistical and clinical impact of two interim analyses on the Phase II design with option for direct assignment.

    PubMed

    An, Ming-Wen; Mandrekar, Sumithra J; Edelman, Martin J; Sargent, Daniel J

    2014-07-01

    The primary goal of Phase II clinical trials is to understand better a treatment's safety and efficacy to inform a Phase III go/no-go decision. Many Phase II designs have been proposed, incorporating randomization, interim analyses, adaptation, and patient selection. The Phase II design with an option for direct assignment (i.e. stop randomization and assign all patients to the experimental arm based on a single interim analysis (IA) at 50% accrual) was recently proposed [An et al., 2012]. We discuss this design in the context of existing designs, and extend it from a single-IA to a two-IA design. We compared the statistical properties and clinical relevance of the direct assignment design with two IA (DAD-2) versus a balanced randomized design with two IA (BRD-2) and a direct assignment design with one IA (DAD-1), over a range of response rate ratios (2.0-3.0). The DAD-2 has minimal loss in power (<2.2%) and minimal increase in T1ER (<1.6%) compared to a BRD-2. As many as 80% more patients were treated with experimental vs. control in the DAD-2 than with the BRD-2 (experimental vs. control ratio: 1.8 vs. 1.0), and as many as 64% more in the DAD-2 than with the DAD-1 (1.8 vs. 1.1). We illustrate the DAD-2 using a case study in lung cancer. In the spectrum of Phase II designs, the direct assignment design, especially with two IA, provides a middle ground with desirable statistical properties and likely appeal to both clinicians and patients. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Got power? A systematic review of sample size adequacy in health professions education research.

    PubMed

    Cook, David A; Hatala, Rose

    2015-03-01

    Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011, and included all studies evaluating simulation-based education for health professionals in comparison with no intervention or another simulation intervention. Reviewers working in duplicate abstracted information to calculate standardized mean differences (SMD's). We included 897 original research studies. Among the 627 no-intervention-comparison studies the median sample size was 25. Only two studies (0.3%) had ≥80% power to detect a small difference (SMD > 0.2 standard deviations) and 136 (22%) had power to detect a large difference (SMD > 0.8). 110 no-intervention-comparison studies failed to find a statistically significant difference, but none excluded a small difference and only 47 (43%) excluded a large difference. Among 297 studies comparing alternate simulation approaches the median sample size was 30. Only one study (0.3%) had ≥80% power to detect a small difference and 79 (27%) had power to detect a large difference. Of the 128 studies that did not detect a statistically significant effect, 4 (3%) excluded a small difference and 91 (71%) excluded a large difference. In conclusion, most education research studies are powered only to detect effects of large magnitude. For most studies that do not reach statistical significance, the possibility of large and important differences still exists.

  9. Statistical analyses of commercial vehicle accident factors. Volume 1 Part 1

    DOT National Transportation Integrated Search

    1978-02-01

    Procedures for conducting statistical analyses of commercial vehicle accidents have been established and initially applied. A file of some 3,000 California Highway Patrol accident reports from two areas of California during a period of about one year...

  10. 40 CFR 90.712 - Request for public hearing.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... sampling plans and statistical analyses have been properly applied (specifically, whether sampling procedures and statistical analyses specified in this subpart were followed and whether there exists a basis... Clerk and will be made available to the public during Agency business hours. ...

  11. Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.

    PubMed

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei

    2016-02-01

    Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.

  12. Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions

    PubMed Central

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei

    2015-01-01

    Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979

  13. The statistical overlap theory of chromatography using power law (fractal) statistics.

    PubMed

    Schure, Mark R; Davis, Joe M

    2011-12-30

    The chromatographic dimensionality was recently proposed as a measure of retention time spacing based on a power law (fractal) distribution. Using this model, a statistical overlap theory (SOT) for chromatographic peaks is developed that estimates the number of peak maxima as a function of the chromatographic dimension, saturation and scale. Power law models exhibit a threshold region whereby below a critical saturation value no loss of peak maxima due to peak fusion occurs as saturation increases. At moderate saturation, behavior is similar to the random (Poisson) peak model. At still higher saturation, the power law model shows loss of peaks nearly independent of the scale and dimension of the model. The physicochemical meaning of the power law scale parameter is discussed and shown to be equal to the Boltzmann-weighted free energy of transfer over the scale limits. The scale is discussed. Small scale range (small β) is shown to generate more uniform chromatograms. Large scale range chromatograms (large β) are shown to give occasional large excursions of retention times; this is a property of power laws where "wild" behavior is noted to occasionally occur. Both cases are shown to be useful depending on the chromatographic saturation. A scale-invariant model of the SOT shows very simple relationships between the fraction of peak maxima and the saturation, peak width and number of theoretical plates. These equations provide much insight into separations which follow power law statistics. Copyright © 2011 Elsevier B.V. All rights reserved.

  14. Genetic polymorphisms in 18 autosomal STR loci in the Tibetan population living in Tibet Chamdo, Southwest China.

    PubMed

    Li, Zhenghui; Zhang, Jian; Zhang, Hantao; Lin, Ziqing; Ye, Jian

    2018-05-01

    Short tandem repeats (STRs) play a vitally important role in forensics. Population data is needed to improve the field. There is currently no large population data-based data set in Chamdo Tibetan. In our study, the allele frequencies and forensic statistical parameters of 18 autosomal STR loci (D5S818, D21S11, D7S820, CSF1PO, D2S1338, D3S1358, VWA, D8S1179, D16S539, PentaE, TPOX, TH01, D19S433, D18S51, FGA, D6S1043, D13S317, and D12S391) included in the DNATyper™19 kit were investigated in 2249 healthy, unrelated Tibetan subjects living in Tibet Chamdo, Southwest China. The combined power of discrimination and the combined probability of exclusion of all 18 loci were 0.9999999999999999999998174 and 0.99999994704, respectively. Furthermore, the genetic relationship between our Tibetan group and 33 previously published populations was also investigated. Phylogenetic analyses revealed that the Chamdo Tibetan population is more closely related genetically with the Lhasa Tibetan group. Our results suggest that these autosomal STR loci are highly polymorphic in the Tibetan population living in Tibet Chamdo and can be used as a powerful tool in forensics, linguistics, and population genetic analyses.

  15. A model-based test for treatment effects with probabilistic classifications.

    PubMed

    Cavagnaro, Daniel R; Davis-Stober, Clintin P

    2018-05-21

    Within modern psychology, computational and statistical models play an important role in describing a wide variety of human behavior. Model selection analyses are typically used to classify individuals according to the model(s) that best describe their behavior. These classifications are inherently probabilistic, which presents challenges for performing group-level analyses, such as quantifying the effect of an experimental manipulation. We answer this challenge by presenting a method for quantifying treatment effects in terms of distributional changes in model-based (i.e., probabilistic) classifications across treatment conditions. The method uses hierarchical Bayesian mixture modeling to incorporate classification uncertainty at the individual level into the test for a treatment effect at the group level. We illustrate the method with several worked examples, including a reanalysis of the data from Kellen, Mata, and Davis-Stober (2017), and analyze its performance more generally through simulation studies. Our simulations show that the method is both more powerful and less prone to type-1 errors than Fisher's exact test when classifications are uncertain. In the special case where classifications are deterministic, we find a near-perfect power-law relationship between the Bayes factor, derived from our method, and the p value obtained from Fisher's exact test. We provide code in an online supplement that allows researchers to apply the method to their own data. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  16. Design and Implementation of the International Genetics and Translational Research in Transplantation Network.

    PubMed

    2015-11-01

    Genetic association studies of transplantation outcomes have been hampered by small samples and highly complex multifactorial phenotypes, hindering investigations of the genetic architecture of a range of comorbidities which significantly impact graft and recipient life expectancy. We describe here the rationale and design of the International Genetics & Translational Research in Transplantation Network. The network comprises 22 studies to date, including 16494 transplant recipients and 11669 donors, of whom more than 5000 are of non-European ancestry, all of whom have existing genomewide genotype data sets. We describe the rich genetic and phenotypic information available in this consortium comprising heart, kidney, liver, and lung transplant cohorts. We demonstrate significant power in International Genetics & Translational Research in Transplantation Network to detect main effect association signals across regions such as the MHC region as well as genomewide for transplant outcomes that span all solid organs, such as graft survival, acute rejection, new onset of diabetes after transplantation, and for delayed graft function in kidney only. This consortium is designed and statistically powered to deliver pioneering insights into the genetic architecture of transplant-related outcomes across a range of different solid-organ transplant studies. The study design allows a spectrum of analyses to be performed including recipient-only analyses, donor-recipient HLA mismatches with focus on loss-of-function variants and nonsynonymous single nucleotide polymorphisms.

  17. Basic statistical analyses of candidate nickel-hydrogen cells for the Space Station Freedom

    NASA Technical Reports Server (NTRS)

    Maloney, Thomas M.; Frate, David T.

    1993-01-01

    Nickel-Hydrogen (Ni/H2) secondary batteries will be implemented as a power source for the Space Station Freedom as well as for other NASA missions. Consequently, characterization tests of Ni/H2 cells from Eagle-Picher, Whittaker-Yardney, and Hughes were completed at the NASA Lewis Research Center. Watt-hour efficiencies of each Ni/H2 cell were measured for regulated charge and discharge cycles as a function of temperature, charge rate, discharge rate, and state of charge. Temperatures ranged from -5 C to 30 C, charge rates ranged from C/10 to 1C, discharge rates ranged from C/10 to 2C, and states of charge ranged from 20 percent to 100 percent. Results from regression analyses and analyses of mean watt-hour efficiencies demonstrated that overall performance was best at temperatures between 10 C and 20 C while the discharge rate correlated most strongly with watt-hour efficiency. In general, the cell with back-to-back electrode arrangement, single stack, 26 percent KOH, and serrated zircar separator and the cell with a recirculating electrode arrangement, unit stack, 31 percent KOH, zircar separators performed best.

  18. Overweight and pregnancy complications.

    PubMed

    Abrams, B; Parker, J

    1988-01-01

    The association between increased prepregnancy weight for height and seven pregnancy complications was studied in a multi-racial sample of more than 4100 recent deliveries. Body mass indices were calculated and used to classify women as average weight (90-119 percent of ideal or BMI 19.21-25.60), moderately overweight (120-135 percent ideal or BMI 25.61-28.90), and very overweight (greater than 135 percent ideal or BMI greater than 28.91) prior to pregnancy. Compared to women of average weight for height, very overweight women had a higher risk of diabetes, hypertension, pregnancy-induced hypertension and primary cesarean section delivery. Moderately overweight women were also at higher risk than average for diabetes, pregnancy-induced hypertension and primary cesarean deliveries but the relative risks were of a smaller magnitude than for very overweight women. With women of average prepregnancy body mass as reference, moderately elevated, but not significant relative risks were found for perinatal mortality in the very overweight group, for urinary tract infections in both overweight groups, and a decreased risk for anemia was found in the very overweight group. However, post-hoc power analyses indicated that the number of overweight women in the sample did not allow adequate statistical power to detect these small differences in risk. To overcome limitations associated with low statistical power, the results of three recent studies of these outcomes in very overweight pregnant women were combined and summarized using Mantel-Haenzel techniques. This second, larger analysis suggested that very overweight women are at significantly higher risk for all seven outcomes studied. Summary results for moderately overweight women could not be calculated, since only two of the studies had evaluated moderately overweight women separately. These latter results support other findings that both moderate overweight and very overweight are risk factors during pregnancy, with the highest risk occurring in the heaviest group. Although these results indicate that moderate overweight is a risk factor during pregnancy, additional studies are needed to confirm the impact of being 20-35 percent above ideal weight prior to pregnancy. The results of this analysis also imply that since the baseline incidence of many perinatal complications is low, studies relating overweight and pregnancy complications should include large enough samples of overweight women so that there is adequate statistical power to reliably detect differences in complication rates.

  19. The Lebanese Armed Forces Engaging Nahr Al-Bared Palestinian Refugee Camp Using the Instruments of National Power

    DTIC Science & Technology

    2017-06-09

    organization. Then, the study analyses the use of the Diplomatic, Informational, Military, and Economic instruments of national power (DIME) by the LAF in...Then, the study analyses the use of the Diplomatic, Informational, Military, and Economic instruments of national power (DIME) by the LAF in...56 Economic Element of National Power

  20. A Note on Comparing the Power of Test Statistics at Low Significance Levels.

    PubMed

    Morris, Nathan; Elston, Robert

    2011-01-01

    It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the relative performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as α = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level α = 5 × 10 -8 , which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.

Top